public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/106415] New: loop-ivopts prevents correct usage of dbra with 16-bit loop counters on m68k
@ 2022-07-22 17:32 undefinedopcode2 at gmail dot com
  2022-07-22 18:50 ` [Bug tree-optimization/106415] " undefinedopcode2 at gmail dot com
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: undefinedopcode2 at gmail dot com @ 2022-07-22 17:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106415

            Bug ID: 106415
           Summary: loop-ivopts prevents correct usage of dbra with 16-bit
                    loop counters on m68k
           Product: gcc
           Version: 11.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: undefinedopcode2 at gmail dot com
  Target Milestone: ---

Created attachment 53338
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53338&action=edit
C file that reproduces the problem.

When targeting m68k and compiling certain loops with 16-bit counters that
should trivially generate a DBRA instruction, GCC's optimization passes end up
converting the IV to 32-bit, which  requires extra logic to check the upper
half. More specifically, these are loops where the number of iterations is
known at compile time.

This additional code is completely useless since we know the loop count fits in
16 bits.

I am using GCC 11.2.0 hosted on ARM64 macOS and targeting m68k. All code
snippets were compiled with `-O3 -std=c99 -march=68000 -mtune=68000`.

Consider the following function:
void dbra_test1(short i) {
    do {
        foo(i);
    } while(--i != -1);
}

As expected, the generated body is a tiny loop consisting solely of call setup,
the call itself, call cleanup, and a DBRA:
.L2:
        movew %d2,%a0
        movel %a0,%sp@-
        jsr %a2@
        addql #4,%sp
        dbra %d2,.L2

Now consider this function, where we change the initial value of the loop count
to be a constant:
void dbra_test2(void) {
    short i = 15;
    do {
        foo(i);
    } while(--i != -1);
}

GCC generates the following code for the body of the loop:
.L7:
        movel %d2,%sp@-
        jsr %a2@
        addql #4,%sp
        dbra %d2,.L7
        clrw %d2
        subql #1,%d2
        jcc .L7

Note the extraneous clr/subq/jcc.

During ivcanon, GCC transforms the second loop to run from 16 to 0 instead of
15 to -1. Later during ivopts, it transforms back into 15 to -1 form, but
promotes the variable from short to int. Future transformations are no longer
able to optimize around the short variable, and we end up with extraneous
checks inserted during codegen.

I've attached a simple file that reproduces the problem. GCC 2.95.3 performed
the operation correctly, but it's been broken since at least 4.3.2, possibly
earlier.

Thanks
--UD2

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-08-01 23:44 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-22 17:32 [Bug tree-optimization/106415] New: loop-ivopts prevents correct usage of dbra with 16-bit loop counters on m68k undefinedopcode2 at gmail dot com
2022-07-22 18:50 ` [Bug tree-optimization/106415] " undefinedopcode2 at gmail dot com
2022-07-25 16:51 ` [Bug target/106415] " pinskia at gcc dot gnu.org
2022-07-25 17:27 ` undefinedopcode2 at gmail dot com
2022-07-26  5:01 ` undefinedopcode2 at gmail dot com
2022-07-26  5:59 ` linkw at gcc dot gnu.org
2022-07-27 18:28 ` undefinedopcode2 at gmail dot com
2022-08-01 23:44 ` undefinedopcode2 at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).