[Bug rtl-optimization/55160] New: [4.8 Regression] Counterproductive loop induction variable optimization

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug rtl-optimization/55160] New: [4.8 Regression] Counterproductive loop induction variable optimization
@ 2012-11-01  1:41 olegendo at gcc dot gnu.org
  2012-11-01  6:28 ` [Bug rtl-optimization/55160] " amylaar at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-11-01  1:41 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55160

             Bug #: 55160
           Summary: [4.8 Regression] Counterproductive loop induction
                    variable optimization
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: olegendo@gcc.gnu.org
                CC: amylaar@gcc.gnu.org
            Target: sh*-*-* arm*-*-*


Starting with rev 192505 the following

int test_04 (int* x, int c)
{
  int s = 0;
  for (int i = 0; i < c; ++i)
    s += *--x;
  return s;
}

gets compiled to (SH, -O2 -m4 -ml):

        cmp/pl  r5
        bf/s    .L12
        mov     #0,r1
        mov     #0,r0
.L11:
        add     #-4,r4
        mov.l   @r4,r2
        add     #1,r1
        cmp/eq  r5,r1
        bf/s    .L11
        add     r2,r0
        rts
        nop
.L12:
        rts
        mov     #0,r0

whereas before (also on 4.7.3) it was:
        cmp/pl  r5
        bf/s    .L11
        mov     #0,r0
.L10:
        add     #-4,r4
        mov.l   @r4,r1
        dt      r5
        bf/s    .L10
        add     r1,r0
        rts    
        nop
.L11:
        rts    
        nop

In this case the inner loop code size effectively does not increase, but there
is overhead in setting up the loop.  Similar code is also generated on ARM.


Another similar case:

int test_03 (int* x, int c)
{
  int s = 0;
  for (int i = 0; i < c; ++i)
    s += x[i];
  return s;
}

rev 192505:
        cmp/pl  r5
        bf/s    .L4
        shll2   r5
        add     r4,r5
        mov     #0,r0
.L3:
        mov.l   @r4+,r1
        cmp/eq  r5,r4
        bf/s    .L3
        add     r1,r0
        rts
        nop
.L4:
        rts
        mov     #0,r0


before it was:
        cmp/pl  r5
        bf/s    .L6
        mov     #0,r0
        shll2   r5
        add     #-4,r5
        shlr2   r5
        add     #1,r5
.L3:
        mov.l   @r4+,r1
        dt      r5
        bf/s    .L3
        add     r1,r0
.L6:
        rts    
        nop

In this case, there was the useless loop setup code.  Ideally this should be
something like:

        cmp/pl  r5
        bf/s    .L6
        mov     #0,r0
.L3:
        mov.l   @r4+,r1
        dt      r5
        bf/s    .L3
        add     r1,r0
.L6:
        rts    
        nop

Jörn, I've added you in CC because your commit (rev 192505) seems to have
triggered something there.  I'm not sure whether this is actually the cause for
this counter productive transformation.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug rtl-optimization/55160] [4.8 Regression] Counterproductive loop induction variable optimization
  2012-11-01  1:41 [Bug rtl-optimization/55160] New: [4.8 Regression] Counterproductive loop induction variable optimization olegendo at gcc dot gnu.org
@ 2012-11-01  6:28 ` amylaar at gcc dot gnu.org
  2012-11-01  7:09 ` [Bug target/55160] " amylaar at gcc dot gnu.org
  2012-11-01 21:29 ` olegendo at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: amylaar at gcc dot gnu.org @ 2012-11-01  6:28 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55160

--- Comment #1 from Jorn Wolfgang Rennecke <amylaar at gcc dot gnu.org> 2012-11-01 06:28:14 UTC ---
Author: amylaar
Date: Thu Nov  1 06:28:06 2012
New Revision: 193060

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=193060
Log:
        PR target/55160
        * config/sh/sh.md (doloop_end): Use emit_jump_insn.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.md


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/55160] [4.8 Regression] Counterproductive loop induction variable optimization
  2012-11-01  1:41 [Bug rtl-optimization/55160] New: [4.8 Regression] Counterproductive loop induction variable optimization olegendo at gcc dot gnu.org
  2012-11-01  6:28 ` [Bug rtl-optimization/55160] " amylaar at gcc dot gnu.org
@ 2012-11-01  7:09 ` amylaar at gcc dot gnu.org
  2012-11-01 21:29 ` olegendo at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: amylaar at gcc dot gnu.org @ 2012-11-01  7:09 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55160

Jorn Wolfgang Rennecke <amylaar at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|sh*-*-* arm*-*-*            |sh*-*-*
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |FIXED

--- Comment #2 from Jorn Wolfgang Rennecke <amylaar at gcc dot gnu.org> 2012-11-01 07:09:25 UTC ---
(In reply to comment #0)
> Starting with rev 192505 the following
> 
> int test_04 (int* x, int c)
> {
>   int s = 0;
>   for (int i = 0; i < c; ++i)
>     s += *--x;
>   return s;
> }
> 
..  
> In this case the inner loop code size effectively does not increase, but there
> is overhead in setting up the loop.  Similar code is also generated on ARM.

In order to get the r192504 arm port to generate a doloop_end pattern,
I have to enable thubm2 support and modulo scheduling, e.g.:
-O2 --std=c99 -mthumb -march=armv7 -fmodulo-sched

with these options, r192505 also gets the doloop_end pattern.
Therefore, I have removed arm from the target list.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/55160] [4.8 Regression] Counterproductive loop induction variable optimization
  2012-11-01  1:41 [Bug rtl-optimization/55160] New: [4.8 Regression] Counterproductive loop induction variable optimization olegendo at gcc dot gnu.org
  2012-11-01  6:28 ` [Bug rtl-optimization/55160] " amylaar at gcc dot gnu.org
  2012-11-01  7:09 ` [Bug target/55160] " amylaar at gcc dot gnu.org
@ 2012-11-01 21:29 ` olegendo at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-11-01 21:29 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55160

--- Comment #3 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-11-01 21:28:53 UTC ---
Author: olegendo
Date: Thu Nov  1 21:28:49 2012
New Revision: 193071

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=193071
Log:
    PR target/55160
    * gcc.target/sh/pr55160.c: New.


Added:
    trunk/gcc/testsuite/gcc.target/sh/pr55160.c
Modified:
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-11-01 21:29 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-01  1:41 [Bug rtl-optimization/55160] New: [4.8 Regression] Counterproductive loop induction variable optimization olegendo at gcc dot gnu.org
2012-11-01  6:28 ` [Bug rtl-optimization/55160] " amylaar at gcc dot gnu.org
2012-11-01  7:09 ` [Bug target/55160] " amylaar at gcc dot gnu.org
2012-11-01 21:29 ` olegendo at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).