From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugs-return-405363-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 10855 invoked by alias); 1 Nov 2012 10:08:15 -0000
Received: (qmail 9896 invoked by uid 48); 1 Nov 2012 10:07:48 -0000
From: "olegendo at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/55162] New: Loop ivopts cuts off top bits of loop counter
Date: Thu, 01 Nov 2012 10:08:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: olegendo at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Changed-Fields:
Message-ID: <bug-55162-4@http.gcc.gnu.org/bugzilla/>
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
Content-Type: text/plain; charset="UTF-8"
MIME-Version: 1.0
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
X-SW-Source: 2012-11/txt/msg00025.txt.bz2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55162

             Bug #: 55162
           Summary: Loop ivopts cuts off top bits of loop counter
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: olegendo@gcc.gnu.org
            Target: sh*-*-*


The following function:

int test (int* x, unsigned int c)
{
  int s = 0;
  unsigned int i;
  for (i = 0; i < c; ++i)
    s += x[i];
  return s;
}

compiled for SH (-O2 -m4 -ml) results in the following code:

        tst     r5,r5      // c == 0 ?
        bt/s    .L6
        mov     #0,r0
        shll2   r5         // c <<= 2
        add     #-4,r5     // c += -4
        shlr2   r5         // c >>= 2 (unsigned shift)
        add     #1,r5      // c += 1
.L3:
        mov.l   @r4+,r1
        dt      r5
        bf/s    .L3
        add     r1,r0
.L6:
        rts
        nop

If the function above is invoked with c = 0x80000000 the loop will do
0x40000000 number of iterations, which looks suspicious.

For example, passing a virtual address 0x00001000 and c = 0x80000000 to the
function should actually run over the address range 0x00001000 .. 0x80001000,
not 0x00001000 .. 0x40001000.

I've also checked this on ARM.  There, the loop counter is transformed into the
end address and the loop compares the addresses instead of using a
decrement-and-test insn:
        cmp     r1, #0
        beq     .L4
        mov     r3, r0
        add     r1, r0, r1, asl #2
        mov     r0, #0
.L3:
        ldr     r2, [r3], #4
        cmp     r3, r1
        add     r0, r0, r2
        bne     .L3
        bx      lr
.L4:
        mov     r0, r1
        bx      lr

The same could be done on SH, too (comparing against the end address instead of
using a loop counter), but it would add a loop setup overhead.  In the optimal
case the above function would result in the following SH code:

        tst     r5,r5
        bt/s    .L6
        mov     #0,r0
.L3:
        mov.l   @r4+,r1
        dt      r5
        bf/s    .L3
        add     r1,r0
.L6:
        rts
        nop


This problem is present on rev 193061 as well as on the 4.7 branch.