From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugs-return-370511-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 26085 invoked by alias); 13 Oct 2011 15:48:58 -0000
Received: (qmail 26073 invoked by uid 22791); 13 Oct 2011 15:48:56 -0000
X-SWARE-Spam-Status: No, hits=-2.9 required=5.0	tests=ALL_TRUSTED,AWL,BAYES_00
X-Spam-Check-By: sourceware.org
Received: from localhost (HELO gcc.gnu.org) (127.0.0.1)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 13 Oct 2011 15:48:42 +0000
From: "mgretton at sourceware dot org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/50717] New: Silent code gen fault with incorrect widening of multiply
Date: Thu, 13 Oct 2011 15:48:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: mgretton at sourceware dot org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Changed-Fields:
Message-ID: <bug-50717-4@http.gcc.gnu.org/bugzilla/>
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
Content-Type: text/plain; charset="UTF-8"
MIME-Version: 1.0
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
X-SW-Source: 2011-10/txt/msg01255.txt.bz2

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50717

             Bug #: 50717
           Summary: Silent code gen fault with incorrect widening of
                    multiply
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: mgretton@sourceware.org
              Host: x86_64-linux-gnu
            Target: arm-none-eabi


Created attachment 25483
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25483
Executable test case.

The attached test case fails when compiled and executed as follows:

arm-none-eabi-gcc -O2 gen_exec.c -o gen_exec.axf -fno-expensive-optimizations
.../linaro-qemu/0.15.50/bin/qemu-arm  ./gen_exec.axf

The two functions in the test case f0a and f0b are identical, just compiled
with -fexpensive-optimizations off (for f0a) and on (for f0b).  The code
generation differences produce an incorrect result.

The attached file gen_exec_simple.c contains the individual f0b function for
compilation.

The attached tree dumps show the first difference between compiling
gen_exec_simple.c with and without -fexpensive-optimizations.  The main
difference seems to be the following:


--- gen_exec_simple.c.135t.tailc.cheap  2011-10-13 15:02:50.000000000 +0100
+++ gen_exec_simple.c.135t.tailc.expensive      2011-10-13 15:03:15.000000000
+0100
@@ -3,6 +3,7 @@

 f0b (uint32_t * restrict arg1, uint64_t * restrict arg2, uint8_t * restrict
arg3)
 {
+  <unnamed-unsigned:32> D.8363;
   void * D.8362;
   sizetype D.8361;
   void * D.8360;
@@ -67,7 +68,8 @@ f0b (uint32_t * restrict arg1, uint64_t 
   D.8255_41 = MEM[base: D.8362_127, offset: 0B];
   D.8256_42 = D.8252_36 * D.8255_41;
   D.8257_43 = (uint64_t) D.8256_42;
-  D.8258_44 = D.8257_43 + temp_1_18;
+  D.8363_7 = (<unnamed-unsigned:32>) D.8245_16;
+  D.8258_44 = WIDEN_MULT_PLUS_EXPR <D.8255_41, D.8363_7, temp_1_18>;
   D.8259_45 = D.8258_44 >> 1;
   D.8260_46 = D.8259_45 >> 24;
   D.8272_57 = D.8251_31 | 1;

That is a widening multiply/accumulate has been added to the tree.  This
ultimately becomes a UMLAL in the output.

This widening multiply/accumulate is incorrect.  It is trying to do the
following:

result += ((((((arg3[idx] * arg1[idx]) + temp_1)/2))>>24) / (temp_2 | 1));

Where arg3[idx] is a uint8_t, arg1[idx] is a uint32_t and temp_1 is a uint64_t.

As written in C, the result of the multiply is truncated to a 32-bit value, and
then added to the 64-bit value.

The widening multiply/accumulate, however, widens the inputs to 64-bits, and
does a 64-bit multiply before adding it to the 64-bit accumulator.

These produce a different result when the result of the multiply overflows
32-bits.

A bisect of the source leads me to believe that revision 177907 is responsible:
http://gcc.gnu.org/viewcvs?view=revision&revision=177907