public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug fortran/35659]  New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
@ 2008-03-21 18:49 kmccarty at debian dot org
  2008-03-22 13:22 ` [Bug target/35659] " rguenth at gcc dot gnu dot org
                   ` (28 more replies)
  0 siblings, 29 replies; 30+ messages in thread
From: kmccarty at debian dot org @ 2008-03-21 18:49 UTC (permalink / raw)
  To: gcc-bugs

Hi,

[This bug was initially submitted to the Debian BTS at
http://bugs.debian.org/466948 -- at the request of Debian's gcc maintainer, I
am also sending it here]

There is apparently a miscompilation of CERNLIB code in the "kernlib" library
on ia64 with gfortran-4.3 at optimization level -O2 or higher.  (Oddly,
compiling with -O2 -funroll-loops does not expose the problem, which is why I
didn't notice it earlier.)

I've verified that this problem occurs on ia64 with gfortran-4.3 versions
4.3-20080202-1, 4.3-20080219-1 and 4.3.0.  For the most recent trial:

(sid)kmccarty@merulo:~$ gfortran-4.3 -v
Using built-in specs.
Target: ia64-linux-gnu
Configured with: ../src/configure linux gnu
Thread model: posix
gcc version 4.3.1 20080309 (prerelease) (Debian 4.3.0-1)

Please see the complete test case I've provided at
http://people.debian.org/~kmccarty/tlsc-test-case.tar.gz

[Note, the files libkernlib.a and libkerngent.a in the tarball were compiled on
ia64 from CERNLIB source code, which is GPL; it is available from the Debian
source package of "cernlib" version 2006.dfsg.2-10]

[Note 2, I have updated this test case tarball to the gfortran version noted
above since I originally submitted this bug to the Debian BTS.]

The file tlsc.F in the test case is where the optimization bug happens.  I
saved files tlsc.f, tlsc.s and tlsc.o into subdirectories of the test case for
three different combinations of compile flags:

-O2 (causes miscompiled code) in obj-fail subdir

-O0 (seems fine)
-O2 -funroll-loops (seems fine) both in obj-success subdir.

Also the output of the test case, saved as output.txt in each subdir.

You can regenerate all of these files with these three combinations of compiler
flags quickly with "make output".

The presence or absence of -fno-automatic on the gfortran command line (I
omitted it in all three cases) appears to make no difference.

best regards,
Kevin McCarty


-- 
           Summary: Miscompiled code with -O2 (but not with -O2 -funroll-
                    loops) on ia64
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: fortran
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: kmccarty at debian dot org
 GCC build triplet: ia64-linux-gnu
  GCC host triplet: ia64-linux-gnu
GCC target triplet: ia64-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
@ 2008-03-22 13:22 ` rguenth at gcc dot gnu dot org
  2008-03-24 16:48 ` kmccarty at debian dot org
                   ` (27 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-03-22 13:22 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from rguenth at gcc dot gnu dot org  2008-03-22 13:21 -------
Does it work with gcc 4.2?


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |wrong-code


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
  2008-03-22 13:22 ` [Bug target/35659] " rguenth at gcc dot gnu dot org
@ 2008-03-24 16:48 ` kmccarty at debian dot org
  2008-03-24 17:33 ` [Bug target/35659] [4.3/4.4 Regression] " rguenth at gcc dot gnu dot org
                   ` (26 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: kmccarty at debian dot org @ 2008-03-24 16:48 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from kmccarty at debian dot org  2008-03-24 16:47 -------
(In reply to comment #1)
> Does it work with gcc 4.2?

Yes, it does, at least with this version:

(sid)kmccarty@merulo:~$ gfortran-4.2 -v
Using built-in specs.
Target: ia64-linux-gnu
Configured with: ../src/configure -v
--enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr
--enable-shared --with-system-zlib --libexecdir=/usr/lib
--without-included-gettext --enable-threads=posix --enable-nls
--with-gxx-include-dir=/usr/include/c++/4.2 --program-suffix=-4.2
--enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --enable-mpfr
--disable-libmudflap --disable-libssp --with-system-libunwind
--enable-checking=release --build=ia64-linux-gnu --host=ia64-linux-gnu
--target=ia64-linux-gnu
Thread model: posix
gcc version 4.2.3 (Debian 4.2.3-2)

The test case works in all cases with gfortran 4.2, regardless of whether I use
libkernlib.a and libkerngent.a compiled with gfortran 4.2 or 4.3.  (But please
let me know if you need copies of these libraries built with gfortran 4.2
anyway.)

I also just noticed, surprisingly, that with gfortran 4.3, if I compile the
test program with -O3 or with -O3 -funroll-loops, the test succeeds.  So oddly
enough it is ONLY with -O2 (without -funroll-loops) that the test fails. 
(Again, this behavior is independent of which gfortran version I use to build
libkernlib.a and libkerngent.a.)  Please also let me know if you need the
intermediate .f and .o files built with gfortran-4.3 -O3.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
  2008-03-22 13:22 ` [Bug target/35659] " rguenth at gcc dot gnu dot org
  2008-03-24 16:48 ` kmccarty at debian dot org
@ 2008-03-24 17:33 ` rguenth at gcc dot gnu dot org
  2008-04-28  4:29 ` mmitchel at gcc dot gnu dot org
                   ` (25 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-03-24 17:33 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from rguenth at gcc dot gnu dot org  2008-03-24 17:32 -------
Possibly a scheduling issue.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |4.2.3
            Summary|Miscompiled code with -O2   |[4.3/4.4 Regression]
                   |(but not with -O2 -funroll- |Miscompiled code with -O2
                   |loops) on ia64              |(but not with -O2 -funroll-
                   |                            |loops) on ia64
   Target Milestone|---                         |4.3.1


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (2 preceding siblings ...)
  2008-03-24 17:33 ` [Bug target/35659] [4.3/4.4 Regression] " rguenth at gcc dot gnu dot org
@ 2008-04-28  4:29 ` mmitchel at gcc dot gnu dot org
  2008-05-20 21:02 ` sje at cup dot hp dot com
                   ` (24 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2008-04-28  4:29 UTC (permalink / raw)
  To: gcc-bugs



-- 

mmitchel at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P5


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (3 preceding siblings ...)
  2008-04-28  4:29 ` mmitchel at gcc dot gnu dot org
@ 2008-05-20 21:02 ` sje at cup dot hp dot com
  2008-05-21  3:51 ` kmccarty at debian dot org
                   ` (23 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: sje at cup dot hp dot com @ 2008-05-20 21:02 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from sje at cup dot hp dot com  2008-05-20 21:01 -------
I wonder if this is related to PR target/35695, the floating point division bug
that Jim Wilson fixed.  Could you try it with ToT or with the latest 4.3
branch, both of which have Jim's fix in them.


-- 

sje at cup dot hp dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sje at cup dot hp dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (4 preceding siblings ...)
  2008-05-20 21:02 ` sje at cup dot hp dot com
@ 2008-05-21  3:51 ` kmccarty at debian dot org
  2008-05-21 10:40 ` rguenth at gcc dot gnu dot org
                   ` (22 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: kmccarty at debian dot org @ 2008-05-21  3:51 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from kmccarty at debian dot org  2008-05-21 03:50 -------
(In reply to comment #4)
> I wonder if this is related to PR target/35695, the floating point division bug
> that Jim Wilson fixed.  Could you try it with ToT or with the latest 4.3
> branch, both of which have Jim's fix in them.

I tried with the gcc-4_3-branch from Subversion today, but it doesn't change
things: plain -O2 fails but both -O2 -funroll-loops and -O1 succeed on the test
case.

For the record:

(sid)kmccarty@merulo:~$ ~/gcc-4.3-branch/bin/gfortran -v
Using built-in specs.
Target: ia64-unknown-linux-gnu
Configured with: ./configure --enable-languages=c,fortran
--prefix=/home/kmccarty/gcc-4.3-branch --enable-shared
--with-mpfr=/home/kmccarty/gcc-4.3-branch
Thread model: posix
gcc version 4.3.1 20080521 (prerelease) (GCC)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (5 preceding siblings ...)
  2008-05-21  3:51 ` kmccarty at debian dot org
@ 2008-05-21 10:40 ` rguenth at gcc dot gnu dot org
  2008-05-23  1:29 ` mmitchel at gcc dot gnu dot org
                   ` (21 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-05-21 10:40 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from rguenth at gcc dot gnu dot org  2008-05-21 10:39 -------
Mark, you made this P5, but ia64-linux is a secondary platform.  P3 again to
get it on the radar.  But not higher priority because we don't have exactly
what
I would call a testcase.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mmitchel at gcc dot gnu dot
                   |                            |org
      Known to fail|                            |4.3.0
           Priority|P5                          |P3


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (6 preceding siblings ...)
  2008-05-21 10:40 ` rguenth at gcc dot gnu dot org
@ 2008-05-23  1:29 ` mmitchel at gcc dot gnu dot org
  2008-06-03 18:00 ` sje at cup dot hp dot com
                   ` (20 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2008-05-23  1:29 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from mmitchel at gcc dot gnu dot org  2008-05-23 01:29 -------
I agree on both points: (1) I should not have marked this P5, and (2) we do not
yet have a test case.  Marked as "WAITING".


-- 

mmitchel at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |WAITING


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (7 preceding siblings ...)
  2008-05-23  1:29 ` mmitchel at gcc dot gnu dot org
@ 2008-06-03 18:00 ` sje at cup dot hp dot com
  2008-06-04  9:00 ` rguenth at gcc dot gnu dot org
                   ` (19 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: sje at cup dot hp dot com @ 2008-06-03 18:00 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from sje at cup dot hp dot com  2008-06-03 17:59 -------
I looked at this bug and I can reproduce it using the precompiled archives from
the link.  I have not tried to get the CERN sources to create a small 'real'
test case.  I noticed that the bug does not happen on ToT and found that the
test started working correctly with version 133081.  The patch submitted in
this version is for PR 34677 which claims to be a missed optimization issue as
opposed to a codegen bug fix so I am not sure if the bug is really fixed by
that change or if that change just masks the problem like -funroll-loops does.

2008-03-10  Richard Guenther  <rguenther@suse.de>

        PR tree-optimization/34677
        * tree-ssa-pre.c (modify_expr_node_pool): Remove.
        (poolify_tree): Likewise.
        (modify_expr_template): Likewise.
        (poolify_modify_stmt): Likewise.
        (insert_fake_stores): Handle all component-ref style stores
        in addition to INDIRECT_REF.  Also handle complex types.
        Do not poolify the inserted load.
        (realify_fake_stores): Do not rebuild the tree but only
        make it a SSA_NAME copy.
        (init_pre): Remove initialzation of modify_expr_template.
        Do not allocate modify_expr_node_pool.
        (fini_pre): Do not free modify_expr_node_pool.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (8 preceding siblings ...)
  2008-06-03 18:00 ` sje at cup dot hp dot com
@ 2008-06-04  9:00 ` rguenth at gcc dot gnu dot org
  2008-06-06 15:03 ` rguenth at gcc dot gnu dot org
                   ` (18 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-06-04  9:00 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from rguenth at gcc dot gnu dot org  2008-06-04 08:59 -------
The only code-generation changing part of the patch is

        (insert_fake_stores): Handle all component-ref style stores
        in addition to INDIRECT_REF.  Also handle complex types.

but that only fixes a missed-optimization, so it probably makes the problem
latent on the trunk.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (9 preceding siblings ...)
  2008-06-04  9:00 ` rguenth at gcc dot gnu dot org
@ 2008-06-06 15:03 ` rguenth at gcc dot gnu dot org
  2008-06-24 14:45 ` jakub at gcc dot gnu dot org
                   ` (17 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-06-06 15:03 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from rguenth at gcc dot gnu dot org  2008-06-06 14:59 -------
4.3.1 is being released, adjusting target milestone.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.3.1                       |4.3.2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (10 preceding siblings ...)
  2008-06-06 15:03 ` rguenth at gcc dot gnu dot org
@ 2008-06-24 14:45 ` jakub at gcc dot gnu dot org
  2008-06-24 15:05 ` jakub at gcc dot gnu dot org
                   ` (16 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: jakub at gcc dot gnu dot org @ 2008-06-24 14:45 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from jakub at gcc dot gnu dot org  2008-06-24 14:44 -------
      PROGRAM PR35659
      DIMENSION A(1000), B(1010), AUX(8), IPIV(8), X(16)
      COMMON /TLSDIM/ M1,M,N,L,IER
      COMMON /SLATE/ V1,V2,IAR(24),DUM(14)
      DATA A/0, 1, 0, 0, 1, 0, 2, 0, 1, 0, 0, 0, 1, 0.200000003,
     1  0.0399999991, 0.00800000038, 1, 0.400000006, 0.159999996,
     2  0.064000003, 1, 0.600000024, 0.360000014, 0.216000006, 1,
     3  0.800000012, 0.639999986, 0.512000024, 1, 1, 1, 1, 968*0./
      DATA B/1, 2, 1, 1.22140002, 1.49179995, 1.82210004,
     4  2.22550011, 2.7183001, 0, 0, 1000*0./
      M1 = 2
      M = 8
      N = 4
      L = 1
      IER = 0
      V1 = 0
      V2 = 1.40129846e-45
      IAR(:) = 538976288
      DUM(:) = 1.35631564e-19
      CALL TLSC(A,B,AUX,IPIV,1.,X)
      IF (ABS(X(1) - 0.99785352).GE.0.1) CALL ABORT
      IF (ABS(X(2) - 1.0).GE.0.1) CALL ABORT
      IF (ABS(X(3) - 0.50107324).GE.0.1) CALL ABORT
      IF (ABS(X(4) - 0.21670136).GE.0.1) CALL ABORT
      END
      SUBROUTINE TLSMSQ (B,L,M,F)                                               
      COMMON /SLATE/ DUM(38),I,JB                                               
      DIMENSION      B(*)                                                       
      F  = 0.                                                                   
      JB = 1                                                                    
      DO           10        I=1,M                                              
      F  = F + B(JB)*B(JB)                                                      
   10 JB = JB + L                                                               
      RETURN                                                                    
      END                                   
      SUBROUTINE TLSWOP (A,AD,N,NR)                                             
      COMMON /SLATE/ DUM(37),H,I,JA                                             
      DIMENSION      A(*), AD(*)                                                
      JA = 1                                                                    
      DO           10        I=1,NR                                             
      H  = A(JA)                                                                
      A(JA)  = AD(JA)                                                           
      AD(JA) = H                                                                
   10 JA = JA + N                                                               
      RETURN                                                                    
      END                                      
      SUBROUTINE TLUK (A,IASEP,NR,SIG,BETA)
      COMMON /SLATE/ DUM(37),I,JA,LL
      DIMENSION A(*)
      SIG= 0.
      JA = 1
      LL = 0
      DO           10        I=1,NR
      IF     (A(JA).EQ.0.)             GO TO     10
      LL = I
      SIG= SIG + A(JA)* A(JA)
   10 JA = JA + IASEP
      NR = LL
      IF     (NR.EQ.0)                 RETURN
      SIG  = SIGN (SQRT (SIG),A(1))
      BETA = A(1) + SIG
      A(1) = BETA
      BETA = 1. / (SIG * BETA)
      RETURN
      END
      SUBROUTINE TLSTEP (A,B,IASEP,IBSEP,NR,NC,BETA)
      COMMON /SLATE/ DUM(34),H,I,IB,J,JA,JB
      DIMENSION      A(*), B(*)
      IB = 0
      DO           30        J=1,NC
      IB = IB + 1
      H  = 0.
      JA = 1
      JB = IB
      DO           10        I=1,NR
      H  = H + A(JA) * B(JB)
      JA = JA +IASEP
   10 JB = JB + IBSEP
      H  = H * BETA
      JA = 1
      JB = IB
      DO           20        I=1,NR
      B(JB) = B(JB) - A(JA) * H
      JA = JA +IASEP
   20 JB = JB + IBSEP
   30 CONTINUE
      RETURN
      END

together with tlsc.f can work as a testcase, and reproduces the problem with
current 4.3 branch.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (11 preceding siblings ...)
  2008-06-24 14:45 ` jakub at gcc dot gnu dot org
@ 2008-06-24 15:05 ` jakub at gcc dot gnu dot org
  2008-06-25 10:21 ` jakub at gcc dot gnu dot org
                   ` (15 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: jakub at gcc dot gnu dot org @ 2008-06-24 15:05 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #12 from jakub at gcc dot gnu dot org  2008-06-24 15:04 -------
Even smaller reproducer:
      PROGRAM PR35659
      DIMENSION A(1000), B(1010), AUX(8), IPIV(8), X(16)
      COMMON /TLSDIM/ M1,M,N,L,IER
      COMMON /SLATE/ V1,V2,IAR(24),DUM(14)
      DATA A/0, 1, 0, 0, 1, 0, 2, 0, 1, 0, 0, 0, 1, 0.200000003,
     1  0.0399999991, 0.00800000038, 1, 0.400000006, 0.159999996,
     2  0.064000003, 1, 0.600000024, 0.360000014, 0.216000006, 1,
     3  0.800000012, 0.639999986, 0.512000024, 1, 1, 1, 1, 968*0./
      DATA B/1, 2, 1, 1.22140002, 1.49179995, 1.82210004,
     4  2.22550011, 2.7183001, 0, 0, 1000*0./
      M1 = 2
      M = 8
      N = 4
      L = 1
      IER = 0
      V1 = 0
      V2 = 1.40129846e-45
      IAR(:) = 538976288
      DUM(:) = 1.35631564e-19
      CALL TLSC(A,B,AUX,IPIV,1.,X)
      END
      SUBROUTINE TLSMSQ (B,L,M,F)
      DIMENSION      B(*)
      IF (M.NE.2) CALL ABORT
      STOP 0
      END
      SUBROUTINE TLSWOP (A,AD,N,NR)
      DIMENSION      A(*), AD(*)
      CALL ABORT
      END
      SUBROUTINE TLUK (A,IASEP,NR,SIG,BETA)
      DIMENSION A(*)
      CALL ABORT
      END
      SUBROUTINE TLSTEP (A,B,IASEP,IBSEP,NR,NC,BETA)
      DIMENSION      A(*), B(*)
      CALL ABORT
      END

The miscompiled TLSC calls the first TLSMSQ routine with 8.0 rather than 2.0
as the 3rd argument.


-- 

jakub at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |NEW
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2008-06-24 15:04:50
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (12 preceding siblings ...)
  2008-06-24 15:05 ` jakub at gcc dot gnu dot org
@ 2008-06-25 10:21 ` jakub at gcc dot gnu dot org
  2008-06-25 11:32 ` jakub at gcc dot gnu dot org
                   ` (14 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: jakub at gcc dot gnu dot org @ 2008-06-25 10:21 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #13 from jakub at gcc dot gnu dot org  2008-06-25 10:20 -------
And the miscompiled tlsc.f inline (compile with just -O2):
      SUBROUTINE TLSC (A,B,AUX,IPIV,EPS,X)
      COMMON /TLSDIM/ M1,M,N,L,IER
      COMMON /SLATE/ BETA,H,I,IB,IB1,ID,ID1,IEND,II,IST,J,JA,JB,JK
     +              ,JST,K,KPIV,KR,KST,KT,K1,LV,MR,M11,NK,NR,PIV,PIVT
     +              ,SIG,DUM(11)
      DIMENSION      A(*), AUX(*), B(*), IPIV(*), X(*)
      IF (N.GT.M.OR.M1.GT.N)  GO TO 90
      K1  = MAX (N,L)
      IER = 1
      DO           5         K=1,N
    5 IPIV(K) = K
      IST = - N
      JB  = 1 - L
      M11 = M1 + 1
      MR  = M1
      DO           50        K=1,N
      IF     (K.GT.M1)                 MR = M
      IST = IST + N + 1
      JB  = JB + L
      LV  = MR - K + 1
      PIV = 0.
      ID  = IST - N
      DO           20        J=K,N
      IF     (K.EQ.1 .OR. K.EQ.M11)    GO TO     10
      PIVT = AUX(J) - A(ID)*A(ID)
      GO TO      15
   10 I = ID + N
      IF (LV .EQ. 1) GO TO 12
      CALL TLSMSQ (A(I),N,LV,PIVT)
      GO TO 15
   12 PIVT= A(I)*A(I)
   15 AUX(J) = PIVT
      ID = ID + 1
      IF     (PIVT*EPS.LE.PIV)         GO TO     20
      PIV  = PIVT
      KPIV = J
   20 CONTINUE
      I = KPIV - K
      IF     (I.LE.0)                  GO TO     25
      H = AUX(K)
      AUX(K) = AUX(KPIV)
      AUX(KPIV) = H
      ID = IST + I
      NR = M - K + 1
      CALL TLSWOP (A(IST),A(ID),N,NR)
   25 CALL TLUK (A(IST),N,LV,SIG,BETA)
      IF     (LV.EQ.0)                 GO TO     90
      J = K1 + K
      AUX(J)=-SIG
      IF     (K.GE.N)                  GO TO     30
      NK = N - K
      IF (LV.EQ.1)                     GO TO     27
      CALL TLSTEP (A(IST),A(IST+1),N,N,LV,NK,BETA)
      GO TO        30
   27 DO           28        J=1,NK
      JST = IST + J
   28 A(JST) = A(JST)*(1.-BETA*A(IST)**2)
   30 IB = (K-1) * L + 1
      IF (LV.EQ.1)                     GO TO     32
      CALL TLSTEP (A(IST),B(IB),N,L,LV,L,BETA)
      GO TO      34
   32 DO           33        J=1,L
      JST = IB + J - 1
   33 B(JST) = B(JST)*(1.-BETA*A(IST)**2)
   34 IPIV(KPIV) = IPIV(K)
      IPIV(K) = KPIV
      IF     (K.GT.M1)                 GO TO     50
      DO           45        I=M11,M
      ID1 = IST + (I-K)*N
      IF     (A(ID1).EQ.0)             GO TO     45
      H  = - A(ID1)/SIG
      A(ID1) = H
      ID1 = ID1 + 1
      ID = IST + 1
      DO           35        J=1,NK
      A(ID1) = A(ID1) - H*A(ID)
      ID1 = ID1 + 1
   35 ID  = ID + 1
      IB1 = 1 + (I-1)*L
      IB = JB
      DO           40        J=1,L
      B(IB1) = B(IB1) - H*B(IB)
      IB1 = IB1 + 1
   40 IB  = IB + 1
   45 CONTINUE
   50 CONTINUE
      IER = N * IER
      KT  = N
      JK  = (N-1) * L
      K   = K1 + N
      PIV = 1./AUX(K)
      DO           55        K=1,L
      JK = JK + 1
   55 X(JK) = PIV * B(JK)
      KR = N - 1
      IF     (KR.LE.0)                 GO TO     70
      JST = KR * (N+1) + 2
      DO           65        J=1,KR
      JST = JST - N - 1
      IEND= (KR-J+1) * N
      K   = K1 + KR - J + 1
      PIV = 1./AUX(K)
      KST = K-K1
      ID  = IPIV(KST)-KST
      KST = (KR-J) * L
      DO           65        K=1,L
      KST = KST + 1
      H=B(KST)
      II = KST
      DO           60        I=JST,IEND
      II = II + L
   60 H  = H - A(I) * X(II)
      II = KST + ID *L
      X(KST) = X(II)
      X(II)  = PIV * H
   65 CONTINUE
   70 IST = KT*L
      DO           80        J=1,L
      IST = IST + 1
      H   = 0.
      JA  = IST
      IF     (M.LE.KT)                 GO TO     80
      NR  = M - KT
      IF     (NR.EQ.1)                 GO TO     75
      CALL TLSMSQ (B(IST),L,NR,H)
      GO TO      80
   75 H = B(IST)*B(IST)
   80 AUX(J) = H
      RETURN
   90 IER = -1001
      RETURN
      END

The problem is that slate.k (aka prephitmp.78) is read before it is stored,
so it has the 0x20202020 value instead of 1.
At tlsc.f.198r.compgotos the code still looks correct:
(insn 1857 1434 1443 8 tlsc.f:16 (set (reg:SI 14 r14 [1392])
        (const_int 1 [0x1])) 4 {*movsi_internal} (expr_list:REG_EQUIV
(const_int 1 [0x1])
        (nil)))

(insn 1443 1857 242 8 tlsc.f:16 (set (mem/s/c:SI (post_modify:DI (reg/f:DI 53
r59 [1516])
                (plus:DI (reg/f:DI 53 r59 [1516])
                    (const_int -60 [0xffffffffffffffc4]))) [2 slate.k+0 S4
A32])
        (reg:SI 14 r14 [1392])) 4 {*movsi_internal} (expr_list:REG_INC
(reg/f:DI 53 r59 [1516])
        (expr_list:REG_EQUAL (const_int 1 [0x1])
            (nil))))
...
(insn 197 228 247 8 tlsc.f:17 (set (reg:DI 18 r18)
        (zero_extend:DI (mem/s/c:SI (reg/f:DI 38 r44 [1517]) [2 slate.k+0 S4
A32]))) 103 {zero_extendsidi2} (expr_list:REG_EQUAL 
(mem/s/c:SI (const:DI (plus:DI (symbol_ref:DI ("slate_") <var_decl
0x2000000003c54c80 slate>)
                    (const_int 60 [0x3c]))) [2 slate.k+0 S4 A32])
        (nil)))

(insn 247 197 198 8 tlsc.f:22 (set (reg:SI 19 r19 [665])
        (minus:SI (reg:SI 20 r20 [orig:560 D.775 ] [560])
            (reg:SI 32 r38 [orig:529 prephitmp.68 ] [529]))) 165 {subsi3}
(nil))

(insn 198 247 253 8 tlsc.f:17 (set (reg:BI 262 p6 [626])
        (le:BI (reg:SI 18 r18 [orig:522 prephitmp.78 ] [522])
            (reg:SI 22 r22 [orig:459 prephitmp.258 ] [459]))) 298
{*cmpsi_normal} (nil))
but tlsc.f.200r.mach is already wrong:
(insn:TI 197 2150 1443 8 tlsc.f:17 (set (reg:DI 18 r18)
        (zero_extend:DI (unspec:SI [
                    (mem/s/c:SI (reg/f:DI 38 r44 [1517]) [2 slate.k+0 S4 A32])
                ] 40))) 17 {zero_extendsidi2_advanced} (expr_list:REG_EQUAL
(mem/s/c:SI (const:DI (plus:DI (symbol_ref:DI ("slate_") <var_decl
0x2000000003c54c80 slate>)
                    (const_int 60 [0x3c]))) [2 slate.k+0 S4 A32])
        (nil)))

(insn 1443 197 1427 8 tlsc.f:16 (set (mem/s/c:SI (post_modify:DI (reg/f:DI 53
r59 [1516])
                (plus:DI (reg/f:DI 53 r59 [1516])
                    (const_int -60 [0xffffffffffffffc4]))) [2 slate.k+0 S4
A32])
        (reg:SI 14 r14 [1392])) 4 {*movsi_internal} (expr_list:REG_DEAD (reg:SI
14 r14 [1392])
        (expr_list:REG_INC (reg/f:DI 53 r59 [1516])
            (expr_list:REG_EQUAL (const_int 1 [0x1])
                (nil)))))
...
(insn:TI 198 2148 223 8 tlsc.f:17 (set (reg:BI 262 p6 [626])
        (le:BI (reg:SI 18 r18 [orig:522 prephitmp.78 ] [522])
            (reg:SI 22 r22 [orig:459 prephitmp.258 ] [459]))) 298
{*cmpsi_normal} (nil))

Note that the scheduler swapped the slate_.k = 1 store with prephitmp.78 =
slate_.k read.  The same can be seen in the assembly:
        ld4.a r18 = [r44]       //, slate.k
[.LBE19:]
        .loc 1 16 0
        st4 [r59] = r14, -60    // slate.k, tmp1392
where r44 == r59.
(gdb) disas $pc $pc+16
Dump of assembler code from 0x4000000000000ec0 to 0x4000000000000ed0:
0x4000000000000ec0 <tlsc+512>:  [MMI]       ld4.a r18=[r44]
0x4000000000000ec1 <tlsc+513>:              st4 [r59]=r14,-60
0x4000000000000ec2 <tlsc+514>:              adds r16=48,r41
End of assembler dump.
(gdb) p/x $r44
$3 = 0x6000000000004e4c
(gdb) p/x $r59
$4 = 0x6000000000004e4c

insn 197 is the only ld4.a in tlsc_ routine.


-- 

jakub at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vmakarov at gcc dot gnu dot
                   |                            |org, wilson at gcc dot gnu
                   |                            |dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (13 preceding siblings ...)
  2008-06-25 10:21 ` jakub at gcc dot gnu dot org
@ 2008-06-25 11:32 ` jakub at gcc dot gnu dot org
  2008-06-25 11:33 ` jakub at gcc dot gnu dot org
                   ` (13 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: jakub at gcc dot gnu dot org @ 2008-06-25 11:32 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #14 from jakub at gcc dot gnu dot org  2008-06-25 11:31 -------
I have no idea why is speculation even attempted here (it doesn't make any
sense,
the pointer is surely non-NULL, it points to a global variable), and apparently
nothing checks whether it is safe to move over the speculative load over
the store (at least, I've put a breakpoint on nonoverlapping_memrefs_p
and {{,canon_}true,anti,output}_dependence and none of them hit with any MEMs
with r44 or POST_MODIFY r59, -60.  Maxim, speculation is your baby, could you
please have a look?


-- 

jakub at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mkuvyrkov at gcc dot gnu dot
                   |                            |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (14 preceding siblings ...)
  2008-06-25 11:32 ` jakub at gcc dot gnu dot org
@ 2008-06-25 11:33 ` jakub at gcc dot gnu dot org
  2008-06-25 21:34 ` mkuvyrkov at gcc dot gnu dot org
                   ` (12 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: jakub at gcc dot gnu dot org @ 2008-06-25 11:33 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #15 from jakub at gcc dot gnu dot org  2008-06-25 11:32 -------
Wrong-code bug on secondary arch.


-- 

jakub at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (15 preceding siblings ...)
  2008-06-25 11:33 ` jakub at gcc dot gnu dot org
@ 2008-06-25 21:34 ` mkuvyrkov at gcc dot gnu dot org
  2008-06-25 22:16 ` doko at cs dot tu-berlin dot de
                   ` (11 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: mkuvyrkov at gcc dot gnu dot org @ 2008-06-25 21:34 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #16 from mkuvyrkov at gcc dot gnu dot org  2008-06-25 21:33 -------
I can't reproduce the error with today mainline.  When I put in one file
'PROGRAM PR35659' and 'SUBROUTINE TLSC (A,B,AUX,IPIV,EPS,X)' and compile it
with any optimization level I get the same "STOP 0" message.  Am I doing
something wrong?

I can't spot the problem in the dumps you posted at debian.org.  tlsc.s has a
single example of data speculation which seems to be fine.  Scheduler
speculatively moves
        ld4.a r18 = [r44]
before
        st4 [r59] = r14, -60
and then also speculates several uses of r18:
        cmp4.ge p6, p7 = r22, r18
...
        (p7) ld4 r14 = [r62]
...
        (p7) st4 [r77] = r14
then it checks the speculation:
        chk.a.clr r18, .L69
and recovers if speculation failed:
.L69:
        .mmi
        ld4 r18 = [r44]
        ;;
        cmp4.ge p6, p7 = r22, r18
        nop 0
        ;;
        .mmi
        nop 0
        (p7) ld4 r14 = [r62]
        nop 0
        ;;
        .mib
        (p7) st4 [r77] = r14
        nop 0
        br .L70

Anyway, can you help me reproduce the issue, so I can take a closer look?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (16 preceding siblings ...)
  2008-06-25 21:34 ` mkuvyrkov at gcc dot gnu dot org
@ 2008-06-25 22:16 ` doko at cs dot tu-berlin dot de
  2008-06-26  8:02 ` jakub at gcc dot gnu dot org
                   ` (10 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: doko at cs dot tu-berlin dot de @ 2008-06-25 22:16 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #17 from doko at cs dot tu-berlin dot de  2008-06-25 22:16 -------
Subject: Re:  [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2
-funroll-loops) on ia64

mkuvyrkov at gcc dot gnu dot org writes:
> Anyway, can you help me reproduce the issue, so I can take a closer look?

please email me a ssh key, if access to a Debian machine would help.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (17 preceding siblings ...)
  2008-06-25 22:16 ` doko at cs dot tu-berlin dot de
@ 2008-06-26  8:02 ` jakub at gcc dot gnu dot org
  2008-06-26  8:42 ` jakub at gcc dot gnu dot org
                   ` (9 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: jakub at gcc dot gnu dot org @ 2008-06-26  8:02 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #18 from jakub at gcc dot gnu dot org  2008-06-26 08:01 -------
You can reproduce it with a cross compiler too.
Just use 4.3 branch (IMHO unrelated changes on the trunk made this bug latent)
and don't combine everything into the same file.
The #c12 program+routines have to be in one file, #c13 in a different one.
The latter is miscompiled, if you are on ia64, you can compile the former
with any optimization options, the latter with -O2, link, run.
If you have just a cross compiler, just compile the #c13 source and inspect the
assembly (look at ld4.a:
        addl r41 = @ltoff(slate_#), r1
...
        adds r14 = 60, r41
...
        .mmi
        mov r44 = r14
        mov r50 = r15
        mov r59 = r14
... ! Instructions that don't modify r44 nor r59, no labels
        .mmi
        ld4.a r18 = [r44]      ! *r44 aka slate_.k is uninitialized here,
0x20202020
        st4 [r59] = r14, -60   ! This stores 1 into slate_.k
        adds r16 = 48, r41
and look at the *.compgotos dump that right before scheduling that slate_.k = 1
preceeded the prephitmp.78 (== r18) = slate_.k load, while after scheduling
it is the other way around.  The alias set looks correct for both MEMs (4) and
MEM_EXPR/MEM_OFFSET etc. too, so IMHO if alias.c was asked if the two MEMs can
overlap, it would certainly say so.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (18 preceding siblings ...)
  2008-06-26  8:02 ` jakub at gcc dot gnu dot org
@ 2008-06-26  8:42 ` jakub at gcc dot gnu dot org
  2008-06-26  9:22 ` maxim at codesourcery dot com
                   ` (8 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: jakub at gcc dot gnu dot org @ 2008-06-26  8:42 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #19 from jakub at gcc dot gnu dot org  2008-06-26 08:41 -------
To be more precise, the problem is in speculating conditional store:
        ld4.a r18 = [r44]
        st4 [r59] = r14, -60
...
        cmp4.ge p6, p7 = r22, r18
        (p7) ld4 r14 = [r62]
...
        (p7) st4 [r77] = r14
        chk.a.clr r18, .L69
.L70:
...
.L69:
        ld4 r18 = [r44]
        ;;
        cmp4.ge p6, p7 = r22, r18
        ;;
        (p7) ld4 r14 = [r62]
        ;;
        (p7) st4 [r77] = r14
        br .L70

Now, ld4.a returns the stale value in r18 (0x20202020), $r22 is 2 and so in the
code before chk.a.clr p7 is 1, so [r62] is loaded into r14 and stored into
[r77]
(i.e. MR = M in IF     (K.GT.M1)                 MR = M is executed).
Then chk.a.clr realizes the [r44] memory changed, so branches to .L69 to do the
speculated stuff again.  This time r18 is 1, so p7 will be 0 and MR = M is not
done.  But this really doesn't and can't undo the st4 [r77] = r14 store that
already happened before chk.a.clr and wasn't supposed to happen.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (19 preceding siblings ...)
  2008-06-26  8:42 ` jakub at gcc dot gnu dot org
@ 2008-06-26  9:22 ` maxim at codesourcery dot com
  2008-06-26  9:24 ` mkuvyrkov at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: maxim at codesourcery dot com @ 2008-06-26  9:22 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #20 from maxim at codesourcery dot com  2008-06-26 09:21 -------
Subject: Re:  [4.3/4.4 Regression] Miscompiled code with
 -O2 (but not with -O2 -funroll-loops) on ia64

jakub at gcc dot gnu dot org wrote:
> ------- Comment #19 from jakub at gcc dot gnu dot org  2008-06-26 08:41 -------
> To be more precise, the problem is in speculating conditional store:
>         ld4.a r18 = [r44]
>         st4 [r59] = r14, -60
> ...
>         cmp4.ge p6, p7 = r22, r18
>         (p7) ld4 r14 = [r62]
> ...
>         (p7) st4 [r77] = r14
>         chk.a.clr r18, .L69
> .L70:
> ...
> .L69:
>         ld4 r18 = [r44]
>         ;;
>         cmp4.ge p6, p7 = r22, r18
>         ;;
>         (p7) ld4 r14 = [r62]
>         ;;
>         (p7) st4 [r77] = r14
>         br .L70
> 
> Now, ld4.a returns the stale value in r18 (0x20202020), $r22 is 2 and so in the
> code before chk.a.clr p7 is 1, so [r62] is loaded into r14 and stored into
> [r77]
> (i.e. MR = M in IF     (K.GT.M1)                 MR = M is executed).
> Then chk.a.clr realizes the [r44] memory changed, so branches to .L69 to do the
> speculated stuff again.  This time r18 is 1, so p7 will be 0 and MR = M is not
> done.  But this really doesn't and can't undo the st4 [r77] = r14 store that
> already happened before chk.a.clr and wasn't supposed to happen.

Oh, now I see the problem.  I need some time to check what the best fix 
would be.  Probably, the fix will be to exclude predicated instructions 
from speculation set, but I'd like to consider alternatives.


Thanks,

Maxim


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (20 preceding siblings ...)
  2008-06-26  9:22 ` maxim at codesourcery dot com
@ 2008-06-26  9:24 ` mkuvyrkov at gcc dot gnu dot org
  2008-08-01 23:29 ` sje at cup dot hp dot com
                   ` (6 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: mkuvyrkov at gcc dot gnu dot org @ 2008-06-26  9:24 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #21 from mkuvyrkov at gcc dot gnu dot org  2008-06-26 09:23 -------
Assign to self


-- 

mkuvyrkov at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |mkuvyrkov at gcc dot gnu dot
                   |dot org                     |org
             Status|NEW                         |ASSIGNED
   Last reconfirmed|2008-06-24 15:04:50         |2008-06-26 09:23:37
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (21 preceding siblings ...)
  2008-06-26  9:24 ` mkuvyrkov at gcc dot gnu dot org
@ 2008-08-01 23:29 ` sje at cup dot hp dot com
  2008-08-02 18:16 ` mkuvyrkov at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: sje at cup dot hp dot com @ 2008-08-01 23:29 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #22 from sje at cup dot hp dot com  2008-08-01 23:28 -------
Maxim, have you had time to look at this bug?  Given that it is generating bad
code and is in 4.3.0 and 4.3.1 I was wondering if it will be fixed for 4.3.2.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (22 preceding siblings ...)
  2008-08-01 23:29 ` sje at cup dot hp dot com
@ 2008-08-02 18:16 ` mkuvyrkov at gcc dot gnu dot org
  2008-08-04 17:36 ` sje at cup dot hp dot com
                   ` (4 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: mkuvyrkov at gcc dot gnu dot org @ 2008-08-02 18:16 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #23 from mkuvyrkov at gcc dot gnu dot org  2008-08-02 18:15 -------
(In reply to comment #22)
> Maxim, have you had time to look at this bug?  Given that it is generating bad
> code and is in 4.3.0 and 4.3.1 I was wondering if it will be fixed for 4.3.2.

Sorry for the delay.  I posted the fix at
http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00112.html.

I would appreciate if someone could test the cumulative patch of three fixes
for ia64 speculation support or provide a single-file executable testcase for
this bug.  Here are links to two other bugfixes for ia64 speculation support:
http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00110.html
http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00111.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (23 preceding siblings ...)
  2008-08-02 18:16 ` mkuvyrkov at gcc dot gnu dot org
@ 2008-08-04 17:36 ` sje at cup dot hp dot com
  2008-08-06  6:26 ` mkuvyrkov at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: sje at cup dot hp dot com @ 2008-08-04 17:36 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #24 from sje at cup dot hp dot com  2008-08-04 17:34 -------
I bootstrapped the 3 patches on mainline and 4.3 branch and verified that they
fix the problem that is reported on the 4.3 branch with the patches.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (24 preceding siblings ...)
  2008-08-04 17:36 ` sje at cup dot hp dot com
@ 2008-08-06  6:26 ` mkuvyrkov at gcc dot gnu dot org
  2008-08-06  6:38 ` mkuvyrkov at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: mkuvyrkov at gcc dot gnu dot org @ 2008-08-06  6:26 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #25 from mkuvyrkov at gcc dot gnu dot org  2008-08-06 06:25 -------
Subject: Bug 35659

Author: mkuvyrkov
Date: Wed Aug  6 06:23:47 2008
New Revision: 138759

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=138759
Log:
        PR target/35659
        * haifa-sched.c (sched_insn_is_legitimate_for_speculation_p): Move ...
        * sched-deps.c (sched_insn_is_legitimate_for_speculation_p): ... here.
        Don't allow predicated instructions for data speculation.
        * sched-int.h (sched_insn_is_legitimate_for_speculation_p): Move
        declaration.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/haifa-sched.c
    trunk/gcc/sched-deps.c
    trunk/gcc/sched-int.h


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (25 preceding siblings ...)
  2008-08-06  6:26 ` mkuvyrkov at gcc dot gnu dot org
@ 2008-08-06  6:38 ` mkuvyrkov at gcc dot gnu dot org
  2008-08-06  6:40 ` mkuvyrkov at gcc dot gnu dot org
  2009-02-06  7:33 ` bonzini at gcc dot gnu dot org
  28 siblings, 0 replies; 30+ messages in thread
From: mkuvyrkov at gcc dot gnu dot org @ 2008-08-06  6:38 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #26 from mkuvyrkov at gcc dot gnu dot org  2008-08-06 06:36 -------
Subject: Bug 35659

Author: mkuvyrkov
Date: Wed Aug  6 06:34:18 2008
New Revision: 138762

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=138762
Log:
        Backport from mainline:
        2008-08-06  Maxim Kuvyrkov  <maxim@codesourcery.com>

        PR target/35659
        * haifa-sched.c (sched_insn_is_legitimate_for_speculation_p): Move ...
        * sched-deps.c (sched_insn_is_legitimate_for_speculation_p): ... here.
        Don't allow predicated instructions for data speculation.
        * sched-int.h (sched_insn_is_legitimate_for_speculation_p): Move
        declaration.

Modified:
    branches/gcc-4_3-branch/gcc/ChangeLog
    branches/gcc-4_3-branch/gcc/haifa-sched.c
    branches/gcc-4_3-branch/gcc/sched-deps.c
    branches/gcc-4_3-branch/gcc/sched-int.h


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (26 preceding siblings ...)
  2008-08-06  6:38 ` mkuvyrkov at gcc dot gnu dot org
@ 2008-08-06  6:40 ` mkuvyrkov at gcc dot gnu dot org
  2009-02-06  7:33 ` bonzini at gcc dot gnu dot org
  28 siblings, 0 replies; 30+ messages in thread
From: mkuvyrkov at gcc dot gnu dot org @ 2008-08-06  6:40 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #27 from mkuvyrkov at gcc dot gnu dot org  2008-08-06 06:38 -------
Should be fixed on both trunk and 4_3-branch.


-- 

mkuvyrkov at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug target/35659] [4.3/4.4 Regression] Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64
  2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
                   ` (27 preceding siblings ...)
  2008-08-06  6:40 ` mkuvyrkov at gcc dot gnu dot org
@ 2009-02-06  7:33 ` bonzini at gcc dot gnu dot org
  28 siblings, 0 replies; 30+ messages in thread
From: bonzini at gcc dot gnu dot org @ 2009-02-06  7:33 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #28 from bonzini at gnu dot org  2009-02-06 07:33 -------
Subject: Bug 35659

Author: bonzini
Date: Fri Feb  6 07:33:05 2009
New Revision: 143980

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=143980
Log:
2009-02-06  Paolo Bonzini  <bonzini@gnu.org>

        PR tree-optimization/35659
        * tree-ssa-sccvn.c (vn_constant_eq, vn_reference_eq, vn_nary_op_eq
        vn_phi_eq): Shortcut if hashcode does not match.
        (vn_reference_op_compute_hash): Do not call iterative_hash_expr for
        NULL operands.
        * tree-ssa-pre.c (pre_expr_hash): Look at hashcode if available,
        and avoid iterative_hash_expr.
        (FOR_EACH_VALUE_ID_IN_SET): New.
        (value_id_compare): Remove.
        (sorted_array_from_bitmap_set): Use FOR_EACH_VALUE_ID_IN_SET to
        sort expressions by value id.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-ssa-pre.c
    trunk/gcc/tree-ssa-sccvn.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35659


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2009-02-06  7:33 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-21 18:49 [Bug fortran/35659] New: Miscompiled code with -O2 (but not with -O2 -funroll-loops) on ia64 kmccarty at debian dot org
2008-03-22 13:22 ` [Bug target/35659] " rguenth at gcc dot gnu dot org
2008-03-24 16:48 ` kmccarty at debian dot org
2008-03-24 17:33 ` [Bug target/35659] [4.3/4.4 Regression] " rguenth at gcc dot gnu dot org
2008-04-28  4:29 ` mmitchel at gcc dot gnu dot org
2008-05-20 21:02 ` sje at cup dot hp dot com
2008-05-21  3:51 ` kmccarty at debian dot org
2008-05-21 10:40 ` rguenth at gcc dot gnu dot org
2008-05-23  1:29 ` mmitchel at gcc dot gnu dot org
2008-06-03 18:00 ` sje at cup dot hp dot com
2008-06-04  9:00 ` rguenth at gcc dot gnu dot org
2008-06-06 15:03 ` rguenth at gcc dot gnu dot org
2008-06-24 14:45 ` jakub at gcc dot gnu dot org
2008-06-24 15:05 ` jakub at gcc dot gnu dot org
2008-06-25 10:21 ` jakub at gcc dot gnu dot org
2008-06-25 11:32 ` jakub at gcc dot gnu dot org
2008-06-25 11:33 ` jakub at gcc dot gnu dot org
2008-06-25 21:34 ` mkuvyrkov at gcc dot gnu dot org
2008-06-25 22:16 ` doko at cs dot tu-berlin dot de
2008-06-26  8:02 ` jakub at gcc dot gnu dot org
2008-06-26  8:42 ` jakub at gcc dot gnu dot org
2008-06-26  9:22 ` maxim at codesourcery dot com
2008-06-26  9:24 ` mkuvyrkov at gcc dot gnu dot org
2008-08-01 23:29 ` sje at cup dot hp dot com
2008-08-02 18:16 ` mkuvyrkov at gcc dot gnu dot org
2008-08-04 17:36 ` sje at cup dot hp dot com
2008-08-06  6:26 ` mkuvyrkov at gcc dot gnu dot org
2008-08-06  6:38 ` mkuvyrkov at gcc dot gnu dot org
2008-08-06  6:40 ` mkuvyrkov at gcc dot gnu dot org
2009-02-06  7:33 ` bonzini at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).