* [Bug middle-end/43866] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
@ 2010-04-23 9:21 ` jv244 at cam dot ac dot uk
2010-04-23 9:25 ` jv244 at cam dot ac dot uk
` (23 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: jv244 at cam dot ac dot uk @ 2010-04-23 9:21 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from jv244 at cam dot ac dot uk 2010-04-23 09:21 -------
The error also appears with:
gfortran -fbounds-check -O1 -funswitch-loops test.f90
--
jv244 at cam dot ac dot uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|wrong code with -O3 - |wrong code with -fbounds-
|fbounds-check |check -funswitch-loops
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
2010-04-23 9:21 ` [Bug middle-end/43866] wrong code with -fbounds-check -funswitch-loops jv244 at cam dot ac dot uk
@ 2010-04-23 9:25 ` jv244 at cam dot ac dot uk
2010-04-23 10:30 ` jv244 at cam dot ac dot uk
` (22 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: jv244 at cam dot ac dot uk @ 2010-04-23 9:25 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from jv244 at cam dot ac dot uk 2010-04-23 09:25 -------
reduced testcase:
IMPLICIT NONE
INTEGER, PARAMETER :: sp=4, dp=8
TYPE cp_fm_type
REAL(KIND=sp), DIMENSION(:,:), POINTER :: local_data_sp
REAL(KIND=dp), DIMENSION(:,:), POINTER :: local_data
INTEGER :: nrow_global,ncol_global
LOGICAL :: use_sp
END TYPE
TYPE(cp_fm_type), POINTER :: amat
INTEGER :: N
INTEGER :: irow_global, ncol_global, nrow_global, icol_global
REAL(KIND = dp), DIMENSION(:,:), POINTER :: a
REAL(KIND = sp), DIMENSION(:,:), POINTER :: a_sp
N=17
ALLOCATE(amat)
ALLOCATE(amat%local_data(N,N))
amat%nrow_global=N
amat%ncol_global=N
amat%use_sp=.FALSE.
amat%local_data=0
nrow_global=amat%nrow_global
ncol_global=amat%ncol_global
a => amat%local_data
a_sp => amat%local_data_sp
DO irow_global=1,nrow_global
DO icol_global=irow_global+1,ncol_global
IF(amat%use_sp) THEN
a_sp(icol_global,irow_global)=a_sp(irow_global,icol_global)
ELSE
a(icol_global,irow_global)=a(irow_global,icol_global)
ENDIF
ENDDO
ENDDO
END
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
2010-04-23 9:21 ` [Bug middle-end/43866] wrong code with -fbounds-check -funswitch-loops jv244 at cam dot ac dot uk
2010-04-23 9:25 ` jv244 at cam dot ac dot uk
@ 2010-04-23 10:30 ` jv244 at cam dot ac dot uk
2010-04-23 12:30 ` rguenth at gcc dot gnu dot org
` (21 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: jv244 at cam dot ac dot uk @ 2010-04-23 10:30 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from jv244 at cam dot ac dot uk 2010-04-23 10:30 -------
It looks like in the .optimized dump these:
a_sp$offset = amat->local_data_sp.offset;
a_sp$dim$1$stride = amat->local_data_sp.dim[1].stride;
a_sp$dim$1$ubound = amat->local_data_sp.dim[1].ubound;
a_sp$dim$1$lbound = amat->local_data_sp.dim[1].lbound;
are indeed accessing fields of local_data_sp even though it never got
allocated.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (2 preceding siblings ...)
2010-04-23 10:30 ` jv244 at cam dot ac dot uk
@ 2010-04-23 12:30 ` rguenth at gcc dot gnu dot org
2010-04-23 13:32 ` jv244 at cam dot ac dot uk
` (20 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-04-23 12:30 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from rguenth at gcc dot gnu dot org 2010-04-23 12:29 -------
Hm, I can't reproduce this.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (3 preceding siblings ...)
2010-04-23 12:30 ` rguenth at gcc dot gnu dot org
@ 2010-04-23 13:32 ` jv244 at cam dot ac dot uk
2010-04-23 17:29 ` [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] " jv244 at cam dot ac dot uk
` (19 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: jv244 at cam dot ac dot uk @ 2010-04-23 13:32 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from jv244 at cam dot ac dot uk 2010-04-23 13:32 -------
(In reply to comment #4)
> Hm, I can't reproduce this.
I see, the reduced testcase (comment #2) indeed doesn't fail with trunk
anymore, but the original does (but only at -O3 -fbounds-check).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (4 preceding siblings ...)
2010-04-23 13:32 ` jv244 at cam dot ac dot uk
@ 2010-04-23 17:29 ` jv244 at cam dot ac dot uk
2010-04-26 11:07 ` jv244 at cam dot ac dot uk
` (18 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: jv244 at cam dot ac dot uk @ 2010-04-23 17:29 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from jv244 at cam dot ac dot uk 2010-04-23 17:29 -------
both testcases work with 4.1.2. I've also checked various versions of valgrind,
which all report consistent results. Also 4.5 only fails on the original
testcase. for reference, -v gives:
gcc-4.5/bin/gfortran-4.5 -O3 -fbounds-check -g -static -v t.f90
Driving: gcc-4.5/bin/gfortran-4.5 -O3 -fbounds-check -g -static -v t.f90
-lgfortran -lm
Using built-in specs.
COLLECT_GCC=gcc-4.5/bin/gfortran-4.5
COLLECT_LTO_WRAPPER=/panfs/panfs0.ften.es.hpcn.uzh.ch/es/home/pci/vondele/gcc-4.5/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.5.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: /lustre/ESPFS/scratch/pci/vondele/gcc-4.5/gcc-4.5.0/configure
--prefix=/home/pci/vondele/gcc-4.5 --enable-languages=c,c++,fortran
--program-suffix=-4.5 --disable-multilib --with-gmp=/home/pci/vondele/gcc-4.5
--with-mpfr=/home/pci/vondele/gcc-4.5 --with-mpc=/home/pci/vondele/gcc-4.5
Thread model: posix
gcc version 4.5.0 (GCC)
COLLECT_GCC_OPTIONS='-O3' '-fbounds-check' '-g' '-static' '-v' '-mtune=generic'
'-march=x86-64'
/panfs/panfs0.ften.es.hpcn.uzh.ch/es/home/pci/vondele/gcc-4.5/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.5.0/f951
t.f90 -quiet -dumpbase t.f90 -mtune=generic -march=x86-64 -auxbase t -g -O3
-version -fbounds-check -fintrinsic-modules-path
/panfs/panfs0.ften.es.hpcn.uzh.ch/es/home/pci/vondele/gcc-4.5/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.5.0/finclude
-o /tmp/cc4wFHAz.s
GNU Fortran (GCC) version 4.5.0 (x86_64-unknown-linux-gnu)
compiled by GNU C version 4.5.0, GMP version 4.2.4, MPFR version 2.4.1,
MPC version 0.8
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU Fortran (GCC) version 4.5.0 (x86_64-unknown-linux-gnu)
compiled by GNU C version 4.5.0, GMP version 4.2.4, MPFR version 2.4.1,
MPC version 0.8
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
COLLECT_GCC_OPTIONS='-O3' '-fbounds-check' '-g' '-static' '-v' '-mtune=generic'
'-march=x86-64'
as -V -Qy --64 -o /tmp/ccvIH9et.o /tmp/cc4wFHAz.s
GNU assembler version 2.16.91.0.5 (x86_64-suse-linux) using BFD version
2.16.91.0.5 20051219 (SUSE Linux)
COMPILER_PATH=/panfs/panfs0.ften.es.hpcn.uzh.ch/es/home/pci/vondele/gcc-4.5/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.5.0/:/panfs/panfs0.ften.es.hpcn.uzh.ch/es/home/pci/vondele/gcc-4.5/bin/../libexec/gcc/
LIBRARY_PATH=/panfs/panfs0.ften.es.hpcn.uzh.ch/es/home/pci/vondele/gcc-4.5/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.5.0/:/panfs/panfs0.ften.es.hpcn.uzh.ch/es/home/pci/vondele/gcc-4.5/bin/../lib/gcc/:/panfs/panfs0.ften.es.hpcn.uzh.ch/es/home/pci/vondele/gcc-4.5/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.5.0/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/panfs/panfs0.ften.es.hpcn.uzh.ch/es/home/pci/vondele/gcc-4.5/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.5.0/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-O3' '-fbounds-check' '-g' '-static' '-v' '-mtune=generic'
'-march=x86-64'
/panfs/panfs0.ften.es.hpcn.uzh.ch/es/home/pci/vondele/gcc-4.5/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.5.0/collect2
-m elf_x86_64 -static /usr/lib/../lib64/crt1.o /usr/lib/../lib64/crti.o
/panfs/panfs0.ften.es.hpcn.uzh.ch/es/home/pci/vondele/gcc-4.5/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.5.0/crtbeginT.o
-L/panfs/panfs0.ften.es.hpcn.uzh.ch/es/home/pci/vondele/gcc-4.5/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.5.0
-L/panfs/panfs0.ften.es.hpcn.uzh.ch/es/home/pci/vondele/gcc-4.5/bin/../lib/gcc
-L/panfs/panfs0.ften.es.hpcn.uzh.ch/es/home/pci/vondele/gcc-4.5/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.5.0/../../../../lib64
-L/lib/../lib64 -L/usr/lib/../lib64
-L/panfs/panfs0.ften.es.hpcn.uzh.ch/es/home/pci/vondele/gcc-4.5/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.5.0/../../..
/tmp/ccvIH9et.o -lgfortran -lm --start-group -lgcc -lgcc_eh -lc --end-group
/panfs/panfs0.ften.es.hpcn.uzh.ch/es/home/pci/vondele/gcc-4.5/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.5.0/crtend.o
/usr/lib/../lib64/crtn.o
--
jv244 at cam dot ac dot uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to fail|4.3.1 4.4.0 4.6.0 |4.3.1 4.4.0 4.5.0 4.6.0
Known to work| |4.1.2
Summary|wrong code with -fbounds- |[4.3/4.4/4.5/4.6 Regression]
|check -funswitch-loops |wrong code with -fbounds-
| |check -funswitch-loops
Target Milestone|--- |4.3.5
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (5 preceding siblings ...)
2010-04-23 17:29 ` [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] " jv244 at cam dot ac dot uk
@ 2010-04-26 11:07 ` jv244 at cam dot ac dot uk
2010-05-14 12:57 ` rguenth at gcc dot gnu dot org
` (17 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: jv244 at cam dot ac dot uk @ 2010-04-26 11:07 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from jv244 at cam dot ac dot uk 2010-04-26 11:06 -------
I have been trying to reduce more, but this is the smallest so far, seems like
it needs the derived types, the 2D arrays, the pointers. I've reduced this with
4.5.1, using '-O2 -funswitch-loops'
MODULE M1
IMPLICIT NONE
TYPE cp_fm_type
REAL(KIND=4), DIMENSION(:,:), POINTER :: local_data_sp
REAL(KIND=8), DIMENSION(:,:), POINTER :: local_data
END TYPE
CONTAINS
SUBROUTINE cp_fm_upper_to_full(matrix,nrow_global,ncol_global,use_sp)
TYPE(cp_fm_type), POINTER :: matrix
INTEGER, INTENT(IN) :: ncol_global, nrow_global
INTEGER :: irow_global, icol_global
LOGICAL :: use_sp
REAL(KIND = 4), DIMENSION(:,:), POINTER :: a_sp
REAL(KIND = 8), DIMENSION(:,:), POINTER :: a
a => matrix%local_data
a_sp => matrix%local_data_sp
DO irow_global=1,nrow_global
DO icol_global=irow_global+1,ncol_global
IF(use_sp) THEN
a_sp(icol_global,irow_global)=a_sp(irow_global,icol_global)
ELSE
a(icol_global,irow_global)=a(irow_global,icol_global)
ENDIF
ENDDO
ENDDO
END SUBROUTINE cp_fm_upper_to_full
END MODULE M1
USE M1
TYPE(cp_fm_type), POINTER :: a
INTEGER, PARAMETER :: N=17
ALLOCATE(a)
ALLOCATE(a%local_data(N,N))
a%local_data=0
CALL cp_fm_upper_to_full(a,N,N,.FALSE.)
END
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (6 preceding siblings ...)
2010-04-26 11:07 ` jv244 at cam dot ac dot uk
@ 2010-05-14 12:57 ` rguenth at gcc dot gnu dot org
2010-05-22 14:09 ` jv244 at cam dot ac dot uk
` (16 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-05-14 12:57 UTC (permalink / raw)
To: gcc-bugs
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Priority|P3 |P2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (7 preceding siblings ...)
2010-05-14 12:57 ` rguenth at gcc dot gnu dot org
@ 2010-05-22 14:09 ` jv244 at cam dot ac dot uk
2010-05-22 19:00 ` rguenth at gcc dot gnu dot org
` (15 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: jv244 at cam dot ac dot uk @ 2010-05-22 14:09 UTC (permalink / raw)
To: gcc-bugs
------- Comment #8 from jv244 at cam dot ac dot uk 2010-05-22 14:09 -------
(In reply to comment #7)
> 4.5.1, using '-O2 -funswitch-loops'
obviously '-O2 -funswitch-loops -fbounds-check'
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (8 preceding siblings ...)
2010-05-22 14:09 ` jv244 at cam dot ac dot uk
@ 2010-05-22 19:00 ` rguenth at gcc dot gnu dot org
2010-06-24 13:31 ` jakub at gcc dot gnu dot org
` (14 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-05-22 19:00 UTC (permalink / raw)
To: gcc-bugs
------- Comment #9 from rguenth at gcc dot gnu dot org 2010-05-22 18:14 -------
GCC 4.3.5 is being released, adjusting target milestone.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.3.5 |4.3.6
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (9 preceding siblings ...)
2010-05-22 19:00 ` rguenth at gcc dot gnu dot org
@ 2010-06-24 13:31 ` jakub at gcc dot gnu dot org
2010-06-24 16:21 ` jakub at gcc dot gnu dot org
` (13 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu dot org @ 2010-06-24 13:31 UTC (permalink / raw)
To: gcc-bugs
------- Comment #10 from jakub at gcc dot gnu dot org 2010-06-24 13:31 -------
This looks like a bug in the source to me, certainly not a middle-end or
tree-optimization bug.
The
a_sp => matrix%local_data_sp
assignment when
matrix%local_data_sp
is uninitialized means you are copying uninitialized data. Either nullify
it in the caller or do this assignment only if (use_sp).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (10 preceding siblings ...)
2010-06-24 13:31 ` jakub at gcc dot gnu dot org
@ 2010-06-24 16:21 ` jakub at gcc dot gnu dot org
2010-06-24 16:24 ` jakub at gcc dot gnu dot org
` (12 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu dot org @ 2010-06-24 16:21 UTC (permalink / raw)
To: gcc-bugs
------- Comment #11 from jakub at gcc dot gnu dot org 2010-06-24 16:21 -------
Created an attachment (id=20997)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20997&action=view)
unswitch-tuplification.patch
While the validity of the testcase is questionable, what unswitching does looks
certainly wrong.
I'm attaching two small patches that fix things I've found while reading
tree-ssa-loop-unswitch.
The real problem for this testcase though seems to be that there is no
unreachable block removal in between the unswitching levels, so it can very
well choose to unswitch on a condition that is never reachable in the
unswitched loop.
Say there is
while (...)
{
if (cond1)
{
if (cond2)
A
else
B
}
else
{
if (cond3)
C
else
D
}
E
}
The first level chooses to unswitch on the cond1 condition, so we have:
if (cond1)
while (...)
{
if (1 != 0)
{
if (cond2)
A
else
B
}
else
{
if (cond3)
C
else
D
}
E
}
else
while (...)
{
if (0 != 0)
{
if (cond2)
A
else
B
}
else
{
if (cond3)
C
else
D
}
E
}
Now, in the first loop if we decide to unswitch on cond3, it transforms this
into:
if (cond1)
{
if (cond3)
while (...)
{
if (1 != 0)
{
if (cond2)
A
else
B
}
else
{
if (1 != 0)
C
else
D
}
E
}
else
while (...)
{
if (1 != 0)
{
if (cond2)
A
else
B
}
else
{
if (0 != 0)
C
else
D
}
E
}
}
else
while (...)
{
if (0 != 0)
{
if (cond2)
A
else
B
}
else
{
if (cond3)
C
else
D
}
E
}
If cond3 tests some variable that is initialized only if cond1 is false, this
unswitching (besides not being very useful because cond3 is never tested when
cond1 is false in the original program) results in jumps on uninitialized
values comparison.
Anyway, this first patch tries to fix a bug caused by tuplification. Before
tuplification, COND_EXPR_COND would be 0 or 1 when optimized, but 0 != 0 or 1
!= 0 condition never satisfies integer_zerop nor integer_nonzerop. We could
fold it, but we already have predicates for this stuff.
This patch actually causes more unswitching to happen, because it never
unswitches on 0 != 0 or 1 != 0 conditions.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (11 preceding siblings ...)
2010-06-24 16:21 ` jakub at gcc dot gnu dot org
@ 2010-06-24 16:24 ` jakub at gcc dot gnu dot org
2010-06-24 16:47 ` jakub at gcc dot gnu dot org
` (11 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu dot org @ 2010-06-24 16:24 UTC (permalink / raw)
To: gcc-bugs
------- Comment #12 from jakub at gcc dot gnu dot org 2010-06-24 16:24 -------
Created an attachment (id=20998)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20998&action=view)
level-opt.patch
When we reach the max-unswitch-level param, the function returns immediately,
which means the conditions the last unswitching optimized aren't simplified
(likely that happens in some later optimization passes, but unswitching has
code to handle it, so I think it won't hurt to do it immediately).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (12 preceding siblings ...)
2010-06-24 16:24 ` jakub at gcc dot gnu dot org
@ 2010-06-24 16:47 ` jakub at gcc dot gnu dot org
2010-06-24 20:49 ` rguenth at gcc dot gnu dot org
` (10 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu dot org @ 2010-06-24 16:47 UTC (permalink / raw)
To: gcc-bugs
------- Comment #13 from jakub at gcc dot gnu dot org 2010-06-24 16:47 -------
For the last issue, perhaps doing delete_unreachable_blocks in between the
levels wouldn't work too well, loops would need to be discovered again etc.
Maybe we can just do something similar to find_unreachable_blocks
limited only to the loop bbs and ignoring false resp. true edges if
gimple_cond_{true,false}_p on the last stmt and not consider as loop
unswitching candidate any bb's that aren't reachable from the header in the
loop.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (13 preceding siblings ...)
2010-06-24 16:47 ` jakub at gcc dot gnu dot org
@ 2010-06-24 20:49 ` rguenth at gcc dot gnu dot org
2010-06-25 8:13 ` jakub at gcc dot gnu dot org
` (9 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-06-24 20:49 UTC (permalink / raw)
To: gcc-bugs
------- Comment #14 from rguenth at gcc dot gnu dot org 2010-06-24 20:49 -------
Both patches look ok. For the remaining issue it would probably work to
not copy unreachable blocks during unswitching?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (14 preceding siblings ...)
2010-06-24 20:49 ` rguenth at gcc dot gnu dot org
@ 2010-06-25 8:13 ` jakub at gcc dot gnu dot org
2010-06-25 9:04 ` rakdver at gcc dot gnu dot org
` (8 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu dot org @ 2010-06-25 8:13 UTC (permalink / raw)
To: gcc-bugs
------- Comment #15 from jakub at gcc dot gnu dot org 2010-06-25 08:13 -------
Created an attachment (id=21000)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21000&action=view)
gcc46-pr43866.patch
Here is a complete fix.
This includes what the earlier two patches did, plus for nested
tree_unswitch_single_loop call in the first pass optimizes all conditions using
entry tests, then if there are some possible unswitching condition candidates
does a still reachable loop bb discovery and only considers bbs in still
unreachable bbs.
Not copying the unreachable blocks would be IMHO harder. The current pass was
trying to be very simple, not modify the original loop, just duplicate it and
add a guard condition in front of the two loops, then in the nested call just
simplify conditions using the entry checks and leave all the cleanups to cfg
cleanup after the pass (e.g. to avoid redoing loop discovery etc.).
With the patch pr43866.s is actually smaller, eventhough more loops are
unswitched (previously some of the 15 unswitched loops were on 0 != 0 or 1 != 0
conditions and thus thrown away immediately).
--
jakub at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #20997|0 |1
is obsolete| |
Attachment #20998|0 |1
is obsolete| |
AssignedTo|unassigned at gcc dot gnu |jakub at gcc dot gnu dot org
|dot org |
Status|UNCONFIRMED |ASSIGNED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (15 preceding siblings ...)
2010-06-25 8:13 ` jakub at gcc dot gnu dot org
@ 2010-06-25 9:04 ` rakdver at gcc dot gnu dot org
2010-06-25 9:12 ` rakdver at gcc dot gnu dot org
` (7 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: rakdver at gcc dot gnu dot org @ 2010-06-25 9:04 UTC (permalink / raw)
To: gcc-bugs
------- Comment #16 from rakdver at gcc dot gnu dot org 2010-06-25 09:04 -------
> Now, in the first loop if we decide to unswitch on cond3, it transforms this
> into:
...
> If cond3 tests some variable that is initialized only if cond1 is false, this
> unswitching (besides not being very useful because cond3 is never tested when
> cond1 is false in the original program) results in jumps on uninitialized
> values comparison.
while unswitching on cond3 indeed does not make sense, I don't think it can
result in a jump on uninitialized value comparison. Unswitching only works if
the arguments of the comparison are ssa names whose values are computed outside
of the loop (it does not do invariant motion by itself). So, the computations
based on uninitialized values would have to be introduced by some other pass
(possibly invariant motion).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (16 preceding siblings ...)
2010-06-25 9:04 ` rakdver at gcc dot gnu dot org
@ 2010-06-25 9:12 ` rakdver at gcc dot gnu dot org
2010-06-25 9:15 ` jakub at gcc dot gnu dot org
` (6 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: rakdver at gcc dot gnu dot org @ 2010-06-25 9:12 UTC (permalink / raw)
To: gcc-bugs
------- Comment #17 from rakdver at gcc dot gnu dot org 2010-06-25 09:12 -------
(In reply to comment #16)
> > Now, in the first loop if we decide to unswitch on cond3, it transforms this
> > into:
> ...
> > If cond3 tests some variable that is initialized only if cond1 is false, this
> > unswitching (besides not being very useful because cond3 is never tested when
> > cond1 is false in the original program) results in jumps on uninitialized
> > values comparison.
>
> while unswitching on cond3 indeed does not make sense, I don't think it can
> result in a jump on uninitialized value comparison. Unswitching only works if
> the arguments of the comparison are ssa names whose values are computed outside
> of the loop (it does not do invariant motion by itself). So, the computations
> based on uninitialized values would have to be introduced by some other pass
> (possibly invariant motion).
unless the arguments of the comparison are completely uninitialized (their
values are not defined in the function and are not parameters of the function),
of course -- is that the case here? We should avoid unswitching on such
conditions (in addition to the proposed patch), since the comparison could be
unreachable due to some reason that cannot be detected in compile-time.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (17 preceding siblings ...)
2010-06-25 9:12 ` rakdver at gcc dot gnu dot org
@ 2010-06-25 9:15 ` jakub at gcc dot gnu dot org
2010-06-25 9:20 ` jakub at gcc dot gnu dot org
` (5 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu dot org @ 2010-06-25 9:15 UTC (permalink / raw)
To: gcc-bugs
------- Comment #18 from jakub at gcc dot gnu dot org 2010-06-25 09:14 -------
That part comes from the questionable testcase, which does
a_sp => matrix%local_data_sp
before the loop unconditionally, eventhough matrix%local_data_sp is
uninitialized unless use_sp is .true.
Without the unswitching the values from the uninitialized struct
matrix%local_data_sp are read, but not tested when use_sp is .false.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (18 preceding siblings ...)
2010-06-25 9:15 ` jakub at gcc dot gnu dot org
@ 2010-06-25 9:20 ` jakub at gcc dot gnu dot org
2010-06-25 9:29 ` jv244 at cam dot ac dot uk
` (4 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu dot org @ 2010-06-25 9:20 UTC (permalink / raw)
To: gcc-bugs
------- Comment #19 from jakub at gcc dot gnu dot org 2010-06-25 09:19 -------
The compiler doesn't know that matrix%local_data_sp is uninitialized (at least
unless it would propagate that info in some IPA pass), matrix%local_data_sp is
a field in what an arguments points at (and the argument is known to be
non-NULL).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (19 preceding siblings ...)
2010-06-25 9:20 ` jakub at gcc dot gnu dot org
@ 2010-06-25 9:29 ` jv244 at cam dot ac dot uk
2010-06-25 9:39 ` jakub at gcc dot gnu dot org
` (3 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: jv244 at cam dot ac dot uk @ 2010-06-25 9:29 UTC (permalink / raw)
To: gcc-bugs
------- Comment #20 from jv244 at cam dot ac dot uk 2010-06-25 09:28 -------
(In reply to comment #18)
> That part comes from the questionable testcase, which does
> a_sp => matrix%local_data_sp
> before the loop unconditionally, eventhough matrix%local_data_sp is
> uninitialized unless use_sp is .true.
thanks for looking into the bug report. Actually the testcase can be turned in
'unquestionable' by adding a 'NULLIFY(a%local_data_sp)' in the main program. At
that point the pointer assignment (a_sp => matrix%local_data_sp) makes a_sp
defined, but valgrind still yields an error.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (20 preceding siblings ...)
2010-06-25 9:29 ` jv244 at cam dot ac dot uk
@ 2010-06-25 9:39 ` jakub at gcc dot gnu dot org
2010-06-25 12:11 ` jakub at gcc dot gnu dot org
` (2 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu dot org @ 2010-06-25 9:39 UTC (permalink / raw)
To: gcc-bugs
------- Comment #21 from jakub at gcc dot gnu dot org 2010-06-25 09:38 -------
True. Most of the fields of the pointer are still uninitialized (NULLIFY only
clears the data field).
I'm afraid when NULLIFY doesn't clear the whole struct there will be always a
possibility valgrind might complain, the tree optimizers see the comparison as
not having any side-effects, not possibly trapping etc., so some pass might
exchange the two comparisons somewhere. But worse case it would be just a
valgrind warning, conditional jump based on uninitialized values which in the
end wouldn't have any runtime visible effect.
This patch doesn't try to do anything about that, just improve unswitching such
that it chooses conditions that it can usefully unswitch on (and not unswitch
on always false/true conditions).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (21 preceding siblings ...)
2010-06-25 9:39 ` jakub at gcc dot gnu dot org
@ 2010-06-25 12:11 ` jakub at gcc dot gnu dot org
2010-06-25 12:31 ` jakub at gcc dot gnu dot org
2010-06-30 15:34 ` [Bug middle-end/43866] [4.3/4.4/4.5 " jakub at gcc dot gnu dot org
24 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu dot org @ 2010-06-25 12:11 UTC (permalink / raw)
To: gcc-bugs
------- Comment #22 from jakub at gcc dot gnu dot org 2010-06-25 12:11 -------
Subject: Bug 43866
Author: jakub
Date: Fri Jun 25 12:10:42 2010
New Revision: 161375
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=161375
Log:
PR middle-end/43866
* tree-ssa-loop-unswitch.c (tree_may_unswitch_on): If stmt is always
true or always false, return NULL_TREE.
(tree_unswitch_single_loop): Optimize conditions even when reaching
max-unswitch-level parameter. If num > 0, optimize first all
conditions
using entry checks, then do still reachable block discovery and
consider
only conditions in still reachable basic blocks in the loop.
* gfortran.dg/pr43866.f90: New test.
Added:
trunk/gcc/testsuite/gfortran.dg/pr43866.f90
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-loop-unswitch.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] [4.3/4.4/4.5/4.6 Regression] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (22 preceding siblings ...)
2010-06-25 12:11 ` jakub at gcc dot gnu dot org
@ 2010-06-25 12:31 ` jakub at gcc dot gnu dot org
2010-06-30 15:34 ` [Bug middle-end/43866] [4.3/4.4/4.5 " jakub at gcc dot gnu dot org
24 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu dot org @ 2010-06-25 12:31 UTC (permalink / raw)
To: gcc-bugs
------- Comment #23 from jakub at gcc dot gnu dot org 2010-06-25 12:30 -------
Fixed on the trunk so far.
--
jakub at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to fail|4.3.1 4.4.0 4.5.0 4.6.0 |4.3.1 4.4.0 4.5.0
Known to work|4.1.2 |4.1.2 4.6.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug middle-end/43866] [4.3/4.4/4.5 Regression] wrong code with -fbounds-check -funswitch-loops
2010-04-23 9:14 [Bug middle-end/43866] New: wrong code with -O3 -fbounds-check jv244 at cam dot ac dot uk
` (23 preceding siblings ...)
2010-06-25 12:31 ` jakub at gcc dot gnu dot org
@ 2010-06-30 15:34 ` jakub at gcc dot gnu dot org
24 siblings, 0 replies; 26+ messages in thread
From: jakub at gcc dot gnu dot org @ 2010-06-30 15:34 UTC (permalink / raw)
To: gcc-bugs
------- Comment #24 from jakub at gcc dot gnu dot org 2010-06-30 15:34 -------
Subject: Bug 43866
Author: jakub
Date: Wed Jun 30 15:34:09 2010
New Revision: 161609
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=161609
Log:
Backport from mainline
2010-06-25 Jakub Jelinek <jakub@redhat.com>
PR middle-end/43866
* tree-ssa-loop-unswitch.c (tree_may_unswitch_on): If stmt is always
true or always false, return NULL_TREE.
(tree_unswitch_single_loop): Optimize conditions even when reaching
max-unswitch-level parameter. If num > 0, optimize first all
conditions
using entry checks, then do still reachable block discovery and
consider
only conditions in still reachable basic blocks in the loop.
* gfortran.dg/pr43866.f90: New test.
Added:
branches/gcc-4_5-branch/gcc/testsuite/gfortran.dg/pr43866.f90
- copied unchanged from r161375,
trunk/gcc/testsuite/gfortran.dg/pr43866.f90
Modified:
branches/gcc-4_5-branch/gcc/ChangeLog
branches/gcc-4_5-branch/gcc/testsuite/ChangeLog
branches/gcc-4_5-branch/gcc/tree-ssa-loop-unswitch.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43866
^ permalink raw reply [flat|nested] 26+ messages in thread