public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/36713]  New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
@ 2008-07-03  9:09 dfranke at gcc dot gnu dot org
  2008-07-03  9:13 ` [Bug target/36713] " rguenther at suse dot de
                   ` (24 more replies)
  0 siblings, 25 replies; 26+ messages in thread
From: dfranke at gcc dot gnu dot org @ 2008-07-03  9:09 UTC (permalink / raw)
  To: gcc-bugs

------------------------------------------------------------------------
r137252 | rguenth | 2008-06-29 17:44:00 +0200 (Sun, 29 Jun 2008) | 27 lines

2008-06-29  Richard Guenther  <rguenther@suse.de>


For now, all I can tell is: my Fortran-based project gives acceptable results
with r137251 (FCFLAGS="-O2"), but r137252 produces strange output that can,
numerically, not be justified. For FCFLAGS="-O1", both revisions seem to work.

Can't provide a testcase for now as I'm not sure where I should start looking.
Pointers welcome :)


-- 
           Summary: [4.4 regression] r137252 breaks -O2 optimization on
                    x86_64-unknown-linux-gnu
           Product: gcc
           Version: 4.4.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: dfranke at gcc dot gnu dot org
  GCC host triplet: x86_64-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
@ 2008-07-03  9:13 ` rguenther at suse dot de
  2008-07-03  9:15 ` dfranke at gcc dot gnu dot org
                   ` (23 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2008-07-03  9:13 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from rguenther at suse dot de  2008-07-03 09:12 -------
Subject: Re:   New: [4.4 regression] r137252 breaks -O2
 optimization on x86_64-unknown-linux-gnu

On Thu, 3 Jul 2008, dfranke at gcc dot gnu dot org wrote:

> ------------------------------------------------------------------------
> r137252 | rguenth | 2008-06-29 17:44:00 +0200 (Sun, 29 Jun 2008) | 27 lines
>
> 2008-06-29  Richard Guenther  <rguenther@suse.de>
>
>
> For now, all I can tell is: my Fortran-based project gives acceptable results
> with r137251 (FCFLAGS="-O2"), but r137252 produces strange output that can,
> numerically, not be justified. For FCFLAGS="-O1", both revisions seem to work.
>
> Can't provide a testcase for now as I'm not sure where I should start looking.
> Pointers welcome :)

Try updating to at least

2008-06-30  Richard Guenther  <rguenther@suse.de>

         PR middle-end/36671
         * tree-ssa-structalias.c (handle_lhs_call): Add flags argument,
         handle calls from ECF_MALLOC functions.
         (handle_pure_call): ECF_MALLOC functions do not return
         call-used memory.
         (find_func_aliases): Handle all calls, adjust calls to
handle_lhs_call.

as this fixes some fortran problems.

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
  2008-07-03  9:13 ` [Bug target/36713] " rguenther at suse dot de
@ 2008-07-03  9:15 ` dfranke at gcc dot gnu dot org
  2008-07-03  9:43 ` rguenther at suse dot de
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: dfranke at gcc dot gnu dot org @ 2008-07-03  9:15 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from dfranke at gcc dot gnu dot org  2008-07-03 09:15 -------
Updated last night to:

Last Changed Rev: 137396
Last Changed Date: 2008-07-03 00:31:11 +0200 (Thu, 03 Jul 2008)

Problem persists.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
  2008-07-03  9:13 ` [Bug target/36713] " rguenther at suse dot de
  2008-07-03  9:15 ` dfranke at gcc dot gnu dot org
@ 2008-07-03  9:43 ` rguenther at suse dot de
  2008-07-03  9:54 ` franke dot daniel at gmail dot com
                   ` (21 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2008-07-03  9:43 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from rguenther at suse dot de  2008-07-03 09:42 -------
Subject: Re:  [4.4 regression] r137252 breaks -O2 optimization
 on x86_64-unknown-linux-gnu

On Thu, 3 Jul 2008, dfranke at gcc dot gnu dot org wrote:

> ------- Comment #2 from dfranke at gcc dot gnu dot org  2008-07-03 09:15 -------
> Updated last night to:
>
> Last Changed Rev: 137396
> Last Changed Date: 2008-07-03 00:31:11 +0200 (Thu, 03 Jul 2008)
>
> Problem persists.

Hm, I'm out of quick ideas ;)  Polyhedron still seems to work fine
for me, as does SPEC 2000.

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (2 preceding siblings ...)
  2008-07-03  9:43 ` rguenther at suse dot de
@ 2008-07-03  9:54 ` franke dot daniel at gmail dot com
  2008-07-03 10:09 ` rguenther at suse dot de
                   ` (20 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: franke dot daniel at gmail dot com @ 2008-07-03  9:54 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from franke dot daniel at gmail dot com  2008-07-03 09:53 -------
Subject: Re:  [4.4 regression] r137252 breaks -O2 optimization on
x86_64-unknown-linux-gnu

3 Jul 2008 09:42:44 -0000, rguenther at suse dot de <gcc-bugzilla@gcc.gnu.org>:
> Hm, I'm out of quick ideas ;)  Polyhedron still seems to work fine
> for me, as does SPEC 2000.

Any hint of what I should look for? Some specific feature that the
change in question might affect?
If nothing else, I'll try to compare dumps between the two revisions.
Is this a specific pass or will -fdump-tree-optimized do?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (3 preceding siblings ...)
  2008-07-03  9:54 ` franke dot daniel at gmail dot com
@ 2008-07-03 10:09 ` rguenther at suse dot de
  2008-07-03 10:41 ` dfranke at gcc dot gnu dot org
                   ` (19 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2008-07-03 10:09 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from rguenther at suse dot de  2008-07-03 10:08 -------
Subject: Re:  [4.4 regression] r137252 breaks -O2 optimization
 on x86_64-unknown-linux-gnu

On Thu, 3 Jul 2008, franke dot daniel at gmail dot com wrote:

> ------- Comment #4 from franke dot daniel at gmail dot com  2008-07-03 09:53 -------
> Subject: Re:  [4.4 regression] r137252 breaks -O2 optimization on
> x86_64-unknown-linux-gnu
>
> 3 Jul 2008 09:42:44 -0000, rguenther at suse dot de <gcc-bugzilla@gcc.gnu.org>:
>> Hm, I'm out of quick ideas ;)  Polyhedron still seems to work fine
>> for me, as does SPEC 2000.
>
> Any hint of what I should look for? Some specific feature that the
> change in question might affect?
> If nothing else, I'll try to compare dumps between the two revisions.
> Is this a specific pass or will -fdump-tree-optimized do?

The patch changed us to assume that on function entry global variables
only point to global memory, not function local memory.  In the end
this should result in less call-clobbering and thus "more" optimization 
...

-fdump-tree-optimized differences should hint at where it goes wrong,
for further analysis I would suggest -fdump-tree-alias-vops.

To simplify your life try if you can reproduce it with -O 
-fstrict-aliasing or with -O2 -fno-tree-pre.

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (4 preceding siblings ...)
  2008-07-03 10:09 ` rguenther at suse dot de
@ 2008-07-03 10:41 ` dfranke at gcc dot gnu dot org
  2008-07-03 11:42 ` dfranke at gcc dot gnu dot org
                   ` (18 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: dfranke at gcc dot gnu dot org @ 2008-07-03 10:41 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from dfranke at gcc dot gnu dot org  2008-07-03 10:40 -------
> To simplify your life try if you can reproduce it with -O 
> -fstrict-aliasing or with -O2 -fno-tree-pre.

FCFLAGS="-O -fstrict-aliasing"      - not reproducable
FCFLAGS="-O2 -fno-tree-pre"         - reproducable
FCFLAGS="-O -ftree-pre"             - not reproducable
FCFLAGS="-O2 -fno-strict-aliasing"  - reproducable

Will do dump comparison next.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (5 preceding siblings ...)
  2008-07-03 10:41 ` dfranke at gcc dot gnu dot org
@ 2008-07-03 11:42 ` dfranke at gcc dot gnu dot org
  2008-07-03 11:50 ` rguenther at suse dot de
                   ` (17 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: dfranke at gcc dot gnu dot org @ 2008-07-03 11:42 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from dfranke at gcc dot gnu dot org  2008-07-03 11:42 -------
Differences in *.optimized dumps seem to be variable renaming only, e.g.
  -  temp.1140 = tdata->p;
  +  temp.1137 = tdata->p;

Differences in *.alias seems to be mostly of the kind:
  -PARM_NOALIAS.895 = &ANYTHING
  +PARM_NOALIAS.895 = &ESCAPED

However, I also found:
  -PARM_NOALIAS.895 = { ANYTHING ESCAPED NONLOCAL }
  -D.3410_6 = same as thread_data
  -thread_data = { ANYTHING }
  -D.3411_7 = same as thread_data
  +PARM_NOALIAS.895 = { ESCAPED NONLOCAL }
  +D.3410_6 = same as derefaddrtmp.893
  +thread_data = same as derefaddrtmp.893         <---- ?!?
  +D.3411_7 = same as derefaddrtmp.893

which might hint at the problem: the variable thread_data is an allocated
POINTER to a vector of structures (which hold more pointers), public within a
module. USE-ing it, makes it (sort of) local to the function. Most (>80% of
all) changes in *.optimized where in the one source file that uses this module
...

Richard, would this fit the bill?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (6 preceding siblings ...)
  2008-07-03 11:42 ` dfranke at gcc dot gnu dot org
@ 2008-07-03 11:50 ` rguenther at suse dot de
  2008-07-03 12:00 ` dfranke at gcc dot gnu dot org
                   ` (16 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2008-07-03 11:50 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from rguenther at suse dot de  2008-07-03 11:49 -------
Subject: Re:  [4.4 regression] r137252 breaks -O2 optimization
 on x86_64-unknown-linux-gnu

On Thu, 3 Jul 2008, dfranke at gcc dot gnu dot org wrote:

> ------- Comment #7 from dfranke at gcc dot gnu dot org  2008-07-03 11:42 -------
> Differences in *.optimized dumps seem to be variable renaming only, e.g.
>  -  temp.1140 = tdata->p;
>  +  temp.1137 = tdata->p;
>
> Differences in *.alias seems to be mostly of the kind:
>  -PARM_NOALIAS.895 = &ANYTHING
>  +PARM_NOALIAS.895 = &ESCAPED
>
> However, I also found:
>  -PARM_NOALIAS.895 = { ANYTHING ESCAPED NONLOCAL }
>  -D.3410_6 = same as thread_data
>  -thread_data = { ANYTHING }
>  -D.3411_7 = same as thread_data
>  +PARM_NOALIAS.895 = { ESCAPED NONLOCAL }
>  +D.3410_6 = same as derefaddrtmp.893
>  +thread_data = same as derefaddrtmp.893         <---- ?!?
>  +D.3411_7 = same as derefaddrtmp.893

Well, it's not really different - only thread_data now doesn't
point to anything but to the same as derefaddrtmp.893 (I guess
that points to { ESCAPED NONLOCAL }?).

> which might hint at the problem: the variable thread_data is an allocated
> POINTER to a vector of structures (which hold more pointers), public within a
> module. USE-ing it, makes it (sort of) local to the function. Most (>80% of
> all) changes in *.optimized where in the one source file that uses this module
> ...
>
> Richard, would this fit the bill?

Well, if there is no difference in the optimized dumps then this
doesn't explain it (but the points-to differences also look good).

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (7 preceding siblings ...)
  2008-07-03 11:50 ` rguenther at suse dot de
@ 2008-07-03 12:00 ` dfranke at gcc dot gnu dot org
  2008-07-03 12:04 ` rguenther at suse dot de
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: dfranke at gcc dot gnu dot org @ 2008-07-03 12:00 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from dfranke at gcc dot gnu dot org  2008-07-03 11:59 -------
> Well, it's not really different - only thread_data now doesn't
> point to anything but to the same as derefaddrtmp.893 (I guess
> that points to { ESCAPED NONLOCAL }?).

Just ESCAPED, see below.

r137251
derefaddrtmp.893 = { ESCAPED }
derefaddrtmp.894 = { NONLOCAL }
this = { PARM_NOALIAS.895 }
PARM_NOALIAS.895 = { ANYTHING ESCAPED NONLOCAL }
D.3410_6 = same as thread_data
thread_data = { ANYTHING }
D.3411_7 = same as thread_data
D.3417_13 = { ANYTHING }

r137252
derefaddrtmp.893 = { ESCAPED }
derefaddrtmp.894 = { NONLOCAL }
this = { PARM_NOALIAS.895 }
PARM_NOALIAS.895 = { ESCAPED NONLOCAL }
D.3410_6 = same as derefaddrtmp.893
thread_data = same as derefaddrtmp.893
D.3411_7 = same as derefaddrtmp.893


> Well, if there is no difference in the optimized dumps then this
> doesn't explain it (but the points-to differences also look good)

None that I could spot by scrolling through 1000+ lines of differences.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (8 preceding siblings ...)
  2008-07-03 12:00 ` dfranke at gcc dot gnu dot org
@ 2008-07-03 12:04 ` rguenther at suse dot de
  2008-07-03 12:46 ` dfranke at gcc dot gnu dot org
                   ` (14 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2008-07-03 12:04 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from rguenther at suse dot de  2008-07-03 12:03 -------
Subject: Re:  [4.4 regression] r137252 breaks -O2 optimization
 on x86_64-unknown-linux-gnu

On Thu, 3 Jul 2008, dfranke at gcc dot gnu dot org wrote:

>
>
> ------- Comment #9 from dfranke at gcc dot gnu dot org  2008-07-03 11:59 -------
>> Well, it's not really different - only thread_data now doesn't
>> point to anything but to the same as derefaddrtmp.893 (I guess
>> that points to { ESCAPED NONLOCAL }?).
>
> Just ESCAPED, see below.
>
> r137251
> derefaddrtmp.893 = { ESCAPED }
> derefaddrtmp.894 = { NONLOCAL }
> this = { PARM_NOALIAS.895 }
> PARM_NOALIAS.895 = { ANYTHING ESCAPED NONLOCAL }
> D.3410_6 = same as thread_data
> thread_data = { ANYTHING }
> D.3411_7 = same as thread_data
> D.3417_13 = { ANYTHING }
>
> r137252
> derefaddrtmp.893 = { ESCAPED }
> derefaddrtmp.894 = { NONLOCAL }
> this = { PARM_NOALIAS.895 }
> PARM_NOALIAS.895 = { ESCAPED NONLOCAL }
> D.3410_6 = same as derefaddrtmp.893
> thread_data = same as derefaddrtmp.893
> D.3411_7 = same as derefaddrtmp.893

That looks correct.

>> Well, if there is no difference in the optimized dumps then this
>> doesn't explain it (but the points-to differences also look good)
>
> None that I could spot by scrolling through 1000+ lines of differences.

Yeah.  I guess the assembly is different, right? ;)

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (9 preceding siblings ...)
  2008-07-03 12:04 ` rguenther at suse dot de
@ 2008-07-03 12:46 ` dfranke at gcc dot gnu dot org
  2008-07-03 12:55 ` rguenther at suse dot de
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: dfranke at gcc dot gnu dot org @ 2008-07-03 12:46 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from dfranke at gcc dot gnu dot org  2008-07-03 12:45 -------
> I guess the assembly is different, right?

Yes, it is. Comparing *.s files from FCFLAGS="-O2 -save-temps", there are
differences in one file - though not the one suspected in comment #7.

There is one difference in the .optimized file that I could easily identify:
-  D.1935 = dummy_atom_model_inertia_tensor (this, phase);
-  __result_dummy_atom_model_rad.21 = inertia_tensor_to_rg (&D.1935);
+  D.1935 = dummy_atom_model_inertia_tensor (this, phase) [return slot
optimization];
+  __result_dummy_atom_model_rad.21 = inertia_tensor_to_rg (&D.1935) [tail
call];

And there are plenty in *.alias, e.g.:
  Aliased symbols

  -D.1935, UID D.1935, struct inertia_tensor, is addressable, score: 8, direct 
 reads: 0, direct writes: 0, indirect reads: 0, indirect writes: 2, call  
clobbered (passed to call)
  +D.1935, UID D.1935, struct inertia_tensor, is addressable, score: 0, direct
reads: 0, direct writes: 0, indirect reads: 0, indirect writes: 0

The difference file has 3000+ lines here, one could argue that the dumps are
significantly different ;)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (10 preceding siblings ...)
  2008-07-03 12:46 ` dfranke at gcc dot gnu dot org
@ 2008-07-03 12:55 ` rguenther at suse dot de
  2008-07-03 13:01 ` dfranke at gcc dot gnu dot org
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2008-07-03 12:55 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #12 from rguenther at suse dot de  2008-07-03 12:54 -------
Subject: Re:  [4.4 regression] r137252 breaks -O2 optimization
 on x86_64-unknown-linux-gnu

On Thu, 3 Jul 2008, dfranke at gcc dot gnu dot org wrote:

> ------- Comment #11 from dfranke at gcc dot gnu dot org  2008-07-03 12:45 -------
>> I guess the assembly is different, right?
>
> Yes, it is. Comparing *.s files from FCFLAGS="-O2 -save-temps", there are
> differences in one file - though not the one suspected in comment #7.
>
> There is one difference in the .optimized file that I could easily identify:
> -  D.1935 = dummy_atom_model_inertia_tensor (this, phase);
> -  __result_dummy_atom_model_rad.21 = inertia_tensor_to_rg (&D.1935);
> +  D.1935 = dummy_atom_model_inertia_tensor (this, phase) [return slot
> optimization];
> +  __result_dummy_atom_model_rad.21 = inertia_tensor_to_rg (&D.1935) [tail
> call];

Ok, that is an interesting difference.  Can you try if 
-fno-optimize-sibling-calls fixes it (that should remove the [tail call]).
Unfortunately there is no option to disable the return slot optimization,
so you have to disable that in the source (tree-nrv.c, just return early
from execute_return_slot_opt).

Obviously the above may be a valid extra optimization ...

Richard.

>
> And there are plenty in *.alias, e.g.:
>  Aliased symbols
>
>  -D.1935, UID D.1935, struct inertia_tensor, is addressable, score: 8, direct
> reads: 0, direct writes: 0, indirect reads: 0, indirect writes: 2, call
> clobbered (passed to call)
>  +D.1935, UID D.1935, struct inertia_tensor, is addressable, score: 0, direct
> reads: 0, direct writes: 0, indirect reads: 0, indirect writes: 0
>
> The difference file has 3000+ lines here, one could argue that the dumps are
> significantly different ;)
>
>
>


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (11 preceding siblings ...)
  2008-07-03 12:55 ` rguenther at suse dot de
@ 2008-07-03 13:01 ` dfranke at gcc dot gnu dot org
  2008-07-03 13:11 ` rguenther at suse dot de
                   ` (11 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: dfranke at gcc dot gnu dot org @ 2008-07-03 13:01 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #13 from dfranke at gcc dot gnu dot org  2008-07-03 13:00 -------
> Can you try if -fno-optimize-sibling-calls fixes it
> (that should remove the [tail call]).

FCFLAGS="-O2 -fno-optimize-sibling-calls" - works
FCFLAGS="-O1 -foptimize-sibling-calls"    - broken


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (12 preceding siblings ...)
  2008-07-03 13:01 ` dfranke at gcc dot gnu dot org
@ 2008-07-03 13:11 ` rguenther at suse dot de
  2008-07-03 19:00 ` dfranke at gcc dot gnu dot org
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2008-07-03 13:11 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #14 from rguenther at suse dot de  2008-07-03 13:10 -------
Subject: Re:  [4.4 regression] r137252 breaks -O2 optimization
 on x86_64-unknown-linux-gnu

On Thu, 3 Jul 2008, dfranke at gcc dot gnu dot org wrote:

> ------- Comment #13 from dfranke at gcc dot gnu dot org  2008-07-03 13:00 -------
>> Can you try if -fno-optimize-sibling-calls fixes it
>> (that should remove the [tail call]).
>
> FCFLAGS="-O2 -fno-optimize-sibling-calls" - works
> FCFLAGS="-O1 -foptimize-sibling-calls"    - broken

Ok.  Probably some latent bug got uncovered.  Your challenge is now
to produce a testcase from your source ;)  Basically isolate this
function somehow, but make sure the [tail call] and the
[return slot optimization] are still there.

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (13 preceding siblings ...)
  2008-07-03 13:11 ` rguenther at suse dot de
@ 2008-07-03 19:00 ` dfranke at gcc dot gnu dot org
  2008-07-03 19:27 ` rguenther at suse dot de
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: dfranke at gcc dot gnu dot org @ 2008-07-03 19:00 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #15 from dfranke at gcc dot gnu dot org  2008-07-03 18:59 -------
> Your challenge is now to produce a testcase from your source ;)

I do have a testcase that shows said flags. The difference in assembler between
-foptimize-sibling-calls and -fno-optimize-sibling-calls is:

--- without.s 2008-07-03 20:51:59.000000000 +0200
+++ with.s    2008-07-03 20:52:26.000000000 +0200
@@ -64,10 +64,9 @@
        movq    %rsp, %rdi
        call    dummy_atom_model_inertia_tensor_
        movq    %rsp, %rdi
-       call    inertia_tensor_to_rg_
        addq    $80, %rsp
        popq    %rbx
-       ret
+       jmp     inertia_tensor_to_rg_
 .LFE2:
        .size   dummy_atom_model_radius_of_gyration_,
.-dummy_atom_model_radius_of_gyration_
        .section        .eh_frame,"a",@progbits

Is this correct? Running the code compiled with -foptimize-sibling-calls seems
to yield the correct result. Not sure if I need to investigate further.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (14 preceding siblings ...)
  2008-07-03 19:00 ` dfranke at gcc dot gnu dot org
@ 2008-07-03 19:27 ` rguenther at suse dot de
  2008-07-03 19:34 ` franke dot daniel at gmail dot com
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2008-07-03 19:27 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #16 from rguenther at suse dot de  2008-07-03 19:26 -------
Subject: Re:  [4.4 regression] r137252 breaks -O2 optimization
 on x86_64-unknown-linux-gnu

On Thu, 3 Jul 2008, dfranke at gcc dot gnu dot org wrote:

> ------- Comment #15 from dfranke at gcc dot gnu dot org  2008-07-03 18:59 -------
>> Your challenge is now to produce a testcase from your source ;)
>
> I do have a testcase that shows said flags. The difference in assembler between
> -foptimize-sibling-calls and -fno-optimize-sibling-calls is:
>
> --- without.s 2008-07-03 20:51:59.000000000 +0200
> +++ with.s    2008-07-03 20:52:26.000000000 +0200
> @@ -64,10 +64,9 @@
>        movq    %rsp, %rdi
>        call    dummy_atom_model_inertia_tensor_
>        movq    %rsp, %rdi
> -       call    inertia_tensor_to_rg_
>        addq    $80, %rsp
>        popq    %rbx
> -       ret
> +       jmp     inertia_tensor_to_rg_
> .LFE2:
>        .size   dummy_atom_model_radius_of_gyration_,
> .-dummy_atom_model_radius_of_gyration_
>        .section        .eh_frame,"a",@progbits
>
> Is this correct? Running the code compiled with -foptimize-sibling-calls seems
> to yield the correct result. Not sure if I need to investigate further.

Well, the circumstances where the miscompilation occurs may be tricky.
So you say you have a testcase but that doesn't show wrong behavior?
That's unfortunate ... :/

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (15 preceding siblings ...)
  2008-07-03 19:27 ` rguenther at suse dot de
@ 2008-07-03 19:34 ` franke dot daniel at gmail dot com
  2008-07-03 19:45 ` rguenther at suse dot de
                   ` (7 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: franke dot daniel at gmail dot com @ 2008-07-03 19:34 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #17 from franke dot daniel at gmail dot com  2008-07-03 19:33 -------
Subject: Re:  [4.4 regression] r137252 breaks -O2 optimization on
x86_64-unknown-linux-gnu


> Well, the circumstances where the miscompilation occurs may be tricky.
> So you say you have a testcase but that doesn't show wrong behavior?
> That's unfortunate ... :/

There was this small glimmer of hope that you could tell from the assembly 
difference that something's horribly, horribly wrong and you know perfectly 
well how to fix it. No? :)

A properly failing testcase might take a while ...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (16 preceding siblings ...)
  2008-07-03 19:34 ` franke dot daniel at gmail dot com
@ 2008-07-03 19:45 ` rguenther at suse dot de
  2008-07-04 15:07 ` rguenth at gcc dot gnu dot org
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2008-07-03 19:45 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #18 from rguenther at suse dot de  2008-07-03 19:44 -------
Subject: Re:  [4.4 regression] r137252 breaks -O2 optimization
 on x86_64-unknown-linux-gnu

On Thu, 3 Jul 2008, franke dot daniel at gmail dot com wrote:

> ------- Comment #17 from franke dot daniel at gmail dot com  2008-07-03 19:33 -------
> Subject: Re:  [4.4 regression] r137252 breaks -O2 optimization on
> x86_64-unknown-linux-gnu
>
>
>> Well, the circumstances where the miscompilation occurs may be tricky.
>> So you say you have a testcase but that doesn't show wrong behavior?
>> That's unfortunate ... :/
>
> There was this small glimmer of hope that you could tell from the assembly
> difference that something's horribly, horribly wrong and you know perfectly
> well how to fix it. No? :)

No, sorry - this is just how a sibcall looks like ;)

> A properly failing testcase might take a while ...

Understood ...

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (17 preceding siblings ...)
  2008-07-03 19:45 ` rguenther at suse dot de
@ 2008-07-04 15:07 ` rguenth at gcc dot gnu dot org
  2008-07-04 15:07 ` rguenth at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-07-04 15:07 UTC (permalink / raw)
  To: gcc-bugs



-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |4.4.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (18 preceding siblings ...)
  2008-07-04 15:07 ` rguenth at gcc dot gnu dot org
@ 2008-07-04 15:07 ` rguenth at gcc dot gnu dot org
  2008-07-04 15:47 ` rguenth at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-07-04 15:07 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #19 from rguenth at gcc dot gnu dot org  2008-07-04 15:06 -------
Created an attachment (id=15854)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15854&action=view)
patch

I found some more latent bugs.  Can you try the attached patch?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (19 preceding siblings ...)
  2008-07-04 15:07 ` rguenth at gcc dot gnu dot org
@ 2008-07-04 15:47 ` rguenth at gcc dot gnu dot org
  2008-07-04 21:36 ` dfranke at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-07-04 15:47 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #20 from rguenth at gcc dot gnu dot org  2008-07-04 15:46 -------
Created an attachment (id=15855)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15855&action=view)
more fixes

Some more fixes (includes the first patch)


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #15854|0                           |1
        is obsolete|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (20 preceding siblings ...)
  2008-07-04 15:47 ` rguenth at gcc dot gnu dot org
@ 2008-07-04 21:36 ` dfranke at gcc dot gnu dot org
  2008-07-06 13:30 ` rguenth at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: dfranke at gcc dot gnu dot org @ 2008-07-04 21:36 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #21 from dfranke at gcc dot gnu dot org  2008-07-04 21:36 -------
Based on:
Last Changed Rev: 137437
Last Changed Date: 2008-07-04 00:02:18 +0200 (Fri, 04 Jul 2008)

without patch: fails, with patch: works.

Please note that the optimized dump of the function in question now again looks
like:
<bb 2>:
  D.1935 = dummy_atom_model_inertia_tensor (this, phase);
  __result_dummy_atom_model_rad.21 = inertia_tensor_to_rg (&D.1935);
  return __result_dummy_atom_model_rad.21;

i.e. with the patch it lost the [return slot optimization]/[sibling call] shown
in comment #11.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (21 preceding siblings ...)
  2008-07-04 21:36 ` dfranke at gcc dot gnu dot org
@ 2008-07-06 13:30 ` rguenth at gcc dot gnu dot org
  2008-07-07 15:12 ` rguenth at gcc dot gnu dot org
  2008-07-07 15:17 ` rguenth at gcc dot gnu dot org
  24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-07-06 13:30 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #22 from rguenth at gcc dot gnu dot org  2008-07-06 13:29 -------
Mine.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |rguenth at gcc dot gnu dot
                   |dot org                     |org
             Status|UNCONFIRMED                 |ASSIGNED
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2008-07-06 13:29:20
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (22 preceding siblings ...)
  2008-07-06 13:30 ` rguenth at gcc dot gnu dot org
@ 2008-07-07 15:12 ` rguenth at gcc dot gnu dot org
  2008-07-07 15:17 ` rguenth at gcc dot gnu dot org
  24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-07-07 15:12 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #23 from rguenth at gcc dot gnu dot org  2008-07-07 15:12 -------
Subject: Bug 36713

Author: rguenth
Date: Mon Jul  7 15:11:29 2008
New Revision: 137571

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=137571
Log:
2008-07-07  Richard Guenther  <rguenther@suse.de>

        PR tree-optimization/36713
        * tree-flow-inline.h (is_call_used): New function.
        * tree-nrv.c (dest_safe_for_nrv_p): Use it.
        * tree-tailcall.c (suitable_for_tail_opt_p): Likewise.
        * tree-outof-ssa.c (create_temp): Set call-used flag if required.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-flow-inline.h
    trunk/gcc/tree-nrv.c
    trunk/gcc/tree-outof-ssa.c
    trunk/gcc/tree-tailcall.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
  2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
                   ` (23 preceding siblings ...)
  2008-07-07 15:12 ` rguenth at gcc dot gnu dot org
@ 2008-07-07 15:17 ` rguenth at gcc dot gnu dot org
  24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-07-07 15:17 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #24 from rguenth at gcc dot gnu dot org  2008-07-07 15:16 -------
Fixed.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2008-07-07 15:17 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-07-03  9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
2008-07-03  9:13 ` [Bug target/36713] " rguenther at suse dot de
2008-07-03  9:15 ` dfranke at gcc dot gnu dot org
2008-07-03  9:43 ` rguenther at suse dot de
2008-07-03  9:54 ` franke dot daniel at gmail dot com
2008-07-03 10:09 ` rguenther at suse dot de
2008-07-03 10:41 ` dfranke at gcc dot gnu dot org
2008-07-03 11:42 ` dfranke at gcc dot gnu dot org
2008-07-03 11:50 ` rguenther at suse dot de
2008-07-03 12:00 ` dfranke at gcc dot gnu dot org
2008-07-03 12:04 ` rguenther at suse dot de
2008-07-03 12:46 ` dfranke at gcc dot gnu dot org
2008-07-03 12:55 ` rguenther at suse dot de
2008-07-03 13:01 ` dfranke at gcc dot gnu dot org
2008-07-03 13:11 ` rguenther at suse dot de
2008-07-03 19:00 ` dfranke at gcc dot gnu dot org
2008-07-03 19:27 ` rguenther at suse dot de
2008-07-03 19:34 ` franke dot daniel at gmail dot com
2008-07-03 19:45 ` rguenther at suse dot de
2008-07-04 15:07 ` rguenth at gcc dot gnu dot org
2008-07-04 15:07 ` rguenth at gcc dot gnu dot org
2008-07-04 15:47 ` rguenth at gcc dot gnu dot org
2008-07-04 21:36 ` dfranke at gcc dot gnu dot org
2008-07-06 13:30 ` rguenth at gcc dot gnu dot org
2008-07-07 15:12 ` rguenth at gcc dot gnu dot org
2008-07-07 15:17 ` rguenth at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).