* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
@ 2008-07-03 9:13 ` rguenther at suse dot de
2008-07-03 9:15 ` dfranke at gcc dot gnu dot org
` (23 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2008-07-03 9:13 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from rguenther at suse dot de 2008-07-03 09:12 -------
Subject: Re: New: [4.4 regression] r137252 breaks -O2
optimization on x86_64-unknown-linux-gnu
On Thu, 3 Jul 2008, dfranke at gcc dot gnu dot org wrote:
> ------------------------------------------------------------------------
> r137252 | rguenth | 2008-06-29 17:44:00 +0200 (Sun, 29 Jun 2008) | 27 lines
>
> 2008-06-29 Richard Guenther <rguenther@suse.de>
>
>
> For now, all I can tell is: my Fortran-based project gives acceptable results
> with r137251 (FCFLAGS="-O2"), but r137252 produces strange output that can,
> numerically, not be justified. For FCFLAGS="-O1", both revisions seem to work.
>
> Can't provide a testcase for now as I'm not sure where I should start looking.
> Pointers welcome :)
Try updating to at least
2008-06-30 Richard Guenther <rguenther@suse.de>
PR middle-end/36671
* tree-ssa-structalias.c (handle_lhs_call): Add flags argument,
handle calls from ECF_MALLOC functions.
(handle_pure_call): ECF_MALLOC functions do not return
call-used memory.
(find_func_aliases): Handle all calls, adjust calls to
handle_lhs_call.
as this fixes some fortran problems.
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
2008-07-03 9:13 ` [Bug target/36713] " rguenther at suse dot de
@ 2008-07-03 9:15 ` dfranke at gcc dot gnu dot org
2008-07-03 9:43 ` rguenther at suse dot de
` (22 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: dfranke at gcc dot gnu dot org @ 2008-07-03 9:15 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from dfranke at gcc dot gnu dot org 2008-07-03 09:15 -------
Updated last night to:
Last Changed Rev: 137396
Last Changed Date: 2008-07-03 00:31:11 +0200 (Thu, 03 Jul 2008)
Problem persists.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
2008-07-03 9:13 ` [Bug target/36713] " rguenther at suse dot de
2008-07-03 9:15 ` dfranke at gcc dot gnu dot org
@ 2008-07-03 9:43 ` rguenther at suse dot de
2008-07-03 9:54 ` franke dot daniel at gmail dot com
` (21 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2008-07-03 9:43 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from rguenther at suse dot de 2008-07-03 09:42 -------
Subject: Re: [4.4 regression] r137252 breaks -O2 optimization
on x86_64-unknown-linux-gnu
On Thu, 3 Jul 2008, dfranke at gcc dot gnu dot org wrote:
> ------- Comment #2 from dfranke at gcc dot gnu dot org 2008-07-03 09:15 -------
> Updated last night to:
>
> Last Changed Rev: 137396
> Last Changed Date: 2008-07-03 00:31:11 +0200 (Thu, 03 Jul 2008)
>
> Problem persists.
Hm, I'm out of quick ideas ;) Polyhedron still seems to work fine
for me, as does SPEC 2000.
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (2 preceding siblings ...)
2008-07-03 9:43 ` rguenther at suse dot de
@ 2008-07-03 9:54 ` franke dot daniel at gmail dot com
2008-07-03 10:09 ` rguenther at suse dot de
` (20 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: franke dot daniel at gmail dot com @ 2008-07-03 9:54 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from franke dot daniel at gmail dot com 2008-07-03 09:53 -------
Subject: Re: [4.4 regression] r137252 breaks -O2 optimization on
x86_64-unknown-linux-gnu
3 Jul 2008 09:42:44 -0000, rguenther at suse dot de <gcc-bugzilla@gcc.gnu.org>:
> Hm, I'm out of quick ideas ;) Polyhedron still seems to work fine
> for me, as does SPEC 2000.
Any hint of what I should look for? Some specific feature that the
change in question might affect?
If nothing else, I'll try to compare dumps between the two revisions.
Is this a specific pass or will -fdump-tree-optimized do?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (3 preceding siblings ...)
2008-07-03 9:54 ` franke dot daniel at gmail dot com
@ 2008-07-03 10:09 ` rguenther at suse dot de
2008-07-03 10:41 ` dfranke at gcc dot gnu dot org
` (19 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2008-07-03 10:09 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from rguenther at suse dot de 2008-07-03 10:08 -------
Subject: Re: [4.4 regression] r137252 breaks -O2 optimization
on x86_64-unknown-linux-gnu
On Thu, 3 Jul 2008, franke dot daniel at gmail dot com wrote:
> ------- Comment #4 from franke dot daniel at gmail dot com 2008-07-03 09:53 -------
> Subject: Re: [4.4 regression] r137252 breaks -O2 optimization on
> x86_64-unknown-linux-gnu
>
> 3 Jul 2008 09:42:44 -0000, rguenther at suse dot de <gcc-bugzilla@gcc.gnu.org>:
>> Hm, I'm out of quick ideas ;) Polyhedron still seems to work fine
>> for me, as does SPEC 2000.
>
> Any hint of what I should look for? Some specific feature that the
> change in question might affect?
> If nothing else, I'll try to compare dumps between the two revisions.
> Is this a specific pass or will -fdump-tree-optimized do?
The patch changed us to assume that on function entry global variables
only point to global memory, not function local memory. In the end
this should result in less call-clobbering and thus "more" optimization
...
-fdump-tree-optimized differences should hint at where it goes wrong,
for further analysis I would suggest -fdump-tree-alias-vops.
To simplify your life try if you can reproduce it with -O
-fstrict-aliasing or with -O2 -fno-tree-pre.
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (4 preceding siblings ...)
2008-07-03 10:09 ` rguenther at suse dot de
@ 2008-07-03 10:41 ` dfranke at gcc dot gnu dot org
2008-07-03 11:42 ` dfranke at gcc dot gnu dot org
` (18 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: dfranke at gcc dot gnu dot org @ 2008-07-03 10:41 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from dfranke at gcc dot gnu dot org 2008-07-03 10:40 -------
> To simplify your life try if you can reproduce it with -O
> -fstrict-aliasing or with -O2 -fno-tree-pre.
FCFLAGS="-O -fstrict-aliasing" - not reproducable
FCFLAGS="-O2 -fno-tree-pre" - reproducable
FCFLAGS="-O -ftree-pre" - not reproducable
FCFLAGS="-O2 -fno-strict-aliasing" - reproducable
Will do dump comparison next.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (5 preceding siblings ...)
2008-07-03 10:41 ` dfranke at gcc dot gnu dot org
@ 2008-07-03 11:42 ` dfranke at gcc dot gnu dot org
2008-07-03 11:50 ` rguenther at suse dot de
` (17 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: dfranke at gcc dot gnu dot org @ 2008-07-03 11:42 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from dfranke at gcc dot gnu dot org 2008-07-03 11:42 -------
Differences in *.optimized dumps seem to be variable renaming only, e.g.
- temp.1140 = tdata->p;
+ temp.1137 = tdata->p;
Differences in *.alias seems to be mostly of the kind:
-PARM_NOALIAS.895 = &ANYTHING
+PARM_NOALIAS.895 = &ESCAPED
However, I also found:
-PARM_NOALIAS.895 = { ANYTHING ESCAPED NONLOCAL }
-D.3410_6 = same as thread_data
-thread_data = { ANYTHING }
-D.3411_7 = same as thread_data
+PARM_NOALIAS.895 = { ESCAPED NONLOCAL }
+D.3410_6 = same as derefaddrtmp.893
+thread_data = same as derefaddrtmp.893 <---- ?!?
+D.3411_7 = same as derefaddrtmp.893
which might hint at the problem: the variable thread_data is an allocated
POINTER to a vector of structures (which hold more pointers), public within a
module. USE-ing it, makes it (sort of) local to the function. Most (>80% of
all) changes in *.optimized where in the one source file that uses this module
...
Richard, would this fit the bill?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (6 preceding siblings ...)
2008-07-03 11:42 ` dfranke at gcc dot gnu dot org
@ 2008-07-03 11:50 ` rguenther at suse dot de
2008-07-03 12:00 ` dfranke at gcc dot gnu dot org
` (16 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2008-07-03 11:50 UTC (permalink / raw)
To: gcc-bugs
------- Comment #8 from rguenther at suse dot de 2008-07-03 11:49 -------
Subject: Re: [4.4 regression] r137252 breaks -O2 optimization
on x86_64-unknown-linux-gnu
On Thu, 3 Jul 2008, dfranke at gcc dot gnu dot org wrote:
> ------- Comment #7 from dfranke at gcc dot gnu dot org 2008-07-03 11:42 -------
> Differences in *.optimized dumps seem to be variable renaming only, e.g.
> - temp.1140 = tdata->p;
> + temp.1137 = tdata->p;
>
> Differences in *.alias seems to be mostly of the kind:
> -PARM_NOALIAS.895 = &ANYTHING
> +PARM_NOALIAS.895 = &ESCAPED
>
> However, I also found:
> -PARM_NOALIAS.895 = { ANYTHING ESCAPED NONLOCAL }
> -D.3410_6 = same as thread_data
> -thread_data = { ANYTHING }
> -D.3411_7 = same as thread_data
> +PARM_NOALIAS.895 = { ESCAPED NONLOCAL }
> +D.3410_6 = same as derefaddrtmp.893
> +thread_data = same as derefaddrtmp.893 <---- ?!?
> +D.3411_7 = same as derefaddrtmp.893
Well, it's not really different - only thread_data now doesn't
point to anything but to the same as derefaddrtmp.893 (I guess
that points to { ESCAPED NONLOCAL }?).
> which might hint at the problem: the variable thread_data is an allocated
> POINTER to a vector of structures (which hold more pointers), public within a
> module. USE-ing it, makes it (sort of) local to the function. Most (>80% of
> all) changes in *.optimized where in the one source file that uses this module
> ...
>
> Richard, would this fit the bill?
Well, if there is no difference in the optimized dumps then this
doesn't explain it (but the points-to differences also look good).
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (7 preceding siblings ...)
2008-07-03 11:50 ` rguenther at suse dot de
@ 2008-07-03 12:00 ` dfranke at gcc dot gnu dot org
2008-07-03 12:04 ` rguenther at suse dot de
` (15 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: dfranke at gcc dot gnu dot org @ 2008-07-03 12:00 UTC (permalink / raw)
To: gcc-bugs
------- Comment #9 from dfranke at gcc dot gnu dot org 2008-07-03 11:59 -------
> Well, it's not really different - only thread_data now doesn't
> point to anything but to the same as derefaddrtmp.893 (I guess
> that points to { ESCAPED NONLOCAL }?).
Just ESCAPED, see below.
r137251
derefaddrtmp.893 = { ESCAPED }
derefaddrtmp.894 = { NONLOCAL }
this = { PARM_NOALIAS.895 }
PARM_NOALIAS.895 = { ANYTHING ESCAPED NONLOCAL }
D.3410_6 = same as thread_data
thread_data = { ANYTHING }
D.3411_7 = same as thread_data
D.3417_13 = { ANYTHING }
r137252
derefaddrtmp.893 = { ESCAPED }
derefaddrtmp.894 = { NONLOCAL }
this = { PARM_NOALIAS.895 }
PARM_NOALIAS.895 = { ESCAPED NONLOCAL }
D.3410_6 = same as derefaddrtmp.893
thread_data = same as derefaddrtmp.893
D.3411_7 = same as derefaddrtmp.893
> Well, if there is no difference in the optimized dumps then this
> doesn't explain it (but the points-to differences also look good)
None that I could spot by scrolling through 1000+ lines of differences.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (8 preceding siblings ...)
2008-07-03 12:00 ` dfranke at gcc dot gnu dot org
@ 2008-07-03 12:04 ` rguenther at suse dot de
2008-07-03 12:46 ` dfranke at gcc dot gnu dot org
` (14 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2008-07-03 12:04 UTC (permalink / raw)
To: gcc-bugs
------- Comment #10 from rguenther at suse dot de 2008-07-03 12:03 -------
Subject: Re: [4.4 regression] r137252 breaks -O2 optimization
on x86_64-unknown-linux-gnu
On Thu, 3 Jul 2008, dfranke at gcc dot gnu dot org wrote:
>
>
> ------- Comment #9 from dfranke at gcc dot gnu dot org 2008-07-03 11:59 -------
>> Well, it's not really different - only thread_data now doesn't
>> point to anything but to the same as derefaddrtmp.893 (I guess
>> that points to { ESCAPED NONLOCAL }?).
>
> Just ESCAPED, see below.
>
> r137251
> derefaddrtmp.893 = { ESCAPED }
> derefaddrtmp.894 = { NONLOCAL }
> this = { PARM_NOALIAS.895 }
> PARM_NOALIAS.895 = { ANYTHING ESCAPED NONLOCAL }
> D.3410_6 = same as thread_data
> thread_data = { ANYTHING }
> D.3411_7 = same as thread_data
> D.3417_13 = { ANYTHING }
>
> r137252
> derefaddrtmp.893 = { ESCAPED }
> derefaddrtmp.894 = { NONLOCAL }
> this = { PARM_NOALIAS.895 }
> PARM_NOALIAS.895 = { ESCAPED NONLOCAL }
> D.3410_6 = same as derefaddrtmp.893
> thread_data = same as derefaddrtmp.893
> D.3411_7 = same as derefaddrtmp.893
That looks correct.
>> Well, if there is no difference in the optimized dumps then this
>> doesn't explain it (but the points-to differences also look good)
>
> None that I could spot by scrolling through 1000+ lines of differences.
Yeah. I guess the assembly is different, right? ;)
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (9 preceding siblings ...)
2008-07-03 12:04 ` rguenther at suse dot de
@ 2008-07-03 12:46 ` dfranke at gcc dot gnu dot org
2008-07-03 12:55 ` rguenther at suse dot de
` (13 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: dfranke at gcc dot gnu dot org @ 2008-07-03 12:46 UTC (permalink / raw)
To: gcc-bugs
------- Comment #11 from dfranke at gcc dot gnu dot org 2008-07-03 12:45 -------
> I guess the assembly is different, right?
Yes, it is. Comparing *.s files from FCFLAGS="-O2 -save-temps", there are
differences in one file - though not the one suspected in comment #7.
There is one difference in the .optimized file that I could easily identify:
- D.1935 = dummy_atom_model_inertia_tensor (this, phase);
- __result_dummy_atom_model_rad.21 = inertia_tensor_to_rg (&D.1935);
+ D.1935 = dummy_atom_model_inertia_tensor (this, phase) [return slot
optimization];
+ __result_dummy_atom_model_rad.21 = inertia_tensor_to_rg (&D.1935) [tail
call];
And there are plenty in *.alias, e.g.:
Aliased symbols
-D.1935, UID D.1935, struct inertia_tensor, is addressable, score: 8, direct
reads: 0, direct writes: 0, indirect reads: 0, indirect writes: 2, call
clobbered (passed to call)
+D.1935, UID D.1935, struct inertia_tensor, is addressable, score: 0, direct
reads: 0, direct writes: 0, indirect reads: 0, indirect writes: 0
The difference file has 3000+ lines here, one could argue that the dumps are
significantly different ;)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (10 preceding siblings ...)
2008-07-03 12:46 ` dfranke at gcc dot gnu dot org
@ 2008-07-03 12:55 ` rguenther at suse dot de
2008-07-03 13:01 ` dfranke at gcc dot gnu dot org
` (12 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2008-07-03 12:55 UTC (permalink / raw)
To: gcc-bugs
------- Comment #12 from rguenther at suse dot de 2008-07-03 12:54 -------
Subject: Re: [4.4 regression] r137252 breaks -O2 optimization
on x86_64-unknown-linux-gnu
On Thu, 3 Jul 2008, dfranke at gcc dot gnu dot org wrote:
> ------- Comment #11 from dfranke at gcc dot gnu dot org 2008-07-03 12:45 -------
>> I guess the assembly is different, right?
>
> Yes, it is. Comparing *.s files from FCFLAGS="-O2 -save-temps", there are
> differences in one file - though not the one suspected in comment #7.
>
> There is one difference in the .optimized file that I could easily identify:
> - D.1935 = dummy_atom_model_inertia_tensor (this, phase);
> - __result_dummy_atom_model_rad.21 = inertia_tensor_to_rg (&D.1935);
> + D.1935 = dummy_atom_model_inertia_tensor (this, phase) [return slot
> optimization];
> + __result_dummy_atom_model_rad.21 = inertia_tensor_to_rg (&D.1935) [tail
> call];
Ok, that is an interesting difference. Can you try if
-fno-optimize-sibling-calls fixes it (that should remove the [tail call]).
Unfortunately there is no option to disable the return slot optimization,
so you have to disable that in the source (tree-nrv.c, just return early
from execute_return_slot_opt).
Obviously the above may be a valid extra optimization ...
Richard.
>
> And there are plenty in *.alias, e.g.:
> Aliased symbols
>
> -D.1935, UID D.1935, struct inertia_tensor, is addressable, score: 8, direct
> reads: 0, direct writes: 0, indirect reads: 0, indirect writes: 2, call
> clobbered (passed to call)
> +D.1935, UID D.1935, struct inertia_tensor, is addressable, score: 0, direct
> reads: 0, direct writes: 0, indirect reads: 0, indirect writes: 0
>
> The difference file has 3000+ lines here, one could argue that the dumps are
> significantly different ;)
>
>
>
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (11 preceding siblings ...)
2008-07-03 12:55 ` rguenther at suse dot de
@ 2008-07-03 13:01 ` dfranke at gcc dot gnu dot org
2008-07-03 13:11 ` rguenther at suse dot de
` (11 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: dfranke at gcc dot gnu dot org @ 2008-07-03 13:01 UTC (permalink / raw)
To: gcc-bugs
------- Comment #13 from dfranke at gcc dot gnu dot org 2008-07-03 13:00 -------
> Can you try if -fno-optimize-sibling-calls fixes it
> (that should remove the [tail call]).
FCFLAGS="-O2 -fno-optimize-sibling-calls" - works
FCFLAGS="-O1 -foptimize-sibling-calls" - broken
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (12 preceding siblings ...)
2008-07-03 13:01 ` dfranke at gcc dot gnu dot org
@ 2008-07-03 13:11 ` rguenther at suse dot de
2008-07-03 19:00 ` dfranke at gcc dot gnu dot org
` (10 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2008-07-03 13:11 UTC (permalink / raw)
To: gcc-bugs
------- Comment #14 from rguenther at suse dot de 2008-07-03 13:10 -------
Subject: Re: [4.4 regression] r137252 breaks -O2 optimization
on x86_64-unknown-linux-gnu
On Thu, 3 Jul 2008, dfranke at gcc dot gnu dot org wrote:
> ------- Comment #13 from dfranke at gcc dot gnu dot org 2008-07-03 13:00 -------
>> Can you try if -fno-optimize-sibling-calls fixes it
>> (that should remove the [tail call]).
>
> FCFLAGS="-O2 -fno-optimize-sibling-calls" - works
> FCFLAGS="-O1 -foptimize-sibling-calls" - broken
Ok. Probably some latent bug got uncovered. Your challenge is now
to produce a testcase from your source ;) Basically isolate this
function somehow, but make sure the [tail call] and the
[return slot optimization] are still there.
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (13 preceding siblings ...)
2008-07-03 13:11 ` rguenther at suse dot de
@ 2008-07-03 19:00 ` dfranke at gcc dot gnu dot org
2008-07-03 19:27 ` rguenther at suse dot de
` (9 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: dfranke at gcc dot gnu dot org @ 2008-07-03 19:00 UTC (permalink / raw)
To: gcc-bugs
------- Comment #15 from dfranke at gcc dot gnu dot org 2008-07-03 18:59 -------
> Your challenge is now to produce a testcase from your source ;)
I do have a testcase that shows said flags. The difference in assembler between
-foptimize-sibling-calls and -fno-optimize-sibling-calls is:
--- without.s 2008-07-03 20:51:59.000000000 +0200
+++ with.s 2008-07-03 20:52:26.000000000 +0200
@@ -64,10 +64,9 @@
movq %rsp, %rdi
call dummy_atom_model_inertia_tensor_
movq %rsp, %rdi
- call inertia_tensor_to_rg_
addq $80, %rsp
popq %rbx
- ret
+ jmp inertia_tensor_to_rg_
.LFE2:
.size dummy_atom_model_radius_of_gyration_,
.-dummy_atom_model_radius_of_gyration_
.section .eh_frame,"a",@progbits
Is this correct? Running the code compiled with -foptimize-sibling-calls seems
to yield the correct result. Not sure if I need to investigate further.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (14 preceding siblings ...)
2008-07-03 19:00 ` dfranke at gcc dot gnu dot org
@ 2008-07-03 19:27 ` rguenther at suse dot de
2008-07-03 19:34 ` franke dot daniel at gmail dot com
` (8 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2008-07-03 19:27 UTC (permalink / raw)
To: gcc-bugs
------- Comment #16 from rguenther at suse dot de 2008-07-03 19:26 -------
Subject: Re: [4.4 regression] r137252 breaks -O2 optimization
on x86_64-unknown-linux-gnu
On Thu, 3 Jul 2008, dfranke at gcc dot gnu dot org wrote:
> ------- Comment #15 from dfranke at gcc dot gnu dot org 2008-07-03 18:59 -------
>> Your challenge is now to produce a testcase from your source ;)
>
> I do have a testcase that shows said flags. The difference in assembler between
> -foptimize-sibling-calls and -fno-optimize-sibling-calls is:
>
> --- without.s 2008-07-03 20:51:59.000000000 +0200
> +++ with.s 2008-07-03 20:52:26.000000000 +0200
> @@ -64,10 +64,9 @@
> movq %rsp, %rdi
> call dummy_atom_model_inertia_tensor_
> movq %rsp, %rdi
> - call inertia_tensor_to_rg_
> addq $80, %rsp
> popq %rbx
> - ret
> + jmp inertia_tensor_to_rg_
> .LFE2:
> .size dummy_atom_model_radius_of_gyration_,
> .-dummy_atom_model_radius_of_gyration_
> .section .eh_frame,"a",@progbits
>
> Is this correct? Running the code compiled with -foptimize-sibling-calls seems
> to yield the correct result. Not sure if I need to investigate further.
Well, the circumstances where the miscompilation occurs may be tricky.
So you say you have a testcase but that doesn't show wrong behavior?
That's unfortunate ... :/
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (15 preceding siblings ...)
2008-07-03 19:27 ` rguenther at suse dot de
@ 2008-07-03 19:34 ` franke dot daniel at gmail dot com
2008-07-03 19:45 ` rguenther at suse dot de
` (7 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: franke dot daniel at gmail dot com @ 2008-07-03 19:34 UTC (permalink / raw)
To: gcc-bugs
------- Comment #17 from franke dot daniel at gmail dot com 2008-07-03 19:33 -------
Subject: Re: [4.4 regression] r137252 breaks -O2 optimization on
x86_64-unknown-linux-gnu
> Well, the circumstances where the miscompilation occurs may be tricky.
> So you say you have a testcase but that doesn't show wrong behavior?
> That's unfortunate ... :/
There was this small glimmer of hope that you could tell from the assembly
difference that something's horribly, horribly wrong and you know perfectly
well how to fix it. No? :)
A properly failing testcase might take a while ...
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (16 preceding siblings ...)
2008-07-03 19:34 ` franke dot daniel at gmail dot com
@ 2008-07-03 19:45 ` rguenther at suse dot de
2008-07-04 15:07 ` rguenth at gcc dot gnu dot org
` (6 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: rguenther at suse dot de @ 2008-07-03 19:45 UTC (permalink / raw)
To: gcc-bugs
------- Comment #18 from rguenther at suse dot de 2008-07-03 19:44 -------
Subject: Re: [4.4 regression] r137252 breaks -O2 optimization
on x86_64-unknown-linux-gnu
On Thu, 3 Jul 2008, franke dot daniel at gmail dot com wrote:
> ------- Comment #17 from franke dot daniel at gmail dot com 2008-07-03 19:33 -------
> Subject: Re: [4.4 regression] r137252 breaks -O2 optimization on
> x86_64-unknown-linux-gnu
>
>
>> Well, the circumstances where the miscompilation occurs may be tricky.
>> So you say you have a testcase but that doesn't show wrong behavior?
>> That's unfortunate ... :/
>
> There was this small glimmer of hope that you could tell from the assembly
> difference that something's horribly, horribly wrong and you know perfectly
> well how to fix it. No? :)
No, sorry - this is just how a sibcall looks like ;)
> A properly failing testcase might take a while ...
Understood ...
Richard.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (17 preceding siblings ...)
2008-07-03 19:45 ` rguenther at suse dot de
@ 2008-07-04 15:07 ` rguenth at gcc dot gnu dot org
2008-07-04 15:07 ` rguenth at gcc dot gnu dot org
` (5 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-07-04 15:07 UTC (permalink / raw)
To: gcc-bugs
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |4.4.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (18 preceding siblings ...)
2008-07-04 15:07 ` rguenth at gcc dot gnu dot org
@ 2008-07-04 15:07 ` rguenth at gcc dot gnu dot org
2008-07-04 15:47 ` rguenth at gcc dot gnu dot org
` (4 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-07-04 15:07 UTC (permalink / raw)
To: gcc-bugs
------- Comment #19 from rguenth at gcc dot gnu dot org 2008-07-04 15:06 -------
Created an attachment (id=15854)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15854&action=view)
patch
I found some more latent bugs. Can you try the attached patch?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (19 preceding siblings ...)
2008-07-04 15:07 ` rguenth at gcc dot gnu dot org
@ 2008-07-04 15:47 ` rguenth at gcc dot gnu dot org
2008-07-04 21:36 ` dfranke at gcc dot gnu dot org
` (3 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-07-04 15:47 UTC (permalink / raw)
To: gcc-bugs
------- Comment #20 from rguenth at gcc dot gnu dot org 2008-07-04 15:46 -------
Created an attachment (id=15855)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15855&action=view)
more fixes
Some more fixes (includes the first patch)
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #15854|0 |1
is obsolete| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (20 preceding siblings ...)
2008-07-04 15:47 ` rguenth at gcc dot gnu dot org
@ 2008-07-04 21:36 ` dfranke at gcc dot gnu dot org
2008-07-06 13:30 ` rguenth at gcc dot gnu dot org
` (2 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: dfranke at gcc dot gnu dot org @ 2008-07-04 21:36 UTC (permalink / raw)
To: gcc-bugs
------- Comment #21 from dfranke at gcc dot gnu dot org 2008-07-04 21:36 -------
Based on:
Last Changed Rev: 137437
Last Changed Date: 2008-07-04 00:02:18 +0200 (Fri, 04 Jul 2008)
without patch: fails, with patch: works.
Please note that the optimized dump of the function in question now again looks
like:
<bb 2>:
D.1935 = dummy_atom_model_inertia_tensor (this, phase);
__result_dummy_atom_model_rad.21 = inertia_tensor_to_rg (&D.1935);
return __result_dummy_atom_model_rad.21;
i.e. with the patch it lost the [return slot optimization]/[sibling call] shown
in comment #11.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (21 preceding siblings ...)
2008-07-04 21:36 ` dfranke at gcc dot gnu dot org
@ 2008-07-06 13:30 ` rguenth at gcc dot gnu dot org
2008-07-07 15:12 ` rguenth at gcc dot gnu dot org
2008-07-07 15:17 ` rguenth at gcc dot gnu dot org
24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-07-06 13:30 UTC (permalink / raw)
To: gcc-bugs
------- Comment #22 from rguenth at gcc dot gnu dot org 2008-07-06 13:29 -------
Mine.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
AssignedTo|unassigned at gcc dot gnu |rguenth at gcc dot gnu dot
|dot org |org
Status|UNCONFIRMED |ASSIGNED
Ever Confirmed|0 |1
Last reconfirmed|0000-00-00 00:00:00 |2008-07-06 13:29:20
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (22 preceding siblings ...)
2008-07-06 13:30 ` rguenth at gcc dot gnu dot org
@ 2008-07-07 15:12 ` rguenth at gcc dot gnu dot org
2008-07-07 15:17 ` rguenth at gcc dot gnu dot org
24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-07-07 15:12 UTC (permalink / raw)
To: gcc-bugs
------- Comment #23 from rguenth at gcc dot gnu dot org 2008-07-07 15:12 -------
Subject: Bug 36713
Author: rguenth
Date: Mon Jul 7 15:11:29 2008
New Revision: 137571
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=137571
Log:
2008-07-07 Richard Guenther <rguenther@suse.de>
PR tree-optimization/36713
* tree-flow-inline.h (is_call_used): New function.
* tree-nrv.c (dest_safe_for_nrv_p): Use it.
* tree-tailcall.c (suitable_for_tail_opt_p): Likewise.
* tree-outof-ssa.c (create_temp): Set call-used flag if required.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-flow-inline.h
trunk/gcc/tree-nrv.c
trunk/gcc/tree-outof-ssa.c
trunk/gcc/tree-tailcall.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread
* [Bug target/36713] [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu
2008-07-03 9:09 [Bug target/36713] New: [4.4 regression] r137252 breaks -O2 optimization on x86_64-unknown-linux-gnu dfranke at gcc dot gnu dot org
` (23 preceding siblings ...)
2008-07-07 15:12 ` rguenth at gcc dot gnu dot org
@ 2008-07-07 15:17 ` rguenth at gcc dot gnu dot org
24 siblings, 0 replies; 26+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-07-07 15:17 UTC (permalink / raw)
To: gcc-bugs
------- Comment #24 from rguenth at gcc dot gnu dot org 2008-07-07 15:16 -------
Fixed.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36713
^ permalink raw reply [flat|nested] 26+ messages in thread