* Difference between -O3 and -O2 with the -f options -O3 adds
@ 2009-07-29 15:43 Matthias Kretz
2009-07-29 15:52 ` John (Eljay) Love-Jensen
2009-07-29 16:57 ` Michael Meissner
0 siblings, 2 replies; 6+ messages in thread
From: Matthias Kretz @ 2009-07-29 15:43 UTC (permalink / raw)
To: gcc-help
Hi,
I just updated to 4.4.1 and now one of my unit tests fails when it is compiled
with -O3.
OTOH when I compile with -O2 -finline-functions -funswitch-loops -ftree-
vectorize -fpredictive-commoning -fgcse-after-reload -fipa-cp-clone the
resulting binary does not fail.
I looked at the Assembly with objdump -dwC and I can see at least that the -O3
compiled binary has some loop unrolling done that's not in the other binary.
The gcc manual says that -O3 is the same as -O2 plus -finline-functions -
funswitch-loops -ftree-vectorize -fpredictive-commoning -fgcse-after-reload
gcc -c -Q -O3 --help=optimizers compared to the same with -O2 shows that also
-fipa-cp-clone goes in that list. But still the result is not the same.
Any ideas how to debug this regression further?
Regards,
Matthias
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Difference between -O3 and -O2 with the -f options -O3 adds
2009-07-29 15:43 Difference between -O3 and -O2 with the -f options -O3 adds Matthias Kretz
@ 2009-07-29 15:52 ` John (Eljay) Love-Jensen
2009-07-30 7:39 ` Matthias Kretz
2009-07-29 16:57 ` Michael Meissner
1 sibling, 1 reply; 6+ messages in thread
From: John (Eljay) Love-Jensen @ 2009-07-29 15:52 UTC (permalink / raw)
To: Matthias Kretz, GCC-help
Hi Mattias,
> Any ideas how to debug this regression further?
Any interesting deltas between these two outputs (O2.s and O3.s):
touch Empty.cpp
gcc -O2 -fverbose-asm -S Empty.cpp -o O2.s
gcc -O3 -fverbose-asm -S Empty.cpp -o O3.s
HTH,
--Eljay
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Difference between -O3 and -O2 with the -f options -O3 adds
2009-07-29 15:43 Difference between -O3 and -O2 with the -f options -O3 adds Matthias Kretz
2009-07-29 15:52 ` John (Eljay) Love-Jensen
@ 2009-07-29 16:57 ` Michael Meissner
2009-07-30 8:29 ` Matthias Kretz
1 sibling, 1 reply; 6+ messages in thread
From: Michael Meissner @ 2009-07-29 16:57 UTC (permalink / raw)
To: Matthias Kretz; +Cc: gcc-help
On Wed, Jul 29, 2009 at 05:43:25PM +0200, Matthias Kretz wrote:
> Hi,
>
> I just updated to 4.4.1 and now one of my unit tests fails when it is compiled
> with -O3.
> OTOH when I compile with -O2 -finline-functions -funswitch-loops -ftree-
> vectorize -fpredictive-commoning -fgcse-after-reload -fipa-cp-clone the
> resulting binary does not fail.
>
> I looked at the Assembly with objdump -dwC and I can see at least that the -O3
> compiled binary has some loop unrolling done that's not in the other binary.
>
> The gcc manual says that -O3 is the same as -O2 plus -finline-functions -
> funswitch-loops -ftree-vectorize -fpredictive-commoning -fgcse-after-reload
>
> gcc -c -Q -O3 --help=optimizers compared to the same with -O2 shows that also
> -fipa-cp-clone goes in that list. But still the result is not the same.
>
> Any ideas how to debug this regression further?
>
> Regards,
> Matthias
This is likely due to two places in the compiler that look at optimization
level as a value, instead of just as as the component switches.
The first place is in tree-ssa-pre.c, which sets the boolean do_partial_partial
if -O3 and there is no extra switch to control this::
/* Main entry point to the SSA-PRE pass. DO_FRE is true if the caller
only wants to do full redundancy elimination. */
static unsigned int
execute_pre (bool do_fre ATTRIBUTE_UNUSED)
{
unsigned int todo = 0;
do_partial_partial = optimize > 2;
/* This has to happen before SCCVN runs because
loop_optimizer_init may create new phis, etc. */
if (!do_fre)
loop_optimizer_init (LOOPS_NORMAL);
if (!run_scc_vn (do_fre))
{
if (!do_fre)
{
remove_dead_inserted_code ();
loop_optimizer_finalize ();
}
return 0;
}
init_pre (do_fre);
/* ... */
In opts.c, two parameter values are adjusted if -O3 in addition to the -f
options:
/* Allow even more virtual operators. Max-aliased-vops was set above for
-O2, so don't reset it unless we are at -O3. */
if (opt3)
set_param_value ("max-aliased-vops", 1000);
set_param_value ("avg-aliased-vops", (opt3) ? 3 : initial_avg_aliased_vops);
For the second set, you can set those parameters yourself to see if setting
them makes any difference. The first case, you could need to debug the
compiler and/or change the source to try it out.
--
Michael Meissner, IBM
4 Technology Place Drive, MS 2203A, Westford, MA, 01886, USA
meissner@linux.vnet.ibm.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Difference between -O3 and -O2 with the -f options -O3 adds
2009-07-29 15:52 ` John (Eljay) Love-Jensen
@ 2009-07-30 7:39 ` Matthias Kretz
0 siblings, 0 replies; 6+ messages in thread
From: Matthias Kretz @ 2009-07-30 7:39 UTC (permalink / raw)
To: gcc-help
On Wednesday 29 July 2009 17:52:27 John (Eljay) Love-Jensen wrote:
> Any interesting deltas between these two outputs (O2.s and O3.s):
>
> touch Empty.cpp
> gcc -O2 -fverbose-asm -S Empty.cpp -o O2.s
> gcc -O3 -fverbose-asm -S Empty.cpp -o O3.s
That gives me the same list of differences that I had before:
-finline-functions -fgcse-after-reload -fipa-cp-clone -fpredictive-commoning -
ftree-vectorize -funswitch-loops
I also looked at the diff of compiling my code with s/-c/-S -fverbose-asm/ and
there's no difference (I'm using the -O2 and -f... switches).
Regards,
Matthias
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Difference between -O3 and -O2 with the -f options -O3 adds
2009-07-29 16:57 ` Michael Meissner
@ 2009-07-30 8:29 ` Matthias Kretz
2009-07-31 9:26 ` Matthias Kretz
0 siblings, 1 reply; 6+ messages in thread
From: Matthias Kretz @ 2009-07-30 8:29 UTC (permalink / raw)
To: Michael Meissner, gcc-help
On Wednesday 29 July 2009 18:57:30 Michael Meissner wrote:
> On Wed, Jul 29, 2009 at 05:43:25PM +0200, Matthias Kretz wrote:
> This is likely due to two places in the compiler that look at optimization
> level as a value, instead of just as as the component switches.
There seem to be more places than what you pointed out...
> The first place is in tree-ssa-pre.c, which sets the boolean
> do_partial_partial if -O3 and there is no extra switch to control this::
I changed
> do_partial_partial = optimize > 2;
to
do_partial_partial = optimize > 3;
compiled with -O3 and the problem remains.
> In opts.c, two parameter values are adjusted if -O3 in addition to the -f
> options:
>
>
> /* Allow even more virtual operators. Max-aliased-vops was set above
> for -O2, so don't reset it unless we are at -O3. */
> if (opt3)
> set_param_value ("max-aliased-vops", 1000);
>
> set_param_value ("avg-aliased-vops", (opt3) ? 3 :
> initial_avg_aliased_vops);
I used -O2 -f... --param max-aliased-vops=1000 --param avg-aliased-vops=3 and
it didn't fail.
Now looking at the other places grep shows:
tree-ssa-loop.c:509: changed >= 3 to > 3: still fails.
tree-ssa-loop.c:550: changed >= 3 to > 3: still fails.
tree-ssa-loop-niter.c:1831: changed >= 3 to > 3: still fails.
So it's a combination?
commented out the set_param_value calls in opts.c, and changed all optimize >=
3 to > 3 (i.e. not the do_partial_partial test): test passes.
only changed all optimize >= 3 to > 3: test passes.
only changed tree-ssa-loop.c lines 509 and 550: test passes.
So the result is that if tree-ssa-loop.c calls tree_unroll_loops_completely
with may_increase_size = true then my test fails.
Does that ring a bell somewhere - any more tips for debugging? Or should I try
to reduce a testcase next?
Regards,
Matthias
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Difference between -O3 and -O2 with the -f options -O3 adds
2009-07-30 8:29 ` Matthias Kretz
@ 2009-07-31 9:26 ` Matthias Kretz
0 siblings, 0 replies; 6+ messages in thread
From: Matthias Kretz @ 2009-07-31 9:26 UTC (permalink / raw)
To: gcc-help
On Thursday 30 July 2009 10:28:58 Matthias Kretz wrote:
> Does that ring a bell somewhere - any more tips for debugging? Or should I
> try to reduce a testcase next?
I just opened http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40924 for this.
Regards,
Matthias
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2009-07-31 9:26 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-07-29 15:43 Difference between -O3 and -O2 with the -f options -O3 adds Matthias Kretz
2009-07-29 15:52 ` John (Eljay) Love-Jensen
2009-07-30 7:39 ` Matthias Kretz
2009-07-29 16:57 ` Michael Meissner
2009-07-30 8:29 ` Matthias Kretz
2009-07-31 9:26 ` Matthias Kretz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).