public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* Difference between -O3 and -O2 with the -f options -O3 adds
@ 2009-07-29 15:43 Matthias Kretz
  2009-07-29 15:52 ` John (Eljay) Love-Jensen
  2009-07-29 16:57 ` Michael Meissner
  0 siblings, 2 replies; 6+ messages in thread
From: Matthias Kretz @ 2009-07-29 15:43 UTC (permalink / raw)
  To: gcc-help

Hi,

I just updated to 4.4.1 and now one of my unit tests fails when it is compiled 
with -O3.
OTOH when I compile with -O2 -finline-functions -funswitch-loops -ftree-
vectorize -fpredictive-commoning -fgcse-after-reload -fipa-cp-clone the 
resulting binary does not fail.

I looked at the Assembly with objdump -dwC and I can see at least that the -O3 
compiled binary has some loop unrolling done that's not in the other binary.

The gcc manual says that -O3 is the same as -O2 plus -finline-functions -
funswitch-loops -ftree-vectorize -fpredictive-commoning -fgcse-after-reload

gcc -c -Q -O3 --help=optimizers compared to the same with -O2 shows that also 
-fipa-cp-clone goes in that list. But still the result is not the same.

Any ideas how to debug this regression further?

Regards,
	Matthias

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Difference between -O3 and -O2 with the -f options -O3 adds
  2009-07-29 15:43 Difference between -O3 and -O2 with the -f options -O3 adds Matthias Kretz
@ 2009-07-29 15:52 ` John (Eljay) Love-Jensen
  2009-07-30  7:39   ` Matthias Kretz
  2009-07-29 16:57 ` Michael Meissner
  1 sibling, 1 reply; 6+ messages in thread
From: John (Eljay) Love-Jensen @ 2009-07-29 15:52 UTC (permalink / raw)
  To: Matthias Kretz, GCC-help

Hi Mattias,

> Any ideas how to debug this regression further?

Any interesting deltas between these two outputs (O2.s and O3.s):

touch Empty.cpp
gcc -O2 -fverbose-asm -S Empty.cpp -o O2.s
gcc -O3 -fverbose-asm -S Empty.cpp -o O3.s

HTH,
--Eljay

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Difference between -O3 and -O2 with the -f options -O3 adds
  2009-07-29 15:43 Difference between -O3 and -O2 with the -f options -O3 adds Matthias Kretz
  2009-07-29 15:52 ` John (Eljay) Love-Jensen
@ 2009-07-29 16:57 ` Michael Meissner
  2009-07-30  8:29   ` Matthias Kretz
  1 sibling, 1 reply; 6+ messages in thread
From: Michael Meissner @ 2009-07-29 16:57 UTC (permalink / raw)
  To: Matthias Kretz; +Cc: gcc-help

On Wed, Jul 29, 2009 at 05:43:25PM +0200, Matthias Kretz wrote:
> Hi,
> 
> I just updated to 4.4.1 and now one of my unit tests fails when it is compiled 
> with -O3.
> OTOH when I compile with -O2 -finline-functions -funswitch-loops -ftree-
> vectorize -fpredictive-commoning -fgcse-after-reload -fipa-cp-clone the 
> resulting binary does not fail.
> 
> I looked at the Assembly with objdump -dwC and I can see at least that the -O3 
> compiled binary has some loop unrolling done that's not in the other binary.
> 
> The gcc manual says that -O3 is the same as -O2 plus -finline-functions -
> funswitch-loops -ftree-vectorize -fpredictive-commoning -fgcse-after-reload
> 
> gcc -c -Q -O3 --help=optimizers compared to the same with -O2 shows that also 
> -fipa-cp-clone goes in that list. But still the result is not the same.
> 
> Any ideas how to debug this regression further?
> 
> Regards,
> 	Matthias

This is likely due to two places in the compiler that look at optimization
level as a value, instead of just as as the component switches.

The first place is in tree-ssa-pre.c, which sets the boolean do_partial_partial
if -O3 and there is no extra switch to control this::

	/* Main entry point to the SSA-PRE pass.  DO_FRE is true if the caller
	   only wants to do full redundancy elimination.  */

	static unsigned int
	execute_pre (bool do_fre ATTRIBUTE_UNUSED)
	{
	  unsigned int todo = 0;

	  do_partial_partial = optimize > 2;

	  /* This has to happen before SCCVN runs because
	     loop_optimizer_init may create new phis, etc.  */
	  if (!do_fre)
	    loop_optimizer_init (LOOPS_NORMAL);

	  if (!run_scc_vn (do_fre))
	    {
	      if (!do_fre)
		{
		  remove_dead_inserted_code ();
		  loop_optimizer_finalize ();
		}

	      return 0;
	    }
	  init_pre (do_fre);
	  /* ... */

In opts.c, two parameter values are adjusted if -O3 in addition to the -f
options:


	  /* Allow even more virtual operators.  Max-aliased-vops was set above for
	     -O2, so don't reset it unless we are at -O3.  */
	  if (opt3)
	    set_param_value ("max-aliased-vops", 1000);

	  set_param_value ("avg-aliased-vops", (opt3) ? 3 : initial_avg_aliased_vops);

For the second set, you can set those parameters yourself to see if setting
them makes any difference.  The first case, you could need to debug the
compiler and/or change the source to try it out.

-- 
Michael Meissner, IBM
4 Technology Place Drive, MS 2203A, Westford, MA, 01886, USA
meissner@linux.vnet.ibm.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Difference between -O3 and -O2 with the -f options -O3 adds
  2009-07-29 15:52 ` John (Eljay) Love-Jensen
@ 2009-07-30  7:39   ` Matthias Kretz
  0 siblings, 0 replies; 6+ messages in thread
From: Matthias Kretz @ 2009-07-30  7:39 UTC (permalink / raw)
  To: gcc-help

On Wednesday 29 July 2009 17:52:27 John (Eljay) Love-Jensen wrote:
> Any interesting deltas between these two outputs (O2.s and O3.s):
>
> touch Empty.cpp
> gcc -O2 -fverbose-asm -S Empty.cpp -o O2.s
> gcc -O3 -fverbose-asm -S Empty.cpp -o O3.s

That gives me the same list of differences that I had before:

-finline-functions -fgcse-after-reload -fipa-cp-clone -fpredictive-commoning -
ftree-vectorize -funswitch-loops

I also looked at the diff of compiling my code with s/-c/-S -fverbose-asm/ and 
there's no difference (I'm using the -O2 and -f... switches).

Regards,
	Matthias

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Difference between -O3 and -O2 with the -f options -O3 adds
  2009-07-29 16:57 ` Michael Meissner
@ 2009-07-30  8:29   ` Matthias Kretz
  2009-07-31  9:26     ` Matthias Kretz
  0 siblings, 1 reply; 6+ messages in thread
From: Matthias Kretz @ 2009-07-30  8:29 UTC (permalink / raw)
  To: Michael Meissner, gcc-help

On Wednesday 29 July 2009 18:57:30 Michael Meissner wrote:
> On Wed, Jul 29, 2009 at 05:43:25PM +0200, Matthias Kretz wrote:
> This is likely due to two places in the compiler that look at optimization
> level as a value, instead of just as as the component switches.

There seem to be more places than what you pointed out...

> The first place is in tree-ssa-pre.c, which sets the boolean
> do_partial_partial if -O3 and there is no extra switch to control this::

I changed
> 	  do_partial_partial = optimize > 2;
to
	  do_partial_partial = optimize > 3;
compiled with -O3 and the problem remains.

> In opts.c, two parameter values are adjusted if -O3 in addition to the -f
> options:
>
>
> 	  /* Allow even more virtual operators.  Max-aliased-vops was set above
> for -O2, so don't reset it unless we are at -O3.  */
> 	  if (opt3)
> 	    set_param_value ("max-aliased-vops", 1000);
>
> 	  set_param_value ("avg-aliased-vops", (opt3) ? 3 :
> initial_avg_aliased_vops);

I used -O2 -f... --param max-aliased-vops=1000 --param avg-aliased-vops=3 and 
it didn't fail.

Now looking at the other places grep shows:

tree-ssa-loop.c:509: changed >= 3 to > 3: still fails.
tree-ssa-loop.c:550: changed >= 3 to > 3: still fails.
tree-ssa-loop-niter.c:1831: changed >= 3 to > 3: still fails.

So it's a combination?
commented out the set_param_value calls in opts.c, and changed all optimize >= 
3 to > 3 (i.e. not the do_partial_partial test): test passes.

only changed all optimize >= 3 to > 3: test passes.

only changed tree-ssa-loop.c lines 509 and 550: test passes.

So the result is that if tree-ssa-loop.c calls tree_unroll_loops_completely 
with may_increase_size = true then my test fails.

Does that ring a bell somewhere - any more tips for debugging? Or should I try 
to reduce a testcase next?

Regards,
	Matthias

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Difference between -O3 and -O2 with the -f options -O3 adds
  2009-07-30  8:29   ` Matthias Kretz
@ 2009-07-31  9:26     ` Matthias Kretz
  0 siblings, 0 replies; 6+ messages in thread
From: Matthias Kretz @ 2009-07-31  9:26 UTC (permalink / raw)
  To: gcc-help

On Thursday 30 July 2009 10:28:58 Matthias Kretz wrote:
> Does that ring a bell somewhere - any more tips for debugging? Or should I
> try to reduce a testcase next?

I just opened http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40924 for this.

Regards,
	Matthias

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-07-31  9:26 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-07-29 15:43 Difference between -O3 and -O2 with the -f options -O3 adds Matthias Kretz
2009-07-29 15:52 ` John (Eljay) Love-Jensen
2009-07-30  7:39   ` Matthias Kretz
2009-07-29 16:57 ` Michael Meissner
2009-07-30  8:29   ` Matthias Kretz
2009-07-31  9:26     ` Matthias Kretz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).