public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: GCSE store motion
@ 2002-05-16  5:30 Robert Dewar
  2002-05-16  7:33 ` -O2 versus -O1 (Was: Re: GCSE store motion) Jan Hubicka
                   ` (2 more replies)
  0 siblings, 3 replies; 58+ messages in thread
From: Robert Dewar @ 2002-05-16  5:30 UTC (permalink / raw)
  To: dberlin, dewar, mark, roger; +Cc: aj, davem, gcc, rth

> That means we shouldn't be spending much time trying to do software
> loop pipelining when compiling GCC, so the optimization shouldn't
> make compiling the compiler significantly slower.

I don't see how you conclude this. You have to do the analysis on every
loop. There will definitely be loops in GCC where the optimization is
possible, there will be loops where it is not. I would expect the
compiler to spend quite a bit of time trying to improve code for
loops in GCC. What I am saying is that I doubt that the overall
effect will be that benficial for GCC.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* -O2 versus -O1 (Was: Re: GCSE store motion)
  2002-05-16  5:30 GCSE store motion Robert Dewar
@ 2002-05-16  7:33 ` Jan Hubicka
  2002-05-16  8:04   ` Mark Mitchell
  2002-05-16  8:04   ` Geert Bosch
  2002-05-16  7:59 ` GCSE store motion Mark Mitchell
  2002-05-16  8:31 ` Daniel Berlin
  2 siblings, 2 replies; 58+ messages in thread
From: Jan Hubicka @ 2002-05-16  7:33 UTC (permalink / raw)
  To: Robert Dewar; +Cc: dberlin, mark, roger, aj, davem, gcc, rth

> > That means we shouldn't be spending much time trying to do software
> > loop pipelining when compiling GCC, so the optimization shouldn't
> > make compiling the compiler significantly slower.
> 
> I don't see how you conclude this. You have to do the analysis on every
> loop. There will definitely be loops in GCC where the optimization is
> possible, there will be loops where it is not. I would expect the
> compiler to spend quite a bit of time trying to improve code for
> loops in GCC. What I am saying is that I doubt that the overall
> effect will be that benficial for GCC.

I don't think the rule should be taken literaly for each optimization.
Software pipelining, profile feedback, loop unroling, function inlining,
prefetch code genration, scheduling on i386 are all optimizations that will
lose in such test and still are worthwhile to have as for numeric code for
instance are a must.

I think we have -O1 for those "I want sane code but don't have time to wait"
and -O2 for "I can wait to save extra few %".

On the other hand, what I think is wortwhile is to reconsider what optimizations
should be enabled at -O1. Currently we do:

      flag_defer_pop = 1;
      flag_thread_jumps = 1;
#ifdef DELAY_SLOTS
      flag_delayed_branch = 1;
#endif
#ifdef CAN_DEBUG_WITHOUT_FP
      flag_omit_frame_pointer = 1;
#endif
      flag_guess_branch_prob = 1;
      flag_cprop_registers = 1;
      flag_loop_optimize = 1;
      flag_crossjumping = 1;
      flag_if_conversion = 1;
      flag_if_conversion2 = 1;

I believe crossjumping, jump threading and perhaps if conversion 2 are examples
of such optimizations that are expensive and brings not so much benefit.
Do you think it makes sense to run some tests and think about disabling them?
Would be the "bootstrap -O1" considered as valueable rule of thumb?

On the other hand at -O2 we do some bits that are not that expensive
and may come to -O1 category.  I would guess for:

      flag_optimize_sibling_calls = 1;
      flag_rename_registers = 1;
      flag_caller_saves = 1;
      flag_force_mem = 1;
      flag_regmove = 1;
      flag_strict_aliasing = 1;
      flag_reorder_blocks = 1;
      flag_reorder_functions = 1;

What do you think?  If we get kind of agreeement, I can run series of tests
for these optimizations...

Another thing I believe can be worthwhile is to have switch that enables
the aggressive bits, like loop unrolling or prefetch people can use for
benchmarks or very CPU bound code.  It appears to be common problems of the
GCC reviews that they do use suboptimal switches and partly it is our mistake
I guess. It is very dificult to set it up.

Honza

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-16  5:30 GCSE store motion Robert Dewar
  2002-05-16  7:33 ` -O2 versus -O1 (Was: Re: GCSE store motion) Jan Hubicka
@ 2002-05-16  7:59 ` Mark Mitchell
  2002-05-16  8:31 ` Daniel Berlin
  2 siblings, 0 replies; 58+ messages in thread
From: Mark Mitchell @ 2002-05-16  7:59 UTC (permalink / raw)
  To: Robert Dewar, dberlin, roger; +Cc: aj, davem, gcc, rth



--On Thursday, May 16, 2002 07:48:38 AM -0400 Robert Dewar <dewar@gnat.com> 
wrote:

>> That means we shouldn't be spending much time trying to do software
>> loop pipelining when compiling GCC, so the optimization shouldn't
>> make compiling the compiler significantly slower.
>
> I don't see how you conclude this. You have to do the analysis on every
> loop. There will definitely be loops in GCC where the optimization is
> possible, there will be loops where it is not. I would expect the
> compiler to spend quite a bit of time trying to improve code for
> loops in GCC. What I am saying is that I doubt that the overall
> effect will be that benficial for GCC.

I think we're wandering into a good pub conversation, rather than a useful
debate about criteria for accepting changes in the compiler, but...

There aren't very many loops in GCC; certainly nowhere near as many (per
line of code) as in your average scientific application, say.  And, a
little heuristicness (most of these loops have lots of internal branches
and even function calls, therefore they're less liikely to be hot spots,
therefore I won't spend a lot of effor trying to wring a few cycles out
of them), doesn't seem unreasonable to me.

I picked a random GCC file (regmove.c) and counted 127 lines with the
word "for", 8 with "while", and 54 with "do" out of 73694 lines.  (That's
probably an over-estimate; I'd bet some of those show up in comments.)

How much time can/should the compiler waste compiling those relatively
few loops?

My point is that I can well imagine the compiler spending 10% of its time
on software pipelining for scientific code, but that would seem highly
out of whack on code like that in GCC.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: -O2 versus -O1 (Was: Re: GCSE store motion)
  2002-05-16  7:33 ` -O2 versus -O1 (Was: Re: GCSE store motion) Jan Hubicka
  2002-05-16  8:04   ` Mark Mitchell
@ 2002-05-16  8:04   ` Geert Bosch
  2002-05-16  8:06     ` Geert Bosch
  1 sibling, 1 reply; 58+ messages in thread
From: Geert Bosch @ 2002-05-16  8:04 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Robert Dewar, dberlin, mark, roger, aj, davem, gcc, rth


On Thursday, May 16, 2002, at 10:07 , Jan Hubicka wrote:

> On the other hand at -O2 we do some bits that are not that expensive
> and may come to -O1 category.  I would guess for:
>
>       flag_optimize_sibling_calls = 1;
>       flag_rename_registers = 1;
>       flag_caller_saves = 1;
>       flag_force_mem = 1;
>       flag_regmove = 1;
>       flag_strict_aliasing = 1;
>       flag_reorder_blocks = 1;
>       flag_reorder_functions = 1;

A second important goal for -O1 is to keep code as debuggable as
possible. This rules out at least the sibling calls optimization, but
probably others as well.

   -Geert

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: -O2 versus -O1 (Was: Re: GCSE store motion)
  2002-05-16  7:33 ` -O2 versus -O1 (Was: Re: GCSE store motion) Jan Hubicka
@ 2002-05-16  8:04   ` Mark Mitchell
  2002-05-16  8:04   ` Geert Bosch
  1 sibling, 0 replies; 58+ messages in thread
From: Mark Mitchell @ 2002-05-16  8:04 UTC (permalink / raw)
  To: Jan Hubicka, Robert Dewar; +Cc: dberlin, roger, aj, davem, gcc, rth


I'm not sure how to change around what's in -O2 and -O1.  We don't want
to confuse people who are used to one set of things, of course.  It's
a tricky question.

> Another thing I believe can be worthwhile is to have switch that enables
> the aggressive bits, like loop unrolling or prefetch people can use for
> benchmarks or very CPU bound code.  It appears to be common problems of
> the GCC reviews that they do use suboptimal switches and partly it is our
> mistake I guess. It is very dificult to set it up.

See my earlier rants about why it is bad to have so dang many options...

I'm not sure what to do, exactly, but you're right that it would be nice
if you tended to get the fastest code with "-O2" or "-O3" and not
"-O2 -fno-this -fthat".  If that's not turning out to be true, we should
see if we could tune it somewhat.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: -O2 versus -O1 (Was: Re: GCSE store motion)
  2002-05-16  8:04   ` Geert Bosch
@ 2002-05-16  8:06     ` Geert Bosch
  0 siblings, 0 replies; 58+ messages in thread
From: Geert Bosch @ 2002-05-16  8:06 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Robert Dewar, dberlin, mark, roger, aj, davem, gcc, rth


On Thursday, May 16, 2002, at 10:07 , Jan Hubicka wrote:

> On the other hand at -O2 we do some bits that are not that expensive
> and may come to -O1 category.  I would guess for:
>
>       flag_optimize_sibling_calls = 1;
>       flag_rename_registers = 1;
>       flag_caller_saves = 1;
>       flag_force_mem = 1;
>       flag_regmove = 1;
>       flag_strict_aliasing = 1;
>       flag_reorder_blocks = 1;
>       flag_reorder_functions = 1;

A second important goal for -O1 is to keep code as debuggable as
possible. This rules out at least the sibling calls optimization, but
probably others as well.

   -Geert

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-16  5:30 GCSE store motion Robert Dewar
  2002-05-16  7:33 ` -O2 versus -O1 (Was: Re: GCSE store motion) Jan Hubicka
  2002-05-16  7:59 ` GCSE store motion Mark Mitchell
@ 2002-05-16  8:31 ` Daniel Berlin
  2 siblings, 0 replies; 58+ messages in thread
From: Daniel Berlin @ 2002-05-16  8:31 UTC (permalink / raw)
  To: Robert Dewar; +Cc: mark, roger, aj, davem, gcc, rth

On Thu, 16 May 2002, Robert Dewar wrote:

> > That means we shouldn't be spending much time trying to do software
> > loop pipelining when compiling GCC, so the optimization shouldn't
> > make compiling the compiler significantly slower.
> 
> I don't see how you conclude this. You have to do the analysis on every
> loop.
Not really.
Intel's compiler immediately discounts loops with calls in them.

> There will definitely be loops in GCC where the optimization is
> possible, there will be loops where it is not. I would expect the
> compiler to spend quite a bit of time trying to improve code for
> loops in GCC. What I am saying is that I doubt that the overall
> effect will be that benficial for GCC.

> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 13:48             ` Toon Moene
  2002-05-16 13:50               ` law
@ 2002-05-17 15:03               ` Tim Hollebeek
  1 sibling, 0 replies; 58+ messages in thread
From: Tim Hollebeek @ 2002-05-17 15:03 UTC (permalink / raw)
  To: Toon Moene; +Cc: Mark Mitchell, law, Roger Sayle, Daniel Berlin, gcc

> 
> No, but it'll help any and all Fortran programs that have to be compiled
> with -fno-automatic because they were developed on systems where local
> variables in subroutines kept their values from call to call.
> 
> Given the number of times my advice on comp.lang.fortran of compiling
> with -fno-automatic actually `stopped the bug', I'd think that's rather
> important ...

If it has it's own -f flag, perhaps that flag could default to true
for fortran (or fortran + -fno-automatic) and false elsewhere.  Though
it'd be nice to verify that it actually helps such programs first.

-Tim

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-16 14:44               ` Daniel Berlin
@ 2002-05-16 21:12                 ` Eric Christopher
  0 siblings, 0 replies; 58+ messages in thread
From: Eric Christopher @ 2002-05-16 21:12 UTC (permalink / raw)
  To: gcc

Daniel,

> However, please note that *I* would only consider this very popular game
> console if the port is in the FSF tree, or it helps some game console
> port in the FSF tree.
> I'm not sure, but IIRC, most of the game console ports don't exist in
> the FSF tree. Please correct me if i'm wrong. After all, no offense
> Jeff, but *I* don't care if it only helps Red Hat's GCC customers, and
> not the people who use the FSF tree.  The Red Hat customers can just use
> the flag or whatever .
> 
> Others may have differing views.

I don't think you understood what Jeff was talking about. He wasn't
talking about a particular chip architecture, he was talking more about a
particular type of user - for which that architecture was important.
There's a difference there. At any rate, if it will help you look at
these problems you'll have the port in the next few weeks. There was some
cleanup that I wanted to do first, but it's on my list.

-eric


-- 
I will not carve gods

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-16 13:53             ` law
@ 2002-05-16 14:44               ` Daniel Berlin
  2002-05-16 21:12                 ` Eric Christopher
  0 siblings, 1 reply; 58+ messages in thread
From: Daniel Berlin @ 2002-05-16 14:44 UTC (permalink / raw)
  To: law; +Cc: Mark Mitchell, Roger Sayle, gcc

On Thu, 2002-05-16 at 12:39, law@redhat.com wrote:
> In message <23630000.1021488625@gandalf.codesourcery.com>, Mark Mitchell writes
> :
>  > 
>  > 
>  > --On Wednesday, May 15, 2002 11:38:03 AM -0600 "law@redhat.com" 
>  > <law@redhat.com> wrote:
>  > 
>  > > In message <17950000.1021482109@gandalf.codesourcery.com>, Mark Mitchell
>  > > writes:  > Dan's claim seems to be that nobody has a real-world
>  > > application that  > shows an improvement with store motion enabled.  If
>  > > that's true, we  > don't need that optimization enabled.  We can keep the
>  > > code, and use  > it when it becomes more useful, but there's no reason to
>  > > be running  > that pass.
>  > >  >
>  > >  > If, however, someone has real applications that show measurable
>  > >  > improvents -- the Linux kernel would certainly qualify -- then we
>  > >  > should rethink the issue.
>  > 
>  > > Would games on a very popular game console work?
>  > 
>  > Sure!  Do we have any numbers at all?  (I know you said it was difficult
>  > to measure...)

However, please note that *I* would only consider this very popular game
console if the port is in the FSF tree, or it helps some game console
port in the FSF tree.
I'm not sure, but IIRC, most of the game console ports don't exist in
the FSF tree. Please correct me if i'm wrong.
After all, no offense Jeff, but *I* don't care if it only helps Red
Hat's GCC customers, and not the people who use the FSF tree.  The Red
Hat customers can just use the flag or whatever .

Others may have differing views.

>  > 
>  > I think there are two issues:
>  > 
>  > 1. Correctness.
>  > 
>  > 2. Efficacy.
>  > 
>  > There seems to be some debate on (1), but assuming that the optimization
>  > is correct, we're down to (2). 

> As long as the optimization doesn't
>  > take unreasonably long to run, and as long as it helps some real programs
>  > without hurting most of them, we should have it.
> Given Toon's reference -fno-automatic or whatever it was, we can probably
> address #2 by running spec with that switch -- once with LSM, once
> without LSM.
You mean SM, we aren't talking about LM.

> 
> Maybe the folks at SuSE could cover that since it seems they have a
> good infrastructure for this kind of thing.
I've already done it for SPEC95's fortran programs.
Without -fno-automatic, it removes no stores on any of the fortran
programs.
With -fno-automatic, it move quite a few, but all of the programs
actually take *longer* to run (IE it's a pessimization).
-fno-automatic -fno-gcse-sm actually speeds a few  back up, but not to
the speed without -fno-automatic.


So GCSE SM is actually *hurting* us here in some cases, by a small but
consistent percentage. When it does help, it's not helping enough to
overcome the hurt it causes (it's 1%).

Looking at dumps, it seems to insert way too many stores to delete too
few.  This would be symptomatic of it thinking stores were not available
when they were, or something of the sort (causing it to insert extra
unneeded stores).
I'm hoping, anyway, because certainly, it shouldn't be slowing programs
down.
At least, one could disable the motion part, and keep the store removal
portion (it removes stores it comes across that are already marked
available, as it computes the bitmaps).

This is all non-official, running the programs multiple times by hand on
the right inputs and outputs (where needed), so don't take it as
official or anything. I'd still like to see a real run done.


> jeff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 12:07           ` Mark Mitchell
  2002-05-15 13:48             ` Toon Moene
@ 2002-05-16 13:53             ` law
  2002-05-16 14:44               ` Daniel Berlin
  1 sibling, 1 reply; 58+ messages in thread
From: law @ 2002-05-16 13:53 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Roger Sayle, Daniel Berlin, gcc

In message <23630000.1021488625@gandalf.codesourcery.com>, Mark Mitchell writes
:
 > 
 > 
 > --On Wednesday, May 15, 2002 11:38:03 AM -0600 "law@redhat.com" 
 > <law@redhat.com> wrote:
 > 
 > > In message <17950000.1021482109@gandalf.codesourcery.com>, Mark Mitchell
 > > writes:  > Dan's claim seems to be that nobody has a real-world
 > > application that  > shows an improvement with store motion enabled.  If
 > > that's true, we  > don't need that optimization enabled.  We can keep the
 > > code, and use  > it when it becomes more useful, but there's no reason to
 > > be running  > that pass.
 > >  >
 > >  > If, however, someone has real applications that show measurable
 > >  > improvents -- the Linux kernel would certainly qualify -- then we
 > >  > should rethink the issue.
 > 
 > > Would games on a very popular game console work?
 > 
 > Sure!  Do we have any numbers at all?  (I know you said it was difficult
 > to measure...)
 > 
 > I think there are two issues:
 > 
 > 1. Correctness.
 > 
 > 2. Efficacy.
 > 
 > There seems to be some debate on (1), but assuming that the optimization
 > is correct, we're down to (2).  As long as the optimization doesn't
 > take unreasonably long to run, and as long as it helps some real programs
 > without hurting most of them, we should have it.
Given Toon's reference -fno-automatic or whatever it was, we can probably
address #2 by running spec with that switch -- once with LSM, once
without LSM.

Maybe the folks at SuSE could cover that since it seems they have a
good infrastructure for this kind of thing.
jeff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 13:48             ` Toon Moene
@ 2002-05-16 13:50               ` law
  2002-05-17 15:03               ` Tim Hollebeek
  1 sibling, 0 replies; 58+ messages in thread
From: law @ 2002-05-16 13:50 UTC (permalink / raw)
  To: Toon Moene; +Cc: Mark Mitchell, Roger Sayle, Daniel Berlin, gcc

 In message <3CE2BCB4.D6D36685@moene.indiv.nluug.nl>, Toon Moene writes:
 > No, but it'll help any and all Fortran programs that have to be compiled
 > with -fno-automatic because they were developed on systems where local
 > variables in subroutines kept their values from call to call.
 > 
 > Given the number of times my advice on comp.lang.fortran of compiling
 > with -fno-automatic actually `stopped the bug', I'd think that's rather
 > important ...
So this means there may be another way to test/benchmark this code -- using
the Fortran tests from spec with -fno-automatic.

jeff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
       [not found] ` <164620000.1021559673@gandalf.codesourcery.com.suse.lists.egcs>
@ 2002-05-16 11:42   ` Andi Kleen
  0 siblings, 0 replies; 58+ messages in thread
From: Andi Kleen @ 2002-05-16 11:42 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc

Mark Mitchell <mark@codesourcery.com> writes:
> 
> My point is that I can well imagine the compiler spending 10% of its time
> on software pipelining for scientific code, but that would seem highly
> out of whack on code like that in GCC.

I just did an full profile of an gcc 3.1 bootstrap. The top offender
(~5%) is ggc_alloc in ggc-page where I suspect the bitmap scanning loop 
eats most of the CPU time. If software pipelining of that single simple loop
cut off some of the 5% then it would surely be noticeable in gcc bootstrap
times :-)

-Andi

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-16  1:51       ` Jan Hubicka
@ 2002-05-16  9:59         ` Daniel Berlin
  0 siblings, 0 replies; 58+ messages in thread
From: Daniel Berlin @ 2002-05-16  9:59 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: Roger Sayle, gcc, Mark Mitchell, David S. Miller, Andreas Jaeger,
	Richard Henderson

On Thu, 16 May 2002, Jan Hubicka wrote:

> > On Wed, 15 May 2002, Roger Sayle wrote:
> > 
> > > 
> > > > In addition, never, in any RTL dumps of any code, ever, have I seen it
> > > > remove a single store.
> > > 
> > > I'd suggest compiling the testcase in the patch below with -O3 on CVS
> > > mainline (before store motion was disabled).  The test is ill-formed
> > > and the duplicate store should be moved, the call to abort() reveals
> > > that the optimizer did its job.  Feel free to step through with a
> > > debuggger to convince yourself that it was GCSE's store motion at
> > > work.  For example, it doesn't abort compiling with "-O3 -fno-gcse".
> > > 
> > > Seeing is believing.
> > Yes, as I mentioned, I misspoke. It now 
> > removes 2 stores through global store removal during compilation of gcc, and moves 
> > 3.
> 
> Can you, please describe to person who didn't had time to investigate it
> in detail why the optimization is not effective and what needs to be done
> to fix it? (or point me to mail mentioning it).

Sure.

It's not effective because the only stores it considers are those that 
meet these tests:
1. The destination is a symbol ref (not based on a symbol ref, just a 
symbol ref)
2. rtx_varies_p == false

If you attempt to change #1, you'll start to find that #2 lies.
It'll claim things don't vary even over two executions of the program when 
they do.

On a side rant, based on what rtx_varies_p and rtx_unstable_p 
*CLAIM* to do (by the comments in front of each function), one shouldn't need to check the 
operands of the store if it's rtx_unstable_p, since !rtx_unstable_p 
means that it wouldn't be different at a different point in the program.
But of course, this is likely just an inaccurate comment or bugs in 
setting the flags it's checking.

The stores that pass both of these tests are going to be, in 99% of 
cases stores into globals (I won't discount that it's possible to come up 
with non-global stores that meet these tests).

Theoretically, one could at least use rtx_unstable_p instead of 
rtx_varies_p.   But, as i mentioned, you'll get wrong answers sometimes.
With a bit of extra screening and whatnot elsewhere to account for this, I 
had it down to miscompiling one function in one file that was part of 
java.

The big gains in effectiveness would come from letting it move around 
basically any store as long as the destination wouldn't change 
because of where we moved it to (and i'm sure there are a few other 
conditions that i'm forgetting at the moment).
> I think it would worth comment in the code disabling it.
> 
> Honza
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
@ 2002-05-16  8:29 Robert Dewar
  0 siblings, 0 replies; 58+ messages in thread
From: Robert Dewar @ 2002-05-16  8:29 UTC (permalink / raw)
  To: dberlin, dewar; +Cc: aj, davem, gcc, mark, roger, rth

> Intel's compiler immediately discounts loops with calls in them.

Reasonable, though of course on the ia64 you typically want to do very
aggressive inlining to have a chance of extracting sufficient ILP.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
@ 2002-05-16  5:36 Robert Dewar
  0 siblings, 0 replies; 58+ messages in thread
From: Robert Dewar @ 2002-05-16  5:36 UTC (permalink / raw)
  To: dewar, toon; +Cc: dberlin, gcc, law, mark, roger

> (Of course, those programmers should have used the SAVE statement to
>  cause the value of the variable to be kept)

The SAVE statement was a late addition to Fortran, so it is unfamiliar
to many Fortran programmers. I wonder if compilers could manage to
detect some cases of obvious omission and generate warnings?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 10:06     ` Daniel Berlin
  2002-05-15 10:15       ` David Edelsohn
  2002-05-15 10:18       ` Roger Sayle
@ 2002-05-16  1:51       ` Jan Hubicka
  2002-05-16  9:59         ` Daniel Berlin
  2 siblings, 1 reply; 58+ messages in thread
From: Jan Hubicka @ 2002-05-16  1:51 UTC (permalink / raw)
  To: Daniel Berlin
  Cc: Roger Sayle, gcc, Mark Mitchell, David S. Miller, Andreas Jaeger,
	Richard Henderson

> On Wed, 15 May 2002, Roger Sayle wrote:
> 
> > 
> > > In addition, never, in any RTL dumps of any code, ever, have I seen it
> > > remove a single store.
> > 
> > I'd suggest compiling the testcase in the patch below with -O3 on CVS
> > mainline (before store motion was disabled).  The test is ill-formed
> > and the duplicate store should be moved, the call to abort() reveals
> > that the optimizer did its job.  Feel free to step through with a
> > debuggger to convince yourself that it was GCSE's store motion at
> > work.  For example, it doesn't abort compiling with "-O3 -fno-gcse".
> > 
> > Seeing is believing.
> Yes, as I mentioned, I misspoke. It now 
> removes 2 stores through global store removal during compilation of gcc, and moves 
> 3.

Can you, please describe to person who didn't had time to investigate it
in detail why the optimization is not effective and what needs to be done
to fix it? (or point me to mail mentioning it).
I think it would worth comment in the code disabling it.

Honza

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 13:56 Robert Dewar
  2002-05-15 14:06 ` Gabriel Dos Reis
  2002-05-15 15:09 ` Toon Moene
@ 2002-05-15 15:20 ` Dale Johannesen
  2 siblings, 0 replies; 58+ messages in thread
From: Dale Johannesen @ 2002-05-15 15:20 UTC (permalink / raw)
  To: Robert Dewar; +Cc: Dale Johannesen, mark, toon, dberlin, gcc, law, roger


On Wednesday, May 15, 2002, at 01:05 PM, Robert Dewar wrote:

> <No, but it'll help any and all Fortran programs that have to be compiled
> with -fno-automatic because they were developed on systems where local
> variables in subroutines kept their values from call to call.
>
> Given the number of times my advice on comp.lang.fortran of compiling
> with -fno-automatic actually `stopped the bug', I'd think that's rather
> important ...
>
> Interesting, considering that *every* version of the Fortran standard has
> emphasized that there is no requirement for local variables in subroutines
> to keep their values from call to call (the only exception is initialized
> data that is never reassigned).

True, but prior to the f77 standard there was no standard way to get the 
SAVE
functionality.  And just because a standard was adopted in 1978 didn't mean
code written using the new features of that standard was portable; that 
didn't
happen until the mid-80s.  Most old compilers SAVEd all locals by default,
and there is a lot of old Fortran around that depends on that behavior.  At
the time, such code was reasonably portable, as much as anything was (f66
was not complete enough or specified fully enough to support true 
portability).
To support these programs, every f77 compiler I know of has a switch
equivalent to -fno-automatic, and they always will.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 13:56 Robert Dewar
  2002-05-15 14:06 ` Gabriel Dos Reis
@ 2002-05-15 15:09 ` Toon Moene
  2002-05-15 15:20 ` Dale Johannesen
  2 siblings, 0 replies; 58+ messages in thread
From: Toon Moene @ 2002-05-15 15:09 UTC (permalink / raw)
  To: Robert Dewar; +Cc: mark, dberlin, gcc, law, roger

Robert Dewar wrote:

> I wrote:

> Given the number of times my advice on comp.lang.fortran of compiling
> with -fno-automatic actually `stopped the bug', I'd think that's rather
> important ...
> >
> 
> Interesting, considering that *every* version of the Fortran standard has
> emphasized that there is no requirement for local variables in subroutines
> to keep their values from call to call

[ ... sigh, yes, I know - this is from the my-compiler-accepts-this
      so-it-must-be-valid department ... ]

At least from Fortran 77 onwards:

"17.3 Events That Caused Entities to Become Undefined

    6.The execution of a RETURN statement or an END statement within a 
      subprogram causes all entities within the subprogram to become 
      undefined except for the following: 
         a.Entities in blank common 
         b.Initially defined entities that have neither been redefined
           nor become undefined
         c.Entities specified by SAVE statements 
         d.Entities in a named common block that appears in the
           subprogram and appears in at least one other program unit
           that is either directly or indirectly referencing the
           subprogram "

(Of course, those programmers should have used the SAVE statement to
 cause the value of the variable to be kept)

> (the only exception is initialized
> data that is never reassigned).

Indeed, see the above.

-- 
Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction)

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 13:56 Robert Dewar
@ 2002-05-15 14:06 ` Gabriel Dos Reis
  2002-05-15 15:09 ` Toon Moene
  2002-05-15 15:20 ` Dale Johannesen
  2 siblings, 0 replies; 58+ messages in thread
From: Gabriel Dos Reis @ 2002-05-15 14:06 UTC (permalink / raw)
  To: Robert Dewar; +Cc: mark, toon, dberlin, gcc, law, roger

dewar@gnat.com (Robert Dewar) writes:

[...]

| Interesting, considering that *every* version of the Fortran standard has
| emphasized that there is no requirement for local variables in subroutines
| to keep their values from call to call (the only exception is initialized
| data that is never reassigned).

In theory, there is no difference between theory and practice...

-- Gaby

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
@ 2002-05-15 13:56 Robert Dewar
  2002-05-15 14:06 ` Gabriel Dos Reis
                   ` (2 more replies)
  0 siblings, 3 replies; 58+ messages in thread
From: Robert Dewar @ 2002-05-15 13:56 UTC (permalink / raw)
  To: mark, toon; +Cc: dberlin, gcc, law, roger

<No, but it'll help any and all Fortran programs that have to be compiled
with -fno-automatic because they were developed on systems where local
variables in subroutines kept their values from call to call.

Given the number of times my advice on comp.lang.fortran of compiling
with -fno-automatic actually `stopped the bug', I'd think that's rather
important ...
>

Interesting, considering that *every* version of the Fortran standard has
emphasized that there is no requirement for local variables in subroutines
to keep their values from call to call (the only exception is initialized
data that is never reassigned).

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 12:07           ` Mark Mitchell
@ 2002-05-15 13:48             ` Toon Moene
  2002-05-16 13:50               ` law
  2002-05-17 15:03               ` Tim Hollebeek
  2002-05-16 13:53             ` law
  1 sibling, 2 replies; 58+ messages in thread
From: Toon Moene @ 2002-05-15 13:48 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: law, Roger Sayle, Daniel Berlin, gcc

Mark Mitchell wrote:
> 
> --On Wednesday, May 15, 2002 11:38:03 AM -0600 "law@redhat.com"
> <law@redhat.com> wrote:

[ Why do we need store motion ]

> > Would games on a very popular game console work?
> 
> Sure!  Do we have any numbers at all?  (I know you said it was difficult
> to measure...)

No, but it'll help any and all Fortran programs that have to be compiled
with -fno-automatic because they were developed on systems where local
variables in subroutines kept their values from call to call.

Given the number of times my advice on comp.lang.fortran of compiling
with -fno-automatic actually `stopped the bug', I'd think that's rather
important ...

-- 
Toon Moene - mailto:toon@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
Join GNU Fortran 95: http://g95.sourceforge.net/ (under construction)

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 12:45 Robert Dewar
  2002-05-15 12:56 ` Mark Mitchell
@ 2002-05-15 13:29 ` Daniel Berlin
  1 sibling, 0 replies; 58+ messages in thread
From: Daniel Berlin @ 2002-05-15 13:29 UTC (permalink / raw)
  To: Robert Dewar; +Cc: mark, roger, aj, davem, gcc, rth

On Wed, 15 May 2002, Robert Dewar wrote:

> <  If the optimization makes the compiler go slower when compiling itself,
>   it ain't worth having.
> >
> 
> I think that's much too harsh. For example a thorough job of software loop
> pipelining using the rotating registers of the ia64 may require *quite*
> a bit of compilation time, and since this optimization is unlikely to help
> GCC itself much that seems unfortunate.

ia64 seems to be a special case.
I was toying with the ia64 version of intel's compiler yesterday, and i 
was *amazed* at how many times it reruns optimizations and whatnot, 
compared to their IA32 compiler.

It seems they threw everything they could to try to get it to move at a 
reasonable pace.

> 
> Perhaps a more appropriate rule is that this is the criterion for putting
> an optimization in -O1.
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 12:45 Robert Dewar
@ 2002-05-15 12:56 ` Mark Mitchell
  2002-05-15 13:29 ` Daniel Berlin
  1 sibling, 0 replies; 58+ messages in thread
From: Mark Mitchell @ 2002-05-15 12:56 UTC (permalink / raw)
  To: Robert Dewar, dberlin, roger; +Cc: aj, davem, gcc, rth



--On Wednesday, May 15, 2002 03:29:29 PM -0400 Robert Dewar 
<dewar@gnat.com> wrote:

> <  If the optimization makes the compiler go slower when compiling itself,
>   it ain't worth having.
>>
>
> I think that's much too harsh.

I'd tried to make clear I didn't mean it literally, or as an absolute.

> For example a thorough job of software loop
> pipelining using the rotating registers of the ia64 may require *quite*
> a bit of compilation time, and since this optimization is unlikely to help
> GCC itself much that seems unfortunate.

That means we shouldn't be spending much time trying to do software
loop pipelining when compiling GCC, so the optimization shouldn't
make compiling the compiler significantly slower.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
@ 2002-05-15 12:45 Robert Dewar
  2002-05-15 12:56 ` Mark Mitchell
  2002-05-15 13:29 ` Daniel Berlin
  0 siblings, 2 replies; 58+ messages in thread
From: Robert Dewar @ 2002-05-15 12:45 UTC (permalink / raw)
  To: dberlin, mark, roger; +Cc: aj, davem, gcc, rth

<  If the optimization makes the compiler go slower when compiling itself,
  it ain't worth having.
>

I think that's much too harsh. For example a thorough job of software loop
pipelining using the rotating registers of the ia64 may require *quite*
a bit of compilation time, and since this optimization is unlikely to help
GCC itself much that seems unfortunate.

Perhaps a more appropriate rule is that this is the criterion for putting
an optimization in -O1.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15  9:21 ` Daniel Berlin
                     ` (2 preceding siblings ...)
  2002-05-15 10:02   ` Mark Mitchell
@ 2002-05-15 12:32   ` law
  3 siblings, 0 replies; 58+ messages in thread
From: law @ 2002-05-15 12:32 UTC (permalink / raw)
  To: Daniel Berlin
  Cc: Roger Sayle, gcc, Mark Mitchell, David S. Miller, Andreas Jaeger,
	Richard Henderson

In message <Pine.LNX.4.44.0205151154080.24022-100000@dberlin.org>, Daniel Berlin writes:
 > I "claimed" it wasn't doing anything because SPEC95/2000 runs show it 
 > making no improvement whatsoever.
 > 
 > In addition, never, in any RTL dumps of any code, ever, have I seen it 
 > remove a single store. 
 > 
 > Nobody has claimed that it is generally useful in it's current state. In 
 > fact, the person who submitted it has claimed otherwise.
 > It was written to address a specific case, which i've no doubt it does.
 > This case rarely, if ever, occurs.
 > If you want to claim it is a functional optimization that has useful 
 > application, please provide benchmarks that show store motion making any
 > difference.
If I legally could give you the code to prove this stuff was useful to
the customer paying for it, then I would.  Unfortunately that code is
highly proprietary.

You might consider looking at EEMBC which has certain gross similarities
to the code from our customer.  THough I haven't looked closely enough
at EEMBC to determine if the similarities are enough to trigger the
optimization.

jeff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 11:56         ` law
@ 2002-05-15 12:07           ` Mark Mitchell
  0 siblings, 0 replies; 58+ messages in thread
From: Mark Mitchell @ 2002-05-15 12:07 UTC (permalink / raw)
  To: law, David Edelsohn
  Cc: Roger Sayle, Daniel Berlin, gcc, David S. Miller, Andreas Jaeger,
	Richard Henderson



--On Wednesday, May 15, 2002 11:40:25 AM -0600 "law@redhat.com" 
<law@redhat.com> wrote:

>  In message <200205151706.NAA23886@makai.watson.ibm.com>, David Edelsohn
> writes:  > 	Maybe we should rearrange the disabling so that toplev.c
> defaults  > flag_gcse_sm to 0 instead of disabling the optimization in
> gcse.c itself.  > Then this file in glibc could be compiled with
> -fgcse-sm in the one  > instance it has been shown to be useful.
>  >
>  > 	When store motion shows performance improvement across a wider
>  > range of applications, then we can enable that optimization by default.
> Seems reasonable to me.

Me too, as long as we have evidence that it is useful to some people.

I agree that a separate optimization flag, not turned on by default
with any level of -O, would make sense in that case.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 11:56         ` law
@ 2002-05-15 12:07           ` Mark Mitchell
  2002-05-15 13:48             ` Toon Moene
  2002-05-16 13:53             ` law
  0 siblings, 2 replies; 58+ messages in thread
From: Mark Mitchell @ 2002-05-15 12:07 UTC (permalink / raw)
  To: law; +Cc: Roger Sayle, Daniel Berlin, gcc



--On Wednesday, May 15, 2002 11:38:03 AM -0600 "law@redhat.com" 
<law@redhat.com> wrote:

> In message <17950000.1021482109@gandalf.codesourcery.com>, Mark Mitchell
> writes:  > Dan's claim seems to be that nobody has a real-world
> application that  > shows an improvement with store motion enabled.  If
> that's true, we  > don't need that optimization enabled.  We can keep the
> code, and use  > it when it becomes more useful, but there's no reason to
> be running  > that pass.
>  >
>  > If, however, someone has real applications that show measurable
>  > improvents -- the Linux kernel would certainly qualify -- then we
>  > should rethink the issue.

> Would games on a very popular game console work?

Sure!  Do we have any numbers at all?  (I know you said it was difficult
to measure...)

I think there are two issues:

1. Correctness.

2. Efficacy.

There seems to be some debate on (1), but assuming that the optimization
is correct, we're down to (2).  As long as the optimization doesn't
take unreasonably long to run, and as long as it helps some real programs
without hurting most of them, we should have it.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 10:15       ` David Edelsohn
@ 2002-05-15 11:56         ` law
  2002-05-15 12:07           ` Mark Mitchell
  0 siblings, 1 reply; 58+ messages in thread
From: law @ 2002-05-15 11:56 UTC (permalink / raw)
  To: David Edelsohn
  Cc: Roger Sayle, Daniel Berlin, gcc, Mark Mitchell, David S. Miller,
	Andreas Jaeger, Richard Henderson

 In message <200205151706.NAA23886@makai.watson.ibm.com>, David Edelsohn writes:
 > 	Maybe we should rearrange the disabling so that toplev.c defaults
 > flag_gcse_sm to 0 instead of disabling the optimization in gcse.c itself.
 > Then this file in glibc could be compiled with -fgcse-sm in the one
 > instance it has been shown to be useful.
 > 
 > 	When store motion shows performance improvement across a wider
 > range of applications, then we can enable that optimization by default.
Seems reasonable to me.
jeff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 10:09       ` Daniel Berlin
  2002-05-15 10:21         ` Jakub Jelinek
@ 2002-05-15 11:56         ` law
  1 sibling, 0 replies; 58+ messages in thread
From: law @ 2002-05-15 11:56 UTC (permalink / raw)
  To: Daniel Berlin
  Cc: Roger Sayle, gcc, Mark Mitchell, David S. Miller, Andreas Jaeger,
	Richard Henderson

In message <Pine.LNX.4.44.0205151258170.24022-100000@dberlin.org>, Daniel Berlin writes:
 > But nobody has any plans to improve it, it's been bitrotting since it was 
 > introduced.
No.  It's more a matter of time.

 > When the number of places the optimizations is applicable approaches 0, 
 > the optimization should be disabled.
If you happen to have a representative sample of all code sets, then yes.
But reality is you don't have that representative sample.  

 > I've not suggested we *remove* store motion until it is superceded.
 > I do think it should be disabled, as it just wastes time.
 > 
 > Or, as you say, move it to -fexpensive-optimizations.
Moving it to -fexmensive-optimizations or even having the user 
explicitly specify that they want those optimizations would be
fine by me.

jeff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 10:14       ` Mark Mitchell
  2002-05-15 10:41         ` Roger Sayle
@ 2002-05-15 11:56         ` law
  2002-05-15 12:07           ` Mark Mitchell
  1 sibling, 1 reply; 58+ messages in thread
From: law @ 2002-05-15 11:56 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Roger Sayle, Daniel Berlin, gcc

In message <17950000.1021482109@gandalf.codesourcery.com>, Mark Mitchell writes:
 > Dan's claim seems to be that nobody has a real-world application that
 > shows an improvement with store motion enabled.  If that's true, we
 > don't need that optimization enabled.  We can keep the code, and use
 > it when it becomes more useful, but there's no reason to be running
 > that pass.
 > 
 > If, however, someone has real applications that show measurable
 > improvents -- the Linux kernel would certainly qualify -- then we
 > should rethink the issue.
Would games on a very popular game console work?    While I realize
it will be difficult/impossible to benchmark them given their environment,
the code was developed in response to the code programmers were 
writing for that particular game console.

Jeff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15  9:53   ` Daniel Berlin
  2002-05-15 10:05     ` Roger Sayle
  2002-05-15 11:05     ` Kevin Handy
@ 2002-05-15 11:54     ` law
  2 siblings, 0 replies; 58+ messages in thread
From: law @ 2002-05-15 11:54 UTC (permalink / raw)
  To: Daniel Berlin
  Cc: Roger Sayle, gcc, Mark Mitchell, David S. Miller, Andreas Jaeger,
	Richard Henderson

In message <Pine.LNX.4.44.0205151212260.24022-100000@dberlin.org>, Daniel Berlin writes:
 > I am completely for disabling optimizations on the mainline that do 
 > nothing but waste time in their current state (though even a 1% 
 > improvement might be arguably worth it).
 > 
 > One would imagine if it's such a functional optimization, it 
 > would at least move or remove greater than 2 stores during
 > during bootstrapping gcc.
 > There are plenty of stores to be moved/eliminated.
 > But it doesn't.
While it hasn't helped with GCC bootstraps or stuff like SPEC; the
current incarnation of load/store motion was designed to deal with
a set of problems found often by a class of Red Hat's customers.

Those customers really do write code which, for example, has 
global variables as loop indices.

The fact that it's not useful for spec, gcc bootstraps, etc is more
a failing of not making the code general enough in the kinds of memory
references it tracks.  The underlying algorithms are sound, we're just
not exposing the larger set of memory references to the optimizer.

Be very careful claiming that an optimization is a waste of time because
you have not seen cases where it is useful.

jeff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 11:05     ` Kevin Handy
@ 2002-05-15 11:50       ` Janis Johnson
  0 siblings, 0 replies; 58+ messages in thread
From: Janis Johnson @ 2002-05-15 11:50 UTC (permalink / raw)
  To: Kevin Handy; +Cc: gcc

On Wed, May 15, 2002 at 11:05:36AM -0600, Kevin Handy wrote:
> 
> Just a dumb question, but is there a way to display which optimizations
> were triggered within a compile and how many times?  Probably not
> very useful generally, but still might be intresting to see.  Might let you
> see if an optimization suddenly stopped working.
>
See the -d<letter> options for gcc, which produce debugging dumps.  Many
optimization passes record information about what optimizations are done
or why they cannot be done.

Janis

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15  9:53   ` Daniel Berlin
  2002-05-15 10:05     ` Roger Sayle
@ 2002-05-15 11:05     ` Kevin Handy
  2002-05-15 11:50       ` Janis Johnson
  2002-05-15 11:54     ` law
  2 siblings, 1 reply; 58+ messages in thread
From: Kevin Handy @ 2002-05-15 11:05 UTC (permalink / raw)
  To: gcc

Daniel Berlin wrote:

>And for the record, the platforms i've stared at untold numbers of rtl 
>dumps of store motion on include x86, PPC, and alpha.
>And I misspoke, I neglected to mention it moves 3 stores during 
>bootstrapping.
>My apologies.
>The global store removal portion also removes 2 stores during 
>bootstrapping gcc, total.
>
>I am completely for disabling optimizations on the mainline that do 
>nothing but waste time in their current state (though even a 1% 
>improvement might be arguably worth it).
>
>One would imagine if it's such a functional optimization, it 
>would at least move or remove greater than 2 stores during
>during bootstrapping gcc.
>There are plenty of stores to be moved/eliminated.
>But it doesn't.
>
>Believe me, I'd like to see a useful store motion.
>It's just not what we've got now.
>

Just a dumb question, but is there a way to display which optimizations
were triggered within a compile and how many times?  Probably not
very useful generally, but still might be intresting to see.  Might let you
see if an optimization suddenly stopped working.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 10:14       ` Mark Mitchell
@ 2002-05-15 10:41         ` Roger Sayle
  2002-05-15 11:56         ` law
  1 sibling, 0 replies; 58+ messages in thread
From: Roger Sayle @ 2002-05-15 10:41 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Daniel Berlin, gcc


Ok. I concede.  I'm all for disabling and even deleting optimizations
that have no real world effect.  I wouldn't be surprised if store
motion rarely applied even to large real world examples that made
heavy use of __asm__ (such as glibc and linux).  Current store motion
is probably only catching the holes in GCC's CSE and GCSE cprop anyway.

> Over the past few years, there's a perception that we've had a bad
> tendency to make the compiler slower by adding nice optimizations that
> don't actually make programs faster, but sometimes do make them buggier.
> We need to combat that problem by vetting new optimizations.

But, you'll also remember from disabling builtins in g++ for v3.0x, that
tight release schedules lead to pieces of code being disabled rather than
fixed (despite the performance loss), and that code that is atleast
exercised infrequently is less likely to bitrot than code paths that are
disabled.

To summarise, I agree with everyone, now that we're all aware of what
the real issues are.  Perhaps, PR opt/5200 should be closed or given a
testcase to avoid any further confusion?

Roger
--

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 10:18       ` Roger Sayle
@ 2002-05-15 10:32         ` David Edelsohn
  0 siblings, 0 replies; 58+ messages in thread
From: David Edelsohn @ 2002-05-15 10:32 UTC (permalink / raw)
  To: Roger Sayle; +Cc: Daniel Berlin, gcc, Mark Mitchell

>>>>> Roger Sayle writes:

Roger> I apologise profusely.  I'm not arguing that store motion is
Roger> useful or shouldn't be disabled per se, just that store motion
Roger> probably isn't broken without any evidence to the contrary.

Roger> I don't want Mark or any other MAINTAINER disabling it just on
Roger> the mistaken understanding that it was causing regressions.

	The lack of regressions does not mean that the implementation of
store motion is correct.  Given that the current implementation of store
motion generally does not move any stores, it is even more difficult to
create a testcase which shows how store motion may produce incorrect code.
I personally think that this argues *for* disabling it by default because
we do not have a coverage tool to ensure that the specific optimization is
working properly.

	Daniel performed an analysis of the store motion implementation
itself and he believes that he found an error (a reversed condition, if I
remember correctly).  Given that we cannot test GCSE store motion
directly, I think it is best to leave it disabled by default until store
motion is fixed or someone finds a flaw in Daniel's analysis.

David

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 10:21         ` Jakub Jelinek
@ 2002-05-15 10:31           ` Mark Mitchell
  0 siblings, 0 replies; 58+ messages in thread
From: Mark Mitchell @ 2002-05-15 10:31 UTC (permalink / raw)
  To: Jakub Jelinek, Daniel Berlin
  Cc: Roger Sayle, gcc, David S. Miller, Andreas Jaeger, Richard Henderson



--On Wednesday, May 15, 2002 07:09:02 PM +0200 Jakub Jelinek 
<jakub@redhat.com> wrote:

> On Wed, May 15, 2002 at 01:02:03PM -0400, Daniel Berlin wrote:
>> I've not suggested we *remove* store motion until it is superceded.
>> I do think it should be disabled, as it just wastes time.
>>
>> Or, as you say, move it to -fexpensive-optimizations.
>
> But -fexpensive-optimizations is enabled at -O2, so it should be
> really disabled, not moved.

Furthermore, there is simply no point in having a feature in the compiler
that nobody knows how to use effectively.

In the best case, the feature is useless.  In the worst case, it turns
out to have bugs.

It's fine to keep the code, in case we want it for something, but if it
really provides no benefit, there should be no way for users to cause
that code to run.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 10:09       ` Daniel Berlin
@ 2002-05-15 10:21         ` Jakub Jelinek
  2002-05-15 10:31           ` Mark Mitchell
  2002-05-15 11:56         ` law
  1 sibling, 1 reply; 58+ messages in thread
From: Jakub Jelinek @ 2002-05-15 10:21 UTC (permalink / raw)
  To: Daniel Berlin
  Cc: Roger Sayle, gcc, Mark Mitchell, David S. Miller, Andreas Jaeger,
	Richard Henderson

On Wed, May 15, 2002 at 01:02:03PM -0400, Daniel Berlin wrote:
> I've not suggested we *remove* store motion until it is superceded.
> I do think it should be disabled, as it just wastes time.
> 
> Or, as you say, move it to -fexpensive-optimizations.

But -fexpensive-optimizations is enabled at -O2, so it should be
really disabled, not moved.

	Jakub

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 10:06     ` Daniel Berlin
  2002-05-15 10:15       ` David Edelsohn
@ 2002-05-15 10:18       ` Roger Sayle
  2002-05-15 10:32         ` David Edelsohn
  2002-05-16  1:51       ` Jan Hubicka
  2 siblings, 1 reply; 58+ messages in thread
From: Roger Sayle @ 2002-05-15 10:18 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: gcc, Mark Mitchell


> You still haven't addressed the issues I raised.
> You simply ignored them.

I apologise profusely.  I'm not arguing that store motion is
useful or shouldn't be disabled per se, just that store motion
probably isn't broken without any evidence to the contrary.

I don't want Mark or any other MAINTAINER disabling it just on
the mistaken understanding that it was causing regressions.

Roger
--

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 10:06     ` Daniel Berlin
@ 2002-05-15 10:15       ` David Edelsohn
  2002-05-15 11:56         ` law
  2002-05-15 10:18       ` Roger Sayle
  2002-05-16  1:51       ` Jan Hubicka
  2 siblings, 1 reply; 58+ messages in thread
From: David Edelsohn @ 2002-05-15 10:15 UTC (permalink / raw)
  To: Roger Sayle
  Cc: Daniel Berlin, gcc, Mark Mitchell, David S. Miller,
	Andreas Jaeger, Richard Henderson

Roger,

	There is no argument that GCSE store motion and store merging can
be useful.  The only question is whether the current implementation is
effective.

	We all concede that the current GCSE store-motion implementation
does have an effect on a store in your example from glibc.  We are asking
that GCSE store motion show demonstrable improvement in macroscopic
performance of code produced by GCC.

	Maybe we should rearrange the disabling so that toplev.c defaults
flag_gcse_sm to 0 instead of disabling the optimization in gcse.c itself.
Then this file in glibc could be compiled with -fgcse-sm in the one
instance it has been shown to be useful.

	When store motion shows performance improvement across a wider
range of applications, then we can enable that optimization by default.

David

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 10:08     ` Roger Sayle
@ 2002-05-15 10:14       ` Mark Mitchell
  2002-05-15 10:41         ` Roger Sayle
  2002-05-15 11:56         ` law
  0 siblings, 2 replies; 58+ messages in thread
From: Mark Mitchell @ 2002-05-15 10:14 UTC (permalink / raw)
  To: Roger Sayle; +Cc: Daniel Berlin, gcc



--On Wednesday, May 15, 2002 10:45:31 AM -0600 Roger Sayle 
<roger@eyesopen.com> wrote:

>
> Hi Mark,
>>   If the optimization makes the compiler go slower when compiling
>>   itself, it ain't worth having.

> You'll also appreciate that Hennesy and Patterson's pitfalls and
> fallacies on the existance of "typical applications".

Right.

> The only example of store motion that I'm aware of comes from glibc
> source code containing an __asm__ statement.  Inline assembly language
> is pretty rare in the GCC source code, but that would be a poor reason
> not to support it.  Dan's analysis of SPEC should perhaps of included
> the timings of disabling GCC's optimizations after the Linux kernel
> and GLibC had been rebuilt without what could be a system performance
> critical transformation.

Dan's claim seems to be that nobody has a real-world application that
shows an improvement with store motion enabled.  If that's true, we
don't need that optimization enabled.  We can keep the code, and use
it when it becomes more useful, but there's no reason to be running
that pass.

If, however, someone has real applications that show measurable
improvents -- the Linux kernel would certainly qualify -- then we
should rethink the issue.

Over the past few years, there's a perception that we've had a bad
tendency to make the compiler slower by adding nice optimizations that
don't actually make programs faster, but sometimes do make them buggier.
We need to combat that problem by vetting new optimizations.

> I don't use a GCC bootstrap to judge floating performance either :>

Of course. :-)

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 10:05     ` Roger Sayle
@ 2002-05-15 10:09       ` Daniel Berlin
  2002-05-15 10:21         ` Jakub Jelinek
  2002-05-15 11:56         ` law
  0 siblings, 2 replies; 58+ messages in thread
From: Daniel Berlin @ 2002-05-15 10:09 UTC (permalink / raw)
  To: Roger Sayle
  Cc: gcc, Mark Mitchell, David S. Miller, Andreas Jaeger, Richard Henderson

On Wed, 15 May 2002, Roger Sayle wrote:

> 
> > I am completely for disabling optimizations on the mainline that do
> > nothing but waste time in their current state (though even a 1%
> > improvement might be arguably worth it).
> 
> Hi Dan,
> 
> You'll also be aware of the "Store merging" section of the "Optimizer
> inadequecies" page, http://gcc.gnu.org/projects/optimize.html#storemerge
> Although store motion isn't particularly functional at the moment, it
> provides a framework for further GCC improvements in the future.

But nobody has any plans to improve it, it's been bitrotting since it was 
introduced.

In fact, it was introduced somewhat broken due to merge botches.
It wasn't until I looked at it 6 months after it was introduced that these 
merge botches were even noticed.

Not that I blame the person who committed it or anything, i'm just 
pointing out evidence that nobody even looks at it.

> 
> I'd agree that perhaps is should be in -fexpensive-optimizations, but
> I'm not convinced that its broken.  

If store_ops_ok was fixed, i'm sure it's not broken.
If it wasn't, i'm not.

I don't believe it was fixed, but my memory is fuzzy.
If it's not fixed, because of how non-functional it is, it's amazingly 
difficult to come up with test cases that show it.
> As you point out in your commentary
> of its deficiencies, it pessimizes const/pure functions etc..., but
> these don't affect the correctness of the code, just decrease the number
> of places this optimization is applicable.

When the number of places the optimizations is applicable approaches 0, 
the optimization should be disabled.

I've not suggested we *remove* store motion until it is superceded.
I do think it should be disabled, as it just wastes time.

Or, as you say, move it to -fexpensive-optimizations.

This would let people who want to improve it, improve it such that it is 
generally useful.



> 
> Roger
> --
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15 10:02   ` Mark Mitchell
@ 2002-05-15 10:08     ` Roger Sayle
  2002-05-15 10:14       ` Mark Mitchell
  0 siblings, 1 reply; 58+ messages in thread
From: Roger Sayle @ 2002-05-15 10:08 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Daniel Berlin, gcc


Hi Mark,
>   If the optimization makes the compiler go slower when compiling
>   itself, it ain't worth having.


You'll also appreciate that Hennesy and Patterson's pitfalls and
fallacies on the existance of "typical applications".

The only example of store motion that I'm aware of comes from glibc
source code containing an __asm__ statement.  Inline assembly language
is pretty rare in the GCC source code, but that would be a poor reason
not to support it.  Dan's analysis of SPEC should perhaps of included
the timings of disabling GCC's optimizations after the Linux kernel
and GLibC had been rebuilt without what could be a system performance
critical transformation.

I don't use a GCC bootstrap to judge floating performance either :>

Roger
--

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15  9:58   ` Roger Sayle
@ 2002-05-15 10:06     ` Daniel Berlin
  2002-05-15 10:15       ` David Edelsohn
                         ` (2 more replies)
  0 siblings, 3 replies; 58+ messages in thread
From: Daniel Berlin @ 2002-05-15 10:06 UTC (permalink / raw)
  To: Roger Sayle
  Cc: gcc, Mark Mitchell, David S. Miller, Andreas Jaeger, Richard Henderson

On Wed, 15 May 2002, Roger Sayle wrote:

> 
> > In addition, never, in any RTL dumps of any code, ever, have I seen it
> > remove a single store.
> 
> I'd suggest compiling the testcase in the patch below with -O3 on CVS
> mainline (before store motion was disabled).  The test is ill-formed
> and the duplicate store should be moved, the call to abort() reveals
> that the optimizer did its job.  Feel free to step through with a
> debuggger to convince yourself that it was GCSE's store motion at
> work.  For example, it doesn't abort compiling with "-O3 -fno-gcse".
> 
> Seeing is believing.
Yes, as I mentioned, I misspoke. It now 
removes 2 stores through global store removal during compilation of gcc, and moves 
3.

You still haven't addressed the issues I raised.
You simply ignored them.

The fact that store motion removes a single store in compilation of glibc 
at -O3 does not make it useful.
I've provided evidence that it is not useful.
This is in the form of statements by the person who wrote it, SPEC runs, 
statistics on the number of stores removed during bootstrapping gcc, etc.

No data or person has claimed, besides you, that store motion in it's 
current form is useful.

Please provide evidence it is, since all evidence points to the contrary.

 > 
> Roger
> --
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15  9:53   ` Daniel Berlin
@ 2002-05-15 10:05     ` Roger Sayle
  2002-05-15 10:09       ` Daniel Berlin
  2002-05-15 11:05     ` Kevin Handy
  2002-05-15 11:54     ` law
  2 siblings, 1 reply; 58+ messages in thread
From: Roger Sayle @ 2002-05-15 10:05 UTC (permalink / raw)
  To: Daniel Berlin
  Cc: gcc, Mark Mitchell, David S. Miller, Andreas Jaeger, Richard Henderson


> I am completely for disabling optimizations on the mainline that do
> nothing but waste time in their current state (though even a 1%
> improvement might be arguably worth it).

Hi Dan,

You'll also be aware of the "Store merging" section of the "Optimizer
inadequecies" page, http://gcc.gnu.org/projects/optimize.html#storemerge
Although store motion isn't particularly functional at the moment, it
provides a framework for further GCC improvements in the future.

I'd agree that perhaps is should be in -fexpensive-optimizations, but
I'm not convinced that its broken.  As you point out in your commentary
of its deficiencies, it pessimizes const/pure functions etc..., but
these don't affect the correctness of the code, just decrease the number
of places this optimization is applicable.

Roger
--

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15  9:21 ` Daniel Berlin
  2002-05-15  9:53   ` Daniel Berlin
  2002-05-15  9:58   ` Roger Sayle
@ 2002-05-15 10:02   ` Mark Mitchell
  2002-05-15 10:08     ` Roger Sayle
  2002-05-15 12:32   ` law
  3 siblings, 1 reply; 58+ messages in thread
From: Mark Mitchell @ 2002-05-15 10:02 UTC (permalink / raw)
  To: Daniel Berlin, Roger Sayle
  Cc: gcc, David S. Miller, Andreas Jaeger, Richard Henderson

> I "claimed" it wasn't doing anything because SPEC95/2000 runs show it
> making no improvement whatsoever.

I'm not knowledgeable about this particular optimization, but Sassan
Hazeghi, a very experienced compiler engineer and manager, recently
provided me with the following sensible metric:

  If the optimization makes the compiler go slower when compiling itself,
  it ain't worth having.

The colorful American colloquialism is mine; Sassan is much more
articulate.

It's not that this is a hard-and-fast rule -- and it certainly wouldn't
apply to a compiler that had no optimizations yet! -- but for a
relatively mature optimizing compiler it seems a decent rule of thumb.
If your new optimization pass makes the compiler bootstrap 2% slower,
well, then, it's a) pretty expensive, and b) not making things run too
much faster.  If it were really a big win, the compiler itself should
run faster.

Now, I know all about how compiler's aren't representative test cases,
and all about how certain optimizations are big wins for certain other
programs, and so forth and so on, but there's an idea in there.  There's
certainly no point in having an optimization that a) introduces risks,
and b) increases compile-time, if it doesn't provide measurable
improvements on real code.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15  9:21 ` Daniel Berlin
  2002-05-15  9:53   ` Daniel Berlin
@ 2002-05-15  9:58   ` Roger Sayle
  2002-05-15 10:06     ` Daniel Berlin
  2002-05-15 10:02   ` Mark Mitchell
  2002-05-15 12:32   ` law
  3 siblings, 1 reply; 58+ messages in thread
From: Roger Sayle @ 2002-05-15  9:58 UTC (permalink / raw)
  To: Daniel Berlin
  Cc: gcc, Mark Mitchell, David S. Miller, Andreas Jaeger, Richard Henderson


> In addition, never, in any RTL dumps of any code, ever, have I seen it
> remove a single store.

I'd suggest compiling the testcase in the patch below with -O3 on CVS
mainline (before store motion was disabled).  The test is ill-formed
and the duplicate store should be moved, the call to abort() reveals
that the optimizer did its job.  Feel free to step through with a
debuggger to convince yourself that it was GCSE's store motion at
work.  For example, it doesn't abort compiling with "-O3 -fno-gcse".

Seeing is believing.

Roger
--

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15  9:21 ` Daniel Berlin
@ 2002-05-15  9:53   ` Daniel Berlin
  2002-05-15 10:05     ` Roger Sayle
                       ` (2 more replies)
  2002-05-15  9:58   ` Roger Sayle
                     ` (2 subsequent siblings)
  3 siblings, 3 replies; 58+ messages in thread
From: Daniel Berlin @ 2002-05-15  9:53 UTC (permalink / raw)
  To: Roger Sayle
  Cc: gcc, Mark Mitchell, David S. Miller, Andreas Jaeger, Richard Henderson

On Wed, 15 May 2002, Daniel Berlin wrote:

> On Wed, 15 May 2002, Roger Sayle wrote:
> 
> > 
> > I'm more than happy to attempt to fix GCSE store motion, but my current
> > understanding is that its working perfectly!
> And doing nothing.
> > 
> > PR opt/5200 was filed by Andreas Jaeger on the supporting evidence of
> > PR opt/5172 and postings by Dan Berlin to gcc-patches last year.
> > 
> > In the review of my patch to that attempted to fix PR/5172
> > http://gcc.gnu.org/ml/gcc-patches/2002-01/msg01142.html, it
> > became clear that it was glibc at fault and not the store motion
> > pass at all.
> > 
> > Hence the only remaining claims that store motion is broken are
> > the comments by Dan Berlin, but as far as I am aware there are
> > no known examples of failing test cases.
> 
> I'm not going through this again.
> We've been over this before.
> 
> 
> >  Dan also claimed that
> > store motion wasn't currently doing anything, but clearly PR/5172
> > showed that duplicate stores are being eliminated in real code.
> 
> I "claimed" it wasn't doing anything because SPEC95/2000 runs show it 
> making no improvement whatsoever.
> 
> In addition, never, in any RTL dumps of any code, ever, have I seen it 
> remove a single store. 

And for the record, the platforms i've stared at untold numbers of rtl 
dumps of store motion on include x86, PPC, and alpha.
And I misspoke, I neglected to mention it moves 3 stores during 
bootstrapping.
My apologies.
The global store removal portion also removes 2 stores during 
bootstrapping gcc, total.

I am completely for disabling optimizations on the mainline that do 
nothing but waste time in their current state (though even a 1% 
improvement might be arguably worth it).

One would imagine if it's such a functional optimization, it 
would at least move or remove greater than 2 stores during
during bootstrapping gcc.
There are plenty of stores to be moved/eliminated.
But it doesn't.

Believe me, I'd like to see a useful store motion.
It's just not what we've got now.

--Dan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15  9:07 Roger Sayle
@ 2002-05-15  9:21 ` Daniel Berlin
  2002-05-15  9:53   ` Daniel Berlin
                     ` (3 more replies)
  0 siblings, 4 replies; 58+ messages in thread
From: Daniel Berlin @ 2002-05-15  9:21 UTC (permalink / raw)
  To: Roger Sayle
  Cc: gcc, Mark Mitchell, David S. Miller, Andreas Jaeger, Richard Henderson

On Wed, 15 May 2002, Roger Sayle wrote:

> 
> I'm more than happy to attempt to fix GCSE store motion, but my current
> understanding is that its working perfectly!
And doing nothing.
> 
> PR opt/5200 was filed by Andreas Jaeger on the supporting evidence of
> PR opt/5172 and postings by Dan Berlin to gcc-patches last year.
> 
> In the review of my patch to that attempted to fix PR/5172
> http://gcc.gnu.org/ml/gcc-patches/2002-01/msg01142.html, it
> became clear that it was glibc at fault and not the store motion
> pass at all.
> 
> Hence the only remaining claims that store motion is broken are
> the comments by Dan Berlin, but as far as I am aware there are
> no known examples of failing test cases.

I'm not going through this again.
We've been over this before.


>  Dan also claimed that
> store motion wasn't currently doing anything, but clearly PR/5172
> showed that duplicate stores are being eliminated in real code.

I "claimed" it wasn't doing anything because SPEC95/2000 runs show it 
making no improvement whatsoever.

In addition, never, in any RTL dumps of any code, ever, have I seen it 
remove a single store. 

Nobody has claimed that it is generally useful in it's current state. In 
fact, the person who submitted it has claimed otherwise.
It was written to address a specific case, which i've no doubt it does.
This case rarely, if ever, occurs.
If you want to claim it is a functional optimization that has useful 
application, please provide benchmarks that show store motion making any
difference.
If you want to improve store motion such that is generally useful, again, 
feel free.  I'd be happy to guide you in doing this.
However, nobody needs to provide a failing test case for you to do this.

Please stop trying to chalk up my statement that store motion is not 
useful in it's current state to bullshit or whatnot.

No benchmarks, test, or otherwise have shown it helping code in it's 
current state.  The only person to claim otherwise is you.

--Dan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
@ 2002-05-15  9:07 Roger Sayle
  2002-05-15  9:21 ` Daniel Berlin
  0 siblings, 1 reply; 58+ messages in thread
From: Roger Sayle @ 2002-05-15  9:07 UTC (permalink / raw)
  To: gcc
  Cc: Mark Mitchell, David S. Miller, Andreas Jaeger, Daniel Berlin,
	Richard Henderson


I'm more than happy to attempt to fix GCSE store motion, but my current
understanding is that its working perfectly!

PR opt/5200 was filed by Andreas Jaeger on the supporting evidence of
PR opt/5172 and postings by Dan Berlin to gcc-patches last year.

In the review of my patch to that attempted to fix PR/5172
http://gcc.gnu.org/ml/gcc-patches/2002-01/msg01142.html, it
became clear that it was glibc at fault and not the store motion
pass at all.

Hence the only remaining claims that store motion is broken are
the comments by Dan Berlin, but as far as I am aware there are
no known examples of failing test cases.  Dan also claimed that
store motion wasn't currently doing anything, but clearly PR/5172
showed that duplicate stores are being eliminated in real code.

To appease the paranoia of the pending GCC 3.1 release it was
decided to disable this optimization even without conclusive
evidence that anything was broken.  Unless a failing test in
the GCC test suite starts passing, I'm against disabling a
functional optimization on the CVS mainline.  Finding a failing
test case would also help the call for volunteers to fix it.

Roger
--
Roger Sayle,                         E-mail: roger@eyesopen.com
OpenEye Scientific Software,         WWW: http://www.eyesopen.com/
Suite 1107, 3600 Cerrillos Road,     Tel: (+1) 505-473-7385
Santa Fe, New Mexico, 87507.         Fax: (+1) 505-473-0833

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15  1:29   ` David S. Miller
@ 2002-05-15  1:32     ` Mark Mitchell
  0 siblings, 0 replies; 58+ messages in thread
From: Mark Mitchell @ 2002-05-15  1:32 UTC (permalink / raw)
  To: David S. Miller; +Cc: gcc, gcc-patches



--On Wednesday, May 15, 2002 12:33:48 AM -0700 "David S. Miller" 
<davem@redhat.com> wrote:

>    From: Mark Mitchell <mark@codesourcery.com>
>    Date: Wed, 15 May 2002 00:26:49 -0700
>
>    --On Tuesday, May 14, 2002 10:03:13 PM -0700 "David S. Miller"
>    <davem@redhat.com> wrote:
>
>    >
>    > While walking over diffs between 3.1 and the mainline I came across
>    > PR/5200.  Basically the fix on the branch just disabled store-motion
>    > in GCSE.
>    >
>    > This is one of those "fix it properly on the mainline" cases.
>    >
>    > So who is going to step up and work on fixing store motion?
>    > Aparently Dan Berlin and Jakub have some idea of the problems.
>    > Refer to the GNATS PR for more information.
>
>    For the time being, please disable it on the mainline too.
>
> Done, as follows:

Thanks!

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15  1:22 ` Mark Mitchell
@ 2002-05-15  1:29   ` David S. Miller
  2002-05-15  1:32     ` Mark Mitchell
  0 siblings, 1 reply; 58+ messages in thread
From: David S. Miller @ 2002-05-15  1:29 UTC (permalink / raw)
  To: mark; +Cc: gcc, gcc-patches

   From: Mark Mitchell <mark@codesourcery.com>
   Date: Wed, 15 May 2002 00:26:49 -0700
   
   --On Tuesday, May 14, 2002 10:03:13 PM -0700 "David S. Miller" 
   <davem@redhat.com> wrote:
   
   >
   > While walking over diffs between 3.1 and the mainline I came across
   > PR/5200.  Basically the fix on the branch just disabled store-motion
   > in GCSE.
   >
   > This is one of those "fix it properly on the mainline" cases.
   >
   > So who is going to step up and work on fixing store motion?
   > Aparently Dan Berlin and Jakub have some idea of the problems.
   > Refer to the GNATS PR for more information.
   
   For the time being, please disable it on the mainline too.

Done, as follows:

2002-03-09  Jakub Jelinek  <jakub@redhat.com>

	PR optimization/5172, optimization/5200
	* gcse.c (gcse_main): Disable store_motion.

--- gcse.c.~1~	Sun Apr 28 21:54:59 2002
+++ gcse.c	Tue May 14 21:00:24 2002
@@ -905,7 +905,8 @@
   end_alias_analysis ();
   allocate_reg_info (max_reg_num (), FALSE, FALSE);
 
-  if (!optimize_size && flag_gcse_sm)
+  /* Store motion disabled until it is fixed.  */
+  if (0 && !optimize_size && flag_gcse_sm)
     store_motion ();
   /* Record where pseudo-registers are set.  */
   return run_jump_opt_after_gcse;

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15  0:03   ` David S. Miller
  2002-05-15  0:41     ` Daniel Berlin
@ 2002-05-15  1:27     ` Jan Hubicka
  1 sibling, 0 replies; 58+ messages in thread
From: Jan Hubicka @ 2002-05-15  1:27 UTC (permalink / raw)
  To: David S. Miller; +Cc: dberlin, gcc, mark

>    From: Daniel Berlin <dberlin@dberlin.org>
>    Date: Wed, 15 May 2002 01:50:37 -0400 (EDT)
> 
>    I had too much fun funking around with hard to track down alias problems
>    the last time i spent a month trying to improve and fix the RTL version.
>    It's rather easy to fix what's there now, but it doesn't do anything 
>    (really. It shows 0% difference on benchmarks), except for whatever 
>    special case it's currently designed to handle.
>    
>    So i'm out.
> 
> If store-motion is useless and something that really does the
> job it is trying to do is "in the pipline", why don't we
> just delete the store-motion GCSE stuff?

I guess this can be more or less said about majority of GCSE stuff.  We need to
figure out how to make our globals really effective.  I believe globals make
sense on the RTL level, as code lowering, spill code generation and friends are
good source for code optimizable by such algorithms, but we need to develop
infrastructure to implement them properly first.

Honza
> 
> If we are going to keep the store-motion stuff, we have the problem
> that the person who knows how to fix it is not willing to do so.
> So we have this regression on the mainline with no resolution.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-14 23:27 David S. Miller
  2002-05-14 23:33 ` Daniel Berlin
@ 2002-05-15  1:22 ` Mark Mitchell
  2002-05-15  1:29   ` David S. Miller
  1 sibling, 1 reply; 58+ messages in thread
From: Mark Mitchell @ 2002-05-15  1:22 UTC (permalink / raw)
  To: David S. Miller, gcc



--On Tuesday, May 14, 2002 10:03:13 PM -0700 "David S. Miller" 
<davem@redhat.com> wrote:

>
> While walking over diffs between 3.1 and the mainline I came across
> PR/5200.  Basically the fix on the branch just disabled store-motion
> in GCSE.
>
> This is one of those "fix it properly on the mainline" cases.
>
> So who is going to step up and work on fixing store motion?
> Aparently Dan Berlin and Jakub have some idea of the problems.
> Refer to the GNATS PR for more information.

For the time being, please disable it on the mainline too.

It's fine to fix it properly, but until somebody does there's no reason
to make things difficult for everybody else.

--
Mark Mitchell                   mark@codesourcery.com
CodeSourcery, LLC               http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-15  0:03   ` David S. Miller
@ 2002-05-15  0:41     ` Daniel Berlin
  2002-05-15  1:27     ` Jan Hubicka
  1 sibling, 0 replies; 58+ messages in thread
From: Daniel Berlin @ 2002-05-15  0:41 UTC (permalink / raw)
  To: David S. Miller; +Cc: gcc, mark

On Tue, 14 May 2002, David S. Miller wrote:

>    From: Daniel Berlin <dberlin@dberlin.org>
>    Date: Wed, 15 May 2002 01:50:37 -0400 (EDT)
> 
>    I had too much fun funking around with hard to track down alias problems
>    the last time i spent a month trying to improve and fix the RTL version.
>    It's rather easy to fix what's there now, but it doesn't do anything 
>    (really. It shows 0% difference on benchmarks), except for whatever 
>    special case it's currently designed to handle.
>    
>    So i'm out.
> 
> If store-motion is useless and something that really does the
> job it is trying to do is "in the pipline", why don't we
> just delete the store-motion GCSE stuff?

It's not useless in general, it's useless in the current form, except for 
the special case of stores to symbol_ref's killing stores to 
symbol_ref's inside loops (IIRC).  One could extend it, and it's easy in 
terms of store motion code, but hard in terms  of tracking down bugs 
elsewhere (particularly, aliasing) that affect your  
results. 

The "in the pipeline" SSAPRE works at the AST level (it's on the 
ast-optimizer-branch).

Thus, it can't be run after spill code generation or anything.
On machines where we spill frequently, good store motion would be very 
useful at the RTL level to run after spilling.

I can't think of other reasons we would introduce lots of optimizable 
stores in RTL other than spilling.

So you could be quite right, and that, for non-x86 architectures, it 
eventually won't make sense to run store motion at the RTL level.

It doesn't make sense to delete it now, however, since the speed at which 
i work on SSAPRE will be slowing down soon as I start my summer internship 
as a law clerk.
But, if it's not useful by the time SSAPRE can do store motion, it 
would likely make sense to delete the code, even if SSAPRE is still only 
existing on a branch. 

 > 
> If we are going to keep the store-motion stuff, we have the problem
> that the person who knows how to fix it is not willing to do so.

There are others that know how to fix it.  
And improving it is conceptually easy.

And hey, maybe all the alias work that was done a few months after i last 
spent moons fixing and improving store motion have gotten rid of the bugs 
i ran into.

It's quite possible i'm now incorrect, and you won't run into the fun i 
did.

But I'm not going to promise something i'm not 100% positive i can 
deliver. Having tried fixing and improving store motion once, i'm not 100% 
positive i could fix it without getting so frustrated i burn out 
for months. In fact, i'm not even 50% positive.

Given that, and the fact that i have no deadlines, i'm trying to 
spend time on that which i think will improve gcc's compilation speed, 
memory usage, and generated code the most. (with the occasional side trip 
to improving debugging information, particularly for optimized code).

> So we have this regression on the mainline with no resolution.
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-14 23:33 ` Daniel Berlin
@ 2002-05-15  0:03   ` David S. Miller
  2002-05-15  0:41     ` Daniel Berlin
  2002-05-15  1:27     ` Jan Hubicka
  0 siblings, 2 replies; 58+ messages in thread
From: David S. Miller @ 2002-05-15  0:03 UTC (permalink / raw)
  To: dberlin; +Cc: gcc, mark

   From: Daniel Berlin <dberlin@dberlin.org>
   Date: Wed, 15 May 2002 01:50:37 -0400 (EDT)

   I had too much fun funking around with hard to track down alias problems
   the last time i spent a month trying to improve and fix the RTL version.
   It's rather easy to fix what's there now, but it doesn't do anything 
   (really. It shows 0% difference on benchmarks), except for whatever 
   special case it's currently designed to handle.
   
   So i'm out.

If store-motion is useless and something that really does the
job it is trying to do is "in the pipline", why don't we
just delete the store-motion GCSE stuff?

If we are going to keep the store-motion stuff, we have the problem
that the person who knows how to fix it is not willing to do so.
So we have this regression on the mainline with no resolution.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: GCSE store motion
  2002-05-14 23:27 David S. Miller
@ 2002-05-14 23:33 ` Daniel Berlin
  2002-05-15  0:03   ` David S. Miller
  2002-05-15  1:22 ` Mark Mitchell
  1 sibling, 1 reply; 58+ messages in thread
From: Daniel Berlin @ 2002-05-14 23:33 UTC (permalink / raw)
  To: David S. Miller; +Cc: gcc, mark

On Tue, 14 May 2002, David S. Miller wrote:

> 
> While walking over diffs between 3.1 and the mainline I came across
> PR/5200.  Basically the fix on the branch just disabled store-motion
> in GCSE.
> 
> This is one of those "fix it properly on the mainline" cases.
> 
> So who is going to step up and work on fixing store motion?

I had too much fun funking around with hard to track down alias problems
the last time i spent a month trying to improve and fix the RTL version.
It's rather easy to fix what's there now, but it doesn't do anything 
(really. It shows 0% difference on benchmarks), except for whatever 
special case it's currently designed to handle.

So i'm out.

Though, on the upside, SSAPRE will be able to move them soon enough, 
which, except on x86, will likely give you most of the benefit (on x86, 
of course, store motion afters spilling would likely be helpful) 

I'm concentrating on the strength reduction stuff and cleaning up the 
points-to analysis needed first, however.


--Dan

^ permalink raw reply	[flat|nested] 58+ messages in thread

* GCSE store motion
@ 2002-05-14 23:27 David S. Miller
  2002-05-14 23:33 ` Daniel Berlin
  2002-05-15  1:22 ` Mark Mitchell
  0 siblings, 2 replies; 58+ messages in thread
From: David S. Miller @ 2002-05-14 23:27 UTC (permalink / raw)
  To: gcc; +Cc: mark


While walking over diffs between 3.1 and the mainline I came across
PR/5200.  Basically the fix on the branch just disabled store-motion
in GCSE.

This is one of those "fix it properly on the mainline" cases.

So who is going to step up and work on fixing store motion?
Aparently Dan Berlin and Jakub have some idea of the problems.
Refer to the GNATS PR for more information.

^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2002-05-17 21:26 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-05-16  5:30 GCSE store motion Robert Dewar
2002-05-16  7:33 ` -O2 versus -O1 (Was: Re: GCSE store motion) Jan Hubicka
2002-05-16  8:04   ` Mark Mitchell
2002-05-16  8:04   ` Geert Bosch
2002-05-16  8:06     ` Geert Bosch
2002-05-16  7:59 ` GCSE store motion Mark Mitchell
2002-05-16  8:31 ` Daniel Berlin
     [not found] <20020516114838.949B6F28C9@nile.gnat.com.suse.lists.egcs>
     [not found] ` <164620000.1021559673@gandalf.codesourcery.com.suse.lists.egcs>
2002-05-16 11:42   ` Andi Kleen
  -- strict thread matches above, loose matches on Subject: below --
2002-05-16  8:29 Robert Dewar
2002-05-16  5:36 Robert Dewar
2002-05-15 13:56 Robert Dewar
2002-05-15 14:06 ` Gabriel Dos Reis
2002-05-15 15:09 ` Toon Moene
2002-05-15 15:20 ` Dale Johannesen
2002-05-15 12:45 Robert Dewar
2002-05-15 12:56 ` Mark Mitchell
2002-05-15 13:29 ` Daniel Berlin
2002-05-15  9:07 Roger Sayle
2002-05-15  9:21 ` Daniel Berlin
2002-05-15  9:53   ` Daniel Berlin
2002-05-15 10:05     ` Roger Sayle
2002-05-15 10:09       ` Daniel Berlin
2002-05-15 10:21         ` Jakub Jelinek
2002-05-15 10:31           ` Mark Mitchell
2002-05-15 11:56         ` law
2002-05-15 11:05     ` Kevin Handy
2002-05-15 11:50       ` Janis Johnson
2002-05-15 11:54     ` law
2002-05-15  9:58   ` Roger Sayle
2002-05-15 10:06     ` Daniel Berlin
2002-05-15 10:15       ` David Edelsohn
2002-05-15 11:56         ` law
2002-05-15 12:07           ` Mark Mitchell
2002-05-15 10:18       ` Roger Sayle
2002-05-15 10:32         ` David Edelsohn
2002-05-16  1:51       ` Jan Hubicka
2002-05-16  9:59         ` Daniel Berlin
2002-05-15 10:02   ` Mark Mitchell
2002-05-15 10:08     ` Roger Sayle
2002-05-15 10:14       ` Mark Mitchell
2002-05-15 10:41         ` Roger Sayle
2002-05-15 11:56         ` law
2002-05-15 12:07           ` Mark Mitchell
2002-05-15 13:48             ` Toon Moene
2002-05-16 13:50               ` law
2002-05-17 15:03               ` Tim Hollebeek
2002-05-16 13:53             ` law
2002-05-16 14:44               ` Daniel Berlin
2002-05-16 21:12                 ` Eric Christopher
2002-05-15 12:32   ` law
2002-05-14 23:27 David S. Miller
2002-05-14 23:33 ` Daniel Berlin
2002-05-15  0:03   ` David S. Miller
2002-05-15  0:41     ` Daniel Berlin
2002-05-15  1:27     ` Jan Hubicka
2002-05-15  1:22 ` Mark Mitchell
2002-05-15  1:29   ` David S. Miller
2002-05-15  1:32     ` Mark Mitchell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).