public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFH] - Less than optimal code compiling 252.eon -O2 for x86
@ 2005-06-24 22:06 Fariborz Jahanian
  2005-06-24 22:17 ` Andrew Pinski
  2005-06-27 19:20 ` Fariborz Jahanian
  0 siblings, 2 replies; 20+ messages in thread
From: Fariborz Jahanian @ 2005-06-24 22:06 UTC (permalink / raw)
  To: gcc

A source file mrSurfaceList.cc of 252.eon produces less efficient  
code initializing instance objects to 0 at -O2 than at -O1. Behavior  
is random and it does not happen on all x86  platforms and making the  
test smaller makes the problem go away. But here is what I found out  
is the cause.

When source is compiled with -O1 -march=pentium4,  'cse' phase sees  
the following pattern initializing a 'double' with 0.

(insn 18 13 19 0 (set (reg:SF 109)
         (mem/u/i:SF (symbol_ref/u:SI ("*LC11") [flags 0x2]) [0 S4  
A32])) -1 (nil)
     (nil))

(insn 19 18 20 0 (set (mem/s/j:DF (plus:SI (reg/f:SI 20 frame)
                 (const_int -32 [0xffffffffffffffe0])) [0  
objectBox.pmin.e+16 S8 A128])
         (float_extend:DF (reg:SF 109))) 86 {*extendsfdf2_sse} (nil)
     (nil))

Then fold_rtx routine  converts it into its reduced form, resulting  
in optimum code:

(insn 19 13 21 0 (set (mem/s/j:DF (plus:SI (reg/f:SI 20 frame)
                 (const_int -32 [0xffffffffffffffe0])) [0  
objectBox.pmin.e+16 S8 A128])
         (const_double:DF 0.0 [0x0.0p+0])) 64 {*movdf_nointeger} (nil)
     (nil))


But when the same source is compiled with -O2 march=pentium4, 'cse'  
phase sees a slightly different pattern (note that float_extend:DF  
has moved)

(insn 18 13 19 0 (set (reg:DF 109)
         (float_extend:DF (mem/u/i:SF (symbol_ref/u:SI ("*LC13")  
[flags 0x2]) [0 S4 A32]))) -1 (nil)
     (nil))

(insn 19 18 20 0 (set (mem/s/j:DF (plus:SI (reg/f:SI 20 frame)
                 (const_int -32 [0xffffffffffffffe0])) [0  
objectBox.pmin.e+16 S8 A128])
         (reg:DF 109)) 64 {*movdf_nointeger} (nil)
     (nil))

This cannot be simplified by fold_rtx, resulting in less efficient code.

Change in pattern is most likely because of additional tree  
optimization phases running at -O2. If so, then should the cse be  
taught to simplify the new rtl pattern. Or, the tree optimizer phase  
responsible for the less than optimal tree need be twiked to generate  
the same tree as with -O1?

Thanks, fariborz

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86
  2005-06-24 22:06 [RFH] - Less than optimal code compiling 252.eon -O2 for x86 Fariborz Jahanian
@ 2005-06-24 22:17 ` Andrew Pinski
  2005-06-24 23:46   ` fjahanian
  2005-06-27 19:20 ` Fariborz Jahanian
  1 sibling, 1 reply; 20+ messages in thread
From: Andrew Pinski @ 2005-06-24 22:17 UTC (permalink / raw)
  To: Fariborz Jahanian; +Cc: gcc


On Jun 24, 2005, at 6:07 PM, Fariborz Jahanian wrote:

> A source file mrSurfaceList.cc of 252.eon produces less efficient code 
> initializing instance objects to 0 at -O2 than at -O1. Behavior is 
> random and it does not happen on all x86  platforms and making the 
> test smaller makes the problem go away. But here is what I found out 
> is the cause.
>

> This cannot be simplified by fold_rtx, resulting in less efficient 
> code.

I wonder why combine can do the simplification though which is why still
produce good code for the simple testcase:
void f1(double *d,float *f2)
{
   *f2 = 0.0;
   *d = 0.0;
}

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86
  2005-06-24 22:17 ` Andrew Pinski
@ 2005-06-24 23:46   ` fjahanian
  2005-06-25  0:06     ` Steven Bosscher
  0 siblings, 1 reply; 20+ messages in thread
From: fjahanian @ 2005-06-24 23:46 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc


On Jun 24, 2005, at 3:16 PM, Andrew Pinski wrote:
>
> I wonder why combine can do the simplification though which is why  
> still
> produce good code for the simple testcase:
> void f1(double *d,float *f2)
> {
>   *f2 = 0.0;
>   *d = 0.0;
> }
>
It is hard to reproduce the simple test case, exhibiting the same  
problem (-O1 producing better code than -O2). Yes, small test cases  
move the desired simplification to other phases.

- fariborz

> Thanks,
> Andrew Pinski
>
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86
  2005-06-24 23:46   ` fjahanian
@ 2005-06-25  0:06     ` Steven Bosscher
  2005-06-30 14:42       ` fjahanian
  0 siblings, 1 reply; 20+ messages in thread
From: Steven Bosscher @ 2005-06-25  0:06 UTC (permalink / raw)
  To: gcc; +Cc: fjahanian, Andrew Pinski

On Saturday 25 June 2005 01:48, fjahanian wrote:
> On Jun 24, 2005, at 3:16 PM, Andrew Pinski wrote:
> > I wonder why combine can do the simplification though which is why
> > still
> > produce good code for the simple testcase:
> > void f1(double *d,float *f2)
> > {
> >   *f2 = 0.0;
> >   *d = 0.0;
> > }
>
> It is hard to reproduce the simple test case, exhibiting the same
> problem (-O1 producing better code than -O2). Yes, small test cases
> move the desired simplification to other phases.

It often helps if you know what function your poorer code is in.  You
could e.g. try to make the .optimized dump of that function compilable
and see if the problem shows up there again.  Then work your way down
to something small.

Gr.
Steven

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86
  2005-06-24 22:06 [RFH] - Less than optimal code compiling 252.eon -O2 for x86 Fariborz Jahanian
  2005-06-24 22:17 ` Andrew Pinski
@ 2005-06-27 19:20 ` Fariborz Jahanian
  2005-06-27 19:56   ` Richard Henderson
  1 sibling, 1 reply; 20+ messages in thread
From: Fariborz Jahanian @ 2005-06-27 19:20 UTC (permalink / raw)
  To: Fariborz Jahanian; +Cc: gcc

FYI, the change to rtl  in -O2 vs. -O1 is that -O2 includes -fforce- 
mem which forces memory operands to registers to make memory  
references common sub-expressions. In this case, the constant double  
float value is assigned to an xmm register which is used where it is  
needed. So, I would say this behavior is as expected but not ideal  
for x86 where a couple of 'movl   $0x0, mem' may be preferred to a  
single 'movsd   %xmm7, mem' for 252.eon on x86-darwin.

- fariborz

On Jun 24, 2005, at 3:07 PM, Fariborz Jahanian wrote:

> A source file mrSurfaceList.cc of 252.eon produces less efficient  
> code initializing instance objects to 0 at -O2 than at -O1.  
> Behavior is random and it does not happen on all x86  platforms and  
> making the test smaller makes the problem go away. But here is what  
> I found out is the cause.
>
> When source is compiled with -O1 -march=pentium4,  'cse' phase sees  
> the following pattern initializing a 'double' with 0.
>
> (insn 18 13 19 0 (set (reg:SF 109)
>         (mem/u/i:SF (symbol_ref/u:SI ("*LC11") [flags 0x2]) [0 S4  
> A32])) -1 (nil)
>     (nil))
>
> (insn 19 18 20 0 (set (mem/s/j:DF (plus:SI (reg/f:SI 20 frame)
>                 (const_int -32 [0xffffffffffffffe0])) [0  
> objectBox.pmin.e+16 S8 A128])
>         (float_extend:DF (reg:SF 109))) 86 {*extendsfdf2_sse} (nil)
>     (nil))
>
> Then fold_rtx routine  converts it into its reduced form, resulting  
> in optimum code:
>
> (insn 19 13 21 0 (set (mem/s/j:DF (plus:SI (reg/f:SI 20 frame)
>                 (const_int -32 [0xffffffffffffffe0])) [0  
> objectBox.pmin.e+16 S8 A128])
>         (const_double:DF 0.0 [0x0.0p+0])) 64 {*movdf_nointeger} (nil)
>     (nil))
>
>
> But when the same source is compiled with -O2 march=pentium4, 'cse'  
> phase sees a slightly different pattern (note that float_extend:DF  
> has moved)
>
> (insn 18 13 19 0 (set (reg:DF 109)
>         (float_extend:DF (mem/u/i:SF (symbol_ref/u:SI ("*LC13")  
> [flags 0x2]) [0 S4 A32]))) -1 (nil)
>     (nil))
>
> (insn 19 18 20 0 (set (mem/s/j:DF (plus:SI (reg/f:SI 20 frame)
>                 (const_int -32 [0xffffffffffffffe0])) [0  
> objectBox.pmin.e+16 S8 A128])
>         (reg:DF 109)) 64 {*movdf_nointeger} (nil)
>     (nil))
>
> This cannot be simplified by fold_rtx, resulting in less efficient  
> code.
>
> Change in pattern is most likely because of additional tree  
> optimization phases running at -O2. If so, then should the cse be  
> taught to simplify the new rtl pattern. Or, the tree optimizer  
> phase responsible for the less than optimal tree need be twiked to  
> generate the same tree as with -O1?
>
> Thanks, fariborz
>
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86
  2005-06-27 19:20 ` Fariborz Jahanian
@ 2005-06-27 19:56   ` Richard Henderson
  2005-06-27 21:52     ` Fariborz Jahanian
  0 siblings, 1 reply; 20+ messages in thread
From: Richard Henderson @ 2005-06-27 19:56 UTC (permalink / raw)
  To: Fariborz Jahanian; +Cc: gcc

On Mon, Jun 27, 2005 at 12:21:01PM -0700, Fariborz Jahanian wrote:
> FYI, the change to rtl  in -O2 vs. -O1 is that -O2 includes -fforce- 
> mem which forces memory operands to registers to make memory  
> references common sub-expressions.

Hmm.  I would suspect this is obsolete now.  We'll have forced
everything into "registers" (or something equivalent that we
can work with) during tree optimization.  Any CSEs that can be
made should have been made.


r~

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86
  2005-06-27 19:56   ` Richard Henderson
@ 2005-06-27 21:52     ` Fariborz Jahanian
  2005-06-30 16:04       ` fjahanian
  0 siblings, 1 reply; 20+ messages in thread
From: Fariborz Jahanian @ 2005-06-27 21:52 UTC (permalink / raw)
  To: Richard Henderson; +Cc: gcc


On Jun 27, 2005, at 12:56 PM, Richard Henderson wrote:

> Hmm.  I would suspect this is obsolete now.  We'll have forced
> everything into "registers" (or something equivalent that we
> can work with) during tree optimization.  Any CSEs that can be
> made should have been made.
>

I will do  sanity check followed by SPEC runs (x86 and ppc darwin)  
and see if behavior changes by obsoleting -fforce-mem  in -O2  (or  
higher).

- Thanks, fariborz

>
> r~
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86
  2005-06-25  0:06     ` Steven Bosscher
@ 2005-06-30 14:42       ` fjahanian
  2005-06-30 15:03         ` fjahanian
  0 siblings, 1 reply; 20+ messages in thread
From: fjahanian @ 2005-06-30 14:42 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc, Andrew Pinski


On Jun 24, 2005, at 5:06 PM, Steven Bosscher wrote:

> On Saturday 25 June 2005 01:48, fjahanian wrote:
>
>> On Jun 24, 2005, at 3:16 PM, Andrew Pinski wrote:
>>
>>> I wonder why combine can do the simplification though which is why
>>> still
>>> produce good code for the simple testcase:
>>> void f1(double *d,float *f2)
>>> {
>>>   *f2 = 0.0;
>>>   *d = 0.0;
>>> }
>>>
>>
>> It is hard to reproduce the simple test case, exhibiting the same
>> problem (-O1 producing better code than -O2). Yes, small test cases
>> move the desired simplification to other phases.
>>
>
> It often helps if you know what function your poorer code is in.  You
> could e.g. try to make the .optimized dump of that function compilable
> and see if the problem shows up there again.  Then work your way down
> to something small.

Yes, I am planning to do this. My first question was though if the  
RTL generated by -O2, which does not get simplified, is correct and  
should be optimized in one of the rtl optimizers. If not, then focus  
shifts to tree optimizers.

- Thanks ,fariborz

>
> Gr.
> Steven
>
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86
  2005-06-30 14:42       ` fjahanian
@ 2005-06-30 15:03         ` fjahanian
  0 siblings, 0 replies; 20+ messages in thread
From: fjahanian @ 2005-06-30 15:03 UTC (permalink / raw)
  Cc: Steven Bosscher, gcc


On Jun 24, 2005, at 5:20 PM, fjahanian wrote:

>
> On Jun 24, 2005, at 5:06 PM, Steven Bosscher wrote:
>
>
>> On Saturday 25 June 2005 01:48, fjahanian wrote:
>>
>>
>>> On Jun 24, 2005, at 3:16 PM, Andrew Pinski wrote:
>>>
>>>
>>>> I wonder why combine can do the simplification though which is why
>>>> still
>>>> produce good code for the simple testcase:
>>>> void f1(double *d,float *f2)
>>>> {
>>>>   *f2 = 0.0;
>>>>   *d = 0.0;
>>>> }
>>>>
>>>>
>>>
>>> It is hard to reproduce the simple test case, exhibiting the same
>>> problem (-O1 producing better code than -O2). Yes, small test cases
>>> move the desired simplification to other phases.
>>>
>>>
>>
>> It often helps if you know what function your poorer code is in.  You
>> could e.g. try to make the .optimized dump of that function  
>> compilable
>> and see if the problem shows up there again.  Then work your way down
>> to something small.
>>
>
> Yes, I am planning to do this. My first question was though if the  
> RTL generated by -O2, which does not get simplified, is correct and  
> should be optimized in one of the rtl optimizers. If not, then  
> focus shifts to tree optimizers.

This email went through late and superseded by earlier exchanges, It  
turned out to be all RTL related issues.

- faribrz

>
> - Thanks ,fariborz
>
>
>>
>> Gr.
>> Steven
>>
>>
>>
>
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86
  2005-06-27 21:52     ` Fariborz Jahanian
@ 2005-06-30 16:04       ` fjahanian
  2005-06-30 16:08         ` Andrew Pinski
  2005-06-30 16:55         ` Steven Bosscher
  0 siblings, 2 replies; 20+ messages in thread
From: fjahanian @ 2005-06-30 16:04 UTC (permalink / raw)
  Cc: Richard Henderson, gcc


On Jun 27, 2005, at 2:50 PM, Fariborz Jahanian wrote:

>
> On Jun 27, 2005, at 12:56 PM, Richard Henderson wrote:
>
>
>> Hmm.  I would suspect this is obsolete now.  We'll have forced
>> everything into "registers" (or something equivalent that we
>> can work with) during tree optimization.  Any CSEs that can be
>> made should have been made.
>>
>>
>
> I will do  sanity check followed by SPEC runs (x86 and ppc darwin)  
> and see if behavior changes by obsoleting -fforce-mem  in -O2  (or  
> higher).

Bootstrapped and dejagnu tested on apple-x86-darwin and apple-ppc- 
darwin.

We also observed that on ppc, SPEC did not show any performance  
change either way. On apple-x86-darwin 252.eon improved by 7% as  
expected, with no noticeable change in other benchmarks. One caveat  
to all these is that this may expose optimization bugs which were  
previously hidden by inclusion of -fforce-mem.

OK for check-in?

- fariborz

ChangeLog:

2005-06-30  Fariborz Jahanian <fjahanian@apple.com>

       * opts.c (decode_options): Don't set -fforce-mem with -O2 and  
more.


Index: opts.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/opts.c,v
retrieving revision 1.114
diff -c -p -r1.114 opts.c
*** opts.c      24 Jun 2005 03:09:45 -0000      1.114
--- opts.c      30 Jun 2005 15:55:15 -0000
*************** decode_options (unsigned int argc, const
*** 559,565 ****
         flag_rerun_cse_after_loop = 1;
         flag_rerun_loop_opt = 1;
         flag_caller_saves = 1;
-       flag_force_mem = 1;
         flag_peephole2 = 1;
   #ifdef INSN_SCHEDULING
         flag_schedule_insns = 1;
--- 559,564 ----

>
> - Thanks, fariborz
>
>
>>
>> r~
>>
>>
>
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86
  2005-06-30 16:04       ` fjahanian
@ 2005-06-30 16:08         ` Andrew Pinski
  2005-06-30 16:55         ` Steven Bosscher
  1 sibling, 0 replies; 20+ messages in thread
From: Andrew Pinski @ 2005-06-30 16:08 UTC (permalink / raw)
  To: fjahanian; +Cc: gcc, Richard Henderson


On Jun 30, 2005, at 12:05 PM, fjahanian wrote:
> Bootstrapped and dejagnu tested on apple-x86-darwin and 
> apple-ppc-darwin.
>
> We also observed that on ppc, SPEC did not show any performance change 
> either way. On apple-x86-darwin 252.eon improved by 7% as expected, 
> with no noticeable change in other benchmarks. One caveat to all these 
> is that this may expose optimization bugs which were previously hidden 
> by inclusion of -fforce-mem.
>
> OK for check-in?
>
> - fariborz
>
> ChangeLog:
>
> 2005-06-30  Fariborz Jahanian <fjahanian@apple.com>
>
>       * opts.c (decode_options): Don't set -fforce-mem with -O2 and 
> more.

Please also update the docs in invoke.texi.

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86
  2005-06-30 16:04       ` fjahanian
  2005-06-30 16:08         ` Andrew Pinski
@ 2005-06-30 16:55         ` Steven Bosscher
  2005-06-30 17:48           ` Jeffrey A Law
  1 sibling, 1 reply; 20+ messages in thread
From: Steven Bosscher @ 2005-06-30 16:55 UTC (permalink / raw)
  To: gcc; +Cc: fjahanian, Richard Henderson

On Thursday 30 June 2005 18:05, fjahanian wrote:
> On Jun 27, 2005, at 2:50 PM, Fariborz Jahanian wrote:
> > On Jun 27, 2005, at 12:56 PM, Richard Henderson wrote:
> >> Hmm.  I would suspect this is obsolete now.  We'll have forced
> >> everything into "registers" (or something equivalent that we
> >> can work with) during tree optimization.  Any CSEs that can be
> >> made should have been made.

(...)

> 2005-06-30  Fariborz Jahanian <fjahanian@apple.com>
>
>        * opts.c (decode_options): Don't set -fforce-mem with -O2 and more.

If the code is obsolete as rth suggested, wouldn't it be even better to 
remover all traces of it, i.e. clean up all places where flag_force_mem
is checked?

Gr.
Steven

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86
  2005-06-30 16:55         ` Steven Bosscher
@ 2005-06-30 17:48           ` Jeffrey A Law
  2005-06-30 18:12             ` Bernd Schmidt
  0 siblings, 1 reply; 20+ messages in thread
From: Jeffrey A Law @ 2005-06-30 17:48 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc, fjahanian, Richard Henderson

On Thu, 2005-06-30 at 18:55 +0200, Steven Bosscher wrote:
> On Thursday 30 June 2005 18:05, fjahanian wrote:
> > On Jun 27, 2005, at 2:50 PM, Fariborz Jahanian wrote:
> > > On Jun 27, 2005, at 12:56 PM, Richard Henderson wrote:
> > >> Hmm.  I would suspect this is obsolete now.  We'll have forced
> > >> everything into "registers" (or something equivalent that we
> > >> can work with) during tree optimization.  Any CSEs that can be
> > >> made should have been made.
> 
> (...)
> 
> > 2005-06-30  Fariborz Jahanian <fjahanian@apple.com>
> >
> >        * opts.c (decode_options): Don't set -fforce-mem with -O2 and more.
> 
> If the code is obsolete as rth suggested, wouldn't it be even better to 
> remover all traces of it, i.e. clean up all places where flag_force_mem
> is checked?
I'd tend to agree.  I'd rather see the option go away than linger on
if the option is no longer useful.

Jeff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86
  2005-06-30 17:48           ` Jeffrey A Law
@ 2005-06-30 18:12             ` Bernd Schmidt
  2005-06-30 18:19               ` Joe Buck
  2005-06-30 18:23               ` Jeffrey A Law
  0 siblings, 2 replies; 20+ messages in thread
From: Bernd Schmidt @ 2005-06-30 18:12 UTC (permalink / raw)
  To: law; +Cc: Steven Bosscher, gcc, fjahanian, Richard Henderson

Jeffrey A Law wrote:
> I'd tend to agree.  I'd rather see the option go away than linger on
> if the option is no longer useful.

I wouldn't mind that, but I'd also like to point out that there are 
Makefiles out there which hard-code things like -fforce-mem.  Do we want 
to keep the option as a stub to avoid breaking them?


Bernd

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86
  2005-06-30 18:12             ` Bernd Schmidt
@ 2005-06-30 18:19               ` Joe Buck
  2005-06-30 18:25                 ` Giovanni Bajo
  2005-06-30 18:23               ` Jeffrey A Law
  1 sibling, 1 reply; 20+ messages in thread
From: Joe Buck @ 2005-06-30 18:19 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: law, Steven Bosscher, gcc, fjahanian, Richard Henderson

On Thu, Jun 30, 2005 at 08:12:14PM +0200, Bernd Schmidt wrote:
> Jeffrey A Law wrote:
> >I'd tend to agree.  I'd rather see the option go away than linger on
> >if the option is no longer useful.
> 
> I wouldn't mind that, but I'd also like to point out that there are 
> Makefiles out there which hard-code things like -fforce-mem.  Do we want 
> to keep the option as a stub to avoid breaking them?

It could produce a "deprecated" or "obsolete" warning for 4.1, and then be
removed for 4.2.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86
  2005-06-30 18:12             ` Bernd Schmidt
  2005-06-30 18:19               ` Joe Buck
@ 2005-06-30 18:23               ` Jeffrey A Law
  2005-06-30 19:06                 ` Fariborz Jahanian
  1 sibling, 1 reply; 20+ messages in thread
From: Jeffrey A Law @ 2005-06-30 18:23 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Steven Bosscher, gcc, fjahanian, Richard Henderson

On Thu, 2005-06-30 at 20:12 +0200, Bernd Schmidt wrote:
> Jeffrey A Law wrote:
> > I'd tend to agree.  I'd rather see the option go away than linger on
> > if the option is no longer useful.
> 
> I wouldn't mind that, but I'd also like to point out that there are 
> Makefiles out there which hard-code things like -fforce-mem.  Do we want 
> to keep the option as a stub to avoid breaking them?
Excellent point.  I believe in other cases we've kept the option
around for a release, then killed it.

jeff



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86
  2005-06-30 18:19               ` Joe Buck
@ 2005-06-30 18:25                 ` Giovanni Bajo
  0 siblings, 0 replies; 20+ messages in thread
From: Giovanni Bajo @ 2005-06-30 18:25 UTC (permalink / raw)
  To: Joe Buck; +Cc: Bernd Schmidt, gcc

Joe Buck <Joe.Buck@synopsys.COM> wrote:

>>> I'd tend to agree.  I'd rather see the option go away than linger on
>>> if the option is no longer useful.
>>
>> I wouldn't mind that, but I'd also like to point out that there are
>> Makefiles out there which hard-code things like -fforce-mem.  Do we want
>> to keep the option as a stub to avoid breaking them?
>
> It could produce a "deprecated" or "obsolete" warning for 4.1, and then be
> removed for 4.2.

Personally, I don't see a point. -fforce-mem is just an optimization option
which does not affect the semantic of the program in any way. If we remove
it, people would need to just drop it from their Makefiles. There is no
source code adjustment required (which would justify the deprecation cycle).

Or convert it to a noop as Bernd suggested.
-- 
Giovanni Bajo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86
  2005-06-30 18:23               ` Jeffrey A Law
@ 2005-06-30 19:06                 ` Fariborz Jahanian
  2005-06-30 19:47                   ` Steven Bosscher
  0 siblings, 1 reply; 20+ messages in thread
From: Fariborz Jahanian @ 2005-06-30 19:06 UTC (permalink / raw)
  To: law; +Cc: Bernd Schmidt, Steven Bosscher, gcc, Richard Henderson


On Jun 30, 2005, at 11:23 AM, Jeffrey A Law wrote:

> On Thu, 2005-06-30 at 20:12 +0200, Bernd Schmidt wrote:
>
>> Jeffrey A Law wrote:
>>
>>> I'd tend to agree.  I'd rather see the option go away than linger on
>>> if the option is no longer useful.
>>>
>>
>> I wouldn't mind that, but I'd also like to point out that there are
>> Makefiles out there which hard-code things like -fforce-mem.  Do  
>> we want
>> to keep the option as a stub to avoid breaking them?
>>
> Excellent point.  I believe in other cases we've kept the option
> around for a release, then killed it.

I would also like to keep this feature around for a while. It is  
possible that setting of this option under -O2/-O3 has masked some  
optimization bugs. In which case, addition of -fforce-mem would be a  
temporary workaround.

- fariborz

>
> jeff
>
>
>
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86
  2005-06-30 19:06                 ` Fariborz Jahanian
@ 2005-06-30 19:47                   ` Steven Bosscher
  2005-06-30 21:30                     ` Fariborz Jahanian
  0 siblings, 1 reply; 20+ messages in thread
From: Steven Bosscher @ 2005-06-30 19:47 UTC (permalink / raw)
  To: Fariborz Jahanian; +Cc: law, Bernd Schmidt, gcc, Richard Henderson

On Thursday 30 June 2005 21:05, Fariborz Jahanian wrote:
> On Jun 30, 2005, at 11:23 AM, Jeffrey A Law wrote:
> > On Thu, 2005-06-30 at 20:12 +0200, Bernd Schmidt wrote:
> >> Jeffrey A Law wrote:
> >>> I'd tend to agree.  I'd rather see the option go away than linger on
> >>> if the option is no longer useful.
> >>
> >> I wouldn't mind that, but I'd also like to point out that there are
> >> Makefiles out there which hard-code things like -fforce-mem.  Do
> >> we want
> >> to keep the option as a stub to avoid breaking them?
> >
> > Excellent point.  I believe in other cases we've kept the option
> > around for a release, then killed it.
>
> I would also like to keep this feature around for a while. It is
> possible that setting of this option under -O2/-O3 has masked some
> optimization bugs. In which case, addition of -fforce-mem would be a
> temporary workaround.

Well, maybe so, but it would be a pretty lame workaround.  Why are you
so worried about bugs?  This flag was always disabled at -O1, and we
have never seen any bug reports that got fixed with -fforced-mem.  And
besides, it is better to fix bugs than to work around them.

Making the option a nop, issuing a warning in 4.1 and removing the
option completely for gcc 4.2 looks like a very reasonable approach to
me.

Gr.
Steven

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFH] - Less than optimal code compiling 252.eon -O2 for x86
  2005-06-30 19:47                   ` Steven Bosscher
@ 2005-06-30 21:30                     ` Fariborz Jahanian
  0 siblings, 0 replies; 20+ messages in thread
From: Fariborz Jahanian @ 2005-06-30 21:30 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: law, Bernd Schmidt, gcc, Richard Henderson


On Jun 30, 2005, at 12:47 PM, Steven Bosscher wrote:

>
> Well, maybe so, but it would be a pretty lame workaround.  Why are you
> so worried about bugs?  This flag was always disabled at -O1, and we
> have never seen any bug reports that got fixed with -fforced-mem.  And
> besides, it is better to fix bugs than to work around them.
>
> Making the option a nop, issuing a warning in 4.1 and removing the
> option completely for gcc 4.2 looks like a very reasonable approach to
> me.
>

OK. This seems to be the consensus and I will prepare a patch base on  
that.

- Thanks, fariborz

> Gr.
> Steven
>
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2005-06-30 21:30 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-06-24 22:06 [RFH] - Less than optimal code compiling 252.eon -O2 for x86 Fariborz Jahanian
2005-06-24 22:17 ` Andrew Pinski
2005-06-24 23:46   ` fjahanian
2005-06-25  0:06     ` Steven Bosscher
2005-06-30 14:42       ` fjahanian
2005-06-30 15:03         ` fjahanian
2005-06-27 19:20 ` Fariborz Jahanian
2005-06-27 19:56   ` Richard Henderson
2005-06-27 21:52     ` Fariborz Jahanian
2005-06-30 16:04       ` fjahanian
2005-06-30 16:08         ` Andrew Pinski
2005-06-30 16:55         ` Steven Bosscher
2005-06-30 17:48           ` Jeffrey A Law
2005-06-30 18:12             ` Bernd Schmidt
2005-06-30 18:19               ` Joe Buck
2005-06-30 18:25                 ` Giovanni Bajo
2005-06-30 18:23               ` Jeffrey A Law
2005-06-30 19:06                 ` Fariborz Jahanian
2005-06-30 19:47                   ` Steven Bosscher
2005-06-30 21:30                     ` Fariborz Jahanian

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).