public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* IRA undoing scheduling decisions
@ 2009-08-25  9:47 Charles J. Tabony
  2009-08-25 13:18 ` Adam Nemet
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Charles J. Tabony @ 2009-08-25  9:47 UTC (permalink / raw)
  To: gcc

Fellow GCC developers,

I am seeing a performance regression on the port I maintain, and I would appreciate some pointers.

When I compile the following code

void f(int *x, int *y){
  *x = 7;
  *y = 4;
}

with GCC 4.3.2, I get the desired sequence of instructions.  I'll call it sequence A:

r0 = 7
r1 = 4
[x] = r0
[y] = r1

When I compile the same code with GCC 4.4.0, I get a sequence that is lower performance for my target machine.  I'll call it sequence B:

r0 = 7
[x] = r0
r0 = 4
[y] = r0

I see the same difference between GCC 4.3.2 and 4.4.0 when compiling for PowerPC, MIPS, ARM, and FR-V.

When I look at the RTL dumps, I see that the first scheduling pass always produces sequence A, across all targets and GCC versions I tried.  In GCC 4.3.2, sequence A persists throughout the remainder of compilation.  In GCC 4.4.0, for every target, the .ira dump shows that the sequence of instructions has reverted back to sequence B.

Are there any machine-dependent parameters that I can tune to prevent IRA from transforming sequence A into sequence B?  If not, where can I add a hook to allow this decision to be tuned per machine?

Is there any more information you would like me to provide?

Thank you,
Charles J. Tabony

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: IRA undoing scheduling decisions
  2009-08-25  9:47 IRA undoing scheduling decisions Charles J. Tabony
@ 2009-08-25 13:18 ` Adam Nemet
  2009-08-26 10:09   ` Charles J. Tabony
  2009-08-25 15:41 ` Bingfeng Mei
  2009-08-27  0:12 ` Peter Bergner
  2 siblings, 1 reply; 17+ messages in thread
From: Adam Nemet @ 2009-08-25 13:18 UTC (permalink / raw)
  To: Charles J. Tabony; +Cc: gcc

"Charles J. Tabony" <tabonyee@austin.rr.com> writes:
> I see the same difference between GCC 4.3.2 and 4.4.0 when compiling
> for PowerPC, MIPS, ARM, and FR-V.

I can confirm this with mainline on MIPS/Octeon.  Can you please file a
bug.

Adam

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: IRA undoing scheduling decisions
  2009-08-25  9:47 IRA undoing scheduling decisions Charles J. Tabony
  2009-08-25 13:18 ` Adam Nemet
@ 2009-08-25 15:41 ` Bingfeng Mei
  2009-08-27  0:12 ` Peter Bergner
  2 siblings, 0 replies; 17+ messages in thread
From: Bingfeng Mei @ 2009-08-25 15:41 UTC (permalink / raw)
  To: Charles J. Tabony, gcc

I can comfirm too in our private port, though in slightly different form.

r2 = 7
[r0] = r2
r0 = 4
[r1] = r0

Bingfeng 

> -----Original Message-----
> From: gcc-owner@gcc.gnu.org [mailto:gcc-owner@gcc.gnu.org] On 
> Behalf Of Charles J. Tabony
> Sent: 25 August 2009 00:56
> To: gcc@gcc.gnu.org
> Subject: IRA undoing scheduling decisions
> 
> Fellow GCC developers,
> 
> I am seeing a performance regression on the port I maintain, 
> and I would appreciate some pointers.
> 
> When I compile the following code
> 
> void f(int *x, int *y){
>   *x = 7;
>   *y = 4;
> }
> 
> with GCC 4.3.2, I get the desired sequence of instructions.  
> I'll call it sequence A:
> 
> r0 = 7
> r1 = 4
> [x] = r0
> [y] = r1
> 
> When I compile the same code with GCC 4.4.0, I get a sequence 
> that is lower performance for my target machine.  I'll call 
> it sequence B:
> 
> r0 = 7
> [x] = r0
> r0 = 4
> [y] = r0
> 
> I see the same difference between GCC 4.3.2 and 4.4.0 when 
> compiling for PowerPC, MIPS, ARM, and FR-V.
> 
> When I look at the RTL dumps, I see that the first scheduling 
> pass always produces sequence A, across all targets and GCC 
> versions I tried.  In GCC 4.3.2, sequence A persists 
> throughout the remainder of compilation.  In GCC 4.4.0, for 
> every target, the .ira dump shows that the sequence of 
> instructions has reverted back to sequence B.
> 
> Are there any machine-dependent parameters that I can tune to 
> prevent IRA from transforming sequence A into sequence B?  If 
> not, where can I add a hook to allow this decision to be 
> tuned per machine?
> 
> Is there any more information you would like me to provide?
> 
> Thank you,
> Charles J. Tabony
> 
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: IRA undoing scheduling decisions
  2009-08-25 13:18 ` Adam Nemet
@ 2009-08-26 10:09   ` Charles J. Tabony
  2009-08-26 10:25     ` Adam Nemet
  0 siblings, 1 reply; 17+ messages in thread
From: Charles J. Tabony @ 2009-08-26 10:09 UTC (permalink / raw)
  To: Adam Nemet; +Cc: gcc

---- Adam Nemet <anemet@caviumnetworks.com> wrote: 
> "Charles J. Tabony" <tabonyee@austin.rr.com> writes:
> > I see the same difference between GCC 4.3.2 and 4.4.0 when compiling
> > for PowerPC, MIPS, ARM, and FR-V.
> 
> I can confirm this with mainline on MIPS/Octeon.  Can you please file a
> bug.

Filed as PR 41171.

Is this actually a performance regression on MIPS?  I suspect either
sequence would be the same performance on most machines.  One machine
where performance appears to suffer is Itanium, since it has EPIC and
two store ports.  What's odd is that for Itanium, I get sequence B
out of both GCC 4.4.0 and 4.3.2.

Thanks,
Charles J. Tabony

P.S.  Sorry about the long lines in my initial email.  I wasn't sure
how my webmail would format long lines.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: IRA undoing scheduling decisions
  2009-08-26 10:09   ` Charles J. Tabony
@ 2009-08-26 10:25     ` Adam Nemet
  0 siblings, 0 replies; 17+ messages in thread
From: Adam Nemet @ 2009-08-26 10:25 UTC (permalink / raw)
  To: Charles J. Tabony; +Cc: gcc

Charles J. Tabony writes:
> Filed as PR 41171.

Thanks.

> Is this actually a performance regression on MIPS?  I suspect either
> sequence would be the same performance on most machines.

Yes it is: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41171#c1

Adam

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: IRA undoing scheduling decisions
  2009-08-25  9:47 IRA undoing scheduling decisions Charles J. Tabony
  2009-08-25 13:18 ` Adam Nemet
  2009-08-25 15:41 ` Bingfeng Mei
@ 2009-08-27  0:12 ` Peter Bergner
  2009-08-27  0:58   ` Richard Guenther
                     ` (2 more replies)
  2 siblings, 3 replies; 17+ messages in thread
From: Peter Bergner @ 2009-08-27  0:12 UTC (permalink / raw)
  To: Charles J. Tabony; +Cc: gcc

On Mon, 2009-08-24 at 23:56 +0000, Charles J. Tabony wrote:
> I am seeing a performance regression on the port I maintain, and I would appreciate some pointers.
> 
> When I compile the following code
> 
> void f(int *x, int *y){
>   *x = 7;
>   *y = 4;
> }
> 
> with GCC 4.3.2, I get the desired sequence of instructions.  I'll call it sequence A:
> 
> r0 = 7
> r1 = 4
> [x] = r0
> [y] = r1
> 
> When I compile the same code with GCC 4.4.0, I get a sequence that is lower performance for my target machine.  I'll call it sequence B:
> 
> r0 = 7
> [x] = r0
> r0 = 4
> [y] = r0

This is caused by update_equiv_regs() which IRA inherited from local-alloc.c.
Although with gcc 4.3 and earlier, you don't see the problem, it is still there,
because if you look at the 4.3 dumps, you will see that update_equiv_regs()
unordered them for us.  What is saving us is that sched2 reschedules them
again for us in the order we want.  With 4.4, IRA happens to reuse the same
register for both pseudos, so sched2 is hand tied and cannot schedule them
back again for us.

Looking at update_equiv_regs(), if I disable the replacement for regs
that are local to one basic block (patch below) like it existed before
John Wehle's patch way back in Oct 2000:

  http://gcc.gnu.org/ml/gcc-patches/2000-09/msg00782.html

then we get the ordering we want.  Does anyone know why John removed
that part of the test in his patch?  Thoughts anyone?


Peter


Index: ira.c
===================================================================
--- ira.c	(revision 151111)
+++ ira.c	(working copy)
@@ -2510,6 +2510,7 @@ update_equiv_regs (void)
 		     calls.  */
 
 		  if (REG_N_REFS (regno) == 2
+		      && REG_BASIC_BLOCK (regno) < NUM_FIXED_BLOCKS
 		      && (rtx_equal_p (x, src)
 			  || ! equiv_init_varies_p (src))
 		      && NONJUMP_INSN_P (insn)



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: IRA undoing scheduling decisions
  2009-08-27  0:12 ` Peter Bergner
@ 2009-08-27  0:58   ` Richard Guenther
  2009-08-27  2:22     ` Peter Bergner
  2009-08-27 13:16   ` Alex Turjan
  2009-09-01 14:38   ` Vladimir Makarov
  2 siblings, 1 reply; 17+ messages in thread
From: Richard Guenther @ 2009-08-27  0:58 UTC (permalink / raw)
  To: Peter Bergner; +Cc: Charles J. Tabony, gcc

On Wed, Aug 26, 2009 at 10:47 PM, Peter Bergner<bergner@vnet.ibm.com> wrote:
> On Mon, 2009-08-24 at 23:56 +0000, Charles J. Tabony wrote:
>> I am seeing a performance regression on the port I maintain, and I would appreciate some pointers.
>>
>> When I compile the following code
>>
>> void f(int *x, int *y){
>>   *x = 7;
>>   *y = 4;
>> }
>>
>> with GCC 4.3.2, I get the desired sequence of instructions.  I'll call it sequence A:
>>
>> r0 = 7
>> r1 = 4
>> [x] = r0
>> [y] = r1
>>
>> When I compile the same code with GCC 4.4.0, I get a sequence that is lower performance for my target machine.  I'll call it sequence B:
>>
>> r0 = 7
>> [x] = r0
>> r0 = 4
>> [y] = r0
>
> This is caused by update_equiv_regs() which IRA inherited from local-alloc.c.
> Although with gcc 4.3 and earlier, you don't see the problem, it is still there,
> because if you look at the 4.3 dumps, you will see that update_equiv_regs()
> unordered them for us.  What is saving us is that sched2 reschedules them
> again for us in the order we want.  With 4.4, IRA happens to reuse the same
> register for both pseudos, so sched2 is hand tied and cannot schedule them
> back again for us.
>
> Looking at update_equiv_regs(), if I disable the replacement for regs
> that are local to one basic block (patch below) like it existed before
> John Wehle's patch way back in Oct 2000:
>
>  http://gcc.gnu.org/ml/gcc-patches/2000-09/msg00782.html
>
> then we get the ordering we want.  Does anyone know why John removed
> that part of the test in his patch?  Thoughts anyone?

Hmm.  I suppose if you conditionalize it on flag_schedule_insns it might be
an overall win.  Care to SPEC test that change?

Thanks,
Richard.

>
> Peter
>
>
> Index: ira.c
> ===================================================================
> --- ira.c       (revision 151111)
> +++ ira.c       (working copy)
> @@ -2510,6 +2510,7 @@ update_equiv_regs (void)
>                     calls.  */
>
>                  if (REG_N_REFS (regno) == 2
> +                     && REG_BASIC_BLOCK (regno) < NUM_FIXED_BLOCKS
>                      && (rtx_equal_p (x, src)
>                          || ! equiv_init_varies_p (src))
>                      && NONJUMP_INSN_P (insn)
>
>
>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: IRA undoing scheduling decisions
  2009-08-27  0:58   ` Richard Guenther
@ 2009-08-27  2:22     ` Peter Bergner
  2009-09-01 20:34       ` Peter Bergner
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Bergner @ 2009-08-27  2:22 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Charles J. Tabony, gcc

On Wed, 2009-08-26 at 23:30 +0200, Richard Guenther wrote:
> On Wed, Aug 26, 2009 at 10:47 PM, Peter Bergner<bergner@vnet.ibm.com> wrote:
> > Looking at update_equiv_regs(), if I disable the replacement for regs
> > that are local to one basic block (patch below) like it existed before
> > John Wehle's patch way back in Oct 2000:
> >
> >  http://gcc.gnu.org/ml/gcc-patches/2000-09/msg00782.html
> >
> > then we get the ordering we want.  Does anyone know why John removed
> > that part of the test in his patch?  Thoughts anyone?
> 
> Hmm.  I suppose if you conditionalize it on flag_schedule_insns it might be
> an overall win.  Care to SPEC test that change?

I assume you mean like the change below?  Yeah, I can SPEC test that.

Peter


Index: ira.c
===================================================================
--- ira.c	(revision 151111)
+++ ira.c	(working copy)
@@ -2510,6 +2510,8 @@ update_equiv_regs (void)
 		     calls.  */
 
 		  if (REG_N_REFS (regno) == 2
+		      && (!flag_schedule_insns
+			  || REG_BASIC_BLOCK (regno) < NUM_FIXED_BLOCKS)
 		      && (rtx_equal_p (x, src)
 			  || ! equiv_init_varies_p (src))
 		      && NONJUMP_INSN_P (insn)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: IRA undoing scheduling decisions
  2009-08-27  0:12 ` Peter Bergner
  2009-08-27  0:58   ` Richard Guenther
@ 2009-08-27 13:16   ` Alex Turjan
  2009-08-29  3:47     ` Jeff Law
  2009-09-01 14:38   ` Vladimir Makarov
  2 siblings, 1 reply; 17+ messages in thread
From: Alex Turjan @ 2009-08-27 13:16 UTC (permalink / raw)
  To: Charles J. Tabony, Peter Bergner; +Cc: gcc

> With 4.4, IRA happens to reuse the same register for both pseudos, so 
> sched2 is hand tied and cannot schedule them back again for us.

I can imagine compiling other programs for which preserving the 4.3 allocation will induce performance degradation due to spilling. 

The register allocator tries to minimize the number of spills without taking into account the ILP implications (i.e., creating extra false dependencies). Perhaps one possible way to solve the problem would be to analyze why the register rename phase (which is responsible for spreading the registers) does not produces 2 registers.


--- On Wed, 8/26/09, Peter Bergner <bergner@vnet.ibm.com> wrote:

> From: Peter Bergner <bergner@vnet.ibm.com>
> Subject: Re: IRA undoing scheduling decisions
> To: "Charles J. Tabony" <tabonyee@austin.rr.com>
> Cc: gcc@gcc.gnu.org
> Date: Wednesday, August 26, 2009, 11:47 PM
> On Mon, 2009-08-24 at 23:56 +0000,
> Charles J. Tabony wrote:
> > I am seeing a performance regression on the port I
> maintain, and I would appreciate some pointers.
> > 
> > When I compile the following code
> > 
> > void f(int *x, int *y){
> >   *x = 7;
> >   *y = 4;
> > }
> > 
> > with GCC 4.3.2, I get the desired sequence of
> instructions.  I'll call it sequence A:
> > 
> > r0 = 7
> > r1 = 4
> > [x] = r0
> > [y] = r1
> > 
> > When I compile the same code with GCC 4.4.0, I get a
> sequence that is lower performance for my target
> machine.  I'll call it sequence B:
> > 
> > r0 = 7
> > [x] = r0
> > r0 = 4
> > [y] = r0
> 
> This is caused by update_equiv_regs() which IRA inherited
> from local-alloc.c.
> Although with gcc 4.3 and earlier, you don't see the
> problem, it is still there,
> because if you look at the 4.3 dumps, you will see that
> update_equiv_regs()
> unordered them for us.  What is saving us is that
> sched2 reschedules them
> again for us in the order we want.  With 4.4, IRA
> happens to reuse the same
> register for both pseudos, so sched2 is hand tied and
> cannot schedule them
> back again for us.
> 
> Looking at update_equiv_regs(), if I disable the
> replacement for regs
> that are local to one basic block (patch below) like it
> existed before
> John Wehle's patch way back in Oct 2000:
> 
>   http://gcc.gnu.org/ml/gcc-patches/2000-09/msg00782.html
> 
> then we get the ordering we want.  Does anyone know
> why John removed
> that part of the test in his patch?  Thoughts anyone?
> 
> 
> Peter
> 
> 
> Index: ira.c
> ===================================================================
> --- ira.c    (revision 151111)
> +++ ira.c    (working copy)
> @@ -2510,6 +2510,7 @@ update_equiv_regs (void)
>           
>    calls.  */
>  
>            if
> (REG_N_REFS (regno) == 2
> +             
> && REG_BASIC_BLOCK (regno) < NUM_FIXED_BLOCKS
>               
> && (rtx_equal_p (x, src)
>             
>   || ! equiv_init_varies_p (src))
>               
> && NONJUMP_INSN_P (insn)
> 
> 
> 
> 



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: IRA undoing scheduling decisions
  2009-08-27 13:16   ` Alex Turjan
@ 2009-08-29  3:47     ` Jeff Law
  0 siblings, 0 replies; 17+ messages in thread
From: Jeff Law @ 2009-08-29  3:47 UTC (permalink / raw)
  To: gcc

On 08/27/09 04:04, Alex Turjan wrote:
>> With 4.4, IRA happens to reuse the same register for both pseudos, so
>> sched2 is hand tied and cannot schedule them back again for us.
>>      
> I can imagine compiling other programs for which preserving the 4.3 allocation will induce performance degradation due to spilling.
>
> The register allocator tries to minimize the number of spills without taking into account the ILP implications (i.e., creating extra false dependencies). Perhaps one possible way to solve the problem would be to analyze why the register rename phase (which is responsible for spreading the registers) does not produces 2 registers.
>    
ISTM the right thing is to go with the more compact allocation, then for 
the register renamer to expand the set of used registers to avoid the 
false dependencies.

jeff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: IRA undoing scheduling decisions
  2009-08-27  0:12 ` Peter Bergner
  2009-08-27  0:58   ` Richard Guenther
  2009-08-27 13:16   ` Alex Turjan
@ 2009-09-01 14:38   ` Vladimir Makarov
  2009-09-01 20:41     ` Peter Bergner
  2 siblings, 1 reply; 17+ messages in thread
From: Vladimir Makarov @ 2009-09-01 14:38 UTC (permalink / raw)
  To: Peter Bergner; +Cc: Charles J. Tabony, gcc

Peter Bergner wrote:
> On Mon, 2009-08-24 at 23:56 +0000, Charles J. Tabony wrote:
>   
>> I am seeing a performance regression on the port I maintain, and I would appreciate some pointers.
>>
>> When I compile the following code
>>
>> void f(int *x, int *y){
>>   *x = 7;
>>   *y = 4;
>> }
>>
>> with GCC 4.3.2, I get the desired sequence of instructions.  I'll call it sequence A:
>>
>> r0 = 7
>> r1 = 4
>> [x] = r0
>> [y] = r1
>>
>> When I compile the same code with GCC 4.4.0, I get a sequence that is lower performance for my target machine.  I'll call it sequence B:
>>
>> r0 = 7
>> [x] = r0
>> r0 = 4
>> [y] = r0
>>     
>
> This is caused by update_equiv_regs() which IRA inherited from local-alloc.c.
> Although with gcc 4.3 and earlier, you don't see the problem, it is still there,
> because if you look at the 4.3 dumps, you will see that update_equiv_regs()
> unordered them for us.  What is saving us is that sched2 reschedules them
> again for us in the order we want.  With 4.4, IRA happens to reuse the same
> register for both pseudos, so sched2 is hand tied and cannot schedule them
> back again for us.
>
>   
Peter, thanks for the investigation.

We could do update_equiv_regs in a separate pass before the 1st insn 
scheduling as it was before IRA.

I'll try this and see how will it work for mainstream targets (x86, ppc).
> Looking at update_equiv_regs(), if I disable the replacement for regs
> that are local to one basic block (patch below) like it existed before
> John Wehle's patch way back in Oct 2000:
>
>   http://gcc.gnu.org/ml/gcc-patches/2000-09/msg00782.html
>
> then we get the ordering we want.  Does anyone know why John removed
> that part of the test in his patch?  Thoughts anyone?
>
>   
I have no idea.  But if it works well, we could use it.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: IRA undoing scheduling decisions
  2009-08-27  2:22     ` Peter Bergner
@ 2009-09-01 20:34       ` Peter Bergner
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Bergner @ 2009-09-01 20:34 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Charles J. Tabony, gcc, Pat Haugen, Vladimir Makarov

On Wed, 2009-08-26 at 17:12 -0500, Peter Bergner wrote:
> On Wed, 2009-08-26 at 23:30 +0200, Richard Guenther wrote:
> > Hmm.  I suppose if you conditionalize it on flag_schedule_insns it might be
> > an overall win.  Care to SPEC test that change?
> 
> I assume you mean like the change below?  Yeah, I can SPEC test that.
> 
> Peter
> 
> 
> Index: ira.c
> ===================================================================
> --- ira.c	(revision 151111)
> +++ ira.c	(working copy)
> @@ -2510,6 +2510,8 @@ update_equiv_regs (void)
>  		     calls.  */
> 
>  		  if (REG_N_REFS (regno) == 2
> +		      && (!flag_schedule_insns
> +			  || REG_BASIC_BLOCK (regno) < NUM_FIXED_BLOCKS)
>  		      && (rtx_equal_p (x, src)
>  			  || ! equiv_init_varies_p (src))
>  		      && NONJUMP_INSN_P (insn)

Pat ran the patch on SPEC2000 and it was very neutral.  The overall
SPECFP number didn't change and the SPECINT number only improved by
0.2%, which is pretty much in the noise.

I think Vlad's suggestion of moving update_equiv_regs() to its own pass
before sched1 sounds interesting.  If that works, it's probably better
than this patch.

Peter



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: IRA undoing scheduling decisions
  2009-09-01 14:38   ` Vladimir Makarov
@ 2009-09-01 20:41     ` Peter Bergner
  2009-09-01 20:45       ` Vladimir Makarov
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Bergner @ 2009-09-01 20:41 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: Charles J. Tabony, gcc, Pat Haugen

On Tue, 2009-09-01 at 10:38 -0400, Vladimir Makarov wrote:
> We could do update_equiv_regs in a separate pass before the 1st insn 
> scheduling as it was before IRA.

IIRC, update_equiv_regs() was always called as part of local-alloc,
so it was always after sched1 even before IRA.  That said, moving it
to its own pass before sched1 sounds like an interesting idea.
My patch from the other note basically didn't affect SPEC2000 at all,
and we could use it, but if your idea works, I'm more than happy to
dump my patch. :)

Were you going to whip that patch up or did you want me to?

Peter



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: IRA undoing scheduling decisions
  2009-09-01 20:41     ` Peter Bergner
@ 2009-09-01 20:45       ` Vladimir Makarov
  2009-09-01 20:59         ` Peter Bergner
  0 siblings, 1 reply; 17+ messages in thread
From: Vladimir Makarov @ 2009-09-01 20:45 UTC (permalink / raw)
  To: Peter Bergner; +Cc: Charles J. Tabony, gcc, Pat Haugen

Peter Bergner wrote:
> On Tue, 2009-09-01 at 10:38 -0400, Vladimir Makarov wrote:
>   
>> We could do update_equiv_regs in a separate pass before the 1st insn 
>> scheduling as it was before IRA.
>>     
>
> IIRC, update_equiv_regs() was always called as part of local-alloc,
> so it was always after sched1 even before IRA.  That said, moving it
> to its own pass before sched1 sounds like an interesting idea.
> My patch from the other note basically didn't affect SPEC2000 at all,
> and we could use it, but if your idea works, I'm more than happy to
> dump my patch. :)
>
> Were you going to whip that patch up or did you want me to?
>
>
>   
I am going to do it by myself.  Thanks for testing your patch, Peter.




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: IRA undoing scheduling decisions
  2009-09-01 20:45       ` Vladimir Makarov
@ 2009-09-01 20:59         ` Peter Bergner
  2009-09-02 15:49           ` Vladimir Makarov
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Bergner @ 2009-09-01 20:59 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: Charles J. Tabony, gcc, Pat Haugen

On Tue, 2009-09-01 at 16:46 -0400, Vladimir Makarov wrote:
> Peter Bergner wrote:
> > Were you going to whip that patch up or did you want me to?
> >
> I am going to do it by myself.

Great!  I'd like to see how your patch affects POWER6 performance.
Do you have access to a POWER6 box?  If not, can you send Pat and I
the patch and we'll fire off a run on our POWER6 benchmark system.
Thanks.

Peter



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: IRA undoing scheduling decisions
  2009-09-01 20:59         ` Peter Bergner
@ 2009-09-02 15:49           ` Vladimir Makarov
  2009-09-02 17:41             ` Peter Bergner
  0 siblings, 1 reply; 17+ messages in thread
From: Vladimir Makarov @ 2009-09-02 15:49 UTC (permalink / raw)
  To: Peter Bergner; +Cc: Charles J. Tabony, gcc, Pat Haugen

Peter Bergner wrote:
> On Tue, 2009-09-01 at 16:46 -0400, Vladimir Makarov wrote:
>   
>> Peter Bergner wrote:
>>     
>>> Were you going to whip that patch up or did you want me to?
>>>
>>>       
>> I am going to do it by myself.
>>     
>
> Great!  I'd like to see how your patch affects POWER6 performance.
> Do you have access to a POWER6 box?
Yes, I have.

>   If not, can you send Pat and I
> the patch and we'll fire off a run on our POWER6 benchmark system.
>
>   

I've got the results. SPECFP2000 is 2% better with separate 
udpate_equiv_regs pass but taking art volatility off (I already wrote 
that art is very volatile on power6: its worst and best scores can be 
10-15% different) it is about 0.6% better.

As for SPECINT2000, crafty failed with the separate pass.  I need some 
time to investigate this.  Without crafty, SPECInt2000 improvement is 
about 0.4%.

So probably, it is worth to do update_equiv_reg as a separate pass.  
I'll submit a patch on next week (sorry, I am a bit busy this week).

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: IRA undoing scheduling decisions
  2009-09-02 15:49           ` Vladimir Makarov
@ 2009-09-02 17:41             ` Peter Bergner
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Bergner @ 2009-09-02 17:41 UTC (permalink / raw)
  To: Vladimir Makarov; +Cc: Charles J. Tabony, gcc, Pat Haugen

On Wed, 2009-09-02 at 11:49 -0400, Vladimir Makarov wrote:
> So probably, it is worth to do update_equiv_reg as a separate pass.

Agreed.


> I'll submit a patch on next week (sorry, I am a bit busy this week).

Sounds good.  Thanks for taking care of this!

Peter



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2009-09-02 17:41 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-25  9:47 IRA undoing scheduling decisions Charles J. Tabony
2009-08-25 13:18 ` Adam Nemet
2009-08-26 10:09   ` Charles J. Tabony
2009-08-26 10:25     ` Adam Nemet
2009-08-25 15:41 ` Bingfeng Mei
2009-08-27  0:12 ` Peter Bergner
2009-08-27  0:58   ` Richard Guenther
2009-08-27  2:22     ` Peter Bergner
2009-09-01 20:34       ` Peter Bergner
2009-08-27 13:16   ` Alex Turjan
2009-08-29  3:47     ` Jeff Law
2009-09-01 14:38   ` Vladimir Makarov
2009-09-01 20:41     ` Peter Bergner
2009-09-01 20:45       ` Vladimir Makarov
2009-09-01 20:59         ` Peter Bergner
2009-09-02 15:49           ` Vladimir Makarov
2009-09-02 17:41             ` Peter Bergner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).