public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: x86 inc/dec on core2
@ 2007-04-07  9:30 Uros Bizjak
  2007-04-07 14:10 ` H. J. Lu
  0 siblings, 1 reply; 16+ messages in thread
From: Uros Bizjak @ 2007-04-07  9:30 UTC (permalink / raw)
  To: GCC; +Cc: H. J. Lu, Mike Stump

Hello!

> > I was wondering, if:
> > 
> >   /* X86_TUNE_USE_INCDEC */
> >   ~(m_PENT4 | m_NOCONA | m_CORE2 | m_GENERIC),
> > 
> > is correct.  Should it be:
> > 
> >   /* X86_TUNE_USE_INCDEC */
> >   ~(m_PENT4 | m_NOCONA | m_GENERIC),
> > 
> > ?
>
> inc/dec has the same performance as add/sub on Core 2 Duo. But
> inc/dec is shorter.
>   

What about partial flag register dependency of inc/dec?

Uros.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: x86 inc/dec on core2
  2007-04-07  9:30 x86 inc/dec on core2 Uros Bizjak
@ 2007-04-07 14:10 ` H. J. Lu
  2007-04-08 10:13   ` Uros Bizjak
  0 siblings, 1 reply; 16+ messages in thread
From: H. J. Lu @ 2007-04-07 14:10 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: GCC, Mike Stump

On Sat, Apr 07, 2007 at 11:29:46AM +0200, Uros Bizjak wrote:
> Hello!
> 
> >> I was wondering, if:
> >> 
> >>   /* X86_TUNE_USE_INCDEC */
> >>   ~(m_PENT4 | m_NOCONA | m_CORE2 | m_GENERIC),
> >> 
> >> is correct.  Should it be:
> >> 
> >>   /* X86_TUNE_USE_INCDEC */
> >>   ~(m_PENT4 | m_NOCONA | m_GENERIC),
> >> 
> >> ?
> >
> >inc/dec has the same performance as add/sub on Core 2 Duo. But
> >inc/dec is shorter.
> >  
> 
> What about partial flag register dependency of inc/dec?

There is no partial flag register dependency on inc/dec.


H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: x86 inc/dec on core2
  2007-04-07 14:10 ` H. J. Lu
@ 2007-04-08 10:13   ` Uros Bizjak
  2007-04-08 10:21     ` Robert Dewar
                       ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Uros Bizjak @ 2007-04-08 10:13 UTC (permalink / raw)
  To: H. J. Lu; +Cc: GCC, Mike Stump

H. J. Lu wrote:

>>> inc/dec has the same performance as add/sub on Core 2 Duo. But
>>> inc/dec is shorter.
>>>  
>>>       
>> What about partial flag register dependency of inc/dec?
>>     
>
> There is no partial flag register dependency on inc/dec.
>   

My docs say that "INC/DEC does not change the carry flag". But you have 
better resources that I, so if you think that C2D should be left out of 
X86_TUNE_USE_INCDEC, then the patch is pre-approved for mainline.

Thanks,
Uros.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: x86 inc/dec on core2
  2007-04-08 10:13   ` Uros Bizjak
@ 2007-04-08 10:21     ` Robert Dewar
  2007-04-08 13:34     ` H. J. Lu
  2007-04-08 16:05     ` Mike Stump
  2 siblings, 0 replies; 16+ messages in thread
From: Robert Dewar @ 2007-04-08 10:21 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: H. J. Lu, GCC, Mike Stump

Uros Bizjak wrote:

> My docs say that "INC/DEC does not change the carry flag". But you have 
> better resources that I, so if you think that C2D should be left out of 
> X86_TUNE_USE_INCDEC, then the patch is pre-approved for mainline.

Absolutely INC/DEC do not change the carry flag, this is an important
part of the architecture, how else would you code ADC loops?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: x86 inc/dec on core2
  2007-04-08 10:13   ` Uros Bizjak
  2007-04-08 10:21     ` Robert Dewar
@ 2007-04-08 13:34     ` H. J. Lu
  2007-04-08 16:05     ` Mike Stump
  2 siblings, 0 replies; 16+ messages in thread
From: H. J. Lu @ 2007-04-08 13:34 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: GCC, Mike Stump

On Sun, Apr 08, 2007 at 11:37:43AM +0200, Uros Bizjak wrote:
> H. J. Lu wrote:
> 
> >>>inc/dec has the same performance as add/sub on Core 2 Duo. But
> >>>inc/dec is shorter.
> >>> 
> >>>      
> >>What about partial flag register dependency of inc/dec?
> >>    
> >
> >There is no partial flag register dependency on inc/dec.
> >  
> 
> My docs say that "INC/DEC does not change the carry flag". But you have 
> better resources that I, so if you think that C2D should be left out of 
> X86_TUNE_USE_INCDEC, then the patch is pre-approved for mainline.

Partial flag register stall only applies to instructions like
shift since they may shift by 0 in which case they won't change
flag register.


H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: x86 inc/dec on core2
  2007-04-08 10:13   ` Uros Bizjak
  2007-04-08 10:21     ` Robert Dewar
  2007-04-08 13:34     ` H. J. Lu
@ 2007-04-08 16:05     ` Mike Stump
  2007-04-08 16:14       ` Robert Dewar
                         ` (2 more replies)
  2 siblings, 3 replies; 16+ messages in thread
From: Mike Stump @ 2007-04-08 16:05 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: H. J. Lu, GCC

On Apr 8, 2007, at 2:37 AM, Uros Bizjak wrote:
> My docs say that "INC/DEC does not change the carry flag".

Personally, I'm having a hard time envisioning how the semantics of  
the instruction are relevant at all.  This is all about instructing  
tuning, so, semantics cannot matter, otherwise, it would be wrong to  
make this a tune choice.

> But you have better resources that I, so if you think that C2D  
> should be left out of X86_TUNE_USE_INCDEC, then the patch is pre- 
> approved for mainline.

I'm confused again, it isn't that it should be left out, it is that  
it should be included.  My patch adds inc/dec selection for C2D.  I'd  
also like it for generic on darwin, as that makes more sense for us.   
How does the rest of the community feel about inc/dec selection for  
generic?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: x86 inc/dec on core2
  2007-04-08 16:05     ` Mike Stump
@ 2007-04-08 16:14       ` Robert Dewar
  2007-04-08 16:56       ` Uros Bizjak
  2007-04-09  3:52       ` Zuxy Meng
  2 siblings, 0 replies; 16+ messages in thread
From: Robert Dewar @ 2007-04-08 16:14 UTC (permalink / raw)
  To: Mike Stump; +Cc: Uros Bizjak, H. J. Lu, GCC

Mike Stump wrote:
> On Apr 8, 2007, at 2:37 AM, Uros Bizjak wrote:
>> My docs say that "INC/DEC does not change the carry flag".
> 
> Personally, I'm having a hard time envisioning how the semantics of  
> the instruction are relevant at all.  This is all about instructing  
> tuning, so, semantics cannot matter, otherwise, it would be wrong to  
> make this a tune choice.

Well for sure INC and ADD have different semantics. INC does not
affect the carry flag, and ADD does, so there are definitely
situations where you want one and the other won't work!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: x86 inc/dec on core2
  2007-04-08 16:05     ` Mike Stump
  2007-04-08 16:14       ` Robert Dewar
@ 2007-04-08 16:56       ` Uros Bizjak
  2007-04-09  3:52       ` Zuxy Meng
  2 siblings, 0 replies; 16+ messages in thread
From: Uros Bizjak @ 2007-04-08 16:56 UTC (permalink / raw)
  To: Mike Stump; +Cc: H. J. Lu, GCC

Mike Stump wrote:

>> But you have better resources that I, so if you think that C2D should 
>> be left out of X86_TUNE_USE_INCDEC, then the patch is pre-approved 
>> for mainline.
>
> I'm confused again, it isn't that it should be left out, it is that it 
> should be included.  My patch adds inc/dec selection for C2D.  I'd 
> also like it for generic on darwin, as that makes more sense for us.  
> How does the rest of the community feel about inc/dec selection for 
> generic?
Just to clear the mess - Yes, C2D should be a part of 
X86_TUNE_USE_INCDEC. Sorry for the confusion.

Uros.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: x86 inc/dec on core2
  2007-04-08 16:05     ` Mike Stump
  2007-04-08 16:14       ` Robert Dewar
  2007-04-08 16:56       ` Uros Bizjak
@ 2007-04-09  3:52       ` Zuxy Meng
  2007-04-09 17:50         ` Mike Stump
  2007-04-09 19:13         ` Vladimir N. Makarov
  2 siblings, 2 replies; 16+ messages in thread
From: Zuxy Meng @ 2007-04-09  3:52 UTC (permalink / raw)
  To: gcc

"Mike Stump" <mrs@apple.com> 
??????:2FD44629-CBE1-42FA-88F6-799D9C09BD81@apple.com...
> On Apr 8, 2007, at 2:37 AM, Uros Bizjak wrote:
>> My docs say that "INC/DEC does not change the carry flag".
>
> Personally, I'm having a hard time envisioning how the semantics of  the 
> instruction are relevant at all.  This is all about instructing  tuning, 
> so, semantics cannot matter, otherwise, it would be wrong to  make this a 
> tune choice.

Intel's optimization reference manual says that:

3.5.1.1 Use of the INC and DEC Instructions
The INC and DEC instructions modify only a subset of the bits in the flag 
register. This creates a dependence on all previous writes of the flag 
register. This is especially problematic when these instructions are on the 
critical path because they are used to change an address for a load on which 
many other instructions depend.

Assembly/Compiler Coding Rule 32. (M impact, H generality)
INC and DEC instructions should be replaced with ADD or SUB instructions, 
because ADD and SUB overwrite all flags, whereas INC and DEC do not, 
therefore creating false dependencies on earlier instructions that set the 
flags.

>
>> But you have better resources that I, so if you think that C2D  should be 
>> left out of X86_TUNE_USE_INCDEC, then the patch is pre- approved for 
>> mainline.
>
> I'm confused again, it isn't that it should be left out, it is that  it 
> should be included.  My patch adds inc/dec selection for C2D.  I'd  also 
> like it for generic on darwin, as that makes more sense for us.   How does 
> the rest of the community feel about inc/dec selection for  generic?

-- 
Zuxy



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: x86 inc/dec on core2
  2007-04-09  3:52       ` Zuxy Meng
@ 2007-04-09 17:50         ` Mike Stump
  2007-04-09 18:13           ` H. J. Lu
  2007-04-09 19:13         ` Vladimir N. Makarov
  1 sibling, 1 reply; 16+ messages in thread
From: Mike Stump @ 2007-04-09 17:50 UTC (permalink / raw)
  To: Zuxy Meng; +Cc: gcc

On Apr 8, 2007, at 8:51 PM, Zuxy Meng wrote:
> Intel's optimization reference manual says that:

I wasn't going off the documentation...  I'd be more interested in  
either benchmarks or in recommendations by Intel people that know the  
details of the core2 and the performance impact of those details.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: x86 inc/dec on core2
  2007-04-09 17:50         ` Mike Stump
@ 2007-04-09 18:13           ` H. J. Lu
  2007-04-09 21:58             ` H. J. Lu
  0 siblings, 1 reply; 16+ messages in thread
From: H. J. Lu @ 2007-04-09 18:13 UTC (permalink / raw)
  To: Mike Stump; +Cc: Zuxy Meng, gcc

On Mon, Apr 09, 2007 at 10:51:22AM -0700, Mike Stump wrote:
> On Apr 8, 2007, at 8:51 PM, Zuxy Meng wrote:
> >Intel's optimization reference manual says that:
> 
> I wasn't going off the documentation...  I'd be more interested in  
> either benchmarks or in recommendations by Intel people that know the  
> details of the core2 and the performance impact of those details.

I am double checking it now.

Intel's optimization guide covers both P4 and C2D. You shouldn't
use inc/dec when you are compiling code to run on both P4 and C2D.
But optimizing for C2D only is different.


H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: x86 inc/dec on core2
  2007-04-09  3:52       ` Zuxy Meng
  2007-04-09 17:50         ` Mike Stump
@ 2007-04-09 19:13         ` Vladimir N. Makarov
  1 sibling, 0 replies; 16+ messages in thread
From: Vladimir N. Makarov @ 2007-04-09 19:13 UTC (permalink / raw)
  To: Zuxy Meng; +Cc: gcc

Zuxy Meng wrote:

>"Mike Stump" <mrs@apple.com> 
>??????:2FD44629-CBE1-42FA-88F6-799D9C09BD81@apple.com...
>  
>
>>On Apr 8, 2007, at 2:37 AM, Uros Bizjak wrote:
>>    
>>
>>>My docs say that "INC/DEC does not change the carry flag".
>>>      
>>>
>>Personally, I'm having a hard time envisioning how the semantics of  the 
>>instruction are relevant at all.  This is all about instructing  tuning, 
>>so, semantics cannot matter, otherwise, it would be wrong to  make this a 
>>tune choice.
>>    
>>
>
>Intel's optimization reference manual says that:
>
>3.5.1.1 Use of the INC and DEC Instructions
>The INC and DEC instructions modify only a subset of the bits in the flag 
>register. This creates a dependence on all previous writes of the flag 
>register. This is especially problematic when these instructions are on the 
>critical path because they are used to change an address for a load on which 
>many other instructions depend.
>
>Assembly/Compiler Coding Rule 32. (M impact, H generality)
>INC and DEC instructions should be replaced with ADD or SUB instructions, 
>because ADD and SUB overwrite all flags, whereas INC and DEC do not, 
>therefore creating false dependencies on earlier instructions that set the 
>flags.
>
>  
>
That is probably one part of the true.  Another part is that usage of 
inc/dec results in smaller code, better code locality and faster code.  
What is more important? This is hard to say.  But I've checked SPEC2000 
for gcc (revision 122995) again and see that usage of inc/dec gives not 
worse code (actually it is a bit better: 1939 without inc/dec vs 1944 
with inc/dec for SPECInt2000 and 1724 vs 1727 for SPECFp2000) besides 
the code with usage of inc/dec is 0.39% for SPECInt and 0.11% for SPECFp 
smaller.

Vlad

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: x86 inc/dec on core2
  2007-04-09 18:13           ` H. J. Lu
@ 2007-04-09 21:58             ` H. J. Lu
  0 siblings, 0 replies; 16+ messages in thread
From: H. J. Lu @ 2007-04-09 21:58 UTC (permalink / raw)
  To: Mike Stump; +Cc: Zuxy Meng, gcc

On Mon, Apr 09, 2007 at 11:13:17AM -0700, H. J. Lu wrote:
> On Mon, Apr 09, 2007 at 10:51:22AM -0700, Mike Stump wrote:
> > On Apr 8, 2007, at 8:51 PM, Zuxy Meng wrote:
> > >Intel's optimization reference manual says that:
> > 
> > I wasn't going off the documentation...  I'd be more interested in  
> > either benchmarks or in recommendations by Intel people that know the  
> > details of the core2 and the performance impact of those details.
> 
> I am double checking it now.

I have confirmed that inc/dec is good on Core 2 Duo.


H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: x86 inc/dec on core2
  2007-04-06 20:27 Mike Stump
  2007-04-06 21:17 ` H. J. Lu
@ 2007-04-08 21:37 ` Vladimir N. Makarov
  1 sibling, 0 replies; 16+ messages in thread
From: Vladimir N. Makarov @ 2007-04-08 21:37 UTC (permalink / raw)
  To: Mike Stump; +Cc: GCC Development

Mike Stump wrote:

> I was wondering, if:
>
>   /* X86_TUNE_USE_INCDEC */
>   ~(m_PENT4 | m_NOCONA | m_CORE2 | m_GENERIC),
>
> is correct.  Should it be:
>
>   /* X86_TUNE_USE_INCDEC */
>   ~(m_PENT4 | m_NOCONA | m_GENERIC),
>
> ?
>
> In the original patch in:
>
> 2006-11-18  Vladimir Makarov  <vmakarov@redhat.com>
>
>         * doc/invoke.texi (core2): Add item.
>
> it wasn't present, but in the checked in patch in r118973, it is:
>
> $ svn diff -r118972:118973 i386.c | grep incdec
> -const int x86_use_incdec = ~(m_PENT4 | m_NOCONA | m_GENERIC);
> +const int x86_use_incdec = ~(m_PENT4 | m_NOCONA | m_CORE2 | m_GENERIC);
>
It is probably a typo.  I did several changes after original patch 
submission because people asked me to benhmark several parameters.

Core2 really should generate INC/DEC because it results in smaller 
programs (as I remember about 0.4% and 0.1% correspondingly for 
SPECINT2000 and SPECFP2000) and I definitely remeber a bit better code too.

> I looked around for a discussion on this, and didn't find it.
>
> Thanks?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: x86 inc/dec on core2
  2007-04-06 20:27 Mike Stump
@ 2007-04-06 21:17 ` H. J. Lu
  2007-04-08 21:37 ` Vladimir N. Makarov
  1 sibling, 0 replies; 16+ messages in thread
From: H. J. Lu @ 2007-04-06 21:17 UTC (permalink / raw)
  To: Mike Stump; +Cc: GCC Development

On Fri, Apr 06, 2007 at 01:27:09PM -0700, Mike Stump wrote:
> I was wondering, if:
> 
>   /* X86_TUNE_USE_INCDEC */
>   ~(m_PENT4 | m_NOCONA | m_CORE2 | m_GENERIC),
> 
> is correct.  Should it be:
> 
>   /* X86_TUNE_USE_INCDEC */
>   ~(m_PENT4 | m_NOCONA | m_GENERIC),
> 
> ?

inc/dec has the same performance as add/sub on Core 2 Duo. But
inc/dec is shorter.


H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* x86 inc/dec on core2
@ 2007-04-06 20:27 Mike Stump
  2007-04-06 21:17 ` H. J. Lu
  2007-04-08 21:37 ` Vladimir N. Makarov
  0 siblings, 2 replies; 16+ messages in thread
From: Mike Stump @ 2007-04-06 20:27 UTC (permalink / raw)
  To: GCC Development

I was wondering, if:

   /* X86_TUNE_USE_INCDEC */
   ~(m_PENT4 | m_NOCONA | m_CORE2 | m_GENERIC),

is correct.  Should it be:

   /* X86_TUNE_USE_INCDEC */
   ~(m_PENT4 | m_NOCONA | m_GENERIC),

?

In the original patch in:

2006-11-18  Vladimir Makarov  <vmakarov@redhat.com>

         * doc/invoke.texi (core2): Add item.

it wasn't present, but in the checked in patch in r118973, it is:

$ svn diff -r118972:118973 i386.c | grep incdec
-const int x86_use_incdec = ~(m_PENT4 | m_NOCONA | m_GENERIC);
+const int x86_use_incdec = ~(m_PENT4 | m_NOCONA | m_CORE2 | m_GENERIC);

I looked around for a discussion on this, and didn't find it.

Thanks?

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2007-04-09 21:58 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-04-07  9:30 x86 inc/dec on core2 Uros Bizjak
2007-04-07 14:10 ` H. J. Lu
2007-04-08 10:13   ` Uros Bizjak
2007-04-08 10:21     ` Robert Dewar
2007-04-08 13:34     ` H. J. Lu
2007-04-08 16:05     ` Mike Stump
2007-04-08 16:14       ` Robert Dewar
2007-04-08 16:56       ` Uros Bizjak
2007-04-09  3:52       ` Zuxy Meng
2007-04-09 17:50         ` Mike Stump
2007-04-09 18:13           ` H. J. Lu
2007-04-09 21:58             ` H. J. Lu
2007-04-09 19:13         ` Vladimir N. Makarov
  -- strict thread matches above, loose matches on Subject: below --
2007-04-06 20:27 Mike Stump
2007-04-06 21:17 ` H. J. Lu
2007-04-08 21:37 ` Vladimir N. Makarov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).