public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* __sync_fetch
@ 2012-11-17  6:34 Hei Chan
  2012-11-18  7:04 ` __sync_fetch Hei Chan
  2012-11-18  8:04 ` __sync_fetch Ian Lance Taylor
  0 siblings, 2 replies; 12+ messages in thread
From: Hei Chan @ 2012-11-17  6:34 UTC (permalink / raw)
  To: gcc-help



Hi,

I am using GCC 4.1.2, and so no __atomic*().

I am looking at http://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins.html

I see __sync_fetch_and_*(), but I don't see __sync_fetch().  Is it because the built-in routines support integral scalar or pointer type that is up to 8 bytes in length, and so the read is automatically atomic anyway?

Thanks in advance.


Cheers,
Hei

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: __sync_fetch
  2012-11-17  6:34 __sync_fetch Hei Chan
@ 2012-11-18  7:04 ` Hei Chan
  2012-11-18  8:07   ` __sync_fetch Ian Lance Taylor
  2012-11-18  8:04 ` __sync_fetch Ian Lance Taylor
  1 sibling, 1 reply; 12+ messages in thread
From: Hei Chan @ 2012-11-18  7:04 UTC (permalink / raw)
  To: Hei Chan, gcc-help

Hi,

After searching more for info, it seems like even though on a
 64-bit machine, reading a long (i.e. 8 bytes) is one operation, it 
might not give the "correct" value:
http://gcc.gnu.org/ml/gcc/2008-03/msg00201.html

And so, we have to use __sync_fetch_and_add(&x, 0) to read?

Could
 someone elaborate a situation that reading a long variable won't get 
the correct value given that all writes in the application use 
__sync_fetch_*()?

Thanks in advance.


Cheers,
Hei

________________________________
From: Hei Chan <structurechart@yahoo.com>
To: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> 
Sent: Friday, November 16, 2012 10:34 PM
Subject: __sync_fetch



Hi,

I am using GCC 4.1.2, and so no __atomic*().

I am looking at http://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins.html

I see __sync_fetch_and_*(), but I don't see __sync_fetch().  Is it because the built-in routines support integral scalar or pointer type that is up to 8 bytes in length, and so the read is automatically atomic anyway?

Thanks in advance.


Cheers,
Hei

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: __sync_fetch
  2012-11-17  6:34 __sync_fetch Hei Chan
  2012-11-18  7:04 ` __sync_fetch Hei Chan
@ 2012-11-18  8:04 ` Ian Lance Taylor
  1 sibling, 0 replies; 12+ messages in thread
From: Ian Lance Taylor @ 2012-11-18  8:04 UTC (permalink / raw)
  To: Hei Chan; +Cc: gcc-help

On Fri, Nov 16, 2012 at 10:34 PM, Hei Chan <structurechart@yahoo.com> wrote:
>
> I am using GCC 4.1.2, and so no __atomic*().
>
> I am looking at http://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins.html
>
> I see __sync_fetch_and_*(), but I don't see __sync_fetch().  Is it because the built-in routines support integral scalar or pointer type that is up to 8 bytes in length, and so the read is automatically atomic anyway?

The __sync primitives were designed by Intel.  I believe that they did
not include atomic load or store operators because on x86 processors
all aligned loads and stores are atomic.  Synchronization of loads and
stores with other processors on x86 requires the use of explicit
memory fence instructions.

Since GCC just picked up the Intel designed primitives, they work fine
on x86, but are deficient on other processors.

Ian

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: __sync_fetch
  2012-11-18  7:04 ` __sync_fetch Hei Chan
@ 2012-11-18  8:07   ` Ian Lance Taylor
  2012-11-18  8:11     ` __sync_fetch Hei Chan
  0 siblings, 1 reply; 12+ messages in thread
From: Ian Lance Taylor @ 2012-11-18  8:07 UTC (permalink / raw)
  To: Hei Chan; +Cc: gcc-help

On Sat, Nov 17, 2012 at 11:04 PM, Hei Chan <structurechart@yahoo.com> wrote:
>
> After searching more for info, it seems like even though on a
>  64-bit machine, reading a long (i.e. 8 bytes) is one operation, it
> might not give the "correct" value:
> http://gcc.gnu.org/ml/gcc/2008-03/msg00201.html
>
> And so, we have to use __sync_fetch_and_add(&x, 0) to read?
>
> Could
>  someone elaborate a situation that reading a long variable won't get
> the correct value given that all writes in the application use
> __sync_fetch_*()?

If you always use __sync_fetch_and_add(&x, 0) to read a value, and you
always use __sync_fetch_and_add to write the value also with some
appropriate detla, then all the accesses to that variable should be
atomic with sequential consistency.  That should be true on any
processors that implements __sync_fetch_and_add in the appropriate
size.

Ian

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: __sync_fetch
  2012-11-18  8:07   ` __sync_fetch Ian Lance Taylor
@ 2012-11-18  8:11     ` Hei Chan
  2012-11-18  8:18       ` __sync_fetch Ian Lance Taylor
  0 siblings, 1 reply; 12+ messages in thread
From: Hei Chan @ 2012-11-18  8:11 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc-help

Hi Ian,

Thanks for your reply.

How about on a 64-bit Intel processor, I use __sync_fetch_and_*() to write to a long variable, but never use any __sync_*() to read?  Under what situation that I will read something invalid?

Thanks in advance.


Cheers,
Hei


----- Original Message -----
From: Ian Lance Taylor <iant@google.com>
To: Hei Chan <structurechart@yahoo.com>
Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
Sent: Sunday, November 18, 2012 12:07 AM
Subject: Re: __sync_fetch

On Sat, Nov 17, 2012 at 11:04 PM, Hei Chan <structurechart@yahoo.com> wrote:
>
> After searching more for info, it seems like even though on a
>  64-bit machine, reading a long (i.e. 8 bytes) is one operation, it
> might not give the "correct" value:
> http://gcc.gnu.org/ml/gcc/2008-03/msg00201.html
>
> And so, we have to use __sync_fetch_and_add(&x, 0) to read?
>
> Could
>  someone elaborate a situation that reading a long variable won't get
> the correct value given that all writes in the application use
> __sync_fetch_*()?

If you always use __sync_fetch_and_add(&x, 0) to read a value, and you
always use __sync_fetch_and_add to write the value also with some
appropriate detla, then all the accesses to that variable should be
atomic with sequential consistency.  That should be true on any
processors that implements __sync_fetch_and_add in the appropriate
size.

Ian

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: __sync_fetch
  2012-11-18  8:11     ` __sync_fetch Hei Chan
@ 2012-11-18  8:18       ` Ian Lance Taylor
  2012-11-18 19:31         ` __sync_fetch Hei Chan
  0 siblings, 1 reply; 12+ messages in thread
From: Ian Lance Taylor @ 2012-11-18  8:18 UTC (permalink / raw)
  To: Hei Chan; +Cc: gcc-help

On Sun, Nov 18, 2012 at 12:10 AM, Hei Chan <structurechart@yahoo.com> wrote:
>
> How about on a 64-bit Intel processor, I use __sync_fetch_and_*() to write to a long variable, but never use any __sync_*() to read?  Under what situation that I will read something invalid?

On a 64-bit Intel processor, if the 64-bit value is at an aligned
adress, then to the best of my knowledge that will always be fine.  If
the 64-bit value is misaligned and crosses a cache line, then if you
are unlucky I believe that a write can occur in between reading the
two different cache lines, causing you to read a value that was never
written.

I feel compelled to add that attempting to reason about this sort of
thing generally means that you are making a mistake.  Unless you are
writing very low-level code, such as the implementation of mutex, it's
best to avoid trying to think this way.

Ian



> ----- Original Message -----
> From: Ian Lance Taylor <iant@google.com>
> To: Hei Chan <structurechart@yahoo.com>
> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
> Sent: Sunday, November 18, 2012 12:07 AM
> Subject: Re: __sync_fetch
>
> On Sat, Nov 17, 2012 at 11:04 PM, Hei Chan <structurechart@yahoo.com> wrote:
>>
>> After searching more for info, it seems like even though on a
>>  64-bit machine, reading a long (i.e. 8 bytes) is one operation, it
>> might not give the "correct" value:
>> http://gcc.gnu.org/ml/gcc/2008-03/msg00201.html
>>
>> And so, we have to use __sync_fetch_and_add(&x, 0) to read?
>>
>> Could
>>  someone elaborate a situation that reading a long variable won't get
>> the correct value given that all writes in the application use
>> __sync_fetch_*()?
>
> If you always use __sync_fetch_and_add(&x, 0) to read a value, and you
> always use __sync_fetch_and_add to write the value also with some
> appropriate detla, then all the accesses to that variable should be
> atomic with sequential consistency.  That should be true on any
> processors that implements __sync_fetch_and_add in the appropriate
> size.
>
> Ian
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: __sync_fetch
  2012-11-18  8:18       ` __sync_fetch Ian Lance Taylor
@ 2012-11-18 19:31         ` Hei Chan
  2012-11-19  2:57           ` __sync_fetch Ian Lance Taylor
  0 siblings, 1 reply; 12+ messages in thread
From: Hei Chan @ 2012-11-18 19:31 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc-help

I just spoke with my coworker about this.  We just wonder whether C++ standard/GCC guarantees all the variables will be aligned if we don't request for unaligned (e.g. __packed__).

Thanks in advance.




----- Original Message -----
From: Ian Lance Taylor <iant@google.com>
To: Hei Chan <structurechart@yahoo.com>
Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
Sent: Sunday, November 18, 2012 12:18 AM
Subject: Re: __sync_fetch

On Sun, Nov 18, 2012 at 12:10 AM, Hei Chan <structurechart@yahoo.com> wrote:
>
> How about on a 64-bit Intel processor, I use __sync_fetch_and_*() to write to a long variable, but never use any __sync_*() to read?  Under what situation that I will read something invalid?

On a 64-bit Intel processor, if the 64-bit value is at an aligned
adress, then to the best of my knowledge that will always be fine.  If
the 64-bit value is misaligned and crosses a cache line, then if you
are unlucky I believe that a write can occur in between reading the
two different cache lines, causing you to read a value that was never
written.

I feel compelled to add that attempting to reason about this sort of
thing generally means that you are making a mistake.  Unless you are
writing very low-level code, such as the implementation of mutex, it's
best to avoid trying to think this way.

Ian



> ----- Original Message -----
> From: Ian Lance Taylor <iant@google.com>
> To: Hei Chan <structurechart@yahoo.com>
> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
> Sent: Sunday, November 18, 2012 12:07 AM
> Subject: Re: __sync_fetch
>
> On Sat, Nov 17, 2012 at 11:04 PM, Hei Chan <structurechart@yahoo.com> wrote:
>>
>> After searching more for info, it seems like even though on a
>>  64-bit machine, reading a long (i.e. 8 bytes) is one operation, it
>> might not give the "correct" value:
>> http://gcc.gnu.org/ml/gcc/2008-03/msg00201.html
>>
>> And so, we have to use __sync_fetch_and_add(&x, 0) to read?
>>
>> Could
>>  someone elaborate a situation that reading a long variable won't get
>> the correct value given that all writes in the application use
>> __sync_fetch_*()?
>
> If you always use __sync_fetch_and_add(&x, 0) to read a value, and you
> always use __sync_fetch_and_add to write the value also with some
> appropriate detla, then all the accesses to that variable should be
> atomic with sequential consistency.  That should be true on any
> processors that implements __sync_fetch_and_add in the appropriate
> size.
>
> Ian
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: __sync_fetch
  2012-11-18 19:31         ` __sync_fetch Hei Chan
@ 2012-11-19  2:57           ` Ian Lance Taylor
       [not found]             ` <1353294140.73855.YahooMailNeo@web165005.mail.bf1.yahoo.com>
  0 siblings, 1 reply; 12+ messages in thread
From: Ian Lance Taylor @ 2012-11-19  2:57 UTC (permalink / raw)
  To: Hei Chan; +Cc: gcc-help

On Sun, Nov 18, 2012 at 11:31 AM, Hei Chan <structurechart@yahoo.com> wrote:
> I just spoke with my coworker about this.  We just wonder whether C++ standard/GCC guarantees all the variables will be aligned if we don't request for unaligned (e.g. __packed__).

Yes.

Ian

> ----- Original Message -----
> From: Ian Lance Taylor <iant@google.com>
> To: Hei Chan <structurechart@yahoo.com>
> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
> Sent: Sunday, November 18, 2012 12:18 AM
> Subject: Re: __sync_fetch
>
> On Sun, Nov 18, 2012 at 12:10 AM, Hei Chan <structurechart@yahoo.com> wrote:
>>
>> How about on a 64-bit Intel processor, I use __sync_fetch_and_*() to write to a long variable, but never use any __sync_*() to read?  Under what situation that I will read something invalid?
>
> On a 64-bit Intel processor, if the 64-bit value is at an aligned
> adress, then to the best of my knowledge that will always be fine.  If
> the 64-bit value is misaligned and crosses a cache line, then if you
> are unlucky I believe that a write can occur in between reading the
> two different cache lines, causing you to read a value that was never
> written.
>
> I feel compelled to add that attempting to reason about this sort of
> thing generally means that you are making a mistake.  Unless you are
> writing very low-level code, such as the implementation of mutex, it's
> best to avoid trying to think this way.
>
> Ian
>
>
>
>> ----- Original Message -----
>> From: Ian Lance Taylor <iant@google.com>
>> To: Hei Chan <structurechart@yahoo.com>
>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
>> Sent: Sunday, November 18, 2012 12:07 AM
>> Subject: Re: __sync_fetch
>>
>> On Sat, Nov 17, 2012 at 11:04 PM, Hei Chan <structurechart@yahoo.com> wrote:
>>>
>>> After searching more for info, it seems like even though on a
>>>  64-bit machine, reading a long (i.e. 8 bytes) is one operation, it
>>> might not give the "correct" value:
>>> http://gcc.gnu.org/ml/gcc/2008-03/msg00201.html
>>>
>>> And so, we have to use __sync_fetch_and_add(&x, 0) to read?
>>>
>>> Could
>>>  someone elaborate a situation that reading a long variable won't get
>>> the correct value given that all writes in the application use
>>> __sync_fetch_*()?
>>
>> If you always use __sync_fetch_and_add(&x, 0) to read a value, and you
>> always use __sync_fetch_and_add to write the value also with some
>> appropriate detla, then all the accesses to that variable should be
>> atomic with sequential consistency.  That should be true on any
>> processors that implements __sync_fetch_and_add in the appropriate
>> size.
>>
>> Ian
>>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: __sync_fetch
       [not found]                   ` <CAKOQZ8xgf7TTyU_X1oHzVRawYiKT5JK5JHiq__VtB_WUkdKAQQ@mail.gmail.com>
@ 2012-11-19  5:39                     ` Hei Chan
  2012-11-19  5:46                       ` __sync_fetch Ian Lance Taylor
  0 siblings, 1 reply; 12+ messages in thread
From: Hei Chan @ 2012-11-19  5:39 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc-help

Hi Ian,

Thanks for your reply.

I am not sure why the casting in your example will cause any issue as I thought without __pack__, the variable a will be aligned by gcc, no?  And you are trying to get the address of &a[0] (which is fixed) and then cast to long*, shouldn't have anything do with alignment....or do you mean long p = (long)(*a)?

Thanks in advance.



----- Original Message -----
From: Ian Lance Taylor <iant@google.com>
To: Hei Chan <structurechart@yahoo.com>
Cc: 
Sent: Sunday, November 18, 2012 9:14 PM
Subject: Re: __sync_fetch

On Sun, Nov 18, 2012 at 9:03 PM, Hei Chan <structurechart@yahoo.com> wrote:
> You mean:
> 1. the variable I am trying to read doesn't belong to a packed struct; and

Yes.

> 2. I am not casting to something not 8 byte long on a 64 bit machine?  I
> thought casting would happen after the variable is stored in register?

I'm sorry, I don't know what you mean.

By casting I mean something like
  char a[16];
  long *p = (long *) a;
Nothing here makes a aligned.

If you want to continue this discussion, please use the mailing list,
not private mail to me.  Thanks.

Ian




> From: Ian Lance Taylor <iant@google.com>
> To: Hei Chan <structurechart@yahoo.com>
> Sent: Sunday, November 18, 2012 7:26 PM
> Subject: Re: __sync_fetch
>
> On Sun, Nov 18, 2012 at 7:02 PM, Hei Chan <structurechart@yahoo.com> wrote:
>> So the case you mentioned about unaligned cache line shouldn't happen,
>> right?
>
> Unless you are doing something unusual involving casts or packed
> structs, that is correct.
>
> Ian
>
>
>> Just want to check (so that I can shave another few hundreds nano sec in
>> my code).
>>
>> Thanks in advance.
>>
>>
>> ----- Original Message -----
>> From: Ian Lance Taylor <iant@google.com>
>> To: Hei Chan <structurechart@yahoo.com>
>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
>> Sent: Sunday, November 18, 2012 6:57 PM
>> Subject: Re: __sync_fetch
>>
>> On Sun, Nov 18, 2012 at 11:31 AM, Hei Chan <structurechart@yahoo.com>
>> wrote:
>>> I just spoke with my coworker about this.  We just wonder whether C++
>>> standard/GCC guarantees all the variables will be aligned if we don't
>>> request for unaligned (e.g. __packed__).
>>
>> Yes.
>>
>> Ian
>>
>>> ----- Original Message -----
>>> From: Ian Lance Taylor <iant@google.com>
>>> To: Hei Chan <structurechart@yahoo.com>
>>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
>>> Sent: Sunday, November 18, 2012 12:18 AM
>>> Subject: Re: __sync_fetch
>>>
>>> On Sun, Nov 18, 2012 at 12:10 AM, Hei Chan <structurechart@yahoo.com>
>>> wrote:
>>>>
>>>> How about on a 64-bit Intel processor, I use __sync_fetch_and_*() to
>>>> write to a long variable, but never use any __sync_*() to read?  Under what
>>>> situation that I will read something invalid?
>>>
>>> On a 64-bit Intel processor, if the 64-bit value is at an aligned
>>> adress, then to the best of my knowledge that will always be fine.  If
>>> the 64-bit value is misaligned and crosses a cache line, then if you
>>> are unlucky I believe that a write can occur in between reading the
>>> two different cache lines, causing you to read a value that was never
>>> written.
>>>
>>> I feel compelled to add that attempting to reason about this sort of
>>> thing generally means that you are making a mistake.  Unless you are
>>> writing very low-level code, such as the implementation of mutex, it's
>>> best to avoid trying to think this way.
>>>
>>> Ian
>>>
>>>
>>>
>>>> ----- Original Message -----
>>>> From: Ian Lance Taylor <iant@google.com>
>>>> To: Hei Chan <structurechart@yahoo.com>
>>>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
>>>> Sent: Sunday, November 18, 2012 12:07 AM
>>>> Subject: Re: __sync_fetch
>>>>
>>>> On Sat, Nov 17, 2012 at 11:04 PM, Hei Chan <structurechart@yahoo.com>
>>>> wrote:
>>>>>
>>>>> After searching more for info, it seems like even though on a
>>>>>  64-bit machine, reading a long (i.e. 8 bytes) is one operation, it
>>>>> might not give the "correct" value:
>>>>> http://gcc.gnu.org/ml/gcc/2008-03/msg00201.html
>>>>>
>>>>> And so, we have to use __sync_fetch_and_add(&x, 0) to read?
>>>>>
>>>>> Could
>>>>>  someone elaborate a situation that reading a long variable won't get
>>>>> the correct value given that all writes in the application use
>>>>> __sync_fetch_*()?
>>>>
>>>> If you always use __sync_fetch_and_add(&x, 0) to read a value, and you
>>>> always use __sync_fetch_and_add to write the value also with some
>>>> appropriate detla, then all the accesses to that variable should be
>>>> atomic with sequential consistency.  That should be true on any
>>>> processors that implements __sync_fetch_and_add in the appropriate
>>>> size.
>>>>
>>>> Ian
>>>>
>>>
>>
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: __sync_fetch
  2012-11-19  5:39                     ` __sync_fetch Hei Chan
@ 2012-11-19  5:46                       ` Ian Lance Taylor
  2012-11-19  6:07                         ` __sync_fetch Hei Chan
       [not found]                         ` <1353305098.17316.YahooMailNeo@web165003.mail.bf1.yahoo.com>
  0 siblings, 2 replies; 12+ messages in thread
From: Ian Lance Taylor @ 2012-11-19  5:46 UTC (permalink / raw)
  To: Hei Chan; +Cc: gcc-help

On Sun, Nov 18, 2012 at 9:38 PM, Hei Chan <structurechart@yahoo.com> wrote:
>
> I am not sure why the casting in your example will cause any issue as I thought without __pack__, the variable a will be aligned by gcc, no?  And you are trying to get the address of &a[0] (which is fixed) and then cast to long*, shouldn't have anything do with alignment....or do you mean long p = (long)(*a)?

In my example, a is a char array.  A char array is not required to be
aligned to an 8-byte boundary.  It's true that GCC will align the
variable a, but it may align it to an odd address.  So if I cast the
array a to long*, I may get a long* that is not aligned to an 8-byte
boundary.

Ian



> ----- Original Message -----
> From: Ian Lance Taylor <iant@google.com>
> To: Hei Chan <structurechart@yahoo.com>
> Cc:
> Sent: Sunday, November 18, 2012 9:14 PM
> Subject: Re: __sync_fetch
>
> On Sun, Nov 18, 2012 at 9:03 PM, Hei Chan <structurechart@yahoo.com> wrote:
>> You mean:
>> 1. the variable I am trying to read doesn't belong to a packed struct; and
>
> Yes.
>
>> 2. I am not casting to something not 8 byte long on a 64 bit machine?  I
>> thought casting would happen after the variable is stored in register?
>
> I'm sorry, I don't know what you mean.
>
> By casting I mean something like
>   char a[16];
>   long *p = (long *) a;
> Nothing here makes a aligned.
>
> If you want to continue this discussion, please use the mailing list,
> not private mail to me.  Thanks.
>
> Ian
>
>
>
>
>> From: Ian Lance Taylor <iant@google.com>
>> To: Hei Chan <structurechart@yahoo.com>
>> Sent: Sunday, November 18, 2012 7:26 PM
>> Subject: Re: __sync_fetch
>>
>> On Sun, Nov 18, 2012 at 7:02 PM, Hei Chan <structurechart@yahoo.com> wrote:
>>> So the case you mentioned about unaligned cache line shouldn't happen,
>>> right?
>>
>> Unless you are doing something unusual involving casts or packed
>> structs, that is correct.
>>
>> Ian
>>
>>
>>> Just want to check (so that I can shave another few hundreds nano sec in
>>> my code).
>>>
>>> Thanks in advance.
>>>
>>>
>>> ----- Original Message -----
>>> From: Ian Lance Taylor <iant@google.com>
>>> To: Hei Chan <structurechart@yahoo.com>
>>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
>>> Sent: Sunday, November 18, 2012 6:57 PM
>>> Subject: Re: __sync_fetch
>>>
>>> On Sun, Nov 18, 2012 at 11:31 AM, Hei Chan <structurechart@yahoo.com>
>>> wrote:
>>>> I just spoke with my coworker about this.  We just wonder whether C++
>>>> standard/GCC guarantees all the variables will be aligned if we don't
>>>> request for unaligned (e.g. __packed__).
>>>
>>> Yes.
>>>
>>> Ian
>>>
>>>> ----- Original Message -----
>>>> From: Ian Lance Taylor <iant@google.com>
>>>> To: Hei Chan <structurechart@yahoo.com>
>>>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
>>>> Sent: Sunday, November 18, 2012 12:18 AM
>>>> Subject: Re: __sync_fetch
>>>>
>>>> On Sun, Nov 18, 2012 at 12:10 AM, Hei Chan <structurechart@yahoo.com>
>>>> wrote:
>>>>>
>>>>> How about on a 64-bit Intel processor, I use __sync_fetch_and_*() to
>>>>> write to a long variable, but never use any __sync_*() to read?  Under what
>>>>> situation that I will read something invalid?
>>>>
>>>> On a 64-bit Intel processor, if the 64-bit value is at an aligned
>>>> adress, then to the best of my knowledge that will always be fine.  If
>>>> the 64-bit value is misaligned and crosses a cache line, then if you
>>>> are unlucky I believe that a write can occur in between reading the
>>>> two different cache lines, causing you to read a value that was never
>>>> written.
>>>>
>>>> I feel compelled to add that attempting to reason about this sort of
>>>> thing generally means that you are making a mistake.  Unless you are
>>>> writing very low-level code, such as the implementation of mutex, it's
>>>> best to avoid trying to think this way.
>>>>
>>>> Ian
>>>>
>>>>
>>>>
>>>>> ----- Original Message -----
>>>>> From: Ian Lance Taylor <iant@google.com>
>>>>> To: Hei Chan <structurechart@yahoo.com>
>>>>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
>>>>> Sent: Sunday, November 18, 2012 12:07 AM
>>>>> Subject: Re: __sync_fetch
>>>>>
>>>>> On Sat, Nov 17, 2012 at 11:04 PM, Hei Chan <structurechart@yahoo.com>
>>>>> wrote:
>>>>>>
>>>>>> After searching more for info, it seems like even though on a
>>>>>>  64-bit machine, reading a long (i.e. 8 bytes) is one operation, it
>>>>>> might not give the "correct" value:
>>>>>> http://gcc.gnu.org/ml/gcc/2008-03/msg00201.html
>>>>>>
>>>>>> And so, we have to use __sync_fetch_and_add(&x, 0) to read?
>>>>>>
>>>>>> Could
>>>>>>  someone elaborate a situation that reading a long variable won't get
>>>>>> the correct value given that all writes in the application use
>>>>>> __sync_fetch_*()?
>>>>>
>>>>> If you always use __sync_fetch_and_add(&x, 0) to read a value, and you
>>>>> always use __sync_fetch_and_add to write the value also with some
>>>>> appropriate detla, then all the accesses to that variable should be
>>>>> atomic with sequential consistency.  That should be true on any
>>>>> processors that implements __sync_fetch_and_add in the appropriate
>>>>> size.
>>>>>
>>>>> Ian
>>>>>
>>>>
>>>
>>
>>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: __sync_fetch
  2012-11-19  5:46                       ` __sync_fetch Ian Lance Taylor
@ 2012-11-19  6:07                         ` Hei Chan
       [not found]                         ` <1353305098.17316.YahooMailNeo@web165003.mail.bf1.yahoo.com>
  1 sibling, 0 replies; 12+ messages in thread
From: Hei Chan @ 2012-11-19  6:07 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc-help

Hmm...I think I must miss something.

Let's say a[0] is at 0x4 (so not an address divisible by 8), and let's say only the first 4 characters are on the cache line.

Then,
 long* p = (long*)a will give 0x4 in any situation as the address of an 
array can't be changed.  So it shouldn't be a problem, no?  Or do you 
mean when I try to de-reference p to read the value p pointing to, then 
there will be a problem as only first 4 bytes are in the cache line?

Thanks in advance.  Really appreciate for your explanation.


________________________________
From: Ian Lance Taylor <iant@google.com>
To: Hei Chan <structurechart@yahoo.com> 
Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> 
Sent: Sunday, November 18, 2012 9:46 PM
Subject: Re: __sync_fetch

On Sun, Nov 18, 2012 at 9:38 PM, Hei Chan <structurechart@yahoo.com> wrote:
>
> I am not sure why the casting in your example will cause any issue as I thought without __pack__, the variable a will be aligned by gcc, no?  And you are trying to get the address of &a[0] (which is fixed) and then cast to long*, shouldn't have anything do with alignment....or do you mean long p = (long)(*a)?

In my example, a is a char array.  A char array is not required to be
aligned to an 8-byte boundary.  It's true that GCC will align the
variable a, but it may align it to an odd address.  So if I cast the
array a to long*, I may get a long* that is not aligned to an 8-byte
boundary.

Ian



> ----- Original Message -----
> From: Ian Lance Taylor <iant@google.com>
> To: Hei Chan <structurechart@yahoo.com>
> Cc:
> Sent: Sunday, November 18, 2012 9:14 PM
> Subject: Re: __sync_fetch
>
> On Sun, Nov 18, 2012 at 9:03 PM, Hei Chan <structurechart@yahoo.com> wrote:
>> You mean:
>> 1. the variable I am trying to read doesn't belong to a packed struct; and
>
> Yes.
>
>> 2. I am not casting to something not 8 byte long on a 64 bit machine?  I
>> thought casting would happen after the variable is stored in register?
>
> I'm sorry, I don't know what you mean.
>
> By casting I mean something like
>   char a[16];
>   long *p = (long *) a;
> Nothing here makes a aligned.
>
> If you want to continue this discussion, please use the mailing list,
> not private mail to me.  Thanks.
>
> Ian
>
>
>
>
>> From: Ian Lance Taylor <iant@google.com>
>> To: Hei Chan <structurechart@yahoo.com>
>> Sent: Sunday, November 18, 2012 7:26 PM
>> Subject: Re: __sync_fetch
>>
>> On Sun, Nov 18, 2012 at 7:02 PM, Hei Chan <structurechart@yahoo.com> wrote:
>>> So the case you mentioned about unaligned cache line shouldn't happen,
>>> right?
>>
>> Unless you are doing something unusual involving casts or packed
>> structs, that is correct.
>>
>> Ian
>>
>>
>>> Just want to check (so that I can shave another few hundreds nano sec in
>>> my code).
>>>
>>> Thanks in advance.
>>>
>>>
>>> ----- Original Message -----
>>> From: Ian Lance Taylor <iant@google.com>
>>> To: Hei Chan <structurechart@yahoo.com>
>>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
>>> Sent: Sunday, November 18, 2012 6:57 PM
>>> Subject: Re: __sync_fetch
>>>
>>> On Sun, Nov 18, 2012 at 11:31 AM, Hei Chan <structurechart@yahoo.com>
>>> wrote:
>>>> I just spoke with my coworker about this.  We just wonder whether C++
>>>> standard/GCC guarantees all the variables will be aligned if we don't
>>>> request for unaligned (e.g. __packed__).
>>>
>>> Yes.
>>>
>>> Ian
>>>
>>>> ----- Original Message -----
>>>> From: Ian Lance Taylor <iant@google.com>
>>>> To: Hei Chan <structurechart@yahoo.com>
>>>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
>>>> Sent: Sunday, November 18, 2012 12:18 AM
>>>> Subject: Re: __sync_fetch
>>>>
>>>> On Sun, Nov 18, 2012 at 12:10 AM, Hei Chan <structurechart@yahoo.com>
>>>> wrote:
>>>>>
>>>>> How about on a 64-bit Intel processor, I use __sync_fetch_and_*() to
>>>>> write to a long variable, but never use any __sync_*() to read?  Under what
>>>>> situation that I will read something invalid?
>>>>
>>>> On a 64-bit Intel processor, if the 64-bit value is at an aligned
>>>> adress, then to the best of my knowledge that will always be fine.  If
>>>> the 64-bit value is misaligned and crosses a cache line, then if you
>>>> are unlucky I believe that a write can occur in between reading the
>>>> two different cache lines, causing you to read a value that was never
>>>> written.
>>>>
>>>> I feel compelled to add that attempting to reason about this sort of
>>>> thing generally means that you are making a mistake.  Unless you are
>>>> writing very low-level code, such as the implementation of mutex, it's
>>>> best to avoid trying to think this way.
>>>>
>>>> Ian
>>>>
>>>>
>>>>
>>>>> ----- Original Message -----
>>>>> From: Ian Lance Taylor <iant@google.com>
>>>>> To: Hei Chan <structurechart@yahoo.com>
>>>>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
>>>>> Sent: Sunday, November 18, 2012 12:07 AM
>>>>> Subject: Re: __sync_fetch
>>>>>
>>>>> On Sat, Nov 17, 2012 at 11:04 PM, Hei Chan <structurechart@yahoo.com>
>>>>> wrote:
>>>>>>
>>>>>> After searching more for info, it seems like even though on a
>>>>>>  64-bit machine, reading a long (i.e. 8 bytes) is one operation, it
>>>>>> might not give the "correct" value:
>>>>>> http://gcc.gnu.org/ml/gcc/2008-03/msg00201.html
>>>>>>
>>>>>> And so, we have to use __sync_fetch_and_add(&x, 0) to read?
>>>>>>
>>>>>> Could
>>>>>>  someone elaborate a situation that reading a long variable won't get
>>>>>> the correct value given that all writes in the application use
>>>>>> __sync_fetch_*()?
>>>>>
>>>>> If you always use __sync_fetch_and_add(&x, 0) to read a value, and you
>>>>> always use __sync_fetch_and_add to write the value also with some
>>>>> appropriate detla, then all the accesses to that variable should be
>>>>> atomic with sequential consistency.  That should be true on any
>>>>> processors that implements __sync_fetch_and_add in the appropriate
>>>>> size.
>>>>>
>>>>> Ian
>>>>>
>>>>
>>>
>>
>>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: __sync_fetch
       [not found]                         ` <1353305098.17316.YahooMailNeo@web165003.mail.bf1.yahoo.com>
@ 2012-11-19  6:18                           ` Ian Lance Taylor
  0 siblings, 0 replies; 12+ messages in thread
From: Ian Lance Taylor @ 2012-11-19  6:18 UTC (permalink / raw)
  To: Hei Chan; +Cc: gcc-help

On Sun, Nov 18, 2012 at 10:04 PM, Hei Chan <structurechart@yahoo.com> wrote:
> Hmm...I think I must miss something.
>
> Let's say a[0] is at 0x4 (so not an address divisible by 8), and let's say
> only the first 4 characters are on the cache line.
>
> Then, long* p = (long*)a will give 0x4 in any situation as the address of an
> array can't be changed.  So it shouldn't be a problem, no?  Or do you mean
> when I try to de-reference p to read the value p pointing to, then there
> will be a problem as only first 4 bytes are in the cache line?

Yes.

Ian



> From: Ian Lance Taylor <iant@google.com>
> To: Hei Chan <structurechart@yahoo.com>
> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
> Sent: Sunday, November 18, 2012 9:46 PM
> Subject: Re: __sync_fetch
>
> On Sun, Nov 18, 2012 at 9:38 PM, Hei Chan <structurechart@yahoo.com> wrote:
>>
>> I am not sure why the casting in your example will cause any issue as I
>> thought without __pack__, the variable a will be aligned by gcc, no?  And
>> you are trying to get the address of &a[0] (which is fixed) and then cast to
>> long*, shouldn't have anything do with alignment....or do you mean long p =
>> (long)(*a)?
>
> In my example, a is a char array.  A char array is not required to be
> aligned to an 8-byte boundary.  It's true that GCC will align the
> variable a, but it may align it to an odd address.  So if I cast the
> array a to long*, I may get a long* that is not aligned to an 8-byte
> boundary.
>
> Ian
>
>
>
>> ----- Original Message -----
>> From: Ian Lance Taylor <iant@google.com>
>> To: Hei Chan <structurechart@yahoo.com>
>> Cc:
>> Sent: Sunday, November 18, 2012 9:14 PM
>> Subject: Re: __sync_fetch
>>
>> On Sun, Nov 18, 2012 at 9:03 PM, Hei Chan <structurechart@yahoo.com>
>> wrote:
>>> You mean:
>>> 1. the variable I am trying to read doesn't belong to a packed struct;
>>> and
>>
>> Yes.
>>
>>> 2. I am not casting to something not 8 byte long on a 64 bit machine?  I
>>> thought casting would happen after the variable is stored in register?
>>
>> I'm sorry, I don't know what you mean.
>>
>> By casting I mean something like
>>  char a[16];
>>  long *p = (long *) a;
>> Nothing here makes a aligned.
>>
>> If you want to continue this discussion, please use the mailing list,
>> not private mail to me.  Thanks.
>>
>> Ian
>>
>>
>>
>>
>>> From: Ian Lance Taylor <iant@google.com>
>>> To: Hei Chan <structurechart@yahoo.com>
>>> Sent: Sunday, November 18, 2012 7:26 PM
>>> Subject: Re: __sync_fetch
>>>
>>> On Sun, Nov 18, 2012 at 7:02 PM, Hei Chan <structurechart@yahoo.com>
>>> wrote:
>>>> So the case you mentioned about unaligned cache line shouldn't happen,
>>>> right?
>>>
>>> Unless you are doing something unusual involving casts or packed
>>> structs, that is correct.
>>>
>>> Ian
>>>
>>>
>>>> Just want to check (so that I can shave another few hundreds nano sec in
>>>> my code).
>>>>
>>>> Thanks in advance.
>>>>
>>>>
>>>> ----- Original Message -----
>>>> From: Ian Lance Taylor <iant@google.com>
>>>> To: Hei Chan <structurechart@yahoo.com>
>>>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
>>>> Sent: Sunday, November 18, 2012 6:57 PM
>>>> Subject: Re: __sync_fetch
>>>>
>>>> On Sun, Nov 18, 2012 at 11:31 AM, Hei Chan <structurechart@yahoo.com>
>>>> wrote:
>>>>> I just spoke with my coworker about this.  We just wonder whether C++
>>>>> standard/GCC guarantees all the variables will be aligned if we don't
>>>>> request for unaligned (e.g. __packed__).
>>>>
>>>> Yes.
>>>>
>>>> Ian
>>>>
>>>>> ----- Original Message -----
>>>>> From: Ian Lance Taylor <iant@google.com>
>>>>> To: Hei Chan <structurechart@yahoo.com>
>>>>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
>>>>> Sent: Sunday, November 18, 2012 12:18 AM
>>>>> Subject: Re: __sync_fetch
>>>>>
>>>>> On Sun, Nov 18, 2012 at 12:10 AM, Hei Chan <structurechart@yahoo.com>
>>>>> wrote:
>>>>>>
>>>>>> How about on a 64-bit Intel processor, I use __sync_fetch_and_*() to
>>>>>> write to a long variable, but never use any __sync_*() to read?  Under
>>>>>> what
>>>>>> situation that I will read something invalid?
>>>>>
>>>>> On a 64-bit Intel processor, if the 64-bit value is at an aligned
>>>>> adress, then to the best of my knowledge that will always be fine.  If
>>>>> the 64-bit value is misaligned and crosses a cache line, then if you
>>>>> are unlucky I believe that a write can occur in between reading the
>>>>> two different cache lines, causing you to read a value that was never
>>>>> written.
>>>>>
>>>>> I feel compelled to add that attempting to reason about this sort of
>>>>> thing generally means that you are making a mistake.  Unless you are
>>>>> writing very low-level code, such as the implementation of mutex, it's
>>>>> best to avoid trying to think this way.
>>>>>
>>>>> Ian
>>>>>
>>>>>
>>>>>
>>>>>> ----- Original Message -----
>>>>>> From: Ian Lance Taylor <iant@google.com>
>>>>>> To: Hei Chan <structurechart@yahoo.com>
>>>>>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
>>>>>> Sent: Sunday, November 18, 2012 12:07 AM
>>>>>> Subject: Re: __sync_fetch
>>>>>>
>>>>>> On Sat, Nov 17, 2012 at 11:04 PM, Hei Chan <structurechart@yahoo.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> After searching more for info, it seems like even though on a
>>>>>>>  64-bit machine, reading a long (i.e. 8 bytes) is one operation, it
>>>>>>> might not give the "correct" value:
>>>>>>> http://gcc.gnu.org/ml/gcc/2008-03/msg00201.html
>>>>>>>
>>>>>>> And so, we have to use __sync_fetch_and_add(&x, 0) to read?
>>>>>>>
>>>>>>> Could
>>>>>>>  someone elaborate a situation that reading a long variable won't get
>>>>>>> the correct value given that all writes in the application use
>>>>>>> __sync_fetch_*()?
>>>>>>
>>>>>> If you always use __sync_fetch_and_add(&x, 0) to read a value, and you
>>>>>> always use __sync_fetch_and_add to write the value also with some
>>>>>> appropriate detla, then all the accesses to that variable should be
>>>>>> atomic with sequential consistency.  That should be true on any
>>>>>> processors that implements __sync_fetch_and_add in the appropriate
>>>>>> size.
>>>>>>
>>>>>> Ian
>>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-11-19  6:18 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-17  6:34 __sync_fetch Hei Chan
2012-11-18  7:04 ` __sync_fetch Hei Chan
2012-11-18  8:07   ` __sync_fetch Ian Lance Taylor
2012-11-18  8:11     ` __sync_fetch Hei Chan
2012-11-18  8:18       ` __sync_fetch Ian Lance Taylor
2012-11-18 19:31         ` __sync_fetch Hei Chan
2012-11-19  2:57           ` __sync_fetch Ian Lance Taylor
     [not found]             ` <1353294140.73855.YahooMailNeo@web165005.mail.bf1.yahoo.com>
     [not found]               ` <CAKOQZ8y2-uP_jQMd+xCtT4Svm121HiJSdz+FGvAW-NSXxM9F+g@mail.gmail.com>
     [not found]                 ` <1353301408.14218.YahooMailNeo@web165002.mail.bf1.yahoo.com>
     [not found]                   ` <CAKOQZ8xgf7TTyU_X1oHzVRawYiKT5JK5JHiq__VtB_WUkdKAQQ@mail.gmail.com>
2012-11-19  5:39                     ` __sync_fetch Hei Chan
2012-11-19  5:46                       ` __sync_fetch Ian Lance Taylor
2012-11-19  6:07                         ` __sync_fetch Hei Chan
     [not found]                         ` <1353305098.17316.YahooMailNeo@web165003.mail.bf1.yahoo.com>
2012-11-19  6:18                           ` __sync_fetch Ian Lance Taylor
2012-11-18  8:04 ` __sync_fetch Ian Lance Taylor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).