* __sync_fetch @ 2012-11-17 6:34 Hei Chan 2012-11-18 7:04 ` __sync_fetch Hei Chan 2012-11-18 8:04 ` __sync_fetch Ian Lance Taylor 0 siblings, 2 replies; 12+ messages in thread From: Hei Chan @ 2012-11-17 6:34 UTC (permalink / raw) To: gcc-help Hi, I am using GCC 4.1.2, and so no __atomic*(). I am looking at http://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins.html I see __sync_fetch_and_*(), but I don't see __sync_fetch(). Is it because the built-in routines support integral scalar or pointer type that is up to 8 bytes in length, and so the read is automatically atomic anyway? Thanks in advance. Cheers, Hei ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: __sync_fetch 2012-11-17 6:34 __sync_fetch Hei Chan @ 2012-11-18 7:04 ` Hei Chan 2012-11-18 8:07 ` __sync_fetch Ian Lance Taylor 2012-11-18 8:04 ` __sync_fetch Ian Lance Taylor 1 sibling, 1 reply; 12+ messages in thread From: Hei Chan @ 2012-11-18 7:04 UTC (permalink / raw) To: Hei Chan, gcc-help Hi, After searching more for info, it seems like even though on a 64-bit machine, reading a long (i.e. 8 bytes) is one operation, it might not give the "correct" value: http://gcc.gnu.org/ml/gcc/2008-03/msg00201.html And so, we have to use __sync_fetch_and_add(&x, 0) to read? Could someone elaborate a situation that reading a long variable won't get the correct value given that all writes in the application use __sync_fetch_*()? Thanks in advance. Cheers, Hei ________________________________ From: Hei Chan <structurechart@yahoo.com> To: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> Sent: Friday, November 16, 2012 10:34 PM Subject: __sync_fetch Hi, I am using GCC 4.1.2, and so no __atomic*(). I am looking at http://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins.html I see __sync_fetch_and_*(), but I don't see __sync_fetch(). Is it because the built-in routines support integral scalar or pointer type that is up to 8 bytes in length, and so the read is automatically atomic anyway? Thanks in advance. Cheers, Hei ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: __sync_fetch 2012-11-18 7:04 ` __sync_fetch Hei Chan @ 2012-11-18 8:07 ` Ian Lance Taylor 2012-11-18 8:11 ` __sync_fetch Hei Chan 0 siblings, 1 reply; 12+ messages in thread From: Ian Lance Taylor @ 2012-11-18 8:07 UTC (permalink / raw) To: Hei Chan; +Cc: gcc-help On Sat, Nov 17, 2012 at 11:04 PM, Hei Chan <structurechart@yahoo.com> wrote: > > After searching more for info, it seems like even though on a > 64-bit machine, reading a long (i.e. 8 bytes) is one operation, it > might not give the "correct" value: > http://gcc.gnu.org/ml/gcc/2008-03/msg00201.html > > And so, we have to use __sync_fetch_and_add(&x, 0) to read? > > Could > someone elaborate a situation that reading a long variable won't get > the correct value given that all writes in the application use > __sync_fetch_*()? If you always use __sync_fetch_and_add(&x, 0) to read a value, and you always use __sync_fetch_and_add to write the value also with some appropriate detla, then all the accesses to that variable should be atomic with sequential consistency. That should be true on any processors that implements __sync_fetch_and_add in the appropriate size. Ian ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: __sync_fetch 2012-11-18 8:07 ` __sync_fetch Ian Lance Taylor @ 2012-11-18 8:11 ` Hei Chan 2012-11-18 8:18 ` __sync_fetch Ian Lance Taylor 0 siblings, 1 reply; 12+ messages in thread From: Hei Chan @ 2012-11-18 8:11 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: gcc-help Hi Ian, Thanks for your reply. How about on a 64-bit Intel processor, I use __sync_fetch_and_*() to write to a long variable, but never use any __sync_*() to read? Under what situation that I will read something invalid? Thanks in advance. Cheers, Hei ----- Original Message ----- From: Ian Lance Taylor <iant@google.com> To: Hei Chan <structurechart@yahoo.com> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> Sent: Sunday, November 18, 2012 12:07 AM Subject: Re: __sync_fetch On Sat, Nov 17, 2012 at 11:04 PM, Hei Chan <structurechart@yahoo.com> wrote: > > After searching more for info, it seems like even though on a > 64-bit machine, reading a long (i.e. 8 bytes) is one operation, it > might not give the "correct" value: > http://gcc.gnu.org/ml/gcc/2008-03/msg00201.html > > And so, we have to use __sync_fetch_and_add(&x, 0) to read? > > Could > someone elaborate a situation that reading a long variable won't get > the correct value given that all writes in the application use > __sync_fetch_*()? If you always use __sync_fetch_and_add(&x, 0) to read a value, and you always use __sync_fetch_and_add to write the value also with some appropriate detla, then all the accesses to that variable should be atomic with sequential consistency. That should be true on any processors that implements __sync_fetch_and_add in the appropriate size. Ian ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: __sync_fetch 2012-11-18 8:11 ` __sync_fetch Hei Chan @ 2012-11-18 8:18 ` Ian Lance Taylor 2012-11-18 19:31 ` __sync_fetch Hei Chan 0 siblings, 1 reply; 12+ messages in thread From: Ian Lance Taylor @ 2012-11-18 8:18 UTC (permalink / raw) To: Hei Chan; +Cc: gcc-help On Sun, Nov 18, 2012 at 12:10 AM, Hei Chan <structurechart@yahoo.com> wrote: > > How about on a 64-bit Intel processor, I use __sync_fetch_and_*() to write to a long variable, but never use any __sync_*() to read? Under what situation that I will read something invalid? On a 64-bit Intel processor, if the 64-bit value is at an aligned adress, then to the best of my knowledge that will always be fine. If the 64-bit value is misaligned and crosses a cache line, then if you are unlucky I believe that a write can occur in between reading the two different cache lines, causing you to read a value that was never written. I feel compelled to add that attempting to reason about this sort of thing generally means that you are making a mistake. Unless you are writing very low-level code, such as the implementation of mutex, it's best to avoid trying to think this way. Ian > ----- Original Message ----- > From: Ian Lance Taylor <iant@google.com> > To: Hei Chan <structurechart@yahoo.com> > Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> > Sent: Sunday, November 18, 2012 12:07 AM > Subject: Re: __sync_fetch > > On Sat, Nov 17, 2012 at 11:04 PM, Hei Chan <structurechart@yahoo.com> wrote: >> >> After searching more for info, it seems like even though on a >> 64-bit machine, reading a long (i.e. 8 bytes) is one operation, it >> might not give the "correct" value: >> http://gcc.gnu.org/ml/gcc/2008-03/msg00201.html >> >> And so, we have to use __sync_fetch_and_add(&x, 0) to read? >> >> Could >> someone elaborate a situation that reading a long variable won't get >> the correct value given that all writes in the application use >> __sync_fetch_*()? > > If you always use __sync_fetch_and_add(&x, 0) to read a value, and you > always use __sync_fetch_and_add to write the value also with some > appropriate detla, then all the accesses to that variable should be > atomic with sequential consistency. That should be true on any > processors that implements __sync_fetch_and_add in the appropriate > size. > > Ian > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: __sync_fetch 2012-11-18 8:18 ` __sync_fetch Ian Lance Taylor @ 2012-11-18 19:31 ` Hei Chan 2012-11-19 2:57 ` __sync_fetch Ian Lance Taylor 0 siblings, 1 reply; 12+ messages in thread From: Hei Chan @ 2012-11-18 19:31 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: gcc-help I just spoke with my coworker about this. We just wonder whether C++ standard/GCC guarantees all the variables will be aligned if we don't request for unaligned (e.g. __packed__). Thanks in advance. ----- Original Message ----- From: Ian Lance Taylor <iant@google.com> To: Hei Chan <structurechart@yahoo.com> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> Sent: Sunday, November 18, 2012 12:18 AM Subject: Re: __sync_fetch On Sun, Nov 18, 2012 at 12:10 AM, Hei Chan <structurechart@yahoo.com> wrote: > > How about on a 64-bit Intel processor, I use __sync_fetch_and_*() to write to a long variable, but never use any __sync_*() to read? Under what situation that I will read something invalid? On a 64-bit Intel processor, if the 64-bit value is at an aligned adress, then to the best of my knowledge that will always be fine. If the 64-bit value is misaligned and crosses a cache line, then if you are unlucky I believe that a write can occur in between reading the two different cache lines, causing you to read a value that was never written. I feel compelled to add that attempting to reason about this sort of thing generally means that you are making a mistake. Unless you are writing very low-level code, such as the implementation of mutex, it's best to avoid trying to think this way. Ian > ----- Original Message ----- > From: Ian Lance Taylor <iant@google.com> > To: Hei Chan <structurechart@yahoo.com> > Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> > Sent: Sunday, November 18, 2012 12:07 AM > Subject: Re: __sync_fetch > > On Sat, Nov 17, 2012 at 11:04 PM, Hei Chan <structurechart@yahoo.com> wrote: >> >> After searching more for info, it seems like even though on a >> 64-bit machine, reading a long (i.e. 8 bytes) is one operation, it >> might not give the "correct" value: >> http://gcc.gnu.org/ml/gcc/2008-03/msg00201.html >> >> And so, we have to use __sync_fetch_and_add(&x, 0) to read? >> >> Could >> someone elaborate a situation that reading a long variable won't get >> the correct value given that all writes in the application use >> __sync_fetch_*()? > > If you always use __sync_fetch_and_add(&x, 0) to read a value, and you > always use __sync_fetch_and_add to write the value also with some > appropriate detla, then all the accesses to that variable should be > atomic with sequential consistency. That should be true on any > processors that implements __sync_fetch_and_add in the appropriate > size. > > Ian > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: __sync_fetch 2012-11-18 19:31 ` __sync_fetch Hei Chan @ 2012-11-19 2:57 ` Ian Lance Taylor [not found] ` <1353294140.73855.YahooMailNeo@web165005.mail.bf1.yahoo.com> 0 siblings, 1 reply; 12+ messages in thread From: Ian Lance Taylor @ 2012-11-19 2:57 UTC (permalink / raw) To: Hei Chan; +Cc: gcc-help On Sun, Nov 18, 2012 at 11:31 AM, Hei Chan <structurechart@yahoo.com> wrote: > I just spoke with my coworker about this. We just wonder whether C++ standard/GCC guarantees all the variables will be aligned if we don't request for unaligned (e.g. __packed__). Yes. Ian > ----- Original Message ----- > From: Ian Lance Taylor <iant@google.com> > To: Hei Chan <structurechart@yahoo.com> > Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> > Sent: Sunday, November 18, 2012 12:18 AM > Subject: Re: __sync_fetch > > On Sun, Nov 18, 2012 at 12:10 AM, Hei Chan <structurechart@yahoo.com> wrote: >> >> How about on a 64-bit Intel processor, I use __sync_fetch_and_*() to write to a long variable, but never use any __sync_*() to read? Under what situation that I will read something invalid? > > On a 64-bit Intel processor, if the 64-bit value is at an aligned > adress, then to the best of my knowledge that will always be fine. If > the 64-bit value is misaligned and crosses a cache line, then if you > are unlucky I believe that a write can occur in between reading the > two different cache lines, causing you to read a value that was never > written. > > I feel compelled to add that attempting to reason about this sort of > thing generally means that you are making a mistake. Unless you are > writing very low-level code, such as the implementation of mutex, it's > best to avoid trying to think this way. > > Ian > > > >> ----- Original Message ----- >> From: Ian Lance Taylor <iant@google.com> >> To: Hei Chan <structurechart@yahoo.com> >> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> >> Sent: Sunday, November 18, 2012 12:07 AM >> Subject: Re: __sync_fetch >> >> On Sat, Nov 17, 2012 at 11:04 PM, Hei Chan <structurechart@yahoo.com> wrote: >>> >>> After searching more for info, it seems like even though on a >>> 64-bit machine, reading a long (i.e. 8 bytes) is one operation, it >>> might not give the "correct" value: >>> http://gcc.gnu.org/ml/gcc/2008-03/msg00201.html >>> >>> And so, we have to use __sync_fetch_and_add(&x, 0) to read? >>> >>> Could >>> someone elaborate a situation that reading a long variable won't get >>> the correct value given that all writes in the application use >>> __sync_fetch_*()? >> >> If you always use __sync_fetch_and_add(&x, 0) to read a value, and you >> always use __sync_fetch_and_add to write the value also with some >> appropriate detla, then all the accesses to that variable should be >> atomic with sequential consistency. That should be true on any >> processors that implements __sync_fetch_and_add in the appropriate >> size. >> >> Ian >> > ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <1353294140.73855.YahooMailNeo@web165005.mail.bf1.yahoo.com>]
[parent not found: <CAKOQZ8y2-uP_jQMd+xCtT4Svm121HiJSdz+FGvAW-NSXxM9F+g@mail.gmail.com>]
[parent not found: <1353301408.14218.YahooMailNeo@web165002.mail.bf1.yahoo.com>]
[parent not found: <CAKOQZ8xgf7TTyU_X1oHzVRawYiKT5JK5JHiq__VtB_WUkdKAQQ@mail.gmail.com>]
* Re: __sync_fetch [not found] ` <CAKOQZ8xgf7TTyU_X1oHzVRawYiKT5JK5JHiq__VtB_WUkdKAQQ@mail.gmail.com> @ 2012-11-19 5:39 ` Hei Chan 2012-11-19 5:46 ` __sync_fetch Ian Lance Taylor 0 siblings, 1 reply; 12+ messages in thread From: Hei Chan @ 2012-11-19 5:39 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: gcc-help Hi Ian, Thanks for your reply. I am not sure why the casting in your example will cause any issue as I thought without __pack__, the variable a will be aligned by gcc, no? And you are trying to get the address of &a[0] (which is fixed) and then cast to long*, shouldn't have anything do with alignment....or do you mean long p = (long)(*a)? Thanks in advance. ----- Original Message ----- From: Ian Lance Taylor <iant@google.com> To: Hei Chan <structurechart@yahoo.com> Cc: Sent: Sunday, November 18, 2012 9:14 PM Subject: Re: __sync_fetch On Sun, Nov 18, 2012 at 9:03 PM, Hei Chan <structurechart@yahoo.com> wrote: > You mean: > 1. the variable I am trying to read doesn't belong to a packed struct; and Yes. > 2. I am not casting to something not 8 byte long on a 64 bit machine? I > thought casting would happen after the variable is stored in register? I'm sorry, I don't know what you mean. By casting I mean something like char a[16]; long *p = (long *) a; Nothing here makes a aligned. If you want to continue this discussion, please use the mailing list, not private mail to me. Thanks. Ian > From: Ian Lance Taylor <iant@google.com> > To: Hei Chan <structurechart@yahoo.com> > Sent: Sunday, November 18, 2012 7:26 PM > Subject: Re: __sync_fetch > > On Sun, Nov 18, 2012 at 7:02 PM, Hei Chan <structurechart@yahoo.com> wrote: >> So the case you mentioned about unaligned cache line shouldn't happen, >> right? > > Unless you are doing something unusual involving casts or packed > structs, that is correct. > > Ian > > >> Just want to check (so that I can shave another few hundreds nano sec in >> my code). >> >> Thanks in advance. >> >> >> ----- Original Message ----- >> From: Ian Lance Taylor <iant@google.com> >> To: Hei Chan <structurechart@yahoo.com> >> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> >> Sent: Sunday, November 18, 2012 6:57 PM >> Subject: Re: __sync_fetch >> >> On Sun, Nov 18, 2012 at 11:31 AM, Hei Chan <structurechart@yahoo.com> >> wrote: >>> I just spoke with my coworker about this. We just wonder whether C++ >>> standard/GCC guarantees all the variables will be aligned if we don't >>> request for unaligned (e.g. __packed__). >> >> Yes. >> >> Ian >> >>> ----- Original Message ----- >>> From: Ian Lance Taylor <iant@google.com> >>> To: Hei Chan <structurechart@yahoo.com> >>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> >>> Sent: Sunday, November 18, 2012 12:18 AM >>> Subject: Re: __sync_fetch >>> >>> On Sun, Nov 18, 2012 at 12:10 AM, Hei Chan <structurechart@yahoo.com> >>> wrote: >>>> >>>> How about on a 64-bit Intel processor, I use __sync_fetch_and_*() to >>>> write to a long variable, but never use any __sync_*() to read? Under what >>>> situation that I will read something invalid? >>> >>> On a 64-bit Intel processor, if the 64-bit value is at an aligned >>> adress, then to the best of my knowledge that will always be fine. If >>> the 64-bit value is misaligned and crosses a cache line, then if you >>> are unlucky I believe that a write can occur in between reading the >>> two different cache lines, causing you to read a value that was never >>> written. >>> >>> I feel compelled to add that attempting to reason about this sort of >>> thing generally means that you are making a mistake. Unless you are >>> writing very low-level code, such as the implementation of mutex, it's >>> best to avoid trying to think this way. >>> >>> Ian >>> >>> >>> >>>> ----- Original Message ----- >>>> From: Ian Lance Taylor <iant@google.com> >>>> To: Hei Chan <structurechart@yahoo.com> >>>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> >>>> Sent: Sunday, November 18, 2012 12:07 AM >>>> Subject: Re: __sync_fetch >>>> >>>> On Sat, Nov 17, 2012 at 11:04 PM, Hei Chan <structurechart@yahoo.com> >>>> wrote: >>>>> >>>>> After searching more for info, it seems like even though on a >>>>> 64-bit machine, reading a long (i.e. 8 bytes) is one operation, it >>>>> might not give the "correct" value: >>>>> http://gcc.gnu.org/ml/gcc/2008-03/msg00201.html >>>>> >>>>> And so, we have to use __sync_fetch_and_add(&x, 0) to read? >>>>> >>>>> Could >>>>> someone elaborate a situation that reading a long variable won't get >>>>> the correct value given that all writes in the application use >>>>> __sync_fetch_*()? >>>> >>>> If you always use __sync_fetch_and_add(&x, 0) to read a value, and you >>>> always use __sync_fetch_and_add to write the value also with some >>>> appropriate detla, then all the accesses to that variable should be >>>> atomic with sequential consistency. That should be true on any >>>> processors that implements __sync_fetch_and_add in the appropriate >>>> size. >>>> >>>> Ian >>>> >>> >> > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: __sync_fetch 2012-11-19 5:39 ` __sync_fetch Hei Chan @ 2012-11-19 5:46 ` Ian Lance Taylor 2012-11-19 6:07 ` __sync_fetch Hei Chan [not found] ` <1353305098.17316.YahooMailNeo@web165003.mail.bf1.yahoo.com> 0 siblings, 2 replies; 12+ messages in thread From: Ian Lance Taylor @ 2012-11-19 5:46 UTC (permalink / raw) To: Hei Chan; +Cc: gcc-help On Sun, Nov 18, 2012 at 9:38 PM, Hei Chan <structurechart@yahoo.com> wrote: > > I am not sure why the casting in your example will cause any issue as I thought without __pack__, the variable a will be aligned by gcc, no? And you are trying to get the address of &a[0] (which is fixed) and then cast to long*, shouldn't have anything do with alignment....or do you mean long p = (long)(*a)? In my example, a is a char array. A char array is not required to be aligned to an 8-byte boundary. It's true that GCC will align the variable a, but it may align it to an odd address. So if I cast the array a to long*, I may get a long* that is not aligned to an 8-byte boundary. Ian > ----- Original Message ----- > From: Ian Lance Taylor <iant@google.com> > To: Hei Chan <structurechart@yahoo.com> > Cc: > Sent: Sunday, November 18, 2012 9:14 PM > Subject: Re: __sync_fetch > > On Sun, Nov 18, 2012 at 9:03 PM, Hei Chan <structurechart@yahoo.com> wrote: >> You mean: >> 1. the variable I am trying to read doesn't belong to a packed struct; and > > Yes. > >> 2. I am not casting to something not 8 byte long on a 64 bit machine? I >> thought casting would happen after the variable is stored in register? > > I'm sorry, I don't know what you mean. > > By casting I mean something like > char a[16]; > long *p = (long *) a; > Nothing here makes a aligned. > > If you want to continue this discussion, please use the mailing list, > not private mail to me. Thanks. > > Ian > > > > >> From: Ian Lance Taylor <iant@google.com> >> To: Hei Chan <structurechart@yahoo.com> >> Sent: Sunday, November 18, 2012 7:26 PM >> Subject: Re: __sync_fetch >> >> On Sun, Nov 18, 2012 at 7:02 PM, Hei Chan <structurechart@yahoo.com> wrote: >>> So the case you mentioned about unaligned cache line shouldn't happen, >>> right? >> >> Unless you are doing something unusual involving casts or packed >> structs, that is correct. >> >> Ian >> >> >>> Just want to check (so that I can shave another few hundreds nano sec in >>> my code). >>> >>> Thanks in advance. >>> >>> >>> ----- Original Message ----- >>> From: Ian Lance Taylor <iant@google.com> >>> To: Hei Chan <structurechart@yahoo.com> >>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> >>> Sent: Sunday, November 18, 2012 6:57 PM >>> Subject: Re: __sync_fetch >>> >>> On Sun, Nov 18, 2012 at 11:31 AM, Hei Chan <structurechart@yahoo.com> >>> wrote: >>>> I just spoke with my coworker about this. We just wonder whether C++ >>>> standard/GCC guarantees all the variables will be aligned if we don't >>>> request for unaligned (e.g. __packed__). >>> >>> Yes. >>> >>> Ian >>> >>>> ----- Original Message ----- >>>> From: Ian Lance Taylor <iant@google.com> >>>> To: Hei Chan <structurechart@yahoo.com> >>>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> >>>> Sent: Sunday, November 18, 2012 12:18 AM >>>> Subject: Re: __sync_fetch >>>> >>>> On Sun, Nov 18, 2012 at 12:10 AM, Hei Chan <structurechart@yahoo.com> >>>> wrote: >>>>> >>>>> How about on a 64-bit Intel processor, I use __sync_fetch_and_*() to >>>>> write to a long variable, but never use any __sync_*() to read? Under what >>>>> situation that I will read something invalid? >>>> >>>> On a 64-bit Intel processor, if the 64-bit value is at an aligned >>>> adress, then to the best of my knowledge that will always be fine. If >>>> the 64-bit value is misaligned and crosses a cache line, then if you >>>> are unlucky I believe that a write can occur in between reading the >>>> two different cache lines, causing you to read a value that was never >>>> written. >>>> >>>> I feel compelled to add that attempting to reason about this sort of >>>> thing generally means that you are making a mistake. Unless you are >>>> writing very low-level code, such as the implementation of mutex, it's >>>> best to avoid trying to think this way. >>>> >>>> Ian >>>> >>>> >>>> >>>>> ----- Original Message ----- >>>>> From: Ian Lance Taylor <iant@google.com> >>>>> To: Hei Chan <structurechart@yahoo.com> >>>>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> >>>>> Sent: Sunday, November 18, 2012 12:07 AM >>>>> Subject: Re: __sync_fetch >>>>> >>>>> On Sat, Nov 17, 2012 at 11:04 PM, Hei Chan <structurechart@yahoo.com> >>>>> wrote: >>>>>> >>>>>> After searching more for info, it seems like even though on a >>>>>> 64-bit machine, reading a long (i.e. 8 bytes) is one operation, it >>>>>> might not give the "correct" value: >>>>>> http://gcc.gnu.org/ml/gcc/2008-03/msg00201.html >>>>>> >>>>>> And so, we have to use __sync_fetch_and_add(&x, 0) to read? >>>>>> >>>>>> Could >>>>>> someone elaborate a situation that reading a long variable won't get >>>>>> the correct value given that all writes in the application use >>>>>> __sync_fetch_*()? >>>>> >>>>> If you always use __sync_fetch_and_add(&x, 0) to read a value, and you >>>>> always use __sync_fetch_and_add to write the value also with some >>>>> appropriate detla, then all the accesses to that variable should be >>>>> atomic with sequential consistency. That should be true on any >>>>> processors that implements __sync_fetch_and_add in the appropriate >>>>> size. >>>>> >>>>> Ian >>>>> >>>> >>> >> >> > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: __sync_fetch 2012-11-19 5:46 ` __sync_fetch Ian Lance Taylor @ 2012-11-19 6:07 ` Hei Chan [not found] ` <1353305098.17316.YahooMailNeo@web165003.mail.bf1.yahoo.com> 1 sibling, 0 replies; 12+ messages in thread From: Hei Chan @ 2012-11-19 6:07 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: gcc-help Hmm...I think I must miss something. Let's say a[0] is at 0x4 (so not an address divisible by 8), and let's say only the first 4 characters are on the cache line. Then, long* p = (long*)a will give 0x4 in any situation as the address of an array can't be changed. So it shouldn't be a problem, no? Or do you mean when I try to de-reference p to read the value p pointing to, then there will be a problem as only first 4 bytes are in the cache line? Thanks in advance. Really appreciate for your explanation. ________________________________ From: Ian Lance Taylor <iant@google.com> To: Hei Chan <structurechart@yahoo.com> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> Sent: Sunday, November 18, 2012 9:46 PM Subject: Re: __sync_fetch On Sun, Nov 18, 2012 at 9:38 PM, Hei Chan <structurechart@yahoo.com> wrote: > > I am not sure why the casting in your example will cause any issue as I thought without __pack__, the variable a will be aligned by gcc, no? And you are trying to get the address of &a[0] (which is fixed) and then cast to long*, shouldn't have anything do with alignment....or do you mean long p = (long)(*a)? In my example, a is a char array. A char array is not required to be aligned to an 8-byte boundary. It's true that GCC will align the variable a, but it may align it to an odd address. So if I cast the array a to long*, I may get a long* that is not aligned to an 8-byte boundary. Ian > ----- Original Message ----- > From: Ian Lance Taylor <iant@google.com> > To: Hei Chan <structurechart@yahoo.com> > Cc: > Sent: Sunday, November 18, 2012 9:14 PM > Subject: Re: __sync_fetch > > On Sun, Nov 18, 2012 at 9:03 PM, Hei Chan <structurechart@yahoo.com> wrote: >> You mean: >> 1. the variable I am trying to read doesn't belong to a packed struct; and > > Yes. > >> 2. I am not casting to something not 8 byte long on a 64 bit machine? I >> thought casting would happen after the variable is stored in register? > > I'm sorry, I don't know what you mean. > > By casting I mean something like > char a[16]; > long *p = (long *) a; > Nothing here makes a aligned. > > If you want to continue this discussion, please use the mailing list, > not private mail to me. Thanks. > > Ian > > > > >> From: Ian Lance Taylor <iant@google.com> >> To: Hei Chan <structurechart@yahoo.com> >> Sent: Sunday, November 18, 2012 7:26 PM >> Subject: Re: __sync_fetch >> >> On Sun, Nov 18, 2012 at 7:02 PM, Hei Chan <structurechart@yahoo.com> wrote: >>> So the case you mentioned about unaligned cache line shouldn't happen, >>> right? >> >> Unless you are doing something unusual involving casts or packed >> structs, that is correct. >> >> Ian >> >> >>> Just want to check (so that I can shave another few hundreds nano sec in >>> my code). >>> >>> Thanks in advance. >>> >>> >>> ----- Original Message ----- >>> From: Ian Lance Taylor <iant@google.com> >>> To: Hei Chan <structurechart@yahoo.com> >>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> >>> Sent: Sunday, November 18, 2012 6:57 PM >>> Subject: Re: __sync_fetch >>> >>> On Sun, Nov 18, 2012 at 11:31 AM, Hei Chan <structurechart@yahoo.com> >>> wrote: >>>> I just spoke with my coworker about this. We just wonder whether C++ >>>> standard/GCC guarantees all the variables will be aligned if we don't >>>> request for unaligned (e.g. __packed__). >>> >>> Yes. >>> >>> Ian >>> >>>> ----- Original Message ----- >>>> From: Ian Lance Taylor <iant@google.com> >>>> To: Hei Chan <structurechart@yahoo.com> >>>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> >>>> Sent: Sunday, November 18, 2012 12:18 AM >>>> Subject: Re: __sync_fetch >>>> >>>> On Sun, Nov 18, 2012 at 12:10 AM, Hei Chan <structurechart@yahoo.com> >>>> wrote: >>>>> >>>>> How about on a 64-bit Intel processor, I use __sync_fetch_and_*() to >>>>> write to a long variable, but never use any __sync_*() to read? Under what >>>>> situation that I will read something invalid? >>>> >>>> On a 64-bit Intel processor, if the 64-bit value is at an aligned >>>> adress, then to the best of my knowledge that will always be fine. If >>>> the 64-bit value is misaligned and crosses a cache line, then if you >>>> are unlucky I believe that a write can occur in between reading the >>>> two different cache lines, causing you to read a value that was never >>>> written. >>>> >>>> I feel compelled to add that attempting to reason about this sort of >>>> thing generally means that you are making a mistake. Unless you are >>>> writing very low-level code, such as the implementation of mutex, it's >>>> best to avoid trying to think this way. >>>> >>>> Ian >>>> >>>> >>>> >>>>> ----- Original Message ----- >>>>> From: Ian Lance Taylor <iant@google.com> >>>>> To: Hei Chan <structurechart@yahoo.com> >>>>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> >>>>> Sent: Sunday, November 18, 2012 12:07 AM >>>>> Subject: Re: __sync_fetch >>>>> >>>>> On Sat, Nov 17, 2012 at 11:04 PM, Hei Chan <structurechart@yahoo.com> >>>>> wrote: >>>>>> >>>>>> After searching more for info, it seems like even though on a >>>>>> 64-bit machine, reading a long (i.e. 8 bytes) is one operation, it >>>>>> might not give the "correct" value: >>>>>> http://gcc.gnu.org/ml/gcc/2008-03/msg00201.html >>>>>> >>>>>> And so, we have to use __sync_fetch_and_add(&x, 0) to read? >>>>>> >>>>>> Could >>>>>> someone elaborate a situation that reading a long variable won't get >>>>>> the correct value given that all writes in the application use >>>>>> __sync_fetch_*()? >>>>> >>>>> If you always use __sync_fetch_and_add(&x, 0) to read a value, and you >>>>> always use __sync_fetch_and_add to write the value also with some >>>>> appropriate detla, then all the accesses to that variable should be >>>>> atomic with sequential consistency. That should be true on any >>>>> processors that implements __sync_fetch_and_add in the appropriate >>>>> size. >>>>> >>>>> Ian >>>>> >>>> >>> >> >> > ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <1353305098.17316.YahooMailNeo@web165003.mail.bf1.yahoo.com>]
* Re: __sync_fetch [not found] ` <1353305098.17316.YahooMailNeo@web165003.mail.bf1.yahoo.com> @ 2012-11-19 6:18 ` Ian Lance Taylor 0 siblings, 0 replies; 12+ messages in thread From: Ian Lance Taylor @ 2012-11-19 6:18 UTC (permalink / raw) To: Hei Chan; +Cc: gcc-help On Sun, Nov 18, 2012 at 10:04 PM, Hei Chan <structurechart@yahoo.com> wrote: > Hmm...I think I must miss something. > > Let's say a[0] is at 0x4 (so not an address divisible by 8), and let's say > only the first 4 characters are on the cache line. > > Then, long* p = (long*)a will give 0x4 in any situation as the address of an > array can't be changed. So it shouldn't be a problem, no? Or do you mean > when I try to de-reference p to read the value p pointing to, then there > will be a problem as only first 4 bytes are in the cache line? Yes. Ian > From: Ian Lance Taylor <iant@google.com> > To: Hei Chan <structurechart@yahoo.com> > Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> > Sent: Sunday, November 18, 2012 9:46 PM > Subject: Re: __sync_fetch > > On Sun, Nov 18, 2012 at 9:38 PM, Hei Chan <structurechart@yahoo.com> wrote: >> >> I am not sure why the casting in your example will cause any issue as I >> thought without __pack__, the variable a will be aligned by gcc, no? And >> you are trying to get the address of &a[0] (which is fixed) and then cast to >> long*, shouldn't have anything do with alignment....or do you mean long p = >> (long)(*a)? > > In my example, a is a char array. A char array is not required to be > aligned to an 8-byte boundary. It's true that GCC will align the > variable a, but it may align it to an odd address. So if I cast the > array a to long*, I may get a long* that is not aligned to an 8-byte > boundary. > > Ian > > > >> ----- Original Message ----- >> From: Ian Lance Taylor <iant@google.com> >> To: Hei Chan <structurechart@yahoo.com> >> Cc: >> Sent: Sunday, November 18, 2012 9:14 PM >> Subject: Re: __sync_fetch >> >> On Sun, Nov 18, 2012 at 9:03 PM, Hei Chan <structurechart@yahoo.com> >> wrote: >>> You mean: >>> 1. the variable I am trying to read doesn't belong to a packed struct; >>> and >> >> Yes. >> >>> 2. I am not casting to something not 8 byte long on a 64 bit machine? I >>> thought casting would happen after the variable is stored in register? >> >> I'm sorry, I don't know what you mean. >> >> By casting I mean something like >> char a[16]; >> long *p = (long *) a; >> Nothing here makes a aligned. >> >> If you want to continue this discussion, please use the mailing list, >> not private mail to me. Thanks. >> >> Ian >> >> >> >> >>> From: Ian Lance Taylor <iant@google.com> >>> To: Hei Chan <structurechart@yahoo.com> >>> Sent: Sunday, November 18, 2012 7:26 PM >>> Subject: Re: __sync_fetch >>> >>> On Sun, Nov 18, 2012 at 7:02 PM, Hei Chan <structurechart@yahoo.com> >>> wrote: >>>> So the case you mentioned about unaligned cache line shouldn't happen, >>>> right? >>> >>> Unless you are doing something unusual involving casts or packed >>> structs, that is correct. >>> >>> Ian >>> >>> >>>> Just want to check (so that I can shave another few hundreds nano sec in >>>> my code). >>>> >>>> Thanks in advance. >>>> >>>> >>>> ----- Original Message ----- >>>> From: Ian Lance Taylor <iant@google.com> >>>> To: Hei Chan <structurechart@yahoo.com> >>>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> >>>> Sent: Sunday, November 18, 2012 6:57 PM >>>> Subject: Re: __sync_fetch >>>> >>>> On Sun, Nov 18, 2012 at 11:31 AM, Hei Chan <structurechart@yahoo.com> >>>> wrote: >>>>> I just spoke with my coworker about this. We just wonder whether C++ >>>>> standard/GCC guarantees all the variables will be aligned if we don't >>>>> request for unaligned (e.g. __packed__). >>>> >>>> Yes. >>>> >>>> Ian >>>> >>>>> ----- Original Message ----- >>>>> From: Ian Lance Taylor <iant@google.com> >>>>> To: Hei Chan <structurechart@yahoo.com> >>>>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> >>>>> Sent: Sunday, November 18, 2012 12:18 AM >>>>> Subject: Re: __sync_fetch >>>>> >>>>> On Sun, Nov 18, 2012 at 12:10 AM, Hei Chan <structurechart@yahoo.com> >>>>> wrote: >>>>>> >>>>>> How about on a 64-bit Intel processor, I use __sync_fetch_and_*() to >>>>>> write to a long variable, but never use any __sync_*() to read? Under >>>>>> what >>>>>> situation that I will read something invalid? >>>>> >>>>> On a 64-bit Intel processor, if the 64-bit value is at an aligned >>>>> adress, then to the best of my knowledge that will always be fine. If >>>>> the 64-bit value is misaligned and crosses a cache line, then if you >>>>> are unlucky I believe that a write can occur in between reading the >>>>> two different cache lines, causing you to read a value that was never >>>>> written. >>>>> >>>>> I feel compelled to add that attempting to reason about this sort of >>>>> thing generally means that you are making a mistake. Unless you are >>>>> writing very low-level code, such as the implementation of mutex, it's >>>>> best to avoid trying to think this way. >>>>> >>>>> Ian >>>>> >>>>> >>>>> >>>>>> ----- Original Message ----- >>>>>> From: Ian Lance Taylor <iant@google.com> >>>>>> To: Hei Chan <structurechart@yahoo.com> >>>>>> Cc: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org> >>>>>> Sent: Sunday, November 18, 2012 12:07 AM >>>>>> Subject: Re: __sync_fetch >>>>>> >>>>>> On Sat, Nov 17, 2012 at 11:04 PM, Hei Chan <structurechart@yahoo.com> >>>>>> wrote: >>>>>>> >>>>>>> After searching more for info, it seems like even though on a >>>>>>> 64-bit machine, reading a long (i.e. 8 bytes) is one operation, it >>>>>>> might not give the "correct" value: >>>>>>> http://gcc.gnu.org/ml/gcc/2008-03/msg00201.html >>>>>>> >>>>>>> And so, we have to use __sync_fetch_and_add(&x, 0) to read? >>>>>>> >>>>>>> Could >>>>>>> someone elaborate a situation that reading a long variable won't get >>>>>>> the correct value given that all writes in the application use >>>>>>> __sync_fetch_*()? >>>>>> >>>>>> If you always use __sync_fetch_and_add(&x, 0) to read a value, and you >>>>>> always use __sync_fetch_and_add to write the value also with some >>>>>> appropriate detla, then all the accesses to that variable should be >>>>>> atomic with sequential consistency. That should be true on any >>>>>> processors that implements __sync_fetch_and_add in the appropriate >>>>>> size. >>>>>> >>>>>> Ian >>>>>> >>>>> >>>> >>> >>> >> > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: __sync_fetch 2012-11-17 6:34 __sync_fetch Hei Chan 2012-11-18 7:04 ` __sync_fetch Hei Chan @ 2012-11-18 8:04 ` Ian Lance Taylor 1 sibling, 0 replies; 12+ messages in thread From: Ian Lance Taylor @ 2012-11-18 8:04 UTC (permalink / raw) To: Hei Chan; +Cc: gcc-help On Fri, Nov 16, 2012 at 10:34 PM, Hei Chan <structurechart@yahoo.com> wrote: > > I am using GCC 4.1.2, and so no __atomic*(). > > I am looking at http://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins.html > > I see __sync_fetch_and_*(), but I don't see __sync_fetch(). Is it because the built-in routines support integral scalar or pointer type that is up to 8 bytes in length, and so the read is automatically atomic anyway? The __sync primitives were designed by Intel. I believe that they did not include atomic load or store operators because on x86 processors all aligned loads and stores are atomic. Synchronization of loads and stores with other processors on x86 requires the use of explicit memory fence instructions. Since GCC just picked up the Intel designed primitives, they work fine on x86, but are deficient on other processors. Ian ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2012-11-19 6:18 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-11-17 6:34 __sync_fetch Hei Chan 2012-11-18 7:04 ` __sync_fetch Hei Chan 2012-11-18 8:07 ` __sync_fetch Ian Lance Taylor 2012-11-18 8:11 ` __sync_fetch Hei Chan 2012-11-18 8:18 ` __sync_fetch Ian Lance Taylor 2012-11-18 19:31 ` __sync_fetch Hei Chan 2012-11-19 2:57 ` __sync_fetch Ian Lance Taylor [not found] ` <1353294140.73855.YahooMailNeo@web165005.mail.bf1.yahoo.com> [not found] ` <CAKOQZ8y2-uP_jQMd+xCtT4Svm121HiJSdz+FGvAW-NSXxM9F+g@mail.gmail.com> [not found] ` <1353301408.14218.YahooMailNeo@web165002.mail.bf1.yahoo.com> [not found] ` <CAKOQZ8xgf7TTyU_X1oHzVRawYiKT5JK5JHiq__VtB_WUkdKAQQ@mail.gmail.com> 2012-11-19 5:39 ` __sync_fetch Hei Chan 2012-11-19 5:46 ` __sync_fetch Ian Lance Taylor 2012-11-19 6:07 ` __sync_fetch Hei Chan [not found] ` <1353305098.17316.YahooMailNeo@web165003.mail.bf1.yahoo.com> 2012-11-19 6:18 ` __sync_fetch Ian Lance Taylor 2012-11-18 8:04 ` __sync_fetch Ian Lance Taylor
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).