* Libatomic 16B
@ 2022-02-23 16:42 Satish Vasudeva
2022-02-24 16:42 ` Satish Vasudeva
2022-02-24 19:09 ` Xi Ruoyao
0 siblings, 2 replies; 19+ messages in thread
From: Satish Vasudeva @ 2022-02-23 16:42 UTC (permalink / raw)
To: gcc-help
Hi Team,
I was looking at the hotspots in our software stack and interestingly I see
libat_load_16_i1 seems to be one of the top in the list.
I am trying to understand why that is the case. My suspicion is some kind
of lock usage for 16B atomic accesses.
I came across this discussion but frankly I am still confused.
https://gcc.gnu.org/legacy-ml/gcc-patches/2017-01/msg02344.html
Do you think the overhead of libat_load_16_i1 is due to spinlock usage?
Also reading some other Intel CPU docs, it seems like the CPU does support
loading 16B in single access. In that case can we optimize this for
performance?
Thanks and appreciate your help.
Satish
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Libatomic 16B
2022-02-23 16:42 Libatomic 16B Satish Vasudeva
@ 2022-02-24 16:42 ` Satish Vasudeva
2022-02-25 13:53 ` Florian Weimer
2022-02-24 19:09 ` Xi Ruoyao
1 sibling, 1 reply; 19+ messages in thread
From: Satish Vasudeva @ 2022-02-24 16:42 UTC (permalink / raw)
To: gcc-help
I looked into this further. Seems like libat_load_16_i1 is implementing the
load 16B as "*lock* *cmpxchg16b* (%*rdi*)"
This is assuming that the CPU doesn't support 16B loads in a single
transaction. How can I compile libatomics to use intrinsics for load 16B
instead of LOCK cmpxchg?
Appreciate your response.
Satish
On Wed, Feb 23, 2022 at 8:42 AM Satish Vasudeva <
satish.vasudeva@cohesity.com> wrote:
> Hi Team,
>
> I was looking at the hotspots in our software stack and interestingly I
> see libat_load_16_i1 seems to be one of the top in the list.
>
> I am trying to understand why that is the case. My suspicion is some kind
> of lock usage for 16B atomic accesses.
>
> I came across this discussion but frankly I am still confused.
> https://gcc.gnu.org/legacy-ml/gcc-patches/2017-01/msg02344.html
>
> Do you think the overhead of libat_load_16_i1 is due to spinlock usage?
> Also reading some other Intel CPU docs, it seems like the CPU does support
> loading 16B in single access. In that case can we optimize this for
> performance?
>
> Thanks and appreciate your help.
>
> Satish
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Libatomic 16B
2022-02-23 16:42 Libatomic 16B Satish Vasudeva
2022-02-24 16:42 ` Satish Vasudeva
@ 2022-02-24 19:09 ` Xi Ruoyao
2022-02-24 19:35 ` Satish Vasudeva
1 sibling, 1 reply; 19+ messages in thread
From: Xi Ruoyao @ 2022-02-24 19:09 UTC (permalink / raw)
To: Satish Vasudeva, gcc-help
On Wed, 2022-02-23 at 08:42 -0800, Satish Vasudeva via Gcc-help wrote:
> Hi Team,
>
> I was looking at the hotspots in our software stack and interestingly I see
> libat_load_16_i1 seems to be one of the top in the list.
>
> I am trying to understand why that is the case. My suspicion is some kind
> of lock usage for 16B atomic accesses.
>
> I came across this discussion but frankly I am still confused.
> https://gcc.gnu.org/legacy-ml/gcc-patches/2017-01/msg02344.html
>
> Do you think the overhead of libat_load_16_i1 is due to spinlock usage?
> Also reading some other Intel CPU docs, it seems like the CPU does support
> loading 16B in single access. In that case can we optimize this for
> performance?
Open a issue at https://gcc.gnu.org/bugzilla, with the reference to the
Intel CPU doc prove that some specific models supports loading 128-bit.
Don't use "it seems like", nobody wants to write some nasty SSE code and
then find it doesn't work on any CPU.
--
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Libatomic 16B
2022-02-24 19:09 ` Xi Ruoyao
@ 2022-02-24 19:35 ` Satish Vasudeva
2022-02-24 20:05 ` Xi Ruoyao
0 siblings, 1 reply; 19+ messages in thread
From: Satish Vasudeva @ 2022-02-24 19:35 UTC (permalink / raw)
To: Xi Ruoyao; +Cc: gcc-help
Thanks for the response.
Looking further into libatomic library code, I do see 16B move instructions
have been used for atomic_exchange code like below. Just wondering why it
is not generating a intrinsic __atomic_load_16 using this instruction.
*movdq**a* 0x0(%*rbp*),%*xmm0*
On Thu, Feb 24, 2022 at 11:09 AM Xi Ruoyao <xry111@mengyan1223.wang> wrote:
> On Wed, 2022-02-23 at 08:42 -0800, Satish Vasudeva via Gcc-help wrote:
> > Hi Team,
> >
> > I was looking at the hotspots in our software stack and interestingly I
> see
> > libat_load_16_i1 seems to be one of the top in the list.
> >
> > I am trying to understand why that is the case. My suspicion is some kind
> > of lock usage for 16B atomic accesses.
> >
> > I came across this discussion but frankly I am still confused.
> > https://gcc.gnu.org/legacy-ml/gcc-patches/2017-01/msg02344.html
> >
> > Do you think the overhead of libat_load_16_i1 is due to spinlock usage?
> > Also reading some other Intel CPU docs, it seems like the CPU does
> support
> > loading 16B in single access. In that case can we optimize this for
> > performance?
>
> Open a issue at https://gcc.gnu.org/bugzilla, with the reference to the
> Intel CPU doc prove that some specific models supports loading 128-bit.
>
> Don't use "it seems like", nobody wants to write some nasty SSE code and
> then find it doesn't work on any CPU.
> --
> Xi Ruoyao <xry111@mengyan1223.wang>
> School of Aerospace Science and Technology, Xidian University
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Libatomic 16B
2022-02-24 19:35 ` Satish Vasudeva
@ 2022-02-24 20:05 ` Xi Ruoyao
2022-02-24 20:13 ` Segher Boessenkool
0 siblings, 1 reply; 19+ messages in thread
From: Xi Ruoyao @ 2022-02-24 20:05 UTC (permalink / raw)
To: Satish Vasudeva; +Cc: gcc-help
On Thu, 2022-02-24 at 11:35 -0800, Satish Vasudeva wrote:
> Thanks for the response.
>
> Looking further into libatomic library code, I do see 16B move
> instructions have been used for atomic_exchange code like below. Just
> wondering why it is not generating a intrinsic __atomic_load_16 using
> this instruction.
>
> movdqa0x0(%rbp),%xmm0
Because both Intel and AMD have not claimed "this is atomic". In
__atomic_exchange movdqa is used as a normal data move instruction
(actually, GCC optimized memcpy calls in libatomic code to this).
--
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Libatomic 16B
2022-02-24 20:05 ` Xi Ruoyao
@ 2022-02-24 20:13 ` Segher Boessenkool
2022-02-24 20:38 ` Satish Vasudeva
0 siblings, 1 reply; 19+ messages in thread
From: Segher Boessenkool @ 2022-02-24 20:13 UTC (permalink / raw)
To: Xi Ruoyao; +Cc: Satish Vasudeva, gcc-help
On Fri, Feb 25, 2022 at 04:05:28AM +0800, Xi Ruoyao via Gcc-help wrote:
> On Thu, 2022-02-24 at 11:35 -0800, Satish Vasudeva wrote:
> > Thanks for the response.
> >
> > Looking further into libatomic library code, I do see 16B move
> > instructions have been used for atomic_exchange code like below. Just
> > wondering why it is not generating a intrinsic __atomic_load_16 using
> > this instruction.
> >
> > movdqa0x0(%rbp),%xmm0
>
> Because both Intel and AMD have not claimed "this is atomic". In
> __atomic_exchange movdqa is used as a normal data move instruction
> (actually, GCC optimized memcpy calls in libatomic code to this).
Yup. Even on cores where this is atomic internally it is not atomic
when used on a system with a 64-bit (or 72-bit) memory bus.
Segher
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Libatomic 16B
2022-02-24 20:13 ` Segher Boessenkool
@ 2022-02-24 20:38 ` Satish Vasudeva
2022-02-25 8:35 ` Stefan Ring
0 siblings, 1 reply; 19+ messages in thread
From: Satish Vasudeva @ 2022-02-24 20:38 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: Xi Ruoyao, gcc-help
Thanks for the comments.
Please let into this intel architecture manual , section 8.1.1
https://cdrdv2.intel.com/v1/dl/getContent/671190
I think Intel claims 16B operations are atomic , unless I am missing
something.
On Thu, Feb 24, 2022 at 12:16 PM Segher Boessenkool <
segher@kernel.crashing.org> wrote:
> On Fri, Feb 25, 2022 at 04:05:28AM +0800, Xi Ruoyao via Gcc-help wrote:
> > On Thu, 2022-02-24 at 11:35 -0800, Satish Vasudeva wrote:
> > > Thanks for the response.
> > >
> > > Looking further into libatomic library code, I do see 16B move
> > > instructions have been used for atomic_exchange code like below. Just
> > > wondering why it is not generating a intrinsic __atomic_load_16 using
> > > this instruction.
> > >
> > > movdqa0x0(%rbp),%xmm0
> >
> > Because both Intel and AMD have not claimed "this is atomic". In
> > __atomic_exchange movdqa is used as a normal data move instruction
> > (actually, GCC optimized memcpy calls in libatomic code to this).
>
> Yup. Even on cores where this is atomic internally it is not atomic
> when used on a system with a 64-bit (or 72-bit) memory bus.
>
>
> Segher
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Libatomic 16B
2022-02-24 20:38 ` Satish Vasudeva
@ 2022-02-25 8:35 ` Stefan Ring
2022-02-25 8:48 ` Xi Ruoyao
0 siblings, 1 reply; 19+ messages in thread
From: Stefan Ring @ 2022-02-25 8:35 UTC (permalink / raw)
To: gcc-help
On Thu, Feb 24, 2022 at 9:39 PM Satish Vasudeva via Gcc-help
<gcc-help@gcc.gnu.org> wrote:
>
> Please let into this intel architecture manual , section 8.1.1
>
> https://cdrdv2.intel.com/v1/dl/getContent/671190
>
> I think Intel claims 16B operations are atomic , unless I am missing
> something.
Interesting. This seems to be a somewhat recent addition, and the
mailing list discussion linked to above predates it. Coincidentally, I
pulled a copy of the Intel manuals at almost exactly the same time as
this discussion, and sure enough, it does not yet contain the
paragraph about 16 byte operations.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Libatomic 16B
2022-02-25 8:35 ` Stefan Ring
@ 2022-02-25 8:48 ` Xi Ruoyao
2022-02-25 14:01 ` Florian Weimer
0 siblings, 1 reply; 19+ messages in thread
From: Xi Ruoyao @ 2022-02-25 8:48 UTC (permalink / raw)
To: Stefan Ring, gcc-help
On Fri, 2022-02-25 at 09:35 +0100, Stefan Ring via Gcc-help wrote:
> On Thu, Feb 24, 2022 at 9:39 PM Satish Vasudeva via Gcc-help
> <gcc-help@gcc.gnu.org> wrote:
> >
> > Please let into this intel architecture manual , section 8.1.1
> >
> > https://cdrdv2.intel.com/v1/dl/getContent/671190
> >
> > I think Intel claims 16B operations are atomic , unless I am missing
> > something.
>
> Interesting. This seems to be a somewhat recent addition, and the
> mailing list discussion linked to above predates it. Coincidentally, I
> pulled a copy of the Intel manuals at almost exactly the same time as
> this discussion, and sure enough, it does not yet contain the
> paragraph about 16 byte operations.
It seems an addition in Dec 2021 revision:
https://cdrdv2.intel.com/v1/dl/getContent/671294
Create an issue in bugzilla then?
--
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Libatomic 16B
2022-02-24 16:42 ` Satish Vasudeva
@ 2022-02-25 13:53 ` Florian Weimer
0 siblings, 0 replies; 19+ messages in thread
From: Florian Weimer @ 2022-02-25 13:53 UTC (permalink / raw)
To: Satish Vasudeva via Gcc-help; +Cc: Satish Vasudeva
* Satish Vasudeva via Gcc-help:
> I looked into this further. Seems like libat_load_16_i1 is implementing the
> load 16B as "*lock* *cmpxchg16b* (%*rdi*)"
> This is assuming that the CPU doesn't support 16B loads in a single
> transaction. How can I compile libatomics to use intrinsics for load 16B
> instead of LOCK cmpxchg?
As far as I know, it's the only reliable way to implement a 16B load on
x86-64. The Intel SDM explicitly says this:
| An x87 instruction or an SSE instructions that accesses data larger
| than a quadword may be implemented using multiple memory accesses.
(Section 8.1.1 in Volume 3A in my copy.)
I wish we had a plain 128-bit atomic load instruction, but we don't.
Thanks,
Florian
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Libatomic 16B
2022-02-25 8:48 ` Xi Ruoyao
@ 2022-02-25 14:01 ` Florian Weimer
2022-02-25 14:10 ` Alexander Monakov
2022-02-25 14:25 ` Xi Ruoyao
0 siblings, 2 replies; 19+ messages in thread
From: Florian Weimer @ 2022-02-25 14:01 UTC (permalink / raw)
To: Xi Ruoyao via Gcc-help
* Xi Ruoyao via Gcc-help:
> On Fri, 2022-02-25 at 09:35 +0100, Stefan Ring via Gcc-help wrote:
>> On Thu, Feb 24, 2022 at 9:39 PM Satish Vasudeva via Gcc-help
>> <gcc-help@gcc.gnu.org> wrote:
>> >
>> > Please let into this intel architecture manual , section 8.1.1
>> >
>> > https://cdrdv2.intel.com/v1/dl/getContent/671190
>> >
>> > I think Intel claims 16B operations are atomic , unless I am missing
>> > something.
>>
>> Interesting. This seems to be a somewhat recent addition, and the
>> mailing list discussion linked to above predates it. Coincidentally, I
>> pulled a copy of the Intel manuals at almost exactly the same time as
>> this discussion, and sure enough, it does not yet contain the
>> paragraph about 16 byte operations.
>
> It seems an addition in Dec 2021 revision:
> https://cdrdv2.intel.com/v1/dl/getContent/671294
>
> Create an issue in bugzilla then?
Yes please. I should have read the whole thread first. 8-)
The AMD manual doesn't say this yet, so any optimization needs to be
restricted to Intel CPUs for now. I'll reach out to AMD to get
clarification.
Thanks,
Florian
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Libatomic 16B
2022-02-25 14:01 ` Florian Weimer
@ 2022-02-25 14:10 ` Alexander Monakov
2022-02-25 14:16 ` Xi Ruoyao
2022-02-25 14:25 ` Xi Ruoyao
1 sibling, 1 reply; 19+ messages in thread
From: Alexander Monakov @ 2022-02-25 14:10 UTC (permalink / raw)
To: Florian Weimer; +Cc: Xi Ruoyao via Gcc-help
On Fri, 25 Feb 2022, Florian Weimer via Gcc-help wrote:
> * Xi Ruoyao via Gcc-help:
>
> > On Fri, 2022-02-25 at 09:35 +0100, Stefan Ring via Gcc-help wrote:
> >> On Thu, Feb 24, 2022 at 9:39 PM Satish Vasudeva via Gcc-help
> >> <gcc-help@gcc.gnu.org> wrote:
> >> >
> >> > Please let into this intel architecture manual , section 8.1.1
> >> >
> >> > https://cdrdv2.intel.com/v1/dl/getContent/671190
> >> >
> >> > I think Intel claims 16B operations are atomic , unless I am missing
> >> > something.
> >>
> >> Interesting. This seems to be a somewhat recent addition, and the
> >> mailing list discussion linked to above predates it. Coincidentally, I
> >> pulled a copy of the Intel manuals at almost exactly the same time as
> >> this discussion, and sure enough, it does not yet contain the
> >> paragraph about 16 byte operations.
> >
> > It seems an addition in Dec 2021 revision:
> > https://cdrdv2.intel.com/v1/dl/getContent/671294
> >
> > Create an issue in bugzilla then?
>
> Yes please. I should have read the whole thread first. 8-)
>
> The AMD manual doesn't say this yet, so any optimization needs to be
> restricted to Intel CPUs for now. I'll reach out to AMD to get
> clarification.
This StackOverflow question has evidence that both Intel (Core Duo) and
AMD (Opteron 2435) can tear 128-bit loads. So neither manufacturer can
give a retroactive guarantee.
https://stackoverflow.com/questions/7646018/sse-instructions-which-cpus-can-do-atomic-16b-memory-operations
Alexander
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Libatomic 16B
2022-02-25 14:10 ` Alexander Monakov
@ 2022-02-25 14:16 ` Xi Ruoyao
0 siblings, 0 replies; 19+ messages in thread
From: Xi Ruoyao @ 2022-02-25 14:16 UTC (permalink / raw)
To: Alexander Monakov, Florian Weimer; +Cc: Xi Ruoyao via Gcc-help
On Fri, 2022-02-25 at 17:10 +0300, Alexander Monakov via Gcc-help wrote:
> > > https://cdrdv2.intel.com/v1/dl/getContent/671294
TL;DR: Intel says on their CPUs with AVX, 128-bit loads (with movdqa)
are atomic, see page 393 of this doc. And this is updated in Dec 2021,
so you may need to re-download the Intel SDM to get a latest copy.
> > > Create an issue in bugzilla then?
> >
> > Yes please. I should have read the whole thread first. 8-)
> >
> > The AMD manual doesn't say this yet, so any optimization needs to be
> > restricted to Intel CPUs for now. I'll reach out to AMD to get
> > clarification.
>
> This StackOverflow question has evidence that both Intel (Core Duo)
> and
> AMD (Opteron 2435) can tear 128-bit loads.
Core Duo does not have AVX, and AMD has not make any guarantee for the
atomicity of 128-bit load. So we can't use movdqa for 128-bit atomics
on those old Intel and (old or new) AMD models.
> So neither manufacturer can
> give a retroactive guarantee.
>
> https://stackoverflow.com/questions/7646018/sse-instructions-which-cpus-can-do-atomic-16b-memory-operations
>
> Alexander
--
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Libatomic 16B
2022-02-25 14:01 ` Florian Weimer
2022-02-25 14:10 ` Alexander Monakov
@ 2022-02-25 14:25 ` Xi Ruoyao
2022-02-25 17:05 ` Satish Vasudeva
1 sibling, 1 reply; 19+ messages in thread
From: Xi Ruoyao @ 2022-02-25 14:25 UTC (permalink / raw)
To: Florian Weimer; +Cc: Stefan Ring, Satish Vasudeva, Xi Ruoyao via Gcc-help
On Fri, 2022-02-25 at 15:01 +0100, Florian Weimer wrote:
> > It seems an addition in Dec 2021 revision:
> > https://cdrdv2.intel.com/v1/dl/getContent/671294
> >
> > Create an issue in bugzilla then?
>
> Yes please. I should have read the whole thread first. 8-)
Opened as https://gcc.gnu.org/PR104688
--
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Libatomic 16B
2022-02-25 14:25 ` Xi Ruoyao
@ 2022-02-25 17:05 ` Satish Vasudeva
2022-02-25 17:16 ` Xi Ruoyao
0 siblings, 1 reply; 19+ messages in thread
From: Satish Vasudeva @ 2022-02-25 17:05 UTC (permalink / raw)
To: Xi Ruoyao; +Cc: Florian Weimer, Stefan Ring, Xi Ruoyao via Gcc-help
Thanks for a quick action on this.
I see that a patch has been posted.
I am new to this, can you please clarify what is the build option for new
and older Intel CPUs?
Satish
On Fri, Feb 25, 2022 at 6:25 AM Xi Ruoyao <xry111@mengyan1223.wang> wrote:
> On Fri, 2022-02-25 at 15:01 +0100, Florian Weimer wrote:
>
> > > It seems an addition in Dec 2021 revision:
> > > https://cdrdv2.intel.com/v1/dl/getContent/671294
> > >
> > > Create an issue in bugzilla then?
> >
> > Yes please. I should have read the whole thread first. 8-)
>
> Opened as https://gcc.gnu.org/PR104688
> --
> Xi Ruoyao <xry111@mengyan1223.wang>
> School of Aerospace Science and Technology, Xidian University
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Libatomic 16B
2022-02-25 17:05 ` Satish Vasudeva
@ 2022-02-25 17:16 ` Xi Ruoyao
2022-02-25 17:25 ` Satish Vasudeva
0 siblings, 1 reply; 19+ messages in thread
From: Xi Ruoyao @ 2022-02-25 17:16 UTC (permalink / raw)
To: Satish Vasudeva; +Cc: Florian Weimer, Stefan Ring, Xi Ruoyao via Gcc-help
On Fri, 2022-02-25 at 09:05 -0800, Satish Vasudeva wrote:
> Thanks for a quick action on this.
>
> I see that a patch has been posted.
>
> I am new to this, can you please clarify what is the build option for
> new and older Intel CPUs?
You don't need to add any build option if you'll use the posted patch.
The patch uses ifunc (https://sourceware.org/glibc/wiki/GNU_IFUNC)
feature. It means libatomic will automatically select a best variant of
16B atomic load applicable for the CPU when it's loaded at runtime.
> > Opened as https://gcc.gnu.org/PR104688
--
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Libatomic 16B
2022-02-25 17:16 ` Xi Ruoyao
@ 2022-02-25 17:25 ` Satish Vasudeva
2022-03-02 0:16 ` Satish Vasudeva
0 siblings, 1 reply; 19+ messages in thread
From: Satish Vasudeva @ 2022-02-25 17:25 UTC (permalink / raw)
To: Xi Ruoyao; +Cc: Florian Weimer, Stefan Ring, Xi Ruoyao via Gcc-help
That's a great answer. Thank you
Have a nice weekend.
On Fri, Feb 25, 2022 at 9:16 AM Xi Ruoyao <xry111@mengyan1223.wang> wrote:
> On Fri, 2022-02-25 at 09:05 -0800, Satish Vasudeva wrote:
> > Thanks for a quick action on this.
> >
> > I see that a patch has been posted.
> >
> > I am new to this, can you please clarify what is the build option for
> > new and older Intel CPUs?
>
> You don't need to add any build option if you'll use the posted patch.
> The patch uses ifunc (https://sourceware.org/glibc/wiki/GNU_IFUNC)
> feature. It means libatomic will automatically select a best variant of
> 16B atomic load applicable for the CPU when it's loaded at runtime.
>
> > > Opened as https://gcc.gnu.org/PR104688
> --
> Xi Ruoyao <xry111@mengyan1223.wang>
> School of Aerospace Science and Technology, Xidian University
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Libatomic 16B
2022-02-25 17:25 ` Satish Vasudeva
@ 2022-03-02 0:16 ` Satish Vasudeva
2022-03-02 5:55 ` Florian Weimer
0 siblings, 1 reply; 19+ messages in thread
From: Satish Vasudeva @ 2022-03-02 0:16 UTC (permalink / raw)
To: Xi Ruoyao; +Cc: Florian Weimer, Stefan Ring, Xi Ruoyao via Gcc-help
Hi,
Just a quick clarification.
Looking back at the description in
https://gcc.gnu.org/legacy-ml/gcc-patches/2017-01/msg02344.html
It sounds like CAS based implementation is a problem for volatile atomic
loads. Can any one please elaborate what is the issue with volatile atomic
loads. I am trying to do risk analysis in our code.
Thanks
Satish
On Fri, Feb 25, 2022 at 9:25 AM Satish Vasudeva <
satish.vasudeva@cohesity.com> wrote:
> That's a great answer. Thank you
>
> Have a nice weekend.
>
> On Fri, Feb 25, 2022 at 9:16 AM Xi Ruoyao <xry111@mengyan1223.wang> wrote:
>
>> On Fri, 2022-02-25 at 09:05 -0800, Satish Vasudeva wrote:
>> > Thanks for a quick action on this.
>> >
>> > I see that a patch has been posted.
>> >
>> > I am new to this, can you please clarify what is the build option for
>> > new and older Intel CPUs?
>>
>> You don't need to add any build option if you'll use the posted patch.
>> The patch uses ifunc (https://sourceware.org/glibc/wiki/GNU_IFUNC)
>> feature. It means libatomic will automatically select a best variant of
>> 16B atomic load applicable for the CPU when it's loaded at runtime.
>>
>> > > Opened as https://gcc.gnu.org/PR104688
>> --
>> Xi Ruoyao <xry111@mengyan1223.wang>
>> School of Aerospace Science and Technology, Xidian University
>>
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Libatomic 16B
2022-03-02 0:16 ` Satish Vasudeva
@ 2022-03-02 5:55 ` Florian Weimer
0 siblings, 0 replies; 19+ messages in thread
From: Florian Weimer @ 2022-03-02 5:55 UTC (permalink / raw)
To: Satish Vasudeva; +Cc: Xi Ruoyao, Stefan Ring, Xi Ruoyao via Gcc-help
* Satish Vasudeva:
> Looking back at the description in
> https://gcc.gnu.org/legacy-ml/gcc-patches/2017-01/msg02344.html It
> sounds like CAS based implementation is a problem for volatile atomic
> loads. Can any one please elaborate what is the issue with volatile
> atomic loads. I am trying to do risk analysis in our code.
The page could be mapped read-only (say if it's in memory shared across
processes). Reading such values using CAS will fault, so CAS is not a
full replacement.
Thanks,
Florian
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2022-03-02 5:56 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-23 16:42 Libatomic 16B Satish Vasudeva
2022-02-24 16:42 ` Satish Vasudeva
2022-02-25 13:53 ` Florian Weimer
2022-02-24 19:09 ` Xi Ruoyao
2022-02-24 19:35 ` Satish Vasudeva
2022-02-24 20:05 ` Xi Ruoyao
2022-02-24 20:13 ` Segher Boessenkool
2022-02-24 20:38 ` Satish Vasudeva
2022-02-25 8:35 ` Stefan Ring
2022-02-25 8:48 ` Xi Ruoyao
2022-02-25 14:01 ` Florian Weimer
2022-02-25 14:10 ` Alexander Monakov
2022-02-25 14:16 ` Xi Ruoyao
2022-02-25 14:25 ` Xi Ruoyao
2022-02-25 17:05 ` Satish Vasudeva
2022-02-25 17:16 ` Xi Ruoyao
2022-02-25 17:25 ` Satish Vasudeva
2022-03-02 0:16 ` Satish Vasudeva
2022-03-02 5:55 ` Florian Weimer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).