public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v1] x86: Use 64MB as nt-store threshold if no cacheinfo [BZ #30429]
@ 2023-05-09  3:10 Noah Goldstein
  2023-05-09  8:56 ` Florian Weimer
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Noah Goldstein @ 2023-05-09  3:10 UTC (permalink / raw)
  To: libc-alpha; +Cc: goldstein.w.n, hjl.tools, carlos

If `non_temporal_threshold` is below `minimum_non_temporal_threshold`,
it almost certainly means we failed to read the systems cache info.

In this case, rather than defaulting the minimum correct value, we
should default to a value that gets at least reasonable
performance. 64MB is chosen conservatively to be at the very high
end. This should never cause non-temporal stores when, if we had read
cache info, we wouldn't have otherwise.
---
 sysdeps/x86/dl-cacheinfo.h | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/sysdeps/x86/dl-cacheinfo.h b/sysdeps/x86/dl-cacheinfo.h
index 864b00a521..6225c852f6 100644
--- a/sysdeps/x86/dl-cacheinfo.h
+++ b/sysdeps/x86/dl-cacheinfo.h
@@ -771,8 +771,16 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
      reflected in the manual.  */
   unsigned long int maximum_non_temporal_threshold = SIZE_MAX >> 4;
   unsigned long int minimum_non_temporal_threshold = 0x4040;
+
+  /* If `non_temporal_threshold` less than `minimum_non_temporal_threshold`
+     it most likely means we failed to detect the cache info. We don't want
+     to default to `minimum_non_temporal_threshold` as such a small value,
+     while correct, has bad performance. We default to 64MB as reasonable
+     default bound. 64MB is likely conservative in that most/all systems would
+     choose a lower value so it should never forcing non-temporal stores when
+     they otherwise wouldn't be used.  */
   if (non_temporal_threshold < minimum_non_temporal_threshold)
-    non_temporal_threshold = minimum_non_temporal_threshold;
+    non_temporal_threshold = 64 * 1024 * 1024;
   else if (non_temporal_threshold > maximum_non_temporal_threshold)
     non_temporal_threshold = maximum_non_temporal_threshold;
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] x86: Use 64MB as nt-store threshold if no cacheinfo [BZ #30429]
  2023-05-09  3:10 [PATCH v1] x86: Use 64MB as nt-store threshold if no cacheinfo [BZ #30429] Noah Goldstein
@ 2023-05-09  8:56 ` Florian Weimer
  2023-05-09 16:01   ` Noah Goldstein
  2023-05-25 18:36 ` H.J. Lu
  2023-06-05 17:14 ` Noah Goldstein
  2 siblings, 1 reply; 9+ messages in thread
From: Florian Weimer @ 2023-05-09  8:56 UTC (permalink / raw)
  To: Noah Goldstein via Libc-alpha; +Cc: Noah Goldstein, hjl.tools, carlos

* Noah Goldstein via Libc-alpha:

> If `non_temporal_threshold` is below `minimum_non_temporal_threshold`,
> it almost certainly means we failed to read the systems cache info.
>
> In this case, rather than defaulting the minimum correct value, we
> should default to a value that gets at least reasonable
> performance. 64MB is chosen conservatively to be at the very high
> end. This should never cause non-temporal stores when, if we had read
> cache info, we wouldn't have otherwise.

I think that's quite surprising for GLIBC_TUNABLES.  Maybe that logic
should only activate if the default was set from cache sizes?

Thanks,
Florian


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] x86: Use 64MB as nt-store threshold if no cacheinfo [BZ #30429]
  2023-05-09  8:56 ` Florian Weimer
@ 2023-05-09 16:01   ` Noah Goldstein
  2023-05-11 18:15     ` Noah Goldstein
  2023-05-25  8:11     ` Florian Weimer
  0 siblings, 2 replies; 9+ messages in thread
From: Noah Goldstein @ 2023-05-09 16:01 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Noah Goldstein via Libc-alpha, hjl.tools, carlos

On Tue, May 9, 2023 at 3:56 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * Noah Goldstein via Libc-alpha:
>
> > If `non_temporal_threshold` is below `minimum_non_temporal_threshold`,
> > it almost certainly means we failed to read the systems cache info.
> >
> > In this case, rather than defaulting the minimum correct value, we
> > should default to a value that gets at least reasonable
> > performance. 64MB is chosen conservatively to be at the very high
> > end. This should never cause non-temporal stores when, if we had read
> > cache info, we wouldn't have otherwise.
>
> I think that's quite surprising for GLIBC_TUNABLES.  Maybe that logic
> should only activate if the default was set from cache sizes?
>
I don't quite understand what you mean by "only active if the default
was set from cache sizes"?

This logic only triggers if total_cachesize / 8 < ~16kb. I think this should
only ever really happen if we failed to read cache info. What is the
surprise?

> Thanks,
> Florian
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] x86: Use 64MB as nt-store threshold if no cacheinfo [BZ #30429]
  2023-05-09 16:01   ` Noah Goldstein
@ 2023-05-11 18:15     ` Noah Goldstein
  2023-05-25  8:11     ` Florian Weimer
  1 sibling, 0 replies; 9+ messages in thread
From: Noah Goldstein @ 2023-05-11 18:15 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Noah Goldstein via Libc-alpha, hjl.tools, carlos

On Tue, May 9, 2023 at 11:01 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> On Tue, May 9, 2023 at 3:56 AM Florian Weimer <fweimer@redhat.com> wrote:
> >
> > * Noah Goldstein via Libc-alpha:
> >
> > > If `non_temporal_threshold` is below `minimum_non_temporal_threshold`,
> > > it almost certainly means we failed to read the systems cache info.
> > >
> > > In this case, rather than defaulting the minimum correct value, we
> > > should default to a value that gets at least reasonable
> > > performance. 64MB is chosen conservatively to be at the very high
> > > end. This should never cause non-temporal stores when, if we had read
> > > cache info, we wouldn't have otherwise.
> >
> > I think that's quite surprising for GLIBC_TUNABLES.  Maybe that logic
> > should only activate if the default was set from cache sizes?
> >
> I don't quite understand what you mean by "only active if the default
> was set from cache sizes"?
>
> This logic only triggers if total_cachesize / 8 < ~16kb. I think this should
> only ever really happen if we failed to read cache info. What is the
> surprise?

florian?
>
> > Thanks,
> > Florian
> >

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] x86: Use 64MB as nt-store threshold if no cacheinfo [BZ #30429]
  2023-05-09 16:01   ` Noah Goldstein
  2023-05-11 18:15     ` Noah Goldstein
@ 2023-05-25  8:11     ` Florian Weimer
  1 sibling, 0 replies; 9+ messages in thread
From: Florian Weimer @ 2023-05-25  8:11 UTC (permalink / raw)
  To: Noah Goldstein; +Cc: Noah Goldstein via Libc-alpha, hjl.tools, carlos

* Noah Goldstein:

> On Tue, May 9, 2023 at 3:56 AM Florian Weimer <fweimer@redhat.com> wrote:
>>
>> * Noah Goldstein via Libc-alpha:
>>
>> > If `non_temporal_threshold` is below `minimum_non_temporal_threshold`,
>> > it almost certainly means we failed to read the systems cache info.
>> >
>> > In this case, rather than defaulting the minimum correct value, we
>> > should default to a value that gets at least reasonable
>> > performance. 64MB is chosen conservatively to be at the very high
>> > end. This should never cause non-temporal stores when, if we had read
>> > cache info, we wouldn't have otherwise.
>>
>> I think that's quite surprising for GLIBC_TUNABLES.  Maybe that logic
>> should only activate if the default was set from cache sizes?
>>
> I don't quite understand what you mean by "only active if the default
> was set from cache sizes"?
>
> This logic only triggers if total_cachesize / 8 < ~16kb. I think this should
> only ever really happen if we failed to read cache info. What is the
> surprise?

Right, my mistake.  The patch should work as-is.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

Thanks,
Florian


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] x86: Use 64MB as nt-store threshold if no cacheinfo [BZ #30429]
  2023-05-09  3:10 [PATCH v1] x86: Use 64MB as nt-store threshold if no cacheinfo [BZ #30429] Noah Goldstein
  2023-05-09  8:56 ` Florian Weimer
@ 2023-05-25 18:36 ` H.J. Lu
  2023-06-05 17:14 ` Noah Goldstein
  2 siblings, 0 replies; 9+ messages in thread
From: H.J. Lu @ 2023-05-25 18:36 UTC (permalink / raw)
  To: Noah Goldstein; +Cc: libc-alpha, carlos

On Mon, May 8, 2023 at 8:10 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> If `non_temporal_threshold` is below `minimum_non_temporal_threshold`,
> it almost certainly means we failed to read the systems cache info.
>
> In this case, rather than defaulting the minimum correct value, we
> should default to a value that gets at least reasonable
> performance. 64MB is chosen conservatively to be at the very high
> end. This should never cause non-temporal stores when, if we had read
> cache info, we wouldn't have otherwise.
> ---
>  sysdeps/x86/dl-cacheinfo.h | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/sysdeps/x86/dl-cacheinfo.h b/sysdeps/x86/dl-cacheinfo.h
> index 864b00a521..6225c852f6 100644
> --- a/sysdeps/x86/dl-cacheinfo.h
> +++ b/sysdeps/x86/dl-cacheinfo.h
> @@ -771,8 +771,16 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
>       reflected in the manual.  */
>    unsigned long int maximum_non_temporal_threshold = SIZE_MAX >> 4;
>    unsigned long int minimum_non_temporal_threshold = 0x4040;
> +
> +  /* If `non_temporal_threshold` less than `minimum_non_temporal_threshold`
> +     it most likely means we failed to detect the cache info. We don't want
> +     to default to `minimum_non_temporal_threshold` as such a small value,
> +     while correct, has bad performance. We default to 64MB as reasonable
> +     default bound. 64MB is likely conservative in that most/all systems would
> +     choose a lower value so it should never forcing non-temporal stores when
> +     they otherwise wouldn't be used.  */
>    if (non_temporal_threshold < minimum_non_temporal_threshold)
> -    non_temporal_threshold = minimum_non_temporal_threshold;
> +    non_temporal_threshold = 64 * 1024 * 1024;
>    else if (non_temporal_threshold > maximum_non_temporal_threshold)
>      non_temporal_threshold = maximum_non_temporal_threshold;
>
> --
> 2.34.1
>

LGTM.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] x86: Use 64MB as nt-store threshold if no cacheinfo [BZ #30429]
  2023-05-09  3:10 [PATCH v1] x86: Use 64MB as nt-store threshold if no cacheinfo [BZ #30429] Noah Goldstein
  2023-05-09  8:56 ` Florian Weimer
  2023-05-25 18:36 ` H.J. Lu
@ 2023-06-05 17:14 ` Noah Goldstein
  2023-06-05 17:45   ` H.J. Lu
  2 siblings, 1 reply; 9+ messages in thread
From: Noah Goldstein @ 2023-06-05 17:14 UTC (permalink / raw)
  To: libc-alpha; +Cc: hjl.tools, carlos

On Mon, May 8, 2023 at 10:10 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> If `non_temporal_threshold` is below `minimum_non_temporal_threshold`,
> it almost certainly means we failed to read the systems cache info.
>
> In this case, rather than defaulting the minimum correct value, we
> should default to a value that gets at least reasonable
> performance. 64MB is chosen conservatively to be at the very high
> end. This should never cause non-temporal stores when, if we had read
> cache info, we wouldn't have otherwise.
> ---
>  sysdeps/x86/dl-cacheinfo.h | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/sysdeps/x86/dl-cacheinfo.h b/sysdeps/x86/dl-cacheinfo.h
> index 864b00a521..6225c852f6 100644
> --- a/sysdeps/x86/dl-cacheinfo.h
> +++ b/sysdeps/x86/dl-cacheinfo.h
> @@ -771,8 +771,16 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
>       reflected in the manual.  */
>    unsigned long int maximum_non_temporal_threshold = SIZE_MAX >> 4;
>    unsigned long int minimum_non_temporal_threshold = 0x4040;
> +
> +  /* If `non_temporal_threshold` less than `minimum_non_temporal_threshold`
> +     it most likely means we failed to detect the cache info. We don't want
> +     to default to `minimum_non_temporal_threshold` as such a small value,
> +     while correct, has bad performance. We default to 64MB as reasonable
> +     default bound. 64MB is likely conservative in that most/all systems would
> +     choose a lower value so it should never forcing non-temporal stores when
> +     they otherwise wouldn't be used.  */
>    if (non_temporal_threshold < minimum_non_temporal_threshold)
> -    non_temporal_threshold = minimum_non_temporal_threshold;
> +    non_temporal_threshold = 64 * 1024 * 1024;
>    else if (non_temporal_threshold > maximum_non_temporal_threshold)
>      non_temporal_threshold = maximum_non_temporal_threshold;
>
> --
> 2.34.1
>

I want to backport down to 2.28.
Thoughts?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] x86: Use 64MB as nt-store threshold if no cacheinfo [BZ #30429]
  2023-06-05 17:14 ` Noah Goldstein
@ 2023-06-05 17:45   ` H.J. Lu
  2023-06-05 20:28     ` Noah Goldstein
  0 siblings, 1 reply; 9+ messages in thread
From: H.J. Lu @ 2023-06-05 17:45 UTC (permalink / raw)
  To: Noah Goldstein; +Cc: libc-alpha, carlos

On Mon, Jun 5, 2023 at 10:15 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> On Mon, May 8, 2023 at 10:10 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> >
> > If `non_temporal_threshold` is below `minimum_non_temporal_threshold`,
> > it almost certainly means we failed to read the systems cache info.
> >
> > In this case, rather than defaulting the minimum correct value, we
> > should default to a value that gets at least reasonable
> > performance. 64MB is chosen conservatively to be at the very high
> > end. This should never cause non-temporal stores when, if we had read
> > cache info, we wouldn't have otherwise.
> > ---
> >  sysdeps/x86/dl-cacheinfo.h | 10 +++++++++-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/sysdeps/x86/dl-cacheinfo.h b/sysdeps/x86/dl-cacheinfo.h
> > index 864b00a521..6225c852f6 100644
> > --- a/sysdeps/x86/dl-cacheinfo.h
> > +++ b/sysdeps/x86/dl-cacheinfo.h
> > @@ -771,8 +771,16 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
> >       reflected in the manual.  */
> >    unsigned long int maximum_non_temporal_threshold = SIZE_MAX >> 4;
> >    unsigned long int minimum_non_temporal_threshold = 0x4040;
> > +
> > +  /* If `non_temporal_threshold` less than `minimum_non_temporal_threshold`
> > +     it most likely means we failed to detect the cache info. We don't want
> > +     to default to `minimum_non_temporal_threshold` as such a small value,
> > +     while correct, has bad performance. We default to 64MB as reasonable
> > +     default bound. 64MB is likely conservative in that most/all systems would
> > +     choose a lower value so it should never forcing non-temporal stores when
> > +     they otherwise wouldn't be used.  */
> >    if (non_temporal_threshold < minimum_non_temporal_threshold)
> > -    non_temporal_threshold = minimum_non_temporal_threshold;
> > +    non_temporal_threshold = 64 * 1024 * 1024;
> >    else if (non_temporal_threshold > maximum_non_temporal_threshold)
> >      non_temporal_threshold = maximum_non_temporal_threshold;
> >
> > --
> > 2.34.1
> >
>
> I want to backport down to 2.28.
> Thoughts?

Who will use such backport?

-- 
H.J.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v1] x86: Use 64MB as nt-store threshold if no cacheinfo [BZ #30429]
  2023-06-05 17:45   ` H.J. Lu
@ 2023-06-05 20:28     ` Noah Goldstein
  0 siblings, 0 replies; 9+ messages in thread
From: Noah Goldstein @ 2023-06-05 20:28 UTC (permalink / raw)
  To: H.J. Lu; +Cc: libc-alpha, carlos

On Mon, Jun 5, 2023 at 12:46 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Mon, Jun 5, 2023 at 10:15 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> >
> > On Mon, May 8, 2023 at 10:10 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> > >
> > > If `non_temporal_threshold` is below `minimum_non_temporal_threshold`,
> > > it almost certainly means we failed to read the systems cache info.
> > >
> > > In this case, rather than defaulting the minimum correct value, we
> > > should default to a value that gets at least reasonable
> > > performance. 64MB is chosen conservatively to be at the very high
> > > end. This should never cause non-temporal stores when, if we had read
> > > cache info, we wouldn't have otherwise.
> > > ---
> > >  sysdeps/x86/dl-cacheinfo.h | 10 +++++++++-
> > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/sysdeps/x86/dl-cacheinfo.h b/sysdeps/x86/dl-cacheinfo.h
> > > index 864b00a521..6225c852f6 100644
> > > --- a/sysdeps/x86/dl-cacheinfo.h
> > > +++ b/sysdeps/x86/dl-cacheinfo.h
> > > @@ -771,8 +771,16 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
> > >       reflected in the manual.  */
> > >    unsigned long int maximum_non_temporal_threshold = SIZE_MAX >> 4;
> > >    unsigned long int minimum_non_temporal_threshold = 0x4040;
> > > +
> > > +  /* If `non_temporal_threshold` less than `minimum_non_temporal_threshold`
> > > +     it most likely means we failed to detect the cache info. We don't want
> > > +     to default to `minimum_non_temporal_threshold` as such a small value,
> > > +     while correct, has bad performance. We default to 64MB as reasonable
> > > +     default bound. 64MB is likely conservative in that most/all systems would
> > > +     choose a lower value so it should never forcing non-temporal stores when
> > > +     they otherwise wouldn't be used.  */
> > >    if (non_temporal_threshold < minimum_non_temporal_threshold)
> > > -    non_temporal_threshold = minimum_non_temporal_threshold;
> > > +    non_temporal_threshold = 64 * 1024 * 1024;
> > >    else if (non_temporal_threshold > maximum_non_temporal_threshold)
> > >      non_temporal_threshold = maximum_non_temporal_threshold;
> > >
> > > --
> > > 2.34.1
> > >
> >
> > I want to backport down to 2.28.
> > Thoughts?
>
> Who will use such backport?

People using systems where we miscalculate cache size.
>
> --
> H.J.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-06-05 20:28 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-09  3:10 [PATCH v1] x86: Use 64MB as nt-store threshold if no cacheinfo [BZ #30429] Noah Goldstein
2023-05-09  8:56 ` Florian Weimer
2023-05-09 16:01   ` Noah Goldstein
2023-05-11 18:15     ` Noah Goldstein
2023-05-25  8:11     ` Florian Weimer
2023-05-25 18:36 ` H.J. Lu
2023-06-05 17:14 ` Noah Goldstein
2023-06-05 17:45   ` H.J. Lu
2023-06-05 20:28     ` Noah Goldstein

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).