public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v1 1/3] x86: Fix misordered logic for setting `rep_movsb_stop_threshold`
@ 2022-06-15  0:25 Noah Goldstein
  2022-06-15  0:25 ` [PATCH v1 2/3] x86: Cleanup bounds checking in large memcpy case Noah Goldstein
                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Noah Goldstein @ 2022-06-15  0:25 UTC (permalink / raw)
  To: libc-alpha

Move the setting of `rep_movsb_stop_threshold` to after the tunables
have been collected so that the `rep_movsb_stop_threshold` (which
is used to redirect control flow to the non_temporal case) will
use any user value for `non_temporal_threshold` (set using
glibc.cpu.x86_non_temporal_threshold)
---
 sysdeps/x86/dl-cacheinfo.h | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/sysdeps/x86/dl-cacheinfo.h b/sysdeps/x86/dl-cacheinfo.h
index f64a2fb0ba..cc3b840f9c 100644
--- a/sysdeps/x86/dl-cacheinfo.h
+++ b/sysdeps/x86/dl-cacheinfo.h
@@ -898,18 +898,6 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
   if (CPU_FEATURE_USABLE_P (cpu_features, FSRM))
     rep_movsb_threshold = 2112;
 
-  unsigned long int rep_movsb_stop_threshold;
-  /* ERMS feature is implemented from AMD Zen3 architecture and it is
-     performing poorly for data above L2 cache size. Henceforth, adding
-     an upper bound threshold parameter to limit the usage of Enhanced
-     REP MOVSB operations and setting its value to L2 cache size.  */
-  if (cpu_features->basic.kind == arch_kind_amd)
-    rep_movsb_stop_threshold = core;
-  /* Setting the upper bound of ERMS to the computed value of
-     non-temporal threshold for architectures other than AMD.  */
-  else
-    rep_movsb_stop_threshold = non_temporal_threshold;
-
   /* The default threshold to use Enhanced REP STOSB.  */
   unsigned long int rep_stosb_threshold = 2048;
 
@@ -951,6 +939,18 @@ dl_init_cacheinfo (struct cpu_features *cpu_features)
 			   SIZE_MAX);
 #endif
 
+  unsigned long int rep_movsb_stop_threshold;
+  /* ERMS feature is implemented from AMD Zen3 architecture and it is
+     performing poorly for data above L2 cache size. Henceforth, adding
+     an upper bound threshold parameter to limit the usage of Enhanced
+     REP MOVSB operations and setting its value to L2 cache size.  */
+  if (cpu_features->basic.kind == arch_kind_amd)
+    rep_movsb_stop_threshold = core;
+  /* Setting the upper bound of ERMS to the computed value of
+     non-temporal threshold for architectures other than AMD.  */
+  else
+    rep_movsb_stop_threshold = non_temporal_threshold;
+
   cpu_features->data_cache_size = data;
   cpu_features->shared_cache_size = shared;
   cpu_features->non_temporal_threshold = non_temporal_threshold;
-- 
2.34.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2022-07-14  2:57 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-15  0:25 [PATCH v1 1/3] x86: Fix misordered logic for setting `rep_movsb_stop_threshold` Noah Goldstein
2022-06-15  0:25 ` [PATCH v1 2/3] x86: Cleanup bounds checking in large memcpy case Noah Goldstein
2022-06-15  1:07   ` H.J. Lu
2022-06-15  3:57     ` Noah Goldstein
2022-06-15  3:57   ` [PATCH v2] " Noah Goldstein
2022-06-15 14:52     ` H.J. Lu
2022-06-15 15:13       ` Noah Goldstein
2022-06-15 15:12   ` [PATCH v3] " Noah Goldstein
2022-06-15 16:48     ` H.J. Lu
2022-06-15 17:44       ` Noah Goldstein
2022-06-15 17:41   ` [PATCH v4 1/2] " Noah Goldstein
2022-06-15 17:41     ` [PATCH v4 2/2] x86: Add bounds `x86_non_temporal_threshold` Noah Goldstein
2022-06-15 18:22       ` H.J. Lu
2022-06-15 18:33         ` Noah Goldstein
2022-06-15 18:32       ` [PATCH v5 " Noah Goldstein
2022-06-15 18:43         ` H.J. Lu
2022-06-15 19:52       ` [PATCH v6 2/3] " Noah Goldstein
2022-06-15 20:27         ` H.J. Lu
2022-06-15 20:35           ` Noah Goldstein
2022-06-15 20:34       ` [PATCH v7 " Noah Goldstein
2022-06-15 20:48         ` H.J. Lu
2022-07-14  2:55           ` Sunil Pandey
2022-06-15 18:22     ` [PATCH v4 1/2] x86: Cleanup bounds checking in large memcpy case H.J. Lu
2022-07-14  2:57       ` Sunil Pandey
2022-06-15  0:25 ` [PATCH v1 3/3] x86: Add sse42 implementation to strcmp's ifunc Noah Goldstein
2022-06-15  1:08   ` H.J. Lu
2022-07-14  2:54     ` Sunil Pandey
2022-06-15  1:02 ` [PATCH v1 1/3] x86: Fix misordered logic for setting `rep_movsb_stop_threshold` H.J. Lu
2022-07-14  2:53   ` Sunil Pandey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).