From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 7852) id AF00A385829A; Tue, 19 Jul 2022 05:11:51 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AF00A385829A Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Sunil Pandey To: glibc-cvs@sourceware.org Subject: [glibc/release/2.34/master] x86: Fix misordered logic for setting `rep_movsb_stop_threshold` X-Act-Checkin: glibc X-Git-Author: Noah Goldstein X-Git-Refname: refs/heads/release/2.34/master X-Git-Oldrev: fc54e1fae854e6ee6361cd4ddf900c36fce8158e X-Git-Newrev: 6e008c884dad5a25f91085c68d044bb5e2d63761 Message-Id: <20220719051151.AF00A385829A@sourceware.org> Date: Tue, 19 Jul 2022 05:11:51 +0000 (GMT) X-BeenThere: glibc-cvs@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Glibc-cvs mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jul 2022 05:11:51 -0000 https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=6e008c884dad5a25f91085c68d044bb5e2d63761 commit 6e008c884dad5a25f91085c68d044bb5e2d63761 Author: Noah Goldstein Date: Tue Jun 14 13:50:11 2022 -0700 x86: Fix misordered logic for setting `rep_movsb_stop_threshold` Move the setting of `rep_movsb_stop_threshold` to after the tunables have been collected so that the `rep_movsb_stop_threshold` (which is used to redirect control flow to the non_temporal case) will use any user value for `non_temporal_threshold` (set using glibc.cpu.x86_non_temporal_threshold) (cherry picked from commit 035591551400cfc810b07244a015c9411e8bff7c) Diff: --- sysdeps/x86/dl-cacheinfo.h | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/sysdeps/x86/dl-cacheinfo.h b/sysdeps/x86/dl-cacheinfo.h index 2e43e67e4f..560bf260e8 100644 --- a/sysdeps/x86/dl-cacheinfo.h +++ b/sysdeps/x86/dl-cacheinfo.h @@ -898,18 +898,6 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) if (CPU_FEATURE_USABLE_P (cpu_features, FSRM)) rep_movsb_threshold = 2112; - unsigned long int rep_movsb_stop_threshold; - /* ERMS feature is implemented from AMD Zen3 architecture and it is - performing poorly for data above L2 cache size. Henceforth, adding - an upper bound threshold parameter to limit the usage of Enhanced - REP MOVSB operations and setting its value to L2 cache size. */ - if (cpu_features->basic.kind == arch_kind_amd) - rep_movsb_stop_threshold = core; - /* Setting the upper bound of ERMS to the computed value of - non-temporal threshold for architectures other than AMD. */ - else - rep_movsb_stop_threshold = non_temporal_threshold; - /* The default threshold to use Enhanced REP STOSB. */ unsigned long int rep_stosb_threshold = 2048; @@ -951,6 +939,18 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) SIZE_MAX); #endif + unsigned long int rep_movsb_stop_threshold; + /* ERMS feature is implemented from AMD Zen3 architecture and it is + performing poorly for data above L2 cache size. Henceforth, adding + an upper bound threshold parameter to limit the usage of Enhanced + REP MOVSB operations and setting its value to L2 cache size. */ + if (cpu_features->basic.kind == arch_kind_amd) + rep_movsb_stop_threshold = core; + /* Setting the upper bound of ERMS to the computed value of + non-temporal threshold for architectures other than AMD. */ + else + rep_movsb_stop_threshold = non_temporal_threshold; + cpu_features->data_cache_size = data; cpu_features->shared_cache_size = shared; cpu_features->non_temporal_threshold = non_temporal_threshold;