From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oa1-x2f.google.com (mail-oa1-x2f.google.com [IPv6:2001:4860:4864:20::2f]) by sourceware.org (Postfix) with ESMTPS id 321923858CDB for ; Fri, 14 Jul 2023 02:22:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 321923858CDB Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-oa1-x2f.google.com with SMTP id 586e51a60fabf-1b3f722fdafso1167539fac.3 for ; Thu, 13 Jul 2023 19:22:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689301330; x=1691893330; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=fsqYgkENdb4IWJkzfmzceMSAyBG8IZ/6pPnR7gEkQ1g=; b=HK13Yno7ozrxmDRM8EzuagUmM3SW0BvKQolUfUrWqydFt46KjzOqws81HLpIj9vf0w ragsrH69urg8Gm4d+/xcesZIpUQiGGHMlountyxloamIN+qdM52HxxKEC/wp2ziK9BJe qKUnYqn6MyqvRdpo/skG4MphaUILpqtWJs8k+KBrMtrbWGtrlpxJMf300smTpAT6gdBP 3sg+2WshuRr/7sCxEZQnreeGTETNw6drftuoWsI9VEgcynSFa/d8pk3jAuzZ7XXuIqT9 heHMPup+Y5pCKpPTOa9SJAKOLHWtmqNyuEIoHIhm32Xkab3p8ab73nWVYXVyV7oRIOjT mcSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689301330; x=1691893330; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fsqYgkENdb4IWJkzfmzceMSAyBG8IZ/6pPnR7gEkQ1g=; b=IdDsbbfnvmdflRGA1/SZOSZ2T/aIZCZDEbS4GpKc3QEA5A6yQFcImPJflSyOK8jDnB tA9668bFLpYgqOlMwWyoqpa65FWjYAC58jnioN0ol24nD66pCHT0GW0c41hS8KV8Ro6Q SfUsF/BauXtsUYdTP44ZLKZ/7lOHg7142kGRuddhQC33Y1QaK4Nq6zsqW12A5wXrfPt1 zN15nYmt7MslZqULlEMF1DAxld8aRTv7VhsG76O8KYFUKy9SpfNWtK402UdDB4UjR41s m8ZYLIZ6Lu5hYT1EbcX9CPstRbL3EooqzNFnQO+5S8BPwAmKemREGW8AJqnPNUi8OsrB IwJQ== X-Gm-Message-State: ABy/qLZtrERMXIdaiaHfWBqMNGXo+YdU8xzC3LQypXiWY8X+spN4rfK7 /05zSNgCUIDQt12aFREi7kuKcfawKGXHPY9i7H4hqZO9 X-Google-Smtp-Source: APBJJlEkS5laJXJRFSLJAN6r8S/byKXOoWG1Zd7ydpbJlX0j1O4Os3VXzYJWILSKjyVU8iJNeuFgdF84f5ofUwdbpxY= X-Received: by 2002:a05:6870:f61b:b0:1b0:408a:1d14 with SMTP id ek27-20020a056870f61b00b001b0408a1d14mr4513538oab.44.1689301330402; Thu, 13 Jul 2023 19:22:10 -0700 (PDT) MIME-Version: 1.0 References: <20230527184632.694761-3-goldstein.w.n@gmail.com> <20230710052317.828308-1-sajan.karumanchi@amd.com> In-Reply-To: From: Noah Goldstein Date: Thu, 13 Jul 2023 21:21:54 -0500 Message-ID: Subject: Re: To: Sajan Karumanchi Cc: premachandra.mallappa@amd.com, dj@redhat.com, hjl.tools@gmail.com, libc-alpha@sourceware.org, carlos@systemhalted.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-9.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, Jul 10, 2023 at 10:58=E2=80=AFAM Noah Goldstein wrote: > > On Mon, Jul 10, 2023 at 12:23=E2=80=AFAM Sajan Karumanchi > wrote: > > > > Noah, > > I verified your patches on the master branch that impacts the non-thres= hold > > parameter on x86 CPUs. This patch modifies the non-temporal threshold = value > > from 24MB(3/4th of L3$) to 8MB(1/4th of L3$) on ZEN4. > > From the Glibc benchmarks, we saw a significant performance drop rangin= g > > from 15% to 99% for size ranges of 8MB to 16MB. > > I also ran the new tool developed by you on all Zen architectures and t= he > > results conclude that 3/4th L3 size holds good on AMD CPUs. > > Hence the current patch degrades the performance of AMD CPUs. > > We strongly recommend marking this change to Intel CPUs only. > > > > So it shouldn't actually go down. I think what is missing is: > ``` > get_common_cache_info (&shared, &shared_per_thread, &threads, core); > ``` > > In the AMD case shared =3D=3D shared_per_thread which shouldn't really > be the case. > > The intended new calculation is: Total_L3_Size / Scale > as opposed to: (L3_Size / NThread) / Scale" > > Before just going with default for AMD, maybe try out the following patch= ? > > ``` > --- > sysdeps/x86/dl-cacheinfo.h | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/sysdeps/x86/dl-cacheinfo.h b/sysdeps/x86/dl-cacheinfo.h > index c98fa57a7b..c1866ca898 100644 > --- a/sysdeps/x86/dl-cacheinfo.h > +++ b/sysdeps/x86/dl-cacheinfo.h > @@ -717,6 +717,7 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) > level3_cache_assoc =3D handle_amd (_SC_LEVEL3_CACHE_ASSOC); > level3_cache_linesize =3D handle_amd (_SC_LEVEL3_CACHE_LINESIZE); > > + get_common_cache_info (&shared, &shared_per_thread, &threads, core= ); > if (shared <=3D 0) > /* No shared L3 cache. All we have is the L2 cache. */ > shared =3D core; > -- > 2.34.1 > ``` > > Thanks, > > Sajan K. > > ping. 2.38 is approaching and I expect you want to get any fixes in before that.