From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qv1-xf2b.google.com (mail-qv1-xf2b.google.com [IPv6:2607:f8b0:4864:20::f2b]) by sourceware.org (Postfix) with ESMTPS id 4CD0B3858427; Sat, 23 Apr 2022 01:52:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4CD0B3858427 Received: by mail-qv1-xf2b.google.com with SMTP id dw17so7374550qvb.9; Fri, 22 Apr 2022 18:52:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=sumShModuETlx9VTxhR5inMpgMWHmFU0Od7SMrjAnug=; b=yq3zhqsJ8HF2mnvZKFoMgvleeEZ20MlImsbYprRGJ57WbN0BXtEK5DrDdZ8ZAoxBp1 YxBuRCcuL9TuyOHZ0DriR2PeTFT21p1GwFv4qPk4uCLGcq54fL4hcWRLiH1Qc3+ci7fM XrVz18uLSrjYF5f6n3UkDw3QOLtCW2xE50HxgtPx1MGsXEyReJvIVUXmzinicnHimsVe 6E6vN2pdEwXTB7SB72SoMG06n4Q1+HvjhLkt5x3MZWev4l+ROBhRr0S+wtNLUJMyshx3 iQso3osPGd4Zcnoh4rmyefJKbF5cYwGqJHtuHl3h7NNsA6H06gRWNGLrDr+cxLY90oRz j0jg== X-Gm-Message-State: AOAM532FL+VcWzU+NH8Xhq26SJqBFRG0pcfLRMsa2leFb/N67HdyW/YR ebPvZkNWTozOi+9Z82qL+vdHSJWZkEa56lAz0E8= X-Google-Smtp-Source: ABdhPJzFIkw37rDbd1gcymG6umDoqqmSFDtkexps9Fs7+H/KszK8CPO/FsN1qAf/jZ0DG8TOX1gXJCGn+3hTHu+D0Xg= X-Received: by 2002:a0c:f84b:0:b0:444:46d3:4dbf with SMTP id g11-20020a0cf84b000000b0044446d34dbfmr5939006qvo.106.1650678733692; Fri, 22 Apr 2022 18:52:13 -0700 (PDT) MIME-Version: 1.0 References: <20211206032303.7159-1-hjl.tools@gmail.com> <3639bca9-e90d-d3ff-c758-d2d5c4c0a3d2@linux.intel.com> <87r1aoqzlz.fsf@oldenburg.str.redhat.com> <878rwwquvk.fsf@oldenburg.str.redhat.com> In-Reply-To: From: Sunil Pandey Date: Fri, 22 Apr 2022 18:51:38 -0700 Message-ID: Subject: Re: [PATCH] x86: Don't set Prefer_No_AVX512 for processors with AVX512 and AVX-VNNI To: Noah Goldstein , libc-stable@sourceware.org Cc: "H.J. Lu" , Florian Weimer , "H.J. Lu via Libc-alpha" , Hongyu Wang , Thiago Macieira , liuhongt , Arjan van de Ven Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, HK_RANDOM_ENVFROM, HK_RANDOM_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-stable@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-stable mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Apr 2022 01:52:16 -0000 On Tue, Dec 7, 2021 at 11:33 AM Noah Goldstein via Libc-alpha wrote: > > On Tue, Dec 7, 2021 at 9:53 AM H.J. Lu via Libc-alpha > wrote: > > > > On Tue, Dec 7, 2021 at 7:48 AM Florian Weimer wrote: > > > > > > * H. J. Lu via Libc-alpha: > > > > > > > On Tue, Dec 7, 2021 at 6:05 AM Florian Weimer wrote: > > > >> > > > >> * H. J. Lu via Libc-alpha: > > > >> > > > >> > Hongtao, Hongyu, can you find a Rocket Lake to test? > > > >> > > > >> I've found a lab machine with an i7-11700 CPU. Is there something I > > > >> could test for you? > > > > > > > > You can enable AVX512 in glibc with: > > > > > > > > $ export GLIBC_TUNABLES=glibc.cpu.hwcaps=-Prefer_No_AVX512 > > > > > > > > While bootstrapping GCC with -j8, track CPU frequency with turbostat. If > > > > there is no CPU frequency drop and build time is less comparing against > > > > without GLIBC_TUNABLES, we can enable AVX512. > > > > > > > >> (This could be non-production silicon, though.) > > > >> > > > > > > > > The frequency behavior of non-production silicon can be different. > > > > > > With that caveat, it seems that frequencies drop further with > > > GLIBC_TUNABLES set as above, and the build is also a little bit slower > > > (5m31s vs 5m23s, the AVX-512 build was run first, and the systems was a > > > little bit warmer for the second run). > > > > > > Would it make sense to run more extensive tests, or should we wait for > > > someone with production silicon to show up? > > > > GCC is a heavy user of memcpy/memset, which is a good proxy of > > ZMM load/store impact on CPU frequency. We need to run the same > > test on a production Rocket Lake. > > I would think a microbenchmark would be better for determining if > rocketlake actually has throttling. > > Testing the full j8 GCC build will add a bunch of frequency "noise" > due to thermal throttling. > > > > > -- > > H.J. I would like to backport this patch to release branches. Any comments or objections? --Sunil