From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by sourceware.org (Postfix) with ESMTPS id 2051C3858C60 for ; Tue, 7 Dec 2021 16:27:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2051C3858C60 X-IronPort-AV: E=McAfee;i="6200,9189,10190"; a="261685147" X-IronPort-AV: E=Sophos;i="5.87,293,1631602800"; d="scan'208";a="261685147" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Dec 2021 08:22:54 -0800 X-IronPort-AV: E=Sophos;i="5.87,293,1631602800"; d="scan'208";a="600222369" Received: from rmcisaac-mobl1.amr.corp.intel.com (HELO tjmaciei-mobl5.localnet) ([10.209.51.112]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Dec 2021 08:22:53 -0800 From: Thiago Macieira To: Florian Weimer , "H.J. Lu" Cc: "H.J. Lu via Libc-alpha" , Arjan van de Ven , liuhongt , Hongyu Wang Subject: Re: [PATCH] x86: Don't set Prefer_No_AVX512 for processors with AVX512 and AVX-VNNI Date: Tue, 07 Dec 2021 08:22:52 -0800 Message-ID: <52871468.kDZyW8NQCg@tjmaciei-mobl5> Organization: Intel Corporation In-Reply-To: References: <20211206032303.7159-1-hjl.tools@gmail.com> <878rwwquvk.fsf@oldenburg.str.redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Dec 2021 16:27:45 -0000 On Tuesday, 7 December 2021 07:52:44 PST H.J. Lu wrote: > > Would it make sense to run more extensive tests, or should we wait for > > someone with production silicon to show up? > > GCC is a heavy user of memcpy/memset, which is a good proxy of > ZMM load/store impact on CPU frequency. We need to run the same > test on a production Rocket Lake. Can someone run the same test on an Ice Lake? That will also answer whether we should enable the same thing for ICL / ICX. RKL is a Cypress Cove, so I'd expect it to have the same performance numbers as ICL's Sunny Cove. The data I have says that, in theory, we should not see a frequency drop for 512-bit memcpy / memset on ICL or TGL, but I haven't got experimental data confirming that. And I can't really run the benchmark test on a laptop with very poor thermal dissipation (freq drops to 1500 MHz all on its own). If a good ICL has the drop, then I'd assume RKL will too. -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel DPG Cloud Engineering