From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x136.google.com (mail-lf1-x136.google.com [IPv6:2a00:1450:4864:20::136]) by sourceware.org (Postfix) with ESMTPS id 054FF3858D20 for ; Wed, 9 Aug 2023 05:48:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 054FF3858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lf1-x136.google.com with SMTP id 2adb3069b0e04-4fe48d0ab0fso10004240e87.1 for ; Tue, 08 Aug 2023 22:48:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1691560135; x=1692164935; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=nU2+ytZN9RleJX1k8SnRsJkBhqZu9xx9Lwjx5e/KrR8=; b=cOiLvPDOONAsDOYV24qwIzUlDqJYv50VJ5DeUB0y5uScdmVZDjs7vDApAltLeduDG7 NWBBUR3y1RCP00RPGiDINNGsDsLl4elncmDP4OqXSeJ6CYdVEzN1jbf6aRzbG9NLhWLx hpxfJwHDAEUTR5r58nUg9YozBlyl3E4smakH0BQeUJYhswaGR22rzvOnnZvvaXAs0k0a 0dhE/PqHVxMmeQUM5eiuS6pUzjCEjK4ua7+3SaNpuBtC89erzOLnqzQf6CGTe7gGK6c8 pSUcJMpCqFR4CqU4+k7ii9JaA4pOUUqE6XzRIi4sgO4HJV5AzGlMg8ZoVjvEWkiVPlLn 3fzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691560135; x=1692164935; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nU2+ytZN9RleJX1k8SnRsJkBhqZu9xx9Lwjx5e/KrR8=; b=FaIsqHi6pR0fsfRrmrNEEpqgBynuKVF2RN7Nf3cr+aKoByJxNsDW8rxCzEqGvOLp3m KIXNkalRMp9j6VAyHI4FEG1y0pafzjwJj3zgFq8JdoaiX1wx7Xa+s0G1I41Vr9BQAssG yPij1yTeAGWVQFhv3MTXRShgw7m/iLigD4cRFLVeVe6Y0zuHzN9xIUpVNjmU1vviU/Ox n4h7LXMNuLk42A6bwO66WHEmiCMc9dANU5y5+rFfYXInSE6dCBYxcGvkhjx3qgiHllX+ E4ROVDULd7mgDVc/6gQJ167RWXBS5g4eSIgeZqy/efjNBolX0mV8wwJQ07RctOLLo9lS Mz6Q== X-Gm-Message-State: AOJu0YySRzfASiX9pyqjH5h/T2MUScBOrfw6YlDcHu3lTluHmcIvh7d0 Eje9RwlX832JhC+oNMlSC2LosKGPOkgTXyI4Y4M= X-Google-Smtp-Source: AGHT+IG79WJ2n3q4mxJ0A9KJ4lmlZTaO/AhQCni+TQTfq5HnsquBI2fH+h4kjJAhnNCrEb+4owucmj090aPfmtcf3Ok= X-Received: by 2002:a05:6512:224d:b0:4fb:cc99:4e90 with SMTP id i13-20020a056512224d00b004fbcc994e90mr1169515lfu.37.1691560134907; Tue, 08 Aug 2023 22:48:54 -0700 (PDT) MIME-Version: 1.0 References: <20230809014756.19615-1-hongtao.liu@intel.com> In-Reply-To: <20230809014756.19615-1-hongtao.liu@intel.com> From: Uros Bizjak Date: Wed, 9 Aug 2023 07:48:43 +0200 Message-ID: Subject: Re: [PATCH V2] [X86] Workaround possible CPUID bug in Sandy Bridge. To: liuhongt Cc: gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Aug 9, 2023 at 3:48=E2=80=AFAM liuhongt wro= te: > > > Please rather do it in a more self-descriptive way, as proposed in the > > attached patch. You won't need a comment then. > > > > Adjusted in V2 patch. > > Don't access leaf 7 subleaf 1 unless subleaf 0 says it is > supported via EAX. > > Intel documentation says invalid subleaves return 0. We had been > relying on that behavior instead of checking the max sublef number. Probably a documentation bug, even Wikipedia says about CPUID: EAX=3D7, ECX=3D0: Extended Features This returns extended feature flags in EBX, ECX, and EDX. Returns the maximum ECX value for EAX=3D7 in EAX. > It appears that some Sandy Bridge CPUs return at least the subleaf 0 > EDX value for subleaf 1. Best guess is that this is a bug in a > microcode patch since all of the bits we're seeing set in EDX were > introduced after Sandy Bridge was originally released. > > This is causing avxvnniint16 to be incorrectly enabled with > -march=3Dnative on these CPUs. > > gcc/ChangeLog: > > * common/config/i386/cpuinfo.h (get_available_features): Check > EAX for valid subleaf before use CPUID. OK for mainline and backports. Thanks, Uros. > --- > gcc/common/config/i386/cpuinfo.h | 82 +++++++++++++++++--------------- > 1 file changed, 43 insertions(+), 39 deletions(-) > > diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cp= uinfo.h > index 30ef0d334ca..9fa4dec2a7e 100644 > --- a/gcc/common/config/i386/cpuinfo.h > +++ b/gcc/common/config/i386/cpuinfo.h > @@ -663,6 +663,7 @@ get_available_features (struct __processor_model *cpu= _model, > unsigned int max_cpuid_level =3D cpu_model2->__cpu_max_level; > unsigned int eax, ebx; > unsigned int ext_level; > + unsigned int subleaf_level; > > /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv. */ > #define XCR_XFEATURE_ENABLED_MASK 0x0 > @@ -762,7 +763,7 @@ get_available_features (struct __processor_model *cpu= _model, > /* Get Advanced Features at level 7 (eax =3D 7, ecx =3D 0/1). */ > if (max_cpuid_level >=3D 7) > { > - __cpuid_count (7, 0, eax, ebx, ecx, edx); > + __cpuid_count (7, 0, subleaf_level, ebx, ecx, edx); > if (ebx & bit_BMI) > set_feature (FEATURE_BMI); > if (ebx & bit_SGX) > @@ -874,45 +875,48 @@ get_available_features (struct __processor_model *c= pu_model, > set_feature (FEATURE_AVX512FP16); > } > > - __cpuid_count (7, 1, eax, ebx, ecx, edx); > - if (eax & bit_HRESET) > - set_feature (FEATURE_HRESET); > - if (eax & bit_CMPCCXADD) > - set_feature(FEATURE_CMPCCXADD); > - if (edx & bit_PREFETCHI) > - set_feature (FEATURE_PREFETCHI); > - if (eax & bit_RAOINT) > - set_feature (FEATURE_RAOINT); > - if (avx_usable) > - { > - if (eax & bit_AVXVNNI) > - set_feature (FEATURE_AVXVNNI); > - if (eax & bit_AVXIFMA) > - set_feature (FEATURE_AVXIFMA); > - if (edx & bit_AVXVNNIINT8) > - set_feature (FEATURE_AVXVNNIINT8); > - if (edx & bit_AVXNECONVERT) > - set_feature (FEATURE_AVXNECONVERT); > - if (edx & bit_AVXVNNIINT16) > - set_feature (FEATURE_AVXVNNIINT16); > - if (eax & bit_SM3) > - set_feature (FEATURE_SM3); > - if (eax & bit_SHA512) > - set_feature (FEATURE_SHA512); > - if (eax & bit_SM4) > - set_feature (FEATURE_SM4); > - } > - if (avx512_usable) > - { > - if (eax & bit_AVX512BF16) > - set_feature (FEATURE_AVX512BF16); > - } > - if (amx_usable) > + if (subleaf_level >=3D 1) > { > - if (eax & bit_AMX_FP16) > - set_feature (FEATURE_AMX_FP16); > - if (edx & bit_AMX_COMPLEX) > - set_feature (FEATURE_AMX_COMPLEX); > + __cpuid_count (7, 1, eax, ebx, ecx, edx); > + if (eax & bit_HRESET) > + set_feature (FEATURE_HRESET); > + if (eax & bit_CMPCCXADD) > + set_feature(FEATURE_CMPCCXADD); > + if (edx & bit_PREFETCHI) > + set_feature (FEATURE_PREFETCHI); > + if (eax & bit_RAOINT) > + set_feature (FEATURE_RAOINT); > + if (avx_usable) > + { > + if (eax & bit_AVXVNNI) > + set_feature (FEATURE_AVXVNNI); > + if (eax & bit_AVXIFMA) > + set_feature (FEATURE_AVXIFMA); > + if (edx & bit_AVXVNNIINT8) > + set_feature (FEATURE_AVXVNNIINT8); > + if (edx & bit_AVXNECONVERT) > + set_feature (FEATURE_AVXNECONVERT); > + if (edx & bit_AVXVNNIINT16) > + set_feature (FEATURE_AVXVNNIINT16); > + if (eax & bit_SM3) > + set_feature (FEATURE_SM3); > + if (eax & bit_SHA512) > + set_feature (FEATURE_SHA512); > + if (eax & bit_SM4) > + set_feature (FEATURE_SM4); > + } > + if (avx512_usable) > + { > + if (eax & bit_AVX512BF16) > + set_feature (FEATURE_AVX512BF16); > + } > + if (amx_usable) > + { > + if (eax & bit_AMX_FP16) > + set_feature (FEATURE_AMX_FP16); > + if (edx & bit_AMX_COMPLEX) > + set_feature (FEATURE_AMX_COMPLEX); > + } > } > } > > -- > 2.31.1 >