From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [IPv6:2a00:1450:4864:20::62d]) by sourceware.org (Postfix) with ESMTPS id 828183858298 for ; Wed, 9 Aug 2023 06:33:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 828183858298 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ej1-x62d.google.com with SMTP id a640c23a62f3a-99c1f6f3884so918547666b.0 for ; Tue, 08 Aug 2023 23:33:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1691562814; x=1692167614; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=az0YmM7WzHs12TMwlW6WZQAUuywamGBUF8OPH+cm0HY=; b=JY/IbiqXryXo2CdN6DseHebjqPyk8z0QYXC9dDGOJTwni/kMLqpM0L50zgH9V8erdx NwDiZWzbN/SzO10LzoSoEZZKB6ngbqfbDRlkLa3SOGvM/8F4vKJrfja/2atbbLcOwvLM ByRs4ln4hnfDcjvFBNC+os6bcN8xGfS+i3kO3UWN8Lf7YGnbZSklQyUX5OCcYNoSsM8k ODDu5JmWjWfXtD/NyyOadwMS+dJPeGTku9qzmzAX/FBueQ0vkqY7GxesNi6ot7of4icX 9+C3gBRwdBB6/YLThWFbkVKm7r5S79nkzLIcWvSOcghvS4uWH5hYxvD9F8lqyNZ2ulP+ v1ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691562814; x=1692167614; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=az0YmM7WzHs12TMwlW6WZQAUuywamGBUF8OPH+cm0HY=; b=ioVnIfnYNJSTymOTJpzIIWRb/29Dv+3X7KW8yqjSynBpMTt9XyFOCtHJyGGsoxrd1+ C2/QqmusWryPJp+jrQ15QEsmlIvo1QFFj5DfMWLJYYNFnlaSm4kZIVOJwHqDd2cEwc4d Tuim0Py7FEBgSk3teeRF2WNvHczpxks8drCvoortSGGFIwzptHy+DS+Xpsqq6+3lEVod hTjruUwY8Y00u5BVGIIr8p3BfvPom2kw6TEJusH9SO0k60pboQjBM1+CaaAeG69/gSKw ro9HBVKUlYCMtECzi/7YH6bBmUnfSb1irCV9QwbpyzSJMUjhV3kXzGpsF6jg+XQjYkCz GiSg== X-Gm-Message-State: AOJu0YwNzlq4CqI1CtVqbdjvGRlkUDqnN+K1rBLiqOQLiYDP1zj69UGX 0NuiFWX8bMtoOW8GmM9YVMhbm/1lMfmruovpv9id25aJjjc= X-Google-Smtp-Source: AGHT+IGduK5y8gIrEV4FsCWKTsgL8GGVE5CcZKtII6QRE3donWlqz0JsUaT9grY8pH/ZaHzQG232Zx8GpCvmklxGX9o= X-Received: by 2002:a17:907:a065:b0:99b:ca5d:1466 with SMTP id ia5-20020a170907a06500b0099bca5d1466mr1341482ejc.66.1691562813990; Tue, 08 Aug 2023 23:33:33 -0700 (PDT) MIME-Version: 1.0 References: <20230809014756.19615-1-hongtao.liu@intel.com> In-Reply-To: <20230809014756.19615-1-hongtao.liu@intel.com> From: Uros Bizjak Date: Wed, 9 Aug 2023 08:33:22 +0200 Message-ID: Subject: Re: [PATCH V2] [X86] Workaround possible CPUID bug in Sandy Bridge. To: liuhongt Cc: gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Aug 9, 2023 at 3:48=E2=80=AFAM liuhongt wro= te: > > > Please rather do it in a more self-descriptive way, as proposed in the > > attached patch. You won't need a comment then. > > > > Adjusted in V2 patch. > > Don't access leaf 7 subleaf 1 unless subleaf 0 says it is > supported via EAX. > > Intel documentation says invalid subleaves return 0. We had been > relying on that behavior instead of checking the max sublef number. > > It appears that some Sandy Bridge CPUs return at least the subleaf 0 > EDX value for subleaf 1. Best guess is that this is a bug in a > microcode patch since all of the bits we're seeing set in EDX were > introduced after Sandy Bridge was originally released. > > This is causing avxvnniint16 to be incorrectly enabled with > -march=3Dnative on these CPUs. > > gcc/ChangeLog: > > * common/config/i386/cpuinfo.h (get_available_features): Check > EAX for valid subleaf before use CPUID. > --- > gcc/common/config/i386/cpuinfo.h | 82 +++++++++++++++++--------------- > 1 file changed, 43 insertions(+), 39 deletions(-) > > diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cp= uinfo.h > index 30ef0d334ca..9fa4dec2a7e 100644 > --- a/gcc/common/config/i386/cpuinfo.h > +++ b/gcc/common/config/i386/cpuinfo.h > @@ -663,6 +663,7 @@ get_available_features (struct __processor_model *cpu= _model, > unsigned int max_cpuid_level =3D cpu_model2->__cpu_max_level; > unsigned int eax, ebx; > unsigned int ext_level; > + unsigned int subleaf_level; Oh, I failed this in my previous review. This variable should be named max_subleaf_level, as it represents the maximum supported ECX value. Uros. > > /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv. */ > #define XCR_XFEATURE_ENABLED_MASK 0x0 > @@ -762,7 +763,7 @@ get_available_features (struct __processor_model *cpu= _model, > /* Get Advanced Features at level 7 (eax =3D 7, ecx =3D 0/1). */ > if (max_cpuid_level >=3D 7) > { > - __cpuid_count (7, 0, eax, ebx, ecx, edx); > + __cpuid_count (7, 0, subleaf_level, ebx, ecx, edx); > if (ebx & bit_BMI) > set_feature (FEATURE_BMI); > if (ebx & bit_SGX) > @@ -874,45 +875,48 @@ get_available_features (struct __processor_model *c= pu_model, > set_feature (FEATURE_AVX512FP16); > } > > - __cpuid_count (7, 1, eax, ebx, ecx, edx); > - if (eax & bit_HRESET) > - set_feature (FEATURE_HRESET); > - if (eax & bit_CMPCCXADD) > - set_feature(FEATURE_CMPCCXADD); > - if (edx & bit_PREFETCHI) > - set_feature (FEATURE_PREFETCHI); > - if (eax & bit_RAOINT) > - set_feature (FEATURE_RAOINT); > - if (avx_usable) > - { > - if (eax & bit_AVXVNNI) > - set_feature (FEATURE_AVXVNNI); > - if (eax & bit_AVXIFMA) > - set_feature (FEATURE_AVXIFMA); > - if (edx & bit_AVXVNNIINT8) > - set_feature (FEATURE_AVXVNNIINT8); > - if (edx & bit_AVXNECONVERT) > - set_feature (FEATURE_AVXNECONVERT); > - if (edx & bit_AVXVNNIINT16) > - set_feature (FEATURE_AVXVNNIINT16); > - if (eax & bit_SM3) > - set_feature (FEATURE_SM3); > - if (eax & bit_SHA512) > - set_feature (FEATURE_SHA512); > - if (eax & bit_SM4) > - set_feature (FEATURE_SM4); > - } > - if (avx512_usable) > - { > - if (eax & bit_AVX512BF16) > - set_feature (FEATURE_AVX512BF16); > - } > - if (amx_usable) > + if (subleaf_level >=3D 1) > { > - if (eax & bit_AMX_FP16) > - set_feature (FEATURE_AMX_FP16); > - if (edx & bit_AMX_COMPLEX) > - set_feature (FEATURE_AMX_COMPLEX); > + __cpuid_count (7, 1, eax, ebx, ecx, edx); > + if (eax & bit_HRESET) > + set_feature (FEATURE_HRESET); > + if (eax & bit_CMPCCXADD) > + set_feature(FEATURE_CMPCCXADD); > + if (edx & bit_PREFETCHI) > + set_feature (FEATURE_PREFETCHI); > + if (eax & bit_RAOINT) > + set_feature (FEATURE_RAOINT); > + if (avx_usable) > + { > + if (eax & bit_AVXVNNI) > + set_feature (FEATURE_AVXVNNI); > + if (eax & bit_AVXIFMA) > + set_feature (FEATURE_AVXIFMA); > + if (edx & bit_AVXVNNIINT8) > + set_feature (FEATURE_AVXVNNIINT8); > + if (edx & bit_AVXNECONVERT) > + set_feature (FEATURE_AVXNECONVERT); > + if (edx & bit_AVXVNNIINT16) > + set_feature (FEATURE_AVXVNNIINT16); > + if (eax & bit_SM3) > + set_feature (FEATURE_SM3); > + if (eax & bit_SHA512) > + set_feature (FEATURE_SHA512); > + if (eax & bit_SM4) > + set_feature (FEATURE_SM4); > + } > + if (avx512_usable) > + { > + if (eax & bit_AVX512BF16) > + set_feature (FEATURE_AVX512BF16); > + } > + if (amx_usable) > + { > + if (eax & bit_AMX_FP16) > + set_feature (FEATURE_AMX_FP16); > + if (edx & bit_AMX_COMPLEX) > + set_feature (FEATURE_AMX_COMPLEX); > + } > } > } > > -- > 2.31.1 >