public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] [X86] Workaround possible CPUID bug in Sandy Bridge.
@ 2023-08-08  7:56 liuhongt
  2023-08-08  8:02 ` Jakub Jelinek
  2023-08-08 12:41 ` Uros Bizjak
  0 siblings, 2 replies; 9+ messages in thread
From: liuhongt @ 2023-08-08  7:56 UTC (permalink / raw)
  To: gcc-patches; +Cc: ubizjak, phoebe.wang, craig.topper

Don't access leaf 7 subleaf 1 unless subleaf 0 says it is
supported via EAX.

Intel documentation says invalid subleaves return 0. We had been
relying on that behavior instead of checking the max sublef number.

It appears that some Sandy Bridge CPUs return at least the subleaf 0
EDX value for subleaf 1. Best guess is that this is a bug in a
microcode patch since all of the bits we're seeing set in EDX were
introduced after Sandy Bridge was originally released.

This is causing avxvnniint16 to be incorrectly enabled with
-march=native on these CPUs.

BTW: Thanks for reminder from llvm forks Phoebe and Craig.


Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk and backport?

gcc/ChangeLog:

	* common/config/i386/cpuinfo.h (get_available_features): Check
	EAX for valid subleaf before use CPUID.
---
 gcc/common/config/i386/cpuinfo.h | 84 +++++++++++++++++---------------
 1 file changed, 46 insertions(+), 38 deletions(-)

diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
index 30ef0d334ca..24ab2252eb0 100644
--- a/gcc/common/config/i386/cpuinfo.h
+++ b/gcc/common/config/i386/cpuinfo.h
@@ -874,45 +874,53 @@ get_available_features (struct __processor_model *cpu_model,
 	    set_feature (FEATURE_AVX512FP16);
 	}
 
-      __cpuid_count (7, 1, eax, ebx, ecx, edx);
-      if (eax & bit_HRESET)
-	set_feature (FEATURE_HRESET);
-      if (eax & bit_CMPCCXADD)
-	set_feature(FEATURE_CMPCCXADD);
-      if (edx & bit_PREFETCHI)
-	set_feature (FEATURE_PREFETCHI);
-      if (eax & bit_RAOINT)
-	set_feature (FEATURE_RAOINT);
-      if (avx_usable)
-	{
-	  if (eax & bit_AVXVNNI)
-	    set_feature (FEATURE_AVXVNNI);
-	  if (eax & bit_AVXIFMA)
-	    set_feature (FEATURE_AVXIFMA);
-	  if (edx & bit_AVXVNNIINT8)
-	    set_feature (FEATURE_AVXVNNIINT8);
-	  if (edx & bit_AVXNECONVERT)
-	    set_feature (FEATURE_AVXNECONVERT);
-	  if (edx & bit_AVXVNNIINT16)
-	    set_feature (FEATURE_AVXVNNIINT16);
-	  if (eax & bit_SM3)
-	    set_feature (FEATURE_SM3);
-	  if (eax & bit_SHA512)
-	    set_feature (FEATURE_SHA512);
-	  if (eax & bit_SM4)
-	    set_feature (FEATURE_SM4);
-	}
-      if (avx512_usable)
-	{
-	  if (eax & bit_AVX512BF16)
-	    set_feature (FEATURE_AVX512BF16);
-	}
-      if (amx_usable)
+      /* According to document, when subleaf is invliad, EAX,EBX,ECX,EDX should
+	 return 0 for CPUID (7, 1, EAX, EBX, ECX, EDX).
+	 But looks like it doesn't satisfy the document on some CPU, refer to
+	 https://reviews.llvm.org/D155145.
+	 Manually check valid subleaf here.  */
+      if (eax)
 	{
-	  if (eax & bit_AMX_FP16)
-	    set_feature (FEATURE_AMX_FP16);
-	  if (edx & bit_AMX_COMPLEX)
-	    set_feature (FEATURE_AMX_COMPLEX);
+	  __cpuid_count (7, 1, eax, ebx, ecx, edx);
+	  if (eax & bit_HRESET)
+	    set_feature (FEATURE_HRESET);
+	  if (eax & bit_CMPCCXADD)
+	    set_feature(FEATURE_CMPCCXADD);
+	  if (edx & bit_PREFETCHI)
+	    set_feature (FEATURE_PREFETCHI);
+	  if (eax & bit_RAOINT)
+	    set_feature (FEATURE_RAOINT);
+	  if (avx_usable)
+	    {
+	      if (eax & bit_AVXVNNI)
+		set_feature (FEATURE_AVXVNNI);
+	      if (eax & bit_AVXIFMA)
+		set_feature (FEATURE_AVXIFMA);
+	      if (edx & bit_AVXVNNIINT8)
+		set_feature (FEATURE_AVXVNNIINT8);
+	      if (edx & bit_AVXNECONVERT)
+		set_feature (FEATURE_AVXNECONVERT);
+	      if (edx & bit_AVXVNNIINT16)
+		set_feature (FEATURE_AVXVNNIINT16);
+	      if (eax & bit_SM3)
+		set_feature (FEATURE_SM3);
+	      if (eax & bit_SHA512)
+		set_feature (FEATURE_SHA512);
+	      if (eax & bit_SM4)
+		set_feature (FEATURE_SM4);
+	    }
+	  if (avx512_usable)
+	    {
+	      if (eax & bit_AVX512BF16)
+		set_feature (FEATURE_AVX512BF16);
+	    }
+	  if (amx_usable)
+	    {
+	      if (eax & bit_AMX_FP16)
+		set_feature (FEATURE_AMX_FP16);
+	      if (edx & bit_AMX_COMPLEX)
+		set_feature (FEATURE_AMX_COMPLEX);
+	    }
 	}
     }
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] [X86] Workaround possible CPUID bug in Sandy Bridge.
  2023-08-08  7:56 [PATCH] [X86] Workaround possible CPUID bug in Sandy Bridge liuhongt
@ 2023-08-08  8:02 ` Jakub Jelinek
  2023-08-08 12:41 ` Uros Bizjak
  1 sibling, 0 replies; 9+ messages in thread
From: Jakub Jelinek @ 2023-08-08  8:02 UTC (permalink / raw)
  To: liuhongt; +Cc: gcc-patches, ubizjak, phoebe.wang, craig.topper

On Tue, Aug 08, 2023 at 03:56:00PM +0800, liuhongt via Gcc-patches wrote:
> +      /* According to document, when subleaf is invliad, EAX,EBX,ECX,EDX should

s/invliad/invalid/

	Jakub


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] [X86] Workaround possible CPUID bug in Sandy Bridge.
  2023-08-08  7:56 [PATCH] [X86] Workaround possible CPUID bug in Sandy Bridge liuhongt
  2023-08-08  8:02 ` Jakub Jelinek
@ 2023-08-08 12:41 ` Uros Bizjak
  2023-08-09  1:47   ` [PATCH V2] " liuhongt
  1 sibling, 1 reply; 9+ messages in thread
From: Uros Bizjak @ 2023-08-08 12:41 UTC (permalink / raw)
  To: liuhongt; +Cc: gcc-patches, phoebe.wang, craig.topper

[-- Attachment #1: Type: text/plain, Size: 1025 bytes --]

On Tue, Aug 8, 2023 at 9:58 AM liuhongt <hongtao.liu@intel.com> wrote:
>
> Don't access leaf 7 subleaf 1 unless subleaf 0 says it is
> supported via EAX.
>
> Intel documentation says invalid subleaves return 0. We had been
> relying on that behavior instead of checking the max sublef number.
>
> It appears that some Sandy Bridge CPUs return at least the subleaf 0
> EDX value for subleaf 1. Best guess is that this is a bug in a
> microcode patch since all of the bits we're seeing set in EDX were
> introduced after Sandy Bridge was originally released.
>
> This is causing avxvnniint16 to be incorrectly enabled with
> -march=native on these CPUs.
>
> BTW: Thanks for reminder from llvm forks Phoebe and Craig.
>
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.

Please rather do it in a more self-descriptive way, as proposed in the
attached patch. You won't need a comment then.

(Please note that indentation is wrong in the patch in order to better
see the changes).

Uros.

[-- Attachment #2: p.diff.txt --]
[-- Type: text/plain, Size: 1216 bytes --]

diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
index 30ef0d334ca..49724d2cba1 100644
--- a/gcc/common/config/i386/cpuinfo.h
+++ b/gcc/common/config/i386/cpuinfo.h
@@ -762,7 +762,9 @@ get_available_features (struct __processor_model *cpu_model,
   /* Get Advanced Features at level 7 (eax = 7, ecx = 0/1). */
   if (max_cpuid_level >= 7)
     {
-      __cpuid_count (7, 0, eax, ebx, ecx, edx);
+      unsigned subleaf_level;
+
+      __cpuid_count (7, 0, subleaf_level, ebx, ecx, edx);
       if (ebx & bit_BMI)
 	set_feature (FEATURE_BMI);
       if (ebx & bit_SGX)
@@ -873,8 +875,9 @@ get_available_features (struct __processor_model *cpu_model,
 	  if (edx & bit_AVX512FP16)
 	    set_feature (FEATURE_AVX512FP16);
 	}
-
-      __cpuid_count (7, 1, eax, ebx, ecx, edx);
+      if (subleaf_level >= 1)
+	{
+	  __cpuid_count (7, 1, eax, ebx, ecx, edx);
       if (eax & bit_HRESET)
 	set_feature (FEATURE_HRESET);
       if (eax & bit_CMPCCXADD)
@@ -914,6 +917,7 @@ get_available_features (struct __processor_model *cpu_model,
 	  if (edx & bit_AMX_COMPLEX)
 	    set_feature (FEATURE_AMX_COMPLEX);
 	}
+	}
     }
 
   /* Get Advanced Features at level 0xd (eax = 0xd, ecx = 1). */

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH V2] [X86] Workaround possible CPUID bug in Sandy Bridge.
  2023-08-08 12:41 ` Uros Bizjak
@ 2023-08-09  1:47   ` liuhongt
  2023-08-09  5:48     ` Uros Bizjak
  2023-08-09  6:33     ` Uros Bizjak
  0 siblings, 2 replies; 9+ messages in thread
From: liuhongt @ 2023-08-09  1:47 UTC (permalink / raw)
  To: gcc-patches; +Cc: ubizjak

> Please rather do it in a more self-descriptive way, as proposed in the
> attached patch. You won't need a comment then.
>

Adjusted in V2 patch.

Don't access leaf 7 subleaf 1 unless subleaf 0 says it is
supported via EAX.

Intel documentation says invalid subleaves return 0. We had been
relying on that behavior instead of checking the max sublef number.

It appears that some Sandy Bridge CPUs return at least the subleaf 0
EDX value for subleaf 1. Best guess is that this is a bug in a
microcode patch since all of the bits we're seeing set in EDX were
introduced after Sandy Bridge was originally released.

This is causing avxvnniint16 to be incorrectly enabled with
-march=native on these CPUs.

gcc/ChangeLog:

	* common/config/i386/cpuinfo.h (get_available_features): Check
	EAX for valid subleaf before use CPUID.
---
 gcc/common/config/i386/cpuinfo.h | 82 +++++++++++++++++---------------
 1 file changed, 43 insertions(+), 39 deletions(-)

diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
index 30ef0d334ca..9fa4dec2a7e 100644
--- a/gcc/common/config/i386/cpuinfo.h
+++ b/gcc/common/config/i386/cpuinfo.h
@@ -663,6 +663,7 @@ get_available_features (struct __processor_model *cpu_model,
   unsigned int max_cpuid_level = cpu_model2->__cpu_max_level;
   unsigned int eax, ebx;
   unsigned int ext_level;
+  unsigned int subleaf_level;
 
   /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv.  */
 #define XCR_XFEATURE_ENABLED_MASK	0x0
@@ -762,7 +763,7 @@ get_available_features (struct __processor_model *cpu_model,
   /* Get Advanced Features at level 7 (eax = 7, ecx = 0/1). */
   if (max_cpuid_level >= 7)
     {
-      __cpuid_count (7, 0, eax, ebx, ecx, edx);
+      __cpuid_count (7, 0, subleaf_level, ebx, ecx, edx);
       if (ebx & bit_BMI)
 	set_feature (FEATURE_BMI);
       if (ebx & bit_SGX)
@@ -874,45 +875,48 @@ get_available_features (struct __processor_model *cpu_model,
 	    set_feature (FEATURE_AVX512FP16);
 	}
 
-      __cpuid_count (7, 1, eax, ebx, ecx, edx);
-      if (eax & bit_HRESET)
-	set_feature (FEATURE_HRESET);
-      if (eax & bit_CMPCCXADD)
-	set_feature(FEATURE_CMPCCXADD);
-      if (edx & bit_PREFETCHI)
-	set_feature (FEATURE_PREFETCHI);
-      if (eax & bit_RAOINT)
-	set_feature (FEATURE_RAOINT);
-      if (avx_usable)
-	{
-	  if (eax & bit_AVXVNNI)
-	    set_feature (FEATURE_AVXVNNI);
-	  if (eax & bit_AVXIFMA)
-	    set_feature (FEATURE_AVXIFMA);
-	  if (edx & bit_AVXVNNIINT8)
-	    set_feature (FEATURE_AVXVNNIINT8);
-	  if (edx & bit_AVXNECONVERT)
-	    set_feature (FEATURE_AVXNECONVERT);
-	  if (edx & bit_AVXVNNIINT16)
-	    set_feature (FEATURE_AVXVNNIINT16);
-	  if (eax & bit_SM3)
-	    set_feature (FEATURE_SM3);
-	  if (eax & bit_SHA512)
-	    set_feature (FEATURE_SHA512);
-	  if (eax & bit_SM4)
-	    set_feature (FEATURE_SM4);
-	}
-      if (avx512_usable)
-	{
-	  if (eax & bit_AVX512BF16)
-	    set_feature (FEATURE_AVX512BF16);
-	}
-      if (amx_usable)
+      if (subleaf_level >= 1)
 	{
-	  if (eax & bit_AMX_FP16)
-	    set_feature (FEATURE_AMX_FP16);
-	  if (edx & bit_AMX_COMPLEX)
-	    set_feature (FEATURE_AMX_COMPLEX);
+	  __cpuid_count (7, 1, eax, ebx, ecx, edx);
+	  if (eax & bit_HRESET)
+	    set_feature (FEATURE_HRESET);
+	  if (eax & bit_CMPCCXADD)
+	    set_feature(FEATURE_CMPCCXADD);
+	  if (edx & bit_PREFETCHI)
+	    set_feature (FEATURE_PREFETCHI);
+	  if (eax & bit_RAOINT)
+	    set_feature (FEATURE_RAOINT);
+	  if (avx_usable)
+	    {
+	      if (eax & bit_AVXVNNI)
+		set_feature (FEATURE_AVXVNNI);
+	      if (eax & bit_AVXIFMA)
+		set_feature (FEATURE_AVXIFMA);
+	      if (edx & bit_AVXVNNIINT8)
+		set_feature (FEATURE_AVXVNNIINT8);
+	      if (edx & bit_AVXNECONVERT)
+		set_feature (FEATURE_AVXNECONVERT);
+	      if (edx & bit_AVXVNNIINT16)
+		set_feature (FEATURE_AVXVNNIINT16);
+	      if (eax & bit_SM3)
+		set_feature (FEATURE_SM3);
+	      if (eax & bit_SHA512)
+		set_feature (FEATURE_SHA512);
+	      if (eax & bit_SM4)
+		set_feature (FEATURE_SM4);
+	    }
+	  if (avx512_usable)
+	    {
+	      if (eax & bit_AVX512BF16)
+		set_feature (FEATURE_AVX512BF16);
+	    }
+	  if (amx_usable)
+	    {
+	      if (eax & bit_AMX_FP16)
+		set_feature (FEATURE_AMX_FP16);
+	      if (edx & bit_AMX_COMPLEX)
+		set_feature (FEATURE_AMX_COMPLEX);
+	    }
 	}
     }
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V2] [X86] Workaround possible CPUID bug in Sandy Bridge.
  2023-08-09  1:47   ` [PATCH V2] " liuhongt
@ 2023-08-09  5:48     ` Uros Bizjak
  2023-08-09  6:33     ` Uros Bizjak
  1 sibling, 0 replies; 9+ messages in thread
From: Uros Bizjak @ 2023-08-09  5:48 UTC (permalink / raw)
  To: liuhongt; +Cc: gcc-patches

On Wed, Aug 9, 2023 at 3:48 AM liuhongt <hongtao.liu@intel.com> wrote:
>
> > Please rather do it in a more self-descriptive way, as proposed in the
> > attached patch. You won't need a comment then.
> >
>
> Adjusted in V2 patch.
>
> Don't access leaf 7 subleaf 1 unless subleaf 0 says it is
> supported via EAX.
>
> Intel documentation says invalid subleaves return 0. We had been
> relying on that behavior instead of checking the max sublef number.

Probably a documentation bug, even Wikipedia says about CPUID:

EAX=7, ECX=0: Extended Features

This returns extended feature flags in EBX, ECX, and EDX. Returns the
maximum ECX value for EAX=7 in EAX.

> It appears that some Sandy Bridge CPUs return at least the subleaf 0
> EDX value for subleaf 1. Best guess is that this is a bug in a
> microcode patch since all of the bits we're seeing set in EDX were
> introduced after Sandy Bridge was originally released.
>
> This is causing avxvnniint16 to be incorrectly enabled with
> -march=native on these CPUs.
>
> gcc/ChangeLog:
>
>         * common/config/i386/cpuinfo.h (get_available_features): Check
>         EAX for valid subleaf before use CPUID.

OK for mainline and backports.

Thanks,
Uros.

> ---
>  gcc/common/config/i386/cpuinfo.h | 82 +++++++++++++++++---------------
>  1 file changed, 43 insertions(+), 39 deletions(-)
>
> diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
> index 30ef0d334ca..9fa4dec2a7e 100644
> --- a/gcc/common/config/i386/cpuinfo.h
> +++ b/gcc/common/config/i386/cpuinfo.h
> @@ -663,6 +663,7 @@ get_available_features (struct __processor_model *cpu_model,
>    unsigned int max_cpuid_level = cpu_model2->__cpu_max_level;
>    unsigned int eax, ebx;
>    unsigned int ext_level;
> +  unsigned int subleaf_level;
>
>    /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv.  */
>  #define XCR_XFEATURE_ENABLED_MASK      0x0
> @@ -762,7 +763,7 @@ get_available_features (struct __processor_model *cpu_model,
>    /* Get Advanced Features at level 7 (eax = 7, ecx = 0/1). */
>    if (max_cpuid_level >= 7)
>      {
> -      __cpuid_count (7, 0, eax, ebx, ecx, edx);
> +      __cpuid_count (7, 0, subleaf_level, ebx, ecx, edx);
>        if (ebx & bit_BMI)
>         set_feature (FEATURE_BMI);
>        if (ebx & bit_SGX)
> @@ -874,45 +875,48 @@ get_available_features (struct __processor_model *cpu_model,
>             set_feature (FEATURE_AVX512FP16);
>         }
>
> -      __cpuid_count (7, 1, eax, ebx, ecx, edx);
> -      if (eax & bit_HRESET)
> -       set_feature (FEATURE_HRESET);
> -      if (eax & bit_CMPCCXADD)
> -       set_feature(FEATURE_CMPCCXADD);
> -      if (edx & bit_PREFETCHI)
> -       set_feature (FEATURE_PREFETCHI);
> -      if (eax & bit_RAOINT)
> -       set_feature (FEATURE_RAOINT);
> -      if (avx_usable)
> -       {
> -         if (eax & bit_AVXVNNI)
> -           set_feature (FEATURE_AVXVNNI);
> -         if (eax & bit_AVXIFMA)
> -           set_feature (FEATURE_AVXIFMA);
> -         if (edx & bit_AVXVNNIINT8)
> -           set_feature (FEATURE_AVXVNNIINT8);
> -         if (edx & bit_AVXNECONVERT)
> -           set_feature (FEATURE_AVXNECONVERT);
> -         if (edx & bit_AVXVNNIINT16)
> -           set_feature (FEATURE_AVXVNNIINT16);
> -         if (eax & bit_SM3)
> -           set_feature (FEATURE_SM3);
> -         if (eax & bit_SHA512)
> -           set_feature (FEATURE_SHA512);
> -         if (eax & bit_SM4)
> -           set_feature (FEATURE_SM4);
> -       }
> -      if (avx512_usable)
> -       {
> -         if (eax & bit_AVX512BF16)
> -           set_feature (FEATURE_AVX512BF16);
> -       }
> -      if (amx_usable)
> +      if (subleaf_level >= 1)
>         {
> -         if (eax & bit_AMX_FP16)
> -           set_feature (FEATURE_AMX_FP16);
> -         if (edx & bit_AMX_COMPLEX)
> -           set_feature (FEATURE_AMX_COMPLEX);
> +         __cpuid_count (7, 1, eax, ebx, ecx, edx);
> +         if (eax & bit_HRESET)
> +           set_feature (FEATURE_HRESET);
> +         if (eax & bit_CMPCCXADD)
> +           set_feature(FEATURE_CMPCCXADD);
> +         if (edx & bit_PREFETCHI)
> +           set_feature (FEATURE_PREFETCHI);
> +         if (eax & bit_RAOINT)
> +           set_feature (FEATURE_RAOINT);
> +         if (avx_usable)
> +           {
> +             if (eax & bit_AVXVNNI)
> +               set_feature (FEATURE_AVXVNNI);
> +             if (eax & bit_AVXIFMA)
> +               set_feature (FEATURE_AVXIFMA);
> +             if (edx & bit_AVXVNNIINT8)
> +               set_feature (FEATURE_AVXVNNIINT8);
> +             if (edx & bit_AVXNECONVERT)
> +               set_feature (FEATURE_AVXNECONVERT);
> +             if (edx & bit_AVXVNNIINT16)
> +               set_feature (FEATURE_AVXVNNIINT16);
> +             if (eax & bit_SM3)
> +               set_feature (FEATURE_SM3);
> +             if (eax & bit_SHA512)
> +               set_feature (FEATURE_SHA512);
> +             if (eax & bit_SM4)
> +               set_feature (FEATURE_SM4);
> +           }
> +         if (avx512_usable)
> +           {
> +             if (eax & bit_AVX512BF16)
> +               set_feature (FEATURE_AVX512BF16);
> +           }
> +         if (amx_usable)
> +           {
> +             if (eax & bit_AMX_FP16)
> +               set_feature (FEATURE_AMX_FP16);
> +             if (edx & bit_AMX_COMPLEX)
> +               set_feature (FEATURE_AMX_COMPLEX);
> +           }
>         }
>      }
>
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V2] [X86] Workaround possible CPUID bug in Sandy Bridge.
  2023-08-09  1:47   ` [PATCH V2] " liuhongt
  2023-08-09  5:48     ` Uros Bizjak
@ 2023-08-09  6:33     ` Uros Bizjak
  2023-08-09  6:37       ` Liu, Hongtao
  1 sibling, 1 reply; 9+ messages in thread
From: Uros Bizjak @ 2023-08-09  6:33 UTC (permalink / raw)
  To: liuhongt; +Cc: gcc-patches

On Wed, Aug 9, 2023 at 3:48 AM liuhongt <hongtao.liu@intel.com> wrote:
>
> > Please rather do it in a more self-descriptive way, as proposed in the
> > attached patch. You won't need a comment then.
> >
>
> Adjusted in V2 patch.
>
> Don't access leaf 7 subleaf 1 unless subleaf 0 says it is
> supported via EAX.
>
> Intel documentation says invalid subleaves return 0. We had been
> relying on that behavior instead of checking the max sublef number.
>
> It appears that some Sandy Bridge CPUs return at least the subleaf 0
> EDX value for subleaf 1. Best guess is that this is a bug in a
> microcode patch since all of the bits we're seeing set in EDX were
> introduced after Sandy Bridge was originally released.
>
> This is causing avxvnniint16 to be incorrectly enabled with
> -march=native on these CPUs.
>
> gcc/ChangeLog:
>
>         * common/config/i386/cpuinfo.h (get_available_features): Check
>         EAX for valid subleaf before use CPUID.
> ---
>  gcc/common/config/i386/cpuinfo.h | 82 +++++++++++++++++---------------
>  1 file changed, 43 insertions(+), 39 deletions(-)
>
> diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
> index 30ef0d334ca..9fa4dec2a7e 100644
> --- a/gcc/common/config/i386/cpuinfo.h
> +++ b/gcc/common/config/i386/cpuinfo.h
> @@ -663,6 +663,7 @@ get_available_features (struct __processor_model *cpu_model,
>    unsigned int max_cpuid_level = cpu_model2->__cpu_max_level;
>    unsigned int eax, ebx;
>    unsigned int ext_level;
> +  unsigned int subleaf_level;

Oh, I failed this in my previous review. This variable should be named
max_subleaf_level, as it represents the maximum supported ECX value.

Uros.

>
>    /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv.  */
>  #define XCR_XFEATURE_ENABLED_MASK      0x0
> @@ -762,7 +763,7 @@ get_available_features (struct __processor_model *cpu_model,
>    /* Get Advanced Features at level 7 (eax = 7, ecx = 0/1). */
>    if (max_cpuid_level >= 7)
>      {
> -      __cpuid_count (7, 0, eax, ebx, ecx, edx);
> +      __cpuid_count (7, 0, subleaf_level, ebx, ecx, edx);
>        if (ebx & bit_BMI)
>         set_feature (FEATURE_BMI);
>        if (ebx & bit_SGX)
> @@ -874,45 +875,48 @@ get_available_features (struct __processor_model *cpu_model,
>             set_feature (FEATURE_AVX512FP16);
>         }
>
> -      __cpuid_count (7, 1, eax, ebx, ecx, edx);
> -      if (eax & bit_HRESET)
> -       set_feature (FEATURE_HRESET);
> -      if (eax & bit_CMPCCXADD)
> -       set_feature(FEATURE_CMPCCXADD);
> -      if (edx & bit_PREFETCHI)
> -       set_feature (FEATURE_PREFETCHI);
> -      if (eax & bit_RAOINT)
> -       set_feature (FEATURE_RAOINT);
> -      if (avx_usable)
> -       {
> -         if (eax & bit_AVXVNNI)
> -           set_feature (FEATURE_AVXVNNI);
> -         if (eax & bit_AVXIFMA)
> -           set_feature (FEATURE_AVXIFMA);
> -         if (edx & bit_AVXVNNIINT8)
> -           set_feature (FEATURE_AVXVNNIINT8);
> -         if (edx & bit_AVXNECONVERT)
> -           set_feature (FEATURE_AVXNECONVERT);
> -         if (edx & bit_AVXVNNIINT16)
> -           set_feature (FEATURE_AVXVNNIINT16);
> -         if (eax & bit_SM3)
> -           set_feature (FEATURE_SM3);
> -         if (eax & bit_SHA512)
> -           set_feature (FEATURE_SHA512);
> -         if (eax & bit_SM4)
> -           set_feature (FEATURE_SM4);
> -       }
> -      if (avx512_usable)
> -       {
> -         if (eax & bit_AVX512BF16)
> -           set_feature (FEATURE_AVX512BF16);
> -       }
> -      if (amx_usable)
> +      if (subleaf_level >= 1)
>         {
> -         if (eax & bit_AMX_FP16)
> -           set_feature (FEATURE_AMX_FP16);
> -         if (edx & bit_AMX_COMPLEX)
> -           set_feature (FEATURE_AMX_COMPLEX);
> +         __cpuid_count (7, 1, eax, ebx, ecx, edx);
> +         if (eax & bit_HRESET)
> +           set_feature (FEATURE_HRESET);
> +         if (eax & bit_CMPCCXADD)
> +           set_feature(FEATURE_CMPCCXADD);
> +         if (edx & bit_PREFETCHI)
> +           set_feature (FEATURE_PREFETCHI);
> +         if (eax & bit_RAOINT)
> +           set_feature (FEATURE_RAOINT);
> +         if (avx_usable)
> +           {
> +             if (eax & bit_AVXVNNI)
> +               set_feature (FEATURE_AVXVNNI);
> +             if (eax & bit_AVXIFMA)
> +               set_feature (FEATURE_AVXIFMA);
> +             if (edx & bit_AVXVNNIINT8)
> +               set_feature (FEATURE_AVXVNNIINT8);
> +             if (edx & bit_AVXNECONVERT)
> +               set_feature (FEATURE_AVXNECONVERT);
> +             if (edx & bit_AVXVNNIINT16)
> +               set_feature (FEATURE_AVXVNNIINT16);
> +             if (eax & bit_SM3)
> +               set_feature (FEATURE_SM3);
> +             if (eax & bit_SHA512)
> +               set_feature (FEATURE_SHA512);
> +             if (eax & bit_SM4)
> +               set_feature (FEATURE_SM4);
> +           }
> +         if (avx512_usable)
> +           {
> +             if (eax & bit_AVX512BF16)
> +               set_feature (FEATURE_AVX512BF16);
> +           }
> +         if (amx_usable)
> +           {
> +             if (eax & bit_AMX_FP16)
> +               set_feature (FEATURE_AMX_FP16);
> +             if (edx & bit_AMX_COMPLEX)
> +               set_feature (FEATURE_AMX_COMPLEX);
> +           }
>         }
>      }
>
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH V2] [X86] Workaround possible CPUID bug in Sandy Bridge.
  2023-08-09  6:33     ` Uros Bizjak
@ 2023-08-09  6:37       ` Liu, Hongtao
  2023-08-09  6:38         ` Uros Bizjak
  0 siblings, 1 reply; 9+ messages in thread
From: Liu, Hongtao @ 2023-08-09  6:37 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: gcc-patches



> -----Original Message-----
> From: Uros Bizjak <ubizjak@gmail.com>
> Sent: Wednesday, August 9, 2023 2:33 PM
> To: Liu, Hongtao <hongtao.liu@intel.com>
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH V2] [X86] Workaround possible CPUID bug in Sandy
> Bridge.
> 
> On Wed, Aug 9, 2023 at 3:48 AM liuhongt <hongtao.liu@intel.com> wrote:
> >
> > > Please rather do it in a more self-descriptive way, as proposed in
> > > the attached patch. You won't need a comment then.
> > >
> >
> > Adjusted in V2 patch.
> >
> > Don't access leaf 7 subleaf 1 unless subleaf 0 says it is supported
> > via EAX.
> >
> > Intel documentation says invalid subleaves return 0. We had been
> > relying on that behavior instead of checking the max sublef number.
> >
> > It appears that some Sandy Bridge CPUs return at least the subleaf 0
> > EDX value for subleaf 1. Best guess is that this is a bug in a
> > microcode patch since all of the bits we're seeing set in EDX were
> > introduced after Sandy Bridge was originally released.
> >
> > This is causing avxvnniint16 to be incorrectly enabled with
> > -march=native on these CPUs.
> >
> > gcc/ChangeLog:
> >
> >         * common/config/i386/cpuinfo.h (get_available_features): Check
> >         EAX for valid subleaf before use CPUID.
> > ---
> >  gcc/common/config/i386/cpuinfo.h | 82
> > +++++++++++++++++---------------
> >  1 file changed, 43 insertions(+), 39 deletions(-)
> >
> > diff --git a/gcc/common/config/i386/cpuinfo.h
> > b/gcc/common/config/i386/cpuinfo.h
> > index 30ef0d334ca..9fa4dec2a7e 100644
> > --- a/gcc/common/config/i386/cpuinfo.h
> > +++ b/gcc/common/config/i386/cpuinfo.h
> > @@ -663,6 +663,7 @@ get_available_features (struct __processor_model
> *cpu_model,
> >    unsigned int max_cpuid_level = cpu_model2->__cpu_max_level;
> >    unsigned int eax, ebx;
> >    unsigned int ext_level;
> > +  unsigned int subleaf_level;
> 
> Oh, I failed this in my previous review. This variable should be named
> max_subleaf_level, as it represents the maximum supported ECX value.
I've committed previous patch ,but not backport yet.
Guess I can just commit another patch to change the name?
For backport, I'll merge the change together with just 1 commit.
> 
> Uros.
> 
> >
> >    /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv.  */
> >  #define XCR_XFEATURE_ENABLED_MASK      0x0
> > @@ -762,7 +763,7 @@ get_available_features (struct __processor_model
> *cpu_model,
> >    /* Get Advanced Features at level 7 (eax = 7, ecx = 0/1). */
> >    if (max_cpuid_level >= 7)
> >      {
> > -      __cpuid_count (7, 0, eax, ebx, ecx, edx);
> > +      __cpuid_count (7, 0, subleaf_level, ebx, ecx, edx);
> >        if (ebx & bit_BMI)
> >         set_feature (FEATURE_BMI);
> >        if (ebx & bit_SGX)
> > @@ -874,45 +875,48 @@ get_available_features (struct
> __processor_model *cpu_model,
> >             set_feature (FEATURE_AVX512FP16);
> >         }
> >
> > -      __cpuid_count (7, 1, eax, ebx, ecx, edx);
> > -      if (eax & bit_HRESET)
> > -       set_feature (FEATURE_HRESET);
> > -      if (eax & bit_CMPCCXADD)
> > -       set_feature(FEATURE_CMPCCXADD);
> > -      if (edx & bit_PREFETCHI)
> > -       set_feature (FEATURE_PREFETCHI);
> > -      if (eax & bit_RAOINT)
> > -       set_feature (FEATURE_RAOINT);
> > -      if (avx_usable)
> > -       {
> > -         if (eax & bit_AVXVNNI)
> > -           set_feature (FEATURE_AVXVNNI);
> > -         if (eax & bit_AVXIFMA)
> > -           set_feature (FEATURE_AVXIFMA);
> > -         if (edx & bit_AVXVNNIINT8)
> > -           set_feature (FEATURE_AVXVNNIINT8);
> > -         if (edx & bit_AVXNECONVERT)
> > -           set_feature (FEATURE_AVXNECONVERT);
> > -         if (edx & bit_AVXVNNIINT16)
> > -           set_feature (FEATURE_AVXVNNIINT16);
> > -         if (eax & bit_SM3)
> > -           set_feature (FEATURE_SM3);
> > -         if (eax & bit_SHA512)
> > -           set_feature (FEATURE_SHA512);
> > -         if (eax & bit_SM4)
> > -           set_feature (FEATURE_SM4);
> > -       }
> > -      if (avx512_usable)
> > -       {
> > -         if (eax & bit_AVX512BF16)
> > -           set_feature (FEATURE_AVX512BF16);
> > -       }
> > -      if (amx_usable)
> > +      if (subleaf_level >= 1)
> >         {
> > -         if (eax & bit_AMX_FP16)
> > -           set_feature (FEATURE_AMX_FP16);
> > -         if (edx & bit_AMX_COMPLEX)
> > -           set_feature (FEATURE_AMX_COMPLEX);
> > +         __cpuid_count (7, 1, eax, ebx, ecx, edx);
> > +         if (eax & bit_HRESET)
> > +           set_feature (FEATURE_HRESET);
> > +         if (eax & bit_CMPCCXADD)
> > +           set_feature(FEATURE_CMPCCXADD);
> > +         if (edx & bit_PREFETCHI)
> > +           set_feature (FEATURE_PREFETCHI);
> > +         if (eax & bit_RAOINT)
> > +           set_feature (FEATURE_RAOINT);
> > +         if (avx_usable)
> > +           {
> > +             if (eax & bit_AVXVNNI)
> > +               set_feature (FEATURE_AVXVNNI);
> > +             if (eax & bit_AVXIFMA)
> > +               set_feature (FEATURE_AVXIFMA);
> > +             if (edx & bit_AVXVNNIINT8)
> > +               set_feature (FEATURE_AVXVNNIINT8);
> > +             if (edx & bit_AVXNECONVERT)
> > +               set_feature (FEATURE_AVXNECONVERT);
> > +             if (edx & bit_AVXVNNIINT16)
> > +               set_feature (FEATURE_AVXVNNIINT16);
> > +             if (eax & bit_SM3)
> > +               set_feature (FEATURE_SM3);
> > +             if (eax & bit_SHA512)
> > +               set_feature (FEATURE_SHA512);
> > +             if (eax & bit_SM4)
> > +               set_feature (FEATURE_SM4);
> > +           }
> > +         if (avx512_usable)
> > +           {
> > +             if (eax & bit_AVX512BF16)
> > +               set_feature (FEATURE_AVX512BF16);
> > +           }
> > +         if (amx_usable)
> > +           {
> > +             if (eax & bit_AMX_FP16)
> > +               set_feature (FEATURE_AMX_FP16);
> > +             if (edx & bit_AMX_COMPLEX)
> > +               set_feature (FEATURE_AMX_COMPLEX);
> > +           }
> >         }
> >      }
> >
> > --
> > 2.31.1
> >

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V2] [X86] Workaround possible CPUID bug in Sandy Bridge.
  2023-08-09  6:37       ` Liu, Hongtao
@ 2023-08-09  6:38         ` Uros Bizjak
  2023-08-09  6:43           ` Uros Bizjak
  0 siblings, 1 reply; 9+ messages in thread
From: Uros Bizjak @ 2023-08-09  6:38 UTC (permalink / raw)
  To: Liu, Hongtao; +Cc: gcc-patches

On Wed, Aug 9, 2023 at 8:37 AM Liu, Hongtao <hongtao.liu@intel.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Uros Bizjak <ubizjak@gmail.com>
> > Sent: Wednesday, August 9, 2023 2:33 PM
> > To: Liu, Hongtao <hongtao.liu@intel.com>
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [PATCH V2] [X86] Workaround possible CPUID bug in Sandy
> > Bridge.
> >
> > On Wed, Aug 9, 2023 at 3:48 AM liuhongt <hongtao.liu@intel.com> wrote:
> > >
> > > > Please rather do it in a more self-descriptive way, as proposed in
> > > > the attached patch. You won't need a comment then.
> > > >
> > >
> > > Adjusted in V2 patch.
> > >
> > > Don't access leaf 7 subleaf 1 unless subleaf 0 says it is supported
> > > via EAX.
> > >
> > > Intel documentation says invalid subleaves return 0. We had been
> > > relying on that behavior instead of checking the max sublef number.
> > >
> > > It appears that some Sandy Bridge CPUs return at least the subleaf 0
> > > EDX value for subleaf 1. Best guess is that this is a bug in a
> > > microcode patch since all of the bits we're seeing set in EDX were
> > > introduced after Sandy Bridge was originally released.
> > >
> > > This is causing avxvnniint16 to be incorrectly enabled with
> > > -march=native on these CPUs.
> > >
> > > gcc/ChangeLog:
> > >
> > >         * common/config/i386/cpuinfo.h (get_available_features): Check
> > >         EAX for valid subleaf before use CPUID.
> > > ---
> > >  gcc/common/config/i386/cpuinfo.h | 82
> > > +++++++++++++++++---------------
> > >  1 file changed, 43 insertions(+), 39 deletions(-)
> > >
> > > diff --git a/gcc/common/config/i386/cpuinfo.h
> > > b/gcc/common/config/i386/cpuinfo.h
> > > index 30ef0d334ca..9fa4dec2a7e 100644
> > > --- a/gcc/common/config/i386/cpuinfo.h
> > > +++ b/gcc/common/config/i386/cpuinfo.h
> > > @@ -663,6 +663,7 @@ get_available_features (struct __processor_model
> > *cpu_model,
> > >    unsigned int max_cpuid_level = cpu_model2->__cpu_max_level;
> > >    unsigned int eax, ebx;
> > >    unsigned int ext_level;
> > > +  unsigned int subleaf_level;
> >
> > Oh, I failed this in my previous review. This variable should be named
> > max_subleaf_level, as it represents the maximum supported ECX value.
> I've committed previous patch ,but not backport yet.
> Guess I can just commit another patch to change the name?
> For backport, I'll merge the change together with just 1 commit.

Yes. It is a trivial minor change.

Uros.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V2] [X86] Workaround possible CPUID bug in Sandy Bridge.
  2023-08-09  6:38         ` Uros Bizjak
@ 2023-08-09  6:43           ` Uros Bizjak
  0 siblings, 0 replies; 9+ messages in thread
From: Uros Bizjak @ 2023-08-09  6:43 UTC (permalink / raw)
  To: Liu, Hongtao; +Cc: gcc-patches

On Wed, Aug 9, 2023 at 8:38 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Wed, Aug 9, 2023 at 8:37 AM Liu, Hongtao <hongtao.liu@intel.com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Uros Bizjak <ubizjak@gmail.com>
> > > Sent: Wednesday, August 9, 2023 2:33 PM
> > > To: Liu, Hongtao <hongtao.liu@intel.com>
> > > Cc: gcc-patches@gcc.gnu.org
> > > Subject: Re: [PATCH V2] [X86] Workaround possible CPUID bug in Sandy
> > > Bridge.
> > >
> > > On Wed, Aug 9, 2023 at 3:48 AM liuhongt <hongtao.liu@intel.com> wrote:
> > > >
> > > > > Please rather do it in a more self-descriptive way, as proposed in
> > > > > the attached patch. You won't need a comment then.
> > > > >
> > > >
> > > > Adjusted in V2 patch.
> > > >
> > > > Don't access leaf 7 subleaf 1 unless subleaf 0 says it is supported
> > > > via EAX.
> > > >
> > > > Intel documentation says invalid subleaves return 0. We had been
> > > > relying on that behavior instead of checking the max sublef number.
> > > >
> > > > It appears that some Sandy Bridge CPUs return at least the subleaf 0
> > > > EDX value for subleaf 1. Best guess is that this is a bug in a
> > > > microcode patch since all of the bits we're seeing set in EDX were
> > > > introduced after Sandy Bridge was originally released.
> > > >
> > > > This is causing avxvnniint16 to be incorrectly enabled with
> > > > -march=native on these CPUs.
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > >         * common/config/i386/cpuinfo.h (get_available_features): Check
> > > >         EAX for valid subleaf before use CPUID.
> > > > ---
> > > >  gcc/common/config/i386/cpuinfo.h | 82
> > > > +++++++++++++++++---------------
> > > >  1 file changed, 43 insertions(+), 39 deletions(-)
> > > >
> > > > diff --git a/gcc/common/config/i386/cpuinfo.h
> > > > b/gcc/common/config/i386/cpuinfo.h
> > > > index 30ef0d334ca..9fa4dec2a7e 100644
> > > > --- a/gcc/common/config/i386/cpuinfo.h
> > > > +++ b/gcc/common/config/i386/cpuinfo.h
> > > > @@ -663,6 +663,7 @@ get_available_features (struct __processor_model
> > > *cpu_model,
> > > >    unsigned int max_cpuid_level = cpu_model2->__cpu_max_level;
> > > >    unsigned int eax, ebx;
> > > >    unsigned int ext_level;
> > > > +  unsigned int subleaf_level;
> > >
> > > Oh, I failed this in my previous review. This variable should be named
> > > max_subleaf_level, as it represents the maximum supported ECX value.
> > I've committed previous patch ,but not backport yet.
> > Guess I can just commit another patch to change the name?
> > For backport, I'll merge the change together with just 1 commit.
>
> Yes. It is a trivial minor change.

I also think the declaration should go inside (max_cpuid_level >= 7)
block, since it is used only there (and irrelevant outside the block),
but it is your call...

Uros.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-08-09  6:44 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-08  7:56 [PATCH] [X86] Workaround possible CPUID bug in Sandy Bridge liuhongt
2023-08-08  8:02 ` Jakub Jelinek
2023-08-08 12:41 ` Uros Bizjak
2023-08-09  1:47   ` [PATCH V2] " liuhongt
2023-08-09  5:48     ` Uros Bizjak
2023-08-09  6:33     ` Uros Bizjak
2023-08-09  6:37       ` Liu, Hongtao
2023-08-09  6:38         ` Uros Bizjak
2023-08-09  6:43           ` Uros Bizjak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).