public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] i386: Call get_available_features for all CPUs with max_level >= 1 [PR100758]
@ 2023-02-09 12:11 Jakub Jelinek
  2023-02-09 15:30 ` H.J. Lu
  0 siblings, 1 reply; 4+ messages in thread
From: Jakub Jelinek @ 2023-02-09 12:11 UTC (permalink / raw)
  To: Uros Bizjak, H.J. Lu; +Cc: gcc-patches

Hi!

get_available_features doesn't depend on cpu_model2->__cpu_{family,model}
and just sets stuff up based on CPUID leaf 1, or some extended ones,
so I wonder why are we calling it separately for Intel, AMD and Zhaoxin
and not for all other CPUs too?  I think various programs in the wild
which aren't using __builtin_cpu_{is,supports} just check the various CPUID
leafs and query bits in there, without blacklisting unknown CPU vendors,
so I think even __builtin_cpu_supports ("sse2") etc. should be reliable
if those VENDOR_{CENTAUR,CYRIX,NSC,OTHER} CPUs set those bits in CPUID leaf
1 or some extended ones.  Calling it for all CPUs also means it can be
inlined because there will be just a single caller.

I will test on Intel but can't test it on non-Intel (or with some extra
effort on AMD; for both of those arches it should be really no change in
behavior).

Thoughts on this?

2023-02-09  Jakub Jelinek  <jakub@redhat.com>

	PR target/100758
	* common/config/i386/cpuinfo.h (get_zhaoxin_cpu): Formatting fixes.
	(cpu_indicator_init): Call get_available_features for all CPUs with
	max_level >= 1, rather than just Intel, AMD or Zhaoxin.  Formatting
	fixes.

--- gcc/common/config/i386/cpuinfo.h.jj	2023-01-16 11:52:15.910736614 +0100
+++ gcc/common/config/i386/cpuinfo.h	2023-02-09 12:51:23.539470140 +0100
@@ -601,8 +601,8 @@ get_intel_cpu (struct __processor_model
 
 static inline const char *
 get_zhaoxin_cpu (struct __processor_model *cpu_model,
-		struct __processor_model2 *cpu_model2,
-		unsigned int *cpu_features2)
+		 struct __processor_model2 *cpu_model2,
+		 unsigned int *cpu_features2)
 {
   const char *cpu = NULL;
   unsigned int family = cpu_model2->__cpu_family;
@@ -1016,6 +1016,10 @@ cpu_indicator_init (struct __processor_m
   extended_model = (eax >> 12) & 0xf0;
   extended_family = (eax >> 20) & 0xff;
 
+  /* Find available features. */
+  get_available_features (cpu_model, cpu_model2, cpu_features2,
+			  ecx, edx);
+
   if (vendor == signature_INTEL_ebx)
     {
       /* Adjust model and family for Intel CPUS. */
@@ -1030,9 +1034,6 @@ cpu_indicator_init (struct __processor_m
       cpu_model2->__cpu_family = family;
       cpu_model2->__cpu_model = model;
 
-      /* Find available features. */
-      get_available_features (cpu_model, cpu_model2, cpu_features2,
-			      ecx, edx);
       /* Get CPU type.  */
       get_intel_cpu (cpu_model, cpu_model2, cpu_features2);
       cpu_model->__cpu_vendor = VENDOR_INTEL;
@@ -1049,9 +1050,6 @@ cpu_indicator_init (struct __processor_m
       cpu_model2->__cpu_family = family;
       cpu_model2->__cpu_model = model;
 
-      /* Find available features. */
-      get_available_features (cpu_model, cpu_model2, cpu_features2,
-			      ecx, edx);
       /* Get CPU type.  */
       get_amd_cpu (cpu_model, cpu_model2, cpu_features2);
       cpu_model->__cpu_vendor = VENDOR_AMD;
@@ -1059,22 +1057,17 @@ cpu_indicator_init (struct __processor_m
   else if (vendor == signature_CENTAUR_ebx && family < 0x07)
     cpu_model->__cpu_vendor = VENDOR_CENTAUR;
   else if (vendor == signature_SHANGHAI_ebx
-		|| vendor == signature_CENTAUR_ebx)
+	   || vendor == signature_CENTAUR_ebx)
     {
       /* Adjust model and family for ZHAOXIN CPUS.  */
       if (family == 0x07)
-	{
-	  model += extended_model;
-	}
+	model += extended_model;
 
       cpu_model2->__cpu_family = family;
       cpu_model2->__cpu_model = model;
 
-      /* Find available features.  */
-      get_available_features (cpu_model, cpu_model2, cpu_features2,
-				  ecx, edx);
       /* Get CPU type.  */
-      get_zhaoxin_cpu (cpu_model, cpu_model2,cpu_features2);
+      get_zhaoxin_cpu (cpu_model, cpu_model2, cpu_features2);
       cpu_model->__cpu_vendor = VENDOR_ZHAOXIN;
     }
   else if (vendor == signature_CYRIX_ebx)

	Jakub


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] i386: Call get_available_features for all CPUs with max_level >= 1 [PR100758]
  2023-02-09 12:11 [PATCH] i386: Call get_available_features for all CPUs with max_level >= 1 [PR100758] Jakub Jelinek
@ 2023-02-09 15:30 ` H.J. Lu
  2023-02-09 15:43   ` Jakub Jelinek
  0 siblings, 1 reply; 4+ messages in thread
From: H.J. Lu @ 2023-02-09 15:30 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Uros Bizjak, gcc-patches

On Thu, Feb 9, 2023 at 4:12 AM Jakub Jelinek <jakub@redhat.com> wrote:
>
> Hi!
>
> get_available_features doesn't depend on cpu_model2->__cpu_{family,model}
> and just sets stuff up based on CPUID leaf 1, or some extended ones,
> so I wonder why are we calling it separately for Intel, AMD and Zhaoxin
> and not for all other CPUs too?  I think various programs in the wild
> which aren't using __builtin_cpu_{is,supports} just check the various CPUID
> leafs and query bits in there, without blacklisting unknown CPU vendors,
> so I think even __builtin_cpu_supports ("sse2") etc. should be reliable
> if those VENDOR_{CENTAUR,CYRIX,NSC,OTHER} CPUs set those bits in CPUID leaf
> 1 or some extended ones.  Calling it for all CPUs also means it can be
> inlined because there will be just a single caller.
>
> I will test on Intel but can't test it on non-Intel (or with some extra
> effort on AMD; for both of those arches it should be really no change in
> behavior).
>
> Thoughts on this?

No objection here.   It just isn't easy to verify CPUID behavior on
other processors.

Thanks.

> 2023-02-09  Jakub Jelinek  <jakub@redhat.com>
>
>         PR target/100758
>         * common/config/i386/cpuinfo.h (get_zhaoxin_cpu): Formatting fixes.
>         (cpu_indicator_init): Call get_available_features for all CPUs with
>         max_level >= 1, rather than just Intel, AMD or Zhaoxin.  Formatting
>         fixes.
>
> --- gcc/common/config/i386/cpuinfo.h.jj 2023-01-16 11:52:15.910736614 +0100
> +++ gcc/common/config/i386/cpuinfo.h    2023-02-09 12:51:23.539470140 +0100
> @@ -601,8 +601,8 @@ get_intel_cpu (struct __processor_model
>
>  static inline const char *
>  get_zhaoxin_cpu (struct __processor_model *cpu_model,
> -               struct __processor_model2 *cpu_model2,
> -               unsigned int *cpu_features2)
> +                struct __processor_model2 *cpu_model2,
> +                unsigned int *cpu_features2)
>  {
>    const char *cpu = NULL;
>    unsigned int family = cpu_model2->__cpu_family;
> @@ -1016,6 +1016,10 @@ cpu_indicator_init (struct __processor_m
>    extended_model = (eax >> 12) & 0xf0;
>    extended_family = (eax >> 20) & 0xff;
>
> +  /* Find available features. */
> +  get_available_features (cpu_model, cpu_model2, cpu_features2,
> +                         ecx, edx);
> +
>    if (vendor == signature_INTEL_ebx)
>      {
>        /* Adjust model and family for Intel CPUS. */
> @@ -1030,9 +1034,6 @@ cpu_indicator_init (struct __processor_m
>        cpu_model2->__cpu_family = family;
>        cpu_model2->__cpu_model = model;
>
> -      /* Find available features. */
> -      get_available_features (cpu_model, cpu_model2, cpu_features2,
> -                             ecx, edx);
>        /* Get CPU type.  */
>        get_intel_cpu (cpu_model, cpu_model2, cpu_features2);
>        cpu_model->__cpu_vendor = VENDOR_INTEL;
> @@ -1049,9 +1050,6 @@ cpu_indicator_init (struct __processor_m
>        cpu_model2->__cpu_family = family;
>        cpu_model2->__cpu_model = model;
>
> -      /* Find available features. */
> -      get_available_features (cpu_model, cpu_model2, cpu_features2,
> -                             ecx, edx);
>        /* Get CPU type.  */
>        get_amd_cpu (cpu_model, cpu_model2, cpu_features2);
>        cpu_model->__cpu_vendor = VENDOR_AMD;
> @@ -1059,22 +1057,17 @@ cpu_indicator_init (struct __processor_m
>    else if (vendor == signature_CENTAUR_ebx && family < 0x07)
>      cpu_model->__cpu_vendor = VENDOR_CENTAUR;
>    else if (vendor == signature_SHANGHAI_ebx
> -               || vendor == signature_CENTAUR_ebx)
> +          || vendor == signature_CENTAUR_ebx)
>      {
>        /* Adjust model and family for ZHAOXIN CPUS.  */
>        if (family == 0x07)
> -       {
> -         model += extended_model;
> -       }
> +       model += extended_model;
>
>        cpu_model2->__cpu_family = family;
>        cpu_model2->__cpu_model = model;
>
> -      /* Find available features.  */
> -      get_available_features (cpu_model, cpu_model2, cpu_features2,
> -                                 ecx, edx);
>        /* Get CPU type.  */
> -      get_zhaoxin_cpu (cpu_model, cpu_model2,cpu_features2);
> +      get_zhaoxin_cpu (cpu_model, cpu_model2, cpu_features2);
>        cpu_model->__cpu_vendor = VENDOR_ZHAOXIN;
>      }
>    else if (vendor == signature_CYRIX_ebx)
>
>         Jakub
>


-- 
H.J.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] i386: Call get_available_features for all CPUs with max_level >= 1 [PR100758]
  2023-02-09 15:30 ` H.J. Lu
@ 2023-02-09 15:43   ` Jakub Jelinek
  2023-02-09 16:34     ` Uros Bizjak
  0 siblings, 1 reply; 4+ messages in thread
From: Jakub Jelinek @ 2023-02-09 15:43 UTC (permalink / raw)
  To: Uros Bizjak, H.J. Lu; +Cc: gcc-patches

On Thu, Feb 09, 2023 at 07:30:52AM -0800, H.J. Lu wrote:
> On Thu, Feb 9, 2023 at 4:12 AM Jakub Jelinek <jakub@redhat.com> wrote:
> > get_available_features doesn't depend on cpu_model2->__cpu_{family,model}
> > and just sets stuff up based on CPUID leaf 1, or some extended ones,
> > so I wonder why are we calling it separately for Intel, AMD and Zhaoxin
> > and not for all other CPUs too?  I think various programs in the wild
> > which aren't using __builtin_cpu_{is,supports} just check the various CPUID
> > leafs and query bits in there, without blacklisting unknown CPU vendors,
> > so I think even __builtin_cpu_supports ("sse2") etc. should be reliable
> > if those VENDOR_{CENTAUR,CYRIX,NSC,OTHER} CPUs set those bits in CPUID leaf
> > 1 or some extended ones.  Calling it for all CPUs also means it can be
> > inlined because there will be just a single caller.
> >
> > I will test on Intel but can't test it on non-Intel (or with some extra
> > effort on AMD; for both of those arches it should be really no change in
> > behavior).
> >
> > Thoughts on this?
> 
> No objection here.   It just isn't easy to verify CPUID behavior on
> other processors.

Sure, worst case it can be reverted or exceptions could be added if some
CPUs misbehave but then we'd hopefully have detailed into on how exactly it
behaves.

FYI, I've successfully bootstrapped/regtested this on Intel i9-7960X
and Martin Liska has regtested it with just i386.exp tests on AMD.

Uros, is this ok now?

> > 2023-02-09  Jakub Jelinek  <jakub@redhat.com>
> >
> >         PR target/100758
> >         * common/config/i386/cpuinfo.h (get_zhaoxin_cpu): Formatting fixes.
> >         (cpu_indicator_init): Call get_available_features for all CPUs with
> >         max_level >= 1, rather than just Intel, AMD or Zhaoxin.  Formatting
> >         fixes.

	Jakub


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] i386: Call get_available_features for all CPUs with max_level >= 1 [PR100758]
  2023-02-09 15:43   ` Jakub Jelinek
@ 2023-02-09 16:34     ` Uros Bizjak
  0 siblings, 0 replies; 4+ messages in thread
From: Uros Bizjak @ 2023-02-09 16:34 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: H.J. Lu, gcc-patches

On Thu, Feb 9, 2023 at 4:43 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Thu, Feb 09, 2023 at 07:30:52AM -0800, H.J. Lu wrote:
> > On Thu, Feb 9, 2023 at 4:12 AM Jakub Jelinek <jakub@redhat.com> wrote:
> > > get_available_features doesn't depend on cpu_model2->__cpu_{family,model}
> > > and just sets stuff up based on CPUID leaf 1, or some extended ones,
> > > so I wonder why are we calling it separately for Intel, AMD and Zhaoxin
> > > and not for all other CPUs too?  I think various programs in the wild
> > > which aren't using __builtin_cpu_{is,supports} just check the various CPUID
> > > leafs and query bits in there, without blacklisting unknown CPU vendors,
> > > so I think even __builtin_cpu_supports ("sse2") etc. should be reliable
> > > if those VENDOR_{CENTAUR,CYRIX,NSC,OTHER} CPUs set those bits in CPUID leaf
> > > 1 or some extended ones.  Calling it for all CPUs also means it can be
> > > inlined because there will be just a single caller.
> > >
> > > I will test on Intel but can't test it on non-Intel (or with some extra
> > > effort on AMD; for both of those arches it should be really no change in
> > > behavior).
> > >
> > > Thoughts on this?
> >
> > No objection here.   It just isn't easy to verify CPUID behavior on
> > other processors.
>
> Sure, worst case it can be reverted or exceptions could be added if some
> CPUs misbehave but then we'd hopefully have detailed into on how exactly it
> behaves.
>
> FYI, I've successfully bootstrapped/regtested this on Intel i9-7960X
> and Martin Liska has regtested it with just i386.exp tests on AMD.
>
> Uros, is this ok now?

OK. Let's go forward with the patch.

Thanks,
Uros.

>
> > > 2023-02-09  Jakub Jelinek  <jakub@redhat.com>
> > >
> > >         PR target/100758
> > >         * common/config/i386/cpuinfo.h (get_zhaoxin_cpu): Formatting fixes.
> > >         (cpu_indicator_init): Call get_available_features for all CPUs with
> > >         max_level >= 1, rather than just Intel, AMD or Zhaoxin.  Formatting
> > >         fixes.
>
>         Jakub
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-02-09 16:34 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-09 12:11 [PATCH] i386: Call get_available_features for all CPUs with max_level >= 1 [PR100758] Jakub Jelinek
2023-02-09 15:30 ` H.J. Lu
2023-02-09 15:43   ` Jakub Jelinek
2023-02-09 16:34     ` Uros Bizjak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).