public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] x86: fix SSE4a dependencies of ".arch .nosse*"
@ 2020-02-12 17:08 Jan Beulich
  2020-02-12 17:19 ` H.J. Lu
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2020-02-12 17:08 UTC (permalink / raw)
  To: binutils; +Cc: H.J. Lu

Since ".arch sse4a" enables SSE3 and earlier, disabling SSE3 should also
disable SSE4a. And as per its name, ".arch .nosse4" should also do so.

gas/
2020-02-XX  Jan Beulich  <jbeulich@suse.com>

	* config/tc-i386.c (cpu_noarch): Use CPU_ANY_SSE4_FLAGS in
	"nosse4" entry.

opcodes/
2020-02-XX  Jan Beulich  <jbeulich@suse.com>

	* i386-gen.c (cpu_flag_init): Move CpuSSE4a from
	CPU_ANY_SSE_FLAGS entry to CPU_ANY_SSE3_FLAGS one. Add
	CPU_ANY_SSE4_FLAGS entry.
	* i386-init.h: Re-generate.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1180,7 +1180,7 @@ static const noarch_entry cpu_noarch[] =
   { STRING_COMMA_LEN ("nossse3"),  CPU_ANY_SSSE3_FLAGS },
   { STRING_COMMA_LEN ("nosse4.1"),  CPU_ANY_SSE4_1_FLAGS },
   { STRING_COMMA_LEN ("nosse4.2"),  CPU_ANY_SSE4_2_FLAGS },
-  { STRING_COMMA_LEN ("nosse4"),  CPU_ANY_SSE4_1_FLAGS },
+  { STRING_COMMA_LEN ("nosse4"),  CPU_ANY_SSE4_FLAGS },
   { STRING_COMMA_LEN ("noavx"),  CPU_ANY_AVX_FLAGS },
   { STRING_COMMA_LEN ("noavx2"),  CPU_ANY_AVX2_FLAGS },
   { STRING_COMMA_LEN ("noavx512f"), CPU_ANY_AVX512F_FLAGS },
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -322,17 +322,19 @@ static initializer cpu_flag_init[] =
   { "CPU_ANY_MMX_FLAGS",
     "CPU_3DNOWA_FLAGS" },
   { "CPU_ANY_SSE_FLAGS",
-    "CPU_ANY_SSE2_FLAGS|CpuSSE|CpuSSE4a" },
+    "CPU_ANY_SSE2_FLAGS|CpuSSE" },
   { "CPU_ANY_SSE2_FLAGS",
     "CPU_ANY_SSE3_FLAGS|CpuSSE2" },
   { "CPU_ANY_SSE3_FLAGS",
-    "CPU_ANY_SSSE3_FLAGS|CpuSSE3" },
+    "CPU_ANY_SSSE3_FLAGS|CpuSSE3|CpuSSE4a" },
   { "CPU_ANY_SSSE3_FLAGS",
     "CPU_ANY_SSE4_1_FLAGS|CpuSSSE3" },
   { "CPU_ANY_SSE4_1_FLAGS",
     "CPU_ANY_SSE4_2_FLAGS|CpuSSE4_1" },
   { "CPU_ANY_SSE4_2_FLAGS",
     "CpuSSE4_2" },
+  { "CPU_ANY_SSE4_FLAGS",
+    "CPU_ANY_SSE4_1_FLAGS|CpuSSE4a" },
   { "CPU_ANY_AVX_FLAGS",
     "CPU_ANY_AVX2_FLAGS|CpuF16C|CpuFMA|CpuFMA4|CpuXOP|CpuAVX" },
   { "CPU_ANY_AVX2_FLAGS",

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86: fix SSE4a dependencies of ".arch .nosse*"
  2020-02-12 17:08 [PATCH] x86: fix SSE4a dependencies of ".arch .nosse*" Jan Beulich
@ 2020-02-12 17:19 ` H.J. Lu
  2020-02-16 16:48   ` [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4 H.J. Lu
  0 siblings, 1 reply; 16+ messages in thread
From: H.J. Lu @ 2020-02-12 17:19 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Wed, Feb 12, 2020 at 9:08 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> Since ".arch sse4a" enables SSE3 and earlier, disabling SSE3 should also
> disable SSE4a. And as per its name, ".arch .nosse4" should also do so.
>
> gas/
> 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
>
>         * config/tc-i386.c (cpu_noarch): Use CPU_ANY_SSE4_FLAGS in
>         "nosse4" entry.
>
> opcodes/
> 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
>
>         * i386-gen.c (cpu_flag_init): Move CpuSSE4a from
>         CPU_ANY_SSE_FLAGS entry to CPU_ANY_SSE3_FLAGS one. Add
>         CPU_ANY_SSE4_FLAGS entry.
>         * i386-init.h: Re-generate.
>

OK.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4
  2020-02-12 17:19 ` H.J. Lu
@ 2020-02-16 16:48   ` H.J. Lu
  2020-02-17  1:06     ` Alan Modra
  2020-02-17 15:27     ` [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4 Jan Beulich
  0 siblings, 2 replies; 16+ messages in thread
From: H.J. Lu @ 2020-02-16 16:48 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

[-- Attachment #1: Type: text/plain, Size: 1018 bytes --]

On Wed, Feb 12, 2020 at 9:18 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Wed, Feb 12, 2020 at 9:08 AM Jan Beulich <jbeulich@suse.com> wrote:
> >
> > Since ".arch sse4a" enables SSE3 and earlier, disabling SSE3 should also
> > disable SSE4a. And as per its name, ".arch .nosse4" should also do so.
> >
> > gas/
> > 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
> >
> >         * config/tc-i386.c (cpu_noarch): Use CPU_ANY_SSE4_FLAGS in
> >         "nosse4" entry.
> >
> > opcodes/
> > 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
> >
> >         * i386-gen.c (cpu_flag_init): Move CpuSSE4a from
> >         CPU_ANY_SSE_FLAGS entry to CPU_ANY_SSE3_FLAGS one. Add
> >         CPU_ANY_SSE4_FLAGS entry.
> >         * i386-init.h: Re-generate.
> >
>
> OK.
>
> Thanks.

commit 7deea9aad8 changed nosse4 to include CpuSSE4a.  But AMD SSE4a is
a superset of SSE3 and Intel SSE4 is a superset of SSSE3.  Disable Intel
SSE4 shouldn't disable AMD SSE4a.  This patch restores nosse4.  It also
adds .sse4a and nosse4a.

-- 
H.J.

[-- Attachment #2: 0001-x86-Don-t-disable-SSE4a-when-disabling-SSE4.patch --]
[-- Type: application/x-patch, Size: 5736 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4
  2020-02-16 16:48   ` [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4 H.J. Lu
@ 2020-02-17  1:06     ` Alan Modra
  2020-02-17  1:20       ` H.J. Lu
  2020-02-17 15:27     ` [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4 Jan Beulich
  1 sibling, 1 reply; 16+ messages in thread
From: Alan Modra @ 2020-02-17  1:06 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Jan Beulich, binutils

On Sun, Feb 16, 2020 at 08:47:56AM -0800, H.J. Lu wrote:
> commit 7deea9aad8 changed nosse4 to include CpuSSE4a.  But AMD SSE4a is
> a superset of SSE3 and Intel SSE4 is a superset of SSSE3.  Disable Intel
> SSE4 shouldn't disable AMD SSE4a.  This patch restores nosse4.  It also
> adds .sse4a and nosse4a.

diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index 79f4cc9d25..45106bcf6d 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -326,6 +326,8 @@ static initializer cpu_flag_init[] =
   { "CPU_ANY_SSE2_FLAGS",
     "CPU_ANY_SSE3_FLAGS|CpuSSE2" },
   { "CPU_ANY_SSE3_FLAGS",
+  { "CPU_ANY_SSE4A_FLAGS",
+    "CPU_ANY_SSE3_FLAGS|CpuSSE4a" },
     "CPU_ANY_SSSE3_FLAGS|CpuSSE3|CpuSSE4a" },
   { "CPU_ANY_SSSE3_FLAGS",
     "CPU_ANY_SSE4_1_FLAGS|CpuSSSE3" },
@@ -333,8 +335,6 @@ static initializer cpu_flag_init[] =
     "CPU_ANY_SSE4_2_FLAGS|CpuSSE4_1" },
   { "CPU_ANY_SSE4_2_FLAGS",
     "CpuSSE4_2" },
-  { "CPU_ANY_SSE4_FLAGS",
-    "CPU_ANY_SSE4_1_FLAGS|CpuSSE4a" },
   { "CPU_ANY_AVX_FLAGS",
     "CPU_ANY_AVX2_FLAGS|CpuF16C|CpuFMA|CpuFMA4|CpuXOP|CpuAVX" },
   { "CPU_ANY_AVX2_FLAGS",

Merge error?

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4
  2020-02-17  1:06     ` Alan Modra
@ 2020-02-17  1:20       ` H.J. Lu
  2020-02-17  1:32         ` Alan Modra
  0 siblings, 1 reply; 16+ messages in thread
From: H.J. Lu @ 2020-02-17  1:20 UTC (permalink / raw)
  To: Alan Modra; +Cc: Jan Beulich, binutils

On Sun, Feb 16, 2020 at 5:06 PM Alan Modra <amodra@gmail.com> wrote:
>
> On Sun, Feb 16, 2020 at 08:47:56AM -0800, H.J. Lu wrote:
> > commit 7deea9aad8 changed nosse4 to include CpuSSE4a.  But AMD SSE4a is
> > a superset of SSE3 and Intel SSE4 is a superset of SSSE3.  Disable Intel
> > SSE4 shouldn't disable AMD SSE4a.  This patch restores nosse4.  It also
> > adds .sse4a and nosse4a.
>
> diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
> index 79f4cc9d25..45106bcf6d 100644
> --- a/opcodes/i386-gen.c
> +++ b/opcodes/i386-gen.c
> @@ -326,6 +326,8 @@ static initializer cpu_flag_init[] =
>    { "CPU_ANY_SSE2_FLAGS",
>      "CPU_ANY_SSE3_FLAGS|CpuSSE2" },
>    { "CPU_ANY_SSE3_FLAGS",
> +  { "CPU_ANY_SSE4A_FLAGS",
> +    "CPU_ANY_SSE3_FLAGS|CpuSSE4a" },
>      "CPU_ANY_SSSE3_FLAGS|CpuSSE3|CpuSSE4a" },
>    { "CPU_ANY_SSSE3_FLAGS",
>      "CPU_ANY_SSE4_1_FLAGS|CpuSSSE3" },
> @@ -333,8 +335,6 @@ static initializer cpu_flag_init[] =
>      "CPU_ANY_SSE4_2_FLAGS|CpuSSE4_1" },
>    { "CPU_ANY_SSE4_2_FLAGS",
>      "CpuSSE4_2" },
> -  { "CPU_ANY_SSE4_FLAGS",
> -    "CPU_ANY_SSE4_1_FLAGS|CpuSSE4a" },
>    { "CPU_ANY_AVX_FLAGS",
>      "CPU_ANY_AVX2_FLAGS|CpuF16C|CpuFMA|CpuFMA4|CpuXOP|CpuAVX" },
>    { "CPU_ANY_AVX2_FLAGS",
>
> Merge error?

Is there anything wrong?


-- 
H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4
  2020-02-17  1:20       ` H.J. Lu
@ 2020-02-17  1:32         ` Alan Modra
  2020-02-17  3:12           ` Alan Modra
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Modra @ 2020-02-17  1:32 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Jan Beulich, binutils

On Sun, Feb 16, 2020 at 05:19:39PM -0800, H.J. Lu wrote:
> On Sun, Feb 16, 2020 at 5:06 PM Alan Modra <amodra@gmail.com> wrote:
> >
> > On Sun, Feb 16, 2020 at 08:47:56AM -0800, H.J. Lu wrote:
> > > commit 7deea9aad8 changed nosse4 to include CpuSSE4a.  But AMD SSE4a is
> > > a superset of SSE3 and Intel SSE4 is a superset of SSSE3.  Disable Intel
> > > SSE4 shouldn't disable AMD SSE4a.  This patch restores nosse4.  It also
> > > adds .sse4a and nosse4a.
> >
> > diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
> > index 79f4cc9d25..45106bcf6d 100644
> > --- a/opcodes/i386-gen.c
> > +++ b/opcodes/i386-gen.c
> > @@ -326,6 +326,8 @@ static initializer cpu_flag_init[] =
> >    { "CPU_ANY_SSE2_FLAGS",
> >      "CPU_ANY_SSE3_FLAGS|CpuSSE2" },
> >    { "CPU_ANY_SSE3_FLAGS",
> > +  { "CPU_ANY_SSE4A_FLAGS",
> > +    "CPU_ANY_SSE3_FLAGS|CpuSSE4a" },
> >      "CPU_ANY_SSSE3_FLAGS|CpuSSE3|CpuSSE4a" },
> >    { "CPU_ANY_SSSE3_FLAGS",
> >      "CPU_ANY_SSE4_1_FLAGS|CpuSSSE3" },
> > @@ -333,8 +335,6 @@ static initializer cpu_flag_init[] =
> >      "CPU_ANY_SSE4_2_FLAGS|CpuSSE4_1" },
> >    { "CPU_ANY_SSE4_2_FLAGS",
> >      "CpuSSE4_2" },
> > -  { "CPU_ANY_SSE4_FLAGS",
> > -    "CPU_ANY_SSE4_1_FLAGS|CpuSSE4a" },
> >    { "CPU_ANY_AVX_FLAGS",
> >      "CPU_ANY_AVX2_FLAGS|CpuF16C|CpuFMA|CpuFMA4|CpuXOP|CpuAVX" },
> >    { "CPU_ANY_AVX2_FLAGS",
> >
> > Merge error?
> 
> Is there anything wrong?

It doesn't compile.  The CPU_ANY_SSE4A_FLAGS entry is added inside the
CPU_ANY_SSE3_FLAGS entry.  Take a look at the diff.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4
  2020-02-17  1:32         ` Alan Modra
@ 2020-02-17  3:12           ` Alan Modra
  2020-02-17  4:16             ` [committed, PATCH] Don't disable SSE3 when disabling SSE4a H.J. Lu
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Modra @ 2020-02-17  3:12 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Jan Beulich, binutils

On Mon, Feb 17, 2020 at 12:01:56PM +1030, Alan Modra wrote:
> It doesn't compile.  The CPU_ANY_SSE4A_FLAGS entry is added inside the
> CPU_ANY_SSE3_FLAGS entry.  Take a look at the diff.

Since it is probably getting late for you I committed this to fix the
problem.

	* i386-gen.c (cpu_flag_init): Correct last change.

diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index 45106bcf6d..407479261c 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -326,8 +326,6 @@ static initializer cpu_flag_init[] =
   { "CPU_ANY_SSE2_FLAGS",
     "CPU_ANY_SSE3_FLAGS|CpuSSE2" },
   { "CPU_ANY_SSE3_FLAGS",
-  { "CPU_ANY_SSE4A_FLAGS",
-    "CPU_ANY_SSE3_FLAGS|CpuSSE4a" },
     "CPU_ANY_SSSE3_FLAGS|CpuSSE3|CpuSSE4a" },
   { "CPU_ANY_SSSE3_FLAGS",
     "CPU_ANY_SSE4_1_FLAGS|CpuSSSE3" },
@@ -335,6 +333,8 @@ static initializer cpu_flag_init[] =
     "CPU_ANY_SSE4_2_FLAGS|CpuSSE4_1" },
   { "CPU_ANY_SSE4_2_FLAGS",
     "CpuSSE4_2" },
+  { "CPU_ANY_SSE4A_FLAGS",
+    "CPU_ANY_SSE3_FLAGS|CpuSSE4a" },
   { "CPU_ANY_AVX_FLAGS",
     "CPU_ANY_AVX2_FLAGS|CpuF16C|CpuFMA|CpuFMA4|CpuXOP|CpuAVX" },
   { "CPU_ANY_AVX2_FLAGS",

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [committed, PATCH] Don't disable SSE3 when disabling SSE4a
  2020-02-17  3:12           ` Alan Modra
@ 2020-02-17  4:16             ` H.J. Lu
  0 siblings, 0 replies; 16+ messages in thread
From: H.J. Lu @ 2020-02-17  4:16 UTC (permalink / raw)
  To: Alan Modra; +Cc: Jan Beulich, binutils

[-- Attachment #1: Type: text/plain, Size: 1342 bytes --]

On Sun, Feb 16, 2020 at 7:12 PM Alan Modra <amodra@gmail.com> wrote:
>
> On Mon, Feb 17, 2020 at 12:01:56PM +1030, Alan Modra wrote:
> > It doesn't compile.  The CPU_ANY_SSE4A_FLAGS entry is added inside the
> > CPU_ANY_SSE3_FLAGS entry.  Take a look at the diff.
>
> Since it is probably getting late for you I committed this to fix the
> problem.
>
>         * i386-gen.c (cpu_flag_init): Correct last change.
>
> diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
> index 45106bcf6d..407479261c 100644
> --- a/opcodes/i386-gen.c
> +++ b/opcodes/i386-gen.c
> @@ -326,8 +326,6 @@ static initializer cpu_flag_init[] =
>    { "CPU_ANY_SSE2_FLAGS",
>      "CPU_ANY_SSE3_FLAGS|CpuSSE2" },
>    { "CPU_ANY_SSE3_FLAGS",
> -  { "CPU_ANY_SSE4A_FLAGS",
> -    "CPU_ANY_SSE3_FLAGS|CpuSSE4a" },
>      "CPU_ANY_SSSE3_FLAGS|CpuSSE3|CpuSSE4a" },
>    { "CPU_ANY_SSSE3_FLAGS",
>      "CPU_ANY_SSE4_1_FLAGS|CpuSSSE3" },
> @@ -335,6 +333,8 @@ static initializer cpu_flag_init[] =
>      "CPU_ANY_SSE4_2_FLAGS|CpuSSE4_1" },
>    { "CPU_ANY_SSE4_2_FLAGS",
>      "CpuSSE4_2" },
> +  { "CPU_ANY_SSE4A_FLAGS",
> +    "CPU_ANY_SSE3_FLAGS|CpuSSE4a" },
>    { "CPU_ANY_AVX_FLAGS",
>      "CPU_ANY_AVX2_FLAGS|CpuF16C|CpuFMA|CpuFMA4|CpuXOP|CpuAVX" },
>    { "CPU_ANY_AVX2_FLAGS",
>

I checked in this patch to avoid disabling SSE3 when
disabling SSE4a.


-- 
H.J.

[-- Attachment #2: 0001-x86-Don-t-disable-SSE3-when-disabling-SSE4a.patch --]
[-- Type: text/x-patch, Size: 1972 bytes --]

From ce504911e5c4068a3498eebde4064b24382c7598 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Sun, 16 Feb 2020 20:10:20 -0800
Subject: [PATCH] x86: Don't disable SSE3 when disabling SSE4a

Since SSE3 is independent of SSE4a, don't disable SSE3 when disabling
SSE4a.

	* i386-gen.c (cpu_flag_init): Remove CPU_ANY_SSE3_FLAGS from
	CPU_ANY_SSE4A_FLAGS.
---
 opcodes/ChangeLog   | 5 +++++
 opcodes/i386-gen.c  | 2 +-
 opcodes/i386-init.h | 2 +-
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/opcodes/ChangeLog b/opcodes/ChangeLog
index 908afdd9a0..9d02fc45e7 100644
--- a/opcodes/ChangeLog
+++ b/opcodes/ChangeLog
@@ -1,3 +1,8 @@
+2020-02-16  H.J. Lu  <hongjiu.lu@intel.com>
+
+	* i386-gen.c (cpu_flag_init): Remove CPU_ANY_SSE3_FLAGS from
+	CPU_ANY_SSE4A_FLAGS.
+
 2020-02-17  Alan Modra  <amodra@gmail.com>
 
 	* i386-gen.c (cpu_flag_init): Correct last change.
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index 407479261c..4d98d31b74 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -334,7 +334,7 @@ static initializer cpu_flag_init[] =
   { "CPU_ANY_SSE4_2_FLAGS",
     "CpuSSE4_2" },
   { "CPU_ANY_SSE4A_FLAGS",
-    "CPU_ANY_SSE3_FLAGS|CpuSSE4a" },
+    "CpuSSE4a" },
   { "CPU_ANY_AVX_FLAGS",
     "CPU_ANY_AVX2_FLAGS|CpuF16C|CpuFMA|CpuFMA4|CpuXOP|CpuAVX" },
   { "CPU_ANY_AVX2_FLAGS",
diff --git a/opcodes/i386-init.h b/opcodes/i386-init.h
index d4674fc02a..36660b109b 100644
--- a/opcodes/i386-init.h
+++ b/opcodes/i386-init.h
@@ -1172,7 +1172,7 @@
 
 #define CPU_ANY_SSE4A_FLAGS \
   { { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
-      0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
+      0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
-- 
2.24.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4
  2020-02-16 16:48   ` [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4 H.J. Lu
  2020-02-17  1:06     ` Alan Modra
@ 2020-02-17 15:27     ` Jan Beulich
  2020-02-17 15:30       ` H.J. Lu
  1 sibling, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2020-02-17 15:27 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

On 16.02.2020 17:47, H.J. Lu wrote:
> On Wed, Feb 12, 2020 at 9:18 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>>
>> On Wed, Feb 12, 2020 at 9:08 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>
>>> Since ".arch sse4a" enables SSE3 and earlier, disabling SSE3 should also
>>> disable SSE4a. And as per its name, ".arch .nosse4" should also do so.
>>>
>>> gas/
>>> 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
>>>
>>>         * config/tc-i386.c (cpu_noarch): Use CPU_ANY_SSE4_FLAGS in
>>>         "nosse4" entry.
>>>
>>> opcodes/
>>> 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
>>>
>>>         * i386-gen.c (cpu_flag_init): Move CpuSSE4a from
>>>         CPU_ANY_SSE_FLAGS entry to CPU_ANY_SSE3_FLAGS one. Add
>>>         CPU_ANY_SSE4_FLAGS entry.
>>>         * i386-init.h: Re-generate.
>>>
>>
>> OK.
>>
>> Thanks.
> 
> commit 7deea9aad8 changed nosse4 to include CpuSSE4a.  But AMD SSE4a is
> a superset of SSE3 and Intel SSE4 is a superset of SSSE3.  Disable Intel
> SSE4 shouldn't disable AMD SSE4a.  This patch restores nosse4.  It also
> adds .sse4a and nosse4a.

And where is it said that "nosse4" means only the Intel flavors? As
said in the commit message of said change, to me the clear implication
is that anything called SSE4* will get disabled.

Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4
  2020-02-17 15:27     ` [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4 Jan Beulich
@ 2020-02-17 15:30       ` H.J. Lu
  2020-02-17 15:32         ` Jan Beulich
  0 siblings, 1 reply; 16+ messages in thread
From: H.J. Lu @ 2020-02-17 15:30 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Mon, Feb 17, 2020 at 7:27 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 16.02.2020 17:47, H.J. Lu wrote:
> > On Wed, Feb 12, 2020 at 9:18 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> >>
> >> On Wed, Feb 12, 2020 at 9:08 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>
> >>> Since ".arch sse4a" enables SSE3 and earlier, disabling SSE3 should also
> >>> disable SSE4a. And as per its name, ".arch .nosse4" should also do so.
> >>>
> >>> gas/
> >>> 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
> >>>
> >>>         * config/tc-i386.c (cpu_noarch): Use CPU_ANY_SSE4_FLAGS in
> >>>         "nosse4" entry.
> >>>
> >>> opcodes/
> >>> 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
> >>>
> >>>         * i386-gen.c (cpu_flag_init): Move CpuSSE4a from
> >>>         CPU_ANY_SSE_FLAGS entry to CPU_ANY_SSE3_FLAGS one. Add
> >>>         CPU_ANY_SSE4_FLAGS entry.
> >>>         * i386-init.h: Re-generate.
> >>>
> >>
> >> OK.
> >>
> >> Thanks.
> >
> > commit 7deea9aad8 changed nosse4 to include CpuSSE4a.  But AMD SSE4a is
> > a superset of SSE3 and Intel SSE4 is a superset of SSSE3.  Disable Intel
> > SSE4 shouldn't disable AMD SSE4a.  This patch restores nosse4.  It also
> > adds .sse4a and nosse4a.
>
> And where is it said that "nosse4" means only the Intel flavors? As
> said in the commit message of said change, to me the clear implication
> is that anything called SSE4* will get disabled.
>

SSE4 refers to SSE4 from Intel, which includes SSE4.1 and SSE4.2.
SSE4a from AMD is unrelated from Intel SSE4.


-- 
H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4
  2020-02-17 15:30       ` H.J. Lu
@ 2020-02-17 15:32         ` Jan Beulich
  2020-02-17 15:45           ` H.J. Lu
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2020-02-17 15:32 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

On 17.02.2020 16:30, H.J. Lu wrote:
> On Mon, Feb 17, 2020 at 7:27 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 16.02.2020 17:47, H.J. Lu wrote:
>>> On Wed, Feb 12, 2020 at 9:18 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>
>>>> On Wed, Feb 12, 2020 at 9:08 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>
>>>>> Since ".arch sse4a" enables SSE3 and earlier, disabling SSE3 should also
>>>>> disable SSE4a. And as per its name, ".arch .nosse4" should also do so.
>>>>>
>>>>> gas/
>>>>> 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
>>>>>
>>>>>         * config/tc-i386.c (cpu_noarch): Use CPU_ANY_SSE4_FLAGS in
>>>>>         "nosse4" entry.
>>>>>
>>>>> opcodes/
>>>>> 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
>>>>>
>>>>>         * i386-gen.c (cpu_flag_init): Move CpuSSE4a from
>>>>>         CPU_ANY_SSE_FLAGS entry to CPU_ANY_SSE3_FLAGS one. Add
>>>>>         CPU_ANY_SSE4_FLAGS entry.
>>>>>         * i386-init.h: Re-generate.
>>>>>
>>>>
>>>> OK.
>>>>
>>>> Thanks.
>>>
>>> commit 7deea9aad8 changed nosse4 to include CpuSSE4a.  But AMD SSE4a is
>>> a superset of SSE3 and Intel SSE4 is a superset of SSSE3.  Disable Intel
>>> SSE4 shouldn't disable AMD SSE4a.  This patch restores nosse4.  It also
>>> adds .sse4a and nosse4a.
>>
>> And where is it said that "nosse4" means only the Intel flavors? As
>> said in the commit message of said change, to me the clear implication
>> is that anything called SSE4* will get disabled.
>>
> 
> SSE4 refers to SSE4 from Intel, which includes SSE4.1 and SSE4.2.
> SSE4a from AMD is unrelated from Intel SSE4.

Repeating my question then: Where is this being said? (Best imo
would be to delete ".arch .nosse4" support then, eliminating
the ambiguity.)

Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4
  2020-02-17 15:32         ` Jan Beulich
@ 2020-02-17 15:45           ` H.J. Lu
  2020-02-17 15:49             ` Jan Beulich
  0 siblings, 1 reply; 16+ messages in thread
From: H.J. Lu @ 2020-02-17 15:45 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Mon, Feb 17, 2020 at 7:32 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 17.02.2020 16:30, H.J. Lu wrote:
> > On Mon, Feb 17, 2020 at 7:27 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 16.02.2020 17:47, H.J. Lu wrote:
> >>> On Wed, Feb 12, 2020 at 9:18 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> >>>>
> >>>> On Wed, Feb 12, 2020 at 9:08 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>
> >>>>> Since ".arch sse4a" enables SSE3 and earlier, disabling SSE3 should also
> >>>>> disable SSE4a. And as per its name, ".arch .nosse4" should also do so.
> >>>>>
> >>>>> gas/
> >>>>> 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
> >>>>>
> >>>>>         * config/tc-i386.c (cpu_noarch): Use CPU_ANY_SSE4_FLAGS in
> >>>>>         "nosse4" entry.
> >>>>>
> >>>>> opcodes/
> >>>>> 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
> >>>>>
> >>>>>         * i386-gen.c (cpu_flag_init): Move CpuSSE4a from
> >>>>>         CPU_ANY_SSE_FLAGS entry to CPU_ANY_SSE3_FLAGS one. Add
> >>>>>         CPU_ANY_SSE4_FLAGS entry.
> >>>>>         * i386-init.h: Re-generate.
> >>>>>
> >>>>
> >>>> OK.
> >>>>
> >>>> Thanks.
> >>>
> >>> commit 7deea9aad8 changed nosse4 to include CpuSSE4a.  But AMD SSE4a is
> >>> a superset of SSE3 and Intel SSE4 is a superset of SSSE3.  Disable Intel
> >>> SSE4 shouldn't disable AMD SSE4a.  This patch restores nosse4.  It also
> >>> adds .sse4a and nosse4a.
> >>
> >> And where is it said that "nosse4" means only the Intel flavors? As
> >> said in the commit message of said change, to me the clear implication
> >> is that anything called SSE4* will get disabled.
> >>
> >
> > SSE4 refers to SSE4 from Intel, which includes SSE4.1 and SSE4.2.
> > SSE4a from AMD is unrelated from Intel SSE4.
>
> Repeating my question then: Where is this being said? (Best imo
> would be to delete ".arch .nosse4" support then, eliminating
> the ambiguity.)

We have both .sse4 and nosse4 which are aliases for SSE4.2.  Please
feel free to add documentation.

-- 
H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4
  2020-02-17 15:45           ` H.J. Lu
@ 2020-02-17 15:49             ` Jan Beulich
  2020-02-17 16:53               ` H.J. Lu
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2020-02-17 15:49 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

On 17.02.2020 16:44, H.J. Lu wrote:
> On Mon, Feb 17, 2020 at 7:32 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 17.02.2020 16:30, H.J. Lu wrote:
>>> On Mon, Feb 17, 2020 at 7:27 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> On 16.02.2020 17:47, H.J. Lu wrote:
>>>>> On Wed, Feb 12, 2020 at 9:18 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>>>
>>>>>> On Wed, Feb 12, 2020 at 9:08 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>>
>>>>>>> Since ".arch sse4a" enables SSE3 and earlier, disabling SSE3 should also
>>>>>>> disable SSE4a. And as per its name, ".arch .nosse4" should also do so.
>>>>>>>
>>>>>>> gas/
>>>>>>> 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
>>>>>>>
>>>>>>>         * config/tc-i386.c (cpu_noarch): Use CPU_ANY_SSE4_FLAGS in
>>>>>>>         "nosse4" entry.
>>>>>>>
>>>>>>> opcodes/
>>>>>>> 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
>>>>>>>
>>>>>>>         * i386-gen.c (cpu_flag_init): Move CpuSSE4a from
>>>>>>>         CPU_ANY_SSE_FLAGS entry to CPU_ANY_SSE3_FLAGS one. Add
>>>>>>>         CPU_ANY_SSE4_FLAGS entry.
>>>>>>>         * i386-init.h: Re-generate.
>>>>>>>
>>>>>>
>>>>>> OK.
>>>>>>
>>>>>> Thanks.
>>>>>
>>>>> commit 7deea9aad8 changed nosse4 to include CpuSSE4a.  But AMD SSE4a is
>>>>> a superset of SSE3 and Intel SSE4 is a superset of SSSE3.  Disable Intel
>>>>> SSE4 shouldn't disable AMD SSE4a.  This patch restores nosse4.  It also
>>>>> adds .sse4a and nosse4a.
>>>>
>>>> And where is it said that "nosse4" means only the Intel flavors? As
>>>> said in the commit message of said change, to me the clear implication
>>>> is that anything called SSE4* will get disabled.
>>>>
>>>
>>> SSE4 refers to SSE4 from Intel, which includes SSE4.1 and SSE4.2.
>>> SSE4a from AMD is unrelated from Intel SSE4.
>>
>> Repeating my question then: Where is this being said? (Best imo
>> would be to delete ".arch .nosse4" support then, eliminating
>> the ambiguity.)
> 
> We have both .sse4 and nosse4 which are aliases for SSE4.2.  Please
> feel free to add documentation.

If it's not documented, then it's not clear at all what the intention
is. I'm certainly not going to add documentation saying something that
I don't believe should be said. I.e. if I were to add documentation
here, it'd say .nosse4 covers all three SSE4* variants (and it would
then be a bug of the implementation that this isn't the case).

Just like for the MOVSX/MOVZX issue, I really dislike you making
statements of things that were (apparently) never settled on.

Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4
  2020-02-17 15:49             ` Jan Beulich
@ 2020-02-17 16:53               ` H.J. Lu
  2020-02-17 17:01                 ` Jan Beulich
  0 siblings, 1 reply; 16+ messages in thread
From: H.J. Lu @ 2020-02-17 16:53 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Mon, Feb 17, 2020 at 7:49 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 17.02.2020 16:44, H.J. Lu wrote:
> > On Mon, Feb 17, 2020 at 7:32 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 17.02.2020 16:30, H.J. Lu wrote:
> >>> On Mon, Feb 17, 2020 at 7:27 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>
> >>>> On 16.02.2020 17:47, H.J. Lu wrote:
> >>>>> On Wed, Feb 12, 2020 at 9:18 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> >>>>>>
> >>>>>> On Wed, Feb 12, 2020 at 9:08 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>>>
> >>>>>>> Since ".arch sse4a" enables SSE3 and earlier, disabling SSE3 should also
> >>>>>>> disable SSE4a. And as per its name, ".arch .nosse4" should also do so.
> >>>>>>>
> >>>>>>> gas/
> >>>>>>> 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
> >>>>>>>
> >>>>>>>         * config/tc-i386.c (cpu_noarch): Use CPU_ANY_SSE4_FLAGS in
> >>>>>>>         "nosse4" entry.
> >>>>>>>
> >>>>>>> opcodes/
> >>>>>>> 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
> >>>>>>>
> >>>>>>>         * i386-gen.c (cpu_flag_init): Move CpuSSE4a from
> >>>>>>>         CPU_ANY_SSE_FLAGS entry to CPU_ANY_SSE3_FLAGS one. Add
> >>>>>>>         CPU_ANY_SSE4_FLAGS entry.
> >>>>>>>         * i386-init.h: Re-generate.
> >>>>>>>
> >>>>>>
> >>>>>> OK.
> >>>>>>
> >>>>>> Thanks.
> >>>>>
> >>>>> commit 7deea9aad8 changed nosse4 to include CpuSSE4a.  But AMD SSE4a is
> >>>>> a superset of SSE3 and Intel SSE4 is a superset of SSSE3.  Disable Intel
> >>>>> SSE4 shouldn't disable AMD SSE4a.  This patch restores nosse4.  It also
> >>>>> adds .sse4a and nosse4a.
> >>>>
> >>>> And where is it said that "nosse4" means only the Intel flavors? As
> >>>> said in the commit message of said change, to me the clear implication
> >>>> is that anything called SSE4* will get disabled.
> >>>>
> >>>
> >>> SSE4 refers to SSE4 from Intel, which includes SSE4.1 and SSE4.2.
> >>> SSE4a from AMD is unrelated from Intel SSE4.
> >>
> >> Repeating my question then: Where is this being said? (Best imo
> >> would be to delete ".arch .nosse4" support then, eliminating
> >> the ambiguity.)
> >
> > We have both .sse4 and nosse4 which are aliases for SSE4.2.  Please
> > feel free to add documentation.
>
> If it's not documented, then it's not clear at all what the intention
> is. I'm certainly not going to add documentation saying something that
> I don't believe should be said. I.e. if I were to add documentation
> here, it'd say .nosse4 covers all three SSE4* variants (and it would
> then be a bug of the implementation that this isn't the case).

From gcc/config/i386/i386.opt:

msse4.1
Target Report Mask(ISA_SSE4_1) Var(ix86_isa_flags) Save
Support MMX, SSE, SSE2, SSE3, SSSE3 and SSE4.1 built-in functions and
code generation.

msse4.2
Target Report Mask(ISA_SSE4_2) Var(ix86_isa_flags) Save
Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1 and SSE4.2 built-in
functions and code generation.

msse4
Target RejectNegative Report Mask(ISA_SSE4_2) Var(ix86_isa_flags) Save
Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1 and SSE4.2 built-in
functions and code generation.

mno-sse4
Target RejectNegative Report InverseMask(ISA_SSE4_1) Var(ix86_isa_flags) Save
Do not support SSE4.1 and SSE4.2 built-in functions and code generation.

SSE4 is for Intel SSE4 only.

> Just like for the MOVSX/MOVZX issue, I really dislike you making
> statements of things that were (apparently) never settled on.
>
> Jan

-- 
H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4
  2020-02-17 16:53               ` H.J. Lu
@ 2020-02-17 17:01                 ` Jan Beulich
  2020-02-17 17:05                   ` H.J. Lu
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2020-02-17 17:01 UTC (permalink / raw)
  To: H.J. Lu; +Cc: binutils

On 17.02.2020 17:52, H.J. Lu wrote:
> On Mon, Feb 17, 2020 at 7:49 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 17.02.2020 16:44, H.J. Lu wrote:
>>> On Mon, Feb 17, 2020 at 7:32 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> On 17.02.2020 16:30, H.J. Lu wrote:
>>>>> On Mon, Feb 17, 2020 at 7:27 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>
>>>>>> On 16.02.2020 17:47, H.J. Lu wrote:
>>>>>>> On Wed, Feb 12, 2020 at 9:18 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>>>>>
>>>>>>>> On Wed, Feb 12, 2020 at 9:08 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>>>>
>>>>>>>>> Since ".arch sse4a" enables SSE3 and earlier, disabling SSE3 should also
>>>>>>>>> disable SSE4a. And as per its name, ".arch .nosse4" should also do so.
>>>>>>>>>
>>>>>>>>> gas/
>>>>>>>>> 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
>>>>>>>>>
>>>>>>>>>         * config/tc-i386.c (cpu_noarch): Use CPU_ANY_SSE4_FLAGS in
>>>>>>>>>         "nosse4" entry.
>>>>>>>>>
>>>>>>>>> opcodes/
>>>>>>>>> 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
>>>>>>>>>
>>>>>>>>>         * i386-gen.c (cpu_flag_init): Move CpuSSE4a from
>>>>>>>>>         CPU_ANY_SSE_FLAGS entry to CPU_ANY_SSE3_FLAGS one. Add
>>>>>>>>>         CPU_ANY_SSE4_FLAGS entry.
>>>>>>>>>         * i386-init.h: Re-generate.
>>>>>>>>>
>>>>>>>>
>>>>>>>> OK.
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>
>>>>>>> commit 7deea9aad8 changed nosse4 to include CpuSSE4a.  But AMD SSE4a is
>>>>>>> a superset of SSE3 and Intel SSE4 is a superset of SSSE3.  Disable Intel
>>>>>>> SSE4 shouldn't disable AMD SSE4a.  This patch restores nosse4.  It also
>>>>>>> adds .sse4a and nosse4a.
>>>>>>
>>>>>> And where is it said that "nosse4" means only the Intel flavors? As
>>>>>> said in the commit message of said change, to me the clear implication
>>>>>> is that anything called SSE4* will get disabled.
>>>>>>
>>>>>
>>>>> SSE4 refers to SSE4 from Intel, which includes SSE4.1 and SSE4.2.
>>>>> SSE4a from AMD is unrelated from Intel SSE4.
>>>>
>>>> Repeating my question then: Where is this being said? (Best imo
>>>> would be to delete ".arch .nosse4" support then, eliminating
>>>> the ambiguity.)
>>>
>>> We have both .sse4 and nosse4 which are aliases for SSE4.2.  Please
>>> feel free to add documentation.
>>
>> If it's not documented, then it's not clear at all what the intention
>> is. I'm certainly not going to add documentation saying something that
>> I don't believe should be said. I.e. if I were to add documentation
>> here, it'd say .nosse4 covers all three SSE4* variants (and it would
>> then be a bug of the implementation that this isn't the case).
> 
> From gcc/config/i386/i386.opt:
> 
> msse4.1
> Target Report Mask(ISA_SSE4_1) Var(ix86_isa_flags) Save
> Support MMX, SSE, SSE2, SSE3, SSSE3 and SSE4.1 built-in functions and
> code generation.
> 
> msse4.2
> Target Report Mask(ISA_SSE4_2) Var(ix86_isa_flags) Save
> Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1 and SSE4.2 built-in
> functions and code generation.
> 
> msse4
> Target RejectNegative Report Mask(ISA_SSE4_2) Var(ix86_isa_flags) Save
> Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1 and SSE4.2 built-in
> functions and code generation.
> 
> mno-sse4
> Target RejectNegative Report InverseMask(ISA_SSE4_1) Var(ix86_isa_flags) Save
> Do not support SSE4.1 and SSE4.2 built-in functions and code generation.
> 
> SSE4 is for Intel SSE4 only.

Hmm, okay, that's gcc, not gas, but at least something.

Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4
  2020-02-17 17:01                 ` Jan Beulich
@ 2020-02-17 17:05                   ` H.J. Lu
  0 siblings, 0 replies; 16+ messages in thread
From: H.J. Lu @ 2020-02-17 17:05 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

On Mon, Feb 17, 2020 at 9:01 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 17.02.2020 17:52, H.J. Lu wrote:
> > On Mon, Feb 17, 2020 at 7:49 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 17.02.2020 16:44, H.J. Lu wrote:
> >>> On Mon, Feb 17, 2020 at 7:32 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>
> >>>> On 17.02.2020 16:30, H.J. Lu wrote:
> >>>>> On Mon, Feb 17, 2020 at 7:27 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>>
> >>>>>> On 16.02.2020 17:47, H.J. Lu wrote:
> >>>>>>> On Wed, Feb 12, 2020 at 9:18 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>> On Wed, Feb 12, 2020 at 9:08 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>>>>>
> >>>>>>>>> Since ".arch sse4a" enables SSE3 and earlier, disabling SSE3 should also
> >>>>>>>>> disable SSE4a. And as per its name, ".arch .nosse4" should also do so.
> >>>>>>>>>
> >>>>>>>>> gas/
> >>>>>>>>> 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
> >>>>>>>>>
> >>>>>>>>>         * config/tc-i386.c (cpu_noarch): Use CPU_ANY_SSE4_FLAGS in
> >>>>>>>>>         "nosse4" entry.
> >>>>>>>>>
> >>>>>>>>> opcodes/
> >>>>>>>>> 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
> >>>>>>>>>
> >>>>>>>>>         * i386-gen.c (cpu_flag_init): Move CpuSSE4a from
> >>>>>>>>>         CPU_ANY_SSE_FLAGS entry to CPU_ANY_SSE3_FLAGS one. Add
> >>>>>>>>>         CPU_ANY_SSE4_FLAGS entry.
> >>>>>>>>>         * i386-init.h: Re-generate.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> OK.
> >>>>>>>>
> >>>>>>>> Thanks.
> >>>>>>>
> >>>>>>> commit 7deea9aad8 changed nosse4 to include CpuSSE4a.  But AMD SSE4a is
> >>>>>>> a superset of SSE3 and Intel SSE4 is a superset of SSSE3.  Disable Intel
> >>>>>>> SSE4 shouldn't disable AMD SSE4a.  This patch restores nosse4.  It also
> >>>>>>> adds .sse4a and nosse4a.
> >>>>>>
> >>>>>> And where is it said that "nosse4" means only the Intel flavors? As
> >>>>>> said in the commit message of said change, to me the clear implication
> >>>>>> is that anything called SSE4* will get disabled.
> >>>>>>
> >>>>>
> >>>>> SSE4 refers to SSE4 from Intel, which includes SSE4.1 and SSE4.2.
> >>>>> SSE4a from AMD is unrelated from Intel SSE4.
> >>>>
> >>>> Repeating my question then: Where is this being said? (Best imo
> >>>> would be to delete ".arch .nosse4" support then, eliminating
> >>>> the ambiguity.)
> >>>
> >>> We have both .sse4 and nosse4 which are aliases for SSE4.2.  Please
> >>> feel free to add documentation.
> >>
> >> If it's not documented, then it's not clear at all what the intention
> >> is. I'm certainly not going to add documentation saying something that
> >> I don't believe should be said. I.e. if I were to add documentation
> >> here, it'd say .nosse4 covers all three SSE4* variants (and it would
> >> then be a bug of the implementation that this isn't the case).
> >
> > From gcc/config/i386/i386.opt:
> >
> > msse4.1
> > Target Report Mask(ISA_SSE4_1) Var(ix86_isa_flags) Save
> > Support MMX, SSE, SSE2, SSE3, SSSE3 and SSE4.1 built-in functions and
> > code generation.
> >
> > msse4.2
> > Target Report Mask(ISA_SSE4_2) Var(ix86_isa_flags) Save
> > Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1 and SSE4.2 built-in
> > functions and code generation.
> >
> > msse4
> > Target RejectNegative Report Mask(ISA_SSE4_2) Var(ix86_isa_flags) Save
> > Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1 and SSE4.2 built-in
> > functions and code generation.
> >
> > mno-sse4
> > Target RejectNegative Report InverseMask(ISA_SSE4_1) Var(ix86_isa_flags) Save
> > Do not support SSE4.1 and SSE4.2 built-in functions and code generation.
> >
> > SSE4 is for Intel SSE4 only.
>
> Hmm, okay, that's gcc, not gas, but at least something.
>

Can you add a sentence for SSE4 to gas manual?

-- 
H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-02-17 17:05 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-12 17:08 [PATCH] x86: fix SSE4a dependencies of ".arch .nosse*" Jan Beulich
2020-02-12 17:19 ` H.J. Lu
2020-02-16 16:48   ` [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4 H.J. Lu
2020-02-17  1:06     ` Alan Modra
2020-02-17  1:20       ` H.J. Lu
2020-02-17  1:32         ` Alan Modra
2020-02-17  3:12           ` Alan Modra
2020-02-17  4:16             ` [committed, PATCH] Don't disable SSE3 when disabling SSE4a H.J. Lu
2020-02-17 15:27     ` [committed, PATCH] x86: Don't disable SSE4a when disabling SSE4 Jan Beulich
2020-02-17 15:30       ` H.J. Lu
2020-02-17 15:32         ` Jan Beulich
2020-02-17 15:45           ` H.J. Lu
2020-02-17 15:49             ` Jan Beulich
2020-02-17 16:53               ` H.J. Lu
2020-02-17 17:01                 ` Jan Beulich
2020-02-17 17:05                   ` H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).