public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] x86: add missing APX logic to cpu_flags_match()
@ 2024-01-05 12:15 Jan Beulich
  2024-01-08  3:17 ` Cui, Lili
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2024-01-05 12:15 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu, Lili Cui

As already indicated during review, we can't get away without certain
adjustments here: Without these, respective {evex}-prefixed insns are
assembled to APX encodings even when APX_F is turned off.

While there also extend the respective comment in the opcode table, to
explain why this construct is used.
---
Strictly speaking we could go with just cpuid|APX_F in the templates,
with the assertions dropped and instead an "else" added. But I think
we're better off this way, for being less prone to introducing mistakes
later on.

The resulting diagnostics aren't quite correct (because of not
mentioning the {evex} prefix, which really is what's the problem there),
but improving diagnostics is a wider topic anyway.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1940,6 +1940,30 @@ cpu_flags_match (const insn_template *t)
 	      any.bitfield.cpuavx512vl = 0;
 	    }
 	}
+
+      /* Dual non-APX/APX templates need massaging from what APX_F() in the
+         opcode table has produced.  While the direct transformation of the
+         incoming cpuid&(cpuid|APX_F) would be to cpuid&(cpuid) / cpuid&(APX_F)
+         respectively, it's cheaper to move to just cpuid / cpuid&APX_F
+         instead.  */
+      if (any.bitfield.cpuapx_f
+	  && (any.bitfield.cpubmi || any.bitfield.cpubmi2
+	      || any.bitfield.cpuavx512f || any.bitfield.cpuavx512bw
+	      || any.bitfield.cpuavx512dq || any.bitfield.cpuamx_tile
+	      || any.bitfield.cpucmpccxadd))
+	{
+	  /* These checks (verifying that APX_F() was properly used in the
+	     opcode table entry) make sure there's no need for an "else" to
+	     the "if()" below.  */
+	  gas_assert (!cpu_flags_all_zero (&all));
+	  cpu = cpu_flags_and (all, any);
+	  gas_assert (cpu_flags_equal (&cpu, &all));
+
+	  if (need_evex_encoding (t))
+	    all = any;
+
+	  memset (&any, 0, sizeof (any));
+	}
     }
 
   if (flag_code != CODE_64BIT)
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -143,7 +143,13 @@
 
 #define DstVVVV VexVVVV=VexVVVV_DST
 
-// The template supports VEX format for cpuid and EVEX format for cpuid & apx_f.
+// The template supports VEX format for cpuid and EVEX format for cpuid & APX_F.
+// While therefore we really mean cpuid|(cpuid&APX_F) here, this can't be
+// expressed in the generated templates.  It's equivalent to just cpuid|APX_F
+// anyway, but that is not what we want (as APX_F alone isn't a sufficient
+// prereq for such insns). Instead the assembler will massage the CPU specifier
+// to the equivalent of either cpuid&(cpuid) or cpuid&(APX_F) (or something
+// substantially similar), depending on what encoding was requested.
 #define APX_F(cpuid) cpuid&(cpuid|APX_F)
 
 // The EVEX purpose of StaticRounding appears only together with SAE. Re-use
--- a/gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l
+++ b/gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l
@@ -13,6 +13,13 @@
 .*:25: Error: `andn' is not supported on `x86_64.nobmi'
 .*:28: Error: `bzhi' is not supported on `x86_64.nobmi2'
 .*:29: Error: `bzhi' is not supported on `x86_64.nobmi2'
+.*:33: Error: .*`andn'.*
+.*:34: Error: .*`bzhi'.*
+.*:35: Error: .*`kmovw'.*
+.*:36: Error: .*`kmovq'.*
+.*:37: Error: .*`kmovb'.*
+.*:38: Error: .*`ldtilecfg'.*
+.*:39: Error: .*`cmpexadd'.*
 GAS LISTING .*
 #...
 [ 	]*1[ 	]+\# Check illegal 64bit APX EVEX promoted instructions
--- a/gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s
+++ b/gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s
@@ -27,3 +27,13 @@
 	.arch .nobmi2
 	bzhi %r16,%r15,%r11
 	bzhi %r15,%r15,%r11
+
+	.arch default
+	.arch .noapx_f
+	{evex} andn %r15, %r15, %r11
+	{evex} bzhi %r15, %r15, %r11
+	{evex} kmovw %k1, %r8d
+	{evex} kmovq %k1, %r8
+	{evex} kmovb %k1, %r8d
+	{evex} ldtilecfg (%r8)
+	{evex} cmpexadd %rax, %rcx, (%r8)

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-01-10  1:44 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-05 12:15 [PATCH] x86: add missing APX logic to cpu_flags_match() Jan Beulich
2024-01-08  3:17 ` Cui, Lili
2024-01-08  7:56   ` Jan Beulich
2024-01-08  8:30     ` Cui, Lili
2024-01-08  8:58       ` Jan Beulich
2024-01-08 10:28         ` Cui, Lili
2024-01-08 10:38           ` Jan Beulich
2024-01-09  5:36             ` Cui, Lili
2024-01-09  8:30               ` Jan Beulich
2024-01-09 11:00                 ` Cui, Lili
2024-01-09 11:07                   ` Jan Beulich
2024-01-10  1:44                     ` Cui, Lili

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).