public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v5 0/8] x86: suffix handling changes
@ 2022-10-25  7:23 Jan Beulich
  2022-10-25  7:24 ` [PATCH v5 1/8] x86: constify parse_insn()'s input Jan Beulich
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Jan Beulich @ 2022-10-25  7:23 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

... accompanied by a few other improvements (or so I hope) found
along the way.

The only change in v5 is the re-ordering of patches, such that the
(controversial) change for move-with-sign-extend is last (and hence
other patches can - hopefully - go in while the controversy is being
resolved).

1: constify parse_insn()'s input
2: introduce Pass2 insn attribute
3: re-work insn/suffix recognition
4: don't recognize/derive Q suffix in the common case
5: allow HLE store of accumulator to absolute 32-bit address
6: move bad-use-of-TLS-reloc check
7: drop (now) stray IsString
8: further re-work insn/suffix recognition to also cover MOVSX

Note that patches 4...6 were previously approved (albeit 4 has now
changed because of the re-ordering of the series), but can't be
committed ahead of the earlier ones.

Jan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v5 1/8] x86: constify parse_insn()'s input
  2022-10-25  7:23 [PATCH v5 0/8] x86: suffix handling changes Jan Beulich
@ 2022-10-25  7:24 ` Jan Beulich
  2022-10-25  7:25 ` [PATCH v5 1/8] x86: introduce Pass2 insn attribute Jan Beulich
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2022-10-25  7:24 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

The function doesn't alter its input buffer: Reflect this in its
prototype. To avoid using any kind of cast, simply calculate the update
of "line" from the function's input and output.
---
Avoiding a cast the way it's done here is going to be useful in
subsequent patches (there'll be code added between the call and the
update of "line"), so switching to a cast is not really an option.
---
v3: New.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -157,7 +157,7 @@ static int i386_intel_operand (char *, i
 static int i386_intel_simplify (expressionS *);
 static int i386_intel_parse_name (const char *, expressionS *);
 static const reg_entry *parse_register (char *, char **);
-static char *parse_insn (char *, char *);
+static const char *parse_insn (const char *, char *);
 static char *parse_operands (char *, const char *);
 static void swap_operands (void);
 static void swap_2_operands (unsigned int, unsigned int);
@@ -4819,6 +4819,7 @@ md_assemble (char *line)
 {
   unsigned int j;
   char mnemonic[MAX_MNEM_SIZE], mnem_suffix;
+  const char *end;
   const insn_template *t;
 
   /* Initialize globals.  */
@@ -4834,9 +4835,10 @@ md_assemble (char *line)
      We assume that the scrubber has arranged it so that line[0] is the valid
      start of a (possibly prefixed) mnemonic.  */
 
-  line = parse_insn (line, mnemonic);
-  if (line == NULL)
+  end = parse_insn (line, mnemonic);
+  if (end == NULL)
     return;
+  line += end - line;
   mnem_suffix = i.suffix;
 
   line = parse_operands (line, mnemonic);
@@ -5216,11 +5218,10 @@ md_assemble (char *line)
     last_insn.kind = last_insn_other;
 }
 
-static char *
-parse_insn (char *line, char *mnemonic)
+static const char *
+parse_insn (const char *line, char *mnemonic)
 {
-  char *l = line;
-  char *token_start = l;
+  const char *l = line, *token_start = l;
   char *mnem_p;
   int supported;
   const insn_template *t;


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v5 1/8] x86: introduce Pass2 insn attribute
  2022-10-25  7:23 [PATCH v5 0/8] x86: suffix handling changes Jan Beulich
  2022-10-25  7:24 ` [PATCH v5 1/8] x86: constify parse_insn()'s input Jan Beulich
@ 2022-10-25  7:25 ` Jan Beulich
  2022-10-25  7:30   ` [PATCH v5 2/8] " Jan Beulich
  2022-10-25  7:26 ` [PATCH v5 3/8] x86: re-work insn/suffix recognition Jan Beulich
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2022-10-25  7:25 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

Subsequently we will want to mark templates for mnemonics which, due to
their last character also possibly being a suffix, are "shadowing" other
templates (and hence may require a second parsing pass). Introduce this
attribute in a standalone patch, largely to ease looking at the actual
later change. To simplify logic the later change is going to look for
the new attribute only on the first template of a group. Make i386-gen
warn about misplaced Pass2 (redundant ones are okay).

While there convert "active_isstring" to bool, matching the new static
variable.
---
As to redundant attributes: An open question is whether - to avoid
surprises when e.g. templates are re-ordered - it wouldn't be desirable
to add Pass2 on all templates of a group.
---
v3: New.

--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -18,6 +18,7 @@
    MA 02110-1301, USA.  */
 
 #include "sysdep.h"
+#include <stdbool.h>
 #include <stdio.h>
 #include <errno.h>
 #include "getopt.h"
@@ -702,6 +703,7 @@ static bitfield opcode_modifiers[] =
   BITFIELD (FWait),
   BITFIELD (IsString),
   BITFIELD (RegMem),
+  BITFIELD (Pass2),
   BITFIELD (BNDPrefixOk),
   BITFIELD (RegKludge),
   BITFIELD (Implicit1stXmm0),
@@ -800,7 +802,7 @@ static bitfield operand_types[] =
 
 static const char *filename;
 static i386_cpu_flags active_cpu_flags;
-static int active_isstring;
+static bool active_isstring, active_pass2;
 
 struct template_arg {
   const struct template_arg *next;
@@ -1168,7 +1170,8 @@ process_i386_opcode_modifier (FILE *tabl
   char *str, *next, *last;
   bitfield modifiers [ARRAY_SIZE (opcode_modifiers)];
 
-  active_isstring = 0;
+  active_isstring = false;
+  active_pass2 = false;
 
   /* Copy the default opcode modifier.  */
   memcpy (modifiers, opcode_modifiers, sizeof (modifiers));
@@ -1191,8 +1194,11 @@ process_i386_opcode_modifier (FILE *tabl
 
 	      set_bitfield (str, modifiers, val, ARRAY_SIZE (modifiers),
 			    lineno);
+
 	      if (strcasecmp(str, "IsString") == 0)
-		active_isstring = 1;
+		active_isstring = true;
+	      if (strcasecmp(str, "Pass2") == 0)
+		active_pass2 = true;
 
 	      if (strcasecmp(str, "W") == 0)
 		have_w = 1;
@@ -1867,6 +1873,7 @@ process_i386_opcodes (FILE *table)
   for (j = 0; j < i; j++)
     {
       struct opcode_hash_entry *next;
+      bool first_pass2 = false;
 
       for (next = opcode_array[j]; next; next = next->next)
 	{
@@ -1875,6 +1882,14 @@ process_i386_opcodes (FILE *table)
 	  lineno = next->lineno;
 	  last = str + strlen (str);
 	  output_i386_opcode (table, name, str, last, lineno);
+
+	  if (next == opcode_array[j])
+	    first_pass2 = active_pass2;
+	  else if (!first_pass2 && active_pass2)
+	    fprintf (stderr,
+		     "Warning: %s:%d: %s: "
+		     "'Pass2' is recognized only on the first template in a group\n",
+		     filename, lineno, name);
 	}
     }
 
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -504,6 +504,9 @@ enum
      Normally, it will be encoded in the reg field. We add a RegMem
      flag to indicate that it should be encoded in the regmem field.  */
   RegMem,
+  /* The (unsuffixed) mnemonic represents itself the suffixed form of another
+     mnemonic, potentially requiring a second parsing pass.  */
+  Pass2,
   /* quick test if branch instruction is MPX supported */
   BNDPrefixOk,
   /* fake an extra reg operand for clr, imul and special register
@@ -730,6 +733,7 @@ typedef struct i386_opcode_modifier
   unsigned int fwait:1;
   unsigned int isstring:2;
   unsigned int regmem:1;
+  unsigned int pass2:1;
   unsigned int bndprefixok:1;
   unsigned int regkludge:1;
   unsigned int implicit1stxmm0:1;


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v5 3/8] x86: re-work insn/suffix recognition
  2022-10-25  7:23 [PATCH v5 0/8] x86: suffix handling changes Jan Beulich
  2022-10-25  7:24 ` [PATCH v5 1/8] x86: constify parse_insn()'s input Jan Beulich
  2022-10-25  7:25 ` [PATCH v5 1/8] x86: introduce Pass2 insn attribute Jan Beulich
@ 2022-10-25  7:26 ` Jan Beulich
  2022-10-27 17:21   ` H.J. Lu
  2022-10-25  7:26 ` [PATCH v5 4/8] ix86: don't recognize/derive Q suffix in the common case Jan Beulich
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2022-10-25  7:26 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

Having templates with a suffix explicitly present has always been
quirky. Introduce a 2nd matching pass in case the 1st one couldn't find
a suitable template _and_ didn't itself already need to trim off a
suffix to find a match at all. This requires error reporting adjustments
(albeit luckily fewer than I was afraid might be necessary), as errors
previously reported during matching now need deferring until after the
2nd pass (because, obviously, we must not emit any error if the 2nd pass
succeeds). While also related to PR gas/29524, it was requested that
move-with-sign-extend be left as broken as it always was.

PR gas/29525
Note that with the dropped CMPSD and MOVSD Intel Syntax string insn
templates taking operands, mixed IsString/non-IsString template groups
(with memory operands) cannot occur anymore. With that
maybe_adjust_templates() becomes unnecessary (and is hence being
removed).

PR gas/29526
Note further that while the additions to the intel16 testcase aren't
really proper Intel syntax, we've been permitting all of those except
for the MOVD variant. The test therefore is to avoid re-introducing such
an inconsistency.
---
To limit code churn I'm using "goto" for the retry loop, but I'd be
happy to make this a proper loop either right here or in a follow-on
change doing just the necessary re-indentation.

The "too many memory references" errors which are being deleted weren't
fully consistent anyway - even the majority of IsString insns accepts
only a single memory operand. If we wanted to retain that, it would need
re-introducing in md_assemble(), latching the error into i.error just
like match_template() does.

Why is "MOVQ $imm64, %reg64" being optimized but "MOVABS $imm64, %reg64"
is not?
---
v5: Split off move-with-sign-extend changes.
v4: Retain support for operand-less MOVSD and CMPSD.
v3: Limit xstrdup() to just the templates where a 2nd pass actually
    makes sense (new Pass2 attribute).

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -297,9 +297,6 @@ struct _i386_insn
        explicit segment overrides are given.  */
     const reg_entry *seg[2];
 
-    /* Copied first memory operand string, for re-checking.  */
-    char *memop1_string;
-
     /* PREFIX holds all the given prefix opcodes (usually null).
        PREFIXES is the number of prefix opcodes.  */
     unsigned int prefixes;
@@ -4272,7 +4269,20 @@ optimize_encoding (void)
 	   movq $imm31, %r64   -> movl $imm31, %r32
 	   movq $imm32, %r64   -> movl $imm32, %r32
         */
-      i.tm.opcode_modifier.norex64 = 1;
+      i.tm.opcode_modifier.size = SIZE32;
+      if (i.imm_operands)
+	{
+	  i.types[0].bitfield.imm32 = 1;
+	  i.types[0].bitfield.imm32s = 0;
+	  i.types[0].bitfield.imm64 = 0;
+	}
+      else
+	{
+	  i.types[0].bitfield.dword = 1;
+	  i.types[0].bitfield.qword = 0;
+	}
+      i.types[1].bitfield.dword = 1;
+      i.types[1].bitfield.qword = 0;
       if (i.tm.base_opcode == 0xb8 || (i.tm.base_opcode | 1) == 0xc7)
 	{
 	  /* Handle
@@ -4282,11 +4292,6 @@ optimize_encoding (void)
 	  i.tm.operand_types[0].bitfield.imm32 = 1;
 	  i.tm.operand_types[0].bitfield.imm32s = 0;
 	  i.tm.operand_types[0].bitfield.imm64 = 0;
-	  i.types[0].bitfield.imm32 = 1;
-	  i.types[0].bitfield.imm32s = 0;
-	  i.types[0].bitfield.imm64 = 0;
-	  i.types[1].bitfield.dword = 1;
-	  i.types[1].bitfield.qword = 0;
 	  if ((i.tm.base_opcode | 1) == 0xc7)
 	    {
 	      /* Handle
@@ -4818,11 +4823,14 @@ void
 md_assemble (char *line)
 {
   unsigned int j;
-  char mnemonic[MAX_MNEM_SIZE], mnem_suffix;
-  const char *end;
+  char mnemonic[MAX_MNEM_SIZE], mnem_suffix, *copy = NULL;
+  const char *end, *pass1_mnem = NULL;
+  enum i386_error pass1_err = 0;
   const insn_template *t;
 
   /* Initialize globals.  */
+  current_templates = NULL;
+ retry:
   memset (&i, '\0', sizeof (i));
   i.rounding.type = rc_none;
   for (j = 0; j < MAX_OPERANDS; j++)
@@ -4837,16 +4845,26 @@ md_assemble (char *line)
 
   end = parse_insn (line, mnemonic);
   if (end == NULL)
-    return;
+    {
+      if (pass1_mnem != NULL)
+	goto match_error;
+      return;
+    }
+  if (current_templates->start->opcode_modifier.pass2)
+    {
+      /* Make a copy of the full line in case we need to retry.  */
+      copy = xstrdup (line);
+    }
   line += end - line;
   mnem_suffix = i.suffix;
 
   line = parse_operands (line, mnemonic);
   this_operand = -1;
-  xfree (i.memop1_string);
-  i.memop1_string = NULL;
   if (line == NULL)
-    return;
+    {
+      free (copy);
+      return;
+    }
 
   /* Now we've parsed the mnemonic into a set of templates, and have the
      operands at hand.  */
@@ -4922,7 +4940,97 @@ md_assemble (char *line)
      with the template operand types.  */
 
   if (!(t = match_template (mnem_suffix)))
-    return;
+    {
+      const char *err_msg;
+
+      if (copy && !mnem_suffix)
+	{
+	  line = copy;
+	  copy = NULL;
+	  pass1_err = i.error;
+	  pass1_mnem = current_templates->start->name;
+	  goto retry;
+	}
+      free (copy);
+  match_error:
+      switch (pass1_mnem ? pass1_err : i.error)
+	{
+	default:
+	  abort ();
+	case operand_size_mismatch:
+	  err_msg = _("operand size mismatch");
+	  break;
+	case operand_type_mismatch:
+	  err_msg = _("operand type mismatch");
+	  break;
+	case register_type_mismatch:
+	  err_msg = _("register type mismatch");
+	  break;
+	case number_of_operands_mismatch:
+	  err_msg = _("number of operands mismatch");
+	  break;
+	case invalid_instruction_suffix:
+	  err_msg = _("invalid instruction suffix");
+	  break;
+	case bad_imm4:
+	  err_msg = _("constant doesn't fit in 4 bits");
+	  break;
+	case unsupported_with_intel_mnemonic:
+	  err_msg = _("unsupported with Intel mnemonic");
+	  break;
+	case unsupported_syntax:
+	  err_msg = _("unsupported syntax");
+	  break;
+	case unsupported:
+	  as_bad (_("unsupported instruction `%s'"),
+		  pass1_mnem ? pass1_mnem : current_templates->start->name);
+	  return;
+	case invalid_sib_address:
+	  err_msg = _("invalid SIB address");
+	  break;
+	case invalid_vsib_address:
+	  err_msg = _("invalid VSIB address");
+	  break;
+	case invalid_vector_register_set:
+	  err_msg = _("mask, index, and destination registers must be distinct");
+	  break;
+	case invalid_tmm_register_set:
+	  err_msg = _("all tmm registers must be distinct");
+	  break;
+	case invalid_dest_and_src_register_set:
+	  err_msg = _("destination and source registers must be distinct");
+	  break;
+	case unsupported_vector_index_register:
+	  err_msg = _("unsupported vector index register");
+	  break;
+	case unsupported_broadcast:
+	  err_msg = _("unsupported broadcast");
+	  break;
+	case broadcast_needed:
+	  err_msg = _("broadcast is needed for operand of such type");
+	  break;
+	case unsupported_masking:
+	  err_msg = _("unsupported masking");
+	  break;
+	case mask_not_on_destination:
+	  err_msg = _("mask not on destination operand");
+	  break;
+	case no_default_mask:
+	  err_msg = _("default mask isn't allowed");
+	  break;
+	case unsupported_rc_sae:
+	  err_msg = _("unsupported static rounding/sae");
+	  break;
+	case invalid_register_operand:
+	  err_msg = _("invalid register operand");
+	  break;
+	}
+      as_bad (_("%s for `%s'"), err_msg,
+	      pass1_mnem ? pass1_mnem : current_templates->start->name);
+      return;
+    }
+
+  free (copy);
 
   if (sse_check != check_none
       /* The opcode space check isn't strictly needed; it's there only to
@@ -5223,6 +5331,7 @@ parse_insn (const char *line, char *mnem
 {
   const char *l = line, *token_start = l;
   char *mnem_p;
+  bool pass1 = !current_templates;
   int supported;
   const insn_template *t;
   char *dot_p = NULL;
@@ -5392,8 +5501,10 @@ parse_insn (const char *line, char *mnem
       current_templates = (const templates *) str_hash_find (op_hash, mnemonic);
     }
 
-  if (!current_templates)
+  if (!current_templates || !pass1)
     {
+      current_templates = NULL;
+
     check_suffix:
       if (mnem_p > mnemonic)
 	{
@@ -5435,13 +5546,39 @@ parse_insn (const char *line, char *mnem
 		  current_templates
 		    = (const templates *) str_hash_find (op_hash, mnemonic);
 		}
+	      /* For compatibility reasons accept MOVSD and CMPSD without
+	         operands even in AT&T mode.  */
+	      else if (*l == END_OF_INSN
+		       || (is_space_char (*l) && l[1] == END_OF_INSN))
+		{
+		  mnem_p[-1] = '\0';
+		  current_templates
+		    = (const templates *) str_hash_find (op_hash, mnemonic);
+		  if (current_templates != NULL
+		      /* MOVS or CMPS */
+		      && (current_templates->start->base_opcode | 2) == 0xa6
+		      && current_templates->start->opcode_modifier.opcodespace
+			 == SPACE_BASE
+		      && mnem_p[-2] == 's')
+		    {
+		      as_warn (_("found `%sd'; assuming `%sl' was meant"),
+			       mnemonic, mnemonic);
+		      i.suffix = LONG_MNEM_SUFFIX;
+		    }
+		  else
+		    {
+		      current_templates = NULL;
+		      mnem_p[-1] = 'd';
+		    }
+		}
 	      break;
 	    }
 	}
 
       if (!current_templates)
 	{
-	  as_bad (_("no such instruction: `%s'"), token_start);
+	  if (pass1)
+	    as_bad (_("no such instruction: `%s'"), token_start);
 	  return NULL;
 	}
     }
@@ -6845,81 +6982,7 @@ match_template (char mnem_suffix)
   if (t == current_templates->end)
     {
       /* We found no match.  */
-      const char *err_msg;
-      switch (specific_error)
-	{
-	default:
-	  abort ();
-	case operand_size_mismatch:
-	  err_msg = _("operand size mismatch");
-	  break;
-	case operand_type_mismatch:
-	  err_msg = _("operand type mismatch");
-	  break;
-	case register_type_mismatch:
-	  err_msg = _("register type mismatch");
-	  break;
-	case number_of_operands_mismatch:
-	  err_msg = _("number of operands mismatch");
-	  break;
-	case invalid_instruction_suffix:
-	  err_msg = _("invalid instruction suffix");
-	  break;
-	case bad_imm4:
-	  err_msg = _("constant doesn't fit in 4 bits");
-	  break;
-	case unsupported_with_intel_mnemonic:
-	  err_msg = _("unsupported with Intel mnemonic");
-	  break;
-	case unsupported_syntax:
-	  err_msg = _("unsupported syntax");
-	  break;
-	case unsupported:
-	  as_bad (_("unsupported instruction `%s'"),
-		  current_templates->start->name);
-	  return NULL;
-	case invalid_sib_address:
-	  err_msg = _("invalid SIB address");
-	  break;
-	case invalid_vsib_address:
-	  err_msg = _("invalid VSIB address");
-	  break;
-	case invalid_vector_register_set:
-	  err_msg = _("mask, index, and destination registers must be distinct");
-	  break;
-	case invalid_tmm_register_set:
-	  err_msg = _("all tmm registers must be distinct");
-	  break;
-	case invalid_dest_and_src_register_set:
-	  err_msg = _("destination and source registers must be distinct");
-	  break;
-	case unsupported_vector_index_register:
-	  err_msg = _("unsupported vector index register");
-	  break;
-	case unsupported_broadcast:
-	  err_msg = _("unsupported broadcast");
-	  break;
-	case broadcast_needed:
-	  err_msg = _("broadcast is needed for operand of such type");
-	  break;
-	case unsupported_masking:
-	  err_msg = _("unsupported masking");
-	  break;
-	case mask_not_on_destination:
-	  err_msg = _("mask not on destination operand");
-	  break;
-	case no_default_mask:
-	  err_msg = _("default mask isn't allowed");
-	  break;
-	case unsupported_rc_sae:
-	  err_msg = _("unsupported static rounding/sae");
-	  break;
-	case invalid_register_operand:
-	  err_msg = _("invalid register operand");
-	  break;
-	}
-      as_bad (_("%s for `%s'"), err_msg,
-	      current_templates->start->name);
+      i.error = specific_error;
       return NULL;
     }
 
@@ -11328,49 +11391,6 @@ RC_SAE_immediate (const char *imm_start)
   return 1;
 }
 
-/* Only string instructions can have a second memory operand, so
-   reduce current_templates to just those if it contains any.  */
-static int
-maybe_adjust_templates (void)
-{
-  const insn_template *t;
-
-  gas_assert (i.mem_operands == 1);
-
-  for (t = current_templates->start; t < current_templates->end; ++t)
-    if (t->opcode_modifier.isstring)
-      break;
-
-  if (t < current_templates->end)
-    {
-      static templates aux_templates;
-      bool recheck;
-
-      aux_templates.start = t;
-      for (; t < current_templates->end; ++t)
-	if (!t->opcode_modifier.isstring)
-	  break;
-      aux_templates.end = t;
-
-      /* Determine whether to re-check the first memory operand.  */
-      recheck = (aux_templates.start != current_templates->start
-		 || t != current_templates->end);
-
-      current_templates = &aux_templates;
-
-      if (recheck)
-	{
-	  i.mem_operands = 0;
-	  if (i.memop1_string != NULL
-	      && i386_index_check (i.memop1_string) == 0)
-	    return 0;
-	  i.mem_operands = 1;
-	}
-    }
-
-  return 1;
-}
-
 static INLINE bool starts_memory_operand (char c)
 {
   return ISDIGIT (c)
@@ -11521,17 +11541,6 @@ i386_att_operand (char *operand_string)
       char *displacement_string_end;
 
     do_memory_reference:
-      if (i.mem_operands == 1 && !maybe_adjust_templates ())
-	return 0;
-      if ((i.mem_operands == 1
-	   && !current_templates->start->opcode_modifier.isstring)
-	  || i.mem_operands == 2)
-	{
-	  as_bad (_("too many memory references for `%s'"),
-		  current_templates->start->name);
-	  return 0;
-	}
-
       /* Check for base index form.  We detect the base index form by
 	 looking for an ')' at the end of the operand, searching
 	 for the '(' matching it, and finding a REGISTER_PREFIX or ','
@@ -11738,8 +11747,6 @@ i386_att_operand (char *operand_string)
       if (i386_index_check (operand_string) == 0)
 	return 0;
       i.flags[this_operand] |= Operand_Mem;
-      if (i.mem_operands == 0)
-	i.memop1_string = xstrdup (operand_string);
       i.mem_operands++;
     }
   else
--- a/gas/config/tc-i386-intel.c
+++ b/gas/config/tc-i386-intel.c
@@ -993,10 +993,7 @@ i386_intel_operand (char *operand_string
 	   || intel_state.is_mem)
     {
       /* Memory operand.  */
-      if (i.mem_operands == 1 && !maybe_adjust_templates ())
-	return 0;
-      if ((int) i.mem_operands
-	  >= 2 - !current_templates->start->opcode_modifier.isstring)
+      if (i.mem_operands)
 	{
 	  /* Handle
 
@@ -1041,10 +1038,6 @@ i386_intel_operand (char *operand_string
 		    }
 		}
 	    }
-
-	  as_bad (_("too many memory references for `%s'"),
-		  current_templates->start->name);
-	  return 0;
 	}
 
       /* Swap base and index in 16-bit memory operands like
@@ -1158,8 +1151,6 @@ i386_intel_operand (char *operand_string
 	return 0;
 
       i.flags[this_operand] |= Operand_Mem;
-      if (i.mem_operands == 0)
-	i.memop1_string = xstrdup (operand_string);
       ++i.mem_operands;
     }
   else
--- a/gas/testsuite/gas/i386/code16.d
+++ b/gas/testsuite/gas/i386/code16.d
@@ -1,5 +1,6 @@
 #objdump: -drw -mi8086
 #name: i386 with .code16
+#warning_output: code16.e
 
 .*: +file format .*
 
--- /dev/null
+++ b/gas/testsuite/gas/i386/code16.e
@@ -0,0 +1,3 @@
+.*: Assembler messages:
+.*:3: Warning: .* `movsd'.*`movsl'.*
+.*:4: Warning: .* `cmpsd'.*`cmpsl'.*
--- a/gas/testsuite/gas/i386/code16.s
+++ b/gas/testsuite/gas/i386/code16.s
@@ -2,8 +2,8 @@
 	.code16
 	rep; movsd
 	rep; cmpsd
-	rep movsd %ds:(%si),%es:(%di)
-	rep cmpsd %es:(%di),%ds:(%si)
+	rep movsl %ds:(%si),%es:(%di)
+	rep cmpsl %es:(%di),%ds:(%si)
 
 	mov	%cr2, %ecx
 	mov	%ecx, %cr2
--- a/gas/testsuite/gas/i386/intel16.d
+++ b/gas/testsuite/gas/i386/intel16.d
@@ -20,4 +20,12 @@ Disassembly of section .text:
   2c:	8d 02 [ 	]*lea    \(%bp,%si\),%ax
   2e:	8d 01 [ 	]*lea    \(%bx,%di\),%ax
   30:	8d 03 [ 	]*lea    \(%bp,%di\),%ax
-	...
+[ 	]*[0-9a-f]+:	67 f7 13[ 	]+notw[ 	]+\(%ebx\)
+[ 	]*[0-9a-f]+:	66 f7 17[ 	]+notl[ 	]+\(%bx\)
+[ 	]*[0-9a-f]+:	67 0f 1f 03[ 	]+nopw[ 	]+\(%ebx\)
+[ 	]*[0-9a-f]+:	66 0f 1f 07[ 	]+nopl[ 	]+\(%bx\)
+[ 	]*[0-9a-f]+:	67 83 03 05[ 	]+addw[ 	]+\$0x5,\(%ebx\)
+[ 	]*[0-9a-f]+:	66 83 07 05[ 	]+addl[ 	]+\$0x5,\(%bx\)
+[ 	]*[0-9a-f]+:	67 c7 03 05 00[ 	]+movw[ 	]+\$0x5,\(%ebx\)
+[ 	]*[0-9a-f]+:	66 c7 07 05 00 00 00[ 	]+movl[ 	]+\$0x5,\(%bx\)
+#pass
--- a/gas/testsuite/gas/i386/intel16.s
+++ b/gas/testsuite/gas/i386/intel16.s
@@ -18,4 +18,14 @@
  lea	ax, [di][bx]
  lea	ax, [di][bp]
 
- .p2align 4,0
+ notw	[ebx]
+ notd	[bx]
+
+ nopw	[ebx]
+ nopd	[bx]
+
+ addw	[ebx], 5
+ addd	[bx], 5
+
+ movw	[ebx], 5
+ movd	[bx], 5
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -135,34 +135,27 @@
 mov, 0xa0, None, CpuNo64, D|W|No_sSuf|No_qSuf|No_ldSuf, { Disp16|Disp32|Unspecified|Byte|Word|Dword, Acc|Byte|Word|Dword }
 mov, 0xa0, None, Cpu64, D|W|No_sSuf|No_ldSuf, { Disp64|Unspecified|Byte|Word|Dword|Qword, Acc|Byte|Word|Dword|Qword }
 movabs, 0xa0, None, Cpu64, D|W|No_sSuf|No_ldSuf, { Disp64|Unspecified|Byte|Word|Dword|Qword, Acc|Byte|Word|Dword|Qword }
-movq, 0xa1, None, Cpu64, D|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Disp64|Unspecified|Qword, Acc|Qword }
 mov, 0x88, None, 0, D|W|CheckRegSize|Modrm|No_sSuf|No_ldSuf|HLEPrefixRelease, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-movq, 0x89, None, Cpu64, D|Modrm|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|HLEPrefixRelease, { Reg64, Reg64|Unspecified|Qword|BaseIndex }
 // In the 64bit mode the short form mov immediate is redefined to have
 // 64bit value.
 mov, 0xb0, None, 0, W|No_sSuf|No_qSuf|No_ldSuf, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32 }
 mov, 0xc6, 0, 0, W|Modrm|No_sSuf|No_ldSuf|HLEPrefixRelease|Optimize, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-movq, 0xc7, 0, Cpu64, Modrm|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|HLEPrefixRelease|Optimize, { Imm32S, Reg64|Qword|Unspecified|BaseIndex }
 mov, 0xb8, None, Cpu64, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf|Optimize, { Imm64, Reg64 }
 movabs, 0xb8, None, Cpu64, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf, { Imm64, Reg64 }
-movq, 0xb8, None, Cpu64, Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { Imm64, Reg64 }
 // The segment register moves accept WordReg so that a segment register
 // can be copied to a 32 bit register, and vice versa, without using a
 // size prefix.  When moving to a 32 bit register, the upper 16 bits
 // are set to an implementation defined value (on the Pentium Pro, the
 // implementation defined value is zero).
-mov, 0x8c, None, 0, RegMem|No_bSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { SReg, Reg16|Reg32|Reg64 }
+mov, 0x8c, None, 0, RegMem|No_bSuf|No_sSuf|No_ldSuf|NoRex64, { SReg, Reg16|Reg32|Reg64 }
 mov, 0x8c, None, 0, D|Modrm|IgnoreSize|No_bSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { SReg, Word|Unspecified|BaseIndex }
-movq, 0x8c, None, Cpu64, D|RegMem|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { SReg, Reg64 }
-mov, 0x8e, None, 0, Modrm|IgnoreSize|No_bSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Reg16|Reg32|Reg64, SReg }
+mov, 0x8e, None, 0, Modrm|IgnoreSize|No_bSuf|No_sSuf|No_ldSuf|NoRex64, { Reg16|Reg32|Reg64, SReg }
 // Move to/from control debug registers.  In the 16 or 32bit modes
 // they are 32bit.  In the 64bit mode they are 64bit.
 mov, 0xf20, None, Cpu386|CpuNo64, D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf|No_ldSuf, { Control, Reg32 }
 mov, 0xf20, None, Cpu64, D|RegMem|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf|NoRex64, { Control, Reg64 }
-movq, 0xf20, None, Cpu64, D|RegMem|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Control, Reg64 }
 mov, 0xf21, None, Cpu386|CpuNo64, D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf|No_ldSuf, { Debug, Reg32 }
 mov, 0xf21, None, Cpu64, D|RegMem|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf|NoRex64, { Debug, Reg64 }
-movq, 0xf21, None, Cpu64, D|RegMem|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Debug, Reg64 }
 mov, 0xf24, None, Cpu386|CpuNo64, D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf|No_ldSuf, { Test, Reg32 }
 
 // Move after swapping the bytes
@@ -492,9 +485,6 @@ set<cc>, 0xf9<cc:opc>, 0, Cpu386, Modrm|
 // String manipulation.
 cmps, 0xa6, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {}
 cmps, 0xa6, None, 0, W|No_sSuf|No_ldSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-// Intel mode string compare.
-cmpsd, 0xa7, None, Cpu386, Size32|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
-cmpsd, 0xa7, None, Cpu386, Size32|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsStringEsOp0|RepPrefixOk, { Dword|Unspecified|BaseIndex, Dword|Unspecified|BaseIndex }
 scmp, 0xa6, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {}
 scmp, 0xa6, None, 0, W|No_sSuf|No_ldSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 ins, 0x6c, None, Cpu186, W|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
@@ -509,9 +499,6 @@ slod, 0xac, None, 0, W|No_sSuf|No_ldSuf|
 slod, 0xac, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Acc|Byte|Word|Dword|Qword }
 movs, 0xa4, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {}
 movs, 0xa4, None, 0, W|No_sSuf|No_ldSuf|IsStringEsOp1|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-// Intel mode string move.
-movsd, 0xa5, None, Cpu386, Size32|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
-movsd, 0xa5, None, Cpu386, Size32|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsStringEsOp1|RepPrefixOk, { Dword|Unspecified|BaseIndex, Dword|Unspecified|BaseIndex }
 smov, 0xa4, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {}
 smov, 0xa4, None, 0, W|No_sSuf|No_ldSuf|IsStringEsOp1|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 scas, 0xae, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {}
@@ -989,13 +976,13 @@ emms, 0xf77, None, CpuMMX, No_bSuf|No_wS
 // copying between Reg64/Mem64 and RegXMM/RegMMX, as is mandated by Intel's
 // spec). AMD's spec, having been in existence for much longer, failed to
 // recognize that and specified movd for 32- and 64-bit operations.
-movd, 0x666e, None, CpuAVX, D|Modrm|Vex128|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Reg32|Unspecified|BaseIndex, RegXMM }
+movd, 0x666e, None, CpuAVX, D|Modrm|Vex128|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX|Pass2, { Reg32|Unspecified|BaseIndex, RegXMM }
 movd, 0x666e, None, CpuAVX|Cpu64, D|Modrm|Vex=1|Space0F|VexW1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64|SSE2AVX, { Reg64|BaseIndex, RegXMM }
 movd, 0x660f6e, None, CpuSSE2, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, RegXMM }
 movd, 0x660f6e, None, CpuSSE2|Cpu64, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg64|BaseIndex, RegXMM }
 movd, 0xf6e, None, CpuMMX, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, RegMMX }
 movd, 0xf6e, None, CpuMMX|Cpu64, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg64|BaseIndex, RegMMX }
-movq, 0xf37e, None, CpuAVX, Load|Modrm|Vex=1|Space0F|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
+movq, 0xf37e, None, CpuAVX, Load|Modrm|Vex=1|Space0F|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX|Pass2, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
 movq, 0x66d6, None, CpuAVX, Modrm|Vex=1|Space0F|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM, Qword|Unspecified|BaseIndex|RegXMM }
 movq, 0x666e, None, CpuAVX|Cpu64, D|Modrm|Vex=1|Space0F|VexW1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64|SSE2AVX, { Reg64|Unspecified|BaseIndex, RegXMM }
 movq, 0xf30f7e, None, CpuSSE2, Load|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|Qword|BaseIndex|RegXMM, RegXMM }
@@ -1159,7 +1146,7 @@ andpd<sse2>, 0x660f54, None, <sse2:cpu>,
 cmp<frel>pd<sse2>, 0x660fc2, <frel:imm>, <sse2:cpu>, Modrm|<sse2:attr>|<sse2:vvvv>|<frel:comm>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { RegXMM|Unspecified|BaseIndex, RegXMM }
 cmp<frel>sd<sse2>, 0xf20fc2, <frel:imm>, <sse2:cpu>, Modrm|<sse2:scal>|<sse2:vvvv>|<frel:comm>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM }
 cmppd<sse2>, 0x660fc2, None, <sse2:cpu>, Modrm|<sse2:attr>|<sse2:vvvv>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
-cmpsd<sse2>, 0xf20fc2, None, <sse2:cpu>, Modrm|<sse2:scal>|<sse2:vvvv>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
+cmpsd<sse2>, 0xf20fc2, None, <sse2:cpu>, Modrm|<sse2:scal>|<sse2:vvvv>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Pass2, { Imm8, Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
 comisd<sse2>, 0x660f2f, None, <sse2:cpu>, Modrm|<sse2:scal>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
 cvtpi2pd, 0x660f2a, None, CpuSSE2, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegMMX, RegXMM }
 cvtpi2pd, 0xf3e6, None, CpuAVX, Modrm|Vex|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
@@ -1184,7 +1171,7 @@ movlpd, 0x6613, None, CpuAVX, Modrm|Vex|
 movlpd, 0x660f12, None, CpuSSE2, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex, RegXMM }
 movmskpd<sse2>, 0x660f50, None, <sse2:cpu>, Modrm|<sse2:attr>|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegXMM, Reg32|Reg64 }
 movntpd<sse2>, 0x660f2b, None, <sse2:cpu>, Modrm|<sse2:attr>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Xmmword|Unspecified|BaseIndex }
-movsd, 0xf210, None, CpuAVX, D|Modrm|VexLIG|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
+movsd, 0xf210, None, CpuAVX, D|Modrm|VexLIG|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX|Pass2, { Qword|Unspecified|BaseIndex, RegXMM }
 movsd, 0xf210, None, CpuAVX, D|Modrm|Vex=3|Space0F|VexVVVV=1|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM, RegXMM }
 movsd, 0xf20f10, None, CpuSSE2, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
 movupd<sse2>, 0x660f10, None, <sse2:cpu>, D|Modrm|<sse2:attr>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v5 4/8] ix86: don't recognize/derive Q suffix in the common case
  2022-10-25  7:23 [PATCH v5 0/8] x86: suffix handling changes Jan Beulich
                   ` (2 preceding siblings ...)
  2022-10-25  7:26 ` [PATCH v5 3/8] x86: re-work insn/suffix recognition Jan Beulich
@ 2022-10-25  7:26 ` Jan Beulich
  2022-10-25  7:27 ` [PATCH v5 5/8] x86-64: allow HLE store of accumulator to absolute 32-bit address Jan Beulich
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2022-10-25  7:26 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

Have its use, except where actually legitimate, result in the same "only
supported in 64-bit mode" diagnostic as emitted for other 64-bit only
insns. Also suppress deriving of the suffix in Intel mode except in the
legitimate cases. This in exchange allows dropping the respective code
from match_template().

To maintain reasonable diagnostics (in particular to avoid "`mov' is
only supported in 64-bit mode" on the SIMD forms of MOVQ) we need to
defer parse_insn()'s emitting of errors unrelated to prefix parsing.
Utilize i.error just like match_template() does.

Oddly enough despite gcc's preference towards FILDQ and FIST{,T}Q we
had no testcase whatsoever for these. Therefore such tests are being
added. Note that the removed line in the x86-64-lfence-load testcase
was redundant with the exact same one a few lines up.
---
With gcc's preference towards FILDQ / FIST{,T}Q I wonder whether the
disassembler wouldn't better emit a Q suffix instead of the LL one.
---
v5: Re-base over re-arrangment of patch series, requiring pieces to be
    moved here.
v3: Re-base over changes to earlier patches.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -236,6 +236,8 @@ enum i386_error
     unsupported_with_intel_mnemonic,
     unsupported_syntax,
     unsupported,
+    unsupported_on_arch,
+    unsupported_64bit,
     invalid_sib_address,
     invalid_vsib_address,
     invalid_vector_register_set,
@@ -4823,7 +4825,7 @@ void
 md_assemble (char *line)
 {
   unsigned int j;
-  char mnemonic[MAX_MNEM_SIZE], mnem_suffix, *copy = NULL;
+  char mnemonic[MAX_MNEM_SIZE], mnem_suffix = 0, *copy = NULL;
   const char *end, *pass1_mnem = NULL;
   enum i386_error pass1_err = 0;
   const insn_template *t;
@@ -4848,6 +4850,16 @@ md_assemble (char *line)
     {
       if (pass1_mnem != NULL)
 	goto match_error;
+      if (i.error != no_error)
+	{
+	  gas_assert (current_templates != NULL);
+	  if (current_templates->start->opcode_modifier.pass2 && !i.suffix)
+	    goto no_match;
+	  /* No point in trying a 2nd pass - it'll only find the same suffix
+	     again.  */
+	  mnem_suffix = i.suffix;
+	  goto match_error;
+	}
       return;
     }
   if (current_templates->start->opcode_modifier.pass2)
@@ -4947,12 +4959,21 @@ md_assemble (char *line)
 	{
 	  line = copy;
 	  copy = NULL;
+  no_match:
 	  pass1_err = i.error;
 	  pass1_mnem = current_templates->start->name;
 	  goto retry;
 	}
-      free (copy);
+
+      /* If a non-/only-64bit template (group) was found in pass 1, and if
+	 _some_ template (group) was found in pass 2, squash pass 1's
+	 error.  */
+      if (pass1_err == unsupported_64bit)
+	pass1_mnem = NULL;
+
   match_error:
+      free (copy);
+
       switch (pass1_mnem ? pass1_err : i.error)
 	{
 	default:
@@ -4985,6 +5006,23 @@ md_assemble (char *line)
 	  as_bad (_("unsupported instruction `%s'"),
 		  pass1_mnem ? pass1_mnem : current_templates->start->name);
 	  return;
+	case unsupported_on_arch:
+	  as_bad (_("`%s' is not supported on `%s%s'"),
+		  pass1_mnem ? pass1_mnem : current_templates->start->name,
+		  cpu_arch_name ? cpu_arch_name : default_arch,
+		  cpu_sub_arch_name ? cpu_sub_arch_name : "");
+	  return;
+	case unsupported_64bit:
+	  if (ISLOWER (mnem_suffix))
+	    as_bad (_("`%s%c' is %s supported in 64-bit mode"),
+		    pass1_mnem ? pass1_mnem : current_templates->start->name,
+		    mnem_suffix,
+		    flag_code == CODE_64BIT ? _("not") : _("only"));
+	  else
+	    as_bad (_("`%s' is %s supported in 64-bit mode"),
+		    pass1_mnem ? pass1_mnem : current_templates->start->name,
+		    flag_code == CODE_64BIT ? _("not") : _("only"));
+	  return;
 	case invalid_sib_address:
 	  err_msg = _("invalid SIB address");
 	  break;
@@ -5326,6 +5364,23 @@ md_assemble (char *line)
     last_insn.kind = last_insn_other;
 }
 
+/* The Q suffix is generally valid only in 64-bit mode, with very few
+   exceptions: fild, fistp, fisttp, and cmpxchg8b.  Note that for fild
+   and fisttp only one of their two templates is matched below: That's
+   sufficient since other relevant attributes are the same between both
+   respective templates.  */
+static INLINE bool q_suffix_allowed(const insn_template *t)
+{
+  return flag_code == CODE_64BIT
+	 || (t->opcode_modifier.opcodespace == SPACE_BASE
+	     && t->base_opcode == 0xdf
+	     && (t->extension_opcode & 1)) /* fild / fistp / fisttp */
+	 || (t->opcode_modifier.opcodespace == SPACE_0F
+	     && t->base_opcode == 0xc7
+	     && t->opcode_modifier.opcodeprefix == PREFIX_NONE
+	     && t->extension_opcode == 1) /* cmpxchg8b */;
+}
+
 static const char *
 parse_insn (const char *line, char *mnemonic)
 {
@@ -5621,20 +5676,21 @@ parse_insn (const char *line, char *mnem
   for (t = current_templates->start; t < current_templates->end; ++t)
     {
       supported |= cpu_flags_match (t);
+
+      if (i.suffix == QWORD_MNEM_SUFFIX && !q_suffix_allowed (t))
+	supported &= ~CPU_FLAGS_64BIT_MATCH;
+
       if (supported == CPU_FLAGS_PERFECT_MATCH)
 	return l;
     }
 
-  if (!(supported & CPU_FLAGS_64BIT_MATCH))
-    as_bad (flag_code == CODE_64BIT
-	    ? _("`%s' is not supported in 64-bit mode")
-	    : _("`%s' is only supported in 64-bit mode"),
-	    current_templates->start->name);
-  else
-    as_bad (_("`%s' is not supported on `%s%s'"),
-	    current_templates->start->name,
-	    cpu_arch_name ? cpu_arch_name : default_arch,
-	    cpu_sub_arch_name ? cpu_sub_arch_name : "");
+  if (pass1)
+    {
+      if (supported & CPU_FLAGS_64BIT_MATCH)
+        i.error = unsupported_on_arch;
+      else
+        i.error = unsupported_64bit;
+    }
 
   return NULL;
 }
@@ -6653,20 +6709,12 @@ match_template (char mnem_suffix)
       for (j = 0; j < MAX_OPERANDS; j++)
 	operand_types[j] = t->operand_types[j];
 
-      /* In general, don't allow
-	 - 64-bit operands outside of 64-bit mode,
-	 - 32-bit operands on pre-386.  */
+      /* In general, don't allow 32-bit operands on pre-386.  */
       specific_error = progress (mnem_suffix ? invalid_instruction_suffix
 					     : operand_size_mismatch);
       j = i.imm_operands + (t->operands > i.imm_operands + 1);
-      if (((i.suffix == QWORD_MNEM_SUFFIX
-	    && flag_code != CODE_64BIT
-	    && !(t->opcode_modifier.opcodespace == SPACE_0F
-		 && t->base_opcode == 0xc7
-		 && t->opcode_modifier.opcodeprefix == PREFIX_NONE
-		 && t->extension_opcode == 1) /* cmpxchg8b */)
-	   || (i.suffix == LONG_MNEM_SUFFIX
-	       && !cpu_arch_flags.bitfield.cpui386))
+      if (i.suffix == LONG_MNEM_SUFFIX
+	  && !cpu_arch_flags.bitfield.cpui386
 	  && (intel_syntax
 	      ? (t->opcode_modifier.mnemonicsize != IGNORESIZE
 		 && !intel_float_operand (t->name))
--- a/gas/config/tc-i386-intel.c
+++ b/gas/config/tc-i386-intel.c
@@ -824,7 +824,7 @@ i386_intel_operand (char *operand_string
 		    continue;
 		  break;
 		case QWORD_MNEM_SUFFIX:
-		  if (t->opcode_modifier.no_qsuf)
+		  if (t->opcode_modifier.no_qsuf || !q_suffix_allowed (t))
 		    continue;
 		  break;
 		case SHORT_MNEM_SUFFIX:
--- a/gas/testsuite/gas/i386/opcode.d
+++ b/gas/testsuite/gas/i386/opcode.d
@@ -592,6 +592,10 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	0f 4b 90 90 90 90 90 	cmovnp -0x6f6f6f70\(%eax\),%edx
 [ 	]*[a-f0-9]+:	66 0f 4a 90 90 90 90 90 	cmovp  -0x6f6f6f70\(%eax\),%dx
 [ 	]*[a-f0-9]+:	66 0f 4b 90 90 90 90 90 	cmovnp -0x6f6f6f70\(%eax\),%dx
+[ 	]*[a-f0-9]+:	df 28                	fildll \(%eax\)
+[ 	]*[a-f0-9]+:	df 28                	fildll \(%eax\)
+[ 	]*[a-f0-9]+:	df 38                	fistpll \(%eax\)
+[ 	]*[a-f0-9]+:	df 38                	fistpll \(%eax\)
  +[a-f0-9]+:	82 c3 01             	add    \$0x1,%bl
  +[a-f0-9]+:	82 f3 01             	xor    \$0x1,%bl
  +[a-f0-9]+:	82 d3 01             	adc    \$0x1,%bl
--- a/gas/testsuite/gas/i386/opcode.s
+++ b/gas/testsuite/gas/i386/opcode.s
@@ -592,6 +592,11 @@ foo:
  cmovpe  0x90909090(%eax),%dx
  cmovpo 0x90909090(%eax),%dx
 
+ fildq  (%eax)
+ fildll (%eax)
+ fistpq (%eax)
+ fistpll (%eax)
+
 	.byte 0x82, 0xc3, 0x01
 	.byte 0x82, 0xf3, 0x01
 	.byte 0x82, 0xd3, 0x01
--- a/gas/testsuite/gas/i386/opcode-intel.d
+++ b/gas/testsuite/gas/i386/opcode-intel.d
@@ -593,6 +593,10 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	0f 4b 90 90 90 90 90 	cmovnp edx,DWORD PTR \[eax-0x6f6f6f70\]
 [ 	]*[a-f0-9]+:	66 0f 4a 90 90 90 90 90 	cmovp  dx,WORD PTR \[eax-0x6f6f6f70\]
 [ 	]*[a-f0-9]+:	66 0f 4b 90 90 90 90 90 	cmovnp dx,WORD PTR \[eax-0x6f6f6f70\]
+[ 	]*[a-f0-9]+:	df 28                	fild   QWORD PTR \[eax\]
+[ 	]*[a-f0-9]+:	df 28                	fild   QWORD PTR \[eax\]
+[ 	]*[a-f0-9]+:	df 38                	fistp  QWORD PTR \[eax\]
+[ 	]*[a-f0-9]+:	df 38                	fistp  QWORD PTR \[eax\]
  +[a-f0-9]+:	82 c3 01             	add    bl,0x1
  +[a-f0-9]+:	82 f3 01             	xor    bl,0x1
  +[a-f0-9]+:	82 d3 01             	adc    bl,0x1
--- a/gas/testsuite/gas/i386/opcode-suffix.d
+++ b/gas/testsuite/gas/i386/opcode-suffix.d
@@ -593,6 +593,10 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	0f 4b 90 90 90 90 90 	cmovnpl -0x6f6f6f70\(%eax\),%edx
 [ 	]*[a-f0-9]+:	66 0f 4a 90 90 90 90 90 	cmovpw -0x6f6f6f70\(%eax\),%dx
 [ 	]*[a-f0-9]+:	66 0f 4b 90 90 90 90 90 	cmovnpw -0x6f6f6f70\(%eax\),%dx
+[ 	]*[a-f0-9]+:	df 28                	fildll \(%eax\)
+[ 	]*[a-f0-9]+:	df 28                	fildll \(%eax\)
+[ 	]*[a-f0-9]+:	df 38                	fistpll \(%eax\)
+[ 	]*[a-f0-9]+:	df 38                	fistpll \(%eax\)
  +[a-f0-9]+:	82 c3 01             	addb   \$0x1,%bl
  +[a-f0-9]+:	82 f3 01             	xorb   \$0x1,%bl
  +[a-f0-9]+:	82 d3 01             	adcb   \$0x1,%bl
--- a/gas/testsuite/gas/i386/sse3.d
+++ b/gas/testsuite/gas/i386/sse3.d
@@ -13,29 +13,30 @@ Disassembly of section .text:
   10:	df 88 90 90 90 90 [ 	]*fisttps -0x6f6f6f70\(%eax\)
   16:	db 88 90 90 90 90 [ 	]*fisttpl -0x6f6f6f70\(%eax\)
   1c:	dd 88 90 90 90 90 [ 	]*fisttpll -0x6f6f6f70\(%eax\)
-  22:	66 0f 7c 65 00 [ 	]*haddpd 0x0\(%ebp\),%xmm4
-  27:	66 0f 7c ee [ 	]*haddpd %xmm6,%xmm5
-  2b:	f2 0f 7c 37 [ 	]*haddps \(%edi\),%xmm6
-  2f:	f2 0f 7c f8 [ 	]*haddps %xmm0,%xmm7
-  33:	66 0f 7d c1 [ 	]*hsubpd %xmm1,%xmm0
-  37:	66 0f 7d 0a [ 	]*hsubpd \(%edx\),%xmm1
-  3b:	f2 0f 7d d2 [ 	]*hsubps %xmm2,%xmm2
-  3f:	f2 0f 7d 1c 24 [ 	]*hsubps \(%esp\),%xmm3
-  44:	f2 0f f0 2e [ 	]*lddqu  \(%esi\),%xmm5
-  48:	0f 01 c8 [ 	]*monitor %eax,%ecx,%edx
-  4b:	0f 01 c8 [ 	]*monitor %eax,%ecx,%edx
-  4e:	f2 0f 12 f7 [ 	]*movddup %xmm7,%xmm6
-  52:	f2 0f 12 38 [ 	]*movddup \(%eax\),%xmm7
-  56:	f3 0f 16 01 [ 	]*movshdup \(%ecx\),%xmm0
-  5a:	f3 0f 16 ca [ 	]*movshdup %xmm2,%xmm1
-  5e:	f3 0f 12 13 [ 	]*movsldup \(%ebx\),%xmm2
-  62:	f3 0f 12 dc [ 	]*movsldup %xmm4,%xmm3
-  66:	0f 01 c9 [ 	]*mwait  %eax,%ecx
-  69:	0f 01 c9 [ 	]*mwait  %eax,%ecx
-  6c:	67 0f 01 c8 [ 	]*monitor %ax,%ecx,%edx
-  70:	67 0f 01 c8 [ 	]*monitor %ax,%ecx,%edx
-  74:	f2 0f 12 38 [ 	]*movddup \(%eax\),%xmm7
-  78:	f2 0f 12 38 [ 	]*movddup \(%eax\),%xmm7
+[ 	]*[0-9a-f]+:	dd 88 90 90 90 90 [ 	]*fisttpll -0x6f6f6f70\(%eax\)
+[ 	]*[0-9a-f]+:	66 0f 7c 65 00 [ 	]*haddpd 0x0\(%ebp\),%xmm4
+[ 	]*[0-9a-f]+:	66 0f 7c ee [ 	]*haddpd %xmm6,%xmm5
+[ 	]*[0-9a-f]+:	f2 0f 7c 37 [ 	]*haddps \(%edi\),%xmm6
+[ 	]*[0-9a-f]+:	f2 0f 7c f8 [ 	]*haddps %xmm0,%xmm7
+[ 	]*[0-9a-f]+:	66 0f 7d c1 [ 	]*hsubpd %xmm1,%xmm0
+[ 	]*[0-9a-f]+:	66 0f 7d 0a [ 	]*hsubpd \(%edx\),%xmm1
+[ 	]*[0-9a-f]+:	f2 0f 7d d2 [ 	]*hsubps %xmm2,%xmm2
+[ 	]*[0-9a-f]+:	f2 0f 7d 1c 24 [ 	]*hsubps \(%esp\),%xmm3
+[ 	]*[0-9a-f]+:	f2 0f f0 2e [ 	]*lddqu  \(%esi\),%xmm5
+[ 	]*[0-9a-f]+:	0f 01 c8 [ 	]*monitor %eax,%ecx,%edx
+[ 	]*[0-9a-f]+:	0f 01 c8 [ 	]*monitor %eax,%ecx,%edx
+[ 	]*[0-9a-f]+:	f2 0f 12 f7 [ 	]*movddup %xmm7,%xmm6
+[ 	]*[0-9a-f]+:	f2 0f 12 38 [ 	]*movddup \(%eax\),%xmm7
+[ 	]*[0-9a-f]+:	f3 0f 16 01 [ 	]*movshdup \(%ecx\),%xmm0
+[ 	]*[0-9a-f]+:	f3 0f 16 ca [ 	]*movshdup %xmm2,%xmm1
+[ 	]*[0-9a-f]+:	f3 0f 12 13 [ 	]*movsldup \(%ebx\),%xmm2
+[ 	]*[0-9a-f]+:	f3 0f 12 dc [ 	]*movsldup %xmm4,%xmm3
+[ 	]*[0-9a-f]+:	0f 01 c9 [ 	]*mwait  %eax,%ecx
+[ 	]*[0-9a-f]+:	0f 01 c9 [ 	]*mwait  %eax,%ecx
+[ 	]*[0-9a-f]+:	67 0f 01 c8 [ 	]*monitor %ax,%ecx,%edx
+[ 	]*[0-9a-f]+:	67 0f 01 c8 [ 	]*monitor %ax,%ecx,%edx
+[ 	]*[0-9a-f]+:	f2 0f 12 38 [ 	]*movddup \(%eax\),%xmm7
+[ 	]*[0-9a-f]+:	f2 0f 12 38 [ 	]*movddup \(%eax\),%xmm7
 [ 	]*[0-9a-f]+:	0f 01 c8[ 	]+monitor %eax,%ecx,%edx
 [ 	]*[0-9a-f]+:	67 0f 01 c8[ 	]+monitor %ax,%ecx,%edx
 [ 	]*[0-9a-f]+:	0f 01 c9[ 	]+mwait  %eax,%ecx
--- a/gas/testsuite/gas/i386/sse3.s
+++ b/gas/testsuite/gas/i386/sse3.s
@@ -8,6 +8,7 @@ foo:
 	addsubps	%xmm4,%xmm3
 	fisttps		0x90909090(%eax)
 	fisttpl		0x90909090(%eax)
+	fisttpq		0x90909090(%eax)
 	fisttpll	0x90909090(%eax)
 	haddpd		0x0(%ebp),%xmm4
 	haddpd		%xmm6,%xmm5
--- a/gas/testsuite/gas/i386/sse3-intel.d
+++ b/gas/testsuite/gas/i386/sse3-intel.d
@@ -14,6 +14,7 @@ Disassembly of section .text:
 [ 	]*[0-9a-f]+:	df 88 90 90 90 90[ 	]+fisttp WORD PTR \[eax-0x6f6f6f70\]
 [ 	]*[0-9a-f]+:	db 88 90 90 90 90[ 	]+fisttp DWORD PTR \[eax-0x6f6f6f70\]
 [ 	]*[0-9a-f]+:	dd 88 90 90 90 90[ 	]+fisttp QWORD PTR \[eax-0x6f6f6f70\]
+[ 	]*[0-9a-f]+:	dd 88 90 90 90 90[ 	]+fisttp QWORD PTR \[eax-0x6f6f6f70\]
 [ 	]*[0-9a-f]+:	66 0f 7c 65 00[ 	]+haddpd xmm4,(XMMWORD PTR )?\[ebp(\+0x0)\]
 [ 	]*[0-9a-f]+:	66 0f 7c ee[ 	]+haddpd xmm5,xmm6
 [ 	]*[0-9a-f]+:	f2 0f 7c 37[ 	]+haddps xmm6,(XMMWORD PTR )?\[edi\]
--- a/gas/testsuite/gas/i386/x86-64-lfence-load.d
+++ b/gas/testsuite/gas/i386/x86-64-lfence-load.d
@@ -44,16 +44,21 @@ Disassembly of section .text:
  +[a-f0-9]+:	0f ae e8             	lfence
  +[a-f0-9]+:	db 55 00             	fistl  0x0\(%rbp\)
  +[a-f0-9]+:	df 55 00             	fists  0x0\(%rbp\)
+ +[a-f0-9]+:	db 5d 00             	fistpl 0x0\(%rbp\)
+ +[a-f0-9]+:	df 5d 00             	fistps 0x0\(%rbp\)
+ +[a-f0-9]+:	df 7d 00             	fistpll 0x0\(%rbp\)
  +[a-f0-9]+:	db 45 00             	fildl  0x0\(%rbp\)
  +[a-f0-9]+:	0f ae e8             	lfence
  +[a-f0-9]+:	df 45 00             	filds  0x0\(%rbp\)
  +[a-f0-9]+:	0f ae e8             	lfence
+ +[a-f0-9]+:	df 6d 00             	fildll 0x0\(%rbp\)
+ +[a-f0-9]+:	0f ae e8             	lfence
  +[a-f0-9]+:	9b dd 75 00          	fsave  0x0\(%rbp\)
  +[a-f0-9]+:	dd 65 00             	frstor 0x0\(%rbp\)
  +[a-f0-9]+:	0f ae e8             	lfence
- +[a-f0-9]+:	df 45 00             	filds  0x0\(%rbp\)
- +[a-f0-9]+:	0f ae e8             	lfence
+ +[a-f0-9]+:	db 4d 00             	fisttpl 0x0\(%rbp\)
  +[a-f0-9]+:	df 4d 00             	fisttps 0x0\(%rbp\)
+ +[a-f0-9]+:	dd 4d 00             	fisttpll 0x0\(%rbp\)
  +[a-f0-9]+:	d9 65 00             	fldenv 0x0\(%rbp\)
  +[a-f0-9]+:	0f ae e8             	lfence
  +[a-f0-9]+:	9b d9 75 00          	fstenv 0x0\(%rbp\)
--- a/gas/testsuite/gas/i386/x86-64-lfence-load.s
+++ b/gas/testsuite/gas/i386/x86-64-lfence-load.s
@@ -27,12 +27,17 @@ _start:
 	flds (%rbp)
 	fistl (%rbp)
 	fists (%rbp)
+	fistpl (%rbp)
+	fistps (%rbp)
+	fistpq (%rbp)
 	fildl (%rbp)
 	filds (%rbp)
+	fildq (%rbp)
 	fsave (%rbp)
 	frstor (%rbp)
-	filds (%rbp)
+	fisttpl (%rbp)
 	fisttps (%rbp)
+	fisttpq (%rbp)
 	fldenv (%rbp)
 	fstenv (%rbp)
 	fadds  (%rbp)
--- a/gas/testsuite/gas/i386/x86-64-sse3.d
+++ b/gas/testsuite/gas/i386/x86-64-sse3.d
@@ -13,6 +13,7 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	df 88 90 90 90 00 [ 	]*fisttps 0x909090\(%rax\)
 [ 	]*[a-f0-9]+:	db 88 90 90 90 00 [ 	]*fisttpl 0x909090\(%rax\)
 [ 	]*[a-f0-9]+:	dd 88 90 90 90 00 [ 	]*fisttpll 0x909090\(%rax\)
+[ 	]*[a-f0-9]+:	dd 88 90 90 90 00 [ 	]*fisttpll 0x909090\(%rax\)
 [ 	]*[a-f0-9]+:	66 0f 7c 65 00 [ 	]*haddpd 0x0\(%rbp\),%xmm4
 [ 	]*[a-f0-9]+:	66 0f 7c ee [ 	]*haddpd %xmm6,%xmm5
 [ 	]*[a-f0-9]+:	f2 0f 7c 37 [ 	]*haddps \(%rdi\),%xmm6
--- a/gas/testsuite/gas/i386/x86-64-sse3.s
+++ b/gas/testsuite/gas/i386/x86-64-sse3.s
@@ -8,6 +8,7 @@ foo:
 	addsubps	%xmm4,%xmm3
 	fisttps		0x909090(%rax)
 	fisttpl		0x909090(%rax)
+	fisttpq		0x909090(%rax)
 	fisttpll	0x909090(%rax)
 	haddpd		0x0(%rbp),%xmm4
 	haddpd		%xmm6,%xmm5
--- a/gas/testsuite/gas/i386/x86-64-sse3-intel.d
+++ b/gas/testsuite/gas/i386/x86-64-sse3-intel.d
@@ -14,6 +14,7 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	df 88 90 90 90 00[ 	]+fisttp WORD PTR \[rax\+0x909090\]
 [ 	]*[a-f0-9]+:	db 88 90 90 90 00[ 	]+fisttp DWORD PTR \[rax\+0x909090\]
 [ 	]*[a-f0-9]+:	dd 88 90 90 90 00[ 	]+fisttp QWORD PTR \[rax\+0x909090\]
+[ 	]*[a-f0-9]+:	dd 88 90 90 90 00[ 	]+fisttp QWORD PTR \[rax\+0x909090\]
 [ 	]*[a-f0-9]+:	66 0f 7c 65 00[ 	]+haddpd xmm4,(XMMWORD PTR )?\[rbp(\+0x0)\]
 [ 	]*[a-f0-9]+:	66 0f 7c ee[ 	]+haddpd xmm5,xmm6
 [ 	]*[a-f0-9]+:	f2 0f 7c 37[ 	]+haddps xmm6,(XMMWORD PTR )?\[rdi\]


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v5 5/8] x86-64: allow HLE store of accumulator to absolute 32-bit address
  2022-10-25  7:23 [PATCH v5 0/8] x86: suffix handling changes Jan Beulich
                   ` (3 preceding siblings ...)
  2022-10-25  7:26 ` [PATCH v5 4/8] ix86: don't recognize/derive Q suffix in the common case Jan Beulich
@ 2022-10-25  7:27 ` Jan Beulich
  2022-10-25  7:27 ` [PATCH v5 6/8] x86: move bad-use-of-TLS-reloc check Jan Beulich
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2022-10-25  7:27 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

In commit 1212781b35c9 ("ix86: allow HLE store of accumulator to
absolute address") I was wrong to exclude 64-bit code. Dropping the
check also leads to better diagnostics in 64-bit code ("MOV", after
all, isn't invalid with "XRELEASE").

While there also limit the amount of further checks done: The operand
type checks that were there were effectively redundant with other ones
anyway, plus it's quite fine to also have "xrelease mov <disp>, %eax"
look for the next MOV template (in fact again also improving
diagnostics).
---
v2: New.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -6837,12 +6837,9 @@ match_template (char mnem_suffix)
 	    continue;
 	  /* xrelease mov %eax, <disp> is another special case. It must not
 	     match the accumulator-only encoding of mov.  */
-	  if (flag_code != CODE_64BIT
-	      && i.hle_prefix
+	  if (i.hle_prefix
 	      && t->base_opcode == 0xa0
-	      && t->opcode_modifier.opcodespace == SPACE_BASE
-	      && i.types[0].bitfield.instance == Accum
-	      && (i.flags[1] & Operand_Mem))
+	      && t->opcode_modifier.opcodespace == SPACE_BASE)
 	    continue;
 	  /* Fall through.  */
 
--- a/gas/testsuite/gas/i386/x86-64-hle.d
+++ b/gas/testsuite/gas/i386/x86-64-hle.d
@@ -424,6 +424,8 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	f0 f2 20 01          	lock xacquire and %al,\(%rcx\)
 [ 	]*[a-f0-9]+:	f0 f3 20 01          	lock xrelease and %al,\(%rcx\)
 [ 	]*[a-f0-9]+:	f3 88 01             	xrelease mov %al,\(%rcx\)
+[ 	]*[a-f0-9]+:	f3 88 04 25 78 56 34 12 	xrelease mov %al,0x12345678
+[ 	]*[a-f0-9]+:	67 f3 88 04 25 21 43 65 87 	xrelease mov %al,0x87654321\(,%eiz,1\)
 [ 	]*[a-f0-9]+:	f2 f0 08 01          	xacquire lock or %al,\(%rcx\)
 [ 	]*[a-f0-9]+:	f2 f0 08 01          	xacquire lock or %al,\(%rcx\)
 [ 	]*[a-f0-9]+:	f3 f0 08 01          	xrelease lock or %al,\(%rcx\)
@@ -475,6 +477,8 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	f0 f2 66 21 01       	lock xacquire and %ax,\(%rcx\)
 [ 	]*[a-f0-9]+:	f0 f3 66 21 01       	lock xrelease and %ax,\(%rcx\)
 [ 	]*[a-f0-9]+:	66 f3 89 01          	xrelease mov %ax,\(%rcx\)
+[ 	]*[a-f0-9]+:	66 f3 89 04 25 78 56 34 12 	xrelease mov %ax,0x12345678
+[ 	]*[a-f0-9]+:	67 66 f3 89 04 25 21 43 65 87 	xrelease mov %ax,0x87654321\(,%eiz,1\)
 [ 	]*[a-f0-9]+:	66 f2 f0 09 01       	xacquire lock or %ax,\(%rcx\)
 [ 	]*[a-f0-9]+:	66 f2 f0 09 01       	xacquire lock or %ax,\(%rcx\)
 [ 	]*[a-f0-9]+:	66 f3 f0 09 01       	xrelease lock or %ax,\(%rcx\)
@@ -526,6 +530,8 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	f0 f2 21 01          	lock xacquire and %eax,\(%rcx\)
 [ 	]*[a-f0-9]+:	f0 f3 21 01          	lock xrelease and %eax,\(%rcx\)
 [ 	]*[a-f0-9]+:	f3 89 01             	xrelease mov %eax,\(%rcx\)
+[ 	]*[a-f0-9]+:	f3 89 04 25 78 56 34 12 	xrelease mov %eax,0x12345678
+[ 	]*[a-f0-9]+:	67 f3 89 04 25 21 43 65 87 	xrelease mov %eax,0x87654321\(,%eiz,1\)
 [ 	]*[a-f0-9]+:	f2 f0 09 01          	xacquire lock or %eax,\(%rcx\)
 [ 	]*[a-f0-9]+:	f2 f0 09 01          	xacquire lock or %eax,\(%rcx\)
 [ 	]*[a-f0-9]+:	f3 f0 09 01          	xrelease lock or %eax,\(%rcx\)
@@ -577,6 +583,8 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	f0 f2 48 21 01       	lock xacquire and %rax,\(%rcx\)
 [ 	]*[a-f0-9]+:	f0 f3 48 21 01       	lock xrelease and %rax,\(%rcx\)
 [ 	]*[a-f0-9]+:	f3 48 89 01          	xrelease mov %rax,\(%rcx\)
+[ 	]*[a-f0-9]+:	f3 48 89 04 25 78 56 34 12 	xrelease mov %rax,0x12345678
+[ 	]*[a-f0-9]+:	67 f3 48 89 04 25 21 43 65 87 	xrelease mov %rax,0x87654321\(,%eiz,1\)
 [ 	]*[a-f0-9]+:	f2 f0 48 09 01       	xacquire lock or %rax,\(%rcx\)
 [ 	]*[a-f0-9]+:	f2 f0 48 09 01       	xacquire lock or %rax,\(%rcx\)
 [ 	]*[a-f0-9]+:	f3 f0 48 09 01       	xrelease lock or %rax,\(%rcx\)
--- a/gas/testsuite/gas/i386/x86-64-hle.s
+++ b/gas/testsuite/gas/i386/x86-64-hle.s
@@ -442,6 +442,8 @@ _start:
 	.byte 0xf0; .byte 0xf2; andb %al,(%rcx)
 	.byte 0xf0; .byte 0xf3; andb %al,(%rcx)
 	xrelease movb %al,(%rcx)
+	xrelease movb %al,0x12345678
+	xrelease addr32 movb %al,0x87654321
 	xacquire lock orb %al,(%rcx)
 	lock xacquire orb %al,(%rcx)
 	xrelease lock orb %al,(%rcx)
@@ -496,6 +498,8 @@ _start:
 	.byte 0xf0; .byte 0xf2; andw %ax,(%rcx)
 	.byte 0xf0; .byte 0xf3; andw %ax,(%rcx)
 	xrelease movw %ax,(%rcx)
+	xrelease movw %ax,0x12345678
+	xrelease addr32 movw %ax,0x87654321
 	xacquire lock orw %ax,(%rcx)
 	lock xacquire orw %ax,(%rcx)
 	xrelease lock orw %ax,(%rcx)
@@ -550,6 +554,8 @@ _start:
 	.byte 0xf0; .byte 0xf2; andl %eax,(%rcx)
 	.byte 0xf0; .byte 0xf3; andl %eax,(%rcx)
 	xrelease movl %eax,(%rcx)
+	xrelease movl %eax,0x12345678
+	xrelease addr32 movl %eax,0x87654321
 	xacquire lock orl %eax,(%rcx)
 	lock xacquire orl %eax,(%rcx)
 	xrelease lock orl %eax,(%rcx)
@@ -604,6 +610,8 @@ _start:
 	.byte 0xf0; .byte 0xf2; andq %rax,(%rcx)
 	.byte 0xf0; .byte 0xf3; andq %rax,(%rcx)
 	xrelease movq %rax,(%rcx)
+	xrelease movq %rax,0x12345678
+	xrelease addr32 movq %rax,0x87654321
 	xacquire lock orq %rax,(%rcx)
 	lock xacquire orq %rax,(%rcx)
 	xrelease lock orq %rax,(%rcx)
--- a/gas/testsuite/gas/i386/x86-64-hle-intel.d
+++ b/gas/testsuite/gas/i386/x86-64-hle-intel.d
@@ -425,6 +425,8 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	f0 f2 20 01          	lock xacquire and BYTE PTR \[rcx\],al
 [ 	]*[a-f0-9]+:	f0 f3 20 01          	lock xrelease and BYTE PTR \[rcx\],al
 [ 	]*[a-f0-9]+:	f3 88 01             	xrelease mov BYTE PTR \[rcx\],al
+[ 	]*[a-f0-9]+:	f3 88 04 25 78 56 34 12 	xrelease mov BYTE PTR (ds:)?0x12345678,al
+[ 	]*[a-f0-9]+:	67 f3 88 04 25 21 43 65 87 	xrelease mov BYTE PTR \[eiz\*1\+0x87654321\],al
 [ 	]*[a-f0-9]+:	f2 f0 08 01          	xacquire lock or BYTE PTR \[rcx\],al
 [ 	]*[a-f0-9]+:	f2 f0 08 01          	xacquire lock or BYTE PTR \[rcx\],al
 [ 	]*[a-f0-9]+:	f3 f0 08 01          	xrelease lock or BYTE PTR \[rcx\],al
@@ -476,6 +478,8 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	f0 f2 66 21 01       	lock xacquire and WORD PTR \[rcx\],ax
 [ 	]*[a-f0-9]+:	f0 f3 66 21 01       	lock xrelease and WORD PTR \[rcx\],ax
 [ 	]*[a-f0-9]+:	66 f3 89 01          	xrelease mov WORD PTR \[rcx\],ax
+[ 	]*[a-f0-9]+:	66 f3 89 04 25 78 56 34 12 	xrelease mov WORD PTR (ds:)?0x12345678,ax
+[ 	]*[a-f0-9]+:	67 66 f3 89 04 25 21 43 65 87 	xrelease mov WORD PTR \[eiz\*1\+0x87654321\],ax
 [ 	]*[a-f0-9]+:	66 f2 f0 09 01       	xacquire lock or WORD PTR \[rcx\],ax
 [ 	]*[a-f0-9]+:	66 f2 f0 09 01       	xacquire lock or WORD PTR \[rcx\],ax
 [ 	]*[a-f0-9]+:	66 f3 f0 09 01       	xrelease lock or WORD PTR \[rcx\],ax
@@ -527,6 +531,8 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	f0 f2 21 01          	lock xacquire and DWORD PTR \[rcx\],eax
 [ 	]*[a-f0-9]+:	f0 f3 21 01          	lock xrelease and DWORD PTR \[rcx\],eax
 [ 	]*[a-f0-9]+:	f3 89 01             	xrelease mov DWORD PTR \[rcx\],eax
+[ 	]*[a-f0-9]+:	f3 89 04 25 78 56 34 12 	xrelease mov DWORD PTR (ds:)?0x12345678,eax
+[ 	]*[a-f0-9]+:	67 f3 89 04 25 21 43 65 87 	xrelease mov DWORD PTR \[eiz\*1\+0x87654321\],eax
 [ 	]*[a-f0-9]+:	f2 f0 09 01          	xacquire lock or DWORD PTR \[rcx\],eax
 [ 	]*[a-f0-9]+:	f2 f0 09 01          	xacquire lock or DWORD PTR \[rcx\],eax
 [ 	]*[a-f0-9]+:	f3 f0 09 01          	xrelease lock or DWORD PTR \[rcx\],eax
@@ -578,6 +584,8 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	f0 f2 48 21 01       	lock xacquire and QWORD PTR \[rcx\],rax
 [ 	]*[a-f0-9]+:	f0 f3 48 21 01       	lock xrelease and QWORD PTR \[rcx\],rax
 [ 	]*[a-f0-9]+:	f3 48 89 01          	xrelease mov QWORD PTR \[rcx\],rax
+[ 	]*[a-f0-9]+:	f3 48 89 04 25 78 56 34 12 	xrelease mov QWORD PTR (ds:)?0x12345678,rax
+[ 	]*[a-f0-9]+:	67 f3 48 89 04 25 21 43 65 87 	xrelease mov QWORD PTR \[eiz\*1\+0x87654321\],rax
 [ 	]*[a-f0-9]+:	f2 f0 48 09 01       	xacquire lock or QWORD PTR \[rcx\],rax
 [ 	]*[a-f0-9]+:	f2 f0 48 09 01       	xacquire lock or QWORD PTR \[rcx\],rax
 [ 	]*[a-f0-9]+:	f3 f0 48 09 01       	xrelease lock or QWORD PTR \[rcx\],rax


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v5 6/8] x86: move bad-use-of-TLS-reloc check
  2022-10-25  7:23 [PATCH v5 0/8] x86: suffix handling changes Jan Beulich
                   ` (4 preceding siblings ...)
  2022-10-25  7:27 ` [PATCH v5 5/8] x86-64: allow HLE store of accumulator to absolute 32-bit address Jan Beulich
@ 2022-10-25  7:27 ` Jan Beulich
  2022-10-25  7:28 ` [PATCH v5 7/8] x86: drop (now) stray IsString Jan Beulich
  2022-10-25  7:29 ` [PATCH v5 8/8] x86: further re-work insn/suffix recognition to also cover MOVSX Jan Beulich
  7 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2022-10-25  7:27 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

Having it in match_template() is unhelpful. Neither does looking for the
next template to possibly match make any sense in that case, nor is the
resulting diagnostic making clear what the problem is.

While moving the check, also generalize it to include all SIMD and VEX-
encoded insns. This way an existing conditional can be re-used in
md_assemble(). Note though that this still leaves a lof of insns which
are also wrong to use with these relocations.

Further fold the remaining check (BFD_RELOC_386_GOT32) with the XRELEASE
related one a few lines down. This again allows re-using an existing
conditional.
---
Is the set of relocations enumerated here actually complete?
---
v2: New.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -5118,14 +5118,30 @@ md_assemble (char *line)
       return;
     }
 
-  /* Check for data size prefix on VEX/XOP/EVEX encoded and SIMD insns.  */
-  if (i.prefix[DATA_PREFIX]
-      && (is_any_vex_encoding (&i.tm)
-	  || i.tm.operand_types[i.imm_operands].bitfield.class >= RegMMX
-	  || i.tm.operand_types[i.imm_operands + 1].bitfield.class >= RegMMX))
+  if (is_any_vex_encoding (&i.tm)
+      || i.tm.operand_types[i.imm_operands].bitfield.class >= RegMMX
+      || i.tm.operand_types[i.imm_operands + 1].bitfield.class >= RegMMX)
     {
-      as_bad (_("data size prefix invalid with `%s'"), i.tm.name);
-      return;
+      /* Check for data size prefix on VEX/XOP/EVEX encoded and SIMD insns.  */
+      if (i.prefix[DATA_PREFIX])
+	{
+	  as_bad (_("data size prefix invalid with `%s'"), i.tm.name);
+	  return;
+	}
+
+      /* Don't allow e.g. KMOV in TLS code sequences.  */
+      for (j = i.imm_operands; j < i.operands; ++j)
+	switch (i.reloc[j])
+	  {
+	  case BFD_RELOC_386_TLS_GOTIE:
+	  case BFD_RELOC_386_TLS_LE_32:
+	  case BFD_RELOC_X86_64_GOTTPOFF:
+	  case BFD_RELOC_X86_64_TLSLD:
+	    as_bad (_("TLS relocation cannot be used with `%s'"), i.tm.name);
+	    return;
+	  default:
+	    break;
+	  }
     }
 
   /* Check if HLE prefix is OK.  */
@@ -6785,26 +6801,6 @@ match_template (char mnem_suffix)
 	    }
 	}
 
-      switch (i.reloc[0])
-	{
-	case BFD_RELOC_386_GOT32:
-	  /* Force 0x8b encoding for "mov foo@GOT, %eax".  */
-	  if (t->base_opcode == 0xa0
-	      && t->opcode_modifier.opcodespace == SPACE_BASE)
-	    continue;
-	  break;
-	case BFD_RELOC_386_TLS_GOTIE:
-	case BFD_RELOC_386_TLS_LE_32:
-	case BFD_RELOC_X86_64_GOTTPOFF:
-	case BFD_RELOC_X86_64_TLSLD:
-	  /* Don't allow KMOV in TLS code sequences.  */
-	  if (t->opcode_modifier.vex)
-	    continue;
-	  break;
-	default:
-	  break;
-	}
-
       /* We check register size if needed.  */
       if (t->opcode_modifier.checkregsize)
 	{
@@ -6835,12 +6831,19 @@ match_template (char mnem_suffix)
 	      && i.types[1].bitfield.instance == Accum
 	      && i.types[1].bitfield.dword)
 	    continue;
-	  /* xrelease mov %eax, <disp> is another special case. It must not
-	     match the accumulator-only encoding of mov.  */
-	  if (i.hle_prefix
-	      && t->base_opcode == 0xa0
+
+	  if (t->base_opcode == MOV_AX_DISP32
 	      && t->opcode_modifier.opcodespace == SPACE_BASE)
-	    continue;
+	    {
+	      /* Force 0x8b encoding for "mov foo@GOT, %eax".  */
+	      if (i.reloc[0] == BFD_RELOC_386_GOT32)
+		continue;
+
+	      /* xrelease mov %eax, <disp> is another special case. It must not
+		 match the accumulator-only encoding of mov.  */
+	      if (i.hle_prefix)
+		continue;
+	    }
 	  /* Fall through.  */
 
 	case 3:


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v5 7/8] x86: drop (now) stray IsString
  2022-10-25  7:23 [PATCH v5 0/8] x86: suffix handling changes Jan Beulich
                   ` (5 preceding siblings ...)
  2022-10-25  7:27 ` [PATCH v5 6/8] x86: move bad-use-of-TLS-reloc check Jan Beulich
@ 2022-10-25  7:28 ` Jan Beulich
  2022-10-25  7:29 ` [PATCH v5 8/8] x86: further re-work insn/suffix recognition to also cover MOVSX Jan Beulich
  7 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2022-10-25  7:28 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

While I question the original need for IsString on the PadLock insns,
the need for them on the operand-less string insns has gone away with
the removal of maybe_adjust_templates() and associated logic. Since
i386_index_check() needs adjustment then anyway, take the opportunity
and also simplify it, possible again as a result of said removal (plus
the opcode template adjustments done here).
---
v4: New.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -11262,12 +11262,9 @@ i386_index_check (const char *operand_st
 {
   const char *kind = "base/index";
   enum flag_code addr_mode = i386_addressing_mode ();
-  const insn_template *t = current_templates->start;
+  const insn_template *t = current_templates->end - 1;
 
-  if (t->opcode_modifier.isstring
-      && !t->cpu_flags.bitfield.cpupadlock
-      && (current_templates->end[-1].opcode_modifier.isstring
-	  || i.mem_operands))
+  if (t->opcode_modifier.isstring)
     {
       /* Memory operands of string insns are special in that they only allow
 	 a single register (rDI, rSI, or rBX) as their memory address.  */
@@ -11284,14 +11281,12 @@ i386_index_check (const char *operand_st
 
       if (t->opcode_modifier.prefixok == PrefixRep)
 	{
-	  int es_op = current_templates->end[-1].opcode_modifier.isstring
-		      - IS_STRING_ES_OP0;
+	  int es_op = t->opcode_modifier.isstring - IS_STRING_ES_OP0;
 	  int op = 0;
 
-	  if (!current_templates->end[-1].operand_types[0].bitfield.baseindex
+	  if (!t->operand_types[0].bitfield.baseindex
 	      || ((!i.mem_operands != !intel_syntax)
-		  && current_templates->end[-1].operand_types[1]
-		     .bitfield.baseindex))
+		  && t->operand_types[1].bitfield.baseindex))
 	    op = 1;
 	  expected_reg
 	    = (const reg_entry *) str_hash_find (reg_hash,
@@ -11334,6 +11329,8 @@ i386_index_check (const char *operand_st
     }
   else
     {
+      t = current_templates->start;
+
       if (addr_mode != CODE_16BIT)
 	{
 	  /* 32-bit/64-bit checks.  */
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -483,37 +483,37 @@ loopne, 0xe0, None, Cpu64, JumpByte|No_b
 set<cc>, 0xf9<cc:opc>, 0, Cpu386, Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg8|Byte|Unspecified|BaseIndex }
 
 // String manipulation.
-cmps, 0xa6, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {}
+cmps, 0xa6, None, 0, W|No_sSuf|No_ldSuf|RepPrefixOk, {}
 cmps, 0xa6, None, 0, W|No_sSuf|No_ldSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-scmp, 0xa6, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {}
+scmp, 0xa6, None, 0, W|No_sSuf|No_ldSuf|RepPrefixOk, {}
 scmp, 0xa6, None, 0, W|No_sSuf|No_ldSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-ins, 0x6c, None, Cpu186, W|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
+ins, 0x6c, None, Cpu186, W|No_sSuf|No_qSuf|No_ldSuf|RepPrefixOk, {}
 ins, 0x6c, None, Cpu186, W|No_sSuf|No_qSuf|No_ldSuf|IsStringEsOp1|RepPrefixOk, { InOutPortReg, Byte|Word|Dword|Unspecified|BaseIndex }
-outs, 0x6e, None, Cpu186, W|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
+outs, 0x6e, None, Cpu186, W|No_sSuf|No_qSuf|No_ldSuf|RepPrefixOk, {}
 outs, 0x6e, None, Cpu186, W|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, { Byte|Word|Dword|Unspecified|BaseIndex, InOutPortReg }
-lods, 0xac, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {}
+lods, 0xac, None, 0, W|No_sSuf|No_ldSuf|RepPrefixOk, {}
 lods, 0xac, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 lods, 0xac, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Acc|Byte|Word|Dword|Qword }
-slod, 0xac, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {}
+slod, 0xac, None, 0, W|No_sSuf|No_ldSuf|RepPrefixOk, {}
 slod, 0xac, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 slod, 0xac, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Acc|Byte|Word|Dword|Qword }
-movs, 0xa4, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {}
+movs, 0xa4, None, 0, W|No_sSuf|No_ldSuf|RepPrefixOk, {}
 movs, 0xa4, None, 0, W|No_sSuf|No_ldSuf|IsStringEsOp1|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-smov, 0xa4, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {}
+smov, 0xa4, None, 0, W|No_sSuf|No_ldSuf|RepPrefixOk, {}
 smov, 0xa4, None, 0, W|No_sSuf|No_ldSuf|IsStringEsOp1|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-scas, 0xae, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {}
+scas, 0xae, None, 0, W|No_sSuf|No_ldSuf|RepPrefixOk, {}
 scas, 0xae, None, 0, W|No_sSuf|No_ldSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 scas, 0xae, None, 0, W|No_sSuf|No_ldSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Acc|Byte|Word|Dword|Qword }
-ssca, 0xae, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {}
+ssca, 0xae, None, 0, W|No_sSuf|No_ldSuf|RepPrefixOk, {}
 ssca, 0xae, None, 0, W|No_sSuf|No_ldSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 ssca, 0xae, None, 0, W|No_sSuf|No_ldSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Acc|Byte|Word|Dword|Qword }
-stos, 0xaa, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {}
+stos, 0xaa, None, 0, W|No_sSuf|No_ldSuf|RepPrefixOk, {}
 stos, 0xaa, None, 0, W|No_sSuf|No_ldSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 stos, 0xaa, None, 0, W|No_sSuf|No_ldSuf|IsStringEsOp1|RepPrefixOk, { Acc|Byte|Word|Dword|Qword, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-ssto, 0xaa, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {}
+ssto, 0xaa, None, 0, W|No_sSuf|No_ldSuf|RepPrefixOk, {}
 ssto, 0xaa, None, 0, W|No_sSuf|No_ldSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 ssto, 0xaa, None, 0, W|No_sSuf|No_ldSuf|IsStringEsOp1|RepPrefixOk, { Acc|Byte|Word|Dword|Qword, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-xlat, 0xd7, None, 0, No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString, {}
+xlat, 0xd7, None, 0, No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, {}
 xlat, 0xd7, None, 0, No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString, { Byte|Unspecified|BaseIndex }
 
 // Bit manipulation.
@@ -1966,24 +1966,24 @@ lzcnt, 0xf30fbd, None, CpuLZCNT, Modrm|C
 popcnt, 0xf30fb8, None, CpuPOPCNT, Modrm|CheckRegSize|No_bSuf|No_sSuf|No_ldSuf, { Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 
 // VIA PadLock extensions.
-xstore-rng, 0xfa7c0, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
-xcrypt-ecb, 0xf30fa7c8, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
-xcrypt-cbc, 0xf30fa7d0, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
-xcrypt-ctr, 0xf30fa7d8, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
-xcrypt-cfb, 0xf30fa7e0, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
-xcrypt-ofb, 0xf30fa7e8, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
-montmul, 0xf30fa6c0, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
-xsha1, 0xf30fa6c8, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
-xsha256, 0xf30fa6d0, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
+xstore-rng, 0xfa7c0, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|RepPrefixOk, {}
+xcrypt-ecb, 0xf30fa7c8, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|RepPrefixOk, {}
+xcrypt-cbc, 0xf30fa7d0, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|RepPrefixOk, {}
+xcrypt-ctr, 0xf30fa7d8, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|RepPrefixOk, {}
+xcrypt-cfb, 0xf30fa7e0, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|RepPrefixOk, {}
+xcrypt-ofb, 0xf30fa7e8, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|RepPrefixOk, {}
+montmul, 0xf30fa6c0, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|RepPrefixOk, {}
+xsha1, 0xf30fa6c8, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|RepPrefixOk, {}
+xsha256, 0xf30fa6d0, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|RepPrefixOk, {}
 // Aliases without hyphens.
-xstorerng, 0xfa7c0, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
-xcryptecb, 0xf30fa7c8, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
-xcryptcbc, 0xf30fa7d0, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
-xcryptctr, 0xf30fa7d8, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
-xcryptcfb, 0xf30fa7e0, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
-xcryptofb, 0xf30fa7e8, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
+xstorerng, 0xfa7c0, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|RepPrefixOk, {}
+xcryptecb, 0xf30fa7c8, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|RepPrefixOk, {}
+xcryptcbc, 0xf30fa7d0, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|RepPrefixOk, {}
+xcryptctr, 0xf30fa7d8, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|RepPrefixOk, {}
+xcryptcfb, 0xf30fa7e0, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|RepPrefixOk, {}
+xcryptofb, 0xf30fa7e8, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|RepPrefixOk, {}
 // Alias for xstore-rng.
-xstore, 0xfa7c0, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
+xstore, 0xfa7c0, None, CpuPadLock, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|RepPrefixOk, {}
 
 // Multy-precision Add Carry, rdseed instructions.
 adcx, 0x660f38f6, None, CpuADX, Modrm|CheckRegSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v5 8/8] x86: further re-work insn/suffix recognition to also cover MOVSX
  2022-10-25  7:23 [PATCH v5 0/8] x86: suffix handling changes Jan Beulich
                   ` (6 preceding siblings ...)
  2022-10-25  7:28 ` [PATCH v5 7/8] x86: drop (now) stray IsString Jan Beulich
@ 2022-10-25  7:29 ` Jan Beulich
  2022-10-25 17:10   ` H.J. Lu
  7 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2022-10-25  7:29 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

PR gas/29524

Having templates with a suffix explicitly present has always been
quirky. After prior adjustment all that's left to also eliminate the
anomaly from move-with-sign-extend is to consolidate the insn templates
(and extend testsuite coverage).
---
v5: Split off from earlier patch, merged with prior patch dealing with
    just x86-64's MOVSL.

--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -73,6 +73,7 @@ if [gas_32_check] then {
     run_dump_test "amd"
     run_dump_test "katmai"
     run_dump_test "jump"
+    run_dump_test "movs32"
     run_dump_test "movz32"
     run_dump_test "relax-1"
     run_dump_test "relax-2"
@@ -804,6 +805,7 @@ if [gas_64_check] then {
     run_dump_test "x86-64-segovr"
     run_list_test "x86-64-inval-seg" "-al"
     run_dump_test "x86-64-branch"
+    run_dump_test "movs64"
     run_dump_test "movz64"
     run_dump_test "x86-64-relax-1"
     run_dump_test "svme64"
--- /dev/null
+++ b/gas/testsuite/gas/i386/movs.s
@@ -0,0 +1,39 @@
+	.text
+movs:
+	movsb	%al,%ax
+	movsb	(%eax),%ax
+	movsb	%al,%eax
+	movsb	(%eax),%eax
+.ifdef x86_64
+	movsb	%al,%rax
+	movsb	(%rax),%rax
+.endif
+
+	movsbw	%al,%ax
+	movsbw	(%eax),%ax
+	movsbl	%al,%eax
+	movsbl	(%eax),%eax
+.ifdef x86_64
+	movsbq	%al,%rax
+	movsbq	(%rax),%rax
+.endif
+
+	movsw	%ax,%eax
+	movsw	(%eax),%eax
+.ifdef x86_64
+	movsw	%ax,%rax
+	movsw	(%rax),%rax
+.endif
+
+	movswl	%ax,%eax
+	movswl	(%eax),%eax
+.ifdef x86_64
+	movswq	%ax,%rax
+	movswq	(%rax),%rax
+
+	movsl	%eax,%rax
+	movsl	(%rax),%rax
+
+	movslq	%eax,%rax
+	movslq	(%rax),%rax
+.endif
--- /dev/null
+++ b/gas/testsuite/gas/i386/movs32.d
@@ -0,0 +1,22 @@
+#objdump: -dw
+#source: movs.s
+#name: x86 mov with sign-extend (32-bit object)
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <movs>:
+[ 	]*[a-f0-9]+:	66 0f be c0 *	movsbw %al,%ax
+[ 	]*[a-f0-9]+:	66 0f be 00 *	movsbw \(%eax\),%ax
+[ 	]*[a-f0-9]+:	0f be c0 *	movsbl %al,%eax
+[ 	]*[a-f0-9]+:	0f be 00 *	movsbl \(%eax\),%eax
+[ 	]*[a-f0-9]+:	66 0f be c0 *	movsbw %al,%ax
+[ 	]*[a-f0-9]+:	66 0f be 00 *	movsbw \(%eax\),%ax
+[ 	]*[a-f0-9]+:	0f be c0 *	movsbl %al,%eax
+[ 	]*[a-f0-9]+:	0f be 00 *	movsbl \(%eax\),%eax
+[ 	]*[a-f0-9]+:	0f bf c0 *	movswl %ax,%eax
+[ 	]*[a-f0-9]+:	0f bf 00 *	movswl \(%eax\),%eax
+[ 	]*[a-f0-9]+:	0f bf c0 *	movswl %ax,%eax
+[ 	]*[a-f0-9]+:	0f bf 00 *	movswl \(%eax\),%eax
+#pass
--- /dev/null
+++ b/gas/testsuite/gas/i386/movs64.d
@@ -0,0 +1,30 @@
+#objdump: -dw
+#source: movs.s
+#name: x86 mov with sign-extend (64-bit object)
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <movs>:
+[ 	]*[a-f0-9]+:	66 0f be c0 *	movsbw %al,%ax
+[ 	]*[a-f0-9]+:	67 66 0f be 00 *	movsbw \(%eax\),%ax
+[ 	]*[a-f0-9]+:	0f be c0 *	movsbl %al,%eax
+[ 	]*[a-f0-9]+:	67 0f be 00 *	movsbl \(%eax\),%eax
+[ 	]*[a-f0-9]+:	48 0f be c0 *	movsbq %al,%rax
+[ 	]*[a-f0-9]+:	48 0f be 00 *	movsbq \(%rax\),%rax
+[ 	]*[a-f0-9]+:	66 0f be c0 *	movsbw %al,%ax
+[ 	]*[a-f0-9]+:	67 66 0f be 00 *	movsbw \(%eax\),%ax
+[ 	]*[a-f0-9]+:	0f be c0 *	movsbl %al,%eax
+[ 	]*[a-f0-9]+:	67 0f be 00 *	movsbl \(%eax\),%eax
+[ 	]*[a-f0-9]+:	48 0f be c0 *	movsbq %al,%rax
+[ 	]*[a-f0-9]+:	48 0f be 00 *	movsbq \(%rax\),%rax
+[ 	]*[a-f0-9]+:	0f bf c0 *	movswl %ax,%eax
+[ 	]*[a-f0-9]+:	67 0f bf 00 *	movswl \(%eax\),%eax
+[ 	]*[a-f0-9]+:	48 0f bf c0 *	movswq %ax,%rax
+[ 	]*[a-f0-9]+:	48 0f bf 00 *	movswq \(%rax\),%rax
+[ 	]*[a-f0-9]+:	0f bf c0 *	movswl %ax,%eax
+[ 	]*[a-f0-9]+:	67 0f bf 00 *	movswl \(%eax\),%eax
+[ 	]*[a-f0-9]+:	48 0f bf c0 *	movswq %ax,%rax
+[ 	]*[a-f0-9]+:	48 0f bf 00 *	movswq \(%rax\),%rax
+#pass
--- a/gas/testsuite/gas/i386/movx16.l
+++ b/gas/testsuite/gas/i386/movx16.l
@@ -41,11 +41,11 @@
 [ 	]*[1-9][0-9]*[ 	]+movsb	%ax, %cl
 [ 	]*[1-9][0-9]*[ 	]+movsb	%eax, %cl
 [ 	]*[1-9][0-9]*[ 	]*
-[ 	]*[1-9][0-9]*[ 	]+movsb	%al, %cx
+[ 	]*[1-9][0-9]* \?\?\?\? 0FBEC8[ 	]+movsb	%al, %cx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%ax, %cx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%eax, %cx
 [ 	]*[1-9][0-9]*[ 	]*
-[ 	]*[1-9][0-9]*[ 	]+movsb	%al, %ecx
+[ 	]*[1-9][0-9]* \?\?\?\? 660FBEC8[ 	]+movsb	%al, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%ax, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%eax, %ecx
 [ 	]*[1-9][0-9]*[ 	]*
@@ -82,7 +82,7 @@
 [ 	]*[1-9][0-9]*[ 	]+movsw	%eax, %cx
 [ 	]*[1-9][0-9]*[ 	]*
 [ 	]*[1-9][0-9]*[ 	]+movsw	%al, %ecx
-[ 	]*[1-9][0-9]*[ 	]+movsw	%ax, %ecx
+[ 	]*[1-9][0-9]* \?\?\?\? 660FBFC8[ 	]+movsw	%ax, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsw	%eax, %ecx
 [ 	]*[1-9][0-9]*[ 	]*
 [ 	]*[1-9][0-9]*[ 	]+movswl	%al, %cl
--- a/gas/testsuite/gas/i386/movx32.l
+++ b/gas/testsuite/gas/i386/movx32.l
@@ -41,11 +41,11 @@
 [ 	]*[1-9][0-9]*[ 	]+movsb	%ax, %cl
 [ 	]*[1-9][0-9]*[ 	]+movsb	%eax, %cl
 [ 	]*[1-9][0-9]*[ 	]*
-[ 	]*[1-9][0-9]*[ 	]+movsb	%al, %cx
+[ 	]*[1-9][0-9]* \?\?\?\? 660FBEC8[ 	]+movsb	%al, %cx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%ax, %cx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%eax, %cx
 [ 	]*[1-9][0-9]*[ 	]*
-[ 	]*[1-9][0-9]*[ 	]+movsb	%al, %ecx
+[ 	]*[1-9][0-9]* \?\?\?\? 0FBEC8[ 	]+movsb	%al, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%ax, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%eax, %ecx
 [ 	]*[1-9][0-9]*[ 	]*
@@ -82,7 +82,7 @@
 [ 	]*[1-9][0-9]*[ 	]+movsw	%eax, %cx
 [ 	]*[1-9][0-9]*[ 	]*
 [ 	]*[1-9][0-9]*[ 	]+movsw	%al, %ecx
-[ 	]*[1-9][0-9]*[ 	]+movsw	%ax, %ecx
+[ 	]*[1-9][0-9]* \?\?\?\? 0FBFC8[ 	]+movsw	%ax, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsw	%eax, %ecx
 [ 	]*[1-9][0-9]*[ 	]*
 [ 	]*[1-9][0-9]*[ 	]+movswl	%al, %cl
--- a/gas/testsuite/gas/i386/movx64.l
+++ b/gas/testsuite/gas/i386/movx64.l
@@ -106,17 +106,17 @@
 [ 	]*[1-9][0-9]*[ 	]+movsb	%eax, %cl
 [ 	]*[1-9][0-9]*[ 	]+movsb	%rax, %cl
 [ 	]*[1-9][0-9]*[ 	]*
-[ 	]*[1-9][0-9]*[ 	]+movsb	%al, %cx
+[ 	]*[1-9][0-9]* \?\?\?\? 660FBEC8[ 	]+movsb	%al, %cx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%ax, %cx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%eax, %cx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%rax, %cx
 [ 	]*[1-9][0-9]*[ 	]*
-[ 	]*[1-9][0-9]*[ 	]+movsb	%al, %ecx
+[ 	]*[1-9][0-9]* \?\?\?\? 0FBEC8[ 	]+movsb	%al, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%ax, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%eax, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%rax, %ecx
 [ 	]*[1-9][0-9]*[ 	]*
-[ 	]*[1-9][0-9]*[ 	]+movsb	%al, %rcx
+[ 	]*[1-9][0-9]* \?\?\?\? 480FBEC8[ 	]+movsb	%al, %rcx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%ax, %rcx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%eax, %rcx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%rax, %rcx
@@ -192,12 +192,12 @@
 [ 	]*[1-9][0-9]*[ 	]+movsw	%rax, %cx
 [ 	]*[1-9][0-9]*[ 	]*
 [ 	]*[1-9][0-9]*[ 	]+movsw	%al, %ecx
-[ 	]*[1-9][0-9]*[ 	]+movsw	%ax, %ecx
+[ 	]*[1-9][0-9]* \?\?\?\? 0FBFC8[ 	]+movsw	%ax, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsw	%eax, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsw	%rax, %ecx
 [ 	]*[1-9][0-9]*[ 	]*
 [ 	]*[1-9][0-9]*[ 	]+movsw	%al, %rcx
-[ 	]*[1-9][0-9]*[ 	]+movsw	%ax, %rcx
+[ 	]*[1-9][0-9]* \?\?\?\? 480FBFC8[ 	]+movsw	%ax, %rcx
 [ 	]*[1-9][0-9]*[ 	]+movsw	%eax, %rcx
 [ 	]*[1-9][0-9]*[ 	]+movsw	%rax, %rcx
 [ 	]*[1-9][0-9]*[ 	]*
@@ -241,6 +241,46 @@
 [ 	]*[1-9][0-9]*[ 	]+movswq	%eax, %rcx
 [ 	]*[1-9][0-9]*[ 	]+movswq	%rax, %rcx
 [ 	]*[1-9][0-9]*[ 	]*
+[ 	]*[1-9][0-9]*[ 	]+movsl	%al, %cl
+[ 	]*[1-9][0-9]*[ 	]+movsl	%ax, %cl
+[ 	]*[1-9][0-9]*[ 	]+movsl	%eax, %cl
+[ 	]*[1-9][0-9]*[ 	]+movsl	%rax, %cl
+[ 	]*[1-9][0-9]*[ 	]*
+[ 	]*[1-9][0-9]*[ 	]+movsl	%al, %cx
+[ 	]*[1-9][0-9]*[ 	]+movsl	%ax, %cx
+[ 	]*[1-9][0-9]*[ 	]+movsl	%eax, %cx
+[ 	]*[1-9][0-9]*[ 	]+movsl	%rax, %cx
+[ 	]*[1-9][0-9]*[ 	]*
+[ 	]*[1-9][0-9]*[ 	]+movsl	%al, %ecx
+[ 	]*[1-9][0-9]*[ 	]+movsl	%ax, %ecx
+[ 	]*[1-9][0-9]*[ 	]+movsl	%eax, %ecx
+[ 	]*[1-9][0-9]*[ 	]+movsl	%rax, %ecx
+[ 	]*[1-9][0-9]*[ 	]*
+[ 	]*[1-9][0-9]*[ 	]+movsl	%al, %rcx
+[ 	]*[1-9][0-9]*[ 	]+movsl	%ax, %rcx
+[ 	]*[1-9][0-9]* \?\?\?\? 4863C8[ 	]+movsl	%eax, %rcx
+[ 	]*[1-9][0-9]*[ 	]+movsl	%rax, %rcx
+[ 	]*[1-9][0-9]*[ 	]*
+[ 	]*[1-9][0-9]*[ 	]+movslq	%al, %cl
+[ 	]*[1-9][0-9]*[ 	]+movslq	%ax, %cl
+[ 	]*[1-9][0-9]*[ 	]+movslq	%eax, %cl
+[ 	]*[1-9][0-9]*[ 	]+movslq	%rax, %cl
+[ 	]*[1-9][0-9]*[ 	]*
+[ 	]*[1-9][0-9]*[ 	]+movslq	%al, %cx
+[ 	]*[1-9][0-9]*[ 	]+movslq	%ax, %cx
+[ 	]*[1-9][0-9]*[ 	]+movslq	%eax, %cx
+[ 	]*[1-9][0-9]*[ 	]+movslq	%rax, %cx
+[ 	]*[1-9][0-9]*[ 	]*
+[ 	]*[1-9][0-9]*[ 	]+movslq	%al, %ecx
+[ 	]*[1-9][0-9]*[ 	]+movslq	%ax, %ecx
+[ 	]*[1-9][0-9]*[ 	]+movslq	%eax, %ecx
+[ 	]*[1-9][0-9]*[ 	]+movslq	%rax, %ecx
+[ 	]*[1-9][0-9]*[ 	]*
+[ 	]*[1-9][0-9]*[ 	]+movslq	%al, %rcx
+[ 	]*[1-9][0-9]*[ 	]+movslq	%ax, %rcx
+[ 	]*[1-9][0-9]* \?\?\?\? 4863C8[ 	]+movslq	%eax, %rcx
+[ 	]*[1-9][0-9]*[ 	]+movslq	%rax, %rcx
+[ 	]*[1-9][0-9]*[ 	]*
 [ 	]*[1-9][0-9]*[ 	]+movzx:
 [ 	]*[1-9][0-9]*[ 	]+movzx	%al, %cl
 [ 	]*[1-9][0-9]*[ 	]+movzx	%ax, %cl
--- a/gas/testsuite/gas/i386/movx64.s
+++ b/gas/testsuite/gas/i386/movx64.s
@@ -241,6 +241,46 @@ movsx:
 	movswq	%eax, %rcx
 	movswq	%rax, %rcx
 
+	movsl	%al, %cl
+	movsl	%ax, %cl
+	movsl	%eax, %cl
+	movsl	%rax, %cl
+
+	movsl	%al, %cx
+	movsl	%ax, %cx
+	movsl	%eax, %cx
+	movsl	%rax, %cx
+
+	movsl	%al, %ecx
+	movsl	%ax, %ecx
+	movsl	%eax, %ecx
+	movsl	%rax, %ecx
+
+	movsl	%al, %rcx
+	movsl	%ax, %rcx
+	movsl	%eax, %rcx
+	movsl	%rax, %rcx
+
+	movslq	%al, %cl
+	movslq	%ax, %cl
+	movslq	%eax, %cl
+	movslq	%rax, %cl
+
+	movslq	%al, %cx
+	movslq	%ax, %cx
+	movslq	%eax, %cx
+	movslq	%rax, %cx
+
+	movslq	%al, %ecx
+	movslq	%ax, %ecx
+	movslq	%eax, %ecx
+	movslq	%rax, %ecx
+
+	movslq	%al, %rcx
+	movslq	%ax, %rcx
+	movslq	%eax, %rcx
+	movslq	%rax, %rcx
+
 movzx:
 	movzx	%al, %cl
 	movzx	%ax, %cl
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -162,14 +162,9 @@ mov, 0xf24, None, Cpu386|CpuNo64, D|RegM
 movbe, 0x0f38f0, None, CpuMovbe, D|Modrm|No_bSuf|No_sSuf|No_ldSuf, { Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 
 // Move with sign extend.
-// "movsbl" & "movsbw" must not be unified into "movsb" to avoid
-// conflict with the "movs" string move instruction.
-movsbl, 0xfbe, None, Cpu386, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg8|Byte|Unspecified|BaseIndex, Reg32 }
-movsbw, 0xfbe, None, Cpu386, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg8|Byte|Unspecified|BaseIndex, Reg16 }
-movswl, 0xfbf, None, Cpu386, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg16|Word|Unspecified|BaseIndex, Reg32 }
-movsbq, 0xfbe, None, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg8|Byte|Unspecified|BaseIndex, Reg64 }
-movswq, 0xfbf, None, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg16|Word|Unspecified|BaseIndex, Reg64 }
-movslq, 0x63, None, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg32|Dword|Unspecified|BaseIndex, Reg64 }
+movsb, 0xfbe, None, Cpu386, Modrm|No_bSuf|No_sSuf|No_ldSuf|Pass2, { Reg8|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+movsw, 0xfbf, None, Cpu386, Modrm|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|Pass2, { Reg16|Unspecified|BaseIndex, Reg32|Reg64 }
+movsl, 0x63, None, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf|Pass2, { Reg32|Unspecified|BaseIndex, Reg64 }
 movsx, 0xfbe, None, Cpu386, W|Modrm|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg8|Reg16|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 movsx, 0x63, None, Cpu64, Modrm|No_bSuf|No_wSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, Reg32|Reg64 }
 movsxd, 0x63, None, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, Reg32|Reg64 }


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 2/8] x86: introduce Pass2 insn attribute
  2022-10-25  7:25 ` [PATCH v5 1/8] x86: introduce Pass2 insn attribute Jan Beulich
@ 2022-10-25  7:30   ` Jan Beulich
  0 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2022-10-25  7:30 UTC (permalink / raw)
  To: Binutils


I'm sorry, I've typo-ed the title of the original submission - this is of
course patch 2/8 (reflected in the subject here).

Jan

On 25.10.2022 09:25, Jan Beulich via Binutils wrote:
> Subsequently we will want to mark templates for mnemonics which, due to
> their last character also possibly being a suffix, are "shadowing" other
> templates (and hence may require a second parsing pass). Introduce this
> attribute in a standalone patch, largely to ease looking at the actual
> later change. To simplify logic the later change is going to look for
> the new attribute only on the first template of a group. Make i386-gen
> warn about misplaced Pass2 (redundant ones are okay).
> 
> While there convert "active_isstring" to bool, matching the new static
> variable.
> ---
> As to redundant attributes: An open question is whether - to avoid
> surprises when e.g. templates are re-ordered - it wouldn't be desirable
> to add Pass2 on all templates of a group.
> ---
> v3: New.
> 
> --- a/opcodes/i386-gen.c
> +++ b/opcodes/i386-gen.c
> @@ -18,6 +18,7 @@
>     MA 02110-1301, USA.  */
>  
>  #include "sysdep.h"
> +#include <stdbool.h>
>  #include <stdio.h>
>  #include <errno.h>
>  #include "getopt.h"
> @@ -702,6 +703,7 @@ static bitfield opcode_modifiers[] =
>    BITFIELD (FWait),
>    BITFIELD (IsString),
>    BITFIELD (RegMem),
> +  BITFIELD (Pass2),
>    BITFIELD (BNDPrefixOk),
>    BITFIELD (RegKludge),
>    BITFIELD (Implicit1stXmm0),
> @@ -800,7 +802,7 @@ static bitfield operand_types[] =
>  
>  static const char *filename;
>  static i386_cpu_flags active_cpu_flags;
> -static int active_isstring;
> +static bool active_isstring, active_pass2;
>  
>  struct template_arg {
>    const struct template_arg *next;
> @@ -1168,7 +1170,8 @@ process_i386_opcode_modifier (FILE *tabl
>    char *str, *next, *last;
>    bitfield modifiers [ARRAY_SIZE (opcode_modifiers)];
>  
> -  active_isstring = 0;
> +  active_isstring = false;
> +  active_pass2 = false;
>  
>    /* Copy the default opcode modifier.  */
>    memcpy (modifiers, opcode_modifiers, sizeof (modifiers));
> @@ -1191,8 +1194,11 @@ process_i386_opcode_modifier (FILE *tabl
>  
>  	      set_bitfield (str, modifiers, val, ARRAY_SIZE (modifiers),
>  			    lineno);
> +
>  	      if (strcasecmp(str, "IsString") == 0)
> -		active_isstring = 1;
> +		active_isstring = true;
> +	      if (strcasecmp(str, "Pass2") == 0)
> +		active_pass2 = true;
>  
>  	      if (strcasecmp(str, "W") == 0)
>  		have_w = 1;
> @@ -1867,6 +1873,7 @@ process_i386_opcodes (FILE *table)
>    for (j = 0; j < i; j++)
>      {
>        struct opcode_hash_entry *next;
> +      bool first_pass2 = false;
>  
>        for (next = opcode_array[j]; next; next = next->next)
>  	{
> @@ -1875,6 +1882,14 @@ process_i386_opcodes (FILE *table)
>  	  lineno = next->lineno;
>  	  last = str + strlen (str);
>  	  output_i386_opcode (table, name, str, last, lineno);
> +
> +	  if (next == opcode_array[j])
> +	    first_pass2 = active_pass2;
> +	  else if (!first_pass2 && active_pass2)
> +	    fprintf (stderr,
> +		     "Warning: %s:%d: %s: "
> +		     "'Pass2' is recognized only on the first template in a group\n",
> +		     filename, lineno, name);
>  	}
>      }
>  
> --- a/opcodes/i386-opc.h
> +++ b/opcodes/i386-opc.h
> @@ -504,6 +504,9 @@ enum
>       Normally, it will be encoded in the reg field. We add a RegMem
>       flag to indicate that it should be encoded in the regmem field.  */
>    RegMem,
> +  /* The (unsuffixed) mnemonic represents itself the suffixed form of another
> +     mnemonic, potentially requiring a second parsing pass.  */
> +  Pass2,
>    /* quick test if branch instruction is MPX supported */
>    BNDPrefixOk,
>    /* fake an extra reg operand for clr, imul and special register
> @@ -730,6 +733,7 @@ typedef struct i386_opcode_modifier
>    unsigned int fwait:1;
>    unsigned int isstring:2;
>    unsigned int regmem:1;
> +  unsigned int pass2:1;
>    unsigned int bndprefixok:1;
>    unsigned int regkludge:1;
>    unsigned int implicit1stxmm0:1;
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 8/8] x86: further re-work insn/suffix recognition to also cover MOVSX
  2022-10-25  7:29 ` [PATCH v5 8/8] x86: further re-work insn/suffix recognition to also cover MOVSX Jan Beulich
@ 2022-10-25 17:10   ` H.J. Lu
  2022-10-26  9:07     ` Jan Beulich
  0 siblings, 1 reply; 19+ messages in thread
From: H.J. Lu @ 2022-10-25 17:10 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils

On Tue, Oct 25, 2022 at 12:29 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> PR gas/29524
>
> Having templates with a suffix explicitly present has always been
> quirky. After prior adjustment all that's left to also eliminate the

I prefer keeping the current MOVS behavior.  Please submit a new patch
set without MOVS changes.

> anomaly from move-with-sign-extend is to consolidate the insn templates
> (and extend testsuite coverage).
> ---
> v5: Split off from earlier patch, merged with prior patch dealing with
>     just x86-64's MOVSL.
>
> --- a/gas/testsuite/gas/i386/i386.exp
> +++ b/gas/testsuite/gas/i386/i386.exp
> @@ -73,6 +73,7 @@ if [gas_32_check] then {
>      run_dump_test "amd"
>      run_dump_test "katmai"
>      run_dump_test "jump"
> +    run_dump_test "movs32"
>      run_dump_test "movz32"
>      run_dump_test "relax-1"
>      run_dump_test "relax-2"
> @@ -804,6 +805,7 @@ if [gas_64_check] then {
>      run_dump_test "x86-64-segovr"
>      run_list_test "x86-64-inval-seg" "-al"
>      run_dump_test "x86-64-branch"
> +    run_dump_test "movs64"
>      run_dump_test "movz64"
>      run_dump_test "x86-64-relax-1"
>      run_dump_test "svme64"
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/movs.s
> @@ -0,0 +1,39 @@
> +       .text
> +movs:
> +       movsb   %al,%ax
> +       movsb   (%eax),%ax
> +       movsb   %al,%eax
> +       movsb   (%eax),%eax
> +.ifdef x86_64
> +       movsb   %al,%rax
> +       movsb   (%rax),%rax
> +.endif
> +
> +       movsbw  %al,%ax
> +       movsbw  (%eax),%ax
> +       movsbl  %al,%eax
> +       movsbl  (%eax),%eax
> +.ifdef x86_64
> +       movsbq  %al,%rax
> +       movsbq  (%rax),%rax
> +.endif
> +
> +       movsw   %ax,%eax
> +       movsw   (%eax),%eax
> +.ifdef x86_64
> +       movsw   %ax,%rax
> +       movsw   (%rax),%rax
> +.endif
> +
> +       movswl  %ax,%eax
> +       movswl  (%eax),%eax
> +.ifdef x86_64
> +       movswq  %ax,%rax
> +       movswq  (%rax),%rax
> +
> +       movsl   %eax,%rax
> +       movsl   (%rax),%rax
> +
> +       movslq  %eax,%rax
> +       movslq  (%rax),%rax
> +.endif
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/movs32.d
> @@ -0,0 +1,22 @@
> +#objdump: -dw
> +#source: movs.s
> +#name: x86 mov with sign-extend (32-bit object)
> +
> +.*: +file format .*
> +
> +Disassembly of section .text:
> +
> +0+ <movs>:
> +[      ]*[a-f0-9]+:    66 0f be c0 *   movsbw %al,%ax
> +[      ]*[a-f0-9]+:    66 0f be 00 *   movsbw \(%eax\),%ax
> +[      ]*[a-f0-9]+:    0f be c0 *      movsbl %al,%eax
> +[      ]*[a-f0-9]+:    0f be 00 *      movsbl \(%eax\),%eax
> +[      ]*[a-f0-9]+:    66 0f be c0 *   movsbw %al,%ax
> +[      ]*[a-f0-9]+:    66 0f be 00 *   movsbw \(%eax\),%ax
> +[      ]*[a-f0-9]+:    0f be c0 *      movsbl %al,%eax
> +[      ]*[a-f0-9]+:    0f be 00 *      movsbl \(%eax\),%eax
> +[      ]*[a-f0-9]+:    0f bf c0 *      movswl %ax,%eax
> +[      ]*[a-f0-9]+:    0f bf 00 *      movswl \(%eax\),%eax
> +[      ]*[a-f0-9]+:    0f bf c0 *      movswl %ax,%eax
> +[      ]*[a-f0-9]+:    0f bf 00 *      movswl \(%eax\),%eax
> +#pass
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/movs64.d
> @@ -0,0 +1,30 @@
> +#objdump: -dw
> +#source: movs.s
> +#name: x86 mov with sign-extend (64-bit object)
> +
> +.*: +file format .*
> +
> +Disassembly of section .text:
> +
> +0+ <movs>:
> +[      ]*[a-f0-9]+:    66 0f be c0 *   movsbw %al,%ax
> +[      ]*[a-f0-9]+:    67 66 0f be 00 *        movsbw \(%eax\),%ax
> +[      ]*[a-f0-9]+:    0f be c0 *      movsbl %al,%eax
> +[      ]*[a-f0-9]+:    67 0f be 00 *   movsbl \(%eax\),%eax
> +[      ]*[a-f0-9]+:    48 0f be c0 *   movsbq %al,%rax
> +[      ]*[a-f0-9]+:    48 0f be 00 *   movsbq \(%rax\),%rax
> +[      ]*[a-f0-9]+:    66 0f be c0 *   movsbw %al,%ax
> +[      ]*[a-f0-9]+:    67 66 0f be 00 *        movsbw \(%eax\),%ax
> +[      ]*[a-f0-9]+:    0f be c0 *      movsbl %al,%eax
> +[      ]*[a-f0-9]+:    67 0f be 00 *   movsbl \(%eax\),%eax
> +[      ]*[a-f0-9]+:    48 0f be c0 *   movsbq %al,%rax
> +[      ]*[a-f0-9]+:    48 0f be 00 *   movsbq \(%rax\),%rax
> +[      ]*[a-f0-9]+:    0f bf c0 *      movswl %ax,%eax
> +[      ]*[a-f0-9]+:    67 0f bf 00 *   movswl \(%eax\),%eax
> +[      ]*[a-f0-9]+:    48 0f bf c0 *   movswq %ax,%rax
> +[      ]*[a-f0-9]+:    48 0f bf 00 *   movswq \(%rax\),%rax
> +[      ]*[a-f0-9]+:    0f bf c0 *      movswl %ax,%eax
> +[      ]*[a-f0-9]+:    67 0f bf 00 *   movswl \(%eax\),%eax
> +[      ]*[a-f0-9]+:    48 0f bf c0 *   movswq %ax,%rax
> +[      ]*[a-f0-9]+:    48 0f bf 00 *   movswq \(%rax\),%rax
> +#pass
> --- a/gas/testsuite/gas/i386/movx16.l
> +++ b/gas/testsuite/gas/i386/movx16.l
> @@ -41,11 +41,11 @@
>  [      ]*[1-9][0-9]*[  ]+movsb %ax, %cl
>  [      ]*[1-9][0-9]*[  ]+movsb %eax, %cl
>  [      ]*[1-9][0-9]*[  ]*
> -[      ]*[1-9][0-9]*[  ]+movsb %al, %cx
> +[      ]*[1-9][0-9]* \?\?\?\? 0FBEC8[  ]+movsb %al, %cx
>  [      ]*[1-9][0-9]*[  ]+movsb %ax, %cx
>  [      ]*[1-9][0-9]*[  ]+movsb %eax, %cx
>  [      ]*[1-9][0-9]*[  ]*
> -[      ]*[1-9][0-9]*[  ]+movsb %al, %ecx
> +[      ]*[1-9][0-9]* \?\?\?\? 660FBEC8[        ]+movsb %al, %ecx
>  [      ]*[1-9][0-9]*[  ]+movsb %ax, %ecx
>  [      ]*[1-9][0-9]*[  ]+movsb %eax, %ecx
>  [      ]*[1-9][0-9]*[  ]*
> @@ -82,7 +82,7 @@
>  [      ]*[1-9][0-9]*[  ]+movsw %eax, %cx
>  [      ]*[1-9][0-9]*[  ]*
>  [      ]*[1-9][0-9]*[  ]+movsw %al, %ecx
> -[      ]*[1-9][0-9]*[  ]+movsw %ax, %ecx
> +[      ]*[1-9][0-9]* \?\?\?\? 660FBFC8[        ]+movsw %ax, %ecx
>  [      ]*[1-9][0-9]*[  ]+movsw %eax, %ecx
>  [      ]*[1-9][0-9]*[  ]*
>  [      ]*[1-9][0-9]*[  ]+movswl        %al, %cl
> --- a/gas/testsuite/gas/i386/movx32.l
> +++ b/gas/testsuite/gas/i386/movx32.l
> @@ -41,11 +41,11 @@
>  [      ]*[1-9][0-9]*[  ]+movsb %ax, %cl
>  [      ]*[1-9][0-9]*[  ]+movsb %eax, %cl
>  [      ]*[1-9][0-9]*[  ]*
> -[      ]*[1-9][0-9]*[  ]+movsb %al, %cx
> +[      ]*[1-9][0-9]* \?\?\?\? 660FBEC8[        ]+movsb %al, %cx
>  [      ]*[1-9][0-9]*[  ]+movsb %ax, %cx
>  [      ]*[1-9][0-9]*[  ]+movsb %eax, %cx
>  [      ]*[1-9][0-9]*[  ]*
> -[      ]*[1-9][0-9]*[  ]+movsb %al, %ecx
> +[      ]*[1-9][0-9]* \?\?\?\? 0FBEC8[  ]+movsb %al, %ecx
>  [      ]*[1-9][0-9]*[  ]+movsb %ax, %ecx
>  [      ]*[1-9][0-9]*[  ]+movsb %eax, %ecx
>  [      ]*[1-9][0-9]*[  ]*
> @@ -82,7 +82,7 @@
>  [      ]*[1-9][0-9]*[  ]+movsw %eax, %cx
>  [      ]*[1-9][0-9]*[  ]*
>  [      ]*[1-9][0-9]*[  ]+movsw %al, %ecx
> -[      ]*[1-9][0-9]*[  ]+movsw %ax, %ecx
> +[      ]*[1-9][0-9]* \?\?\?\? 0FBFC8[  ]+movsw %ax, %ecx
>  [      ]*[1-9][0-9]*[  ]+movsw %eax, %ecx
>  [      ]*[1-9][0-9]*[  ]*
>  [      ]*[1-9][0-9]*[  ]+movswl        %al, %cl
> --- a/gas/testsuite/gas/i386/movx64.l
> +++ b/gas/testsuite/gas/i386/movx64.l
> @@ -106,17 +106,17 @@
>  [      ]*[1-9][0-9]*[  ]+movsb %eax, %cl
>  [      ]*[1-9][0-9]*[  ]+movsb %rax, %cl
>  [      ]*[1-9][0-9]*[  ]*
> -[      ]*[1-9][0-9]*[  ]+movsb %al, %cx
> +[      ]*[1-9][0-9]* \?\?\?\? 660FBEC8[        ]+movsb %al, %cx
>  [      ]*[1-9][0-9]*[  ]+movsb %ax, %cx
>  [      ]*[1-9][0-9]*[  ]+movsb %eax, %cx
>  [      ]*[1-9][0-9]*[  ]+movsb %rax, %cx
>  [      ]*[1-9][0-9]*[  ]*
> -[      ]*[1-9][0-9]*[  ]+movsb %al, %ecx
> +[      ]*[1-9][0-9]* \?\?\?\? 0FBEC8[  ]+movsb %al, %ecx
>  [      ]*[1-9][0-9]*[  ]+movsb %ax, %ecx
>  [      ]*[1-9][0-9]*[  ]+movsb %eax, %ecx
>  [      ]*[1-9][0-9]*[  ]+movsb %rax, %ecx
>  [      ]*[1-9][0-9]*[  ]*
> -[      ]*[1-9][0-9]*[  ]+movsb %al, %rcx
> +[      ]*[1-9][0-9]* \?\?\?\? 480FBEC8[        ]+movsb %al, %rcx
>  [      ]*[1-9][0-9]*[  ]+movsb %ax, %rcx
>  [      ]*[1-9][0-9]*[  ]+movsb %eax, %rcx
>  [      ]*[1-9][0-9]*[  ]+movsb %rax, %rcx
> @@ -192,12 +192,12 @@
>  [      ]*[1-9][0-9]*[  ]+movsw %rax, %cx
>  [      ]*[1-9][0-9]*[  ]*
>  [      ]*[1-9][0-9]*[  ]+movsw %al, %ecx
> -[      ]*[1-9][0-9]*[  ]+movsw %ax, %ecx
> +[      ]*[1-9][0-9]* \?\?\?\? 0FBFC8[  ]+movsw %ax, %ecx
>  [      ]*[1-9][0-9]*[  ]+movsw %eax, %ecx
>  [      ]*[1-9][0-9]*[  ]+movsw %rax, %ecx
>  [      ]*[1-9][0-9]*[  ]*
>  [      ]*[1-9][0-9]*[  ]+movsw %al, %rcx
> -[      ]*[1-9][0-9]*[  ]+movsw %ax, %rcx
> +[      ]*[1-9][0-9]* \?\?\?\? 480FBFC8[        ]+movsw %ax, %rcx
>  [      ]*[1-9][0-9]*[  ]+movsw %eax, %rcx
>  [      ]*[1-9][0-9]*[  ]+movsw %rax, %rcx
>  [      ]*[1-9][0-9]*[  ]*
> @@ -241,6 +241,46 @@
>  [      ]*[1-9][0-9]*[  ]+movswq        %eax, %rcx
>  [      ]*[1-9][0-9]*[  ]+movswq        %rax, %rcx
>  [      ]*[1-9][0-9]*[  ]*
> +[      ]*[1-9][0-9]*[  ]+movsl %al, %cl
> +[      ]*[1-9][0-9]*[  ]+movsl %ax, %cl
> +[      ]*[1-9][0-9]*[  ]+movsl %eax, %cl
> +[      ]*[1-9][0-9]*[  ]+movsl %rax, %cl
> +[      ]*[1-9][0-9]*[  ]*
> +[      ]*[1-9][0-9]*[  ]+movsl %al, %cx
> +[      ]*[1-9][0-9]*[  ]+movsl %ax, %cx
> +[      ]*[1-9][0-9]*[  ]+movsl %eax, %cx
> +[      ]*[1-9][0-9]*[  ]+movsl %rax, %cx
> +[      ]*[1-9][0-9]*[  ]*
> +[      ]*[1-9][0-9]*[  ]+movsl %al, %ecx
> +[      ]*[1-9][0-9]*[  ]+movsl %ax, %ecx
> +[      ]*[1-9][0-9]*[  ]+movsl %eax, %ecx
> +[      ]*[1-9][0-9]*[  ]+movsl %rax, %ecx
> +[      ]*[1-9][0-9]*[  ]*
> +[      ]*[1-9][0-9]*[  ]+movsl %al, %rcx
> +[      ]*[1-9][0-9]*[  ]+movsl %ax, %rcx
> +[      ]*[1-9][0-9]* \?\?\?\? 4863C8[  ]+movsl %eax, %rcx
> +[      ]*[1-9][0-9]*[  ]+movsl %rax, %rcx
> +[      ]*[1-9][0-9]*[  ]*
> +[      ]*[1-9][0-9]*[  ]+movslq        %al, %cl
> +[      ]*[1-9][0-9]*[  ]+movslq        %ax, %cl
> +[      ]*[1-9][0-9]*[  ]+movslq        %eax, %cl
> +[      ]*[1-9][0-9]*[  ]+movslq        %rax, %cl
> +[      ]*[1-9][0-9]*[  ]*
> +[      ]*[1-9][0-9]*[  ]+movslq        %al, %cx
> +[      ]*[1-9][0-9]*[  ]+movslq        %ax, %cx
> +[      ]*[1-9][0-9]*[  ]+movslq        %eax, %cx
> +[      ]*[1-9][0-9]*[  ]+movslq        %rax, %cx
> +[      ]*[1-9][0-9]*[  ]*
> +[      ]*[1-9][0-9]*[  ]+movslq        %al, %ecx
> +[      ]*[1-9][0-9]*[  ]+movslq        %ax, %ecx
> +[      ]*[1-9][0-9]*[  ]+movslq        %eax, %ecx
> +[      ]*[1-9][0-9]*[  ]+movslq        %rax, %ecx
> +[      ]*[1-9][0-9]*[  ]*
> +[      ]*[1-9][0-9]*[  ]+movslq        %al, %rcx
> +[      ]*[1-9][0-9]*[  ]+movslq        %ax, %rcx
> +[      ]*[1-9][0-9]* \?\?\?\? 4863C8[  ]+movslq        %eax, %rcx
> +[      ]*[1-9][0-9]*[  ]+movslq        %rax, %rcx
> +[      ]*[1-9][0-9]*[  ]*
>  [      ]*[1-9][0-9]*[  ]+movzx:
>  [      ]*[1-9][0-9]*[  ]+movzx %al, %cl
>  [      ]*[1-9][0-9]*[  ]+movzx %ax, %cl
> --- a/gas/testsuite/gas/i386/movx64.s
> +++ b/gas/testsuite/gas/i386/movx64.s
> @@ -241,6 +241,46 @@ movsx:
>         movswq  %eax, %rcx
>         movswq  %rax, %rcx
>
> +       movsl   %al, %cl
> +       movsl   %ax, %cl
> +       movsl   %eax, %cl
> +       movsl   %rax, %cl
> +
> +       movsl   %al, %cx
> +       movsl   %ax, %cx
> +       movsl   %eax, %cx
> +       movsl   %rax, %cx
> +
> +       movsl   %al, %ecx
> +       movsl   %ax, %ecx
> +       movsl   %eax, %ecx
> +       movsl   %rax, %ecx
> +
> +       movsl   %al, %rcx
> +       movsl   %ax, %rcx
> +       movsl   %eax, %rcx
> +       movsl   %rax, %rcx
> +
> +       movslq  %al, %cl
> +       movslq  %ax, %cl
> +       movslq  %eax, %cl
> +       movslq  %rax, %cl
> +
> +       movslq  %al, %cx
> +       movslq  %ax, %cx
> +       movslq  %eax, %cx
> +       movslq  %rax, %cx
> +
> +       movslq  %al, %ecx
> +       movslq  %ax, %ecx
> +       movslq  %eax, %ecx
> +       movslq  %rax, %ecx
> +
> +       movslq  %al, %rcx
> +       movslq  %ax, %rcx
> +       movslq  %eax, %rcx
> +       movslq  %rax, %rcx
> +
>  movzx:
>         movzx   %al, %cl
>         movzx   %ax, %cl
> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -162,14 +162,9 @@ mov, 0xf24, None, Cpu386|CpuNo64, D|RegM
>  movbe, 0x0f38f0, None, CpuMovbe, D|Modrm|No_bSuf|No_sSuf|No_ldSuf, { Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>
>  // Move with sign extend.
> -// "movsbl" & "movsbw" must not be unified into "movsb" to avoid
> -// conflict with the "movs" string move instruction.
> -movsbl, 0xfbe, None, Cpu386, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg8|Byte|Unspecified|BaseIndex, Reg32 }
> -movsbw, 0xfbe, None, Cpu386, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg8|Byte|Unspecified|BaseIndex, Reg16 }
> -movswl, 0xfbf, None, Cpu386, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg16|Word|Unspecified|BaseIndex, Reg32 }
> -movsbq, 0xfbe, None, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg8|Byte|Unspecified|BaseIndex, Reg64 }
> -movswq, 0xfbf, None, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg16|Word|Unspecified|BaseIndex, Reg64 }
> -movslq, 0x63, None, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg32|Dword|Unspecified|BaseIndex, Reg64 }
> +movsb, 0xfbe, None, Cpu386, Modrm|No_bSuf|No_sSuf|No_ldSuf|Pass2, { Reg8|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> +movsw, 0xfbf, None, Cpu386, Modrm|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|Pass2, { Reg16|Unspecified|BaseIndex, Reg32|Reg64 }
> +movsl, 0x63, None, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf|Pass2, { Reg32|Unspecified|BaseIndex, Reg64 }
>  movsx, 0xfbe, None, Cpu386, W|Modrm|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg8|Reg16|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  movsx, 0x63, None, Cpu64, Modrm|No_bSuf|No_wSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, Reg32|Reg64 }
>  movsxd, 0x63, None, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, Reg32|Reg64 }
>


-- 
H.J.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 8/8] x86: further re-work insn/suffix recognition to also cover MOVSX
  2022-10-25 17:10   ` H.J. Lu
@ 2022-10-26  9:07     ` Jan Beulich
  2022-10-27  0:11       ` H.J. Lu
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2022-10-26  9:07 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Binutils

On 25.10.2022 19:10, H.J. Lu wrote:
> On Tue, Oct 25, 2022 at 12:29 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> PR gas/29524
>>
>> Having templates with a suffix explicitly present has always been
>> quirky. After prior adjustment all that's left to also eliminate the
> 
> I prefer keeping the current MOVS behavior.  Please submit a new patch
> set without MOVS changes.

You're kidding? If you _still_ think this patch isn't worth it despite
its merits, and despite it now only reducing the set of templates and
adjusting/extending the testsuite, without any changes to tc-i386.c you
can simply give your okay to the first 7 patches. Why would I spam the
list with an identical re-submission of those first 7 patches?

Independent of that I'd like to understand the reasons for your
preference, in particular when considering the benefits as well as the
reported bug which is being fixed (and which you marked as WONTFIX for
a subjective reason). So far your objection was that the change
complicates the code. Now it actually simplifies it slightly by
halving the number of involved templates.

Jan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 8/8] x86: further re-work insn/suffix recognition to also cover MOVSX
  2022-10-26  9:07     ` Jan Beulich
@ 2022-10-27  0:11       ` H.J. Lu
  2022-10-27  6:31         ` Jan Beulich
  0 siblings, 1 reply; 19+ messages in thread
From: H.J. Lu @ 2022-10-27  0:11 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils

On Wed, Oct 26, 2022 at 2:07 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 25.10.2022 19:10, H.J. Lu wrote:
> > On Tue, Oct 25, 2022 at 12:29 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> PR gas/29524
> >>
> >> Having templates with a suffix explicitly present has always been
> >> quirky. After prior adjustment all that's left to also eliminate the
> >
> > I prefer keeping the current MOVS behavior.  Please submit a new patch
> > set without MOVS changes.
>
> You're kidding? If you _still_ think this patch isn't worth it despite
> its merits, and despite it now only reducing the set of templates and
> adjusting/extending the testsuite, without any changes to tc-i386.c you

Without this change, the Pass2 change can be dropped.

> can simply give your okay to the first 7 patches. Why would I spam the
> list with an identical re-submission of those first 7 patches?
>
> Independent of that I'd like to understand the reasons for your
> preference, in particular when considering the benefits as well as the
> reported bug which is being fixed (and which you marked as WONTFIX for
> a subjective reason). So far your objection was that the change
> complicates the code. Now it actually simplifies it slightly by
> halving the number of involved templates.
>
> Jan



-- 
H.J.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 8/8] x86: further re-work insn/suffix recognition to also cover MOVSX
  2022-10-27  0:11       ` H.J. Lu
@ 2022-10-27  6:31         ` Jan Beulich
  0 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2022-10-27  6:31 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Binutils

On 27.10.2022 02:11, H.J. Lu wrote:
> On Wed, Oct 26, 2022 at 2:07 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 25.10.2022 19:10, H.J. Lu wrote:
>>> On Tue, Oct 25, 2022 at 12:29 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> PR gas/29524
>>>>
>>>> Having templates with a suffix explicitly present has always been
>>>> quirky. After prior adjustment all that's left to also eliminate the
>>>
>>> I prefer keeping the current MOVS behavior.  Please submit a new patch
>>> set without MOVS changes.
>>
>> You're kidding? If you _still_ think this patch isn't worth it despite
>> its merits, and despite it now only reducing the set of templates and
>> adjusting/extending the testsuite, without any changes to tc-i386.c you
> 
> Without this change, the Pass2 change can be dropped.

No, it cannot.

Jan

>> can simply give your okay to the first 7 patches. Why would I spam the
>> list with an identical re-submission of those first 7 patches?
>>
>> Independent of that I'd like to understand the reasons for your
>> preference, in particular when considering the benefits as well as the
>> reported bug which is being fixed (and which you marked as WONTFIX for
>> a subjective reason). So far your objection was that the change
>> complicates the code. Now it actually simplifies it slightly by
>> halving the number of involved templates.
>>
>> Jan
> 
> 
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 3/8] x86: re-work insn/suffix recognition
  2022-10-25  7:26 ` [PATCH v5 3/8] x86: re-work insn/suffix recognition Jan Beulich
@ 2022-10-27 17:21   ` H.J. Lu
  2022-10-28  9:00     ` Jan Beulich
  0 siblings, 1 reply; 19+ messages in thread
From: H.J. Lu @ 2022-10-27 17:21 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils

On Tue, Oct 25, 2022 at 12:26 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> Having templates with a suffix explicitly present has always been
> quirky. Introduce a 2nd matching pass in case the 1st one couldn't find
> a suitable template _and_ didn't itself already need to trim off a
> suffix to find a match at all. This requires error reporting adjustments
> (albeit luckily fewer than I was afraid might be necessary), as errors
> previously reported during matching now need deferring until after the
> 2nd pass (because, obviously, we must not emit any error if the 2nd pass
> succeeds). While also related to PR gas/29524, it was requested that
> move-with-sign-extend be left as broken as it always was.
>
> PR gas/29525
> Note that with the dropped CMPSD and MOVSD Intel Syntax string insn
> templates taking operands, mixed IsString/non-IsString template groups
> (with memory operands) cannot occur anymore. With that
> maybe_adjust_templates() becomes unnecessary (and is hence being
> removed).
>
> PR gas/29526
> Note further that while the additions to the intel16 testcase aren't
> really proper Intel syntax, we've been permitting all of those except
> for the MOVD variant. The test therefore is to avoid re-introducing such
> an inconsistency.
> ---
> To limit code churn I'm using "goto" for the retry loop, but I'd be
> happy to make this a proper loop either right here or in a follow-on
> change doing just the necessary re-indentation.
>
> The "too many memory references" errors which are being deleted weren't
> fully consistent anyway - even the majority of IsString insns accepts
> only a single memory operand. If we wanted to retain that, it would need
> re-introducing in md_assemble(), latching the error into i.error just
> like match_template() does.
>
> Why is "MOVQ $imm64, %reg64" being optimized but "MOVABS $imm64, %reg64"
> is not?
> ---
> v5: Split off move-with-sign-extend changes.
> v4: Retain support for operand-less MOVSD and CMPSD.
> v3: Limit xstrdup() to just the templates where a 2nd pass actually
>     makes sense (new Pass2 attribute).
>
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -297,9 +297,6 @@ struct _i386_insn
>         explicit segment overrides are given.  */
>      const reg_entry *seg[2];
>
> -    /* Copied first memory operand string, for re-checking.  */
> -    char *memop1_string;
> -
>      /* PREFIX holds all the given prefix opcodes (usually null).
>         PREFIXES is the number of prefix opcodes.  */
>      unsigned int prefixes;
> @@ -4272,7 +4269,20 @@ optimize_encoding (void)
>            movq $imm31, %r64   -> movl $imm31, %r32
>            movq $imm32, %r64   -> movl $imm32, %r32
>          */
> -      i.tm.opcode_modifier.norex64 = 1;
> +      i.tm.opcode_modifier.size = SIZE32;
> +      if (i.imm_operands)
> +       {
> +         i.types[0].bitfield.imm32 = 1;
> +         i.types[0].bitfield.imm32s = 0;
> +         i.types[0].bitfield.imm64 = 0;
> +       }
> +      else
> +       {
> +         i.types[0].bitfield.dword = 1;
> +         i.types[0].bitfield.qword = 0;
> +       }
> +      i.types[1].bitfield.dword = 1;
> +      i.types[1].bitfield.qword = 0;
>        if (i.tm.base_opcode == 0xb8 || (i.tm.base_opcode | 1) == 0xc7)
>         {
>           /* Handle
> @@ -4282,11 +4292,6 @@ optimize_encoding (void)
>           i.tm.operand_types[0].bitfield.imm32 = 1;
>           i.tm.operand_types[0].bitfield.imm32s = 0;
>           i.tm.operand_types[0].bitfield.imm64 = 0;
> -         i.types[0].bitfield.imm32 = 1;
> -         i.types[0].bitfield.imm32s = 0;
> -         i.types[0].bitfield.imm64 = 0;
> -         i.types[1].bitfield.dword = 1;
> -         i.types[1].bitfield.qword = 0;
>           if ((i.tm.base_opcode | 1) == 0xc7)
>             {
>               /* Handle
> @@ -4818,11 +4823,14 @@ void
>  md_assemble (char *line)
>  {
>    unsigned int j;
> -  char mnemonic[MAX_MNEM_SIZE], mnem_suffix;
> -  const char *end;
> +  char mnemonic[MAX_MNEM_SIZE], mnem_suffix, *copy = NULL;
> +  const char *end, *pass1_mnem = NULL;
> +  enum i386_error pass1_err = 0;
>    const insn_template *t;
>
>    /* Initialize globals.  */
> +  current_templates = NULL;
> + retry:
>    memset (&i, '\0', sizeof (i));
>    i.rounding.type = rc_none;
>    for (j = 0; j < MAX_OPERANDS; j++)
> @@ -4837,16 +4845,26 @@ md_assemble (char *line)
>
>    end = parse_insn (line, mnemonic);
>    if (end == NULL)
> -    return;
> +    {
> +      if (pass1_mnem != NULL)
> +       goto match_error;
> +      return;
> +    }
> +  if (current_templates->start->opcode_modifier.pass2)
> +    {
> +      /* Make a copy of the full line in case we need to retry.  */
> +      copy = xstrdup (line);
> +    }
>    line += end - line;
>    mnem_suffix = i.suffix;
>
>    line = parse_operands (line, mnemonic);
>    this_operand = -1;
> -  xfree (i.memop1_string);
> -  i.memop1_string = NULL;
>    if (line == NULL)
> -    return;
> +    {
> +      free (copy);
> +      return;
> +    }
>
>    /* Now we've parsed the mnemonic into a set of templates, and have the
>       operands at hand.  */
> @@ -4922,7 +4940,97 @@ md_assemble (char *line)
>       with the template operand types.  */
>
>    if (!(t = match_template (mnem_suffix)))
> -    return;
> +    {
> +      const char *err_msg;
> +
> +      if (copy && !mnem_suffix)
> +       {
> +         line = copy;
> +         copy = NULL;
> +         pass1_err = i.error;
> +         pass1_mnem = current_templates->start->name;
> +         goto retry;
> +       }
> +      free (copy);
> +  match_error:
> +      switch (pass1_mnem ? pass1_err : i.error)
> +       {
> +       default:
> +         abort ();
> +       case operand_size_mismatch:
> +         err_msg = _("operand size mismatch");
> +         break;
> +       case operand_type_mismatch:
> +         err_msg = _("operand type mismatch");
> +         break;
> +       case register_type_mismatch:
> +         err_msg = _("register type mismatch");
> +         break;
> +       case number_of_operands_mismatch:
> +         err_msg = _("number of operands mismatch");
> +         break;
> +       case invalid_instruction_suffix:
> +         err_msg = _("invalid instruction suffix");
> +         break;
> +       case bad_imm4:
> +         err_msg = _("constant doesn't fit in 4 bits");
> +         break;
> +       case unsupported_with_intel_mnemonic:
> +         err_msg = _("unsupported with Intel mnemonic");
> +         break;
> +       case unsupported_syntax:
> +         err_msg = _("unsupported syntax");
> +         break;
> +       case unsupported:
> +         as_bad (_("unsupported instruction `%s'"),
> +                 pass1_mnem ? pass1_mnem : current_templates->start->name);
> +         return;
> +       case invalid_sib_address:
> +         err_msg = _("invalid SIB address");
> +         break;
> +       case invalid_vsib_address:
> +         err_msg = _("invalid VSIB address");
> +         break;
> +       case invalid_vector_register_set:
> +         err_msg = _("mask, index, and destination registers must be distinct");
> +         break;
> +       case invalid_tmm_register_set:
> +         err_msg = _("all tmm registers must be distinct");
> +         break;
> +       case invalid_dest_and_src_register_set:
> +         err_msg = _("destination and source registers must be distinct");
> +         break;
> +       case unsupported_vector_index_register:
> +         err_msg = _("unsupported vector index register");
> +         break;
> +       case unsupported_broadcast:
> +         err_msg = _("unsupported broadcast");
> +         break;
> +       case broadcast_needed:
> +         err_msg = _("broadcast is needed for operand of such type");
> +         break;
> +       case unsupported_masking:
> +         err_msg = _("unsupported masking");
> +         break;
> +       case mask_not_on_destination:
> +         err_msg = _("mask not on destination operand");
> +         break;
> +       case no_default_mask:
> +         err_msg = _("default mask isn't allowed");
> +         break;
> +       case unsupported_rc_sae:
> +         err_msg = _("unsupported static rounding/sae");
> +         break;
> +       case invalid_register_operand:
> +         err_msg = _("invalid register operand");
> +         break;
> +       }
> +      as_bad (_("%s for `%s'"), err_msg,
> +             pass1_mnem ? pass1_mnem : current_templates->start->name);
> +      return;
> +    }
> +
> +  free (copy);
>
>    if (sse_check != check_none
>        /* The opcode space check isn't strictly needed; it's there only to
> @@ -5223,6 +5331,7 @@ parse_insn (const char *line, char *mnem
>  {
>    const char *l = line, *token_start = l;
>    char *mnem_p;
> +  bool pass1 = !current_templates;
>    int supported;
>    const insn_template *t;
>    char *dot_p = NULL;
> @@ -5392,8 +5501,10 @@ parse_insn (const char *line, char *mnem
>        current_templates = (const templates *) str_hash_find (op_hash, mnemonic);
>      }
>
> -  if (!current_templates)
> +  if (!current_templates || !pass1)
>      {
> +      current_templates = NULL;
> +
>      check_suffix:
>        if (mnem_p > mnemonic)
>         {
> @@ -5435,13 +5546,39 @@ parse_insn (const char *line, char *mnem
>                   current_templates
>                     = (const templates *) str_hash_find (op_hash, mnemonic);
>                 }
> +             /* For compatibility reasons accept MOVSD and CMPSD without
> +                operands even in AT&T mode.  */
> +             else if (*l == END_OF_INSN
> +                      || (is_space_char (*l) && l[1] == END_OF_INSN))
> +               {
> +                 mnem_p[-1] = '\0';
> +                 current_templates
> +                   = (const templates *) str_hash_find (op_hash, mnemonic);
> +                 if (current_templates != NULL
> +                     /* MOVS or CMPS */
> +                     && (current_templates->start->base_opcode | 2) == 0xa6
> +                     && current_templates->start->opcode_modifier.opcodespace
> +                        == SPACE_BASE
> +                     && mnem_p[-2] == 's')
> +                   {
> +                     as_warn (_("found `%sd'; assuming `%sl' was meant"),
> +                              mnemonic, mnemonic);
> +                     i.suffix = LONG_MNEM_SUFFIX;
> +                   }
> +                 else
> +                   {
> +                     current_templates = NULL;
> +                     mnem_p[-1] = 'd';
> +                   }
> +               }
>               break;
>             }
>         }
>
>        if (!current_templates)
>         {
> -         as_bad (_("no such instruction: `%s'"), token_start);
> +         if (pass1)
> +           as_bad (_("no such instruction: `%s'"), token_start);
>           return NULL;
>         }
>      }
> @@ -6845,81 +6982,7 @@ match_template (char mnem_suffix)
>    if (t == current_templates->end)
>      {
>        /* We found no match.  */
> -      const char *err_msg;
> -      switch (specific_error)
> -       {
> -       default:
> -         abort ();
> -       case operand_size_mismatch:
> -         err_msg = _("operand size mismatch");
> -         break;
> -       case operand_type_mismatch:
> -         err_msg = _("operand type mismatch");
> -         break;
> -       case register_type_mismatch:
> -         err_msg = _("register type mismatch");
> -         break;
> -       case number_of_operands_mismatch:
> -         err_msg = _("number of operands mismatch");
> -         break;
> -       case invalid_instruction_suffix:
> -         err_msg = _("invalid instruction suffix");
> -         break;
> -       case bad_imm4:
> -         err_msg = _("constant doesn't fit in 4 bits");
> -         break;
> -       case unsupported_with_intel_mnemonic:
> -         err_msg = _("unsupported with Intel mnemonic");
> -         break;
> -       case unsupported_syntax:
> -         err_msg = _("unsupported syntax");
> -         break;
> -       case unsupported:
> -         as_bad (_("unsupported instruction `%s'"),
> -                 current_templates->start->name);
> -         return NULL;
> -       case invalid_sib_address:
> -         err_msg = _("invalid SIB address");
> -         break;
> -       case invalid_vsib_address:
> -         err_msg = _("invalid VSIB address");
> -         break;
> -       case invalid_vector_register_set:
> -         err_msg = _("mask, index, and destination registers must be distinct");
> -         break;
> -       case invalid_tmm_register_set:
> -         err_msg = _("all tmm registers must be distinct");
> -         break;
> -       case invalid_dest_and_src_register_set:
> -         err_msg = _("destination and source registers must be distinct");
> -         break;
> -       case unsupported_vector_index_register:
> -         err_msg = _("unsupported vector index register");
> -         break;
> -       case unsupported_broadcast:
> -         err_msg = _("unsupported broadcast");
> -         break;
> -       case broadcast_needed:
> -         err_msg = _("broadcast is needed for operand of such type");
> -         break;
> -       case unsupported_masking:
> -         err_msg = _("unsupported masking");
> -         break;
> -       case mask_not_on_destination:
> -         err_msg = _("mask not on destination operand");
> -         break;
> -       case no_default_mask:
> -         err_msg = _("default mask isn't allowed");
> -         break;
> -       case unsupported_rc_sae:
> -         err_msg = _("unsupported static rounding/sae");
> -         break;
> -       case invalid_register_operand:
> -         err_msg = _("invalid register operand");
> -         break;
> -       }
> -      as_bad (_("%s for `%s'"), err_msg,
> -             current_templates->start->name);
> +      i.error = specific_error;
>        return NULL;
>      }
>
> @@ -11328,49 +11391,6 @@ RC_SAE_immediate (const char *imm_start)
>    return 1;
>  }
>
> -/* Only string instructions can have a second memory operand, so
> -   reduce current_templates to just those if it contains any.  */
> -static int
> -maybe_adjust_templates (void)
> -{
> -  const insn_template *t;
> -
> -  gas_assert (i.mem_operands == 1);
> -
> -  for (t = current_templates->start; t < current_templates->end; ++t)
> -    if (t->opcode_modifier.isstring)
> -      break;
> -
> -  if (t < current_templates->end)
> -    {
> -      static templates aux_templates;
> -      bool recheck;
> -
> -      aux_templates.start = t;
> -      for (; t < current_templates->end; ++t)
> -       if (!t->opcode_modifier.isstring)
> -         break;
> -      aux_templates.end = t;
> -
> -      /* Determine whether to re-check the first memory operand.  */
> -      recheck = (aux_templates.start != current_templates->start
> -                || t != current_templates->end);
> -
> -      current_templates = &aux_templates;
> -
> -      if (recheck)
> -       {
> -         i.mem_operands = 0;
> -         if (i.memop1_string != NULL
> -             && i386_index_check (i.memop1_string) == 0)
> -           return 0;
> -         i.mem_operands = 1;
> -       }
> -    }
> -
> -  return 1;
> -}
> -
>  static INLINE bool starts_memory_operand (char c)
>  {
>    return ISDIGIT (c)
> @@ -11521,17 +11541,6 @@ i386_att_operand (char *operand_string)
>        char *displacement_string_end;
>
>      do_memory_reference:
> -      if (i.mem_operands == 1 && !maybe_adjust_templates ())
> -       return 0;
> -      if ((i.mem_operands == 1
> -          && !current_templates->start->opcode_modifier.isstring)
> -         || i.mem_operands == 2)
> -       {
> -         as_bad (_("too many memory references for `%s'"),
> -                 current_templates->start->name);
> -         return 0;
> -       }
> -
>        /* Check for base index form.  We detect the base index form by
>          looking for an ')' at the end of the operand, searching
>          for the '(' matching it, and finding a REGISTER_PREFIX or ','
> @@ -11738,8 +11747,6 @@ i386_att_operand (char *operand_string)
>        if (i386_index_check (operand_string) == 0)
>         return 0;
>        i.flags[this_operand] |= Operand_Mem;
> -      if (i.mem_operands == 0)
> -       i.memop1_string = xstrdup (operand_string);
>        i.mem_operands++;
>      }
>    else
> --- a/gas/config/tc-i386-intel.c
> +++ b/gas/config/tc-i386-intel.c
> @@ -993,10 +993,7 @@ i386_intel_operand (char *operand_string
>            || intel_state.is_mem)
>      {
>        /* Memory operand.  */
> -      if (i.mem_operands == 1 && !maybe_adjust_templates ())
> -       return 0;
> -      if ((int) i.mem_operands
> -         >= 2 - !current_templates->start->opcode_modifier.isstring)
> +      if (i.mem_operands)
>         {
>           /* Handle
>
> @@ -1041,10 +1038,6 @@ i386_intel_operand (char *operand_string
>                     }
>                 }
>             }
> -
> -         as_bad (_("too many memory references for `%s'"),
> -                 current_templates->start->name);
> -         return 0;
>         }
>
>        /* Swap base and index in 16-bit memory operands like
> @@ -1158,8 +1151,6 @@ i386_intel_operand (char *operand_string
>         return 0;
>
>        i.flags[this_operand] |= Operand_Mem;
> -      if (i.mem_operands == 0)
> -       i.memop1_string = xstrdup (operand_string);
>        ++i.mem_operands;
>      }
>    else
> --- a/gas/testsuite/gas/i386/code16.d
> +++ b/gas/testsuite/gas/i386/code16.d
> @@ -1,5 +1,6 @@
>  #objdump: -drw -mi8086
>  #name: i386 with .code16
> +#warning_output: code16.e
>
>  .*: +file format .*
>
> --- /dev/null
> +++ b/gas/testsuite/gas/i386/code16.e
> @@ -0,0 +1,3 @@
> +.*: Assembler messages:
> +.*:3: Warning: .* `movsd'.*`movsl'.*
> +.*:4: Warning: .* `cmpsd'.*`cmpsl'.*
> --- a/gas/testsuite/gas/i386/code16.s
> +++ b/gas/testsuite/gas/i386/code16.s
> @@ -2,8 +2,8 @@
>         .code16
>         rep; movsd
>         rep; cmpsd
> -       rep movsd %ds:(%si),%es:(%di)
> -       rep cmpsd %es:(%di),%ds:(%si)
> +       rep movsl %ds:(%si),%es:(%di)
> +       rep cmpsl %es:(%di),%ds:(%si)
>
>         mov     %cr2, %ecx
>         mov     %ecx, %cr2
> --- a/gas/testsuite/gas/i386/intel16.d
> +++ b/gas/testsuite/gas/i386/intel16.d
> @@ -20,4 +20,12 @@ Disassembly of section .text:
>    2c:  8d 02 [         ]*lea    \(%bp,%si\),%ax
>    2e:  8d 01 [         ]*lea    \(%bx,%di\),%ax
>    30:  8d 03 [         ]*lea    \(%bp,%di\),%ax
> -       ...
> +[      ]*[0-9a-f]+:    67 f7 13[       ]+notw[         ]+\(%ebx\)
> +[      ]*[0-9a-f]+:    66 f7 17[       ]+notl[         ]+\(%bx\)
> +[      ]*[0-9a-f]+:    67 0f 1f 03[    ]+nopw[         ]+\(%ebx\)
> +[      ]*[0-9a-f]+:    66 0f 1f 07[    ]+nopl[         ]+\(%bx\)
> +[      ]*[0-9a-f]+:    67 83 03 05[    ]+addw[         ]+\$0x5,\(%ebx\)
> +[      ]*[0-9a-f]+:    66 83 07 05[    ]+addl[         ]+\$0x5,\(%bx\)
> +[      ]*[0-9a-f]+:    67 c7 03 05 00[         ]+movw[         ]+\$0x5,\(%ebx\)
> +[      ]*[0-9a-f]+:    66 c7 07 05 00 00 00[   ]+movl[         ]+\$0x5,\(%bx\)
> +#pass
> --- a/gas/testsuite/gas/i386/intel16.s
> +++ b/gas/testsuite/gas/i386/intel16.s
> @@ -18,4 +18,14 @@
>   lea   ax, [di][bx]
>   lea   ax, [di][bp]
>
> - .p2align 4,0
> + notw  [ebx]
> + notd  [bx]
> +
> + nopw  [ebx]
> + nopd  [bx]
> +
> + addw  [ebx], 5
> + addd  [bx], 5
> +
> + movw  [ebx], 5
> + movd  [bx], 5
> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -135,34 +135,27 @@
>  mov, 0xa0, None, CpuNo64, D|W|No_sSuf|No_qSuf|No_ldSuf, { Disp16|Disp32|Unspecified|Byte|Word|Dword, Acc|Byte|Word|Dword }
>  mov, 0xa0, None, Cpu64, D|W|No_sSuf|No_ldSuf, { Disp64|Unspecified|Byte|Word|Dword|Qword, Acc|Byte|Word|Dword|Qword }
>  movabs, 0xa0, None, Cpu64, D|W|No_sSuf|No_ldSuf, { Disp64|Unspecified|Byte|Word|Dword|Qword, Acc|Byte|Word|Dword|Qword }
> -movq, 0xa1, None, Cpu64, D|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Disp64|Unspecified|Qword, Acc|Qword }
>  mov, 0x88, None, 0, D|W|CheckRegSize|Modrm|No_sSuf|No_ldSuf|HLEPrefixRelease, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
> -movq, 0x89, None, Cpu64, D|Modrm|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|HLEPrefixRelease, { Reg64, Reg64|Unspecified|Qword|BaseIndex }
>  // In the 64bit mode the short form mov immediate is redefined to have
>  // 64bit value.
>  mov, 0xb0, None, 0, W|No_sSuf|No_qSuf|No_ldSuf, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32 }
>  mov, 0xc6, 0, 0, W|Modrm|No_sSuf|No_ldSuf|HLEPrefixRelease|Optimize, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
> -movq, 0xc7, 0, Cpu64, Modrm|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|HLEPrefixRelease|Optimize, { Imm32S, Reg64|Qword|Unspecified|BaseIndex }
>  mov, 0xb8, None, Cpu64, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf|Optimize, { Imm64, Reg64 }
>  movabs, 0xb8, None, Cpu64, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf, { Imm64, Reg64 }
> -movq, 0xb8, None, Cpu64, Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { Imm64, Reg64 }
>  // The segment register moves accept WordReg so that a segment register
>  // can be copied to a 32 bit register, and vice versa, without using a
>  // size prefix.  When moving to a 32 bit register, the upper 16 bits
>  // are set to an implementation defined value (on the Pentium Pro, the
>  // implementation defined value is zero).
> -mov, 0x8c, None, 0, RegMem|No_bSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { SReg, Reg16|Reg32|Reg64 }
> +mov, 0x8c, None, 0, RegMem|No_bSuf|No_sSuf|No_ldSuf|NoRex64, { SReg, Reg16|Reg32|Reg64 }
>  mov, 0x8c, None, 0, D|Modrm|IgnoreSize|No_bSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { SReg, Word|Unspecified|BaseIndex }
> -movq, 0x8c, None, Cpu64, D|RegMem|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { SReg, Reg64 }
> -mov, 0x8e, None, 0, Modrm|IgnoreSize|No_bSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Reg16|Reg32|Reg64, SReg }
> +mov, 0x8e, None, 0, Modrm|IgnoreSize|No_bSuf|No_sSuf|No_ldSuf|NoRex64, { Reg16|Reg32|Reg64, SReg }
>  // Move to/from control debug registers.  In the 16 or 32bit modes
>  // they are 32bit.  In the 64bit mode they are 64bit.
>  mov, 0xf20, None, Cpu386|CpuNo64, D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf|No_ldSuf, { Control, Reg32 }
>  mov, 0xf20, None, Cpu64, D|RegMem|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf|NoRex64, { Control, Reg64 }
> -movq, 0xf20, None, Cpu64, D|RegMem|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Control, Reg64 }
>  mov, 0xf21, None, Cpu386|CpuNo64, D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf|No_ldSuf, { Debug, Reg32 }
>  mov, 0xf21, None, Cpu64, D|RegMem|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_ldSuf|NoRex64, { Debug, Reg64 }
> -movq, 0xf21, None, Cpu64, D|RegMem|Size64|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoRex64, { Debug, Reg64 }
>  mov, 0xf24, None, Cpu386|CpuNo64, D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf|No_ldSuf, { Test, Reg32 }
>
>  // Move after swapping the bytes
> @@ -492,9 +485,6 @@ set<cc>, 0xf9<cc:opc>, 0, Cpu386, Modrm|
>  // String manipulation.
>  cmps, 0xa6, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {}
>  cmps, 0xa6, None, 0, W|No_sSuf|No_ldSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
> -// Intel mode string compare.
> -cmpsd, 0xa7, None, Cpu386, Size32|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
> -cmpsd, 0xa7, None, Cpu386, Size32|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsStringEsOp0|RepPrefixOk, { Dword|Unspecified|BaseIndex, Dword|Unspecified|BaseIndex }
>  scmp, 0xa6, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {}
>  scmp, 0xa6, None, 0, W|No_sSuf|No_ldSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
>  ins, 0x6c, None, Cpu186, W|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
> @@ -509,9 +499,6 @@ slod, 0xac, None, 0, W|No_sSuf|No_ldSuf|
>  slod, 0xac, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Acc|Byte|Word|Dword|Qword }
>  movs, 0xa4, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {}
>  movs, 0xa4, None, 0, W|No_sSuf|No_ldSuf|IsStringEsOp1|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
> -// Intel mode string move.
> -movsd, 0xa5, None, Cpu386, Size32|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsString|RepPrefixOk, {}
> -movsd, 0xa5, None, Cpu386, Size32|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IsStringEsOp1|RepPrefixOk, { Dword|Unspecified|BaseIndex, Dword|Unspecified|BaseIndex }
>  smov, 0xa4, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {}
>  smov, 0xa4, None, 0, W|No_sSuf|No_ldSuf|IsStringEsOp1|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
>  scas, 0xae, None, 0, W|No_sSuf|No_ldSuf|IsString|RepPrefixOk, {}
> @@ -989,13 +976,13 @@ emms, 0xf77, None, CpuMMX, No_bSuf|No_wS
>  // copying between Reg64/Mem64 and RegXMM/RegMMX, as is mandated by Intel's
>  // spec). AMD's spec, having been in existence for much longer, failed to
>  // recognize that and specified movd for 32- and 64-bit operations.
> -movd, 0x666e, None, CpuAVX, D|Modrm|Vex128|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Reg32|Unspecified|BaseIndex, RegXMM }
> +movd, 0x666e, None, CpuAVX, D|Modrm|Vex128|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX|Pass2, { Reg32|Unspecified|BaseIndex, RegXMM }
>  movd, 0x666e, None, CpuAVX|Cpu64, D|Modrm|Vex=1|Space0F|VexW1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64|SSE2AVX, { Reg64|BaseIndex, RegXMM }
>  movd, 0x660f6e, None, CpuSSE2, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, RegXMM }
>  movd, 0x660f6e, None, CpuSSE2|Cpu64, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg64|BaseIndex, RegXMM }
>  movd, 0xf6e, None, CpuMMX, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, RegMMX }
>  movd, 0xf6e, None, CpuMMX|Cpu64, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg64|BaseIndex, RegMMX }
> -movq, 0xf37e, None, CpuAVX, Load|Modrm|Vex=1|Space0F|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
> +movq, 0xf37e, None, CpuAVX, Load|Modrm|Vex=1|Space0F|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX|Pass2, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
>  movq, 0x66d6, None, CpuAVX, Modrm|Vex=1|Space0F|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM, Qword|Unspecified|BaseIndex|RegXMM }
>  movq, 0x666e, None, CpuAVX|Cpu64, D|Modrm|Vex=1|Space0F|VexW1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64|SSE2AVX, { Reg64|Unspecified|BaseIndex, RegXMM }
>  movq, 0xf30f7e, None, CpuSSE2, Load|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|Qword|BaseIndex|RegXMM, RegXMM }
> @@ -1159,7 +1146,7 @@ andpd<sse2>, 0x660f54, None, <sse2:cpu>,
>  cmp<frel>pd<sse2>, 0x660fc2, <frel:imm>, <sse2:cpu>, Modrm|<sse2:attr>|<sse2:vvvv>|<frel:comm>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { RegXMM|Unspecified|BaseIndex, RegXMM }
>  cmp<frel>sd<sse2>, 0xf20fc2, <frel:imm>, <sse2:cpu>, Modrm|<sse2:scal>|<sse2:vvvv>|<frel:comm>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM }
>  cmppd<sse2>, 0x660fc2, None, <sse2:cpu>, Modrm|<sse2:attr>|<sse2:vvvv>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
> -cmpsd<sse2>, 0xf20fc2, None, <sse2:cpu>, Modrm|<sse2:scal>|<sse2:vvvv>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
> +cmpsd<sse2>, 0xf20fc2, None, <sse2:cpu>, Modrm|<sse2:scal>|<sse2:vvvv>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Pass2, { Imm8, Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
>  comisd<sse2>, 0x660f2f, None, <sse2:cpu>, Modrm|<sse2:scal>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
>  cvtpi2pd, 0x660f2a, None, CpuSSE2, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegMMX, RegXMM }
>  cvtpi2pd, 0xf3e6, None, CpuAVX, Modrm|Vex|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
> @@ -1184,7 +1171,7 @@ movlpd, 0x6613, None, CpuAVX, Modrm|Vex|
>  movlpd, 0x660f12, None, CpuSSE2, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex, RegXMM }
>  movmskpd<sse2>, 0x660f50, None, <sse2:cpu>, Modrm|<sse2:attr>|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegXMM, Reg32|Reg64 }
>  movntpd<sse2>, 0x660f2b, None, <sse2:cpu>, Modrm|<sse2:attr>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Xmmword|Unspecified|BaseIndex }
> -movsd, 0xf210, None, CpuAVX, D|Modrm|VexLIG|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
> +movsd, 0xf210, None, CpuAVX, D|Modrm|VexLIG|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX|Pass2, { Qword|Unspecified|BaseIndex, RegXMM }
>  movsd, 0xf210, None, CpuAVX, D|Modrm|Vex=3|Space0F|VexVVVV=1|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM, RegXMM }
>  movsd, 0xf20f10, None, CpuSSE2, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
>  movupd<sse2>, 0x660f10, None, <sse2:cpu>, D|Modrm|<sse2:attr>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
>

All these instructions with Pass2 have a vector register operand.  Can you
use it instead?

-- 
H.J.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 3/8] x86: re-work insn/suffix recognition
  2022-10-27 17:21   ` H.J. Lu
@ 2022-10-28  9:00     ` Jan Beulich
  2022-10-28 16:12       ` H.J. Lu
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2022-10-28  9:00 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Binutils

On 27.10.2022 19:21, H.J. Lu wrote:
> On Tue, Oct 25, 2022 at 12:26 AM Jan Beulich <jbeulich@suse.com> wrote:
>> @@ -989,13 +976,13 @@ emms, 0xf77, None, CpuMMX, No_bSuf|No_wS
>>  // copying between Reg64/Mem64 and RegXMM/RegMMX, as is mandated by Intel's
>>  // spec). AMD's spec, having been in existence for much longer, failed to
>>  // recognize that and specified movd for 32- and 64-bit operations.
>> -movd, 0x666e, None, CpuAVX, D|Modrm|Vex128|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Reg32|Unspecified|BaseIndex, RegXMM }
>> +movd, 0x666e, None, CpuAVX, D|Modrm|Vex128|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX|Pass2, { Reg32|Unspecified|BaseIndex, RegXMM }
>>  movd, 0x666e, None, CpuAVX|Cpu64, D|Modrm|Vex=1|Space0F|VexW1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64|SSE2AVX, { Reg64|BaseIndex, RegXMM }
>>  movd, 0x660f6e, None, CpuSSE2, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, RegXMM }
>>  movd, 0x660f6e, None, CpuSSE2|Cpu64, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg64|BaseIndex, RegXMM }
>>  movd, 0xf6e, None, CpuMMX, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, RegMMX }
>>  movd, 0xf6e, None, CpuMMX|Cpu64, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg64|BaseIndex, RegMMX }
>> -movq, 0xf37e, None, CpuAVX, Load|Modrm|Vex=1|Space0F|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
>> +movq, 0xf37e, None, CpuAVX, Load|Modrm|Vex=1|Space0F|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX|Pass2, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
>>  movq, 0x66d6, None, CpuAVX, Modrm|Vex=1|Space0F|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM, Qword|Unspecified|BaseIndex|RegXMM }
>>  movq, 0x666e, None, CpuAVX|Cpu64, D|Modrm|Vex=1|Space0F|VexW1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64|SSE2AVX, { Reg64|Unspecified|BaseIndex, RegXMM }
>>  movq, 0xf30f7e, None, CpuSSE2, Load|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|Qword|BaseIndex|RegXMM, RegXMM }
>> @@ -1159,7 +1146,7 @@ andpd<sse2>, 0x660f54, None, <sse2:cpu>,
>>  cmp<frel>pd<sse2>, 0x660fc2, <frel:imm>, <sse2:cpu>, Modrm|<sse2:attr>|<sse2:vvvv>|<frel:comm>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { RegXMM|Unspecified|BaseIndex, RegXMM }
>>  cmp<frel>sd<sse2>, 0xf20fc2, <frel:imm>, <sse2:cpu>, Modrm|<sse2:scal>|<sse2:vvvv>|<frel:comm>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM }
>>  cmppd<sse2>, 0x660fc2, None, <sse2:cpu>, Modrm|<sse2:attr>|<sse2:vvvv>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
>> -cmpsd<sse2>, 0xf20fc2, None, <sse2:cpu>, Modrm|<sse2:scal>|<sse2:vvvv>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
>> +cmpsd<sse2>, 0xf20fc2, None, <sse2:cpu>, Modrm|<sse2:scal>|<sse2:vvvv>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Pass2, { Imm8, Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
>>  comisd<sse2>, 0x660f2f, None, <sse2:cpu>, Modrm|<sse2:scal>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
>>  cvtpi2pd, 0x660f2a, None, CpuSSE2, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegMMX, RegXMM }
>>  cvtpi2pd, 0xf3e6, None, CpuAVX, Modrm|Vex|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
>> @@ -1184,7 +1171,7 @@ movlpd, 0x6613, None, CpuAVX, Modrm|Vex|
>>  movlpd, 0x660f12, None, CpuSSE2, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex, RegXMM }
>>  movmskpd<sse2>, 0x660f50, None, <sse2:cpu>, Modrm|<sse2:attr>|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegXMM, Reg32|Reg64 }
>>  movntpd<sse2>, 0x660f2b, None, <sse2:cpu>, Modrm|<sse2:attr>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Xmmword|Unspecified|BaseIndex }
>> -movsd, 0xf210, None, CpuAVX, D|Modrm|VexLIG|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
>> +movsd, 0xf210, None, CpuAVX, D|Modrm|VexLIG|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX|Pass2, { Qword|Unspecified|BaseIndex, RegXMM }
>>  movsd, 0xf210, None, CpuAVX, D|Modrm|Vex=3|Space0F|VexVVVV=1|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM, RegXMM }
>>  movsd, 0xf20f10, None, CpuSSE2, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
>>  movupd<sse2>, 0x660f10, None, <sse2:cpu>, D|Modrm|<sse2:attr>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
>>
> 
> All these instructions with Pass2 have a vector register operand.  Can you
> use it instead?

I probably could, but that would go against your request of limiting as
much as possible the allocation of a copy of the input line: Far more
insns would then also undergo this setup for a possible 2nd parsing pass.
It may be possible to limit the set some by taking into account further
attributes (like, leaving aside move-with-sign-extend which you object
to being corrected, SSE2AVX), but the collection of those checks would
likely be clumsy and harder to maintain than the new attribute. For
example the consistency check in i386-gen would then not be possible
anymore, and hence the whole thing would become _silently_ dependent on
the ordering of templates in the file.

Furthermore keying this condition to insn attributes would be less
future proof. While it may not be very likely for new colliding
mnemonics to appear, if one would which then doesn't fit the chosen
pattern, the tweaking of the conditional would be more fragile than
the mere adding of Pass2 in the appropriate template. (Even if you
continue to object to it, simply consider the last patch of this series
an example.)

Jan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 3/8] x86: re-work insn/suffix recognition
  2022-10-28  9:00     ` Jan Beulich
@ 2022-10-28 16:12       ` H.J. Lu
  2022-10-31 11:40         ` Jan Beulich
  0 siblings, 1 reply; 19+ messages in thread
From: H.J. Lu @ 2022-10-28 16:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils

On Fri, Oct 28, 2022 at 2:00 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 27.10.2022 19:21, H.J. Lu wrote:
> > On Tue, Oct 25, 2022 at 12:26 AM Jan Beulich <jbeulich@suse.com> wrote:
> >> @@ -989,13 +976,13 @@ emms, 0xf77, None, CpuMMX, No_bSuf|No_wS
> >>  // copying between Reg64/Mem64 and RegXMM/RegMMX, as is mandated by Intel's
> >>  // spec). AMD's spec, having been in existence for much longer, failed to
> >>  // recognize that and specified movd for 32- and 64-bit operations.
> >> -movd, 0x666e, None, CpuAVX, D|Modrm|Vex128|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Reg32|Unspecified|BaseIndex, RegXMM }
> >> +movd, 0x666e, None, CpuAVX, D|Modrm|Vex128|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX|Pass2, { Reg32|Unspecified|BaseIndex, RegXMM }
> >>  movd, 0x666e, None, CpuAVX|Cpu64, D|Modrm|Vex=1|Space0F|VexW1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64|SSE2AVX, { Reg64|BaseIndex, RegXMM }
> >>  movd, 0x660f6e, None, CpuSSE2, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, RegXMM }
> >>  movd, 0x660f6e, None, CpuSSE2|Cpu64, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg64|BaseIndex, RegXMM }
> >>  movd, 0xf6e, None, CpuMMX, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, RegMMX }
> >>  movd, 0xf6e, None, CpuMMX|Cpu64, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg64|BaseIndex, RegMMX }
> >> -movq, 0xf37e, None, CpuAVX, Load|Modrm|Vex=1|Space0F|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
> >> +movq, 0xf37e, None, CpuAVX, Load|Modrm|Vex=1|Space0F|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX|Pass2, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
> >>  movq, 0x66d6, None, CpuAVX, Modrm|Vex=1|Space0F|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM, Qword|Unspecified|BaseIndex|RegXMM }
> >>  movq, 0x666e, None, CpuAVX|Cpu64, D|Modrm|Vex=1|Space0F|VexW1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64|SSE2AVX, { Reg64|Unspecified|BaseIndex, RegXMM }
> >>  movq, 0xf30f7e, None, CpuSSE2, Load|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|Qword|BaseIndex|RegXMM, RegXMM }
> >> @@ -1159,7 +1146,7 @@ andpd<sse2>, 0x660f54, None, <sse2:cpu>,
> >>  cmp<frel>pd<sse2>, 0x660fc2, <frel:imm>, <sse2:cpu>, Modrm|<sse2:attr>|<sse2:vvvv>|<frel:comm>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { RegXMM|Unspecified|BaseIndex, RegXMM }
> >>  cmp<frel>sd<sse2>, 0xf20fc2, <frel:imm>, <sse2:cpu>, Modrm|<sse2:scal>|<sse2:vvvv>|<frel:comm>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM }
> >>  cmppd<sse2>, 0x660fc2, None, <sse2:cpu>, Modrm|<sse2:attr>|<sse2:vvvv>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
> >> -cmpsd<sse2>, 0xf20fc2, None, <sse2:cpu>, Modrm|<sse2:scal>|<sse2:vvvv>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
> >> +cmpsd<sse2>, 0xf20fc2, None, <sse2:cpu>, Modrm|<sse2:scal>|<sse2:vvvv>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Pass2, { Imm8, Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
> >>  comisd<sse2>, 0x660f2f, None, <sse2:cpu>, Modrm|<sse2:scal>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
> >>  cvtpi2pd, 0x660f2a, None, CpuSSE2, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegMMX, RegXMM }
> >>  cvtpi2pd, 0xf3e6, None, CpuAVX, Modrm|Vex|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
> >> @@ -1184,7 +1171,7 @@ movlpd, 0x6613, None, CpuAVX, Modrm|Vex|
> >>  movlpd, 0x660f12, None, CpuSSE2, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex, RegXMM }
> >>  movmskpd<sse2>, 0x660f50, None, <sse2:cpu>, Modrm|<sse2:attr>|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegXMM, Reg32|Reg64 }
> >>  movntpd<sse2>, 0x660f2b, None, <sse2:cpu>, Modrm|<sse2:attr>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Xmmword|Unspecified|BaseIndex }
> >> -movsd, 0xf210, None, CpuAVX, D|Modrm|VexLIG|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
> >> +movsd, 0xf210, None, CpuAVX, D|Modrm|VexLIG|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX|Pass2, { Qword|Unspecified|BaseIndex, RegXMM }
> >>  movsd, 0xf210, None, CpuAVX, D|Modrm|Vex=3|Space0F|VexVVVV=1|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM, RegXMM }
> >>  movsd, 0xf20f10, None, CpuSSE2, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
> >>  movupd<sse2>, 0x660f10, None, <sse2:cpu>, D|Modrm|<sse2:attr>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
> >>
> >
> > All these instructions with Pass2 have a vector register operand.  Can you
> > use it instead?
>
> I probably could, but that would go against your request of limiting as
> much as possible the allocation of a copy of the input line: Far more
> insns would then also undergo this setup for a possible 2nd parsing pass.
> It may be possible to limit the set some by taking into account further
> attributes (like, leaving aside move-with-sign-extend which you object
> to being corrected, SSE2AVX), but the collection of those checks would
> likely be clumsy and harder to maintain than the new attribute. For
> example the consistency check in i386-gen would then not be possible
> anymore, and hence the whole thing would become _silently_ dependent on
> the ordering of templates in the file.

Order dependency is the part of the implementation.

> Furthermore keying this condition to insn attributes would be less
> future proof. While it may not be very likely for new colliding
> mnemonics to appear, if one would which then doesn't fit the chosen
> pattern, the tweaking of the conditional would be more fragile than
> the mere adding of Pass2 in the appropriate template. (Even if you
> continue to object to it, simply consider the last patch of this series
> an example.)
>

I like to avoid the second pass for many common instructions.
The current assembler may not be ideal.  But it works.

-- 
H.J.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 3/8] x86: re-work insn/suffix recognition
  2022-10-28 16:12       ` H.J. Lu
@ 2022-10-31 11:40         ` Jan Beulich
  2022-10-31 16:59           ` H.J. Lu
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2022-10-31 11:40 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Binutils

On 28.10.2022 18:12, H.J. Lu wrote:
> On Fri, Oct 28, 2022 at 2:00 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 27.10.2022 19:21, H.J. Lu wrote:
>>> On Tue, Oct 25, 2022 at 12:26 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>> @@ -989,13 +976,13 @@ emms, 0xf77, None, CpuMMX, No_bSuf|No_wS
>>>>  // copying between Reg64/Mem64 and RegXMM/RegMMX, as is mandated by Intel's
>>>>  // spec). AMD's spec, having been in existence for much longer, failed to
>>>>  // recognize that and specified movd for 32- and 64-bit operations.
>>>> -movd, 0x666e, None, CpuAVX, D|Modrm|Vex128|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Reg32|Unspecified|BaseIndex, RegXMM }
>>>> +movd, 0x666e, None, CpuAVX, D|Modrm|Vex128|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX|Pass2, { Reg32|Unspecified|BaseIndex, RegXMM }
>>>>  movd, 0x666e, None, CpuAVX|Cpu64, D|Modrm|Vex=1|Space0F|VexW1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64|SSE2AVX, { Reg64|BaseIndex, RegXMM }
>>>>  movd, 0x660f6e, None, CpuSSE2, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, RegXMM }
>>>>  movd, 0x660f6e, None, CpuSSE2|Cpu64, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg64|BaseIndex, RegXMM }
>>>>  movd, 0xf6e, None, CpuMMX, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, RegMMX }
>>>>  movd, 0xf6e, None, CpuMMX|Cpu64, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg64|BaseIndex, RegMMX }
>>>> -movq, 0xf37e, None, CpuAVX, Load|Modrm|Vex=1|Space0F|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
>>>> +movq, 0xf37e, None, CpuAVX, Load|Modrm|Vex=1|Space0F|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX|Pass2, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
>>>>  movq, 0x66d6, None, CpuAVX, Modrm|Vex=1|Space0F|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM, Qword|Unspecified|BaseIndex|RegXMM }
>>>>  movq, 0x666e, None, CpuAVX|Cpu64, D|Modrm|Vex=1|Space0F|VexW1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64|SSE2AVX, { Reg64|Unspecified|BaseIndex, RegXMM }
>>>>  movq, 0xf30f7e, None, CpuSSE2, Load|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|Qword|BaseIndex|RegXMM, RegXMM }
>>>> @@ -1159,7 +1146,7 @@ andpd<sse2>, 0x660f54, None, <sse2:cpu>,
>>>>  cmp<frel>pd<sse2>, 0x660fc2, <frel:imm>, <sse2:cpu>, Modrm|<sse2:attr>|<sse2:vvvv>|<frel:comm>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { RegXMM|Unspecified|BaseIndex, RegXMM }
>>>>  cmp<frel>sd<sse2>, 0xf20fc2, <frel:imm>, <sse2:cpu>, Modrm|<sse2:scal>|<sse2:vvvv>|<frel:comm>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM }
>>>>  cmppd<sse2>, 0x660fc2, None, <sse2:cpu>, Modrm|<sse2:attr>|<sse2:vvvv>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
>>>> -cmpsd<sse2>, 0xf20fc2, None, <sse2:cpu>, Modrm|<sse2:scal>|<sse2:vvvv>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
>>>> +cmpsd<sse2>, 0xf20fc2, None, <sse2:cpu>, Modrm|<sse2:scal>|<sse2:vvvv>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Pass2, { Imm8, Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
>>>>  comisd<sse2>, 0x660f2f, None, <sse2:cpu>, Modrm|<sse2:scal>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
>>>>  cvtpi2pd, 0x660f2a, None, CpuSSE2, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegMMX, RegXMM }
>>>>  cvtpi2pd, 0xf3e6, None, CpuAVX, Modrm|Vex|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
>>>> @@ -1184,7 +1171,7 @@ movlpd, 0x6613, None, CpuAVX, Modrm|Vex|
>>>>  movlpd, 0x660f12, None, CpuSSE2, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex, RegXMM }
>>>>  movmskpd<sse2>, 0x660f50, None, <sse2:cpu>, Modrm|<sse2:attr>|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegXMM, Reg32|Reg64 }
>>>>  movntpd<sse2>, 0x660f2b, None, <sse2:cpu>, Modrm|<sse2:attr>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Xmmword|Unspecified|BaseIndex }
>>>> -movsd, 0xf210, None, CpuAVX, D|Modrm|VexLIG|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
>>>> +movsd, 0xf210, None, CpuAVX, D|Modrm|VexLIG|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX|Pass2, { Qword|Unspecified|BaseIndex, RegXMM }
>>>>  movsd, 0xf210, None, CpuAVX, D|Modrm|Vex=3|Space0F|VexVVVV=1|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM, RegXMM }
>>>>  movsd, 0xf20f10, None, CpuSSE2, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
>>>>  movupd<sse2>, 0x660f10, None, <sse2:cpu>, D|Modrm|<sse2:attr>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
>>>>
>>>
>>> All these instructions with Pass2 have a vector register operand.  Can you
>>> use it instead?
>>
>> I probably could, but that would go against your request of limiting as
>> much as possible the allocation of a copy of the input line: Far more
>> insns would then also undergo this setup for a possible 2nd parsing pass.
>> It may be possible to limit the set some by taking into account further
>> attributes (like, leaving aside move-with-sign-extend which you object
>> to being corrected, SSE2AVX), but the collection of those checks would
>> likely be clumsy and harder to maintain than the new attribute. For
>> example the consistency check in i386-gen would then not be possible
>> anymore, and hence the whole thing would become _silently_ dependent on
>> the ordering of templates in the file.
> 
> Order dependency is the part of the implementation.
> 
>> Furthermore keying this condition to insn attributes would be less
>> future proof. While it may not be very likely for new colliding
>> mnemonics to appear, if one would which then doesn't fit the chosen
>> pattern, the tweaking of the conditional would be more fragile than
>> the mere adding of Pass2 in the appropriate template. (Even if you
>> continue to object to it, simply consider the last patch of this series
>> an example.)
>>
> 
> I like to avoid the second pass for many common instructions.
> The current assembler may not be ideal.  But it works.

So am I understanding correctly that you'd like me to drop Pass2 again,
accepting that a few too many insns would have the input buffer cloned
then?

Jan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 3/8] x86: re-work insn/suffix recognition
  2022-10-31 11:40         ` Jan Beulich
@ 2022-10-31 16:59           ` H.J. Lu
  0 siblings, 0 replies; 19+ messages in thread
From: H.J. Lu @ 2022-10-31 16:59 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils

On Mon, Oct 31, 2022 at 4:40 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 28.10.2022 18:12, H.J. Lu wrote:
> > On Fri, Oct 28, 2022 at 2:00 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 27.10.2022 19:21, H.J. Lu wrote:
> >>> On Tue, Oct 25, 2022 at 12:26 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>> @@ -989,13 +976,13 @@ emms, 0xf77, None, CpuMMX, No_bSuf|No_wS
> >>>>  // copying between Reg64/Mem64 and RegXMM/RegMMX, as is mandated by Intel's
> >>>>  // spec). AMD's spec, having been in existence for much longer, failed to
> >>>>  // recognize that and specified movd for 32- and 64-bit operations.
> >>>> -movd, 0x666e, None, CpuAVX, D|Modrm|Vex128|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Reg32|Unspecified|BaseIndex, RegXMM }
> >>>> +movd, 0x666e, None, CpuAVX, D|Modrm|Vex128|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX|Pass2, { Reg32|Unspecified|BaseIndex, RegXMM }
> >>>>  movd, 0x666e, None, CpuAVX|Cpu64, D|Modrm|Vex=1|Space0F|VexW1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64|SSE2AVX, { Reg64|BaseIndex, RegXMM }
> >>>>  movd, 0x660f6e, None, CpuSSE2, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, RegXMM }
> >>>>  movd, 0x660f6e, None, CpuSSE2|Cpu64, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg64|BaseIndex, RegXMM }
> >>>>  movd, 0xf6e, None, CpuMMX, D|Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Reg32|Unspecified|BaseIndex, RegMMX }
> >>>>  movd, 0xf6e, None, CpuMMX|Cpu64, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64, { Reg64|BaseIndex, RegMMX }
> >>>> -movq, 0xf37e, None, CpuAVX, Load|Modrm|Vex=1|Space0F|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
> >>>> +movq, 0xf37e, None, CpuAVX, Load|Modrm|Vex=1|Space0F|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX|Pass2, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
> >>>>  movq, 0x66d6, None, CpuAVX, Modrm|Vex=1|Space0F|VexWIG|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM, Qword|Unspecified|BaseIndex|RegXMM }
> >>>>  movq, 0x666e, None, CpuAVX|Cpu64, D|Modrm|Vex=1|Space0F|VexW1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Size64|SSE2AVX, { Reg64|Unspecified|BaseIndex, RegXMM }
> >>>>  movq, 0xf30f7e, None, CpuSSE2, Load|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Unspecified|Qword|BaseIndex|RegXMM, RegXMM }
> >>>> @@ -1159,7 +1146,7 @@ andpd<sse2>, 0x660f54, None, <sse2:cpu>,
> >>>>  cmp<frel>pd<sse2>, 0x660fc2, <frel:imm>, <sse2:cpu>, Modrm|<sse2:attr>|<sse2:vvvv>|<frel:comm>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { RegXMM|Unspecified|BaseIndex, RegXMM }
> >>>>  cmp<frel>sd<sse2>, 0xf20fc2, <frel:imm>, <sse2:cpu>, Modrm|<sse2:scal>|<sse2:vvvv>|<frel:comm>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ImmExt, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM }
> >>>>  cmppd<sse2>, 0x660fc2, None, <sse2:cpu>, Modrm|<sse2:attr>|<sse2:vvvv>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
> >>>> -cmpsd<sse2>, 0xf20fc2, None, <sse2:cpu>, Modrm|<sse2:scal>|<sse2:vvvv>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
> >>>> +cmpsd<sse2>, 0xf20fc2, None, <sse2:cpu>, Modrm|<sse2:scal>|<sse2:vvvv>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Pass2, { Imm8, Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
> >>>>  comisd<sse2>, 0x660f2f, None, <sse2:cpu>, Modrm|<sse2:scal>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
> >>>>  cvtpi2pd, 0x660f2a, None, CpuSSE2, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegMMX, RegXMM }
> >>>>  cvtpi2pd, 0xf3e6, None, CpuAVX, Modrm|Vex|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
> >>>> @@ -1184,7 +1171,7 @@ movlpd, 0x6613, None, CpuAVX, Modrm|Vex|
> >>>>  movlpd, 0x660f12, None, CpuSSE2, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex, RegXMM }
> >>>>  movmskpd<sse2>, 0x660f50, None, <sse2:cpu>, Modrm|<sse2:attr>|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_ldSuf|NoRex64, { RegXMM, Reg32|Reg64 }
> >>>>  movntpd<sse2>, 0x660f2b, None, <sse2:cpu>, Modrm|<sse2:attr>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, Xmmword|Unspecified|BaseIndex }
> >>>> -movsd, 0xf210, None, CpuAVX, D|Modrm|VexLIG|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
> >>>> +movsd, 0xf210, None, CpuAVX, D|Modrm|VexLIG|Space0F|VexW0|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX|Pass2, { Qword|Unspecified|BaseIndex, RegXMM }
> >>>>  movsd, 0xf210, None, CpuAVX, D|Modrm|Vex=3|Space0F|VexVVVV=1|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|SSE2AVX, { RegXMM, RegXMM }
> >>>>  movsd, 0xf20f10, None, CpuSSE2, D|Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
> >>>>  movupd<sse2>, 0x660f10, None, <sse2:cpu>, D|Modrm|<sse2:attr>|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
> >>>>
> >>>
> >>> All these instructions with Pass2 have a vector register operand.  Can you
> >>> use it instead?
> >>
> >> I probably could, but that would go against your request of limiting as
> >> much as possible the allocation of a copy of the input line: Far more
> >> insns would then also undergo this setup for a possible 2nd parsing pass.
> >> It may be possible to limit the set some by taking into account further
> >> attributes (like, leaving aside move-with-sign-extend which you object
> >> to being corrected, SSE2AVX), but the collection of those checks would
> >> likely be clumsy and harder to maintain than the new attribute. For
> >> example the consistency check in i386-gen would then not be possible
> >> anymore, and hence the whole thing would become _silently_ dependent on
> >> the ordering of templates in the file.
> >
> > Order dependency is the part of the implementation.
> >
> >> Furthermore keying this condition to insn attributes would be less
> >> future proof. While it may not be very likely for new colliding
> >> mnemonics to appear, if one would which then doesn't fit the chosen
> >> pattern, the tweaking of the conditional would be more fragile than
> >> the mere adding of Pass2 in the appropriate template. (Even if you
> >> continue to object to it, simply consider the last patch of this series
> >> an example.)
> >>
> >
> > I like to avoid the second pass for many common instructions.
> > The current assembler may not be ideal.  But it works.
>
> So am I understanding correctly that you'd like me to drop Pass2 again,
> accepting that a few too many insns would have the input buffer cloned
> then?
>

Yes.  It is too much to require the second pass for common instructions.


-- 
H.J.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-10-31 16:59 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-25  7:23 [PATCH v5 0/8] x86: suffix handling changes Jan Beulich
2022-10-25  7:24 ` [PATCH v5 1/8] x86: constify parse_insn()'s input Jan Beulich
2022-10-25  7:25 ` [PATCH v5 1/8] x86: introduce Pass2 insn attribute Jan Beulich
2022-10-25  7:30   ` [PATCH v5 2/8] " Jan Beulich
2022-10-25  7:26 ` [PATCH v5 3/8] x86: re-work insn/suffix recognition Jan Beulich
2022-10-27 17:21   ` H.J. Lu
2022-10-28  9:00     ` Jan Beulich
2022-10-28 16:12       ` H.J. Lu
2022-10-31 11:40         ` Jan Beulich
2022-10-31 16:59           ` H.J. Lu
2022-10-25  7:26 ` [PATCH v5 4/8] ix86: don't recognize/derive Q suffix in the common case Jan Beulich
2022-10-25  7:27 ` [PATCH v5 5/8] x86-64: allow HLE store of accumulator to absolute 32-bit address Jan Beulich
2022-10-25  7:27 ` [PATCH v5 6/8] x86: move bad-use-of-TLS-reloc check Jan Beulich
2022-10-25  7:28 ` [PATCH v5 7/8] x86: drop (now) stray IsString Jan Beulich
2022-10-25  7:29 ` [PATCH v5 8/8] x86: further re-work insn/suffix recognition to also cover MOVSX Jan Beulich
2022-10-25 17:10   ` H.J. Lu
2022-10-26  9:07     ` Jan Beulich
2022-10-27  0:11       ` H.J. Lu
2022-10-27  6:31         ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).