public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v7 0/7] x86: suffix handling changes
@ 2022-12-02 10:13 Jan Beulich
  2022-12-02 10:18 ` [PATCH v7 1/7] x86: constify parse_insn()'s input Jan Beulich
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Jan Beulich @ 2022-12-02 10:13 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

... accompanied by a few other improvements (or so I hope) found
along the way.

Apart from re-basing this is merely a re-submission, in an attempt to
revive the discussion: There are multiple bugs being fixed by this
series, which I'd like to ask to not put off simply with a subjective
qualification of them being minor. Plus even if minor, it's one thing
to not invest time into addressing a minor bug, but here a fix is
being supplied, ready for committing. I'd like to also stress again
that I've taken quite a bit of time to address concerns, and to hence
find a viable compromise.

1: constify parse_insn()'s input
2: re-work insn/suffix recognition
3: don't recognize/derive Q suffix in the common case
4: allow HLE store of accumulator to absolute 32-bit address
5: move bad-use-of-TLS-reloc check
6: drop (now) stray IsString
7: further re-work insn/suffix recognition to also cover MOVSX

Note that patches 3...5 were previously approved (albeit 3 has now
changed because of the re-ordering of the series in v5), but can't be
committed ahead of the earlier ones.

While I don't think there's any collision with the also pending "x86:
break gas dependency on libopcodes" series (apart from generated files),
just in case: The series here is presented on top of that one, but could
be easily re-based ahead.

Jan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v7 1/7] x86: constify parse_insn()'s input
  2022-12-02 10:13 [PATCH v7 0/7] x86: suffix handling changes Jan Beulich
@ 2022-12-02 10:18 ` Jan Beulich
  2022-12-02 10:18 ` [PATCH v7 2/7] x86: re-work insn/suffix recognition Jan Beulich
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2022-12-02 10:18 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

The function doesn't alter its input buffer: Reflect this in its
prototype. To avoid using any kind of cast, simply calculate the update
of "line" from the function's input and output.
---
Avoiding a cast the way it's done here is going to be useful in
subsequent patches (there'll be code added between the call and the
update of "line"), so switching to a cast is not really an option.
---
v3: New.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -158,7 +158,7 @@ static int i386_intel_operand (char *, i
 static int i386_intel_simplify (expressionS *);
 static int i386_intel_parse_name (const char *, expressionS *);
 static const reg_entry *parse_register (char *, char **);
-static char *parse_insn (char *, char *);
+static const char *parse_insn (const char *, char *);
 static char *parse_operands (char *, const char *);
 static void swap_operands (void);
 static void swap_2_operands (unsigned int, unsigned int);
@@ -4845,6 +4845,7 @@ md_assemble (char *line)
 {
   unsigned int j;
   char mnemonic[MAX_MNEM_SIZE], mnem_suffix;
+  const char *end;
   const insn_template *t;
 
   /* Initialize globals.  */
@@ -4860,9 +4861,10 @@ md_assemble (char *line)
      We assume that the scrubber has arranged it so that line[0] is the valid
      start of a (possibly prefixed) mnemonic.  */
 
-  line = parse_insn (line, mnemonic);
-  if (line == NULL)
+  end = parse_insn (line, mnemonic);
+  if (end == NULL)
     return;
+  line += end - line;
   mnem_suffix = i.suffix;
 
   line = parse_operands (line, mnemonic);
@@ -5260,11 +5262,10 @@ md_assemble (char *line)
     last_insn.kind = last_insn_other;
 }
 
-static char *
-parse_insn (char *line, char *mnemonic)
+static const char *
+parse_insn (const char *line, char *mnemonic)
 {
-  char *l = line;
-  char *token_start = l;
+  const char *l = line, *token_start = l;
   char *mnem_p;
   int supported;
   const insn_template *t;


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v7 2/7] x86: re-work insn/suffix recognition
  2022-12-02 10:13 [PATCH v7 0/7] x86: suffix handling changes Jan Beulich
  2022-12-02 10:18 ` [PATCH v7 1/7] x86: constify parse_insn()'s input Jan Beulich
@ 2022-12-02 10:18 ` Jan Beulich
  2022-12-02 10:19 ` [PATCH v7 3/7] ix86: don't recognize/derive Q suffix in the common case Jan Beulich
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2022-12-02 10:18 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

Having templates with a suffix explicitly present has always been
quirky. Introduce a 2nd matching pass in case the 1st one couldn't find
a suitable template _and_ didn't itself already need to trim off a
suffix to find a match at all. This requires error reporting adjustments
(albeit luckily fewer than I was afraid might be necessary), as errors
previously reported during matching now need deferring until after the
2nd pass (because, obviously, we must not emit any error if the 2nd pass
succeeds). While also related to PR gas/29524, it was requested that
move-with-sign-extend be left as broken as it always was.

PR gas/29525
Note that with the dropped CMPSD and MOVSD Intel Syntax string insn
templates taking operands, mixed IsString/non-IsString template groups
(with memory operands) cannot occur anymore. With that
maybe_adjust_templates() becomes unnecessary (and is hence being
removed).

PR gas/29526
Note further that while the additions to the intel16 testcase aren't
really proper Intel syntax, we've been permitting all of those except
for the MOVD variant. The test therefore is to avoid re-introducing such
an inconsistency.
---
To limit code churn I'm using "goto" for the retry loop, but I'd be
happy to make this a proper loop either right here or in a follow-on
change doing just the necessary re-indentation.

The "too many memory references" errors which are being deleted weren't
fully consistent anyway - even the majority of IsString insns accepts
only a single memory operand. If we wanted to retain that, it would need
re-introducing in md_assemble(), latching the error into i.error just
like match_template() does.

As an unrelated observation: Why is "MOVQ $imm64, %reg64" being
optimized but "MOVABS $imm64, %reg64" is not?
---
v7: Re-base.
v6: Re-base over dropping of Pass2 attribute.
v5: Split off move-with-sign-extend changes.
v4: Retain support for operand-less MOVSD and CMPSD.
v3: Limit xstrdup() to just the templates where a 2nd pass actually
    makes sense (new Pass2 attribute).

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -298,9 +298,6 @@ struct _i386_insn
        explicit segment overrides are given.  */
     const reg_entry *seg[2];
 
-    /* Copied first memory operand string, for re-checking.  */
-    char *memop1_string;
-
     /* PREFIX holds all the given prefix opcodes (usually null).
        PREFIXES is the number of prefix opcodes.  */
     unsigned int prefixes;
@@ -4299,7 +4296,20 @@ optimize_encoding (void)
 	   movq $imm31, %r64   -> movl $imm31, %r32
 	   movq $imm32, %r64   -> movl $imm32, %r32
         */
-      i.tm.opcode_modifier.norex64 = 1;
+      i.tm.opcode_modifier.size = SIZE32;
+      if (i.imm_operands)
+	{
+	  i.types[0].bitfield.imm32 = 1;
+	  i.types[0].bitfield.imm32s = 0;
+	  i.types[0].bitfield.imm64 = 0;
+	}
+      else
+	{
+	  i.types[0].bitfield.dword = 1;
+	  i.types[0].bitfield.qword = 0;
+	}
+      i.types[1].bitfield.dword = 1;
+      i.types[1].bitfield.qword = 0;
       if (i.tm.base_opcode == 0xb8 || (i.tm.base_opcode | 1) == 0xc7)
 	{
 	  /* Handle
@@ -4309,11 +4319,6 @@ optimize_encoding (void)
 	  i.tm.operand_types[0].bitfield.imm32 = 1;
 	  i.tm.operand_types[0].bitfield.imm32s = 0;
 	  i.tm.operand_types[0].bitfield.imm64 = 0;
-	  i.types[0].bitfield.imm32 = 1;
-	  i.types[0].bitfield.imm32s = 0;
-	  i.types[0].bitfield.imm64 = 0;
-	  i.types[1].bitfield.dword = 1;
-	  i.types[1].bitfield.qword = 0;
 	  if ((i.tm.base_opcode | 1) == 0xc7)
 	    {
 	      /* Handle
@@ -4836,6 +4841,18 @@ insert_lfence_before (void)
     }
 }
 
+/* Helper for md_assemble() to decide whether to prepare for a possible 2nd
+   parsing pass. Instead of introducing a rarely use new insn attribute this
+   utilizes a common pattern between affected templates. It is deemed
+   acceptable that this will lead to unnecessary pass 2 preparations in a
+   limited set of cases.  */
+static INLINE bool may_need_pass2 (const insn_template *t)
+{
+  return t->opcode_modifier.sse2avx
+	 /* Note that all SSE2AVX templates have at least one operand.  */
+	 && t->operand_types[t->operands - 1].bitfield.class == RegSIMD;
+}
+
 /* This is the guts of the machine-dependent assembler.  LINE points to a
    machine dependent instruction.  This function is supposed to emit
    the frags/bytes it assembles to.  */
@@ -4844,11 +4861,14 @@ void
 md_assemble (char *line)
 {
   unsigned int j;
-  char mnemonic[MAX_MNEM_SIZE], mnem_suffix;
-  const char *end;
+  char mnemonic[MAX_MNEM_SIZE], mnem_suffix, *copy = NULL;
+  const char *end, *pass1_mnem = NULL;
+  enum i386_error pass1_err = 0;
   const insn_template *t;
 
   /* Initialize globals.  */
+  current_templates = NULL;
+ retry:
   memset (&i, '\0', sizeof (i));
   i.rounding.type = rc_none;
   for (j = 0; j < MAX_OPERANDS; j++)
@@ -4863,16 +4883,26 @@ md_assemble (char *line)
 
   end = parse_insn (line, mnemonic);
   if (end == NULL)
-    return;
+    {
+      if (pass1_mnem != NULL)
+	goto match_error;
+      return;
+    }
+  if (may_need_pass2 (current_templates->start))
+    {
+      /* Make a copy of the full line in case we need to retry.  */
+      copy = xstrdup (line);
+    }
   line += end - line;
   mnem_suffix = i.suffix;
 
   line = parse_operands (line, mnemonic);
   this_operand = -1;
-  xfree (i.memop1_string);
-  i.memop1_string = NULL;
   if (line == NULL)
-    return;
+    {
+      free (copy);
+      return;
+    }
 
   /* Now we've parsed the mnemonic into a set of templates, and have the
      operands at hand.  */
@@ -4950,7 +4980,97 @@ md_assemble (char *line)
      with the template operand types.  */
 
   if (!(t = match_template (mnem_suffix)))
-    return;
+    {
+      const char *err_msg;
+
+      if (copy && !mnem_suffix)
+	{
+	  line = copy;
+	  copy = NULL;
+	  pass1_err = i.error;
+	  pass1_mnem = current_templates->start->name;
+	  goto retry;
+	}
+      free (copy);
+  match_error:
+      switch (pass1_mnem ? pass1_err : i.error)
+	{
+	default:
+	  abort ();
+	case operand_size_mismatch:
+	  err_msg = _("operand size mismatch");
+	  break;
+	case operand_type_mismatch:
+	  err_msg = _("operand type mismatch");
+	  break;
+	case register_type_mismatch:
+	  err_msg = _("register type mismatch");
+	  break;
+	case number_of_operands_mismatch:
+	  err_msg = _("number of operands mismatch");
+	  break;
+	case invalid_instruction_suffix:
+	  err_msg = _("invalid instruction suffix");
+	  break;
+	case bad_imm4:
+	  err_msg = _("constant doesn't fit in 4 bits");
+	  break;
+	case unsupported_with_intel_mnemonic:
+	  err_msg = _("unsupported with Intel mnemonic");
+	  break;
+	case unsupported_syntax:
+	  err_msg = _("unsupported syntax");
+	  break;
+	case unsupported:
+	  as_bad (_("unsupported instruction `%s'"),
+		  pass1_mnem ? pass1_mnem : current_templates->start->name);
+	  return;
+	case invalid_sib_address:
+	  err_msg = _("invalid SIB address");
+	  break;
+	case invalid_vsib_address:
+	  err_msg = _("invalid VSIB address");
+	  break;
+	case invalid_vector_register_set:
+	  err_msg = _("mask, index, and destination registers must be distinct");
+	  break;
+	case invalid_tmm_register_set:
+	  err_msg = _("all tmm registers must be distinct");
+	  break;
+	case invalid_dest_and_src_register_set:
+	  err_msg = _("destination and source registers must be distinct");
+	  break;
+	case unsupported_vector_index_register:
+	  err_msg = _("unsupported vector index register");
+	  break;
+	case unsupported_broadcast:
+	  err_msg = _("unsupported broadcast");
+	  break;
+	case broadcast_needed:
+	  err_msg = _("broadcast is needed for operand of such type");
+	  break;
+	case unsupported_masking:
+	  err_msg = _("unsupported masking");
+	  break;
+	case mask_not_on_destination:
+	  err_msg = _("mask not on destination operand");
+	  break;
+	case no_default_mask:
+	  err_msg = _("default mask isn't allowed");
+	  break;
+	case unsupported_rc_sae:
+	  err_msg = _("unsupported static rounding/sae");
+	  break;
+	case invalid_register_operand:
+	  err_msg = _("invalid register operand");
+	  break;
+	}
+      as_bad (_("%s for `%s'"), err_msg,
+	      pass1_mnem ? pass1_mnem : current_templates->start->name);
+      return;
+    }
+
+  free (copy);
 
   if (sse_check != check_none
       /* The opcode space check isn't strictly needed; it's there only to
@@ -5267,6 +5387,7 @@ parse_insn (const char *line, char *mnem
 {
   const char *l = line, *token_start = l;
   char *mnem_p;
+  bool pass1 = !current_templates;
   int supported;
   const insn_template *t;
   char *dot_p = NULL;
@@ -5436,8 +5557,10 @@ parse_insn (const char *line, char *mnem
       current_templates = (const templates *) str_hash_find (op_hash, mnemonic);
     }
 
-  if (!current_templates)
+  if (!current_templates || !pass1)
     {
+      current_templates = NULL;
+
     check_suffix:
       if (mnem_p > mnemonic)
 	{
@@ -5479,13 +5602,39 @@ parse_insn (const char *line, char *mnem
 		  current_templates
 		    = (const templates *) str_hash_find (op_hash, mnemonic);
 		}
+	      /* For compatibility reasons accept MOVSD and CMPSD without
+	         operands even in AT&T mode.  */
+	      else if (*l == END_OF_INSN
+		       || (is_space_char (*l) && l[1] == END_OF_INSN))
+		{
+		  mnem_p[-1] = '\0';
+		  current_templates
+		    = (const templates *) str_hash_find (op_hash, mnemonic);
+		  if (current_templates != NULL
+		      /* MOVS or CMPS */
+		      && (current_templates->start->base_opcode | 2) == 0xa6
+		      && current_templates->start->opcode_modifier.opcodespace
+			 == SPACE_BASE
+		      && mnem_p[-2] == 's')
+		    {
+		      as_warn (_("found `%sd'; assuming `%sl' was meant"),
+			       mnemonic, mnemonic);
+		      i.suffix = LONG_MNEM_SUFFIX;
+		    }
+		  else
+		    {
+		      current_templates = NULL;
+		      mnem_p[-1] = 'd';
+		    }
+		}
 	      break;
 	    }
 	}
 
       if (!current_templates)
 	{
-	  as_bad (_("no such instruction: `%s'"), token_start);
+	  if (pass1)
+	    as_bad (_("no such instruction: `%s'"), token_start);
 	  return NULL;
 	}
     }
@@ -6873,81 +7022,7 @@ match_template (char mnem_suffix)
   if (t == current_templates->end)
     {
       /* We found no match.  */
-      const char *err_msg;
-      switch (specific_error)
-	{
-	default:
-	  abort ();
-	case operand_size_mismatch:
-	  err_msg = _("operand size mismatch");
-	  break;
-	case operand_type_mismatch:
-	  err_msg = _("operand type mismatch");
-	  break;
-	case register_type_mismatch:
-	  err_msg = _("register type mismatch");
-	  break;
-	case number_of_operands_mismatch:
-	  err_msg = _("number of operands mismatch");
-	  break;
-	case invalid_instruction_suffix:
-	  err_msg = _("invalid instruction suffix");
-	  break;
-	case bad_imm4:
-	  err_msg = _("constant doesn't fit in 4 bits");
-	  break;
-	case unsupported_with_intel_mnemonic:
-	  err_msg = _("unsupported with Intel mnemonic");
-	  break;
-	case unsupported_syntax:
-	  err_msg = _("unsupported syntax");
-	  break;
-	case unsupported:
-	  as_bad (_("unsupported instruction `%s'"),
-		  current_templates->start->name);
-	  return NULL;
-	case invalid_sib_address:
-	  err_msg = _("invalid SIB address");
-	  break;
-	case invalid_vsib_address:
-	  err_msg = _("invalid VSIB address");
-	  break;
-	case invalid_vector_register_set:
-	  err_msg = _("mask, index, and destination registers must be distinct");
-	  break;
-	case invalid_tmm_register_set:
-	  err_msg = _("all tmm registers must be distinct");
-	  break;
-	case invalid_dest_and_src_register_set:
-	  err_msg = _("destination and source registers must be distinct");
-	  break;
-	case unsupported_vector_index_register:
-	  err_msg = _("unsupported vector index register");
-	  break;
-	case unsupported_broadcast:
-	  err_msg = _("unsupported broadcast");
-	  break;
-	case broadcast_needed:
-	  err_msg = _("broadcast is needed for operand of such type");
-	  break;
-	case unsupported_masking:
-	  err_msg = _("unsupported masking");
-	  break;
-	case mask_not_on_destination:
-	  err_msg = _("mask not on destination operand");
-	  break;
-	case no_default_mask:
-	  err_msg = _("default mask isn't allowed");
-	  break;
-	case unsupported_rc_sae:
-	  err_msg = _("unsupported static rounding/sae");
-	  break;
-	case invalid_register_operand:
-	  err_msg = _("invalid register operand");
-	  break;
-	}
-      as_bad (_("%s for `%s'"), err_msg,
-	      current_templates->start->name);
+      i.error = specific_error;
       return NULL;
     }
 
@@ -11426,49 +11501,6 @@ RC_SAE_immediate (const char *imm_start)
   return 1;
 }
 
-/* Only string instructions can have a second memory operand, so
-   reduce current_templates to just those if it contains any.  */
-static int
-maybe_adjust_templates (void)
-{
-  const insn_template *t;
-
-  gas_assert (i.mem_operands == 1);
-
-  for (t = current_templates->start; t < current_templates->end; ++t)
-    if (t->opcode_modifier.isstring)
-      break;
-
-  if (t < current_templates->end)
-    {
-      static templates aux_templates;
-      bool recheck;
-
-      aux_templates.start = t;
-      for (; t < current_templates->end; ++t)
-	if (!t->opcode_modifier.isstring)
-	  break;
-      aux_templates.end = t;
-
-      /* Determine whether to re-check the first memory operand.  */
-      recheck = (aux_templates.start != current_templates->start
-		 || t != current_templates->end);
-
-      current_templates = &aux_templates;
-
-      if (recheck)
-	{
-	  i.mem_operands = 0;
-	  if (i.memop1_string != NULL
-	      && i386_index_check (i.memop1_string) == 0)
-	    return 0;
-	  i.mem_operands = 1;
-	}
-    }
-
-  return 1;
-}
-
 static INLINE bool starts_memory_operand (char c)
 {
   return ISDIGIT (c)
@@ -11619,17 +11651,6 @@ i386_att_operand (char *operand_string)
       char *displacement_string_end;
 
     do_memory_reference:
-      if (i.mem_operands == 1 && !maybe_adjust_templates ())
-	return 0;
-      if ((i.mem_operands == 1
-	   && !current_templates->start->opcode_modifier.isstring)
-	  || i.mem_operands == 2)
-	{
-	  as_bad (_("too many memory references for `%s'"),
-		  current_templates->start->name);
-	  return 0;
-	}
-
       /* Check for base index form.  We detect the base index form by
 	 looking for an ')' at the end of the operand, searching
 	 for the '(' matching it, and finding a REGISTER_PREFIX or ','
@@ -11837,8 +11858,6 @@ i386_att_operand (char *operand_string)
       if (i386_index_check (operand_string) == 0)
 	return 0;
       i.flags[this_operand] |= Operand_Mem;
-      if (i.mem_operands == 0)
-	i.memop1_string = xstrdup (operand_string);
       i.mem_operands++;
     }
   else
--- a/gas/config/tc-i386-intel.c
+++ b/gas/config/tc-i386-intel.c
@@ -1000,10 +1000,7 @@ i386_intel_operand (char *operand_string
 	   || intel_state.is_mem)
     {
       /* Memory operand.  */
-      if (i.mem_operands == 1 && !maybe_adjust_templates ())
-	return 0;
-      if ((int) i.mem_operands
-	  >= 2 - !current_templates->start->opcode_modifier.isstring)
+      if (i.mem_operands)
 	{
 	  /* Handle
 
@@ -1048,10 +1045,6 @@ i386_intel_operand (char *operand_string
 		    }
 		}
 	    }
-
-	  as_bad (_("too many memory references for `%s'"),
-		  current_templates->start->name);
-	  return 0;
 	}
 
       /* Swap base and index in 16-bit memory operands like
@@ -1165,8 +1158,6 @@ i386_intel_operand (char *operand_string
 	return 0;
 
       i.flags[this_operand] |= Operand_Mem;
-      if (i.mem_operands == 0)
-	i.memop1_string = xstrdup (operand_string);
       ++i.mem_operands;
     }
   else
--- a/gas/testsuite/gas/i386/code16.d
+++ b/gas/testsuite/gas/i386/code16.d
@@ -1,5 +1,6 @@
 #objdump: -drw -mi8086
 #name: i386 with .code16
+#warning_output: code16.e
 
 .*: +file format .*
 
--- /dev/null
+++ b/gas/testsuite/gas/i386/code16.e
@@ -0,0 +1,3 @@
+.*: Assembler messages:
+.*:3: Warning: .* `movsd'.*`movsl'.*
+.*:4: Warning: .* `cmpsd'.*`cmpsl'.*
--- a/gas/testsuite/gas/i386/code16.s
+++ b/gas/testsuite/gas/i386/code16.s
@@ -2,8 +2,8 @@
 	.code16
 	rep; movsd
 	rep; cmpsd
-	rep movsd %ds:(%si),%es:(%di)
-	rep cmpsd %es:(%di),%ds:(%si)
+	rep movsl %ds:(%si),%es:(%di)
+	rep cmpsl %es:(%di),%ds:(%si)
 
 	mov	%cr2, %ecx
 	mov	%ecx, %cr2
--- a/gas/testsuite/gas/i386/intel16.d
+++ b/gas/testsuite/gas/i386/intel16.d
@@ -20,4 +20,12 @@ Disassembly of section .text:
   2c:	8d 02 [ 	]*lea    \(%bp,%si\),%ax
   2e:	8d 01 [ 	]*lea    \(%bx,%di\),%ax
   30:	8d 03 [ 	]*lea    \(%bp,%di\),%ax
-	...
+[ 	]*[0-9a-f]+:	67 f7 13[ 	]+notw[ 	]+\(%ebx\)
+[ 	]*[0-9a-f]+:	66 f7 17[ 	]+notl[ 	]+\(%bx\)
+[ 	]*[0-9a-f]+:	67 0f 1f 03[ 	]+nopw[ 	]+\(%ebx\)
+[ 	]*[0-9a-f]+:	66 0f 1f 07[ 	]+nopl[ 	]+\(%bx\)
+[ 	]*[0-9a-f]+:	67 83 03 05[ 	]+addw[ 	]+\$0x5,\(%ebx\)
+[ 	]*[0-9a-f]+:	66 83 07 05[ 	]+addl[ 	]+\$0x5,\(%bx\)
+[ 	]*[0-9a-f]+:	67 c7 03 05 00[ 	]+movw[ 	]+\$0x5,\(%ebx\)
+[ 	]*[0-9a-f]+:	66 c7 07 05 00 00 00[ 	]+movl[ 	]+\$0x5,\(%bx\)
+#pass
--- a/gas/testsuite/gas/i386/intel16.s
+++ b/gas/testsuite/gas/i386/intel16.s
@@ -18,4 +18,14 @@
  lea	ax, [di][bx]
  lea	ax, [di][bp]
 
- .p2align 4,0
+ notw	[ebx]
+ notd	[bx]
+
+ nopw	[ebx]
+ nopd	[bx]
+
+ addw	[ebx], 5
+ addd	[bx], 5
+
+ movw	[ebx], 5
+ movd	[bx], 5
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -145,34 +145,27 @@
 mov, 0xa0, None, CpuNo64, D|W|CheckRegSize|No_sSuf|No_qSuf, { Disp16|Disp32|Unspecified|Byte|Word|Dword, Acc|Byte|Word|Dword }
 mov, 0xa0, None, Cpu64, D|W|CheckRegSize|No_sSuf, { Disp64|Unspecified|Byte|Word|Dword|Qword, Acc|Byte|Word|Dword|Qword }
 movabs, 0xa0, None, Cpu64, D|W|CheckRegSize|No_sSuf, { Disp64|Unspecified|Byte|Word|Dword|Qword, Acc|Byte|Word|Dword|Qword }
-movq, 0xa1, None, Cpu64, D|Size64|NoSuf, { Disp64|Unspecified|Qword, Acc|Qword }
 mov, 0x88, None, 0, D|W|CheckRegSize|Modrm|No_sSuf|HLEPrefixRelease, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-movq, 0x89, None, Cpu64, D|Modrm|Size64|NoSuf|HLEPrefixRelease, { Reg64, Reg64|Unspecified|Qword|BaseIndex }
 // In the 64bit mode the short form mov immediate is redefined to have
 // 64bit value.
 mov, 0xb0, None, 0, W|No_sSuf|No_qSuf, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32 }
 mov, 0xc6, 0, 0, W|Modrm|No_sSuf|HLEPrefixRelease|Optimize, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-movq, 0xc7, 0, Cpu64, Modrm|Size64|NoSuf|HLEPrefixRelease|Optimize, { Imm32S, Reg64|Qword|Unspecified|BaseIndex }
 mov, 0xb8, None, Cpu64, No_bSuf|No_wSuf|No_lSuf|No_sSuf|Optimize, { Imm64, Reg64 }
 movabs, 0xb8, None, Cpu64, No_bSuf|No_wSuf|No_lSuf|No_sSuf, { Imm64, Reg64 }
-movq, 0xb8, None, Cpu64, Size64|NoSuf|Optimize, { Imm64, Reg64 }
 // The segment register moves accept WordReg so that a segment register
 // can be copied to a 32 bit register, and vice versa, without using a
 // size prefix.  When moving to a 32 bit register, the upper 16 bits
 // are set to an implementation defined value (on the Pentium Pro, the
 // implementation defined value is zero).
-mov, 0x8c, None, 0, RegMem|No_bSuf|No_sSuf|No_qSuf|NoRex64, { SReg, Reg16|Reg32|Reg64 }
+mov, 0x8c, None, 0, RegMem|No_bSuf|No_sSuf|NoRex64, { SReg, Reg16|Reg32|Reg64 }
 mov, 0x8c, None, 0, D|Modrm|IgnoreSize|No_bSuf|No_lSuf|No_sSuf|No_qSuf, { SReg, Word|Unspecified|BaseIndex }
-movq, 0x8c, None, Cpu64, D|RegMem|NoSuf|NoRex64, { SReg, Reg64 }
-mov, 0x8e, None, 0, Modrm|IgnoreSize|No_bSuf|No_sSuf|No_qSuf|NoRex64, { Reg16|Reg32|Reg64, SReg }
+mov, 0x8e, None, 0, Modrm|IgnoreSize|No_bSuf|No_sSuf|NoRex64, { Reg16|Reg32|Reg64, SReg }
 // Move to/from control debug registers.  In the 16 or 32bit modes
 // they are 32bit.  In the 64bit mode they are 64bit.
 mov, 0xf20, None, Cpu386|CpuNo64, D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { Control, Reg32 }
 mov, 0xf20, None, Cpu64, D|RegMem|No_bSuf|No_wSuf|No_lSuf|No_sSuf|NoRex64, { Control, Reg64 }
-movq, 0xf20, None, Cpu64, D|RegMem|Size64|NoSuf|NoRex64, { Control, Reg64 }
 mov, 0xf21, None, Cpu386|CpuNo64, D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { Debug, Reg32 }
 mov, 0xf21, None, Cpu64, D|RegMem|No_bSuf|No_wSuf|No_lSuf|No_sSuf|NoRex64, { Debug, Reg64 }
-movq, 0xf21, None, Cpu64, D|RegMem|Size64|NoSuf|NoRex64, { Debug, Reg64 }
 mov, 0xf24, None, Cpu386|CpuNo64, D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { Test, Reg32 }
 
 // Move after swapping the bytes
@@ -499,9 +492,6 @@ set<cc>, 0xf9<cc:opc>, 0, Cpu386, Modrm|
 // String manipulation.
 cmps, 0xa6, None, 0, W|No_sSuf|IsString|RepPrefixOk, {}
 cmps, 0xa6, None, 0, W|No_sSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-// Intel mode string compare.
-cmpsd, 0xa7, None, Cpu386, Size32|NoSuf|IsString|RepPrefixOk, {}
-cmpsd, 0xa7, None, Cpu386, Size32|NoSuf|IsStringEsOp0|RepPrefixOk, { Dword|Unspecified|BaseIndex, Dword|Unspecified|BaseIndex }
 scmp, 0xa6, None, 0, W|No_sSuf|IsString|RepPrefixOk, {}
 scmp, 0xa6, None, 0, W|No_sSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 ins, 0x6c, None, Cpu186, W|No_sSuf|No_qSuf|IsString|RepPrefixOk, {}
@@ -516,9 +506,6 @@ slod, 0xac, None, 0, W|No_sSuf|IsString|
 slod, 0xac, None, 0, W|No_sSuf|IsString|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Acc|Byte|Word|Dword|Qword }
 movs, 0xa4, None, 0, W|No_sSuf|IsString|RepPrefixOk, {}
 movs, 0xa4, None, 0, W|No_sSuf|IsStringEsOp1|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-// Intel mode string move.
-movsd, 0xa5, None, Cpu386, Size32|NoSuf|IsString|RepPrefixOk, {}
-movsd, 0xa5, None, Cpu386, Size32|NoSuf|IsStringEsOp1|RepPrefixOk, { Dword|Unspecified|BaseIndex, Dword|Unspecified|BaseIndex }
 smov, 0xa4, None, 0, W|No_sSuf|IsString|RepPrefixOk, {}
 smov, 0xa4, None, 0, W|No_sSuf|IsStringEsOp1|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 scas, 0xae, None, 0, W|No_sSuf|IsString|RepPrefixOk, {}
@@ -997,6 +984,7 @@ movd, 0x666e, None, CpuAVX, D|Modrm|Vex1
 movd, 0x666e, None, CpuAVX|Cpu64, D|Modrm|Vex=1|Space0F|VexW1|NoSuf|Size64|SSE2AVX, { Reg64|BaseIndex, RegXMM }
 movd, 0x660f6e, None, CpuSSE2, D|Modrm|IgnoreSize|NoSuf, { Reg32|Unspecified|BaseIndex, RegXMM }
 movd, 0x660f6e, None, CpuSSE2|Cpu64, D|Modrm|NoSuf|Size64, { Reg64|BaseIndex, RegXMM }
+// The MMX templates have to remain after at least the SSE2AVX ones.
 movd, 0xf6e, None, CpuMMX, D|Modrm|IgnoreSize|NoSuf, { Reg32|Unspecified|BaseIndex, RegMMX }
 movd, 0xf6e, None, CpuMMX|Cpu64, D|Modrm|NoSuf|Size64, { Reg64|BaseIndex, RegMMX }
 movq, 0xf37e, None, CpuAVX, Load|Modrm|Vex=1|Space0F|VexWIG|NoSuf|SSE2AVX, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
@@ -1005,6 +993,7 @@ movq, 0x666e, None, CpuAVX|Cpu64, D|Modr
 movq, 0xf30f7e, None, CpuSSE2, Load|Modrm|NoSuf, { Unspecified|Qword|BaseIndex|RegXMM, RegXMM }
 movq, 0x660fd6, None, CpuSSE2, Modrm|NoSuf, { RegXMM, Unspecified|Qword|BaseIndex|RegXMM }
 movq, 0x660f6e, None, CpuSSE2|Cpu64, D|Modrm|NoSuf|Size64, { Reg64|Unspecified|BaseIndex, RegXMM }
+// The MMX templates have to remain after at least the SSE2AVX ones.
 movq, 0xf6f, None, CpuMMX, D|Modrm|NoSuf, { Unspecified|Qword|BaseIndex|RegMMX, RegMMX }
 movq, 0xf6e, None, CpuMMX|Cpu64, D|Modrm|NoSuf|Size64, { Reg64|Unspecified|BaseIndex, RegMMX }
 packssdw<mmx>, 0x<mmx:pfx>0f6b, None, <mmx:cpu>, Modrm|<mmx:attr>|NoSuf, { <mmx:reg>|<mmx:mem>|Unspecified|BaseIndex, <mmx:reg> }


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v7 3/7] ix86: don't recognize/derive Q suffix in the common case
  2022-12-02 10:13 [PATCH v7 0/7] x86: suffix handling changes Jan Beulich
  2022-12-02 10:18 ` [PATCH v7 1/7] x86: constify parse_insn()'s input Jan Beulich
  2022-12-02 10:18 ` [PATCH v7 2/7] x86: re-work insn/suffix recognition Jan Beulich
@ 2022-12-02 10:19 ` Jan Beulich
  2022-12-02 10:20 ` [PATCH v7 4/7] x86-64: allow HLE store of accumulator to absolute 32-bit address Jan Beulich
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2022-12-02 10:19 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

Have its use, except where actually legitimate, result in the same "only
supported in 64-bit mode" diagnostic as emitted for other 64-bit only
insns. Also suppress deriving of the suffix in Intel mode except in the
legitimate cases. This in exchange allows dropping the respective code
from match_template().

To maintain reasonable diagnostics (in particular to avoid "`mov' is
only supported in 64-bit mode" on the SIMD forms of MOVQ) we need to
defer parse_insn()'s emitting of errors unrelated to prefix parsing.
Utilize i.error just like match_template() does.

Oddly enough despite gcc's preference towards FILDQ and FIST{,T}Q we
had no testcase whatsoever for these. Therefore such tests are being
added. Note that the removed line in the x86-64-lfence-load testcase
was redundant with the exact same one a few lines up.
---
With gcc's preference towards FILDQ / FIST{,T}Q I wonder whether the
disassembler wouldn't better emit a Q suffix instead of the LL one.
---
v6: Re-base over dropping of Pass2 attribute.
v5: Re-base over re-arrangment of patch series, requiring pieces to be
    moved here.
v3: Re-base over changes to earlier patches.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -237,6 +237,8 @@ enum i386_error
     unsupported_with_intel_mnemonic,
     unsupported_syntax,
     unsupported,
+    unsupported_on_arch,
+    unsupported_64bit,
     invalid_sib_address,
     invalid_vsib_address,
     invalid_vector_register_set,
@@ -4861,7 +4863,7 @@ void
 md_assemble (char *line)
 {
   unsigned int j;
-  char mnemonic[MAX_MNEM_SIZE], mnem_suffix, *copy = NULL;
+  char mnemonic[MAX_MNEM_SIZE], mnem_suffix = 0, *copy = NULL;
   const char *end, *pass1_mnem = NULL;
   enum i386_error pass1_err = 0;
   const insn_template *t;
@@ -4886,6 +4888,16 @@ md_assemble (char *line)
     {
       if (pass1_mnem != NULL)
 	goto match_error;
+      if (i.error != no_error)
+	{
+	  gas_assert (current_templates != NULL);
+	  if (may_need_pass2 (current_templates->start) && !i.suffix)
+	    goto no_match;
+	  /* No point in trying a 2nd pass - it'll only find the same suffix
+	     again.  */
+	  mnem_suffix = i.suffix;
+	  goto match_error;
+	}
       return;
     }
   if (may_need_pass2 (current_templates->start))
@@ -4987,12 +4999,21 @@ md_assemble (char *line)
 	{
 	  line = copy;
 	  copy = NULL;
+  no_match:
 	  pass1_err = i.error;
 	  pass1_mnem = current_templates->start->name;
 	  goto retry;
 	}
-      free (copy);
+
+      /* If a non-/only-64bit template (group) was found in pass 1, and if
+	 _some_ template (group) was found in pass 2, squash pass 1's
+	 error.  */
+      if (pass1_err == unsupported_64bit)
+	pass1_mnem = NULL;
+
   match_error:
+      free (copy);
+
       switch (pass1_mnem ? pass1_err : i.error)
 	{
 	default:
@@ -5025,6 +5046,23 @@ md_assemble (char *line)
 	  as_bad (_("unsupported instruction `%s'"),
 		  pass1_mnem ? pass1_mnem : current_templates->start->name);
 	  return;
+	case unsupported_on_arch:
+	  as_bad (_("`%s' is not supported on `%s%s'"),
+		  pass1_mnem ? pass1_mnem : current_templates->start->name,
+		  cpu_arch_name ? cpu_arch_name : default_arch,
+		  cpu_sub_arch_name ? cpu_sub_arch_name : "");
+	  return;
+	case unsupported_64bit:
+	  if (ISLOWER (mnem_suffix))
+	    as_bad (_("`%s%c' is %s supported in 64-bit mode"),
+		    pass1_mnem ? pass1_mnem : current_templates->start->name,
+		    mnem_suffix,
+		    flag_code == CODE_64BIT ? _("not") : _("only"));
+	  else
+	    as_bad (_("`%s' is %s supported in 64-bit mode"),
+		    pass1_mnem ? pass1_mnem : current_templates->start->name,
+		    flag_code == CODE_64BIT ? _("not") : _("only"));
+	  return;
 	case invalid_sib_address:
 	  err_msg = _("invalid SIB address");
 	  break;
@@ -5382,6 +5420,23 @@ md_assemble (char *line)
     last_insn.kind = last_insn_other;
 }
 
+/* The Q suffix is generally valid only in 64-bit mode, with very few
+   exceptions: fild, fistp, fisttp, and cmpxchg8b.  Note that for fild
+   and fisttp only one of their two templates is matched below: That's
+   sufficient since other relevant attributes are the same between both
+   respective templates.  */
+static INLINE bool q_suffix_allowed(const insn_template *t)
+{
+  return flag_code == CODE_64BIT
+	 || (t->opcode_modifier.opcodespace == SPACE_BASE
+	     && t->base_opcode == 0xdf
+	     && (t->extension_opcode & 1)) /* fild / fistp / fisttp */
+	 || (t->opcode_modifier.opcodespace == SPACE_0F
+	     && t->base_opcode == 0xc7
+	     && t->opcode_modifier.opcodeprefix == PREFIX_NONE
+	     && t->extension_opcode == 1) /* cmpxchg8b */;
+}
+
 static const char *
 parse_insn (const char *line, char *mnemonic)
 {
@@ -5677,20 +5732,21 @@ parse_insn (const char *line, char *mnem
   for (t = current_templates->start; t < current_templates->end; ++t)
     {
       supported |= cpu_flags_match (t);
+
+      if (i.suffix == QWORD_MNEM_SUFFIX && !q_suffix_allowed (t))
+	supported &= ~CPU_FLAGS_64BIT_MATCH;
+
       if (supported == CPU_FLAGS_PERFECT_MATCH)
 	return l;
     }
 
-  if (!(supported & CPU_FLAGS_64BIT_MATCH))
-    as_bad (flag_code == CODE_64BIT
-	    ? _("`%s' is not supported in 64-bit mode")
-	    : _("`%s' is only supported in 64-bit mode"),
-	    current_templates->start->name);
-  else
-    as_bad (_("`%s' is not supported on `%s%s'"),
-	    current_templates->start->name,
-	    cpu_arch_name ? cpu_arch_name : default_arch,
-	    cpu_sub_arch_name ? cpu_sub_arch_name : "");
+  if (pass1)
+    {
+      if (supported & CPU_FLAGS_64BIT_MATCH)
+        i.error = unsupported_on_arch;
+      else
+        i.error = unsupported_64bit;
+    }
 
   return NULL;
 }
@@ -6687,20 +6743,12 @@ match_template (char mnem_suffix)
       for (j = 0; j < MAX_OPERANDS; j++)
 	operand_types[j] = t->operand_types[j];
 
-      /* In general, don't allow
-	 - 64-bit operands outside of 64-bit mode,
-	 - 32-bit operands on pre-386.  */
+      /* In general, don't allow 32-bit operands on pre-386.  */
       specific_error = progress (mnem_suffix ? invalid_instruction_suffix
 					     : operand_size_mismatch);
       j = i.imm_operands + (t->operands > i.imm_operands + 1);
-      if (((i.suffix == QWORD_MNEM_SUFFIX
-	    && flag_code != CODE_64BIT
-	    && !(t->opcode_modifier.opcodespace == SPACE_0F
-		 && t->base_opcode == 0xc7
-		 && t->opcode_modifier.opcodeprefix == PREFIX_NONE
-		 && t->extension_opcode == 1) /* cmpxchg8b */)
-	   || (i.suffix == LONG_MNEM_SUFFIX
-	       && !cpu_arch_flags.bitfield.cpui386))
+      if (i.suffix == LONG_MNEM_SUFFIX
+	  && !cpu_arch_flags.bitfield.cpui386
 	  && (intel_syntax
 	      ? (t->opcode_modifier.mnemonicsize != IGNORESIZE
 		 && !intel_float_operand (t->name))
--- a/gas/config/tc-i386-intel.c
+++ b/gas/config/tc-i386-intel.c
@@ -828,7 +828,7 @@ i386_intel_operand (char *operand_string
 		    continue;
 		  break;
 		case QWORD_MNEM_SUFFIX:
-		  if (t->opcode_modifier.no_qsuf)
+		  if (t->opcode_modifier.no_qsuf || !q_suffix_allowed (t))
 		    continue;
 		  break;
 		case SHORT_MNEM_SUFFIX:
--- a/gas/testsuite/gas/i386/opcode.d
+++ b/gas/testsuite/gas/i386/opcode.d
@@ -592,6 +592,10 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	0f 4b 90 90 90 90 90 	cmovnp -0x6f6f6f70\(%eax\),%edx
 [ 	]*[a-f0-9]+:	66 0f 4a 90 90 90 90 90 	cmovp  -0x6f6f6f70\(%eax\),%dx
 [ 	]*[a-f0-9]+:	66 0f 4b 90 90 90 90 90 	cmovnp -0x6f6f6f70\(%eax\),%dx
+[ 	]*[a-f0-9]+:	df 28                	fildll \(%eax\)
+[ 	]*[a-f0-9]+:	df 28                	fildll \(%eax\)
+[ 	]*[a-f0-9]+:	df 38                	fistpll \(%eax\)
+[ 	]*[a-f0-9]+:	df 38                	fistpll \(%eax\)
  +[a-f0-9]+:	82 c3 01             	add    \$0x1,%bl
  +[a-f0-9]+:	82 f3 01             	xor    \$0x1,%bl
  +[a-f0-9]+:	82 d3 01             	adc    \$0x1,%bl
--- a/gas/testsuite/gas/i386/opcode.s
+++ b/gas/testsuite/gas/i386/opcode.s
@@ -592,6 +592,11 @@ foo:
  cmovpe  0x90909090(%eax),%dx
  cmovpo 0x90909090(%eax),%dx
 
+ fildq  (%eax)
+ fildll (%eax)
+ fistpq (%eax)
+ fistpll (%eax)
+
 	.byte 0x82, 0xc3, 0x01
 	.byte 0x82, 0xf3, 0x01
 	.byte 0x82, 0xd3, 0x01
--- a/gas/testsuite/gas/i386/opcode-intel.d
+++ b/gas/testsuite/gas/i386/opcode-intel.d
@@ -593,6 +593,10 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	0f 4b 90 90 90 90 90 	cmovnp edx,DWORD PTR \[eax-0x6f6f6f70\]
 [ 	]*[a-f0-9]+:	66 0f 4a 90 90 90 90 90 	cmovp  dx,WORD PTR \[eax-0x6f6f6f70\]
 [ 	]*[a-f0-9]+:	66 0f 4b 90 90 90 90 90 	cmovnp dx,WORD PTR \[eax-0x6f6f6f70\]
+[ 	]*[a-f0-9]+:	df 28                	fild   QWORD PTR \[eax\]
+[ 	]*[a-f0-9]+:	df 28                	fild   QWORD PTR \[eax\]
+[ 	]*[a-f0-9]+:	df 38                	fistp  QWORD PTR \[eax\]
+[ 	]*[a-f0-9]+:	df 38                	fistp  QWORD PTR \[eax\]
  +[a-f0-9]+:	82 c3 01             	add    bl,0x1
  +[a-f0-9]+:	82 f3 01             	xor    bl,0x1
  +[a-f0-9]+:	82 d3 01             	adc    bl,0x1
--- a/gas/testsuite/gas/i386/opcode-suffix.d
+++ b/gas/testsuite/gas/i386/opcode-suffix.d
@@ -593,6 +593,10 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	0f 4b 90 90 90 90 90 	cmovnpl -0x6f6f6f70\(%eax\),%edx
 [ 	]*[a-f0-9]+:	66 0f 4a 90 90 90 90 90 	cmovpw -0x6f6f6f70\(%eax\),%dx
 [ 	]*[a-f0-9]+:	66 0f 4b 90 90 90 90 90 	cmovnpw -0x6f6f6f70\(%eax\),%dx
+[ 	]*[a-f0-9]+:	df 28                	fildll \(%eax\)
+[ 	]*[a-f0-9]+:	df 28                	fildll \(%eax\)
+[ 	]*[a-f0-9]+:	df 38                	fistpll \(%eax\)
+[ 	]*[a-f0-9]+:	df 38                	fistpll \(%eax\)
  +[a-f0-9]+:	82 c3 01             	addb   \$0x1,%bl
  +[a-f0-9]+:	82 f3 01             	xorb   \$0x1,%bl
  +[a-f0-9]+:	82 d3 01             	adcb   \$0x1,%bl
--- a/gas/testsuite/gas/i386/sse3.d
+++ b/gas/testsuite/gas/i386/sse3.d
@@ -13,29 +13,30 @@ Disassembly of section .text:
   10:	df 88 90 90 90 90 [ 	]*fisttps -0x6f6f6f70\(%eax\)
   16:	db 88 90 90 90 90 [ 	]*fisttpl -0x6f6f6f70\(%eax\)
   1c:	dd 88 90 90 90 90 [ 	]*fisttpll -0x6f6f6f70\(%eax\)
-  22:	66 0f 7c 65 00 [ 	]*haddpd 0x0\(%ebp\),%xmm4
-  27:	66 0f 7c ee [ 	]*haddpd %xmm6,%xmm5
-  2b:	f2 0f 7c 37 [ 	]*haddps \(%edi\),%xmm6
-  2f:	f2 0f 7c f8 [ 	]*haddps %xmm0,%xmm7
-  33:	66 0f 7d c1 [ 	]*hsubpd %xmm1,%xmm0
-  37:	66 0f 7d 0a [ 	]*hsubpd \(%edx\),%xmm1
-  3b:	f2 0f 7d d2 [ 	]*hsubps %xmm2,%xmm2
-  3f:	f2 0f 7d 1c 24 [ 	]*hsubps \(%esp\),%xmm3
-  44:	f2 0f f0 2e [ 	]*lddqu  \(%esi\),%xmm5
-  48:	0f 01 c8 [ 	]*monitor %eax,%ecx,%edx
-  4b:	0f 01 c8 [ 	]*monitor %eax,%ecx,%edx
-  4e:	f2 0f 12 f7 [ 	]*movddup %xmm7,%xmm6
-  52:	f2 0f 12 38 [ 	]*movddup \(%eax\),%xmm7
-  56:	f3 0f 16 01 [ 	]*movshdup \(%ecx\),%xmm0
-  5a:	f3 0f 16 ca [ 	]*movshdup %xmm2,%xmm1
-  5e:	f3 0f 12 13 [ 	]*movsldup \(%ebx\),%xmm2
-  62:	f3 0f 12 dc [ 	]*movsldup %xmm4,%xmm3
-  66:	0f 01 c9 [ 	]*mwait  %eax,%ecx
-  69:	0f 01 c9 [ 	]*mwait  %eax,%ecx
-  6c:	67 0f 01 c8 [ 	]*monitor %ax,%ecx,%edx
-  70:	67 0f 01 c8 [ 	]*monitor %ax,%ecx,%edx
-  74:	f2 0f 12 38 [ 	]*movddup \(%eax\),%xmm7
-  78:	f2 0f 12 38 [ 	]*movddup \(%eax\),%xmm7
+[ 	]*[0-9a-f]+:	dd 88 90 90 90 90 [ 	]*fisttpll -0x6f6f6f70\(%eax\)
+[ 	]*[0-9a-f]+:	66 0f 7c 65 00 [ 	]*haddpd 0x0\(%ebp\),%xmm4
+[ 	]*[0-9a-f]+:	66 0f 7c ee [ 	]*haddpd %xmm6,%xmm5
+[ 	]*[0-9a-f]+:	f2 0f 7c 37 [ 	]*haddps \(%edi\),%xmm6
+[ 	]*[0-9a-f]+:	f2 0f 7c f8 [ 	]*haddps %xmm0,%xmm7
+[ 	]*[0-9a-f]+:	66 0f 7d c1 [ 	]*hsubpd %xmm1,%xmm0
+[ 	]*[0-9a-f]+:	66 0f 7d 0a [ 	]*hsubpd \(%edx\),%xmm1
+[ 	]*[0-9a-f]+:	f2 0f 7d d2 [ 	]*hsubps %xmm2,%xmm2
+[ 	]*[0-9a-f]+:	f2 0f 7d 1c 24 [ 	]*hsubps \(%esp\),%xmm3
+[ 	]*[0-9a-f]+:	f2 0f f0 2e [ 	]*lddqu  \(%esi\),%xmm5
+[ 	]*[0-9a-f]+:	0f 01 c8 [ 	]*monitor %eax,%ecx,%edx
+[ 	]*[0-9a-f]+:	0f 01 c8 [ 	]*monitor %eax,%ecx,%edx
+[ 	]*[0-9a-f]+:	f2 0f 12 f7 [ 	]*movddup %xmm7,%xmm6
+[ 	]*[0-9a-f]+:	f2 0f 12 38 [ 	]*movddup \(%eax\),%xmm7
+[ 	]*[0-9a-f]+:	f3 0f 16 01 [ 	]*movshdup \(%ecx\),%xmm0
+[ 	]*[0-9a-f]+:	f3 0f 16 ca [ 	]*movshdup %xmm2,%xmm1
+[ 	]*[0-9a-f]+:	f3 0f 12 13 [ 	]*movsldup \(%ebx\),%xmm2
+[ 	]*[0-9a-f]+:	f3 0f 12 dc [ 	]*movsldup %xmm4,%xmm3
+[ 	]*[0-9a-f]+:	0f 01 c9 [ 	]*mwait  %eax,%ecx
+[ 	]*[0-9a-f]+:	0f 01 c9 [ 	]*mwait  %eax,%ecx
+[ 	]*[0-9a-f]+:	67 0f 01 c8 [ 	]*monitor %ax,%ecx,%edx
+[ 	]*[0-9a-f]+:	67 0f 01 c8 [ 	]*monitor %ax,%ecx,%edx
+[ 	]*[0-9a-f]+:	f2 0f 12 38 [ 	]*movddup \(%eax\),%xmm7
+[ 	]*[0-9a-f]+:	f2 0f 12 38 [ 	]*movddup \(%eax\),%xmm7
 [ 	]*[0-9a-f]+:	0f 01 c8[ 	]+monitor %eax,%ecx,%edx
 [ 	]*[0-9a-f]+:	67 0f 01 c8[ 	]+monitor %ax,%ecx,%edx
 [ 	]*[0-9a-f]+:	0f 01 c9[ 	]+mwait  %eax,%ecx
--- a/gas/testsuite/gas/i386/sse3.s
+++ b/gas/testsuite/gas/i386/sse3.s
@@ -8,6 +8,7 @@ foo:
 	addsubps	%xmm4,%xmm3
 	fisttps		0x90909090(%eax)
 	fisttpl		0x90909090(%eax)
+	fisttpq		0x90909090(%eax)
 	fisttpll	0x90909090(%eax)
 	haddpd		0x0(%ebp),%xmm4
 	haddpd		%xmm6,%xmm5
--- a/gas/testsuite/gas/i386/sse3-intel.d
+++ b/gas/testsuite/gas/i386/sse3-intel.d
@@ -14,6 +14,7 @@ Disassembly of section .text:
 [ 	]*[0-9a-f]+:	df 88 90 90 90 90[ 	]+fisttp WORD PTR \[eax-0x6f6f6f70\]
 [ 	]*[0-9a-f]+:	db 88 90 90 90 90[ 	]+fisttp DWORD PTR \[eax-0x6f6f6f70\]
 [ 	]*[0-9a-f]+:	dd 88 90 90 90 90[ 	]+fisttp QWORD PTR \[eax-0x6f6f6f70\]
+[ 	]*[0-9a-f]+:	dd 88 90 90 90 90[ 	]+fisttp QWORD PTR \[eax-0x6f6f6f70\]
 [ 	]*[0-9a-f]+:	66 0f 7c 65 00[ 	]+haddpd xmm4,(XMMWORD PTR )?\[ebp(\+0x0)\]
 [ 	]*[0-9a-f]+:	66 0f 7c ee[ 	]+haddpd xmm5,xmm6
 [ 	]*[0-9a-f]+:	f2 0f 7c 37[ 	]+haddps xmm6,(XMMWORD PTR )?\[edi\]
--- a/gas/testsuite/gas/i386/x86-64-lfence-load.d
+++ b/gas/testsuite/gas/i386/x86-64-lfence-load.d
@@ -46,16 +46,21 @@ Disassembly of section .text:
  +[a-f0-9]+:	0f ae e8             	lfence
  +[a-f0-9]+:	db 55 00             	fistl  0x0\(%rbp\)
  +[a-f0-9]+:	df 55 00             	fists  0x0\(%rbp\)
+ +[a-f0-9]+:	db 5d 00             	fistpl 0x0\(%rbp\)
+ +[a-f0-9]+:	df 5d 00             	fistps 0x0\(%rbp\)
+ +[a-f0-9]+:	df 7d 00             	fistpll 0x0\(%rbp\)
  +[a-f0-9]+:	db 45 00             	fildl  0x0\(%rbp\)
  +[a-f0-9]+:	0f ae e8             	lfence
  +[a-f0-9]+:	df 45 00             	filds  0x0\(%rbp\)
  +[a-f0-9]+:	0f ae e8             	lfence
+ +[a-f0-9]+:	df 6d 00             	fildll 0x0\(%rbp\)
+ +[a-f0-9]+:	0f ae e8             	lfence
  +[a-f0-9]+:	9b dd 75 00          	fsave  0x0\(%rbp\)
  +[a-f0-9]+:	dd 65 00             	frstor 0x0\(%rbp\)
  +[a-f0-9]+:	0f ae e8             	lfence
- +[a-f0-9]+:	df 45 00             	filds  0x0\(%rbp\)
- +[a-f0-9]+:	0f ae e8             	lfence
+ +[a-f0-9]+:	db 4d 00             	fisttpl 0x0\(%rbp\)
  +[a-f0-9]+:	df 4d 00             	fisttps 0x0\(%rbp\)
+ +[a-f0-9]+:	dd 4d 00             	fisttpll 0x0\(%rbp\)
  +[a-f0-9]+:	d9 65 00             	fldenv 0x0\(%rbp\)
  +[a-f0-9]+:	0f ae e8             	lfence
  +[a-f0-9]+:	9b d9 75 00          	fstenv 0x0\(%rbp\)
--- a/gas/testsuite/gas/i386/x86-64-lfence-load.s
+++ b/gas/testsuite/gas/i386/x86-64-lfence-load.s
@@ -29,12 +29,17 @@ _start:
 	flds (%rbp)
 	fistl (%rbp)
 	fists (%rbp)
+	fistpl (%rbp)
+	fistps (%rbp)
+	fistpq (%rbp)
 	fildl (%rbp)
 	filds (%rbp)
+	fildq (%rbp)
 	fsave (%rbp)
 	frstor (%rbp)
-	filds (%rbp)
+	fisttpl (%rbp)
 	fisttps (%rbp)
+	fisttpq (%rbp)
 	fldenv (%rbp)
 	fstenv (%rbp)
 	fadds  (%rbp)
--- a/gas/testsuite/gas/i386/x86-64-sse3.d
+++ b/gas/testsuite/gas/i386/x86-64-sse3.d
@@ -13,6 +13,7 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	df 88 90 90 90 00 [ 	]*fisttps 0x909090\(%rax\)
 [ 	]*[a-f0-9]+:	db 88 90 90 90 00 [ 	]*fisttpl 0x909090\(%rax\)
 [ 	]*[a-f0-9]+:	dd 88 90 90 90 00 [ 	]*fisttpll 0x909090\(%rax\)
+[ 	]*[a-f0-9]+:	dd 88 90 90 90 00 [ 	]*fisttpll 0x909090\(%rax\)
 [ 	]*[a-f0-9]+:	66 0f 7c 65 00 [ 	]*haddpd 0x0\(%rbp\),%xmm4
 [ 	]*[a-f0-9]+:	66 0f 7c ee [ 	]*haddpd %xmm6,%xmm5
 [ 	]*[a-f0-9]+:	f2 0f 7c 37 [ 	]*haddps \(%rdi\),%xmm6
--- a/gas/testsuite/gas/i386/x86-64-sse3.s
+++ b/gas/testsuite/gas/i386/x86-64-sse3.s
@@ -8,6 +8,7 @@ foo:
 	addsubps	%xmm4,%xmm3
 	fisttps		0x909090(%rax)
 	fisttpl		0x909090(%rax)
+	fisttpq		0x909090(%rax)
 	fisttpll	0x909090(%rax)
 	haddpd		0x0(%rbp),%xmm4
 	haddpd		%xmm6,%xmm5
--- a/gas/testsuite/gas/i386/x86-64-sse3-intel.d
+++ b/gas/testsuite/gas/i386/x86-64-sse3-intel.d
@@ -14,6 +14,7 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	df 88 90 90 90 00[ 	]+fisttp WORD PTR \[rax\+0x909090\]
 [ 	]*[a-f0-9]+:	db 88 90 90 90 00[ 	]+fisttp DWORD PTR \[rax\+0x909090\]
 [ 	]*[a-f0-9]+:	dd 88 90 90 90 00[ 	]+fisttp QWORD PTR \[rax\+0x909090\]
+[ 	]*[a-f0-9]+:	dd 88 90 90 90 00[ 	]+fisttp QWORD PTR \[rax\+0x909090\]
 [ 	]*[a-f0-9]+:	66 0f 7c 65 00[ 	]+haddpd xmm4,(XMMWORD PTR )?\[rbp(\+0x0)\]
 [ 	]*[a-f0-9]+:	66 0f 7c ee[ 	]+haddpd xmm5,xmm6
 [ 	]*[a-f0-9]+:	f2 0f 7c 37[ 	]+haddps xmm6,(XMMWORD PTR )?\[rdi\]


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v7 4/7] x86-64: allow HLE store of accumulator to absolute 32-bit address
  2022-12-02 10:13 [PATCH v7 0/7] x86: suffix handling changes Jan Beulich
                   ` (2 preceding siblings ...)
  2022-12-02 10:19 ` [PATCH v7 3/7] ix86: don't recognize/derive Q suffix in the common case Jan Beulich
@ 2022-12-02 10:20 ` Jan Beulich
  2022-12-02 10:20 ` [PATCH v7 5/7] x86: move bad-use-of-TLS-reloc check Jan Beulich
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2022-12-02 10:20 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

In commit 1212781b35c9 ("ix86: allow HLE store of accumulator to
absolute address") I was wrong to exclude 64-bit code. Dropping the
check also leads to better diagnostics in 64-bit code ("MOV", after
all, isn't invalid with "XRELEASE").

While there also limit the amount of further checks done: The operand
type checks that were there were effectively redundant with other ones
anyway, plus it's quite fine to also have "xrelease mov <disp>, %eax"
look for the next MOV template (in fact again also improving
diagnostics).
---
v2: New.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -6870,12 +6870,9 @@ match_template (char mnem_suffix)
 	    continue;
 	  /* xrelease mov %eax, <disp> is another special case. It must not
 	     match the accumulator-only encoding of mov.  */
-	  if (flag_code != CODE_64BIT
-	      && i.hle_prefix
+	  if (i.hle_prefix
 	      && t->base_opcode == 0xa0
-	      && t->opcode_modifier.opcodespace == SPACE_BASE
-	      && i.types[0].bitfield.instance == Accum
-	      && (i.flags[1] & Operand_Mem))
+	      && t->opcode_modifier.opcodespace == SPACE_BASE)
 	    continue;
 	  /* Fall through.  */
 
--- a/gas/testsuite/gas/i386/x86-64-hle.d
+++ b/gas/testsuite/gas/i386/x86-64-hle.d
@@ -424,6 +424,8 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	f0 f2 20 01          	lock xacquire and %al,\(%rcx\)
 [ 	]*[a-f0-9]+:	f0 f3 20 01          	lock xrelease and %al,\(%rcx\)
 [ 	]*[a-f0-9]+:	f3 88 01             	xrelease mov %al,\(%rcx\)
+[ 	]*[a-f0-9]+:	f3 88 04 25 78 56 34 12 	xrelease mov %al,0x12345678
+[ 	]*[a-f0-9]+:	67 f3 88 04 25 21 43 65 87 	xrelease mov %al,0x87654321\(,%eiz,1\)
 [ 	]*[a-f0-9]+:	f2 f0 08 01          	xacquire lock or %al,\(%rcx\)
 [ 	]*[a-f0-9]+:	f2 f0 08 01          	xacquire lock or %al,\(%rcx\)
 [ 	]*[a-f0-9]+:	f3 f0 08 01          	xrelease lock or %al,\(%rcx\)
@@ -475,6 +477,8 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	f0 f2 66 21 01       	lock xacquire and %ax,\(%rcx\)
 [ 	]*[a-f0-9]+:	f0 f3 66 21 01       	lock xrelease and %ax,\(%rcx\)
 [ 	]*[a-f0-9]+:	66 f3 89 01          	xrelease mov %ax,\(%rcx\)
+[ 	]*[a-f0-9]+:	66 f3 89 04 25 78 56 34 12 	xrelease mov %ax,0x12345678
+[ 	]*[a-f0-9]+:	67 66 f3 89 04 25 21 43 65 87 	xrelease mov %ax,0x87654321\(,%eiz,1\)
 [ 	]*[a-f0-9]+:	66 f2 f0 09 01       	xacquire lock or %ax,\(%rcx\)
 [ 	]*[a-f0-9]+:	66 f2 f0 09 01       	xacquire lock or %ax,\(%rcx\)
 [ 	]*[a-f0-9]+:	66 f3 f0 09 01       	xrelease lock or %ax,\(%rcx\)
@@ -526,6 +530,8 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	f0 f2 21 01          	lock xacquire and %eax,\(%rcx\)
 [ 	]*[a-f0-9]+:	f0 f3 21 01          	lock xrelease and %eax,\(%rcx\)
 [ 	]*[a-f0-9]+:	f3 89 01             	xrelease mov %eax,\(%rcx\)
+[ 	]*[a-f0-9]+:	f3 89 04 25 78 56 34 12 	xrelease mov %eax,0x12345678
+[ 	]*[a-f0-9]+:	67 f3 89 04 25 21 43 65 87 	xrelease mov %eax,0x87654321\(,%eiz,1\)
 [ 	]*[a-f0-9]+:	f2 f0 09 01          	xacquire lock or %eax,\(%rcx\)
 [ 	]*[a-f0-9]+:	f2 f0 09 01          	xacquire lock or %eax,\(%rcx\)
 [ 	]*[a-f0-9]+:	f3 f0 09 01          	xrelease lock or %eax,\(%rcx\)
@@ -577,6 +583,8 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	f0 f2 48 21 01       	lock xacquire and %rax,\(%rcx\)
 [ 	]*[a-f0-9]+:	f0 f3 48 21 01       	lock xrelease and %rax,\(%rcx\)
 [ 	]*[a-f0-9]+:	f3 48 89 01          	xrelease mov %rax,\(%rcx\)
+[ 	]*[a-f0-9]+:	f3 48 89 04 25 78 56 34 12 	xrelease mov %rax,0x12345678
+[ 	]*[a-f0-9]+:	67 f3 48 89 04 25 21 43 65 87 	xrelease mov %rax,0x87654321\(,%eiz,1\)
 [ 	]*[a-f0-9]+:	f2 f0 48 09 01       	xacquire lock or %rax,\(%rcx\)
 [ 	]*[a-f0-9]+:	f2 f0 48 09 01       	xacquire lock or %rax,\(%rcx\)
 [ 	]*[a-f0-9]+:	f3 f0 48 09 01       	xrelease lock or %rax,\(%rcx\)
--- a/gas/testsuite/gas/i386/x86-64-hle.s
+++ b/gas/testsuite/gas/i386/x86-64-hle.s
@@ -442,6 +442,8 @@ _start:
 	.byte 0xf0; .byte 0xf2; andb %al,(%rcx)
 	.byte 0xf0; .byte 0xf3; andb %al,(%rcx)
 	xrelease movb %al,(%rcx)
+	xrelease movb %al,0x12345678
+	xrelease addr32 movb %al,0x87654321
 	xacquire lock orb %al,(%rcx)
 	lock xacquire orb %al,(%rcx)
 	xrelease lock orb %al,(%rcx)
@@ -496,6 +498,8 @@ _start:
 	.byte 0xf0; .byte 0xf2; andw %ax,(%rcx)
 	.byte 0xf0; .byte 0xf3; andw %ax,(%rcx)
 	xrelease movw %ax,(%rcx)
+	xrelease movw %ax,0x12345678
+	xrelease addr32 movw %ax,0x87654321
 	xacquire lock orw %ax,(%rcx)
 	lock xacquire orw %ax,(%rcx)
 	xrelease lock orw %ax,(%rcx)
@@ -550,6 +554,8 @@ _start:
 	.byte 0xf0; .byte 0xf2; andl %eax,(%rcx)
 	.byte 0xf0; .byte 0xf3; andl %eax,(%rcx)
 	xrelease movl %eax,(%rcx)
+	xrelease movl %eax,0x12345678
+	xrelease addr32 movl %eax,0x87654321
 	xacquire lock orl %eax,(%rcx)
 	lock xacquire orl %eax,(%rcx)
 	xrelease lock orl %eax,(%rcx)
@@ -604,6 +610,8 @@ _start:
 	.byte 0xf0; .byte 0xf2; andq %rax,(%rcx)
 	.byte 0xf0; .byte 0xf3; andq %rax,(%rcx)
 	xrelease movq %rax,(%rcx)
+	xrelease movq %rax,0x12345678
+	xrelease addr32 movq %rax,0x87654321
 	xacquire lock orq %rax,(%rcx)
 	lock xacquire orq %rax,(%rcx)
 	xrelease lock orq %rax,(%rcx)
--- a/gas/testsuite/gas/i386/x86-64-hle-intel.d
+++ b/gas/testsuite/gas/i386/x86-64-hle-intel.d
@@ -425,6 +425,8 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	f0 f2 20 01          	lock xacquire and BYTE PTR \[rcx\],al
 [ 	]*[a-f0-9]+:	f0 f3 20 01          	lock xrelease and BYTE PTR \[rcx\],al
 [ 	]*[a-f0-9]+:	f3 88 01             	xrelease mov BYTE PTR \[rcx\],al
+[ 	]*[a-f0-9]+:	f3 88 04 25 78 56 34 12 	xrelease mov BYTE PTR (ds:)?0x12345678,al
+[ 	]*[a-f0-9]+:	67 f3 88 04 25 21 43 65 87 	xrelease mov BYTE PTR \[eiz\*1\+0x87654321\],al
 [ 	]*[a-f0-9]+:	f2 f0 08 01          	xacquire lock or BYTE PTR \[rcx\],al
 [ 	]*[a-f0-9]+:	f2 f0 08 01          	xacquire lock or BYTE PTR \[rcx\],al
 [ 	]*[a-f0-9]+:	f3 f0 08 01          	xrelease lock or BYTE PTR \[rcx\],al
@@ -476,6 +478,8 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	f0 f2 66 21 01       	lock xacquire and WORD PTR \[rcx\],ax
 [ 	]*[a-f0-9]+:	f0 f3 66 21 01       	lock xrelease and WORD PTR \[rcx\],ax
 [ 	]*[a-f0-9]+:	66 f3 89 01          	xrelease mov WORD PTR \[rcx\],ax
+[ 	]*[a-f0-9]+:	66 f3 89 04 25 78 56 34 12 	xrelease mov WORD PTR (ds:)?0x12345678,ax
+[ 	]*[a-f0-9]+:	67 66 f3 89 04 25 21 43 65 87 	xrelease mov WORD PTR \[eiz\*1\+0x87654321\],ax
 [ 	]*[a-f0-9]+:	66 f2 f0 09 01       	xacquire lock or WORD PTR \[rcx\],ax
 [ 	]*[a-f0-9]+:	66 f2 f0 09 01       	xacquire lock or WORD PTR \[rcx\],ax
 [ 	]*[a-f0-9]+:	66 f3 f0 09 01       	xrelease lock or WORD PTR \[rcx\],ax
@@ -527,6 +531,8 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	f0 f2 21 01          	lock xacquire and DWORD PTR \[rcx\],eax
 [ 	]*[a-f0-9]+:	f0 f3 21 01          	lock xrelease and DWORD PTR \[rcx\],eax
 [ 	]*[a-f0-9]+:	f3 89 01             	xrelease mov DWORD PTR \[rcx\],eax
+[ 	]*[a-f0-9]+:	f3 89 04 25 78 56 34 12 	xrelease mov DWORD PTR (ds:)?0x12345678,eax
+[ 	]*[a-f0-9]+:	67 f3 89 04 25 21 43 65 87 	xrelease mov DWORD PTR \[eiz\*1\+0x87654321\],eax
 [ 	]*[a-f0-9]+:	f2 f0 09 01          	xacquire lock or DWORD PTR \[rcx\],eax
 [ 	]*[a-f0-9]+:	f2 f0 09 01          	xacquire lock or DWORD PTR \[rcx\],eax
 [ 	]*[a-f0-9]+:	f3 f0 09 01          	xrelease lock or DWORD PTR \[rcx\],eax
@@ -578,6 +584,8 @@ Disassembly of section .text:
 [ 	]*[a-f0-9]+:	f0 f2 48 21 01       	lock xacquire and QWORD PTR \[rcx\],rax
 [ 	]*[a-f0-9]+:	f0 f3 48 21 01       	lock xrelease and QWORD PTR \[rcx\],rax
 [ 	]*[a-f0-9]+:	f3 48 89 01          	xrelease mov QWORD PTR \[rcx\],rax
+[ 	]*[a-f0-9]+:	f3 48 89 04 25 78 56 34 12 	xrelease mov QWORD PTR (ds:)?0x12345678,rax
+[ 	]*[a-f0-9]+:	67 f3 48 89 04 25 21 43 65 87 	xrelease mov QWORD PTR \[eiz\*1\+0x87654321\],rax
 [ 	]*[a-f0-9]+:	f2 f0 48 09 01       	xacquire lock or QWORD PTR \[rcx\],rax
 [ 	]*[a-f0-9]+:	f2 f0 48 09 01       	xacquire lock or QWORD PTR \[rcx\],rax
 [ 	]*[a-f0-9]+:	f3 f0 48 09 01       	xrelease lock or QWORD PTR \[rcx\],rax


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v7 5/7] x86: move bad-use-of-TLS-reloc check
  2022-12-02 10:13 [PATCH v7 0/7] x86: suffix handling changes Jan Beulich
                   ` (3 preceding siblings ...)
  2022-12-02 10:20 ` [PATCH v7 4/7] x86-64: allow HLE store of accumulator to absolute 32-bit address Jan Beulich
@ 2022-12-02 10:20 ` Jan Beulich
  2022-12-02 10:21 ` [PATCH v7 6/7] x86: drop (now) stray IsString Jan Beulich
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2022-12-02 10:20 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

Having it in match_template() is unhelpful. Neither does looking for the
next template to possibly match make any sense in that case, nor is the
resulting diagnostic making clear what the problem is.

While moving the check, also generalize it to include all SIMD and VEX-
encoded insns. This way an existing conditional can be re-used in
md_assemble(). Note though that this still leaves a lof of insns which
are also wrong to use with these relocations.

Further fold the remaining check (BFD_RELOC_386_GOT32) with the XRELEASE
related one a few lines down. This again allows re-using an existing
conditional.
---
Is the set of relocations enumerated here actually complete?
---
v2: New.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -5158,14 +5158,30 @@ md_assemble (char *line)
       return;
     }
 
-  /* Check for data size prefix on VEX/XOP/EVEX encoded and SIMD insns.  */
-  if (i.prefix[DATA_PREFIX]
-      && (is_any_vex_encoding (&i.tm)
-	  || i.tm.operand_types[i.imm_operands].bitfield.class >= RegMMX
-	  || i.tm.operand_types[i.imm_operands + 1].bitfield.class >= RegMMX))
+  if (is_any_vex_encoding (&i.tm)
+      || i.tm.operand_types[i.imm_operands].bitfield.class >= RegMMX
+      || i.tm.operand_types[i.imm_operands + 1].bitfield.class >= RegMMX)
     {
-      as_bad (_("data size prefix invalid with `%s'"), i.tm.name);
-      return;
+      /* Check for data size prefix on VEX/XOP/EVEX encoded and SIMD insns.  */
+      if (i.prefix[DATA_PREFIX])
+	{
+	  as_bad (_("data size prefix invalid with `%s'"), i.tm.name);
+	  return;
+	}
+
+      /* Don't allow e.g. KMOV in TLS code sequences.  */
+      for (j = i.imm_operands; j < i.operands; ++j)
+	switch (i.reloc[j])
+	  {
+	  case BFD_RELOC_386_TLS_GOTIE:
+	  case BFD_RELOC_386_TLS_LE_32:
+	  case BFD_RELOC_X86_64_GOTTPOFF:
+	  case BFD_RELOC_X86_64_TLSLD:
+	    as_bad (_("TLS relocation cannot be used with `%s'"), i.tm.name);
+	    return;
+	  default:
+	    break;
+	  }
     }
 
   /* Check if HLE prefix is OK.  */
@@ -6819,26 +6835,6 @@ match_template (char mnem_suffix)
 	    }
 	}
 
-      switch (i.reloc[0])
-	{
-	case BFD_RELOC_386_GOT32:
-	  /* Force 0x8b encoding for "mov foo@GOT, %eax".  */
-	  if (t->base_opcode == 0xa0
-	      && t->opcode_modifier.opcodespace == SPACE_BASE)
-	    continue;
-	  break;
-	case BFD_RELOC_386_TLS_GOTIE:
-	case BFD_RELOC_386_TLS_LE_32:
-	case BFD_RELOC_X86_64_GOTTPOFF:
-	case BFD_RELOC_X86_64_TLSLD:
-	  /* Don't allow KMOV in TLS code sequences.  */
-	  if (t->opcode_modifier.vex)
-	    continue;
-	  break;
-	default:
-	  break;
-	}
-
       /* We check register size if needed.  */
       if (t->opcode_modifier.checkregsize)
 	{
@@ -6868,12 +6864,19 @@ match_template (char mnem_suffix)
 	      && i.types[0].bitfield.dword
 	      && i.types[1].bitfield.instance == Accum)
 	    continue;
-	  /* xrelease mov %eax, <disp> is another special case. It must not
-	     match the accumulator-only encoding of mov.  */
-	  if (i.hle_prefix
-	      && t->base_opcode == 0xa0
+
+	  if (t->base_opcode == MOV_AX_DISP32
 	      && t->opcode_modifier.opcodespace == SPACE_BASE)
-	    continue;
+	    {
+	      /* Force 0x8b encoding for "mov foo@GOT, %eax".  */
+	      if (i.reloc[0] == BFD_RELOC_386_GOT32)
+		continue;
+
+	      /* xrelease mov %eax, <disp> is another special case. It must not
+		 match the accumulator-only encoding of mov.  */
+	      if (i.hle_prefix)
+		continue;
+	    }
 	  /* Fall through.  */
 
 	case 3:


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v7 6/7] x86: drop (now) stray IsString
  2022-12-02 10:13 [PATCH v7 0/7] x86: suffix handling changes Jan Beulich
                   ` (4 preceding siblings ...)
  2022-12-02 10:20 ` [PATCH v7 5/7] x86: move bad-use-of-TLS-reloc check Jan Beulich
@ 2022-12-02 10:21 ` Jan Beulich
  2022-12-02 10:21 ` [PATCH v7 7/7] x86: further re-work insn/suffix recognition to also cover MOVSX Jan Beulich
  2022-12-09 10:56 ` [PATCH v7 0/7] x86: suffix handling changes Jan Beulich
  7 siblings, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2022-12-02 10:21 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

The need for them on the operand-less string insns has gone away with
the removal of maybe_adjust_templates() and associated logic. Since
i386_index_check() needs adjustment then anyway, take the opportunity
and also simplify it, possible again as a result of said removal (plus
the opcode template adjustments done here).
---
v7: PadLock parts split off.
v4: New.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -11373,11 +11373,9 @@ i386_index_check (const char *operand_st
 {
   const char *kind = "base/index";
   enum flag_code addr_mode = i386_addressing_mode ();
-  const insn_template *t = current_templates->start;
+  const insn_template *t = current_templates->end - 1;
 
-  if (t->opcode_modifier.isstring
-      && (current_templates->end[-1].opcode_modifier.isstring
-	  || i.mem_operands))
+  if (t->opcode_modifier.isstring)
     {
       /* Memory operands of string insns are special in that they only allow
 	 a single register (rDI, rSI, or rBX) as their memory address.  */
@@ -11394,14 +11392,12 @@ i386_index_check (const char *operand_st
 
       if (t->opcode_modifier.prefixok == PrefixRep)
 	{
-	  int es_op = current_templates->end[-1].opcode_modifier.isstring
-		      - IS_STRING_ES_OP0;
+	  int es_op = t->opcode_modifier.isstring - IS_STRING_ES_OP0;
 	  int op = 0;
 
-	  if (!current_templates->end[-1].operand_types[0].bitfield.baseindex
+	  if (!t->operand_types[0].bitfield.baseindex
 	      || ((!i.mem_operands != !intel_syntax)
-		  && current_templates->end[-1].operand_types[1]
-		     .bitfield.baseindex))
+		  && t->operand_types[1].bitfield.baseindex))
 	    op = 1;
 	  expected_reg
 	    = (const reg_entry *) str_hash_find (reg_hash,
@@ -11444,6 +11440,8 @@ i386_index_check (const char *operand_st
     }
   else
     {
+      t = current_templates->start;
+
       if (addr_mode != CODE_16BIT)
 	{
 	  /* 32-bit/64-bit checks.  */
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -490,37 +490,37 @@ loopne, 0xe0, None, Cpu64, JumpByte|No_b
 set<cc>, 0xf9<cc:opc>, 0, Cpu386, Modrm|No_wSuf|No_lSuf|No_sSuf|No_qSuf, { Reg8|Byte|Unspecified|BaseIndex }
 
 // String manipulation.
-cmps, 0xa6, None, 0, W|No_sSuf|IsString|RepPrefixOk, {}
+cmps, 0xa6, None, 0, W|No_sSuf|RepPrefixOk, {}
 cmps, 0xa6, None, 0, W|No_sSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-scmp, 0xa6, None, 0, W|No_sSuf|IsString|RepPrefixOk, {}
+scmp, 0xa6, None, 0, W|No_sSuf|RepPrefixOk, {}
 scmp, 0xa6, None, 0, W|No_sSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-ins, 0x6c, None, Cpu186, W|No_sSuf|No_qSuf|IsString|RepPrefixOk, {}
+ins, 0x6c, None, Cpu186, W|No_sSuf|No_qSuf|RepPrefixOk, {}
 ins, 0x6c, None, Cpu186, W|No_sSuf|No_qSuf|IsStringEsOp1|RepPrefixOk, { InOutPortReg, Byte|Word|Dword|Unspecified|BaseIndex }
-outs, 0x6e, None, Cpu186, W|No_sSuf|No_qSuf|IsString|RepPrefixOk, {}
+outs, 0x6e, None, Cpu186, W|No_sSuf|No_qSuf|RepPrefixOk, {}
 outs, 0x6e, None, Cpu186, W|No_sSuf|No_qSuf|IsString|RepPrefixOk, { Byte|Word|Dword|Unspecified|BaseIndex, InOutPortReg }
-lods, 0xac, None, 0, W|No_sSuf|IsString|RepPrefixOk, {}
+lods, 0xac, None, 0, W|No_sSuf|RepPrefixOk, {}
 lods, 0xac, None, 0, W|No_sSuf|IsString|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 lods, 0xac, None, 0, W|No_sSuf|IsString|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Acc|Byte|Word|Dword|Qword }
-slod, 0xac, None, 0, W|No_sSuf|IsString|RepPrefixOk, {}
+slod, 0xac, None, 0, W|No_sSuf|RepPrefixOk, {}
 slod, 0xac, None, 0, W|No_sSuf|IsString|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 slod, 0xac, None, 0, W|No_sSuf|IsString|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Acc|Byte|Word|Dword|Qword }
-movs, 0xa4, None, 0, W|No_sSuf|IsString|RepPrefixOk, {}
+movs, 0xa4, None, 0, W|No_sSuf|RepPrefixOk, {}
 movs, 0xa4, None, 0, W|No_sSuf|IsStringEsOp1|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-smov, 0xa4, None, 0, W|No_sSuf|IsString|RepPrefixOk, {}
+smov, 0xa4, None, 0, W|No_sSuf|RepPrefixOk, {}
 smov, 0xa4, None, 0, W|No_sSuf|IsStringEsOp1|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-scas, 0xae, None, 0, W|No_sSuf|IsString|RepPrefixOk, {}
+scas, 0xae, None, 0, W|No_sSuf|RepPrefixOk, {}
 scas, 0xae, None, 0, W|No_sSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 scas, 0xae, None, 0, W|No_sSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Acc|Byte|Word|Dword|Qword }
-ssca, 0xae, None, 0, W|No_sSuf|IsString|RepPrefixOk, {}
+ssca, 0xae, None, 0, W|No_sSuf|RepPrefixOk, {}
 ssca, 0xae, None, 0, W|No_sSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 ssca, 0xae, None, 0, W|No_sSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex, Acc|Byte|Word|Dword|Qword }
-stos, 0xaa, None, 0, W|No_sSuf|IsString|RepPrefixOk, {}
+stos, 0xaa, None, 0, W|No_sSuf|RepPrefixOk, {}
 stos, 0xaa, None, 0, W|No_sSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 stos, 0xaa, None, 0, W|No_sSuf|IsStringEsOp1|RepPrefixOk, { Acc|Byte|Word|Dword|Qword, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-ssto, 0xaa, None, 0, W|No_sSuf|IsString|RepPrefixOk, {}
+ssto, 0xaa, None, 0, W|No_sSuf|RepPrefixOk, {}
 ssto, 0xaa, None, 0, W|No_sSuf|IsStringEsOp0|RepPrefixOk, { Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 ssto, 0xaa, None, 0, W|No_sSuf|IsStringEsOp1|RepPrefixOk, { Acc|Byte|Word|Dword|Qword, Byte|Word|Dword|Qword|Unspecified|BaseIndex }
-xlat, 0xd7, None, 0, No_wSuf|No_lSuf|No_sSuf|No_qSuf|IsString, {}
+xlat, 0xd7, None, 0, No_wSuf|No_lSuf|No_sSuf|No_qSuf, {}
 xlat, 0xd7, None, 0, No_wSuf|No_lSuf|No_sSuf|No_qSuf|IsString, { Byte|Unspecified|BaseIndex }
 
 // Bit manipulation.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v7 7/7] x86: further re-work insn/suffix recognition to also cover MOVSX
  2022-12-02 10:13 [PATCH v7 0/7] x86: suffix handling changes Jan Beulich
                   ` (5 preceding siblings ...)
  2022-12-02 10:21 ` [PATCH v7 6/7] x86: drop (now) stray IsString Jan Beulich
@ 2022-12-02 10:21 ` Jan Beulich
  2022-12-09 10:56 ` [PATCH v7 0/7] x86: suffix handling changes Jan Beulich
  7 siblings, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2022-12-02 10:21 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

PR gas/29524

Having templates with a suffix explicitly present has always been
quirky. After prior adjustment all that's left to also eliminate the
anomaly from move-with-sign-extend is to consolidate the insn templates
and to make may_need_pass2() cope (plus extend testsuite coverage).
---
v7: Re-base.
v6: Re-base over dropping of Pass2 attribute.
v5: Split off from earlier patch, merged with prior patch dealing with
    just x86-64's MOVSL.

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -4852,7 +4852,11 @@ static INLINE bool may_need_pass2 (const
 {
   return t->opcode_modifier.sse2avx
 	 /* Note that all SSE2AVX templates have at least one operand.  */
-	 && t->operand_types[t->operands - 1].bitfield.class == RegSIMD;
+	 ? t->operand_types[t->operands - 1].bitfield.class == RegSIMD
+	 : (t->opcode_modifier.opcodespace == SPACE_0F
+	    && (t->base_opcode | 1) == 0xbf)
+	   || (t->opcode_modifier.opcodespace == SPACE_BASE
+	       && t->base_opcode == 0x63);
 }
 
 /* This is the guts of the machine-dependent assembler.  LINE points to a
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -73,6 +73,7 @@ if [gas_32_check] then {
     run_dump_test "amd"
     run_dump_test "katmai"
     run_dump_test "jump"
+    run_dump_test "movs32"
     run_dump_test "movz32"
     run_dump_test "relax-1"
     run_dump_test "relax-2"
@@ -820,6 +821,7 @@ if [gas_64_check] then {
     run_dump_test "x86-64-segovr"
     run_list_test "x86-64-inval-seg" "-al"
     run_dump_test "x86-64-branch"
+    run_dump_test "movs64"
     run_dump_test "movz64"
     run_dump_test "x86-64-relax-1"
     run_dump_test "svme64"
--- /dev/null
+++ b/gas/testsuite/gas/i386/movs.s
@@ -0,0 +1,39 @@
+	.text
+movs:
+	movsb	%al,%ax
+	movsb	(%eax),%ax
+	movsb	%al,%eax
+	movsb	(%eax),%eax
+.ifdef x86_64
+	movsb	%al,%rax
+	movsb	(%rax),%rax
+.endif
+
+	movsbw	%al,%ax
+	movsbw	(%eax),%ax
+	movsbl	%al,%eax
+	movsbl	(%eax),%eax
+.ifdef x86_64
+	movsbq	%al,%rax
+	movsbq	(%rax),%rax
+.endif
+
+	movsw	%ax,%eax
+	movsw	(%eax),%eax
+.ifdef x86_64
+	movsw	%ax,%rax
+	movsw	(%rax),%rax
+.endif
+
+	movswl	%ax,%eax
+	movswl	(%eax),%eax
+.ifdef x86_64
+	movswq	%ax,%rax
+	movswq	(%rax),%rax
+
+	movsl	%eax,%rax
+	movsl	(%rax),%rax
+
+	movslq	%eax,%rax
+	movslq	(%rax),%rax
+.endif
--- /dev/null
+++ b/gas/testsuite/gas/i386/movs32.d
@@ -0,0 +1,22 @@
+#objdump: -dw
+#source: movs.s
+#name: x86 mov with sign-extend (32-bit object)
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <movs>:
+[ 	]*[a-f0-9]+:	66 0f be c0 *	movsbw %al,%ax
+[ 	]*[a-f0-9]+:	66 0f be 00 *	movsbw \(%eax\),%ax
+[ 	]*[a-f0-9]+:	0f be c0 *	movsbl %al,%eax
+[ 	]*[a-f0-9]+:	0f be 00 *	movsbl \(%eax\),%eax
+[ 	]*[a-f0-9]+:	66 0f be c0 *	movsbw %al,%ax
+[ 	]*[a-f0-9]+:	66 0f be 00 *	movsbw \(%eax\),%ax
+[ 	]*[a-f0-9]+:	0f be c0 *	movsbl %al,%eax
+[ 	]*[a-f0-9]+:	0f be 00 *	movsbl \(%eax\),%eax
+[ 	]*[a-f0-9]+:	0f bf c0 *	movswl %ax,%eax
+[ 	]*[a-f0-9]+:	0f bf 00 *	movswl \(%eax\),%eax
+[ 	]*[a-f0-9]+:	0f bf c0 *	movswl %ax,%eax
+[ 	]*[a-f0-9]+:	0f bf 00 *	movswl \(%eax\),%eax
+#pass
--- /dev/null
+++ b/gas/testsuite/gas/i386/movs64.d
@@ -0,0 +1,30 @@
+#objdump: -dw
+#source: movs.s
+#name: x86 mov with sign-extend (64-bit object)
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <movs>:
+[ 	]*[a-f0-9]+:	66 0f be c0 *	movsbw %al,%ax
+[ 	]*[a-f0-9]+:	67 66 0f be 00 *	movsbw \(%eax\),%ax
+[ 	]*[a-f0-9]+:	0f be c0 *	movsbl %al,%eax
+[ 	]*[a-f0-9]+:	67 0f be 00 *	movsbl \(%eax\),%eax
+[ 	]*[a-f0-9]+:	48 0f be c0 *	movsbq %al,%rax
+[ 	]*[a-f0-9]+:	48 0f be 00 *	movsbq \(%rax\),%rax
+[ 	]*[a-f0-9]+:	66 0f be c0 *	movsbw %al,%ax
+[ 	]*[a-f0-9]+:	67 66 0f be 00 *	movsbw \(%eax\),%ax
+[ 	]*[a-f0-9]+:	0f be c0 *	movsbl %al,%eax
+[ 	]*[a-f0-9]+:	67 0f be 00 *	movsbl \(%eax\),%eax
+[ 	]*[a-f0-9]+:	48 0f be c0 *	movsbq %al,%rax
+[ 	]*[a-f0-9]+:	48 0f be 00 *	movsbq \(%rax\),%rax
+[ 	]*[a-f0-9]+:	0f bf c0 *	movswl %ax,%eax
+[ 	]*[a-f0-9]+:	67 0f bf 00 *	movswl \(%eax\),%eax
+[ 	]*[a-f0-9]+:	48 0f bf c0 *	movswq %ax,%rax
+[ 	]*[a-f0-9]+:	48 0f bf 00 *	movswq \(%rax\),%rax
+[ 	]*[a-f0-9]+:	0f bf c0 *	movswl %ax,%eax
+[ 	]*[a-f0-9]+:	67 0f bf 00 *	movswl \(%eax\),%eax
+[ 	]*[a-f0-9]+:	48 0f bf c0 *	movswq %ax,%rax
+[ 	]*[a-f0-9]+:	48 0f bf 00 *	movswq \(%rax\),%rax
+#pass
--- a/gas/testsuite/gas/i386/movx16.l
+++ b/gas/testsuite/gas/i386/movx16.l
@@ -41,11 +41,11 @@
 [ 	]*[1-9][0-9]*[ 	]+movsb	%ax, %cl
 [ 	]*[1-9][0-9]*[ 	]+movsb	%eax, %cl
 [ 	]*[1-9][0-9]*[ 	]*
-[ 	]*[1-9][0-9]*[ 	]+movsb	%al, %cx
+[ 	]*[1-9][0-9]* \?\?\?\? 0FBEC8[ 	]+movsb	%al, %cx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%ax, %cx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%eax, %cx
 [ 	]*[1-9][0-9]*[ 	]*
-[ 	]*[1-9][0-9]*[ 	]+movsb	%al, %ecx
+[ 	]*[1-9][0-9]* \?\?\?\? 660FBEC8[ 	]+movsb	%al, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%ax, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%eax, %ecx
 [ 	]*[1-9][0-9]*[ 	]*
@@ -82,7 +82,7 @@
 [ 	]*[1-9][0-9]*[ 	]+movsw	%eax, %cx
 [ 	]*[1-9][0-9]*[ 	]*
 [ 	]*[1-9][0-9]*[ 	]+movsw	%al, %ecx
-[ 	]*[1-9][0-9]*[ 	]+movsw	%ax, %ecx
+[ 	]*[1-9][0-9]* \?\?\?\? 660FBFC8[ 	]+movsw	%ax, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsw	%eax, %ecx
 [ 	]*[1-9][0-9]*[ 	]*
 [ 	]*[1-9][0-9]*[ 	]+movswl	%al, %cl
--- a/gas/testsuite/gas/i386/movx32.l
+++ b/gas/testsuite/gas/i386/movx32.l
@@ -41,11 +41,11 @@
 [ 	]*[1-9][0-9]*[ 	]+movsb	%ax, %cl
 [ 	]*[1-9][0-9]*[ 	]+movsb	%eax, %cl
 [ 	]*[1-9][0-9]*[ 	]*
-[ 	]*[1-9][0-9]*[ 	]+movsb	%al, %cx
+[ 	]*[1-9][0-9]* \?\?\?\? 660FBEC8[ 	]+movsb	%al, %cx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%ax, %cx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%eax, %cx
 [ 	]*[1-9][0-9]*[ 	]*
-[ 	]*[1-9][0-9]*[ 	]+movsb	%al, %ecx
+[ 	]*[1-9][0-9]* \?\?\?\? 0FBEC8[ 	]+movsb	%al, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%ax, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%eax, %ecx
 [ 	]*[1-9][0-9]*[ 	]*
@@ -82,7 +82,7 @@
 [ 	]*[1-9][0-9]*[ 	]+movsw	%eax, %cx
 [ 	]*[1-9][0-9]*[ 	]*
 [ 	]*[1-9][0-9]*[ 	]+movsw	%al, %ecx
-[ 	]*[1-9][0-9]*[ 	]+movsw	%ax, %ecx
+[ 	]*[1-9][0-9]* \?\?\?\? 0FBFC8[ 	]+movsw	%ax, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsw	%eax, %ecx
 [ 	]*[1-9][0-9]*[ 	]*
 [ 	]*[1-9][0-9]*[ 	]+movswl	%al, %cl
--- a/gas/testsuite/gas/i386/movx64.l
+++ b/gas/testsuite/gas/i386/movx64.l
@@ -106,17 +106,17 @@
 [ 	]*[1-9][0-9]*[ 	]+movsb	%eax, %cl
 [ 	]*[1-9][0-9]*[ 	]+movsb	%rax, %cl
 [ 	]*[1-9][0-9]*[ 	]*
-[ 	]*[1-9][0-9]*[ 	]+movsb	%al, %cx
+[ 	]*[1-9][0-9]* \?\?\?\? 660FBEC8[ 	]+movsb	%al, %cx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%ax, %cx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%eax, %cx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%rax, %cx
 [ 	]*[1-9][0-9]*[ 	]*
-[ 	]*[1-9][0-9]*[ 	]+movsb	%al, %ecx
+[ 	]*[1-9][0-9]* \?\?\?\? 0FBEC8[ 	]+movsb	%al, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%ax, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%eax, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%rax, %ecx
 [ 	]*[1-9][0-9]*[ 	]*
-[ 	]*[1-9][0-9]*[ 	]+movsb	%al, %rcx
+[ 	]*[1-9][0-9]* \?\?\?\? 480FBEC8[ 	]+movsb	%al, %rcx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%ax, %rcx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%eax, %rcx
 [ 	]*[1-9][0-9]*[ 	]+movsb	%rax, %rcx
@@ -192,12 +192,12 @@
 [ 	]*[1-9][0-9]*[ 	]+movsw	%rax, %cx
 [ 	]*[1-9][0-9]*[ 	]*
 [ 	]*[1-9][0-9]*[ 	]+movsw	%al, %ecx
-[ 	]*[1-9][0-9]*[ 	]+movsw	%ax, %ecx
+[ 	]*[1-9][0-9]* \?\?\?\? 0FBFC8[ 	]+movsw	%ax, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsw	%eax, %ecx
 [ 	]*[1-9][0-9]*[ 	]+movsw	%rax, %ecx
 [ 	]*[1-9][0-9]*[ 	]*
 [ 	]*[1-9][0-9]*[ 	]+movsw	%al, %rcx
-[ 	]*[1-9][0-9]*[ 	]+movsw	%ax, %rcx
+[ 	]*[1-9][0-9]* \?\?\?\? 480FBFC8[ 	]+movsw	%ax, %rcx
 [ 	]*[1-9][0-9]*[ 	]+movsw	%eax, %rcx
 [ 	]*[1-9][0-9]*[ 	]+movsw	%rax, %rcx
 [ 	]*[1-9][0-9]*[ 	]*
@@ -241,6 +241,46 @@
 [ 	]*[1-9][0-9]*[ 	]+movswq	%eax, %rcx
 [ 	]*[1-9][0-9]*[ 	]+movswq	%rax, %rcx
 [ 	]*[1-9][0-9]*[ 	]*
+[ 	]*[1-9][0-9]*[ 	]+movsl	%al, %cl
+[ 	]*[1-9][0-9]*[ 	]+movsl	%ax, %cl
+[ 	]*[1-9][0-9]*[ 	]+movsl	%eax, %cl
+[ 	]*[1-9][0-9]*[ 	]+movsl	%rax, %cl
+[ 	]*[1-9][0-9]*[ 	]*
+[ 	]*[1-9][0-9]*[ 	]+movsl	%al, %cx
+[ 	]*[1-9][0-9]*[ 	]+movsl	%ax, %cx
+[ 	]*[1-9][0-9]*[ 	]+movsl	%eax, %cx
+[ 	]*[1-9][0-9]*[ 	]+movsl	%rax, %cx
+[ 	]*[1-9][0-9]*[ 	]*
+[ 	]*[1-9][0-9]*[ 	]+movsl	%al, %ecx
+[ 	]*[1-9][0-9]*[ 	]+movsl	%ax, %ecx
+[ 	]*[1-9][0-9]*[ 	]+movsl	%eax, %ecx
+[ 	]*[1-9][0-9]*[ 	]+movsl	%rax, %ecx
+[ 	]*[1-9][0-9]*[ 	]*
+[ 	]*[1-9][0-9]*[ 	]+movsl	%al, %rcx
+[ 	]*[1-9][0-9]*[ 	]+movsl	%ax, %rcx
+[ 	]*[1-9][0-9]* \?\?\?\? 4863C8[ 	]+movsl	%eax, %rcx
+[ 	]*[1-9][0-9]*[ 	]+movsl	%rax, %rcx
+[ 	]*[1-9][0-9]*[ 	]*
+[ 	]*[1-9][0-9]*[ 	]+movslq	%al, %cl
+[ 	]*[1-9][0-9]*[ 	]+movslq	%ax, %cl
+[ 	]*[1-9][0-9]*[ 	]+movslq	%eax, %cl
+[ 	]*[1-9][0-9]*[ 	]+movslq	%rax, %cl
+[ 	]*[1-9][0-9]*[ 	]*
+[ 	]*[1-9][0-9]*[ 	]+movslq	%al, %cx
+[ 	]*[1-9][0-9]*[ 	]+movslq	%ax, %cx
+[ 	]*[1-9][0-9]*[ 	]+movslq	%eax, %cx
+[ 	]*[1-9][0-9]*[ 	]+movslq	%rax, %cx
+[ 	]*[1-9][0-9]*[ 	]*
+[ 	]*[1-9][0-9]*[ 	]+movslq	%al, %ecx
+[ 	]*[1-9][0-9]*[ 	]+movslq	%ax, %ecx
+[ 	]*[1-9][0-9]*[ 	]+movslq	%eax, %ecx
+[ 	]*[1-9][0-9]*[ 	]+movslq	%rax, %ecx
+[ 	]*[1-9][0-9]*[ 	]*
+[ 	]*[1-9][0-9]*[ 	]+movslq	%al, %rcx
+[ 	]*[1-9][0-9]*[ 	]+movslq	%ax, %rcx
+[ 	]*[1-9][0-9]* \?\?\?\? 4863C8[ 	]+movslq	%eax, %rcx
+[ 	]*[1-9][0-9]*[ 	]+movslq	%rax, %rcx
+[ 	]*[1-9][0-9]*[ 	]*
 [ 	]*[1-9][0-9]*[ 	]+movzx:
 [ 	]*[1-9][0-9]*[ 	]+movzx	%al, %cl
 [ 	]*[1-9][0-9]*[ 	]+movzx	%ax, %cl
--- a/gas/testsuite/gas/i386/movx64.s
+++ b/gas/testsuite/gas/i386/movx64.s
@@ -241,6 +241,46 @@ movsx:
 	movswq	%eax, %rcx
 	movswq	%rax, %rcx
 
+	movsl	%al, %cl
+	movsl	%ax, %cl
+	movsl	%eax, %cl
+	movsl	%rax, %cl
+
+	movsl	%al, %cx
+	movsl	%ax, %cx
+	movsl	%eax, %cx
+	movsl	%rax, %cx
+
+	movsl	%al, %ecx
+	movsl	%ax, %ecx
+	movsl	%eax, %ecx
+	movsl	%rax, %ecx
+
+	movsl	%al, %rcx
+	movsl	%ax, %rcx
+	movsl	%eax, %rcx
+	movsl	%rax, %rcx
+
+	movslq	%al, %cl
+	movslq	%ax, %cl
+	movslq	%eax, %cl
+	movslq	%rax, %cl
+
+	movslq	%al, %cx
+	movslq	%ax, %cx
+	movslq	%eax, %cx
+	movslq	%rax, %cx
+
+	movslq	%al, %ecx
+	movslq	%ax, %ecx
+	movslq	%eax, %ecx
+	movslq	%rax, %ecx
+
+	movslq	%al, %rcx
+	movslq	%ax, %rcx
+	movslq	%eax, %rcx
+	movslq	%rax, %rcx
+
 movzx:
 	movzx	%al, %cl
 	movzx	%ax, %cl
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -172,14 +172,9 @@ mov, 0xf24, None, Cpu386|CpuNo64, D|RegM
 movbe, 0x0f38f0, None, CpuMovbe, D|Modrm|CheckRegSize|No_bSuf|No_sSuf, { Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 
 // Move with sign extend.
-// "movsbl" & "movsbw" must not be unified into "movsb" to avoid
-// conflict with the "movs" string move instruction.
-movsbl, 0xfbe, None, Cpu386, Modrm|NoSuf, { Reg8|Byte|Unspecified|BaseIndex, Reg32 }
-movsbw, 0xfbe, None, Cpu386, Modrm|NoSuf, { Reg8|Byte|Unspecified|BaseIndex, Reg16 }
-movswl, 0xfbf, None, Cpu386, Modrm|NoSuf, { Reg16|Word|Unspecified|BaseIndex, Reg32 }
-movsbq, 0xfbe, None, Cpu64, Modrm|NoSuf|Size64, { Reg8|Byte|Unspecified|BaseIndex, Reg64 }
-movswq, 0xfbf, None, Cpu64, Modrm|NoSuf|Size64, { Reg16|Word|Unspecified|BaseIndex, Reg64 }
-movslq, 0x63, None, Cpu64, Modrm|NoSuf|Size64, { Reg32|Dword|Unspecified|BaseIndex, Reg64 }
+movsb, 0xfbe, None, Cpu386, Modrm|No_bSuf|No_sSuf, { Reg8|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+movsw, 0xfbf, None, Cpu386, Modrm|No_bSuf|No_wSuf|No_sSuf, { Reg16|Unspecified|BaseIndex, Reg32|Reg64 }
+movsl, 0x63, None, Cpu64, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf, { Reg32|Unspecified|BaseIndex, Reg64 }
 movsx, 0xfbe, None, Cpu386, W|Modrm|No_lSuf|No_sSuf|No_qSuf, { Reg8|Reg16|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 movsx, 0x63, None, Cpu64, Modrm|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { Reg32|Unspecified|BaseIndex, Reg32|Reg64 }
 movsxd, 0x63, None, Cpu64, Modrm|NoSuf, { Reg32|Unspecified|BaseIndex, Reg32|Reg64 }


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v7 0/7] x86: suffix handling changes
  2022-12-02 10:13 [PATCH v7 0/7] x86: suffix handling changes Jan Beulich
                   ` (6 preceding siblings ...)
  2022-12-02 10:21 ` [PATCH v7 7/7] x86: further re-work insn/suffix recognition to also cover MOVSX Jan Beulich
@ 2022-12-09 10:56 ` Jan Beulich
  7 siblings, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2022-12-09 10:56 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

On 02.12.2022 11:13, Jan Beulich via Binutils wrote:
> ... accompanied by a few other improvements (or so I hope) found
> along the way.
> 
> Apart from re-basing this is merely a re-submission, in an attempt to
> revive the discussion: There are multiple bugs being fixed by this
> series, which I'd like to ask to not put off simply with a subjective
> qualification of them being minor. Plus even if minor, it's one thing
> to not invest time into addressing a minor bug, but here a fix is
> being supplied, ready for committing. I'd like to also stress again
> that I've taken quite a bit of time to address concerns, and to hence
> find a viable compromise.
> 
> 1: constify parse_insn()'s input
> 2: re-work insn/suffix recognition
> 3: don't recognize/derive Q suffix in the common case
> 4: allow HLE store of accumulator to absolute 32-bit address
> 5: move bad-use-of-TLS-reloc check
> 6: drop (now) stray IsString
> 7: further re-work insn/suffix recognition to also cover MOVSX

In the absence of alternative proposals on how to address the several
bugs addressed within this series I'm intending to commit this perhaps
early next week, ...

> Note that patches 3...5 were previously approved (albeit 3 has now
> changed because of the re-ordering of the series in v5), but can't be
> committed ahead of the earlier ones.
> 
> While I don't think there's any collision with the also pending "x86:
> break gas dependency on libopcodes" series (apart from generated files),
> just in case: The series here is presented on top of that one, but could
> be easily re-based ahead.

... on top of this one (and finally unblocking further work I have
pending on top).

Jan

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-12-09 10:57 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-02 10:13 [PATCH v7 0/7] x86: suffix handling changes Jan Beulich
2022-12-02 10:18 ` [PATCH v7 1/7] x86: constify parse_insn()'s input Jan Beulich
2022-12-02 10:18 ` [PATCH v7 2/7] x86: re-work insn/suffix recognition Jan Beulich
2022-12-02 10:19 ` [PATCH v7 3/7] ix86: don't recognize/derive Q suffix in the common case Jan Beulich
2022-12-02 10:20 ` [PATCH v7 4/7] x86-64: allow HLE store of accumulator to absolute 32-bit address Jan Beulich
2022-12-02 10:20 ` [PATCH v7 5/7] x86: move bad-use-of-TLS-reloc check Jan Beulich
2022-12-02 10:21 ` [PATCH v7 6/7] x86: drop (now) stray IsString Jan Beulich
2022-12-02 10:21 ` [PATCH v7 7/7] x86: further re-work insn/suffix recognition to also cover MOVSX Jan Beulich
2022-12-09 10:56 ` [PATCH v7 0/7] x86: suffix handling changes Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).