public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 00/11] gas: scrubber adjustments
@ 2024-06-28 13:15 Jan Beulich
  2024-06-28 13:17 ` [PATCH 01/11] gas: there's no scrubber state 12 Jan Beulich
                   ` (10 more replies)
  0 siblings, 11 replies; 18+ messages in thread
From: Jan Beulich @ 2024-06-28 13:15 UTC (permalink / raw)
  To: Binutils; +Cc: Anthony Green, Alan Modra

The end goal is to finally deal with long standing issues in, in
particular, macro handling (see e.g. b3446f947bd1 ["gas: retain
whitespace between strings"]). That's the final, still work-in-
progress patch (posting for early commenting). I've come across
various other issues though, which some of the earlier patches aim
at addressing. Imo those are (in principle, subject to feedback of
course) fine to go in for 2.43. Whereas I'd prefer the final change
to go in relatively early in a release cycle, to allow time to deal
with real or perceived regressions (see patch 8 for an example
thereof).

One of the issues I'm going to face when getting farther in dealing
with target specific adjustments is that there are 5 gas targets
which don't even have a testsuite subdir:
- m32c (no maintainer)
- moxie
- ns32k (no maintainer)
- spu
- tic30 (no maintainer)
How is one supposed to have even the slightest idea whether a common
code change breaks such a target (Cc-ing the two maintainers of the
named targets which have one)? I'm already worried enough by targets
having only a pretty "slim" set of test cases.

01: there's no scrubber state 12
02: drop scrubber state -2
03: multi-byte warning adjustments
04: don't open-code IS_WHITESPACE() / IS_NEWLINE()
05: pre-init the scrubber's lex[]
06: re-work tic6x scrubber special case
07: consistently drop trailing whitespace when scrubbing
08: Arm: correct macro use in gas testsuite
09: Arm: relax gas testsuite whitespace expectations
10: aarch64: relax gas testsuite whitespace expectations
11: have scrubber retain more whitespace

Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 01/11] gas: there's no scrubber state 12
  2024-06-28 13:15 [PATCH 00/11] gas: scrubber adjustments Jan Beulich
@ 2024-06-28 13:17 ` Jan Beulich
  2024-06-28 13:17 ` [PATCH 02/11] gas: drop scrubber state -2 Jan Beulich
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2024-06-28 13:17 UTC (permalink / raw)
  To: Binutils

Apparently (beyond what's [easily] visible in git history) when this was
added there was confusion about scrubber states vs lex[] contents. For
the purposes here LEX_IS_DOUBLEDASH_1ST (which happens to also resolve
to 12) alone is sufficient. "state" is never set to 12, and it being 12
also isn't handled anywhere.

--- a/gas/app.c
+++ b/gas/app.c
@@ -436,10 +436,7 @@ do_scrub_chars (size_t (*get) (char *, s
 	 11: After seeing a symbol character in state 0 (eg a label definition)
 	 -1: output string in out_string and go to the state in old_state
 	 -2: flush text until a '*' '/' is seen, then go to state old_state
-#ifdef TC_V850
-	 12: After seeing a dash, looking for a second dash as a start
-	     of comment.
-#endif
+	 12: no longer used
 #ifdef DOUBLEBAR_PARALLEL
 	 13: After seeing a vertical bar, looking for a second
 	     vertical bar as a parallel expression separator.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 02/11] gas: drop scrubber state -2
  2024-06-28 13:15 [PATCH 00/11] gas: scrubber adjustments Jan Beulich
  2024-06-28 13:17 ` [PATCH 01/11] gas: there's no scrubber state 12 Jan Beulich
@ 2024-06-28 13:17 ` Jan Beulich
  2024-06-28 13:18 ` [PATCH 03/11] gas: multi-byte warning adjustments Jan Beulich
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2024-06-28 13:17 UTC (permalink / raw)
  To: Binutils

Instead re-use code handling LEX_IS_TWOCHAR_COMMENT_1ST, thus ensuring
that we wouldn't get bogus state transitions: For example, when we're
in states 0 or 1, a comment should be no different from whitespace
encountered in those states. Plus for e.g. x86 this results in such
comments now truly being converted to a blank, as mandated by
documentation. Both aspects apparently were a result of blindly (and
wrongly) moving to state 3 _before_ consuming the "ungot" blank.

Also amend a related comment elsewhere.

In the new testcase the .irp is to make visible in the listing all the
whitespace that the scrubber inserts / leaves in place.

--- a/gas/app.c
+++ b/gas/app.c
@@ -172,7 +172,8 @@ do_scrub_begin (int m68k_mri ATTRIBUTE_U
     lex[(unsigned char) *p] = LEX_IS_PARALLEL_SEPARATOR;
 #endif
 
-  /* Only allow slash-star comments if slash is not in use.
+  /* Only allow slash-star comments if slash is not in use.  Certain
+     other cases are dealt with in LEX_IS_LINE_COMMENT_START handling.
      FIXME: This isn't right.  We should always permit them.  */
   if (lex['/'] == 0)
     lex['/'] = LEX_IS_TWOCHAR_COMMENT_1ST;
@@ -435,7 +436,6 @@ do_scrub_chars (size_t (*get) (char *, s
 	 10: After seeing whitespace in state 9 (keep white before symchar)
 	 11: After seeing a symbol character in state 0 (eg a label definition)
 	 -1: output string in out_string and go to the state in old_state
-	 -2: flush text until a '*' '/' is seen, then go to state old_state
 	 12: no longer used
 #ifdef DOUBLEBAR_PARALLEL
 	 13: After seeing a vertical bar, looking for a second
@@ -538,43 +538,6 @@ do_scrub_chars (size_t (*get) (char *, s
 	  PUT (ch);
 	  continue;
 
-	case -2:
-	  for (;;)
-	    {
-	      do
-		{
-		  ch = GET ();
-
-		  if (ch == EOF)
-		    {
-		      as_warn (_("end of file in comment"));
-		      goto fromeof;
-		    }
-
-		  if (ch == '\n')
-		    PUT ('\n');
-		}
-	      while (ch != '*');
-
-	      while ((ch = GET ()) == '*')
-		;
-
-	      if (ch == EOF)
-		{
-		  as_warn (_("end of file in comment"));
-		  goto fromeof;
-		}
-
-	      if (ch == '/')
-		break;
-
-	      UNGET (ch);
-	    }
-
-	  state = old_state;
-	  UNGET (' ');
-	  continue;
-
 	case 4:
 	  ch = GET ();
 	  if (ch == EOF)
@@ -1031,6 +994,7 @@ do_scrub_chars (size_t (*get) (char *, s
 	  ch2 = GET ();
 	  if (ch2 == '*')
 	    {
+	twochar_comment:
 	      for (;;)
 		{
 		  do
@@ -1245,15 +1209,9 @@ do_scrub_chars (size_t (*get) (char *, s
 	    {
 	      ch2 = GET ();
 	      if (ch2 == '*')
-		{
-		  old_state = 3;
-		  state = -2;
-		  break;
-		}
-	      else if (ch2 != EOF)
-		{
-		  UNGET (ch2);
-		}
+		goto twochar_comment;
+	      if (ch2 != EOF)
+		UNGET (ch2);
 	    }
 
 	  if (state == 0 || state == 1)	/* Only comment at start of line.  */
--- /dev/null
+++ b/gas/testsuite/gas/i386/comments.l
@@ -0,0 +1,30 @@
+.*: Assembler messages:
+.*:6: Error: .*
+.*:7: Error: .*
+[ 	]*[0-9]+[ 	]+scrubber:
+[ 	]*[0-9]+[ 	]+\.irp x,""
+[ 	]*[0-9]+[ 	]*
+[ 	]*[0-9]+[ 	]+/\*    \*/vaddps	%zmm0, %zmm1, %zmm2
+[ 	]*[0-9]+[ 	]+vaddps/\*\*/%zmm0, %zmm1, %zmm2
+[ 	]*[0-9]+[ 	]+vadd/\*\*/ps %zmm0, %zmm1, %zmm2
+[ 	]*[0-9]+[ 	]+vaddps	%zmm/\*\*/0, %zmm1, %zmm2
+[ 	]*[0-9]+[ 	]*
+[ 	]*[0-9]+[ 	]+vaddps	%zmm0, %zmm1, %zmm2# ...
+[ 	]*[0-9]+[ 	]+vaddps	%zmm0, %zmm1, %zmm2	# ...
+[ 	]*[0-9]+[ 	]+vaddps	%zmm0, %zmm1, %zmm2/\*    \*/# ...
+[ 	]*[0-9]+[ 	]*
+[ 	]*[0-9]+[ 	]+\.endr
+[ 	]*[0-9]+[ 	]+> +
+[ 	]*[0-9]+ \?\?\?\? [0-9A-F]+[ 	]+>  vaddps %zmm0,%zmm1,%zmm2
+#...
+[ 	]*[0-9]+ \?\?\?\? [0-9A-F]+[ 	]+>  vaddps %zmm0,%zmm1,%zmm2
+#...
+[ 	]*[0-9]+[ 	]+>  vadd ps %zmm0,%zmm1,%zmm2
+[ 	]*[0-9]+[ 	]+>  vaddps %zmm 0,%zmm1,%zmm2
+[ 	]*[0-9]+[ 	]+> +
+[ 	]*[0-9]+ \?\?\?\? [0-9A-F]+[ 	]+>  vaddps %zmm0,%zmm1,%zmm2
+#...
+[ 	]*[0-9]+ \?\?\?\? [0-9A-F]+[ 	]+>  vaddps %zmm0,%zmm1,%zmm2
+#...
+[ 	]*[0-9]+ \?\?\?\? [0-9A-F]+[ 	]+>  vaddps %zmm0,%zmm1,%zmm2
+#pass
--- /dev/null
+++ b/gas/testsuite/gas/i386/comments.s
@@ -0,0 +1,13 @@
+scrubber:
+	.irp x,""
+
+/*    */vaddps	%zmm0, %zmm1, %zmm2
+	vaddps/**/%zmm0, %zmm1, %zmm2
+	vadd/**/ps %zmm0, %zmm1, %zmm2
+	vaddps	%zmm/**/0, %zmm1, %zmm2
+
+	vaddps	%zmm0, %zmm1, %zmm2# ...
+	vaddps	%zmm0, %zmm1, %zmm2	# ...
+	vaddps	%zmm0, %zmm1, %zmm2/*    */# ...
+
+	.endr
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -105,6 +105,7 @@ if [gas_32_check] then {
     run_dump_test "equ"
     run_list_test "equ-2" "-al"
     run_list_test "equ-bad"
+    run_list_test "comments" "-almn"
     run_dump_test "divide"
     run_dump_test "quoted"
     run_dump_test "quoted2"


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 03/11] gas: multi-byte warning adjustments
  2024-06-28 13:15 [PATCH 00/11] gas: scrubber adjustments Jan Beulich
  2024-06-28 13:17 ` [PATCH 01/11] gas: there's no scrubber state 12 Jan Beulich
  2024-06-28 13:17 ` [PATCH 02/11] gas: drop scrubber state -2 Jan Beulich
@ 2024-06-28 13:18 ` Jan Beulich
  2024-06-28 13:18 ` [PATCH 04/11] gas: don't open-code IS_WHITESPACE() / IS_NEWLINE() Jan Beulich
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2024-06-28 13:18 UTC (permalink / raw)
  To: Binutils

First input_scrub_next_buffer()'s invocation was wrong, leading to input
only being checked from the last newline till the end of the current
buffer. Correcting the invocation, however, leads to duplicate checking
unless -f (or the #NO_APP equivalent thereof) is in effect. Move the
invocation to input_file_give_next_buffer(), to restrict it accordingly.

Then, when macros contain multi-byte characters, warning about them
again in every expansion isn't useful. Suppress such warnings from
sb_scrub_and_add_sb().
---
The latest with this change the 3-way as_warn() in
scan_for_multibyte_characters() becomes questionable: as_warn() is going
to prefix the warning with data obtained from as_where() anyway. This
means the 2nd one really is no more useful than the 1st one. Whereas the
3rd one now comes into play only for sufficiently large inputs, where
the line number would already be non-zero when fetching subsequent parts
of the input. Saying "at or near line" only then is bogus, though: When
the line number is still zero, we could as well say "at or near line 1".
Bumping by 1 the line number reported is going to get closer even in the
common case, as on average we're in the middle of a buffer rather than
at its start. Alternatively / additionally we'd also get things more
correct if we didn't uniformly bump the number, but said "at or after
line" instead: The report is never going to be for a past line.

Why is input re-scrubbed from input_scrub_include_sb() (via
sb_scrub_and_add_sb())? I can kind of see that there may be a need to
collapse successive whitespace, which may have formed in the course of
macro expansion. But I'm pretty sure comments which may have "formed"
should be left alone, for whatever meaning that may have. Other more
special handling may be equally inapplicable; the handling of double
quoted strings in particular may cause issues here as well, considering
especially that elsewhere in gas we have two different ways of escaping
double quotes in such strings.

--- a/gas/app.c
+++ b/gas/app.c
@@ -412,7 +412,8 @@ scan_for_multibyte_characters (const uns
    This is the way the old code used to work.  */
 
 size_t
-do_scrub_chars (size_t (*get) (char *, size_t), char *tostart, size_t tolen)
+do_scrub_chars (size_t (*get) (char *, size_t), char *tostart, size_t tolen,
+		bool check_multibyte)
 {
   char *to = tostart;
   char *toend = tostart + tolen;
@@ -515,7 +516,7 @@ do_scrub_chars (size_t (*get) (char *, s
       from = input_buffer;
       fromend = from + fromlen;
 
-      if (multibyte_handling == multibyte_warn)
+      if (check_multibyte)
 	(void) scan_for_multibyte_characters ((const unsigned char *) from,
 					      (const unsigned char* ) fromend,
 					      true /* Generate warnings.  */);
--- a/gas/as.h
+++ b/gas/as.h
@@ -496,7 +496,7 @@ void   input_scrub_insert_line (const ch
 void   input_scrub_insert_file (char *);
 char * input_scrub_new_file (const char *);
 char * input_scrub_next_buffer (char **bufp);
-size_t do_scrub_chars (size_t (*get) (char *, size_t), char *, size_t);
+size_t do_scrub_chars (size_t (*get) (char *, size_t), char *, size_t, bool);
 size_t do_scrub_pending (void);
 bool   scan_for_multibyte_characters (const unsigned char *, const unsigned char *, bool);
 int    gen_to_words (LITTLENUM_TYPE *, int, long);
--- a/gas/input-file.c
+++ b/gas/input-file.c
@@ -240,9 +240,20 @@ input_file_give_next_buffer (char *where
      Since the assembler shouldn't do any output to stdout, we
      don't bother to synch output and input.  */
   if (preprocess)
-    size = do_scrub_chars (input_file_get, where, BUFFER_SIZE);
+    size = do_scrub_chars (input_file_get, where, BUFFER_SIZE,
+                           multibyte_handling == multibyte_warn);
   else
-    size = input_file_get (where, BUFFER_SIZE);
+    {
+      size = input_file_get (where, BUFFER_SIZE);
+
+      if (multibyte_handling == multibyte_warn)
+	{
+	  const unsigned char *start = (const unsigned char *) where;
+
+	  (void) scan_for_multibyte_characters (start, start + size,
+						true /* Generate warnings */);
+	}
+    }
 
   if (size)
     return_value = where + size;
--- a/gas/input-scrub.c
+++ b/gas/input-scrub.c
@@ -386,11 +386,6 @@ input_scrub_next_buffer (char **bufp)
 	  ++p;
 	}
 
-      if (multibyte_handling == multibyte_warn)
-	(void) scan_for_multibyte_characters ((const unsigned char *) p,
-					      (const unsigned char *) limit,
-					      true /* Generate warnings */);
-
       /* We found a newline in the newly read chars.  */
       partial_where = p;
       partial_size = limit - p;
--- a/gas/sb.c
+++ b/gas/sb.c
@@ -124,7 +124,7 @@ sb_scrub_and_add_sb (sb *ptr, sb *s)
 	break;
       sb_check (ptr, copy);
       ptr->len += do_scrub_chars (scrub_from_sb, ptr->ptr + ptr->len,
-				  ptr->max - ptr->len);
+				  ptr->max - ptr->len, false);
     }
 
   sb_to_scrub = 0;
--- a/gas/testsuite/gas/all/gas.exp
+++ b/gas/testsuite/gas/all/gas.exp
@@ -532,3 +532,5 @@ run_dump_test "pr27384"
 run_dump_test "pr27381"
 run_dump_test "multibyte1"
 run_dump_test "multibyte2"
+run_list_test "multibyte3" "--multibyte-handling=warn"
+run_list_test "multibyte3" "-f --multibyte-handling=warn"
--- /dev/null
+++ b/gas/testsuite/gas/all/multibyte3.l
@@ -0,0 +1,10 @@
+[^:]*: Assembler messages:
+[^:]*: Warning: multibyte character \(.*\) encountered in .*
+[^:]*: Warning: multibyte character \(.*\) encountered in .*
+[^:]*: Warning: multibyte character \(.*\) encountered in .*
+[^:]*: Warning: multibyte character \(.*\) encountered in .*
+[^:]*: Warning: multibyte character \(.*\) encountered in .*
+[^:]*: Warning: multibyte character \(.*\) encountered in .*
+[^:]*: Warning: multibyte character \(.*\) encountered in .*
+[^:]*: Warning: multibyte character \(.*\) encountered in .*
+[^:]*:[0-9]+: Warning: end of input
--- /dev/null
+++ b/gas/testsuite/gas/all/multibyte3.s
@@ -0,0 +1,11 @@
+	.macro m
+UmlautÜ\@:
+	.endm
+
+UmlautÄ:
+	.irpc c,szß
+UmlautÖ\@\c\():
+	m
+	.endr
+
+	.warning "end of input"


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 04/11] gas: don't open-code IS_WHITESPACE() / IS_NEWLINE()
  2024-06-28 13:15 [PATCH 00/11] gas: scrubber adjustments Jan Beulich
                   ` (2 preceding siblings ...)
  2024-06-28 13:18 ` [PATCH 03/11] gas: multi-byte warning adjustments Jan Beulich
@ 2024-06-28 13:18 ` Jan Beulich
  2024-06-28 13:19 ` [PATCH 05/11] gas: pre-init the scrubber's lex[] Jan Beulich
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2024-06-28 13:18 UTC (permalink / raw)
  To: Binutils

Better be consistent in use of the wrapper macros, which imo also helps
readability.
---
I'm wondering about LEX_IS_COLON, which has no corresponding IS_COLON():
Do we really need this? IOW do we really expect architectures may be
using other than a colon (and other than nothing at all) to "terminate"
a label definition?

--- a/gas/app.c
+++ b/gas/app.c
@@ -819,7 +819,7 @@ do_scrub_chars (size_t (*get) (char *, s
 	  if (ch != '\0'
 	      && (*mri_state == ch
 		  || (*mri_state == ' '
-		      && lex[ch] == LEX_IS_WHITESPACE)
+		      && IS_WHITESPACE (ch))
 		  || (*mri_state == '0'
 		      && ch == '1')))
 	    {
@@ -827,8 +827,7 @@ do_scrub_chars (size_t (*get) (char *, s
 	      ++mri_state;
 	    }
 	  else if (*mri_state != '\0'
-		   || (lex[ch] != LEX_IS_WHITESPACE
-		       && lex[ch] != LEX_IS_NEWLINE))
+		   || (!IS_WHITESPACE (ch) && !IS_NEWLINE (ch)))
 	    {
 	      /* We did not get the expected character, or we didn't
 		 get a valid terminating character after seeing the


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 05/11] gas: pre-init the scrubber's lex[]
  2024-06-28 13:15 [PATCH 00/11] gas: scrubber adjustments Jan Beulich
                   ` (3 preceding siblings ...)
  2024-06-28 13:18 ` [PATCH 04/11] gas: don't open-code IS_WHITESPACE() / IS_NEWLINE() Jan Beulich
@ 2024-06-28 13:19 ` Jan Beulich
  2024-07-01  1:45   ` Hans-Peter Nilsson
  2024-06-28 13:20 ` [PATCH 06/11] gas: re-work tic6x scrubber special case Jan Beulich
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2024-06-28 13:19 UTC (permalink / raw)
  To: Binutils; +Cc: Anthony Green, Alan Modra

While we can't - unlike an old comment suggests - do this fully, we can
certainly do part of this at compile time.

Since it's adjacent, also drop the unnecessary forward declaration of
process_escape().
---
Assumptions on certain ordering between characters exists elsewhere
already, so I assume leveraging this here as well isn't going to be a
problem.

--- a/gas/app.c
+++ b/gas/app.c
@@ -58,10 +58,6 @@ static const char * symver_state;
 
 static char last_char;
 
-static char lex[256];
-static const char symbol_chars[] =
-"$._ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";
-
 #define LEX_IS_SYMBOL_COMPONENT		1
 #define LEX_IS_WHITESPACE		2
 #define LEX_IS_LINE_SEPARATOR		3
@@ -93,23 +89,25 @@ static const char symbol_chars[] =
 #define IS_LINE_COMMENT(c)		(lex[c] == LEX_IS_LINE_COMMENT_START)
 #define	IS_NEWLINE(c)			(lex[c] == LEX_IS_NEWLINE)
 
-static int process_escape (int);
-
-/* FIXME-soon: The entire lexer/parser thingy should be
-   built statically at compile time rather than dynamically
-   each and every time the assembler is run.  xoxorich.  */
+static char lex[256] = {
+  [' ']  = LEX_IS_WHITESPACE,
+  ['\t'] = LEX_IS_WHITESPACE,
+  ['\r'] = LEX_IS_WHITESPACE,
+  ['\n'] = LEX_IS_NEWLINE,
+  [':']  = LEX_IS_COLON,
+  ['$']  = LEX_IS_SYMBOL_COMPONENT,
+  ['.']  = LEX_IS_SYMBOL_COMPONENT,
+  ['_']  = LEX_IS_SYMBOL_COMPONENT,
+  ['A' ... 'Z'] = LEX_IS_SYMBOL_COMPONENT,
+  ['a' ... 'z'] = LEX_IS_SYMBOL_COMPONENT,
+  ['0' ... '9'] = LEX_IS_SYMBOL_COMPONENT,
+  [128 ... 255] = LEX_IS_SYMBOL_COMPONENT,
+};
 
 void
 do_scrub_begin (int m68k_mri ATTRIBUTE_UNUSED)
 {
   const char *p;
-  int c;
-
-  lex[' '] = LEX_IS_WHITESPACE;
-  lex['\t'] = LEX_IS_WHITESPACE;
-  lex['\r'] = LEX_IS_WHITESPACE;
-  lex['\n'] = LEX_IS_NEWLINE;
-  lex[':'] = LEX_IS_COLON;
 
 #ifdef TC_M68K
   scrub_m68k_mri = m68k_mri;
@@ -133,11 +131,6 @@ do_scrub_begin (int m68k_mri ATTRIBUTE_U
 
   /* Note that these override the previous defaults, e.g. if ';' is a
      comment char, then it isn't a line separator.  */
-  for (p = symbol_chars; *p; ++p)
-    lex[(unsigned char) *p] = LEX_IS_SYMBOL_COMPONENT;
-
-  for (c = 128; c < 256; ++c)
-    lex[c] = LEX_IS_SYMBOL_COMPONENT;
 
 #ifdef tc_symbol_chars
   /* This macro permits the processor to specify all characters which


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 06/11] gas: re-work tic6x scrubber special case
  2024-06-28 13:15 [PATCH 00/11] gas: scrubber adjustments Jan Beulich
                   ` (4 preceding siblings ...)
  2024-06-28 13:19 ` [PATCH 05/11] gas: pre-init the scrubber's lex[] Jan Beulich
@ 2024-06-28 13:20 ` Jan Beulich
  2024-06-28 13:22 ` [PATCH 07/11] gas: consistently drop trailing whitespace when scrubbing Jan Beulich
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2024-06-28 13:20 UTC (permalink / raw)
  To: Binutils; +Cc: Joseph Myers

Two successive PUT() without a state change in between can't be right:
The first PUT() may take the "goto tofull" path, leading to the
subsequent character being processed later in the previously set state
(1 in this case), rather than the state we were in upon entry to the
switch() (13 in this case). Move to state 2 instead, thus forcing the
retaining of possible whitespace between "||" and a subsequent '>',
making sure not to mistake that for "||>".
---
While running the testsuite suggests this is okay a change to make just
on its own, if this was assumed to cause issues by the tic6x maintainer
merely because of issues with whitespace between what the states
description calls "opcode" and "operands" right now, this would need
folding into the subsequent "gas: have scrubber retain more whitespace".

The assumption here is (as the new comment says) that no label may
follow the "||" or "||>". In fact, if that assumption was true for all
targets setting DOUBLEBAR_PARALLEL, the #ifdef could be avoided
altogether, the latest with said subsequent "gas: have scrubber retain
more whitespace".

--- a/gas/app.c
+++ b/gas/app.c
@@ -685,20 +685,16 @@ do_scrub_chars (size_t (*get) (char *, s
 	  if (ch != '|')
 	    abort ();
 
+#ifndef TC_TIC6X
 	  /* Reset back to state 1 and pretend that we are parsing a
 	     line from just after the first white space.  */
 	  state = 1;
-	  PUT ('|');
-#ifdef TC_TIC6X
-	  /* "||^" is used for SPMASKed instructions.  */
-	  ch = GET ();
-	  if (ch == EOF)
-	    goto fromeof;
-	  else if (ch == '^')
-	    PUT ('^');
-	  else
-	    UNGET (ch);
+#else
+	  /* Don't permit white space between "||" and a possible '>'.  There
+	     shouldn't be any labels subsequent to that anyway.  */
+	  state = 2;
 #endif
+	  PUT ('|');
 	  continue;
 #endif
 #ifdef TC_Z80


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 07/11] gas: consistently drop trailing whitespace when scrubbing
  2024-06-28 13:15 [PATCH 00/11] gas: scrubber adjustments Jan Beulich
                   ` (5 preceding siblings ...)
  2024-06-28 13:20 ` [PATCH 06/11] gas: re-work tic6x scrubber special case Jan Beulich
@ 2024-06-28 13:22 ` Jan Beulich
  2024-06-28 13:53   ` Maciej W. Rozycki
  2024-06-28 13:23 ` [PATCH 08/11] Arm: correct macro use in gas testsuite Jan Beulich
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2024-06-28 13:22 UTC (permalink / raw)
  To: Binutils; +Cc: Chenghua Xu, Maciej W. Rozycki

From especially the checks for the two separator forms it appears to
follow that the construct being touched is about trailing whitespace. In
such a case, considering that for many targets ordinary and line comment
chars overlap, take into account that line comment chars override
ordinary ones in lex[] (logic elsewhere in do_scrub_chars() actually
depends on that ordering, and also accounts for this overriding).

Plus of course IS_NEWLINE() would better also be consulted. Note also
that the DOUBLESLASH_LINE_COMMENTS change should generally have no
effect just yet; it's a prereq for a later change but better fits here.

Leave respective comments as well, and update documentation to correct
which comment form is actually replaced by a single blank (i.e. neither
the ones starting with what {,tc_}comment_chars[] has nor the ones
starting with what line_comment_chars[] has).
---
In the MIPS testsuite adjustments I wasn't sure how exactly to make the
necessary changes: I could also have zapped the two blanks each, but
then I wondered why they were part of the expectations in the first
place.

Why are there both comment_chars[] and tc_comment_chars[]? The former is
per-arch anyway, so why would there be a need for an arch override? Is
this perhaps merely a historical leftover, which could be eliminated?

--- a/gas/app.c
+++ b/gas/app.c
@@ -87,6 +87,7 @@ static char last_char;
 #define IS_PARALLEL_SEPARATOR(c)	(lex[c] == LEX_IS_PARALLEL_SEPARATOR)
 #define IS_COMMENT(c)			(lex[c] == LEX_IS_COMMENT_START)
 #define IS_LINE_COMMENT(c)		(lex[c] == LEX_IS_LINE_COMMENT_START)
+#define IS_TWOCHAR_COMMENT_1ST(c)	(lex[c] == LEX_IS_TWOCHAR_COMMENT_1ST)
 #define	IS_NEWLINE(c)			(lex[c] == LEX_IS_NEWLINE)
 
 static char lex[256] = {
@@ -149,6 +150,9 @@ do_scrub_begin (int m68k_mri ATTRIBUTE_U
   for (p = tc_comment_chars; *p; p++)
     lex[(unsigned char) *p] = LEX_IS_COMMENT_START;
 
+  /* While counter intuitive to have more special purpose line comment chars
+     override more general purpose ordinary ones, logic in do_scrub_chars()
+     depends on this ordering.   */
   for (p = line_comment_chars; *p; p++)
     lex[(unsigned char) *p] = LEX_IS_LINE_COMMENT_START;
 
@@ -888,7 +892,12 @@ do_scrub_chars (size_t (*get) (char *, s
 		}
 	    }
 #endif
+
+	  /* Prune trailing whitespace.  */
 	  if (IS_COMMENT (ch)
+	      || (IS_LINE_COMMENT (ch)
+	          && (state < 1 || strchr (tc_comment_chars, ch)))
+	      || IS_NEWLINE (ch)
 	      || IS_LINE_SEPARATOR (ch)
 	      || IS_PARALLEL_SEPARATOR (ch))
 	    {
@@ -901,6 +910,16 @@ do_scrub_chars (size_t (*get) (char *, s
 		}
 	      goto recycle;
 	    }
+#ifdef DOUBLESLASH_LINE_COMMENTS
+	  if (IS_TWOCHAR_COMMENT_1ST (ch))
+	    {
+	      ch2 = GET ();
+	      if (ch2 != EOF)
+	        UNGET (ch2);
+	      if (ch2 == '/')
+		goto recycle;
+	    }
+#endif
 
 	  /* If we're in state 2 or 11, we've seen a non-white
 	     character followed by whitespace.  If the next character
--- a/gas/doc/as.texi
+++ b/gas/doc/as.texi
@@ -2987,11 +2987,11 @@ as exactly one space.
 @section Comments
 
 @cindex comments
-There are two ways of rendering comments to @command{@value{AS}}.  In both
-cases the comment is equivalent to one space.
+There are two ways of rendering comments to @command{@value{AS}}.
 
 Anything from @samp{/*} through the next @samp{*/} is a comment.
-This means you may not nest these comments.
+This means you may not nest these comments.  Such a comment is equivalent to
+one space, plus bumping the line counter accordingly.
 
 @smallexample
 /*
--- a/gas/testsuite/gas/mips/mips16-32@mips16-insn-e.l
+++ b/gas/testsuite/gas/mips/mips16-32@mips16-insn-e.l
@@ -25,7 +25,7 @@
 .*:50: Warning: extended operand requested but not required
 .*:51: Error: opcode not supported on this processor: mips1 \(mips1\) `restore\.e 128'
 .*:52: Error: opcode not supported on this processor: mips1 \(mips1\) `save\.e 128'
-.*:53: Error: unrecognized extended version of MIPS16 opcode `nop\.e '
+.*:53: Error: unrecognized extended version of MIPS16 opcode `nop\.e ?'
 .*:54: Error: unrecognized extended version of MIPS16 opcode `move\.e \$0,\$16'
 .*:55: Error: unrecognized extended version of MIPS16 opcode `move\.e \$16,\$0'
 .*:57: Warning: extended operand requested but not required
@@ -71,7 +71,7 @@
 .*:123: Error: unrecognized extended version of MIPS16 opcode `srav\.e \$16,\$16'
 .*:124: Error: operand 2 must be an immediate expression `sra\.e \$16,\$16'
 .*:125: Error: opcode not supported on this processor: mips1 \(mips1\) `dsrl\.e \$16,8'
-.*:126: Error: unrecognized extended version of MIPS16 opcode `entry\.e '
+.*:126: Error: unrecognized extended version of MIPS16 opcode `entry\.e ?'
 .*:127: Error: unrecognized extended version of MIPS16 opcode `entry\.e \$31'
 .*:128: Error: unrecognized extended version of MIPS16 opcode `exit\.e \$f0'
 .*:129: Error: unrecognized extended version of MIPS16 opcode `exit\.e'
--- a/gas/testsuite/gas/mips/mips16-64@mips16-insn-e.l
+++ b/gas/testsuite/gas/mips/mips16-64@mips16-insn-e.l
@@ -25,7 +25,7 @@
 .*:50: Warning: extended operand requested but not required
 .*:51: Error: opcode not supported on this processor: mips3 \(mips3\) `restore\.e 128'
 .*:52: Error: opcode not supported on this processor: mips3 \(mips3\) `save\.e 128'
-.*:53: Error: unrecognized extended version of MIPS16 opcode `nop\.e '
+.*:53: Error: unrecognized extended version of MIPS16 opcode `nop\.e ?'
 .*:54: Error: unrecognized extended version of MIPS16 opcode `move\.e \$0,\$16'
 .*:55: Error: unrecognized extended version of MIPS16 opcode `move\.e \$16,\$0'
 .*:57: Warning: extended operand requested but not required
@@ -71,7 +71,7 @@
 .*:123: Error: unrecognized extended version of MIPS16 opcode `srav\.e \$16,\$16'
 .*:124: Error: operand 2 must be an immediate expression `sra\.e \$16,\$16'
 .*:125: Warning: extended operand requested but not required
-.*:126: Error: unrecognized extended version of MIPS16 opcode `entry\.e '
+.*:126: Error: unrecognized extended version of MIPS16 opcode `entry\.e ?'
 .*:127: Error: unrecognized extended version of MIPS16 opcode `entry\.e \$31'
 .*:128: Error: unrecognized extended version of MIPS16 opcode `exit\.e \$f0'
 .*:129: Error: unrecognized extended version of MIPS16 opcode `exit\.e'
--- a/gas/testsuite/gas/mips/mips16-insn-e.l
+++ b/gas/testsuite/gas/mips/mips16-insn-e.l
@@ -23,7 +23,7 @@
 .*:48: Warning: extended operand requested but not required
 .*:49: Warning: extended operand requested but not required
 .*:50: Warning: extended operand requested but not required
-.*:53: Error: unrecognized extended version of MIPS16 opcode `nop\.e '
+.*:53: Error: unrecognized extended version of MIPS16 opcode `nop\.e ?'
 .*:54: Error: unrecognized extended version of MIPS16 opcode `move\.e \$0,\$16'
 .*:55: Error: unrecognized extended version of MIPS16 opcode `move\.e \$16,\$0'
 .*:57: Warning: extended operand requested but not required
@@ -69,7 +69,7 @@
 .*:123: Error: unrecognized extended version of MIPS16 opcode `srav\.e \$16,\$16'
 .*:124: Error: operand 2 must be an immediate expression `sra\.e \$16,\$16'
 .*:125: Warning: extended operand requested but not required
-.*:126: Error: unrecognized extended version of MIPS16 opcode `entry\.e '
+.*:126: Error: unrecognized extended version of MIPS16 opcode `entry\.e ?'
 .*:127: Error: unrecognized extended version of MIPS16 opcode `entry\.e \$31'
 .*:128: Error: unrecognized extended version of MIPS16 opcode `exit\.e \$f0'
 .*:129: Error: unrecognized extended version of MIPS16 opcode `exit\.e'
--- a/gas/testsuite/gas/mips/mips16e-32@mips16-insn-e.l
+++ b/gas/testsuite/gas/mips/mips16e-32@mips16-insn-e.l
@@ -23,7 +23,7 @@
 .*:48: Warning: extended operand requested but not required
 .*:49: Warning: extended operand requested but not required
 .*:50: Warning: extended operand requested but not required
-.*:53: Error: unrecognized extended version of MIPS16 opcode `nop\.e '
+.*:53: Error: unrecognized extended version of MIPS16 opcode `nop\.e ?'
 .*:54: Error: unrecognized extended version of MIPS16 opcode `move\.e \$0,\$16'
 .*:55: Error: unrecognized extended version of MIPS16 opcode `move\.e \$16,\$0'
 .*:57: Warning: extended operand requested but not required
@@ -69,7 +69,7 @@
 .*:123: Error: unrecognized extended version of MIPS16 opcode `srav\.e \$16,\$16'
 .*:124: Error: operand 2 must be an immediate expression `sra\.e \$16,\$16'
 .*:125: Error: opcode not supported on this processor: mips32 \(mips32\) `dsrl\.e \$16,8'
-.*:126: Error: unrecognized extended version of MIPS16 opcode `entry\.e '
+.*:126: Error: unrecognized extended version of MIPS16 opcode `entry\.e ?'
 .*:127: Error: unrecognized extended version of MIPS16 opcode `entry\.e \$31'
 .*:128: Error: unrecognized extended version of MIPS16 opcode `exit\.e \$f0'
 .*:129: Error: unrecognized extended version of MIPS16 opcode `exit\.e'
--- a/gas/testsuite/gas/mips/mips16e2-32@mips16-insn-e.l
+++ b/gas/testsuite/gas/mips/mips16e2-32@mips16-insn-e.l
@@ -23,7 +23,7 @@
 .*:48: Warning: extended operand requested but not required
 .*:49: Warning: extended operand requested but not required
 .*:50: Warning: extended operand requested but not required
-.*:53: Error: unrecognized extended version of MIPS16 opcode `nop\.e '
+.*:53: Error: unrecognized extended version of MIPS16 opcode `nop\.e ?'
 .*:54: Error: unrecognized extended version of MIPS16 opcode `move\.e \$0,\$16'
 .*:55: Error: unrecognized extended version of MIPS16 opcode `move\.e \$16,\$0'
 .*:57: Warning: extended operand requested but not required
@@ -69,7 +69,7 @@
 .*:123: Error: unrecognized extended version of MIPS16 opcode `srav\.e \$16,\$16'
 .*:124: Error: operand 2 must be an immediate expression `sra\.e \$16,\$16'
 .*:125: Error: opcode not supported on this processor: mips32r2 \(mips32r2\) `dsrl\.e \$16,8'
-.*:126: Error: unrecognized extended version of MIPS16 opcode `entry\.e '
+.*:126: Error: unrecognized extended version of MIPS16 opcode `entry\.e ?'
 .*:127: Error: unrecognized extended version of MIPS16 opcode `entry\.e \$31'
 .*:128: Error: unrecognized extended version of MIPS16 opcode `exit\.e \$f0'
 .*:129: Error: unrecognized extended version of MIPS16 opcode `exit\.e'
--- a/gas/testsuite/gas/mips/mips16e2-interaptiv-mr2@mips16-insn-e.l
+++ b/gas/testsuite/gas/mips/mips16e2-interaptiv-mr2@mips16-insn-e.l
@@ -23,7 +23,7 @@
 .*:48: Warning: extended operand requested but not required
 .*:49: Warning: extended operand requested but not required
 .*:50: Warning: extended operand requested but not required
-.*:53: Error: unrecognized extended version of MIPS16 opcode `nop\.e '
+.*:53: Error: unrecognized extended version of MIPS16 opcode `nop\.e ?'
 .*:54: Error: unrecognized extended version of MIPS16 opcode `move\.e \$0,\$16'
 .*:55: Error: unrecognized extended version of MIPS16 opcode `move\.e \$16,\$0'
 .*:57: Warning: extended operand requested but not required
@@ -69,7 +69,7 @@
 .*:123: Error: unrecognized extended version of MIPS16 opcode `srav\.e \$16,\$16'
 .*:124: Error: operand 2 must be an immediate expression `sra\.e \$16,\$16'
 .*:125: Error: opcode not supported on this processor: interaptiv-mr2 \(mips32r3\) `dsrl\.e \$16,8'
-.*:126: Error: unrecognized extended version of MIPS16 opcode `entry\.e '
+.*:126: Error: unrecognized extended version of MIPS16 opcode `entry\.e ?'
 .*:127: Error: unrecognized extended version of MIPS16 opcode `entry\.e \$31'
 .*:128: Error: unrecognized extended version of MIPS16 opcode `exit\.e \$f0'
 .*:129: Error: unrecognized extended version of MIPS16 opcode `exit\.e'


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 08/11] Arm: correct macro use in gas testsuite
  2024-06-28 13:15 [PATCH 00/11] gas: scrubber adjustments Jan Beulich
                   ` (6 preceding siblings ...)
  2024-06-28 13:22 ` [PATCH 07/11] gas: consistently drop trailing whitespace when scrubbing Jan Beulich
@ 2024-06-28 13:23 ` Jan Beulich
  2024-06-28 13:24 ` [PATCH 09/11] Arm: relax gas testsuite whitespace expectations Jan Beulich
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2024-06-28 13:23 UTC (permalink / raw)
  To: Binutils; +Cc: Nick Clifton, ramana.radhakrishnan, Richard Earnshaw

The way the inner macro invocations are written doesn't quite work as
expected (and would actually break subsequently): Due to overly
aggressive removal of whitespace by the scrubber, the incoming \sym and
\offset arguments actually get concatenated; an empty 3rd argument is
being passed to ldrtest2. That just so happened to work as intended; any
use of \offset alone would have exposed the problem. Quote the 3rd
argument, thus retaining enough whitespace to be independent of scrubber
internals.

--- a/gas/testsuite/gas/arm/group-reloc-ldrs-encoding-bad.s
+++ b/gas/testsuite/gas/arm/group-reloc-ldrs-encoding-bad.s
@@ -14,7 +14,7 @@
 
 	.macro ldrtest load store sym offset
 
-	ldrtest2 \load \sym \offset
+	ldrtest2 \load \sym "\offset"
 
 	\store	r0, [r0, #:pc_g1:(\sym \offset)]
 	\store	r0, [r0, #:pc_g2:(\sym \offset)]
--- a/gas/testsuite/gas/arm/group-reloc-ldrs.s
+++ b/gas/testsuite/gas/arm/group-reloc-ldrs.s
@@ -14,7 +14,7 @@
 
 	.macro ldrtest load store sym offset
 
-	ldrtest2 \load \sym \offset
+	ldrtest2 \load \sym "\offset"
 
 	\store	r0, [r0, #:pc_g1:(\sym \offset)]
 	\store	r0, [r0, #:pc_g2:(\sym \offset)]


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 09/11] Arm: relax gas testsuite whitespace expectations
  2024-06-28 13:15 [PATCH 00/11] gas: scrubber adjustments Jan Beulich
                   ` (7 preceding siblings ...)
  2024-06-28 13:23 ` [PATCH 08/11] Arm: correct macro use in gas testsuite Jan Beulich
@ 2024-06-28 13:24 ` Jan Beulich
  2024-06-28 13:26 ` [PATCH 10/11] aarch64: " Jan Beulich
  2024-06-28 13:31 ` [PATCH WIP/RFC 11/11] gas: have scrubber retain more whitespace Jan Beulich
  10 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2024-06-28 13:24 UTC (permalink / raw)
  To: Binutils; +Cc: Nick Clifton, ramana.radhakrishnan, Richard Earnshaw

In a subsequent change the scrubber is going to be changed to retain
further whitespace. Test case expectations generally would better not
depend on the specific whitespace treatment by the scrubber, unless of
course a test is specifically about it. Adjust relevant test cases to
permit blanks where those will subsequently appear.
---
This is adding just the blanks that are going to be needed; imo it would
generally be better if test case expectations were, from the very
beginning, written to focus on what is being tested, rather than taking
verbatim copies of the respective tool's output.

--- a/gas/testsuite/gas/arm/addthumb2err.l
+++ b/gas/testsuite/gas/arm/addthumb2err.l
@@ -1,21 +1,21 @@
 [^:]*: Assembler messages:
-[^:]*:9: Error: shift value over 3 not allowed in thumb mode -- `add sp,sp,r0,LSL#4'
-[^:]*:10: Error: only LSL shift allowed in thumb mode -- `add sp,sp,r0,LSR#3'
-[^:]*:11: Error: only LSL shift allowed in thumb mode -- `add sp,sp,r0,ASR#3'
-[^:]*:12: Error: only LSL shift allowed in thumb mode -- `add sp,sp,r0,ROR#3'
+[^:]*:9: Error: shift value over 3 not allowed in thumb mode -- `add sp,sp,r0,LSL ?#4'
+[^:]*:10: Error: only LSL shift allowed in thumb mode -- `add sp,sp,r0,LSR ?#3'
+[^:]*:11: Error: only LSL shift allowed in thumb mode -- `add sp,sp,r0,ASR ?#3'
+[^:]*:12: Error: only LSL shift allowed in thumb mode -- `add sp,sp,r0,ROR ?#3'
 [^:]*:13: Error: only LSL shift allowed in thumb mode -- `add sp,sp,r0,RRX'
-[^:]*:14: Error: shift value over 3 not allowed in thumb mode -- `adds sp,sp,r0,LSL#4'
-[^:]*:15: Error: only LSL shift allowed in thumb mode -- `adds sp,sp,r0,LSR#3'
-[^:]*:16: Error: only LSL shift allowed in thumb mode -- `adds sp,sp,r0,ASR#3'
-[^:]*:17: Error: only LSL shift allowed in thumb mode -- `adds sp,sp,r0,ROR#3'
+[^:]*:14: Error: shift value over 3 not allowed in thumb mode -- `adds sp,sp,r0,LSL ?#4'
+[^:]*:15: Error: only LSL shift allowed in thumb mode -- `adds sp,sp,r0,LSR ?#3'
+[^:]*:16: Error: only LSL shift allowed in thumb mode -- `adds sp,sp,r0,ASR ?#3'
+[^:]*:17: Error: only LSL shift allowed in thumb mode -- `adds sp,sp,r0,ROR ?#3'
 [^:]*:18: Error: only LSL shift allowed in thumb mode -- `adds sp,sp,r0,RRX'
-[^:]*:19: Error: shift value over 3 not allowed in thumb mode -- `sub sp,sp,r0,LSL#4'
-[^:]*:20: Error: only LSL shift allowed in thumb mode -- `sub sp,sp,r0,LSR#3'
-[^:]*:21: Error: only LSL shift allowed in thumb mode -- `sub sp,sp,r0,ASR#3'
-[^:]*:22: Error: only LSL shift allowed in thumb mode -- `sub sp,sp,r0,ROR#3'
+[^:]*:19: Error: shift value over 3 not allowed in thumb mode -- `sub sp,sp,r0,LSL ?#4'
+[^:]*:20: Error: only LSL shift allowed in thumb mode -- `sub sp,sp,r0,LSR ?#3'
+[^:]*:21: Error: only LSL shift allowed in thumb mode -- `sub sp,sp,r0,ASR ?#3'
+[^:]*:22: Error: only LSL shift allowed in thumb mode -- `sub sp,sp,r0,ROR ?#3'
 [^:]*:23: Error: only LSL shift allowed in thumb mode -- `sub sp,sp,r0,RRX'
-[^:]*:24: Error: shift value over 3 not allowed in thumb mode -- `subs sp,sp,r0,LSL#4'
-[^:]*:25: Error: only LSL shift allowed in thumb mode -- `subs sp,sp,r0,LSR#3'
-[^:]*:26: Error: only LSL shift allowed in thumb mode -- `subs sp,sp,r0,ASR#3'
-[^:]*:27: Error: only LSL shift allowed in thumb mode -- `subs sp,sp,r0,ROR#3'
+[^:]*:24: Error: shift value over 3 not allowed in thumb mode -- `subs sp,sp,r0,LSL ?#4'
+[^:]*:25: Error: only LSL shift allowed in thumb mode -- `subs sp,sp,r0,LSR ?#3'
+[^:]*:26: Error: only LSL shift allowed in thumb mode -- `subs sp,sp,r0,ASR ?#3'
+[^:]*:27: Error: only LSL shift allowed in thumb mode -- `subs sp,sp,r0,ROR ?#3'
 [^:]*:28: Error: only LSL shift allowed in thumb mode -- `subs sp,sp,r0,RRX'
--- a/gas/testsuite/gas/arm/arch7em-bad.l
+++ b/gas/testsuite/gas/arm/arch7em-bad.l
@@ -3,10 +3,10 @@
 [^:]*:9: Error: selected processor does not support `pkhbt r9,r0,r0' in Thumb mode
 [^:]*:10: Error: selected processor does not support `pkhbt r0,r9,r0' in Thumb mode
 [^:]*:11: Error: selected processor does not support `pkhbt r0,r0,r9' in Thumb mode
-[^:]*:12: Error: selected processor does not support `pkhbt r0,r0,r0,lsl#0x14' in Thumb mode
-[^:]*:13: Error: selected processor does not support `pkhbt r0,r0,r0,lsl#3' in Thumb mode
+[^:]*:12: Error: selected processor does not support `pkhbt r0,r0,r0,lsl ?#0x14' in Thumb mode
+[^:]*:13: Error: selected processor does not support `pkhbt r0,r0,r0,lsl ?#3' in Thumb mode
 [^:]*:14: Error: selected processor does not support `pkhtb r1,r2,r3' in Thumb mode
-[^:]*:15: Error: selected processor does not support `pkhtb r1,r2,r3,asr#0x11' in Thumb mode
+[^:]*:15: Error: selected processor does not support `pkhtb r1,r2,r3,asr ?#0x11' in Thumb mode
 [^:]*:18: Error: selected processor does not support `qadd r1,r2,r3' in Thumb mode
 [^:]*:19: Error: selected processor does not support `qadd16 r1,r2,r3' in Thumb mode
 [^:]*:20: Error: selected processor does not support `qadd8 r1,r2,r3' in Thumb mode
@@ -121,10 +121,10 @@
 [^:]*:143: Error: selected processor does not support `uxtb16 r1,r2' in Thumb mode
 [^:]*:144: Error: selected processor does not support `uxtb16 r8,r9' in Thumb mode
 [^:]*:147: Error: selected processor does not support `sxtab r0,r0,r0' in Thumb mode
-[^:]*:148: Error: selected processor does not support `sxtab r0,r0,r0,ror#0' in Thumb mode
-[^:]*:149: Error: selected processor does not support `sxtab r9,r0,r0,ror#8' in Thumb mode
-[^:]*:150: Error: selected processor does not support `sxtab r0,r9,r0,ror#16' in Thumb mode
-[^:]*:151: Error: selected processor does not support `sxtab r0,r0,r9,ror#24' in Thumb mode
+[^:]*:148: Error: selected processor does not support `sxtab r0,r0,r0,ror ?#0' in Thumb mode
+[^:]*:149: Error: selected processor does not support `sxtab r9,r0,r0,ror ?#8' in Thumb mode
+[^:]*:150: Error: selected processor does not support `sxtab r0,r9,r0,ror ?#16' in Thumb mode
+[^:]*:151: Error: selected processor does not support `sxtab r0,r0,r9,ror ?#24' in Thumb mode
 [^:]*:153: Error: selected processor does not support `sxtab16 r1,r2,r3' in Thumb mode
 [^:]*:154: Error: selected processor does not support `sxtah r1,r2,r3' in Thumb mode
 [^:]*:155: Error: selected processor does not support `uxtab r1,r2,r3' in Thumb mode
--- a/gas/testsuite/gas/arm/mve-vldr-bad-1.l
+++ b/gas/testsuite/gas/arm/mve-vldr-bad-1.l
@@ -103,20 +103,20 @@
 [^:]*:88: Error: vector predicated instruction should be in VPT/VPST block -- `vldrdt.u64 q0,\[r0,q1\]'
 [^:]*:90: Error: instruction missing MVE vector predication code -- `vldrd.u64 q0,\[r0,q1\]'
 [^:]*:92: Error: shift expression expected -- `vldrb.u8 q0,\[r0,q1,#0\]'
-[^:]*:93: Error: can not shift offsets when accessing less than half-word -- `vldrb.u8 q0,\[r0,q1,UXTW#1\]'
-[^:]*:94: Error: can not shift offsets when accessing less than half-word -- `vldrb.u16 q0,\[r0,q1,UXTW#1\]'
-[^:]*:95: Error: can not shift offsets when accessing less than half-word -- `vldrb.u32 q0,\[r0,q1,UXTW#1\]'
+[^:]*:93: Error: can not shift offsets when accessing less than half-word -- `vldrb.u8 q0,\[r0,q1,UXTW ?#1\]'
+[^:]*:94: Error: can not shift offsets when accessing less than half-word -- `vldrb.u16 q0,\[r0,q1,UXTW ?#1\]'
+[^:]*:95: Error: can not shift offsets when accessing less than half-word -- `vldrb.u32 q0,\[r0,q1,UXTW ?#1\]'
 [^:]*:96: Error: shift expression expected -- `vldrh.u16 q0,\[r0,q1,#1\]'
-[^:]*:97: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vldrh.u16 q0,\[r0,q1,UXTW#2\]'
-[^:]*:98: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vldrh.u32 q0,\[r0,q1,UXTW#2\]'
-[^:]*:99: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vldrh.u16 q0,\[r0,q1,UXTW#3\]'
-[^:]*:100: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vldrh.u32 q0,\[r0,q1,UXTW#3\]'
+[^:]*:97: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vldrh.u16 q0,\[r0,q1,UXTW ?#2\]'
+[^:]*:98: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vldrh.u32 q0,\[r0,q1,UXTW ?#2\]'
+[^:]*:99: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vldrh.u16 q0,\[r0,q1,UXTW ?#3\]'
+[^:]*:100: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vldrh.u32 q0,\[r0,q1,UXTW ?#3\]'
 [^:]*:101: Error: shift expression expected -- `vldrw.u32 q0,\[r0,q1,#2\]'
-[^:]*:102: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vldrw.u32 q0,\[r0,q1,UXTW#1\]'
-[^:]*:103: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vldrw.u32 q0,\[r0,q1,UXTW#3\]'
+[^:]*:102: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vldrw.u32 q0,\[r0,q1,UXTW ?#1\]'
+[^:]*:103: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vldrw.u32 q0,\[r0,q1,UXTW ?#3\]'
 [^:]*:104: Error: shift expression expected -- `vldrd.u64 q0,\[r0,q1,#3\]'
-[^:]*:105: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vldrd.u64 q0,\[r0,q1,UXTW#1\]'
-[^:]*:106: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vldrd.u64 q0,\[r0,q1,UXTW#2\]'
-[^:]*:107: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vldrd.u64 q0,\[r0,q1,UXTW#4\]'
+[^:]*:105: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vldrd.u64 q0,\[r0,q1,UXTW ?#1\]'
+[^:]*:106: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vldrd.u64 q0,\[r0,q1,UXTW ?#2\]'
+[^:]*:107: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vldrd.u64 q0,\[r0,q1,UXTW ?#4\]'
 
 
--- a/gas/testsuite/gas/arm/mve-vldr-bad-3.l
+++ b/gas/testsuite/gas/arm/mve-vldr-bad-3.l
@@ -161,27 +161,27 @@
 [^:]*:139: Error: bad element type for instruction -- `vldrb.p32 q0,\[r2,q3\]'
 [^:]*:139: Error: bad element type for instruction -- `vldrb.p64 q0,\[r2,q3\]'
 [^:]*:139: Error: bad element type for instruction -- `vldrb.s8 q0,\[r2,q3\]'
-[^:]*:142: Error: bad element type for instruction -- `vldrh.8 q0,\[r2,q3,uxtw#1\]'
-[^:]*:142: Error: bad element type for instruction -- `vldrh.32 q0,\[r2,q3,uxtw#1\]'
-[^:]*:142: Error: bad element type for instruction -- `vldrh.64 q0,\[r2,q3,uxtw#1\]'
-[^:]*:142: Error: bad element type for instruction -- `vldrh.f32 q0,\[r2,q3,uxtw#1\]'
-[^:]*:142: Error: bad element type for instruction -- `vldrh.f64 q0,\[r2,q3,uxtw#1\]'
-[^:]*:142: Error: bad element type for instruction -- `vldrh.p32 q0,\[r2,q3,uxtw#1\]'
-[^:]*:142: Error: bad element type for instruction -- `vldrh.p64 q0,\[r2,q3,uxtw#1\]'
-[^:]*:142: Error: bad element type for instruction -- `vldrh.s16 q0,\[r2,q3,uxtw#1\]'
-[^:]*:145: Error: bad element type for instruction -- `vldrw.8 q0,\[r2,q3,uxtw#2\]'
-[^:]*:145: Error: bad element type for instruction -- `vldrw.16 q0,\[r2,q3,uxtw#2\]'
-[^:]*:145: Error: bad element type for instruction -- `vldrw.64 q0,\[r2,q3,uxtw#2\]'
-[^:]*:145: Error: bad element type for instruction -- `vldrw.f16 q0,\[r2,q3,uxtw#2\]'
-[^:]*:145: Error: bad element type for instruction -- `vldrw.f64 q0,\[r2,q3,uxtw#2\]'
-[^:]*:145: Error: bad element type for instruction -- `vldrw.p16 q0,\[r2,q3,uxtw#2\]'
-[^:]*:145: Error: bad element type for instruction -- `vldrw.p64 q0,\[r2,q3,uxtw#2\]'
-[^:]*:145: Error: bad element type for instruction -- `vldrw.s32 q0,\[r2,q3,uxtw#2\]'
-[^:]*:148: Error: bad element type for instruction -- `vldrd.8 q0,\[r2,q3,uxtw#3\]'
-[^:]*:148: Error: bad element type for instruction -- `vldrd.16 q0,\[r2,q3,uxtw#3\]'
-[^:]*:148: Error: bad element type for instruction -- `vldrd.32 q0,\[r2,q3,uxtw#3\]'
-[^:]*:148: Error: bad element type for instruction -- `vldrd.f16 q0,\[r2,q3,uxtw#3\]'
-[^:]*:148: Error: bad element type for instruction -- `vldrd.f32 q0,\[r2,q3,uxtw#3\]'
-[^:]*:148: Error: bad element type for instruction -- `vldrd.p16 q0,\[r2,q3,uxtw#3\]'
-[^:]*:148: Error: bad element type for instruction -- `vldrd.p32 q0,\[r2,q3,uxtw#3\]'
-[^:]*:148: Error: bad element type for instruction -- `vldrd.s64 q0,\[r2,q3,uxtw#3\]'
+[^:]*:142: Error: bad element type for instruction -- `vldrh.8 q0,\[r2,q3,uxtw ?#1\]'
+[^:]*:142: Error: bad element type for instruction -- `vldrh.32 q0,\[r2,q3,uxtw ?#1\]'
+[^:]*:142: Error: bad element type for instruction -- `vldrh.64 q0,\[r2,q3,uxtw ?#1\]'
+[^:]*:142: Error: bad element type for instruction -- `vldrh.f32 q0,\[r2,q3,uxtw ?#1\]'
+[^:]*:142: Error: bad element type for instruction -- `vldrh.f64 q0,\[r2,q3,uxtw ?#1\]'
+[^:]*:142: Error: bad element type for instruction -- `vldrh.p32 q0,\[r2,q3,uxtw ?#1\]'
+[^:]*:142: Error: bad element type for instruction -- `vldrh.p64 q0,\[r2,q3,uxtw ?#1\]'
+[^:]*:142: Error: bad element type for instruction -- `vldrh.s16 q0,\[r2,q3,uxtw ?#1\]'
+[^:]*:145: Error: bad element type for instruction -- `vldrw.8 q0,\[r2,q3,uxtw ?#2\]'
+[^:]*:145: Error: bad element type for instruction -- `vldrw.16 q0,\[r2,q3,uxtw ?#2\]'
+[^:]*:145: Error: bad element type for instruction -- `vldrw.64 q0,\[r2,q3,uxtw ?#2\]'
+[^:]*:145: Error: bad element type for instruction -- `vldrw.f16 q0,\[r2,q3,uxtw ?#2\]'
+[^:]*:145: Error: bad element type for instruction -- `vldrw.f64 q0,\[r2,q3,uxtw ?#2\]'
+[^:]*:145: Error: bad element type for instruction -- `vldrw.p16 q0,\[r2,q3,uxtw ?#2\]'
+[^:]*:145: Error: bad element type for instruction -- `vldrw.p64 q0,\[r2,q3,uxtw ?#2\]'
+[^:]*:145: Error: bad element type for instruction -- `vldrw.s32 q0,\[r2,q3,uxtw ?#2\]'
+[^:]*:148: Error: bad element type for instruction -- `vldrd.8 q0,\[r2,q3,uxtw ?#3\]'
+[^:]*:148: Error: bad element type for instruction -- `vldrd.16 q0,\[r2,q3,uxtw ?#3\]'
+[^:]*:148: Error: bad element type for instruction -- `vldrd.32 q0,\[r2,q3,uxtw ?#3\]'
+[^:]*:148: Error: bad element type for instruction -- `vldrd.f16 q0,\[r2,q3,uxtw ?#3\]'
+[^:]*:148: Error: bad element type for instruction -- `vldrd.f32 q0,\[r2,q3,uxtw ?#3\]'
+[^:]*:148: Error: bad element type for instruction -- `vldrd.p16 q0,\[r2,q3,uxtw ?#3\]'
+[^:]*:148: Error: bad element type for instruction -- `vldrd.p32 q0,\[r2,q3,uxtw ?#3\]'
+[^:]*:148: Error: bad element type for instruction -- `vldrd.s64 q0,\[r2,q3,uxtw ?#3\]'
--- a/gas/testsuite/gas/arm/mve-vstr-bad-1.l
+++ b/gas/testsuite/gas/arm/mve-vstr-bad-1.l
@@ -45,7 +45,7 @@
 [^:]*:4: Warning: instruction is UNPREDICTABLE in an IT block
 [^:]*:36: *Info: macro .*
 [^:]*:39: Error: shift expression expected -- `vstrh.16 q0,\[r0,q1,#1\]'
-[^:]*:40: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vstrh.16 q0,\[r0,q1,UXTW#2\]'
+[^:]*:40: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vstrh.16 q0,\[r0,q1,UXTW ?#2\]'
 [^:]*:41: Error: bad element type for instruction -- `vstrw.8 q0,\[r0,q1\]'
 [^:]*:42: Error: bad element type for instruction -- `vstrw.u8 q0,\[r0,q1\]'
 [^:]*:43: Error: bad element type for instruction -- `vstrw.s8 q0,\[r0,q1\]'
@@ -71,8 +71,8 @@
 [^:]*:4: Warning: instruction is UNPREDICTABLE in an IT block
 [^:]*:53: *Info: macro .*
 [^:]*:56: Error: shift expression expected -- `vstrw.32 q0,\[r0,q1,#2\]'
-[^:]*:57: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vstrw.32 q0,\[r0,q1,UXTW#1\]'
-[^:]*:58: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vstrw.32 q0,\[r0,q1,UXTW#3\]'
+[^:]*:57: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vstrw.32 q0,\[r0,q1,UXTW ?#1\]'
+[^:]*:58: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vstrw.32 q0,\[r0,q1,UXTW ?#3\]'
 [^:]*:59: Error: bad element type for instruction -- `vstrd.8 q0,\[r0,q1\]'
 [^:]*:60: Error: bad element type for instruction -- `vstrd.u8 q0,\[r0,q1\]'
 [^:]*:61: Error: bad element type for instruction -- `vstrd.s8 q0,\[r0,q1\]'
@@ -100,9 +100,9 @@
 [^:]*:77: Warning: instruction is UNPREDICTABLE in an IT block
 [^:]*:83: *Info: macro .*
 [^:]*:84: Error: shift expression expected -- `vstrd.64 q0,\[r0,q1,#3\]'
-[^:]*:85: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vstrd.64 q0,\[r0,q1,UXTW#1\]'
-[^:]*:86: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vstrd.64 q0,\[r0,q1,UXTW#2\]'
-[^:]*:87: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vstrd.64 q0,\[r0,q1,UXTW#4\]'
+[^:]*:85: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vstrd.64 q0,\[r0,q1,UXTW ?#1\]'
+[^:]*:86: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vstrd.64 q0,\[r0,q1,UXTW ?#2\]'
+[^:]*:87: Error: shift immediate must be 1, 2 or 3 for half-word, word or double-word accesses respectively -- `vstrd.64 q0,\[r0,q1,UXTW ?#4\]'
 [^:]*:90: Error: syntax error -- `vstrbeq.32 q0,\[r0,q1\]'
 [^:]*:91: Error: syntax error -- `vstrbeq.32 q0,\[r0,q1\]'
 [^:]*:93: Error: syntax error -- `vstrbeq.32 q0,\[r0,q1\]'
--- a/gas/testsuite/gas/arm/neon-ldst-align-bad.l
+++ b/gas/testsuite/gas/arm/neon-ldst-align-bad.l
@@ -1,3 +1,3 @@
 [^:]*: Assembler messages:
-[^:]*:1: Error: bad alignment -- `vld1.8 {d0},\[r0:128\]'
-[^:]*:2: Error: bad alignment -- `vld1.8 {q0},\[r0:256\]'
+[^:]*:1: Error: bad alignment -- `vld1.8 {d0},\[r0 ?:128\]'
+[^:]*:2: Error: bad alignment -- `vld1.8 {q0},\[r0 ?:256\]'
--- a/gas/testsuite/gas/arm/shift-bad.l
+++ b/gas/testsuite/gas/arm/shift-bad.l
@@ -1,9 +1,9 @@
 .*shift-bad.s: Assembler messages:
-.*shift-bad.s:2: Error: extraneous shift as part of operand to shift insn -- `asr r0,r1,r2,ror#5'
+.*shift-bad.s:2: Error: extraneous shift as part of operand to shift insn -- `asr r0,r1,r2,ror ?#5'
 .*shift-bad.s:3: Error: extraneous shift as part of operand to shift insn -- `ror r0,r1,r2,lsl r3'
-.*shift-bad.s:7: Error: extraneous shift as part of operand to shift insn -- `ror r0,r0,r2,lsl#1'
-.*shift-bad.s:8: Error: extraneous shift as part of operand to shift insn -- `lsl r0,r0,r2,lsl#1'
+.*shift-bad.s:7: Error: extraneous shift as part of operand to shift insn -- `ror r0,r0,r2,lsl ?#1'
+.*shift-bad.s:8: Error: extraneous shift as part of operand to shift insn -- `lsl r0,r0,r2,lsl ?#1'
 .*shift-bad.s:9: Error: extraneous shift as part of operand to shift insn -- `lsl r0,r0,r2,asr r0'
-.*shift-bad.s:13: Error: extraneous shift as part of operand to shift insn -- `ror r0,r1,r2,lsl#1'
-.*shift-bad.s:14: Error: extraneous shift as part of operand to shift insn -- `lsl r0,r1,r2,lsl#1'
+.*shift-bad.s:13: Error: extraneous shift as part of operand to shift insn -- `ror r0,r1,r2,lsl ?#1'
+.*shift-bad.s:14: Error: extraneous shift as part of operand to shift insn -- `lsl r0,r1,r2,lsl ?#1'
 .*shift-bad.s:15: Error: extraneous shift as part of operand to shift insn -- `lsl r0,r1,r2,asr r0'
--- a/gas/testsuite/gas/arm/sp-pc-validations-bad-t-v8a.l
+++ b/gas/testsuite/gas/arm/sp-pc-validations-bad-t-v8a.l
@@ -131,7 +131,7 @@
 [^:]*:12: IT blocks containing more than one conditional instruction are performance deprecated in ARMv8-A and ARMv8-R
 [^:]*:21: *Info: macro .*
 [^:]*:45: *Info: macro .*
-[^:]*:12: Error: branch must be last instruction in IT block -- `ldreq.w r15,\[r0,r1,LSL#2\]'
+[^:]*:12: Error: branch must be last instruction in IT block -- `ldreq.w r15,\[r0,r1,LSL ?#2\]'
 [^:]*:21: *Info: macro .*
 [^:]*:45: *Info: macro .*
 [^:]*:48: Error: r15 not allowed here -- `ldrb pc,\[r0,#4\]'
@@ -144,8 +144,8 @@
 [^:]*:66: Error: r15 not allowed here -- `ldrb pc,\[r0,r1\]'
 [^:]*:67: Error: cannot use register index with PC-relative addressing -- `ldrb r0,\[pc,r1\]'
 [^:]*:68: Error: r15 not allowed here -- `ldrb r0,\[r1,pc\]'
-[^:]*:69: Error: r15 not allowed here -- `ldrb.w pc,\[r0,r1,LSL#1\]'
-[^:]*:71: Error: r15 not allowed here -- `ldrb.w r2,\[r0,pc,LSL#2\]'
+[^:]*:69: Error: r15 not allowed here -- `ldrb.w pc,\[r0,r1,LSL ?#1\]'
+[^:]*:71: Error: r15 not allowed here -- `ldrb.w r2,\[r0,pc,LSL ?#2\]'
 [^:]*:75: Error: r15 not allowed here -- `ldrbt pc,\[r0,#4\]'
 [^:]*:79: Error: r15 not allowed here -- `ldrd pc,r0,\[r1\]'
 [^:]*:81: Error: r12 not allowed here -- `ldrd r12,\[r1\]'
@@ -184,8 +184,8 @@
 [^:]*:149: Error: r15 not allowed here -- `ldrh pc,\[r0,r1\]'
 [^:]*:150: Error: cannot use register index with PC-relative addressing -- `ldrh r0,\[pc,r1\]'
 [^:]*:151: Error: r15 not allowed here -- `ldrh r0,\[r1,pc\]'
-[^:]*:152: Error: r15 not allowed here -- `ldrh.w pc,\[r0,r1,LSL#1\]'
-[^:]*:154: Error: r15 not allowed here -- `ldrh.w r2,\[r0,pc,LSL#1\]'
+[^:]*:152: Error: r15 not allowed here -- `ldrh.w pc,\[r0,r1,LSL ?#1\]'
+[^:]*:154: Error: r15 not allowed here -- `ldrh.w r2,\[r0,pc,LSL ?#1\]'
 [^:]*:158: Error: r15 not allowed here -- `ldrht pc,\[r0,#4\]'
 [^:]*:162: Error: r15 not allowed here -- `ldrsb pc,\[r0,#4\]'
 [^:]*:165: Error: r15 not allowed here -- `ldrsb pc,\[r0,#-4\]'
@@ -196,8 +196,8 @@
 [^:]*:179: Error: r15 not allowed here -- `ldrsb pc,\[r0,r1\]'
 [^:]*:180: Error: cannot use register index with PC-relative addressing -- `ldrsb r0,\[pc,r1\]'
 [^:]*:181: Error: r15 not allowed here -- `ldrsb r0,\[r1,pc\]'
-[^:]*:182: Error: r15 not allowed here -- `ldrsb.w pc,\[r0,r1,LSL#2\]'
-[^:]*:185: Error: r15 not allowed here -- `ldrsb.w r2,\[r0,pc,LSL#2\]'
+[^:]*:182: Error: r15 not allowed here -- `ldrsb.w pc,\[r0,r1,LSL ?#2\]'
+[^:]*:185: Error: r15 not allowed here -- `ldrsb.w r2,\[r0,pc,LSL ?#2\]'
 [^:]*:190: Error: r15 not allowed here -- `ldrsbt pc,\[r0,#4\]'
 [^:]*:195: Error: r15 not allowed here -- `ldrsh pc,\[r0,#4\]'
 [^:]*:197: Error: r15 not allowed here -- `ldrsh pc,\[r0,#-4\]'
@@ -207,8 +207,8 @@
 [^:]*:210: Error: r15 not allowed here -- `ldrsh pc,\[r0,r1\]'
 [^:]*:211: Error: cannot use register index with PC-relative addressing -- `ldrsh r0,\[pc,r1\]'
 [^:]*:212: Error: r15 not allowed here -- `ldrsh r0,\[r1,pc\]'
-[^:]*:214: Error: r15 not allowed here -- `ldrsh.w pc,\[r0,r1,LSL#3\]'
-[^:]*:217: Error: r15 not allowed here -- `ldrsh.w r0,\[r1,pc,LSL#3\]'
+[^:]*:214: Error: r15 not allowed here -- `ldrsh.w pc,\[r0,r1,LSL ?#3\]'
+[^:]*:217: Error: r15 not allowed here -- `ldrsh.w r0,\[r1,pc,LSL ?#3\]'
 [^:]*:221: Error: r15 not allowed here -- `ldrsht pc,\[r0,#4\]'
 [^:]*:226: Error: r15 not allowed here -- `ldrt pc,\[r0,#4\]'
 [^:]*:232: Error: r15 not allowed here -- `str pc,\[r0,#4\]'
@@ -217,7 +217,7 @@
 [^:]*:235: Error: cannot use post-indexing with PC-relative addressing -- `str r0,\[pc\],#4'
 [^:]*:236: Error: cannot use writeback with PC-relative addressing -- `str r0,\[pc,#4\]!'
 [^:]*:239: Error: cannot use register index with PC-relative addressing -- `str.w r0,\[pc,r1\]'
-[^:]*:240: Error: cannot use register index with PC-relative addressing -- `str.w r0,\[pc,r1,LSL#2\]'
+[^:]*:240: Error: cannot use register index with PC-relative addressing -- `str.w r0,\[pc,r1,LSL ?#2\]'
 [^:]*:246: Error: cannot use register index with PC-relative addressing -- `strb.w r0,\[pc,#4\]'
 [^:]*:247: Error: r15 not allowed here -- `strb.w pc,\[r0,#4\]'
 [^:]*:249: Error: cannot use register index with PC-relative addressing -- `strb r0,\[pc,#-4\]'
@@ -227,11 +227,11 @@
 [^:]*:253: Error: r15 not allowed here -- `strb pc,\[r0\],#4'
 [^:]*:254: Error: r15 not allowed here -- `strb pc,\[r0,#4\]!'
 [^:]*:260: Error: cannot use register index with PC-relative addressing -- `strb.w r0,\[pc,r1\]'
-[^:]*:261: Error: cannot use register index with PC-relative addressing -- `strb.w r0,\[pc,r1,LSL#2\]'
+[^:]*:261: Error: cannot use register index with PC-relative addressing -- `strb.w r0,\[pc,r1,LSL ?#2\]'
 [^:]*:262: Error: r15 not allowed here -- `strb.w pc,\[r0,r1\]'
-[^:]*:263: Error: r15 not allowed here -- `strb.w pc,\[r0,r1,LSL#2\]'
+[^:]*:263: Error: r15 not allowed here -- `strb.w pc,\[r0,r1,LSL ?#2\]'
 [^:]*:266: Error: r15 not allowed here -- `strb.w r0,\[r1,pc\]'
-[^:]*:267: Error: r15 not allowed here -- `strb.w r0,\[r1,pc,LSL#2\]'
+[^:]*:267: Error: r15 not allowed here -- `strb.w r0,\[r1,pc,LSL ?#2\]'
 [^:]*:272: Error: cannot use register index with PC-relative addressing -- `strbt r0,\[pc,#4\]'
 [^:]*:273: Error: r15 not allowed here -- `strbt pc,\[r0,#4\]'
 [^:]*:277: Error: cannot use register index with PC-relative addressing -- `strd r0,r1,\[pc,#4\]'
@@ -265,7 +265,7 @@
 [^:]*:335: Error: cannot use post-indexing with PC-relative addressing -- `strh r0,\[pc\],#4'
 [^:]*:336: Error: cannot use writeback with PC-relative addressing -- `strh r0,\[pc,#4\]!'
 [^:]*:339: Error: cannot use register index with PC-relative addressing -- `strh.w r0,\[pc,r1\]'
-[^:]*:340: Error: cannot use register index with PC-relative addressing -- `strh.w r0,\[pc,r1,LSL#2\]'
+[^:]*:340: Error: cannot use register index with PC-relative addressing -- `strh.w r0,\[pc,r1,LSL ?#2\]'
 [^:]*:341: Error: r15 not allowed here -- `strh.w pc,\[r0,#4\]'
 [^:]*:342: Error: r15 not allowed here -- `strh.w pc,\[r0\]'
 [^:]*:345: Error: r15 not allowed here -- `strh pc,\[r0,#-4\]'
@@ -273,8 +273,8 @@
 [^:]*:347: Error: r15 not allowed here -- `strh pc,\[r0,#4\]!'
 [^:]*:351: Error: r15 not allowed here -- `strh.w pc,\[r0,r1\]'
 [^:]*:353: Error: r15 not allowed here -- `strh.w r0,\[r1,pc\]'
-[^:]*:355: Error: r15 not allowed here -- `strh.w pc,\[r0,r1,LSL#2\]'
-[^:]*:357: Error: r15 not allowed here -- `strh.w r0,\[r1,pc,LSL#2\]'
+[^:]*:355: Error: r15 not allowed here -- `strh.w pc,\[r0,r1,LSL ?#2\]'
+[^:]*:357: Error: r15 not allowed here -- `strh.w r0,\[r1,pc,LSL ?#2\]'
 [^:]*:361: Error: cannot use register index with PC-relative addressing -- `strht r0,\[pc,#4\]'
 [^:]*:362: Error: r15 not allowed here -- `strht pc,\[r0,#4\]'
 [^:]*:363: Error: cannot use register index with PC-relative addressing -- `strht sp,\[pc,#4\]'
--- a/gas/testsuite/gas/arm/sp-pc-validations-bad-t.l
+++ b/gas/testsuite/gas/arm/sp-pc-validations-bad-t.l
@@ -41,7 +41,7 @@
 [^:]*:12: Error: branch must be last instruction in IT block -- `ldreq.w r15,\[r0,r1\]'
 [^:]*:21: *Info: macro .*
 [^:]*:44: *Info: macro .*
-[^:]*:12: Error: branch must be last instruction in IT block -- `ldreq.w r15,\[r0,r1,LSL#2\]'
+[^:]*:12: Error: branch must be last instruction in IT block -- `ldreq.w r15,\[r0,r1,LSL ?#2\]'
 [^:]*:21: *Info: macro .*
 [^:]*:45: *Info: macro .*
 [^:]*:48: Error: r15 not allowed here -- `ldrb pc,\[r0,#4\]'
@@ -59,10 +59,10 @@
 [^:]*:66: Error: r15 not allowed here -- `ldrb pc,\[r0,r1\]'
 [^:]*:67: Error: cannot use register index with PC-relative addressing -- `ldrb r0,\[pc,r1\]'
 [^:]*:68: Error: r15 not allowed here -- `ldrb r0,\[r1,pc\]'
-[^:]*:69: Error: r15 not allowed here -- `ldrb.w pc,\[r0,r1,LSL#1\]'
+[^:]*:69: Error: r15 not allowed here -- `ldrb.w pc,\[r0,r1,LSL ?#1\]'
 [^:]*:70: Error: r13 not allowed here -- `ldrb.w sp,\[r0,r1\]'
-[^:]*:71: Error: r15 not allowed here -- `ldrb.w r2,\[r0,pc,LSL#2\]'
-[^:]*:72: Error: r13 not allowed here -- `ldrb.w r2,\[r0,sp,LSL#2\]'
+[^:]*:71: Error: r15 not allowed here -- `ldrb.w r2,\[r0,pc,LSL ?#2\]'
+[^:]*:72: Error: r13 not allowed here -- `ldrb.w r2,\[r0,sp,LSL ?#2\]'
 [^:]*:75: Error: r15 not allowed here -- `ldrbt pc,\[r0,#4\]'
 [^:]*:76: Error: r13 not allowed here -- `ldrbt sp,\[r0,#4\]'
 [^:]*:79: Error: r15 not allowed here -- `ldrd pc,r0,\[r1\]'
@@ -123,10 +123,10 @@
 [^:]*:149: Error: r15 not allowed here -- `ldrh pc,\[r0,r1\]'
 [^:]*:150: Error: cannot use register index with PC-relative addressing -- `ldrh r0,\[pc,r1\]'
 [^:]*:151: Error: r15 not allowed here -- `ldrh r0,\[r1,pc\]'
-[^:]*:152: Error: r15 not allowed here -- `ldrh.w pc,\[r0,r1,LSL#1\]'
-[^:]*:153: Error: r13 not allowed here -- `ldrh.w sp,\[r0,r1,LSL#1\]'
-[^:]*:154: Error: r15 not allowed here -- `ldrh.w r2,\[r0,pc,LSL#1\]'
-[^:]*:155: Error: r13 not allowed here -- `ldrh.w r2,\[r0,sp,LSL#1\]'
+[^:]*:152: Error: r15 not allowed here -- `ldrh.w pc,\[r0,r1,LSL ?#1\]'
+[^:]*:153: Error: r13 not allowed here -- `ldrh.w sp,\[r0,r1,LSL ?#1\]'
+[^:]*:154: Error: r15 not allowed here -- `ldrh.w r2,\[r0,pc,LSL ?#1\]'
+[^:]*:155: Error: r13 not allowed here -- `ldrh.w r2,\[r0,sp,LSL ?#1\]'
 [^:]*:158: Error: r15 not allowed here -- `ldrht pc,\[r0,#4\]'
 [^:]*:159: Error: r13 not allowed here -- `ldrht sp,\[r0,#4\]'
 [^:]*:162: Error: r15 not allowed here -- `ldrsb pc,\[r0,#4\]'
@@ -144,10 +144,10 @@
 [^:]*:179: Error: r15 not allowed here -- `ldrsb pc,\[r0,r1\]'
 [^:]*:180: Error: cannot use register index with PC-relative addressing -- `ldrsb r0,\[pc,r1\]'
 [^:]*:181: Error: r15 not allowed here -- `ldrsb r0,\[r1,pc\]'
-[^:]*:182: Error: r15 not allowed here -- `ldrsb.w pc,\[r0,r1,LSL#2\]'
-[^:]*:184: Error: r13 not allowed here -- `ldrsb.w sp,\[r0,r1,LSL#2\]'
-[^:]*:185: Error: r15 not allowed here -- `ldrsb.w r2,\[r0,pc,LSL#2\]'
-[^:]*:186: Error: r13 not allowed here -- `ldrsb.w r2,\[r0,sp,LSL#2\]'
+[^:]*:182: Error: r15 not allowed here -- `ldrsb.w pc,\[r0,r1,LSL ?#2\]'
+[^:]*:184: Error: r13 not allowed here -- `ldrsb.w sp,\[r0,r1,LSL ?#2\]'
+[^:]*:185: Error: r15 not allowed here -- `ldrsb.w r2,\[r0,pc,LSL ?#2\]'
+[^:]*:186: Error: r13 not allowed here -- `ldrsb.w r2,\[r0,sp,LSL ?#2\]'
 [^:]*:190: Error: r15 not allowed here -- `ldrsbt pc,\[r0,#4\]'
 [^:]*:191: Error: r13 not allowed here -- `ldrsbt sp,\[r0,#4\]'
 [^:]*:195: Error: r15 not allowed here -- `ldrsh pc,\[r0,#4\]'
@@ -164,10 +164,10 @@
 [^:]*:210: Error: r15 not allowed here -- `ldrsh pc,\[r0,r1\]'
 [^:]*:211: Error: cannot use register index with PC-relative addressing -- `ldrsh r0,\[pc,r1\]'
 [^:]*:212: Error: r15 not allowed here -- `ldrsh r0,\[r1,pc\]'
-[^:]*:214: Error: r15 not allowed here -- `ldrsh.w pc,\[r0,r1,LSL#3\]'
-[^:]*:215: Error: r13 not allowed here -- `ldrsh.w sp,\[r0,r1,LSL#3\]'
-[^:]*:216: Error: r13 not allowed here -- `ldrsh.w r0,\[r1,sp,LSL#3\]'
-[^:]*:217: Error: r15 not allowed here -- `ldrsh.w r0,\[r1,pc,LSL#3\]'
+[^:]*:214: Error: r15 not allowed here -- `ldrsh.w pc,\[r0,r1,LSL ?#3\]'
+[^:]*:215: Error: r13 not allowed here -- `ldrsh.w sp,\[r0,r1,LSL ?#3\]'
+[^:]*:216: Error: r13 not allowed here -- `ldrsh.w r0,\[r1,sp,LSL ?#3\]'
+[^:]*:217: Error: r15 not allowed here -- `ldrsh.w r0,\[r1,pc,LSL ?#3\]'
 [^:]*:221: Error: r15 not allowed here -- `ldrsht pc,\[r0,#4\]'
 [^:]*:222: Error: r13 not allowed here -- `ldrsht sp,\[r0,#4\]'
 [^:]*:226: Error: r15 not allowed here -- `ldrt pc,\[r0,#4\]'
@@ -178,7 +178,7 @@
 [^:]*:235: Error: cannot use post-indexing with PC-relative addressing -- `str r0,\[pc\],#4'
 [^:]*:236: Error: cannot use writeback with PC-relative addressing -- `str r0,\[pc,#4\]!'
 [^:]*:239: Error: cannot use register index with PC-relative addressing -- `str.w r0,\[pc,r1\]'
-[^:]*:240: Error: cannot use register index with PC-relative addressing -- `str.w r0,\[pc,r1,LSL#2\]'
+[^:]*:240: Error: cannot use register index with PC-relative addressing -- `str.w r0,\[pc,r1,LSL ?#2\]'
 [^:]*:246: Error: cannot use register index with PC-relative addressing -- `strb.w r0,\[pc,#4\]'
 [^:]*:247: Error: r15 not allowed here -- `strb.w pc,\[r0,#4\]'
 [^:]*:248: Error: r13 not allowed here -- `strb.w sp,\[r0,#4\]'
@@ -192,15 +192,15 @@
 [^:]*:256: Error: r13 not allowed here -- `strb sp,\[r0\],#4'
 [^:]*:257: Error: r13 not allowed here -- `strb sp,\[r0,#4\]!'
 [^:]*:260: Error: cannot use register index with PC-relative addressing -- `strb.w r0,\[pc,r1\]'
-[^:]*:261: Error: cannot use register index with PC-relative addressing -- `strb.w r0,\[pc,r1,LSL#2\]'
+[^:]*:261: Error: cannot use register index with PC-relative addressing -- `strb.w r0,\[pc,r1,LSL ?#2\]'
 [^:]*:262: Error: r15 not allowed here -- `strb.w pc,\[r0,r1\]'
-[^:]*:263: Error: r15 not allowed here -- `strb.w pc,\[r0,r1,LSL#2\]'
+[^:]*:263: Error: r15 not allowed here -- `strb.w pc,\[r0,r1,LSL ?#2\]'
 [^:]*:264: Error: r13 not allowed here -- `strb.w sp,\[r0,r1\]'
-[^:]*:265: Error: r13 not allowed here -- `strb.w sp,\[r0,r1,LSL#2\]'
+[^:]*:265: Error: r13 not allowed here -- `strb.w sp,\[r0,r1,LSL ?#2\]'
 [^:]*:266: Error: r15 not allowed here -- `strb.w r0,\[r1,pc\]'
-[^:]*:267: Error: r15 not allowed here -- `strb.w r0,\[r1,pc,LSL#2\]'
+[^:]*:267: Error: r15 not allowed here -- `strb.w r0,\[r1,pc,LSL ?#2\]'
 [^:]*:268: Error: r13 not allowed here -- `strb.w r0,\[r1,sp\]'
-[^:]*:269: Error: r13 not allowed here -- `strb.w r0,\[r1,sp,LSL#2\]'
+[^:]*:269: Error: r13 not allowed here -- `strb.w r0,\[r1,sp,LSL ?#2\]'
 [^:]*:272: Error: cannot use register index with PC-relative addressing -- `strbt r0,\[pc,#4\]'
 [^:]*:273: Error: r15 not allowed here -- `strbt pc,\[r0,#4\]'
 [^:]*:274: Error: r13 not allowed here -- `strbt sp,\[r0,#4\]'
@@ -252,7 +252,7 @@
 [^:]*:335: Error: cannot use post-indexing with PC-relative addressing -- `strh r0,\[pc\],#4'
 [^:]*:336: Error: cannot use writeback with PC-relative addressing -- `strh r0,\[pc,#4\]!'
 [^:]*:339: Error: cannot use register index with PC-relative addressing -- `strh.w r0,\[pc,r1\]'
-[^:]*:340: Error: cannot use register index with PC-relative addressing -- `strh.w r0,\[pc,r1,LSL#2\]'
+[^:]*:340: Error: cannot use register index with PC-relative addressing -- `strh.w r0,\[pc,r1,LSL ?#2\]'
 [^:]*:341: Error: r15 not allowed here -- `strh.w pc,\[r0,#4\]'
 [^:]*:342: Error: r15 not allowed here -- `strh.w pc,\[r0\]'
 [^:]*:343: Error: r13 not allowed here -- `strh.w sp,\[r0,#4\]'
@@ -267,10 +267,10 @@
 [^:]*:352: Error: r13 not allowed here -- `strh.w sp,\[r0,r1\]'
 [^:]*:353: Error: r15 not allowed here -- `strh.w r0,\[r1,pc\]'
 [^:]*:354: Error: r13 not allowed here -- `strh.w r0,\[r1,sp\]'
-[^:]*:355: Error: r15 not allowed here -- `strh.w pc,\[r0,r1,LSL#2\]'
-[^:]*:356: Error: r13 not allowed here -- `strh.w sp,\[r0,r1,LSL#2\]'
-[^:]*:357: Error: r15 not allowed here -- `strh.w r0,\[r1,pc,LSL#2\]'
-[^:]*:358: Error: r13 not allowed here -- `strh.w r0,\[r1,sp,LSL#2\]'
+[^:]*:355: Error: r15 not allowed here -- `strh.w pc,\[r0,r1,LSL ?#2\]'
+[^:]*:356: Error: r13 not allowed here -- `strh.w sp,\[r0,r1,LSL ?#2\]'
+[^:]*:357: Error: r15 not allowed here -- `strh.w r0,\[r1,pc,LSL ?#2\]'
+[^:]*:358: Error: r13 not allowed here -- `strh.w r0,\[r1,sp,LSL ?#2\]'
 [^:]*:361: Error: cannot use register index with PC-relative addressing -- `strht r0,\[pc,#4\]'
 [^:]*:362: Error: r15 not allowed here -- `strht pc,\[r0,#4\]'
 [^:]*:363: Error: r13 not allowed here -- `strht sp,\[pc,#4\]'
--- a/gas/testsuite/gas/arm/sp-pc-validations-bad.l
+++ b/gas/testsuite/gas/arm/sp-pc-validations-bad.l
@@ -1,27 +1,27 @@
 [^:]*: Assembler messages:
-[^:]*:11: Error: cannot use register index with PC-relative addressing -- `ldr r0,\[r1,pc,LSL#2\]'
-[^:]*:12: Error: cannot use register index with PC-relative addressing -- `ldr r0,\[r1,pc,LSL#2\]!'
-[^:]*:13: Error: cannot use register index with PC-relative addressing -- `ldr r0,\[r1\],pc,LSL#2'
-[^:]*:14: Error: cannot use register index with PC-relative addressing -- `ldr r0,\[pc,r1,LSL#2\]!'
-[^:]*:15: Error: cannot use register index with PC-relative addressing -- `ldr r0,\[pc\],r1,LSL#2'
+[^:]*:11: Error: cannot use register index with PC-relative addressing -- `ldr r0,\[r1,pc,LSL ?#2\]'
+[^:]*:12: Error: cannot use register index with PC-relative addressing -- `ldr r0,\[r1,pc,LSL ?#2\]!'
+[^:]*:13: Error: cannot use register index with PC-relative addressing -- `ldr r0,\[r1\],pc,LSL ?#2'
+[^:]*:14: Error: cannot use register index with PC-relative addressing -- `ldr r0,\[pc,r1,LSL ?#2\]!'
+[^:]*:15: Error: cannot use register index with PC-relative addressing -- `ldr r0,\[pc\],r1,LSL ?#2'
 [^:]*:18: Error: r15 not allowed here -- `ldrb pc,\[r0,#4\]'
 [^:]*:19: Error: r15 not allowed here -- `ldrb pc,\[r0\],#4'
 [^:]*:20: Error: r15 not allowed here -- `ldrb pc,\[r0,#4\]!'
 [^:]*:23: Error: r15 not allowed here -- `ldrb pc,label'
 [^:]*:24: Error: r15 not allowed here -- `ldrb pc,\[pc,#-0\]'
-[^:]*:27: Error: r15 not allowed here -- `ldrb pc,\[r0,r1,LSL#2\]'
-[^:]*:28: Error: r15 not allowed here -- `ldrb pc,\[r0,r1,LSL#2\]!'
-[^:]*:29: Error: r15 not allowed here -- `ldrb pc,\[r0\],r1,LSL#2'
-[^:]*:30: Error: cannot use register index with PC-relative addressing -- `ldrb r0,\[r1,pc,LSL#2\]'
-[^:]*:31: Error: cannot use register index with PC-relative addressing -- `ldrb r0,\[r1,pc,LSL#2\]!'
-[^:]*:32: Error: cannot use register index with PC-relative addressing -- `ldrb r0,\[r1\],pc,LSL#2'
-[^:]*:33: Error: cannot use register index with PC-relative addressing -- `ldrb r0,\[pc,r1,LSL#2\]!'
-[^:]*:34: Error: cannot use register index with PC-relative addressing -- `ldrb r0,\[pc\],r1,LSL#2'
+[^:]*:27: Error: r15 not allowed here -- `ldrb pc,\[r0,r1,LSL ?#2\]'
+[^:]*:28: Error: r15 not allowed here -- `ldrb pc,\[r0,r1,LSL ?#2\]!'
+[^:]*:29: Error: r15 not allowed here -- `ldrb pc,\[r0\],r1,LSL ?#2'
+[^:]*:30: Error: cannot use register index with PC-relative addressing -- `ldrb r0,\[r1,pc,LSL ?#2\]'
+[^:]*:31: Error: cannot use register index with PC-relative addressing -- `ldrb r0,\[r1,pc,LSL ?#2\]!'
+[^:]*:32: Error: cannot use register index with PC-relative addressing -- `ldrb r0,\[r1\],pc,LSL ?#2'
+[^:]*:33: Error: cannot use register index with PC-relative addressing -- `ldrb r0,\[pc,r1,LSL ?#2\]!'
+[^:]*:34: Error: cannot use register index with PC-relative addressing -- `ldrb r0,\[pc\],r1,LSL ?#2'
 [^:]*:37: Error: r15 not allowed here -- `ldrbt pc,\[r0\],#4'
 [^:]*:38: Error: cannot use register index with PC-relative addressing -- `ldrbt r0,\[pc\],#4'
-[^:]*:39: Error: r15 not allowed here -- `ldrbt pc,\[r0\],r1,LSL#4'
-[^:]*:40: Error: cannot use register index with PC-relative addressing -- `ldrbt r0,\[pc\],r1,LSL#4'
-[^:]*:41: Error: cannot use register index with PC-relative addressing -- `ldrbt r0,\[r1\],pc,LSL#4'
+[^:]*:39: Error: r15 not allowed here -- `ldrbt pc,\[r0\],r1,LSL ?#4'
+[^:]*:40: Error: cannot use register index with PC-relative addressing -- `ldrbt r0,\[pc\],r1,LSL ?#4'
+[^:]*:41: Error: cannot use register index with PC-relative addressing -- `ldrbt r0,\[r1\],pc,LSL ?#4'
 [^:]*:44: Error: r15 not allowed here -- `ldrd r0,pc,\[r1,#4\]'
 [^:]*:45: Error: r15 not allowed here -- `ldrd r0,pc,\[r1\],#4'
 [^:]*:46: Error: r15 not allowed here -- `ldrd r0,pc,\[r1,#4\]!'
@@ -98,32 +98,32 @@
 [^:]*:153: Error: cannot use register index with PC-relative addressing -- `ldrsht r0,\[r1\],pc'
 [^:]*:156: Error: r15 not allowed here -- `ldrt pc,\[r0\],#4'
 [^:]*:157: Error: cannot use register index with PC-relative addressing -- `ldrt r0,\[pc\],#4'
-[^:]*:158: Error: r15 not allowed here -- `ldrt pc,\[r0\],r1,LSL#4'
-[^:]*:159: Error: cannot use register index with PC-relative addressing -- `ldrt r0,\[pc\],r1,LSL#4'
-[^:]*:160: Error: cannot use register index with PC-relative addressing -- `ldrt r0,\[r1\],pc,LSL#4'
+[^:]*:158: Error: r15 not allowed here -- `ldrt pc,\[r0\],r1,LSL ?#4'
+[^:]*:159: Error: cannot use register index with PC-relative addressing -- `ldrt r0,\[pc\],r1,LSL ?#4'
+[^:]*:160: Error: cannot use register index with PC-relative addressing -- `ldrt r0,\[r1\],pc,LSL ?#4'
 [^:]*:166: Error: cannot use register index with PC-relative addressing -- `str r0,\[pc\],#4'
 [^:]*:167: Error: cannot use register index with PC-relative addressing -- `str r0,\[pc,#4\]!'
-[^:]*:170: Error: cannot use register index with PC-relative addressing -- `str r0,\[r1,pc,LSL#4\]'
-[^:]*:171: Error: cannot use register index with PC-relative addressing -- `str r0,\[r1,pc,LSL#4\]!'
-[^:]*:172: Error: cannot use register index with PC-relative addressing -- `str r0,\[r1\],pc,LSL#4'
+[^:]*:170: Error: cannot use register index with PC-relative addressing -- `str r0,\[r1,pc,LSL ?#4\]'
+[^:]*:171: Error: cannot use register index with PC-relative addressing -- `str r0,\[r1,pc,LSL ?#4\]!'
+[^:]*:172: Error: cannot use register index with PC-relative addressing -- `str r0,\[r1\],pc,LSL ?#4'
 [^:]*:175: Error: r15 not allowed here -- `strb pc,\[r0,#4\]'
 [^:]*:176: Error: r15 not allowed here -- `strb pc,\[r0\],#4'
 [^:]*:177: Error: r15 not allowed here -- `strb pc,\[r0,#4\]!'
 [^:]*:178: Error: cannot use register index with PC-relative addressing -- `strb r0,\[pc\],#4'
 [^:]*:179: Error: cannot use register index with PC-relative addressing -- `strb r0,\[pc,#4\]!'
-[^:]*:182: Error: r15 not allowed here -- `strb pc,\[r0,r1,LSL#4\]'
-[^:]*:183: Error: r15 not allowed here -- `strb pc,\[r0,r1,LSL#4\]!'
-[^:]*:184: Error: r15 not allowed here -- `strb pc,\[r0\],r1,LSL#4'
-[^:]*:185: Error: cannot use register index with PC-relative addressing -- `strb r1,\[r0,pc,LSL#4\]'
-[^:]*:186: Error: cannot use register index with PC-relative addressing -- `strb r1,\[r0,pc,LSL#4\]!'
-[^:]*:187: Error: cannot use register index with PC-relative addressing -- `strb r1,\[r0\],pc,LSL#4'
-[^:]*:188: Error: cannot use register index with PC-relative addressing -- `strb r0,\[pc,r1,LSL#4\]!'
-[^:]*:189: Error: cannot use register index with PC-relative addressing -- `strb r0,\[pc\],r1,LSL#4'
+[^:]*:182: Error: r15 not allowed here -- `strb pc,\[r0,r1,LSL ?#4\]'
+[^:]*:183: Error: r15 not allowed here -- `strb pc,\[r0,r1,LSL ?#4\]!'
+[^:]*:184: Error: r15 not allowed here -- `strb pc,\[r0\],r1,LSL ?#4'
+[^:]*:185: Error: cannot use register index with PC-relative addressing -- `strb r1,\[r0,pc,LSL ?#4\]'
+[^:]*:186: Error: cannot use register index with PC-relative addressing -- `strb r1,\[r0,pc,LSL ?#4\]!'
+[^:]*:187: Error: cannot use register index with PC-relative addressing -- `strb r1,\[r0\],pc,LSL ?#4'
+[^:]*:188: Error: cannot use register index with PC-relative addressing -- `strb r0,\[pc,r1,LSL ?#4\]!'
+[^:]*:189: Error: cannot use register index with PC-relative addressing -- `strb r0,\[pc\],r1,LSL ?#4'
 [^:]*:192: Error: r15 not allowed here -- `strbt pc,\[r0\],#4'
 [^:]*:193: Error: cannot use register index with PC-relative addressing -- `strbt r0,\[pc\],#4'
-[^:]*:194: Error: r15 not allowed here -- `strbt pc,\[r0\],r1,LSL#4'
-[^:]*:195: Error: cannot use register index with PC-relative addressing -- `strbt r0,\[pc\],r1,LSL#4'
-[^:]*:196: Error: cannot use register index with PC-relative addressing -- `strbt r0,\[r1\],pc,LSL#4'
+[^:]*:194: Error: r15 not allowed here -- `strbt pc,\[r0\],r1,LSL ?#4'
+[^:]*:195: Error: cannot use register index with PC-relative addressing -- `strbt r0,\[pc\],r1,LSL ?#4'
+[^:]*:196: Error: cannot use register index with PC-relative addressing -- `strbt r0,\[r1\],pc,LSL ?#4'
 [^:]*:199: Error: r15 not allowed here -- `strd r0,pc,\[r1,#4\]'
 [^:]*:200: Error: r15 not allowed here -- `strd r0,pc,\[r1\],#4'
 [^:]*:201: Error: r15 not allowed here -- `strd r0,pc,\[r1,#4\]!'
@@ -167,5 +167,5 @@
 [^:]*:255: Error: cannot use register index with PC-relative addressing -- `strht r0,\[pc\],r1'
 [^:]*:256: Error: cannot use register index with PC-relative addressing -- `strht r0,\[r1\],pc'
 [^:]*:259: Error: cannot use register index with PC-relative addressing -- `strt r0,\[pc\],#4'
-[^:]*:260: Error: cannot use register index with PC-relative addressing -- `strt r0,\[pc\],r1,LSL#4'
-[^:]*:261: Error: cannot use register index with PC-relative addressing -- `strt r0,\[r1\],pc,LSL#4'
+[^:]*:260: Error: cannot use register index with PC-relative addressing -- `strt r0,\[pc\],r1,LSL ?#4'
+[^:]*:261: Error: cannot use register index with PC-relative addressing -- `strt r0,\[r1\],pc,LSL ?#4'
--- a/gas/testsuite/gas/arm/t16-bad.l
+++ b/gas/testsuite/gas/arm/t16-bad.l
@@ -7,7 +7,7 @@
 [^:]*:36: *Info: macro .*
 [^:]*:16: Error: unshifted register required -- `tst r0,#12'
 [^:]*:36: *Info: macro .*
-[^:]*:17: Error: unshifted register required -- `tst r0,r1,lsl#2'
+[^:]*:17: Error: unshifted register required -- `tst r0,r1,lsl ?#2'
 [^:]*:36: *Info: macro .*
 [^:]*:18: Error: unshifted register required -- `tst r0,r1,lsl r3'
 [^:]*:36: *Info: macro .*
@@ -19,7 +19,7 @@
 [^:]*:37: *Info: macro .*
 [^:]*:16: Error: unshifted register required -- `cmn r0,#12'
 [^:]*:37: *Info: macro .*
-[^:]*:17: Error: unshifted register required -- `cmn r0,r1,lsl#2'
+[^:]*:17: Error: unshifted register required -- `cmn r0,r1,lsl ?#2'
 [^:]*:37: *Info: macro .*
 [^:]*:18: Error: unshifted register required -- `cmn r0,r1,lsl r3'
 [^:]*:37: *Info: macro .*
@@ -31,7 +31,7 @@
 [^:]*:38: *Info: macro .*
 [^:]*:16: Error: unshifted register required -- `mvn r0,#12'
 [^:]*:38: *Info: macro .*
-[^:]*:17: Error: unshifted register required -- `mvn r0,r1,lsl#2'
+[^:]*:17: Error: unshifted register required -- `mvn r0,r1,lsl ?#2'
 [^:]*:38: *Info: macro .*
 [^:]*:18: Error: unshifted register required -- `mvn r0,r1,lsl r3'
 [^:]*:38: *Info: macro .*
@@ -57,7 +57,7 @@
 [^:]*:12: Error: lo register required -- `sxtb r0,r8'
 [^:]*:21: *Info: macro .*
 [^:]*:43: *Info: macro .*
-[^:]*:22: Error: Thumb encoding does not support rotation -- `sxtb r0,r1,ror#8'
+[^:]*:22: Error: Thumb encoding does not support rotation -- `sxtb r0,r1,ror ?#8'
 [^:]*:43: *Info: macro .*
 [^:]*:11: Error: lo register required -- `sxth r8,r0'
 [^:]*:21: *Info: macro .*
@@ -65,7 +65,7 @@
 [^:]*:12: Error: lo register required -- `sxth r0,r8'
 [^:]*:21: *Info: macro .*
 [^:]*:44: *Info: macro .*
-[^:]*:22: Error: Thumb encoding does not support rotation -- `sxth r0,r1,ror#8'
+[^:]*:22: Error: Thumb encoding does not support rotation -- `sxth r0,r1,ror ?#8'
 [^:]*:44: *Info: macro .*
 [^:]*:11: Error: lo register required -- `uxtb r8,r0'
 [^:]*:21: *Info: macro .*
@@ -73,7 +73,7 @@
 [^:]*:12: Error: lo register required -- `uxtb r0,r8'
 [^:]*:21: *Info: macro .*
 [^:]*:45: *Info: macro .*
-[^:]*:22: Error: Thumb encoding does not support rotation -- `uxtb r0,r1,ror#8'
+[^:]*:22: Error: Thumb encoding does not support rotation -- `uxtb r0,r1,ror ?#8'
 [^:]*:45: *Info: macro .*
 [^:]*:11: Error: lo register required -- `uxth r8,r0'
 [^:]*:21: *Info: macro .*
@@ -81,7 +81,7 @@
 [^:]*:12: Error: lo register required -- `uxth r0,r8'
 [^:]*:21: *Info: macro .*
 [^:]*:46: *Info: macro .*
-[^:]*:22: Error: Thumb encoding does not support rotation -- `uxth r0,r1,ror#8'
+[^:]*:22: Error: Thumb encoding does not support rotation -- `uxth r0,r1,ror ?#8'
 [^:]*:46: *Info: macro .*
 [^:]*:25: Error: dest must overlap one source register -- `adc r1,r2,r3'
 [^:]*:30: *Info: macro .*
@@ -94,7 +94,7 @@
 [^:]*:48: *Info: macro .*
 [^:]*:31: Error: unshifted register required -- `adc r0,#12'
 [^:]*:48: *Info: macro .*
-[^:]*:32: Error: unshifted register required -- `adc r0,r1,lsl#2'
+[^:]*:32: Error: unshifted register required -- `adc r0,r1,lsl ?#2'
 [^:]*:48: *Info: macro .*
 [^:]*:33: Error: unshifted register required -- `adc r0,r1,lsl r3'
 [^:]*:48: *Info: macro .*
@@ -109,7 +109,7 @@
 [^:]*:49: *Info: macro .*
 [^:]*:31: Error: unshifted register required -- `and r0,#12'
 [^:]*:49: *Info: macro .*
-[^:]*:32: Error: unshifted register required -- `and r0,r1,lsl#2'
+[^:]*:32: Error: unshifted register required -- `and r0,r1,lsl ?#2'
 [^:]*:49: *Info: macro .*
 [^:]*:33: Error: unshifted register required -- `and r0,r1,lsl r3'
 [^:]*:49: *Info: macro .*
@@ -124,7 +124,7 @@
 [^:]*:50: *Info: macro .*
 [^:]*:31: Error: unshifted register required -- `bic r0,#12'
 [^:]*:50: *Info: macro .*
-[^:]*:32: Error: unshifted register required -- `bic r0,r1,lsl#2'
+[^:]*:32: Error: unshifted register required -- `bic r0,r1,lsl ?#2'
 [^:]*:50: *Info: macro .*
 [^:]*:33: Error: unshifted register required -- `bic r0,r1,lsl r3'
 [^:]*:50: *Info: macro .*
@@ -139,7 +139,7 @@
 [^:]*:51: *Info: macro .*
 [^:]*:31: Error: unshifted register required -- `eor r0,#12'
 [^:]*:51: *Info: macro .*
-[^:]*:32: Error: unshifted register required -- `eor r0,r1,lsl#2'
+[^:]*:32: Error: unshifted register required -- `eor r0,r1,lsl ?#2'
 [^:]*:51: *Info: macro .*
 [^:]*:33: Error: unshifted register required -- `eor r0,r1,lsl r3'
 [^:]*:51: *Info: macro .*
@@ -154,7 +154,7 @@
 [^:]*:52: *Info: macro .*
 [^:]*:31: Error: unshifted register required -- `orr r0,#12'
 [^:]*:52: *Info: macro .*
-[^:]*:32: Error: unshifted register required -- `orr r0,r1,lsl#2'
+[^:]*:32: Error: unshifted register required -- `orr r0,r1,lsl ?#2'
 [^:]*:52: *Info: macro .*
 [^:]*:33: Error: unshifted register required -- `orr r0,r1,lsl r3'
 [^:]*:52: *Info: macro .*
@@ -169,7 +169,7 @@
 [^:]*:53: *Info: macro .*
 [^:]*:31: Error: unshifted register required -- `sbc r0,#12'
 [^:]*:53: *Info: macro .*
-[^:]*:32: Error: unshifted register required -- `sbc r0,r1,lsl#2'
+[^:]*:32: Error: unshifted register required -- `sbc r0,r1,lsl ?#2'
 [^:]*:53: *Info: macro .*
 [^:]*:33: Error: unshifted register required -- `sbc r0,r1,lsl r3'
 [^:]*:53: *Info: macro .*
@@ -220,7 +220,7 @@
 [^:]*:60: *Info: macro .*
 [^:]*:65: *Info: macro .*
 [^:]*:66: Error: ror #imm not supported -- `ror r0,r1,#12'
-[^:]*:69: Error: unshifted register required -- `add r0,r1,lsl#2'
+[^:]*:69: Error: unshifted register required -- `add r0,r1,lsl ?#2'
 [^:]*:70: Error: unshifted register required -- `add r0,r1,lsl r3'
 [^:]*:71: Error: lo register required -- `add r8,r0,#1'
 [^:]*:72: Error: lo register required -- `add r0,r8,#1'
@@ -236,7 +236,7 @@
 [^:]*:27: Error: lo register required -- `sub r0,r8'
 [^:]*:30: *Info: macro .*
 [^:]*:80: *Info: macro .*
-[^:]*:32: Error: unshifted register required -- `sub r0,r1,lsl#2'
+[^:]*:32: Error: unshifted register required -- `sub r0,r1,lsl ?#2'
 [^:]*:80: *Info: macro .*
 [^:]*:33: Error: unshifted register required -- `sub r0,r1,lsl r3'
 [^:]*:80: *Info: macro .*
@@ -246,10 +246,10 @@
 [^:]*:84: Error: lo register required -- `sub r8,r1,r2'
 [^:]*:85: Error: lo register required -- `sub r1,r8,r2'
 [^:]*:86: Error: lo register required -- `sub r1,r2,r8'
-[^:]*:88: Error: shifts in CMP/MOV instructions are only supported in unified syntax -- `cmp r0,r1,lsl#2'
+[^:]*:88: Error: shifts in CMP/MOV instructions are only supported in unified syntax -- `cmp r0,r1,lsl ?#2'
 [^:]*:89: Error: shifts in CMP/MOV instructions are only supported in unified syntax -- `cmp r0,r1,lsl r3'
 [^:]*:90: Error: only lo regs allowed with immediate -- `cmp r8,#255'
-[^:]*:92: Error: shifts in CMP/MOV instructions are only supported in unified syntax -- `mov r0,r1,lsl#2'
+[^:]*:92: Error: shifts in CMP/MOV instructions are only supported in unified syntax -- `mov r0,r1,lsl ?#2'
 [^:]*:93: Error: shifts in CMP/MOV instructions are only supported in unified syntax -- `mov r0,r1,lsl r3'
 [^:]*:94: Error: only lo regs allowed with immediate -- `mov r8,#255'
 [^:]*:98: Error: lo register required -- `ldr r8,\[r0\]'
@@ -364,8 +364,8 @@
 [^:]*:113: *Info: macro .*
 [^:]*:104: Error: Thumb does not support this addressing mode -- `strh r0,\[r1\],r2'
 [^:]*:113: *Info: macro .*
-[^:]*:115: Error: Thumb does not support this addressing mode -- `ldr r0,\[r1,r2,lsl#1\]'
-[^:]*:116: Error: Thumb does not support this addressing mode -- `str r0,\[r1,r2,lsl#1\]'
+[^:]*:115: Error: Thumb does not support this addressing mode -- `ldr r0,\[r1,r2,lsl ?#1\]'
+[^:]*:116: Error: Thumb does not support this addressing mode -- `str r0,\[r1,r2,lsl ?#1\]'
 [^:]*:119: Error: lo register required -- `ldmia r8!,{r1,r2}'
 [^:]*:120: Error: lo register required -- `ldmia r7!,{r8}'
 [^:]*:121: Warning: this instruction will write back the base register
--- a/gas/testsuite/gas/arm/thumb2_bad_reg.l
+++ b/gas/testsuite/gas/arm/thumb2_bad_reg.l
@@ -514,7 +514,7 @@
 [^:]*:[0-9]+: Error: r15 not allowed here -- `ssat r15,#1,r0'
 [^:]*:[0-9]+: Error: r13 not allowed here -- `ssat r0,#1,r13'
 [^:]*:[0-9]+: Error: r15 not allowed here -- `ssat r0,#1,r15'
-[^:]*:[0-9]+: Error: shift expression is too large -- `ssat r1,#1,r3,asr#32'
+[^:]*:[0-9]+: Error: shift expression is too large -- `ssat r1,#1,r3,asr ?#32'
 [^:]*:[0-9]+: Error: r13 not allowed here -- `ssat16 r13,#1,r0'
 [^:]*:[0-9]+: Error: r15 not allowed here -- `ssat16 r15,#1,r0'
 [^:]*:[0-9]+: Error: r13 not allowed here -- `ssat16 r0,#1,r13'
@@ -742,7 +742,7 @@
 [^:]*:[0-9]+: Error: r15 not allowed here -- `usat r15,#1,r0'
 [^:]*:[0-9]+: Error: r13 not allowed here -- `usat r0,#1,r13'
 [^:]*:[0-9]+: Error: r15 not allowed here -- `usat r0,#1,r15'
-[^:]*:[0-9]+: Error: shift expression is too large -- `usat r1,#1,r3,asr#32'
+[^:]*:[0-9]+: Error: shift expression is too large -- `usat r1,#1,r3,asr ?#32'
 [^:]*:[0-9]+: Error: r13 not allowed here -- `usat16 r13,#1,r0'
 [^:]*:[0-9]+: Error: r15 not allowed here -- `usat16 r15,#1,r0'
 [^:]*:[0-9]+: Error: r13 not allowed here -- `usat16 r0,#1,r13'


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 10/11] aarch64: relax gas testsuite whitespace expectations
  2024-06-28 13:15 [PATCH 00/11] gas: scrubber adjustments Jan Beulich
                   ` (8 preceding siblings ...)
  2024-06-28 13:24 ` [PATCH 09/11] Arm: relax gas testsuite whitespace expectations Jan Beulich
@ 2024-06-28 13:26 ` Jan Beulich
  2024-06-28 13:31 ` [PATCH WIP/RFC 11/11] gas: have scrubber retain more whitespace Jan Beulich
  10 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2024-06-28 13:26 UTC (permalink / raw)
  To: Binutils; +Cc: Richard Earnshaw, Marcus Shawcroft

[-- Attachment #1: Type: text/plain, Size: 638 bytes --]

In a subsequent change the scrubber is going to be changed to retain
further whitespace. Test case expectations generally would better not
depend on the specific whitespace treatment by the scrubber, unless of
course a test is specifically about it. Adjust relevant test cases to
permit blanks where those will subsequently appear.
---
This is adding just the blanks that are going to be needed; imo it would
generally be better if test case expectations were, from the very
beginning, written to focus on what is being tested, rather than taking
verbatim copies of the respective tool's output.

[actual patch attached, for size reasons]

[-- Attachment #2: gas-testsuite-relax-Arm64.patch.gz --]
[-- Type: application/x-gzip, Size: 117830 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH WIP/RFC 11/11] gas: have scrubber retain more whitespace
  2024-06-28 13:15 [PATCH 00/11] gas: scrubber adjustments Jan Beulich
                   ` (9 preceding siblings ...)
  2024-06-28 13:26 ` [PATCH 10/11] aarch64: " Jan Beulich
@ 2024-06-28 13:31 ` Jan Beulich
  10 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2024-06-28 13:31 UTC (permalink / raw)
  To: Binutils
  Cc: Nick Clifton, ramana.radhakrishnan, Richard Earnshaw,
	Marcus Shawcroft, H.J. Lu, Joseph Myers

According to the description os the state machine, the expectation
appears to be that (leaving aside labels) any insn mnemonic or
directive would be followed by a comma separated list of operands. That
may have been true very long ago, but the latest with the advent of more
elaborate macros this isn't rhe case anymore. Neither macro parameters
in macro definitions nor macro arguments in macro invocations are
required to be separated by commas. Hence whitespace serves a crucial
role there. Plus even without "real" macros issues exist, in e.g.

	.irp n, ...
	insn\n\(suffix)	operand1, operand2
	.endr

Whitespace following the closing parenthesis would have been removed
(ahead of even processing the .irp), as the "opcode" was deemed to have
ended earlier already.

Therefore, squash the distinction between "opcode" and operands, i.e.
fold state 10 back into state 3. Also drop most of the distinction
between "symbol chars" and "relatively normal" ones. Not entirely
unexpectedly this results in the need to skip whitespace in a few more
places in arch-specific code.
---
The comma related special casing could of course be extended to adjacent
two-char comments; I simply didn't deem this as important.

WIP: likely there's also fallout for further architectures
WIP: see about dropping the D10V special case (not helped by there not
     being a dedicated maintainer)
WIP: revisit the TIC6X special case
WIP: testcase(s)
WIP: NEWS entry (perhaps also to warn people of posible perceived
     regressions, along the lines of what "Arm: correct macro use in gas
     testsuite" addresses)

--- a/gas/app.c
+++ b/gas/app.c
@@ -423,16 +423,18 @@ do_scrub_chars (size_t (*get) (char *, s
 
   /*State 0: beginning of normal line
 	  1: After first whitespace on line (flush more white)
-	  2: After first non-white (opcode) on line (keep 1white)
-	  3: after second white on line (into operands) (flush white)
+	  2: After first non-white (opcode or maybe label when they're followed
+	     by colons) on line (keep 1white)
+	  3: after subsequent white on line (typically into operands)
+	     (flush more white)
 	  4: after putting out a .linefile, put out digits
 	  5: parsing a string, then go to old-state
 	  6: putting out \ escape in a "d string.
 	  7: no longer used
 	  8: no longer used
-	  9: After seeing symbol char in state 3 (keep 1white after symchar)
-	 10: After seeing whitespace in state 9 (keep white before symchar)
-	 11: After seeing a symbol character in state 0 (eg a label definition)
+	  9: After seeing non-white in state 3 (keep 1white)
+	 10: no longer used
+	 11: After seeing a non-white character in state 0 (eg a label definition)
 	 -1: output string in out_string and go to the state in old_state
 	 12: no longer used
 #ifdef DOUBLEBAR_PARALLEL
@@ -899,7 +901,11 @@ do_scrub_chars (size_t (*get) (char *, s
 	          && (state < 1 || strchr (tc_comment_chars, ch)))
 	      || IS_NEWLINE (ch)
 	      || IS_LINE_SEPARATOR (ch)
-	      || IS_PARALLEL_SEPARATOR (ch))
+	      || IS_PARALLEL_SEPARATOR (ch)
+	      /* See comma related comment near the bottom of the function.
+		 Reasoning equally applies to whitespace preceding a comma in
+		 most cases.  */
+	      || (ch == ',' && state > 2))
 	    {
 	      if (scrub_m68k_mri)
 		{
@@ -942,6 +948,7 @@ do_scrub_chars (size_t (*get) (char *, s
 		 character at the beginning of a line.  */
 	      goto recycle;
 	    case 2:
+	    case 9:
 	      state = 3;
 	      if (to + 1 < toend)
 		{
@@ -965,20 +972,6 @@ do_scrub_chars (size_t (*get) (char *, s
 		  break;
 		}
 	      goto recycle;	/* Sp in operands */
-	    case 9:
-	    case 10:
-#ifndef TC_KEEP_OPERAND_SPACES
-	      if (scrub_m68k_mri)
-#endif
-		{
-		  /* In MRI mode, we keep these spaces.  */
-		  state = 3;
-		  UNGET (ch);
-		  PUT (' ');
-		  break;
-		}
-	      state = 10;	/* Sp after symbol char */
-	      goto recycle;
 	    case 11:
 	      if (LABELS_WITHOUT_COLONS || flag_m68k_mri)
 		state = 1;
@@ -1049,27 +1042,17 @@ do_scrub_chars (size_t (*get) (char *, s
 	    {
 	      if (ch2 != EOF)
 		UNGET (ch2);
-	      if (state == 9 || state == 10)
-		state = 3;
+	      if (state == 1)
+		state = 2;
+	      else if (state == 3)
+		state = 9;
 	      PUT (ch);
 	    }
 	  break;
 
 	case LEX_IS_STRINGQUOTE:
 	  quotechar = ch;
-	  if (state == 10)
-	    {
-	      /* Preserve the whitespace in foo "bar".  */
-	      UNGET (ch);
-	      state = 3;
-	      PUT (' ');
-
-	      /* PUT didn't jump out.  We could just break, but we
-		 know what will happen, so optimize a bit.  */
-	      ch = GET ();
-	      old_state = 9;
-	    }
-	  else if (state == 3)
+	  if (state == 3)
 	    old_state = 9;
 	  else
 	    old_state = state;
@@ -1088,14 +1071,6 @@ do_scrub_chars (size_t (*get) (char *, s
 	      UNGET (c);
 	    }
 #endif
-	  if (state == 10)
-	    {
-	      /* Preserve the whitespace in foo 'b'.  */
-	      UNGET (ch);
-	      state = 3;
-	      PUT (' ');
-	      break;
-	    }
 	  ch = GET ();
 	  if (ch == EOF)
 	    {
@@ -1130,10 +1105,7 @@ do_scrub_chars (size_t (*get) (char *, s
 	      PUT (out_buf[0]);
 	      break;
 	    }
-	  if (state == 9)
-	    old_state = 3;
-	  else
-	    old_state = state;
+	  old_state = state;
 	  state = -1;
 	  out_string = out_buf;
 	  PUT (*out_string++);
@@ -1143,10 +1115,10 @@ do_scrub_chars (size_t (*get) (char *, s
 #ifdef KEEP_WHITE_AROUND_COLON
 	  state = 9;
 #else
-	  if (state == 9 || state == 10)
-	    state = 3;
-	  else if (state != 3)
+	  if (state == 2 || state == 11)
 	    state = 1;
+	  else
+	    state = 9;
 #endif
 	  PUT (ch);
 	  break;
@@ -1271,7 +1243,7 @@ do_scrub_chars (size_t (*get) (char *, s
 	      break;
 	    }
 
-#ifdef TC_D10V
+#ifdef TC_D10V//todo drop?
 	  /* All insns end in a char for which LEX_IS_SYMBOL_COMPONENT is true.
 	     Trap is the only short insn that has a first operand that is
 	     neither register nor label.
@@ -1282,7 +1254,7 @@ do_scrub_chars (size_t (*get) (char *, s
 	     can recognize it as such.  */
 	  /* An alternative approach would be to reset the state to 1 when
 	     we see '||', '<'- or '->', but that seems to be overkill.  */
-	  if (state == 10)
+	  if (state == 3)
 	    PUT (' ');
 #endif
 	  /* We have a line comment character which is not at the
@@ -1296,7 +1268,7 @@ do_scrub_chars (size_t (*get) (char *, s
 	  if (scrub_m68k_mri
 	      && (ch == '!' || ch == '*' || ch == '#')
 	      && state != 1
-	      && state != 10)
+	      && state != 3)
 	    goto de_fault;
 	  /* Fall through.  */
 	case LEX_IS_COMMENT_START:
@@ -1352,17 +1324,6 @@ do_scrub_chars (size_t (*get) (char *, s
 	  /* Fall through.  */
 
 	case LEX_IS_SYMBOL_COMPONENT:
-	  if (state == 10)
-	    {
-	      /* This is a symbol character following another symbol
-		 character, with whitespace in between.  We skipped
-		 the whitespace earlier, so output it now.  */
-	      UNGET (ch);
-	      state = 3;
-	      PUT (' ');
-	      break;
-	    }
-
 #ifdef TC_Z80
 	  /* "af'" is a symbol containing '\''.  */
 	  if (state == 3 && (ch == 'a' || ch == 'A'))
@@ -1388,7 +1349,16 @@ do_scrub_chars (size_t (*get) (char *, s
 		}
 	    }
 #endif
-	  if (state == 3)
+
+	  /* Fall through.  */
+	default:
+	de_fault:
+	  /* Some relatively `normal' character.  */
+	  if (state == 0)
+	    state = 11;	/* Now seeing label definition.  */
+	  else if (state == 1)
+	    state = 2;	/* Ditto.  */
+	  else if (state == 3)
 	    state = 9;
 
 	  /* This is a common case.  Quickly copy CH and all the
@@ -1436,52 +1406,15 @@ do_scrub_chars (size_t (*get) (char *, s
 		}
 	    }
 
-	  /* Fall through.  */
-	default:
-	de_fault:
-	  /* Some relatively `normal' character.  */
-	  if (state == 0)
-	    {
-	      state = 11;	/* Now seeing label definition.  */
-	    }
-	  else if (state == 1)
-	    {
-	      state = 2;	/* Ditto.  */
-	    }
-	  else if (state == 9)
-	    {
-	      if (!IS_SYMBOL_COMPONENT (ch))
-		state = 3;
-	    }
-	  else if (state == 10)
-	    {
-	      if (ch == '\\')
-		{
-		  /* Special handling for backslash: a backslash may
-		     be the beginning of a formal parameter (of a
-		     macro) following another symbol character, with
-		     whitespace in between.  If that is the case, we
-		     output a space before the parameter.  Strictly
-		     speaking, correct handling depends upon what the
-		     macro parameter expands into; if the parameter
-		     expands into something which does not start with
-		     an operand character, then we don't want to keep
-		     the space.  We don't have enough information to
-		     make the right choice, so here we are making the
-		     choice which is more likely to be correct.  */
-		  if (to + 1 >= toend)
-		    {
-		      /* If we're near the end of the buffer, save the
-		         character for the next time round.  Otherwise
-		         we'll lose our state.  */
-		      UNGET (ch);
-		      goto tofull;
-		    }
-		  *to++ = ' ';
-		}
+	  /* As a special case, to limit the delta to previous behavior, e.g.
+	     also affecting listings, go straight to state 3 when seeing a
+	     comma. Commas are special: While they can be used to separate
+	     macro parameters or arguments, they cannot (on their own, i.e.
+	     without quoting) be arguments (or parameter default values).
+	     Hence successive whitespace is not meaningful there.  */
+	  if (ch == ',' && state == 9)
+	    state = 3;
 
-	      state = 3;
-	    }
 	  PUT (ch);
 	  break;
 	}
--- a/gas/config/tc-aarch64.c
+++ b/gas/config/tc-aarch64.c
@@ -641,6 +641,7 @@ const char FLT_CHARS[] = "rRsSfFdDxXeEpP
 static inline bool
 skip_past_char (char **str, char c)
 {
+  skip_whitespace (*str);
   if (**str == c)
     {
       (*str)++;
@@ -891,6 +892,7 @@ parse_reg (char **ccp)
   start++;
 #endif
 
+  skip_whitespace (start);
   p = start;
   if (!ISALPHA (*p) || !is_name_beginner (*p))
     return NULL;
@@ -1196,13 +1198,17 @@ parse_typed_reg (char **ccp, aarch64_reg
 		 struct vector_type_el *typeinfo, unsigned int flags)
 {
   char *str = *ccp;
-  bool is_alpha = ISALPHA (*str);
-  const reg_entry *reg = parse_reg (&str);
+  bool is_alpha;
+  const reg_entry *reg;
   struct vector_type_el atype;
   struct vector_type_el parsetype;
   bool is_typed_vecreg = false;
   unsigned int err_flags = (flags & PTR_IN_REGLIST) ? SEF_IN_REGLIST : 0;
 
+  skip_whitespace (str);
+  is_alpha = ISALPHA (*str);
+  reg = parse_reg (&str);
+
   atype.defined = 0;
   atype.type = NT_invtype;
   atype.width = -1;
@@ -1420,10 +1426,7 @@ parse_vector_reg_list (char **ccp, aarch
   do
     {
       if (in_range)
-	{
-	  str++;		/* skip over '-' */
-	  val_range = val;
-	}
+	val_range = val;
 
       const reg_entry *reg;
       if (has_qualifier)
@@ -1485,7 +1488,8 @@ parse_vector_reg_list (char **ccp, aarch
       in_range = 0;
       ptr_flags |= PTR_GOOD_MATCH;
     }
-  while (skip_past_comma (&str) || (in_range = 1, *str == '-'));
+  while (skip_past_comma (&str)
+	 || (in_range = 1, skip_past_char (&str, '-')));
 
   skip_whitespace (str);
   if (*str != '}')
@@ -8265,6 +8269,7 @@ parse_operands (char *str, const aarch64
     }
 
   /* Check if we have parsed all the operands.  */
+  skip_whitespace (str);
   if (*str != '\0' && ! error_p ())
     {
       /* Set I to the index of the last present operand; this is
--- a/gas/config/tc-arm.c
+++ b/gas/config/tc-arm.c
@@ -1148,6 +1148,8 @@ my_get_expression (expressionS * ep, cha
     prefix_mode = (prefix_mode == GE_OPT_PREFIX_BIG) ? prefix_mode
 		  : GE_OPT_PREFIX;
 
+  skip_whitespace (*str);
+
   switch (prefix_mode)
     {
     case GE_NO_PREFIX: break;
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -15306,6 +15306,8 @@ RC_SAE_immediate (const char *imm_start)
       as_bad (_("Missing '}': '%s'"), imm_start);
       return 0;
     }
+  if (is_space_char (*pstr))
+    ++pstr;
   /* RC/SAE immediate string should contain nothing more.  */;
   if (*pstr != 0)
     {


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 07/11] gas: consistently drop trailing whitespace when scrubbing
  2024-06-28 13:22 ` [PATCH 07/11] gas: consistently drop trailing whitespace when scrubbing Jan Beulich
@ 2024-06-28 13:53   ` Maciej W. Rozycki
  2024-07-01  5:58     ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Maciej W. Rozycki @ 2024-06-28 13:53 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils, Chenghua Xu

On Fri, 28 Jun 2024, Jan Beulich wrote:

> >From especially the checks for the two separator forms it appears to

 Accidental `>' here?

> In the MIPS testsuite adjustments I wasn't sure how exactly to make the
> necessary changes: I could also have zapped the two blanks each, but
> then I wondered why they were part of the expectations in the first
> place.

 Please do.

 The only reason I can think of why the patterns are such as they are is 
that they need to match the output as it has been produced.  It made no 
sense to make whitespace matching optional when said whitespace is always 
there.

 Of course I could have investigated why this extraneous whitespace is 
present there in the first place, but either I chose it wasn't worth it or 
I missed it in the eyeball review of the output produced; there are only 
two among 133 lines and not key to the purpose of the test case.  It's 
been a while, I genuinely do not remember.

 Thank you for making this clean-up, and taking care of the MIPS port in 
particular.

  Maciej

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 05/11] gas: pre-init the scrubber's lex[]
  2024-06-28 13:19 ` [PATCH 05/11] gas: pre-init the scrubber's lex[] Jan Beulich
@ 2024-07-01  1:45   ` Hans-Peter Nilsson
  2024-07-01  6:02     ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Hans-Peter Nilsson @ 2024-07-01  1:45 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils, Anthony Green, Alan Modra

On Fri, 28 Jun 2024, Jan Beulich wrote:
> +static char lex[256] = {
> +  [' ']  = LEX_IS_WHITESPACE,
> +  ['\t'] = LEX_IS_WHITESPACE,
> +  ['\r'] = LEX_IS_WHITESPACE,
> +  ['\n'] = LEX_IS_NEWLINE,
> +  [':']  = LEX_IS_COLON,
> +  ['$']  = LEX_IS_SYMBOL_COMPONENT,
> +  ['.']  = LEX_IS_SYMBOL_COMPONENT,
> +  ['_']  = LEX_IS_SYMBOL_COMPONENT,
> +  ['A' ... 'Z'] = LEX_IS_SYMBOL_COMPONENT,
> +  ['a' ... 'z'] = LEX_IS_SYMBOL_COMPONENT,
> +  ['0' ... '9'] = LEX_IS_SYMBOL_COMPONENT,
> +  [128 ... 255] = LEX_IS_SYMBOL_COMPONENT,

That "..." initializer thingy is a GCC GNU extension.
Cf. GCC docs.  You probably don't want to do that.
Spelling it out won't hurt that much.

brgds, H-P

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 07/11] gas: consistently drop trailing whitespace when scrubbing
  2024-06-28 13:53   ` Maciej W. Rozycki
@ 2024-07-01  5:58     ` Jan Beulich
  2024-07-01 11:05       ` Maciej W. Rozycki
  2024-07-03 13:42       ` Michael Matz
  0 siblings, 2 replies; 18+ messages in thread
From: Jan Beulich @ 2024-07-01  5:58 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Binutils, Chenghua Xu

On 28.06.2024 15:53, Maciej W. Rozycki wrote:
> On Fri, 28 Jun 2024, Jan Beulich wrote:
> 
>> >From especially the checks for the two separator forms it appears to
> 
>  Accidental `>' here?

Indeed. Don't have that locally, so not really sure where it came from.

>> In the MIPS testsuite adjustments I wasn't sure how exactly to make the
>> necessary changes: I could also have zapped the two blanks each, but
>> then I wondered why they were part of the expectations in the first
>> place.
> 
>  Please do.

Done locally; I won't post a v2 just for this, though.

>  The only reason I can think of why the patterns are such as they are is 
> that they need to match the output as it has been produced.  It made no 
> sense to make whitespace matching optional when said whitespace is always 
> there.

But the purpose of those tests isn't the _exact_ output that gas would
produce, is it? The presence / absence of whitespace in most tests
really is entirely unrelated to what is being tested. Just to mention
it ...

Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 05/11] gas: pre-init the scrubber's lex[]
  2024-07-01  1:45   ` Hans-Peter Nilsson
@ 2024-07-01  6:02     ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2024-07-01  6:02 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: Binutils, Anthony Green, Alan Modra

On 01.07.2024 03:45, Hans-Peter Nilsson wrote:
> On Fri, 28 Jun 2024, Jan Beulich wrote:
>> +static char lex[256] = {
>> +  [' ']  = LEX_IS_WHITESPACE,
>> +  ['\t'] = LEX_IS_WHITESPACE,
>> +  ['\r'] = LEX_IS_WHITESPACE,
>> +  ['\n'] = LEX_IS_NEWLINE,
>> +  [':']  = LEX_IS_COLON,
>> +  ['$']  = LEX_IS_SYMBOL_COMPONENT,
>> +  ['.']  = LEX_IS_SYMBOL_COMPONENT,
>> +  ['_']  = LEX_IS_SYMBOL_COMPONENT,
>> +  ['A' ... 'Z'] = LEX_IS_SYMBOL_COMPONENT,
>> +  ['a' ... 'z'] = LEX_IS_SYMBOL_COMPONENT,
>> +  ['0' ... '9'] = LEX_IS_SYMBOL_COMPONENT,
>> +  [128 ... 255] = LEX_IS_SYMBOL_COMPONENT,
> 
> That "..." initializer thingy is a GCC GNU extension.
> Cf. GCC docs.  You probably don't want to do that.

Oh, you're right. I got so used to using it that I've been under the wrong
impression that only the case label use of ... is an extension. Will undo
of course (I had it spelled out first); thanks for pointing out.

Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 07/11] gas: consistently drop trailing whitespace when scrubbing
  2024-07-01  5:58     ` Jan Beulich
@ 2024-07-01 11:05       ` Maciej W. Rozycki
  2024-07-03 13:42       ` Michael Matz
  1 sibling, 0 replies; 18+ messages in thread
From: Maciej W. Rozycki @ 2024-07-01 11:05 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils, Chenghua Xu

On Mon, 1 Jul 2024, Jan Beulich wrote:

> >> In the MIPS testsuite adjustments I wasn't sure how exactly to make the
> >> necessary changes: I could also have zapped the two blanks each, but
> >> then I wondered why they were part of the expectations in the first
> >> place.
> > 
> >  Please do.
> 
> Done locally; I won't post a v2 just for this, though.

 Thank you, and agreed.

> >  The only reason I can think of why the patterns are such as they are is 
> > that they need to match the output as it has been produced.  It made no 
> > sense to make whitespace matching optional when said whitespace is always 
> > there.
> 
> But the purpose of those tests isn't the _exact_ output that gas would
> produce, is it? The presence / absence of whitespace in most tests
> really is entirely unrelated to what is being tested. Just to mention
> it ...

 Of course, but something has to be placed there as the pattern to match 
against and taking the message produced literally, whitespace or not, is 
the simplest and most obvious choice.

 Also think of the MIPS tests as coverage for your good work; is there any 
other?

  Maciej

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 07/11] gas: consistently drop trailing whitespace when scrubbing
  2024-07-01  5:58     ` Jan Beulich
  2024-07-01 11:05       ` Maciej W. Rozycki
@ 2024-07-03 13:42       ` Michael Matz
  1 sibling, 0 replies; 18+ messages in thread
From: Michael Matz @ 2024-07-03 13:42 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils

On Mon, 1 Jul 2024, Jan Beulich wrote:

> On 28.06.2024 15:53, Maciej W. Rozycki wrote:
> > On Fri, 28 Jun 2024, Jan Beulich wrote:
> > 
> >> >From especially the checks for the two separator forms it appears to
> > 
> >  Accidental `>' here?
> 
> Indeed. Don't have that locally, so not really sure where it came from.

That happens if anything in the mailpipe doesn't properly deal 
with from-escaping in mbox files:

  https://doc.dovecot.org/admin_manual/mailbox_formats/mbox/#from-escaping

(either too triger-happy escaping on the sender side, or not enough 
de-escaping on the store/forward/retrieve side)


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-07-03 13:42 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-28 13:15 [PATCH 00/11] gas: scrubber adjustments Jan Beulich
2024-06-28 13:17 ` [PATCH 01/11] gas: there's no scrubber state 12 Jan Beulich
2024-06-28 13:17 ` [PATCH 02/11] gas: drop scrubber state -2 Jan Beulich
2024-06-28 13:18 ` [PATCH 03/11] gas: multi-byte warning adjustments Jan Beulich
2024-06-28 13:18 ` [PATCH 04/11] gas: don't open-code IS_WHITESPACE() / IS_NEWLINE() Jan Beulich
2024-06-28 13:19 ` [PATCH 05/11] gas: pre-init the scrubber's lex[] Jan Beulich
2024-07-01  1:45   ` Hans-Peter Nilsson
2024-07-01  6:02     ` Jan Beulich
2024-06-28 13:20 ` [PATCH 06/11] gas: re-work tic6x scrubber special case Jan Beulich
2024-06-28 13:22 ` [PATCH 07/11] gas: consistently drop trailing whitespace when scrubbing Jan Beulich
2024-06-28 13:53   ` Maciej W. Rozycki
2024-07-01  5:58     ` Jan Beulich
2024-07-01 11:05       ` Maciej W. Rozycki
2024-07-03 13:42       ` Michael Matz
2024-06-28 13:23 ` [PATCH 08/11] Arm: correct macro use in gas testsuite Jan Beulich
2024-06-28 13:24 ` [PATCH 09/11] Arm: relax gas testsuite whitespace expectations Jan Beulich
2024-06-28 13:26 ` [PATCH 10/11] aarch64: " Jan Beulich
2024-06-28 13:31 ` [PATCH WIP/RFC 11/11] gas: have scrubber retain more whitespace Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).