public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/3] better fix for PR gas/30248
@ 2023-03-31 10:03 Jan Beulich
  2023-03-31 10:04 ` [PATCH 1/3] x86: parse_real_register() does not alter the parsed string Jan Beulich
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Jan Beulich @ 2023-03-31 10:03 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

As said in reply to the original attempt - that was wrong along with
the diagnosis of the root cause.

1: x86: parse_real_register() does not alter the parsed string
2: x86: parse_register() must not alter the parsed string
3: gas: document that get_symbol_name() can clobber the input buffer

Jan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/3] x86: parse_real_register() does not alter the parsed string
  2023-03-31 10:03 [PATCH 0/3] better fix for PR gas/30248 Jan Beulich
@ 2023-03-31 10:04 ` Jan Beulich
  2023-03-31 10:05 ` [PATCH 2/3] x86: parse_register() must " Jan Beulich
  2023-03-31 10:07 ` [PATCH 3/3] gas: document that get_symbol_name() can clobber the input buffer Jan Beulich
  2 siblings, 0 replies; 4+ messages in thread
From: Jan Beulich @ 2023-03-31 10:04 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

Follow the model of strtol() et al - input string is const-qualified to
signal that the string isn't altered, but the returned "end" pointer is
not const-qualified, requiring const to be cast away (which generally is
a bad idea, but the alternative would be more convoluted code).

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -13749,9 +13749,9 @@ static bool check_register (const reg_en
 /* REG_STRING starts *before* REGISTER_PREFIX.  */
 
 static const reg_entry *
-parse_real_register (char *reg_string, char **end_op)
+parse_real_register (const char *reg_string, char **end_op)
 {
-  char *s = reg_string;
+  const char *s = reg_string;
   char *p;
   char reg_name_given[MAX_REG_NAME_SIZE + 1];
   const reg_entry *r;
@@ -13774,7 +13774,7 @@ parse_real_register (char *reg_string, c
   if (is_part_of_name (*s))
     return (const reg_entry *) NULL;
 
-  *end_op = s;
+  *end_op = (char *) s;
 
   r = (const reg_entry *) str_hash_find (reg_hash, reg_name_given);
 
@@ -13802,7 +13802,7 @@ parse_real_register (char *reg_string, c
 		++s;
 	      if (*s == ')')
 		{
-		  *end_op = s + 1;
+		  *end_op = (char *) s + 1;
 		  know (r[fpr].reg_num == fpr);
 		  return r + fpr;
 		}


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 2/3] x86: parse_register() must not alter the parsed string
  2023-03-31 10:03 [PATCH 0/3] better fix for PR gas/30248 Jan Beulich
  2023-03-31 10:04 ` [PATCH 1/3] x86: parse_real_register() does not alter the parsed string Jan Beulich
@ 2023-03-31 10:05 ` Jan Beulich
  2023-03-31 10:07 ` [PATCH 3/3] gas: document that get_symbol_name() can clobber the input buffer Jan Beulich
  2 siblings, 0 replies; 4+ messages in thread
From: Jan Beulich @ 2023-03-31 10:05 UTC (permalink / raw)
  To: Binutils; +Cc: H.J. Lu

This reverts the code change done by 100f993c53a5 ("x86: Check
unbalanced braces in memory reference"), which wrongly identified
e87fb6a6d0cd ("x86/gas: support quoted address scale factor in AT&T
syntax") as the root cause of PR gas/30248. (The testcase is left in
place, no matter that it's at best marginally useful in that shape.)

The problem instead is that parse_register() alters the string handed to
it, thus breaking valid assumptions in subsequent parsing code. Since
the function's behavior is a result of get_symbol_name()'s, make a copy
of the incoming string before invoking that function.

Like for parse_real_register() follow the model of strtol() et al: input
string is const-qualified to signal that the string isn't altered, but
the returned "end" pointer is not const-qualified, requiring const to be
cast away (which generally is a bad idea, but the alternative would
again be more convoluted code).

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -159,7 +159,7 @@ static int i386_att_operand (char *);
 static int i386_intel_operand (char *, int);
 static int i386_intel_simplify (expressionS *);
 static int i386_intel_parse_name (const char *, expressionS *);
-static const reg_entry *parse_register (char *, char **);
+static const reg_entry *parse_register (const char *, char **);
 static const char *parse_insn (const char *, char *, bool);
 static char *parse_operands (char *, const char *);
 static void swap_operands (void);
@@ -12497,11 +12497,7 @@ i386_att_operand (char *operand_string)
 	  temp_string = base_string;
 
 	  /* Skip past '(' and whitespace.  */
-	  if (*base_string != '(')
-	    {
-	      as_bad (_("unbalanced braces"));
-	      return 0;
-	    }
+	  gas_assert (*base_string == '(');
 	  ++base_string;
 	  if (is_space_char (*base_string))
 	    ++base_string;
@@ -13818,7 +13814,7 @@ parse_real_register (const char *reg_str
 /* REG_STRING starts *before* REGISTER_PREFIX.  */
 
 static const reg_entry *
-parse_register (char *reg_string, char **end_op)
+parse_register (const char *reg_string, char **end_op)
 {
   const reg_entry *r;
 
@@ -13829,12 +13825,12 @@ parse_register (char *reg_string, char *
   if (!r)
     {
       char *save = input_line_pointer;
-      char c;
+      char *buf = xstrdup (reg_string), *name;
       symbolS *symbolP;
 
-      input_line_pointer = reg_string;
-      c = get_symbol_name (&reg_string);
-      symbolP = symbol_find (reg_string);
+      input_line_pointer = buf;
+      get_symbol_name (&name);
+      symbolP = symbol_find (name);
       while (symbolP && S_GET_SEGMENT (symbolP) != reg_section)
 	{
 	  const expressionS *e = symbol_get_value_expression(symbolP);
@@ -13852,7 +13848,7 @@ parse_register (char *reg_string, char *
 	      know (e->X_add_number >= 0
 		    && (valueT) e->X_add_number < i386_regtab_size);
 	      r = i386_regtab + e->X_add_number;
-	      *end_op = input_line_pointer;
+	      *end_op = (char *) reg_string + (input_line_pointer - buf);
 	    }
 	  if (r && !check_register (r))
 	    {
@@ -13861,8 +13857,8 @@ parse_register (char *reg_string, char *
 	      r = &bad_reg;
 	    }
 	}
-      *input_line_pointer = c;
       input_line_pointer = save;
+      free (buf);
     }
   return r;
 }


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 3/3] gas: document that get_symbol_name() can clobber the input buffer
  2023-03-31 10:03 [PATCH 0/3] better fix for PR gas/30248 Jan Beulich
  2023-03-31 10:04 ` [PATCH 1/3] x86: parse_real_register() does not alter the parsed string Jan Beulich
  2023-03-31 10:05 ` [PATCH 2/3] x86: parse_register() must " Jan Beulich
@ 2023-03-31 10:07 ` Jan Beulich
  2 siblings, 0 replies; 4+ messages in thread
From: Jan Beulich @ 2023-03-31 10:07 UTC (permalink / raw)
  To: Binutils; +Cc: Nick Clifton, Alan Modra

Callers which want to make further parsing attempts at the buffer passed
to the function need to be aware that due to the potential of string
concatenation the input buffer may be altered in ways beyond what can be
undone by putting back at *input_line_pointer the character that the
function returns.

--- a/gas/expr.c
+++ b/gas/expr.c
@@ -2388,12 +2388,17 @@ resolve_expression (expressionS *express
    here lessens the crowd at read.c.
 
    Assume input_line_pointer is at start of symbol name, or the
-    start of a double quote enclosed symbol name.
-   Advance input_line_pointer past symbol name.
-   Turn that character into a '\0', returning its former value,
-    which may be the closing double quote.
+   start of a double quote enclosed symbol name.  Advance
+   input_line_pointer past symbol name.  Turn that character into a '\0',
+   returning its former value, which may be the closing double quote.
+
    This allows a string compare (RMS wants symbol names to be strings)
-    of the symbol name.
+   of the symbol name.
+
+   NOTE: The input buffer is further altered when adjacent strings are
+   concatenated by the function.  Callers caring about the original buffer
+   contents will need to make a copy before calling here.
+
    There will always be a char following symbol name, because all good
    lines end in end-of-line.  */
 


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-03-31 10:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-31 10:03 [PATCH 0/3] better fix for PR gas/30248 Jan Beulich
2023-03-31 10:04 ` [PATCH 1/3] x86: parse_real_register() does not alter the parsed string Jan Beulich
2023-03-31 10:05 ` [PATCH 2/3] x86: parse_register() must " Jan Beulich
2023-03-31 10:07 ` [PATCH 3/3] gas: document that get_symbol_name() can clobber the input buffer Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).