public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] x86: Check unbalanced braces in memory reference
@ 2023-03-20 17:03 H.J. Lu
  2023-03-30 14:54 ` Jan Beulich
  0 siblings, 1 reply; 4+ messages in thread
From: H.J. Lu @ 2023-03-20 17:03 UTC (permalink / raw)
  To: binutils

Check unbalanced braces in memory reference to avoid assembler crash
caused by

commit e87fb6a6d0cdfc0e9c471b7825c20c238c2cf506
Author: Jan Beulich <jbeulich@suse.com>
Date:   Wed Oct 5 09:16:24 2022 +0200

    x86/gas: support quoted address scale factor in AT&T syntax

	PR gas/30248
	* config/tc-i386.c (i386_att_operand): Check unbalanced braces
	in memory reference.
	* testsuite/gas/i386/i386.exp: Run pr30248.
	* testsuite/gas/i386/pr30248.d: New file.
	* testsuite/gas/i386/pr30248.err: Likewise.
	* testsuite/gas/i386/pr30248.s: Likewise.
---
 gas/config/tc-i386.c               | 6 +++++-
 gas/testsuite/gas/i386/i386.exp    | 1 +
 gas/testsuite/gas/i386/pr30248.d   | 2 ++
 gas/testsuite/gas/i386/pr30248.err | 5 +++++
 gas/testsuite/gas/i386/pr30248.s   | 2 ++
 5 files changed, 15 insertions(+), 1 deletion(-)
 create mode 100644 gas/testsuite/gas/i386/pr30248.d
 create mode 100644 gas/testsuite/gas/i386/pr30248.err
 create mode 100644 gas/testsuite/gas/i386/pr30248.s

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index ed8329f25d8..44efad73e5d 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -11613,7 +11613,11 @@ i386_att_operand (char *operand_string)
 	  temp_string = base_string;
 
 	  /* Skip past '(' and whitespace.  */
-	  gas_assert (*base_string == '(');
+	  if (*base_string != '(')
+	    {
+	      as_bad (_("unbalanced braces"));
+	      return 0;
+	    }
 	  ++base_string;
 	  if (is_space_char (*base_string))
 	    ++base_string;
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index 590cd783efe..4d2150f9c68 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -752,6 +752,7 @@ run_dump_test pr19498
 run_list_test "nop-bad-1" ""
 run_list_test "unspec" ""
 run_dump_test "fp"
+run_dump_test pr30248
 if {[is_elf_format] || [istarget "*-*-vxworks*"]} then {
     run_list_test_stdin "list-1" "-al"
     run_list_test_stdin "list-2" "-al"
diff --git a/gas/testsuite/gas/i386/pr30248.d b/gas/testsuite/gas/i386/pr30248.d
new file mode 100644
index 00000000000..a6efd26e2c5
--- /dev/null
+++ b/gas/testsuite/gas/i386/pr30248.d
@@ -0,0 +1,2 @@
+#as:
+#error_output: pr30248.err
diff --git a/gas/testsuite/gas/i386/pr30248.err b/gas/testsuite/gas/i386/pr30248.err
new file mode 100644
index 00000000000..1f71543e1d4
--- /dev/null
+++ b/gas/testsuite/gas/i386/pr30248.err
@@ -0,0 +1,5 @@
+#failif
+
+#...
+.*Internal error.*
+#pass
diff --git a/gas/testsuite/gas/i386/pr30248.s b/gas/testsuite/gas/i386/pr30248.s
new file mode 100644
index 00000000000..ab875008f3d
--- /dev/null
+++ b/gas/testsuite/gas/i386/pr30248.s
@@ -0,0 +1,2 @@
+	.text
+	lgs ")"""
-- 
2.39.2


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86: Check unbalanced braces in memory reference
  2023-03-20 17:03 [PATCH] x86: Check unbalanced braces in memory reference H.J. Lu
@ 2023-03-30 14:54 ` Jan Beulich
  2023-03-30 16:49   ` H.J. Lu
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Beulich @ 2023-03-30 14:54 UTC (permalink / raw)
  To: H.J. Lu, Nick Clifton, Alan Modra; +Cc: binutils

On 20.03.2023 18:03, H.J. Lu via Binutils wrote:
> Check unbalanced braces in memory reference to avoid assembler crash
> caused by
> 
> commit e87fb6a6d0cdfc0e9c471b7825c20c238c2cf506
> Author: Jan Beulich <jbeulich@suse.com>
> Date:   Wed Oct 5 09:16:24 2022 +0200
> 
>     x86/gas: support quoted address scale factor in AT&T syntax

This claim is wrong, and the "fix" is wrong as well. The assertion is
correct, and it triggering correctly points out a problem, but elsewhere
(which makes me suspect you didn't take the time to understand what it
actually is that is going wrong): The parse_register() call from
i386_att_operand() ends up zapping the trailing three quotes from the
example operand in the testcase ('")"""'). Which renders invalid the
checking done earlier in parse_operands().

This behavior of parse_register() in turn is because of bogus behavior
in get_symbol_name(): It consumes all pairs of quotes (i.e. the trailing
three ones) with the apparent goal of concatenating adjacent strings.
But in this case the function stores two nul characters at different
positions, yet the caller cannot possibly restore more than one of the
original characters. Hence the previously properly balanced quoted
string becomes unbalanced. _This_ is what causes the assertion to
trigger.

Please revert. I'll see to get to fixing this where it needs fixing,
unless someone else gets to it earlier. For now it isn't really clear
to me what the best approach is going to be: Having all callers of
get_symbol_name() deal with the situation isn't nice. But dealing with
this in get_symbol_name() isn't nice either, as we'd need to replace
the "excess" characters by e.g. blanks. Yet code elsewhere often enough
assumes that adjacent blanks were collapsed by the scrubber. IOW even
then many/most(/all?) callers may need adjustment.

Possibly get_symbol_name() simply isn't intended for cases where the
original buffer contents is to remain usable for further processing.
If so,
- this property should be called out in the comment ahead of the
  function,
- we'd simply need to make a copy before calling the function in
  parse_register() (or in callers where retaining the original buffer
  contents matters).

Nick, Alan - do you have any thoughts here?

Jan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86: Check unbalanced braces in memory reference
  2023-03-30 14:54 ` Jan Beulich
@ 2023-03-30 16:49   ` H.J. Lu
  2023-03-31  5:50     ` Jan Beulich
  0 siblings, 1 reply; 4+ messages in thread
From: H.J. Lu @ 2023-03-30 16:49 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Nick Clifton, Alan Modra, binutils

On Thu, Mar 30, 2023 at 7:54 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 20.03.2023 18:03, H.J. Lu via Binutils wrote:
> > Check unbalanced braces in memory reference to avoid assembler crash
> > caused by
> >
> > commit e87fb6a6d0cdfc0e9c471b7825c20c238c2cf506
> > Author: Jan Beulich <jbeulich@suse.com>
> > Date:   Wed Oct 5 09:16:24 2022 +0200
> >
> >     x86/gas: support quoted address scale factor in AT&T syntax
>
> This claim is wrong, and the "fix" is wrong as well. The assertion is
> correct, and it triggering correctly points out a problem, but elsewhere
> (which makes me suspect you didn't take the time to understand what it
> actually is that is going wrong): The parse_register() call from
> i386_att_operand() ends up zapping the trailing three quotes from the
> example operand in the testcase ('")"""'). Which renders invalid the
> checking done earlier in parse_operands().
>
> This behavior of parse_register() in turn is because of bogus behavior
> in get_symbol_name(): It consumes all pairs of quotes (i.e. the trailing
> three ones) with the apparent goal of concatenating adjacent strings.
> But in this case the function stores two nul characters at different
> positions, yet the caller cannot possibly restore more than one of the
> original characters. Hence the previously properly balanced quoted
> string becomes unbalanced. _This_ is what causes the assertion to
> trigger.
>
> Please revert. I'll see to get to fixing this where it needs fixing,
> unless someone else gets to it earlier. For now it isn't really clear
> to me what the best approach is going to be: Having all callers of
> get_symbol_name() deal with the situation isn't nice. But dealing with
> this in get_symbol_name() isn't nice either, as we'd need to replace
> the "excess" characters by e.g. blanks. Yet code elsewhere often enough
> assumes that adjacent blanks were collapsed by the scrubber. IOW even
> then many/most(/all?) callers may need adjustment.
>
> Possibly get_symbol_name() simply isn't intended for cases where the
> original buffer contents is to remain usable for further processing.
> If so,
> - this property should be called out in the comment ahead of the
>   function,
> - we'd simply need to make a copy before calling the function in
>   parse_register() (or in callers where retaining the original buffer
>   contents matters).

What is the end goal of this complexity?  Assembler is changed
to accept more obscure syntaxes which aren't expected by the
original framework.   I think assembler should be as simple as
possible without those obscure syntaxes

> Nick, Alan - do you have any thoughts here?
>
> Jan



-- 
H.J.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86: Check unbalanced braces in memory reference
  2023-03-30 16:49   ` H.J. Lu
@ 2023-03-31  5:50     ` Jan Beulich
  0 siblings, 0 replies; 4+ messages in thread
From: Jan Beulich @ 2023-03-31  5:50 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Nick Clifton, Alan Modra, binutils

On 30.03.2023 18:49, H.J. Lu wrote:
> On Thu, Mar 30, 2023 at 7:54 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 20.03.2023 18:03, H.J. Lu via Binutils wrote:
>>> Check unbalanced braces in memory reference to avoid assembler crash
>>> caused by
>>>
>>> commit e87fb6a6d0cdfc0e9c471b7825c20c238c2cf506
>>> Author: Jan Beulich <jbeulich@suse.com>
>>> Date:   Wed Oct 5 09:16:24 2022 +0200
>>>
>>>     x86/gas: support quoted address scale factor in AT&T syntax
>>
>> This claim is wrong, and the "fix" is wrong as well. The assertion is
>> correct, and it triggering correctly points out a problem, but elsewhere
>> (which makes me suspect you didn't take the time to understand what it
>> actually is that is going wrong): The parse_register() call from
>> i386_att_operand() ends up zapping the trailing three quotes from the
>> example operand in the testcase ('")"""'). Which renders invalid the
>> checking done earlier in parse_operands().
>>
>> This behavior of parse_register() in turn is because of bogus behavior
>> in get_symbol_name(): It consumes all pairs of quotes (i.e. the trailing
>> three ones) with the apparent goal of concatenating adjacent strings.
>> But in this case the function stores two nul characters at different
>> positions, yet the caller cannot possibly restore more than one of the
>> original characters. Hence the previously properly balanced quoted
>> string becomes unbalanced. _This_ is what causes the assertion to
>> trigger.
>>
>> Please revert. I'll see to get to fixing this where it needs fixing,
>> unless someone else gets to it earlier. For now it isn't really clear
>> to me what the best approach is going to be: Having all callers of
>> get_symbol_name() deal with the situation isn't nice. But dealing with
>> this in get_symbol_name() isn't nice either, as we'd need to replace
>> the "excess" characters by e.g. blanks. Yet code elsewhere often enough
>> assumes that adjacent blanks were collapsed by the scrubber. IOW even
>> then many/most(/all?) callers may need adjustment.
>>
>> Possibly get_symbol_name() simply isn't intended for cases where the
>> original buffer contents is to remain usable for further processing.
>> If so,
>> - this property should be called out in the comment ahead of the
>>   function,
>> - we'd simply need to make a copy before calling the function in
>>   parse_register() (or in callers where retaining the original buffer
>>   contents matters).
> 
> What is the end goal of this complexity?  Assembler is changed
> to accept more obscure syntaxes which aren't expected by the
> original framework.   I think assembler should be as simple as
> possible without those obscure syntaxes

Quoted symbol names are a general feature of gas; they should work
everywhere. Or else that capability would want to removing again
completely (which I'm far from suggesting we do).

What you've done in your "fix" is paper over one corner aspect of a
much wider problem. IOW even without the change you falsely blamed
there would be problems from the underlying cause that I've
identified (and which wants fixing there instead of curing symptoms
all over the place).

Jan

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-03-31  5:50 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-20 17:03 [PATCH] x86: Check unbalanced braces in memory reference H.J. Lu
2023-03-30 14:54 ` Jan Beulich
2023-03-30 16:49   ` H.J. Lu
2023-03-31  5:50     ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).