public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: "H.J. Lu" <hjl.tools@gmail.com>
Cc: Nick Clifton <nickc@redhat.com>, Alan Modra <amodra@gmail.com>,
	binutils@sourceware.org
Subject: Re: [PATCH] x86: Check unbalanced braces in memory reference
Date: Fri, 31 Mar 2023 07:50:13 +0200	[thread overview]
Message-ID: <83a4bd07-fa95-fd39-0b0b-64a895a54211@suse.com> (raw)
In-Reply-To: <CAMe9rOq+E7kKEMpxEvw6XknD6uYmoJR6AoQCncbeyk-jX7fSZg@mail.gmail.com>

On 30.03.2023 18:49, H.J. Lu wrote:
> On Thu, Mar 30, 2023 at 7:54 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 20.03.2023 18:03, H.J. Lu via Binutils wrote:
>>> Check unbalanced braces in memory reference to avoid assembler crash
>>> caused by
>>>
>>> commit e87fb6a6d0cdfc0e9c471b7825c20c238c2cf506
>>> Author: Jan Beulich <jbeulich@suse.com>
>>> Date:   Wed Oct 5 09:16:24 2022 +0200
>>>
>>>     x86/gas: support quoted address scale factor in AT&T syntax
>>
>> This claim is wrong, and the "fix" is wrong as well. The assertion is
>> correct, and it triggering correctly points out a problem, but elsewhere
>> (which makes me suspect you didn't take the time to understand what it
>> actually is that is going wrong): The parse_register() call from
>> i386_att_operand() ends up zapping the trailing three quotes from the
>> example operand in the testcase ('")"""'). Which renders invalid the
>> checking done earlier in parse_operands().
>>
>> This behavior of parse_register() in turn is because of bogus behavior
>> in get_symbol_name(): It consumes all pairs of quotes (i.e. the trailing
>> three ones) with the apparent goal of concatenating adjacent strings.
>> But in this case the function stores two nul characters at different
>> positions, yet the caller cannot possibly restore more than one of the
>> original characters. Hence the previously properly balanced quoted
>> string becomes unbalanced. _This_ is what causes the assertion to
>> trigger.
>>
>> Please revert. I'll see to get to fixing this where it needs fixing,
>> unless someone else gets to it earlier. For now it isn't really clear
>> to me what the best approach is going to be: Having all callers of
>> get_symbol_name() deal with the situation isn't nice. But dealing with
>> this in get_symbol_name() isn't nice either, as we'd need to replace
>> the "excess" characters by e.g. blanks. Yet code elsewhere often enough
>> assumes that adjacent blanks were collapsed by the scrubber. IOW even
>> then many/most(/all?) callers may need adjustment.
>>
>> Possibly get_symbol_name() simply isn't intended for cases where the
>> original buffer contents is to remain usable for further processing.
>> If so,
>> - this property should be called out in the comment ahead of the
>>   function,
>> - we'd simply need to make a copy before calling the function in
>>   parse_register() (or in callers where retaining the original buffer
>>   contents matters).
> 
> What is the end goal of this complexity?  Assembler is changed
> to accept more obscure syntaxes which aren't expected by the
> original framework.   I think assembler should be as simple as
> possible without those obscure syntaxes

Quoted symbol names are a general feature of gas; they should work
everywhere. Or else that capability would want to removing again
completely (which I'm far from suggesting we do).

What you've done in your "fix" is paper over one corner aspect of a
much wider problem. IOW even without the change you falsely blamed
there would be problems from the underlying cause that I've
identified (and which wants fixing there instead of curing symptoms
all over the place).

Jan

      reply	other threads:[~2023-03-31  5:50 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-20 17:03 H.J. Lu
2023-03-30 14:54 ` Jan Beulich
2023-03-30 16:49   ` H.J. Lu
2023-03-31  5:50     ` Jan Beulich [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83a4bd07-fa95-fd39-0b0b-64a895a54211@suse.com \
    --to=jbeulich@suse.com \
    --cc=amodra@gmail.com \
    --cc=binutils@sourceware.org \
    --cc=hjl.tools@gmail.com \
    --cc=nickc@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).