public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: Andrew Burgess <aburgess@redhat.com>
Cc: binutils@sourceware.org
Subject: Re: [PATCH 3/3] opcodes/i386: partially implement disassembler style support
Date: Mon, 21 Feb 2022 14:08:44 +0100	[thread overview]
Message-ID: <6ad59383-d742-a217-a0e6-09d32fa4e900@suse.com> (raw)
In-Reply-To: <87sfsfceju.fsf@redhat.com>

On 19.02.2022 11:54, Andrew Burgess wrote:
> Jan Beulich via Binutils <binutils@sourceware.org> writes:
> 
>> On 17.02.2022 23:37, Andrew Burgess wrote:
>>> Jan Beulich via Binutils <binutils@sourceware.org> writes:
>>>> On 17.02.2022 17:15, Andrew Burgess wrote:
>>>>> Jan Beulich via Binutils <binutils@sourceware.org> writes:
>>>>>> On 16.02.2022 21:53, Andrew Burgess via Binutils wrote:
>>>>>>> +	      (*ins->info->fprintf_styled_func)
>>>>>>> +		(ins->info->stream, dis_style_text, " ");
>>>>>>> +	      (*ins->info->fprintf_styled_func)
>>>>>>> +		(ins->info->stream, dis_style_immediate, "0x%x",
>>>>>>> +		 (unsigned int) priv.the_buffer[0]);
>>>>>>
>>>>>> I wonder if the naming (dis_style_immediate) isn't misleading. As per
>>>>>> the comment next to its definition it really appears to mean any kind
>>>>>> of number (like is the case here), not just immediate operands of
>>>>>> instructions. Hence maybe dis_style_number (as replacement for or in
>>>>>> addition to dis_style_immediate)?
>>>>>
>>>>> You mentioned this before in the previous thread, and I didn't really
>>>>> understand then either.
>>>>>
>>>>> Can you give an example of something that's a number, but not an
>>>>> immediate?  e.g. I wonder (given the instruction/directive distinction
>>>>> you draw above), I wonder if you're conserned about: '.byte 0x4', maybe
>>>>> you don't like referring to this 0x4 here as an immediate?
>>>>
>>>> Well, an operand to a directive for example is not an immediate imo,
>>>> yes. A "load offset" (as your comment calls it) may also not be an
>>>> immediate. E.g. in x86 memory access instructions:
>>>>
>>>> 	mov	0x10(%rbx), %eax
>>>>
>>>> the 0x10 isn't an immediate, but a displacement. The difference may
>>>> be more relevant in something like
>>>>
>>>> 	mov	$0, 0x10(%rbx)
>>>>
>>>> where the $0 is an immediate operand, but the 0x10 isn't (and you
>>>> wouldn't want to mix the two).
>>>>
>>>> From that comment it's not clear to me where else you would think
>>>> "immediate" applies (or not), but in RISC-V's
>>>>
>>>> 	lw	x0, 0x10(x0)
>>>>
>>>> I wouldn't consider the 0x10 an immediate either, albeit this may
>>>> be a result of my x86 bias.
>>>
>>> I wonder if there's a name we could come up with that would allow me to
>>> classify the '$0' and '0x10' (in your example above) as the same style?
>>>
>>> I've kind-of lost the thread a bit, but maybe that's what the 'number'
>>> you suggested original was for?  If I replaced dis_style_immediate with
>>> dis_style_number, and just replaced thoughout, would that be less
>>> problematic?
>>
>> Yes, "number" was meant to be a possible replacement. Whether it's
>> helpful to style all forms of numbers the same is questionable
>> though. Note that to me in "$0" the '$' would not be covered by
>> "number" then, while it would be covered by "immediate".
>>
>>> Another possibility would be to have some aliases either in the original
>>> enum, as in:
>>>
>>>   dis_style_displacement = dis_style_immediate,
>>>
>>> or even at the top of i386-dis.c, as in:
>>>
>>>   #define dis_style_displacement dis_style_immediate
>>
>> But that's wrong. I'm not primarily after the naming in the sources,
>> but after the output not showing distinct things as similar.
>>
>>> I really think we should avoid adding too many distinct styles if we
>>> can.  My concern is less about disassembler users handling the different
>>> styles, and more about consistency between the disassemblers.  I figure
>>> it's easier to be consistent if we only have a small number of styles.
>>> If displacement is a different style to other immediate (yes, I'm still
>>> going to call numbers in instruction immediates!) then we end up with
>>> some architectures going one way, and others another.
>>
>> I agree with the goal of limiting the number of styles. But instead
>> of marking distinct items with similar non-basic (text) styles, I'd
>> rather see items left alone then. IOW with dis_style_immediate I'd
>> prefer (in the x86 example) to see only true immediate insn operands
>> be tagged that way, and all other numbers to remain dis_style_text.
> 
> Having reviewed the thread, I've tried to come up with the complete list
> of styles that I believe your are arguing for.  These incude a new
> directive style, and an address_offset style.  I've extended the text in
> several places to cover how to handle prefixes, as well as how styles
> should be used for directives.
> 
> I'd be grateful if you could read through this list, and give any
> examples of things which, if styled using the rules below, you feel
> would be unacceptable.

I think this all looks good (but of course once this can actually
be seen in use, more things may pop up), provided ...

>   /* This is the default style, use this for any additional syntax
>      (e.g. commas between operands, brackets, etc), or just as a default if
>      no other style seems appropriate.  */
>   dis_style_text,
> 
>   /* Use this for all instruction mnemonics, or aliases for mnemonics.
>      These should be things that correspond to real machine
>      instructions.  */
>   dis_style_mnemonic,
> 
>   /* For things that aren't real machine instructions, but rather
>      assembler directives, e.g. .byte, etc.  */
>   dis_style_assembler_directive,
> 
>   /* Use this for any register names.  This may or may-not include any
>      register prefix, e.g. '$', '%', at the discretion of the target,
>      though within each target the choice to include prefixes for not
>      should be kept consistent.  If the prefix is not printed with this
>      style, then dis_style_text should be used.  */
>   dis_style_register,
> 
>   /* Use this for any constant values used within instructions or
>      directives, unless the value is an absolute address, or an offset
>      that will be added to an address (no matter where the address comes
>      from) before use.  This style may, or may-not be used for any
>      prefix to the immediate value, e.g. '$', at the discretion of the
>      target, though within each target the choice to include these
>      prefixes should be kept consistent.  */
>   dis_style_immediate,
> 
>   /* The style for the numerical representation of an absolute address.
>      Anything that is an address offset should use the immediate style.
>      This style may, or may-not be used for any prefix to the immediate
>      value, e.g. '$', at the discretion of the target, though within
>      each target the choice to include these prefixes should be kept
>      consistent.  */
>   dis_style_address,
> 
>   /* The style for any constant value within an instruction or directive
>      that represents an offset that will be added to an address before
>      use.  This style may, or may-not be used for any prefix to the
>      immediate value, e.g. '$', at the discretion of the target, though
>      within each target the choice to include these prefixes should be
>      kept consistent.  */
>   dis_style_address_offset,

... it was more a copy-n-paste mistake to repeat the reference to $ etc
in these latter two? Or is this to cover e.g. Arm prefixing numbers by
# in a wider fashion than x86's use of $? In this case, using # as the
example character may avoid some confusion.

>   /* The style for a symbol's name.  The numerical address of a symbol
>      should use the address style above, this style is reserved for the
>      name.  */
>   dis_style_symbol,

There may be a remaining ambiguity here: What is the intended style to
be used for <symbol>+<offset>? Just dis_style_symbol or first
dis_style_symbol and then dis_style_address_offset?

Jan


  reply	other threads:[~2022-02-21 13:08 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-16 20:53 [PATCH 0/3] disassembler syntax highlighting in objdump (via libopcodes) Andrew Burgess
2022-02-16 20:53 ` [PATCH 1/3] objdump/opcodes: add syntax highlighting to disassembler output Andrew Burgess
2022-02-28 15:54   ` Tom Tromey
2022-02-16 20:53 ` [PATCH 2/3] opcodes/riscv: implement style support in the disassembler Andrew Burgess
2022-02-19 10:24   ` Andrew Burgess
2022-02-16 20:53 ` [PATCH 3/3] opcodes/i386: partially implement disassembler style support Andrew Burgess
2022-02-17  9:35   ` Jan Beulich
2022-02-17 16:15     ` Andrew Burgess
2022-02-17 16:29       ` Jan Beulich
2022-02-17 22:37         ` Andrew Burgess
2022-02-18  7:14           ` Jan Beulich
2022-02-19 10:54             ` Andrew Burgess
2022-02-21 13:08               ` Jan Beulich [this message]
2022-02-21 18:01                 ` Andrew Burgess
2022-02-17  3:57 ` [PATCH 0/3] disassembler syntax highlighting in objdump (via libopcodes) Nelson Chu
2022-02-17 16:17   ` Andrew Burgess
2022-03-21 14:33 ` [PATCHv2 " Andrew Burgess
2022-03-21 14:33   ` [PATCHv2 1/3] objdump/opcodes: add syntax highlighting to disassembler output Andrew Burgess
2022-03-21 14:33   ` [PATCHv2 2/3] opcodes/riscv: implement style support in the disassembler Andrew Burgess
2022-03-21 14:33   ` [PATCHv2 3/3] opcodes/i386: partially implement disassembler style support Andrew Burgess
2022-03-24 17:08   ` [PATCHv2 0/3] disassembler syntax highlighting in objdump (via libopcodes) Nick Clifton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6ad59383-d742-a217-a0e6-09d32fa4e900@suse.com \
    --to=jbeulich@suse.com \
    --cc=aburgess@redhat.com \
    --cc=binutils@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).