public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: "H.J. Lu" <hjl.tools@gmail.com>
To: Andrew Burgess <aburgess@redhat.com>
Cc: Binutils <binutils@sourceware.org>
Subject: Re: [PATCHv2] libopcodes: extend the styling within the i386 disassembler
Date: Wed, 18 May 2022 14:23:36 -0700	[thread overview]
Message-ID: <CAMe9rOoZHm97UQyaNDXYsnj0cftLCYhZCoRtsVKJkRrOHsEH1A@mail.gmail.com> (raw)
In-Reply-To: <20220509125414.3526554-1-aburgess@redhat.com>

On Mon, May 9, 2022 at 5:54 AM Andrew Burgess via Binutils
<binutils@sourceware.org> wrote:
>
> In patch v2:
>
>   - Addressed all minor feedback items from Vladimir, H.J. and Jan,
>
>   - Switched to using \002 as the styling escape character,
>
>   - Escape character is defined once near the top of i386-dis.c making
>     it easy to switch to a different character if needed,
>
>   - Detection of the style escape character is stricter in
>     i386_dis_printf,
>
>   - Proper error handling in i386_dis_printf, though I can't imagine
>     when this would actually trigger.
>
> ---
>
> The i386 disassembler is pretty complex.  Most disassembly is done
> indirectly; operands are built into buffers within a struct instr_info
> instance, before finally being printed later in the disassembly
> process.
>
> Sometimes the operand buffers are built in a different order to the
> order in which they will eventually be printed.
>
> Each operand can contain multiple components, e.g. multiple registers,
> immediates, other textual elements (commas, brackets, etc).
>
> When looking for how to apply styling I guess the ideal solution would
> be to move away from the operands being a single string that is built
> up, and instead have each operand be a list of "parts", where each
> part is some text and a style.  Then, when we eventually print the
> operand we would loop over the parts and print each part with the
> correct style.
>
> But it feels like a huge amount of work to move from where we are
> now to that potentially ideal solution.  Plus, the above solution
> would be pretty complex.
>
> So, instead I propose a .... different solution here, one that works
> with the existing infrastructure.
>
> As each operand is built up, piece be piece, we pass through style
> information.  This style information is then encoded into the operand
> buffer (see below for details).  After this the code can continue to
> operate as it does right now in order to manage the set of operand
> buffers.
>
> Then, as each operand is printed we can split the operand buffer into
> chunks at the style marker boundaries, with each chunk being printed
> with the correct style.
>
> For encoding the style information I use a single character, currently
> \002, followed by the style encoded as a single hex digit, followed
> again by the \002 character.
>
> This of course relies on there not being more than 16 styles, but that
> is currently true, and hopefully will remain true for the foreseeable
> future.
>
> The other major concern that has arisen around this work is whether
> the escape character could ever be encountered in output naturally
> generated by the disassembler.  If this did happen then the escape
> characters would be stripped from the output, and the wrong styling
> would be applied.
>
> However, I don't believe that this is currently a problem.
> Disassembler content comes from a number of sources.  First there's
> content that copied directly from the i386-dis.c file, this is things
> like register names, and other syntax elements (brackets, commas,
> etc).  We can easily check that the i386-dis.c file doesn't contain
> our special character.
>
> The next source of content are immediate operands.  The text for these
> operands is generated by calls into libc.  By selecting a
> non-printable character we can be confident that this is not something
> that libc will generate as part of an immediate representation.
>
> The other output that appears to be from the disassembler is operands
> that contain addresses and (possibly) symbol names.  It is quite
> possible that a symbol name might contain any special character we
> could imagine, so is this a problem?
>
> I don't think it is, we don't actually print address and symbol
> operands through the disassembler, instead, the disassembler calls
> back to the user (objdump, gdb, etc) to print the address and symbol
> on its behalf.  This content is printed directly to the output stream,
> it does not pass through the i386 disassembler output buffers.  As a
> result, we never check this particular output for styling escape
> characters.
>
> In some (not very scientific) benchmarking on my machine,
> disassembling a reasonably large (142M) shared library, I'm not seeing
> any significant slow down in disassembler speed with this change.
>
> Most instructions are now being fully syntax highlighted when I
> disassemble using the --disassembler-color=extended-color option.  I'm
> sure that there are probably still a few corner cases that need fixing
> up, but we can come back to them later I think.
>
> When disassembler syntax highlighting is not being used, then there
> should be no user visible changes after this commit.
> ---
>  opcodes/i386-dis.c | 405 +++++++++++++++++++++++++++++++--------------
>  1 file changed, 278 insertions(+), 127 deletions(-)
>
> diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
> index 6ef091ea7d7..28834e4650b 100644
> --- a/opcodes/i386-dis.c
> +++ b/opcodes/i386-dis.c
> @@ -47,6 +47,8 @@ static void dofloat (instr_info *, int);
>  static void OP_ST (instr_info *, int, int);
>  static void OP_STi (instr_info *, int, int);
>  static int putop (instr_info *, const char *, int);
> +static void oappend_with_style (instr_info *, const char *,
> +                               enum disassembler_style);
>  static void oappend (instr_info *, const char *);
>  static void append_seg (instr_info *);
>  static void OP_indirE (instr_info *, int, int);
> @@ -116,6 +118,10 @@ static void FXSAVE_Fixup (instr_info *, int, int);
>  static void MOVSXD_Fixup (instr_info *, int, int);
>  static void DistinctDest_Fixup (instr_info *, int, int);
>
> +/* This character is used to encode style information within the output
> +   buffers.  See oappend_insert_style for more details.  */
> +#define STYLE_MARKER_CHAR '\002'
> +
>  struct dis_private {
>    /* Points to first byte not fetched.  */
>    bfd_byte *max_fetched;
> @@ -248,6 +254,8 @@ struct instr_info
>
>    enum x86_64_isa isa64;
>
> +  int (*printf) (instr_info *ins, enum disassembler_style style,
> +                const char *fmt, ...) ATTRIBUTE_FPTR_PRINTF_3;
>  };
>
>  /* Mark parts used in the REX prefix.  When we are testing for
> @@ -9298,11 +9306,103 @@ get_sib (instr_info *ins, int sizeflag)
>  }
>
>  /* Like oappend (below), but S is a string starting with '%'.
> -   In Intel syntax, the '%' is elided.  */
> +   In Intel syntax, the '%' is elided.  STYLE is used when displaying this
> +   part of the output in the disassembler.  */
> +
> +static void
> +oappend_maybe_intel_with_style (instr_info *ins, const char *s,
> +                               enum disassembler_style style)
> +{
> +  oappend_with_style (ins, s + ins->intel_syntax, style);
> +}
> +
> +/* Like oappend_maybe_intel_with_style, but always uses text style.  */
> +
>  static void
>  oappend_maybe_intel (instr_info *ins, const char *s)
>  {
> -  oappend (ins, s + ins->intel_syntax);
> +  oappend_maybe_intel_with_style (ins, s, dis_style_text);
> +}
> +
> +/* Wrap around a call to INS->info->fprintf_styled_func, printing FMT.
> +   STYLE is the default style to use in the fprintf_styled_func calls,
> +   however, FMT might include embedded style markers (see oappend_style),
> +   these embedded markers are not printed, but instead change the style
> +   used in the next fprintf_styled_func call.
> +
> +   Return non-zero to indicate the print call was a success.  */
> +
> +static int ATTRIBUTE_PRINTF_3
> +i386_dis_printf (instr_info *ins, enum disassembler_style style,
> +                const char *fmt, ...)
> +{
> +  va_list ap;
> +  enum disassembler_style curr_style = style;
> +  char *start, *curr;
> +  char staging_area[100];
> +  int res;
> +
> +  va_start (ap, fmt);
> +  res = vsnprintf (staging_area, sizeof (staging_area), fmt, ap);
> +  va_end (ap);
> +
> +  if (res < 0)
> +    return res;
> +
> +  start = curr = staging_area;
> +
> +  do
> +    {
> +      if (*curr == '\0'
> +         || (*curr == STYLE_MARKER_CHAR
> +             && ISXDIGIT (*(curr + 1))
> +             && *(curr + 2) == STYLE_MARKER_CHAR))
> +       {
> +         /* Output content between our START position and CURR.  */
> +         int len = curr - start;
> +         int n = (*ins->info->fprintf_styled_func) (ins->info->stream,
> +                                                    curr_style,
> +                                                    "%.*s", len, start);
> +         if (n < 0)
> +           {
> +             res = n;
> +             break;
> +           }
> +
> +         if (*curr == '\0')
> +           break;
> +
> +         /* Skip over the initial STYLE_MARKER_CHAR.  */
> +         ++curr;
> +
> +         /* Update the CURR_STYLE.  As there are less than 16 styles, it
> +            is possible, that if the input is corrupted in some way, that
> +            we might set CURR_STYLE to an invalid value.  Don't worry
> +            though, we check for this situation.  */
> +         if (*curr >= '0' && *curr <= '9')
> +           curr_style = (enum disassembler_style) (*curr - '0');
> +         else if (*curr >= 'a' && *curr <= 'f')
> +           curr_style = (enum disassembler_style) (*curr - 'a' + 10);
> +         else
> +           curr_style = dis_style_text;
> +
> +         /* Check for an invalid style having been selected.  This should
> +            never happen, but it doesn't hurt to be a little paranoid.  */
> +         if (curr_style > dis_style_comment_start)
> +           curr_style = dis_style_text;
> +
> +         /* Skip the hex character, and the closing STYLE_MARKER_CHAR.  */
> +         curr += 2;
> +
> +         /* Reset the START to after the style marker.  */
> +         start = curr;
> +       }
> +      else
> +       ++curr;
> +    }
> +  while (true);
> +
> +  return res;
>  }
>
>  static int
> @@ -9317,6 +9417,7 @@ print_insn (bfd_vma pc, instr_info *ins)
>    struct dis_private priv;
>    int prefix_length;
>
> +  ins->printf = i386_dis_printf;
>    ins->isa64 = 0;
>    ins->intel_mnemonic = !SYSV386_COMPAT;
>    ins->op_is_jump = false;
> @@ -9401,8 +9502,7 @@ print_insn (bfd_vma pc, instr_info *ins)
>
>    if (ins->address_mode == mode_64bit && sizeof (bfd_vma) < 8)
>      {
> -      (*ins->info->fprintf_styled_func) (ins->info->stream, dis_style_text,
> -                                        _("64-bit address is disabled"));
> +      ins->printf (ins, dis_style_text, _("64-bit address is disabled"));
>        return -1;
>      }
>
> @@ -9451,16 +9551,14 @@ print_insn (bfd_vma pc, instr_info *ins)
>         {
>           name = prefix_name (ins, priv.the_buffer[0], priv.orig_sizeflag);
>           if (name != NULL)
> -           (*ins->info->fprintf_styled_func)
> -             (ins->info->stream, dis_style_mnemonic, "%s", name);
> +           ins->printf (ins, dis_style_mnemonic, "%s", name);
>           else
>             {
>               /* Just print the first byte as a .byte instruction.  */
> -             (*ins->info->fprintf_styled_func)
> -               (ins->info->stream, dis_style_assembler_directive, ".byte ");
> -             (*ins->info->fprintf_styled_func)
> -               (ins->info->stream, dis_style_immediate, "0x%x",
> -                (unsigned int) priv.the_buffer[0]);
> +             ins->printf (ins, dis_style_assembler_directive,
> +                          ".byte ");
> +             ins->printf (ins, dis_style_immediate, "0x%x",
> +                          (unsigned int) priv.the_buffer[0]);
>             }
>
>           return 1;
> @@ -9478,10 +9576,9 @@ print_insn (bfd_vma pc, instr_info *ins)
>        for (i = 0;
>            i < (int) ARRAY_SIZE (ins->all_prefixes) && ins->all_prefixes[i];
>            i++)
> -       (*ins->info->fprintf_styled_func)
> -         (ins->info->stream, dis_style_mnemonic, "%s%s",
> -          (i == 0 ? "" : " "), prefix_name (ins, ins->all_prefixes[i],
> -                                            sizeflag));
> +       ins->printf (ins, dis_style_mnemonic, "%s%s",
> +                    (i == 0 ? "" : " "),
> +                    prefix_name (ins, ins->all_prefixes[i], sizeflag));
>        return i;
>      }
>
> @@ -9496,11 +9593,9 @@ print_insn (bfd_vma pc, instr_info *ins)
>        /* Handle ins->prefixes before fwait.  */
>        for (i = 0; i < ins->fwait_prefix && ins->all_prefixes[i];
>            i++)
> -       (*ins->info->fprintf_styled_func)
> -         (ins->info->stream, dis_style_mnemonic, "%s ",
> -          prefix_name (ins, ins->all_prefixes[i], sizeflag));
> -      (*ins->info->fprintf_styled_func)
> -       (ins->info->stream, dis_style_mnemonic, "fwait");
> +       ins->printf (ins, dis_style_mnemonic, "%s ",
> +                    prefix_name (ins, ins->all_prefixes[i], sizeflag));
> +      ins->printf (ins, dis_style_mnemonic, "fwait");
>        return i + 1;
>      }
>
> @@ -9649,16 +9744,14 @@ print_insn (bfd_vma pc, instr_info *ins)
>       are all 0s in inverted form.  */
>    if (ins->need_vex && ins->vex.register_specifier != 0)
>      {
> -      (*ins->info->fprintf_styled_func) (ins->info->stream, dis_style_text,
> -                                        "(bad)");
> +      ins->printf (ins, dis_style_text, "(bad)");
>        return ins->end_codep - priv.the_buffer;
>      }
>
>    /* If EVEX.z is set, there must be an actual mask register in use.  */
>    if (ins->vex.zeroing && ins->vex.mask_register_specifier == 0)
>      {
> -      (*ins->info->fprintf_styled_func) (ins->info->stream, dis_style_text,
> -                                        "(bad)");
> +      ins->printf (ins, dis_style_text, "(bad)");
>        return ins->end_codep - priv.the_buffer;
>      }
>
> @@ -9669,8 +9762,7 @@ print_insn (bfd_vma pc, instr_info *ins)
>          the encoding invalid.  Most other PREFIX_OPCODE rules still apply.  */
>        if (ins->need_vex ? !ins->vex.prefix : !(ins->prefixes & PREFIX_DATA))
>         {
> -         (*ins->info->fprintf_styled_func) (ins->info->stream,
> -                                            dis_style_text, "(bad)");
> +         ins->printf (ins, dis_style_text, "(bad)");
>           return ins->end_codep - priv.the_buffer;
>         }
>        ins->used_prefixes |= PREFIX_DATA;
> @@ -9697,8 +9789,7 @@ print_insn (bfd_vma pc, instr_info *ins)
>           || (ins->vex.evex && dp->prefix_requirement != PREFIX_DATA
>               && !ins->vex.w != !(ins->used_prefixes & PREFIX_DATA)))
>         {
> -         (*ins->info->fprintf_styled_func) (ins->info->stream,
> -                                            dis_style_text, "(bad)");
> +         ins->printf (ins, dis_style_text, "(bad)");
>           return ins->end_codep - priv.the_buffer;
>         }
>        break;
> @@ -9748,24 +9839,28 @@ print_insn (bfd_vma pc, instr_info *ins)
>         if (name == NULL)
>           abort ();
>         prefix_length += strlen (name) + 1;
> -       (*ins->info->fprintf_styled_func)
> -         (ins->info->stream, dis_style_mnemonic, "%s ", name);
> +       ins->printf (ins, dis_style_mnemonic, "%s ", name);
>        }
>
>    /* Check maximum code length.  */
>    if ((ins->codep - ins->start_codep) > MAX_CODE_LENGTH)
>      {
> -      (*ins->info->fprintf_styled_func)
> -       (ins->info->stream, dis_style_text, "(bad)");
> +      ins->printf (ins, dis_style_text, "(bad)");
>        return MAX_CODE_LENGTH;
>      }
>
> -  ins->obufp = ins->mnemonicendp;
> -  for (i = strlen (ins->obuf) + prefix_length; i < 6; i++)
> -    oappend (ins, " ");
> -  oappend (ins, " ");
> -  (*ins->info->fprintf_styled_func)
> -    (ins->info->stream, dis_style_mnemonic, "%s", ins->obuf);
> +  i = strlen (ins->obuf);
> +  if (ins->mnemonicendp == ins->obuf + i)
> +    {
> +      i += prefix_length;
> +      if (i < 6)
> +       i = 6 - i + 1;
> +      else
> +       i = 1;
> +    }
> +  else
> +    i = 0;
> +  ins->printf (ins, dis_style_mnemonic, "%s%*s", ins->obuf, i, "");
>
>    /* The enter and bound instructions are printed with operands in the same
>       order as the intel book; everything else is printed in reverse order.  */
> @@ -9804,8 +9899,7 @@ print_insn (bfd_vma pc, instr_info *ins)
>      if (*op_txt[i])
>        {
>         if (needcomma)
> -         (*ins->info->fprintf_styled_func) (ins->info->stream,
> -                                            dis_style_text, ",");
> +         ins->printf (ins, dis_style_text, ",");
>         if (ins->op_index[i] != -1 && !ins->op_riprel[i])
>           {
>             bfd_vma target = (bfd_vma) ins->op_address[ins->op_index[i]];
> @@ -9821,18 +9915,14 @@ print_insn (bfd_vma pc, instr_info *ins)
>             (*ins->info->print_address_func) (target, ins->info);
>           }
>         else
> -         (*ins->info->fprintf_styled_func) (ins->info->stream,
> -                                            dis_style_text, "%s",
> -                                            op_txt[i]);
> +         ins->printf (ins, dis_style_text, "%s", op_txt[i]);
>         needcomma = 1;
>        }
>
>    for (i = 0; i < MAX_OPERANDS; i++)
>      if (ins->op_index[i] != -1 && ins->op_riprel[i])
>        {
> -       (*ins->info->fprintf_styled_func) (ins->info->stream,
> -                                          dis_style_comment_start,
> -                                          "        # ");
> +       ins->printf (ins, dis_style_comment_start, "        # ");
>         (*ins->info->print_address_func) ((bfd_vma)
>                         (ins->start_pc + (ins->codep - ins->start_codep)
>                          + ins->op_address[ins->op_index[i]]), ins->info);
> @@ -10224,8 +10314,11 @@ static void
>  OP_STi (instr_info *ins, int bytemode ATTRIBUTE_UNUSED,
>         int sizeflag ATTRIBUTE_UNUSED)
>  {
> -  sprintf (ins->scratchbuf, "%%st(%d)", ins->modrm.rm);
> -  oappend_maybe_intel (ins, ins->scratchbuf);
> +  oappend_maybe_intel (ins, "%st");
> +  oappend (ins, "(");
> +  sprintf (ins->scratchbuf, "%d", ins->modrm.rm);
> +  oappend_with_style (ins, ins->scratchbuf, dis_style_immediate);
> +  oappend (ins, ")");
>  }
>
>  /* Capital letters in template are macros.  */
> @@ -10772,12 +10865,64 @@ putop (instr_info *ins, const char *in_template, int sizeflag)
>    return 0;
>  }
>
> +/* Add a style marker to *INS->obufp that encodes STYLE.  This assumes that
> +   the buffer pointed to by INS->obufp has space.  A style marker is made
> +   from the STYLE_MARKER_CHAR followed by STYLE converted to a single hex
> +   digit, followed by another STYLE_MARKER_CHAR.  This function assumes
> +   that the number of styles is not greater than 16.  */
> +
>  static void
> -oappend (instr_info *ins, const char *s)
> +oappend_insert_style (instr_info *ins, enum disassembler_style style)
>  {
> +  int num = (int) style;
> +
> +  /* We currently assume that STYLE can be encoded as a single hex
> +     character.  If more styles are added then this might start to fail,
> +     and we'll need to expand this code.  */
> +  if (num > 0xf)
> +    abort ();
> +
> +  *ins->obufp++ = STYLE_MARKER_CHAR;
> +  *ins->obufp++ = (num < 10 ? ('0' + num)
> +                  : ((num < 16) ? ('a' + (num - 10)) : '0'));
> +  *ins->obufp++ = STYLE_MARKER_CHAR;
> +
> +  /* This final null character is not strictly necessary, after inserting a
> +     style marker we should always be inserting some additional content.
> +     However, having the buffer null terminated doesn't cost much, and make
> +     it easier to debug what's going on.  Also, if we do ever forget to add
> +     any additional content after this style marker, then the buffer will
> +     still be well formed.  */
> +  *ins->obufp = '\0';
> +}
> +
> +static void
> +oappend_with_style (instr_info *ins, const char *s,
> +                   enum disassembler_style style)
> +{
> +  oappend_insert_style (ins, style);
>    ins->obufp = stpcpy (ins->obufp, s);
>  }
>
> +/* Like oappend_with_style but always with text style.  */
> +
> +static void
> +oappend (instr_info *ins, const char *s)
> +{
> +  oappend_with_style (ins, s, dis_style_text);
> +}
> +
> +/* Add a single character C to the buffer pointer to by INS->obufp, marking
> +   the style for the character as STYLE.  */
> +
> +static void
> +oappend_char (instr_info *ins, const char c, enum disassembler_style style)
> +{
> +  oappend_insert_style (ins, style);
> +  *ins->obufp++ = c;
> +  *ins->obufp = '\0';
> +}
> +
>  static void
>  append_seg (instr_info *ins)
>  {
> @@ -10789,26 +10934,27 @@ append_seg (instr_info *ins)
>    switch (ins->active_seg_prefix)
>      {
>      case PREFIX_CS:
> -      oappend_maybe_intel (ins, "%cs:");
> +      oappend_maybe_intel_with_style (ins, "%cs", dis_style_register);
>        break;
>      case PREFIX_DS:
> -      oappend_maybe_intel (ins, "%ds:");
> +      oappend_maybe_intel_with_style (ins, "%ds", dis_style_register);
>        break;
>      case PREFIX_SS:
> -      oappend_maybe_intel (ins, "%ss:");
> +      oappend_maybe_intel_with_style (ins, "%ss", dis_style_register);
>        break;
>      case PREFIX_ES:
> -      oappend_maybe_intel (ins, "%es:");
> +      oappend_maybe_intel_with_style (ins, "%es", dis_style_register);
>        break;
>      case PREFIX_FS:
> -      oappend_maybe_intel (ins, "%fs:");
> +      oappend_maybe_intel_with_style (ins, "%fs", dis_style_register);
>        break;
>      case PREFIX_GS:
> -      oappend_maybe_intel (ins, "%gs:");
> +      oappend_maybe_intel_with_style (ins, "%gs", dis_style_register);
>        break;
>      default:
>        break;
>      }
> +  oappend_char (ins, ':', dis_style_text);
>  }
>
>  static void
> @@ -11296,7 +11442,7 @@ print_register (instr_info *ins, unsigned int reg, unsigned int rexmask,
>        oappend (ins, INTERNAL_DISASSEMBLER_ERROR);
>        return;
>      }
> -  oappend_maybe_intel (ins, names[reg]);
> +  oappend_maybe_intel_with_style (ins, names[reg], dis_style_register);
>  }
>
>  static void
> @@ -11560,11 +11706,15 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
>               print_displacement (ins, ins->scratchbuf, disp);
>             else
>               print_operand_value (ins, ins->scratchbuf, 1, disp);
> -           oappend (ins, ins->scratchbuf);
> +           oappend_with_style (ins, ins->scratchbuf,
> +                               dis_style_address_offset);
>             if (riprel)
>               {
>                 set_op (ins, disp, 1);
> -               oappend (ins, !addr32flag ? "(%rip)" : "(%eip)");
> +               oappend_char (ins, '(', dis_style_text);
> +               oappend_with_style (ins, !addr32flag ? "%rip" : "%eip",
> +                                   dis_style_register);
> +               oappend_char (ins, ')', dis_style_text);
>               }
>           }
>
> @@ -11578,17 +11728,19 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
>
>        if (havedisp || (ins->intel_syntax && riprel))
>         {
> -         *ins->obufp++ = ins->open_char;
> +         oappend_char (ins, ins->open_char, dis_style_text);
>           if (ins->intel_syntax && riprel)
>             {
>               set_op (ins, disp, 1);
> -             oappend (ins, !addr32flag ? "rip" : "eip");
> +             oappend_with_style (ins, !addr32flag ? "rip" : "eip",
> +                                 dis_style_register);
>             }
> -         *ins->obufp = '\0';
>           if (havebase)
> -           oappend_maybe_intel (ins,
> -                                (ins->address_mode == mode_64bit && !addr32flag
> -                                 ? att_names64 : att_names32)[rbase]);
> +           oappend_maybe_intel_with_style
> +             (ins,
> +              (ins->address_mode == mode_64bit && !addr32flag
> +               ? att_names64 : att_names32)[rbase],
> +              dis_style_register);
>           if (ins->has_sib)
>             {
>               /* ESP/RSP won't allow index.  If base isn't ESP/RSP,
> @@ -11599,14 +11751,12 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
>                   || (havebase && base != ESP_REG_NUM))
>                 {
>                   if (!ins->intel_syntax || havebase)
> -                   {
> -                     *ins->obufp++ = ins->separator_char;
> -                     *ins->obufp = '\0';
> -                   }
> +                   oappend_char (ins, ins->separator_char, dis_style_text);
>                   if (indexes)
>                     {
>                       if (ins->address_mode == mode_64bit || vindex < 16)
> -                       oappend_maybe_intel (ins, indexes[vindex]);
> +                       oappend_maybe_intel_with_style (ins, indexes[vindex],
> +                                                       dis_style_register);
>                       else
>                         oappend (ins, "(bad)");
>                     }
> @@ -11614,26 +11764,22 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
>                     oappend_maybe_intel (ins,
>                                          ins->address_mode == mode_64bit
>                                          && !addr32flag ? att_index64
> -                                                       : att_index32);
> +                                        : att_index32);
>
> -                 *ins->obufp++ = ins->scale_char;
> -                 *ins->obufp = '\0';
> +                 oappend_char (ins, ins->scale_char, dis_style_text);
>                   sprintf (ins->scratchbuf, "%d", 1 << scale);
> -                 oappend (ins, ins->scratchbuf);
> +                 oappend_with_style (ins, ins->scratchbuf,
> +                                     dis_style_immediate);
>                 }
>             }
>           if (ins->intel_syntax
>               && (disp || ins->modrm.mod != 0 || base == 5))
>             {
>               if (!havedisp || (bfd_signed_vma) disp >= 0)
> -               {
> -                 *ins->obufp++ = '+';
> -                 *ins->obufp = '\0';
> -               }
> +                 oappend_char (ins, '+', dis_style_text);
>               else if (ins->modrm.mod != 1 && disp != -disp)
>                 {
> -                 *ins->obufp++ = '-';
> -                 *ins->obufp = '\0';
> +                 oappend_char (ins, '-', dis_style_text);
>                   disp = -disp;
>                 }
>
> @@ -11644,8 +11790,7 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
>               oappend (ins, ins->scratchbuf);
>             }
>
> -         *ins->obufp++ = ins->close_char;
> -         *ins->obufp = '\0';
> +         oappend_char (ins, ins->close_char, dis_style_text);
>
>           if (check_gather)
>             {
> @@ -11666,7 +11811,8 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
>             {
>               if (!ins->active_seg_prefix)
>                 {
> -                 oappend_maybe_intel (ins, att_names_seg[ds_reg - es_reg]);
> +                 oappend_maybe_intel_with_style
> +                   (ins, att_names_seg[ds_reg - es_reg], dis_style_register);
>                   oappend (ins, ":");
>                 }
>               print_operand_value (ins, ins->scratchbuf, 1, disp);
> @@ -11722,23 +11868,17 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
>
>        if (ins->modrm.mod != 0 || ins->modrm.rm != 6)
>         {
> -         *ins->obufp++ = ins->open_char;
> -         *ins->obufp = '\0';
> -         oappend (ins,
> -                  (ins->intel_syntax ? intel_index16
> -                                     : att_index16)[ins->modrm.rm]);
> +         oappend_char (ins, ins->open_char, dis_style_text);
> +         oappend (ins, (ins->intel_syntax ? intel_index16
> +                        : att_index16)[ins->modrm.rm]);
>           if (ins->intel_syntax
>               && (disp || ins->modrm.mod != 0 || ins->modrm.rm == 6))
>             {
>               if ((bfd_signed_vma) disp >= 0)
> -               {
> -                 *ins->obufp++ = '+';
> -                 *ins->obufp = '\0';
> -               }
> +               oappend_char (ins, '+', dis_style_text);
>               else if (ins->modrm.mod != 1)
>                 {
> -                 *ins->obufp++ = '-';
> -                 *ins->obufp = '\0';
> +                 oappend_char (ins, '-', dis_style_text);
>                   disp = -disp;
>                 }
>
> @@ -11746,14 +11886,14 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
>               oappend (ins, ins->scratchbuf);
>             }
>
> -         *ins->obufp++ = ins->close_char;
> -         *ins->obufp = '\0';
> +         oappend_char (ins, ins->close_char, dis_style_text);
>         }
>        else if (ins->intel_syntax)
>         {
>           if (!ins->active_seg_prefix)
>             {
> -             oappend_maybe_intel (ins, att_names_seg[ds_reg - es_reg]);
> +             oappend_maybe_intel_with_style
> +               (ins, att_names_seg[ds_reg - es_reg], dis_style_register);
>               oappend (ins, ":");
>             }
>           print_operand_value (ins, ins->scratchbuf, 1, disp & 0xffff);
> @@ -11969,7 +12109,8 @@ OP_REG (instr_info *ins, int code, int sizeflag)
>      {
>      case es_reg: case ss_reg: case cs_reg:
>      case ds_reg: case fs_reg: case gs_reg:
> -      oappend_maybe_intel (ins, att_names_seg[code - es_reg]);
> +      oappend_maybe_intel_with_style
> +       (ins, att_names_seg[code - es_reg], dis_style_register);
>        return;
>      }
>
> @@ -12022,7 +12163,7 @@ OP_REG (instr_info *ins, int code, int sizeflag)
>        oappend (ins, INTERNAL_DISASSEMBLER_ERROR);
>        return;
>      }
> -  oappend_maybe_intel (ins, s);
> +  oappend_maybe_intel_with_style (ins, s, dis_style_register);
>  }
>
>  static void
> @@ -12063,7 +12204,7 @@ OP_IMREG (instr_info *ins, int code, int sizeflag)
>        oappend (ins, INTERNAL_DISASSEMBLER_ERROR);
>        return;
>      }
> -  oappend_maybe_intel (ins, s);
> +  oappend_maybe_intel_with_style (ins, s, dis_style_register);
>  }
>
>  static void
> @@ -12118,7 +12259,7 @@ OP_I (instr_info *ins, int bytemode, int sizeflag)
>    op &= mask;
>    ins->scratchbuf[0] = '$';
>    print_operand_value (ins, ins->scratchbuf + 1, 1, op);
> -  oappend_maybe_intel (ins, ins->scratchbuf);
> +  oappend_maybe_intel_with_style (ins, ins->scratchbuf, dis_style_immediate);
>    ins->scratchbuf[0] = '\0';
>  }
>
> @@ -12136,7 +12277,7 @@ OP_I64 (instr_info *ins, int bytemode, int sizeflag)
>
>    ins->scratchbuf[0] = '$';
>    print_operand_value (ins, ins->scratchbuf + 1, 1, get64 (ins));
> -  oappend_maybe_intel (ins, ins->scratchbuf);
> +  oappend_maybe_intel_with_style (ins, ins->scratchbuf, dis_style_immediate);
>    ins->scratchbuf[0] = '\0';
>  }
>
> @@ -12190,7 +12331,7 @@ OP_sI (instr_info *ins, int bytemode, int sizeflag)
>
>    ins->scratchbuf[0] = '$';
>    print_operand_value (ins, ins->scratchbuf + 1, 1, op);
> -  oappend_maybe_intel (ins, ins->scratchbuf);
> +  oappend_maybe_intel_with_style (ins, ins->scratchbuf, dis_style_immediate);
>  }
>
>  static void
> @@ -12248,7 +12389,8 @@ static void
>  OP_SEG (instr_info *ins, int bytemode, int sizeflag)
>  {
>    if (bytemode == w_mode)
> -    oappend_maybe_intel (ins, att_names_seg[ins->modrm.reg]);
> +    oappend_maybe_intel_with_style
> +      (ins, att_names_seg[ins->modrm.reg], dis_style_register);
>    else
>      OP_E (ins, ins->modrm.mod == 3 ? bytemode : w_mode, sizeflag);
>  }
> @@ -12294,12 +12436,13 @@ OP_OFF (instr_info *ins, int bytemode, int sizeflag)
>      {
>        if (!ins->active_seg_prefix)
>         {
> -         oappend_maybe_intel (ins, att_names_seg[ds_reg - es_reg]);
> +         oappend_maybe_intel_with_style (ins, att_names_seg[ds_reg - es_reg],
> +                                         dis_style_register);
>           oappend (ins, ":");
>         }
>      }
>    print_operand_value (ins, ins->scratchbuf, 1, off);
> -  oappend (ins, ins->scratchbuf);
> +  oappend_with_style (ins, ins->scratchbuf, dis_style_address_offset);
>  }
>
>  static void
> @@ -12324,12 +12467,14 @@ OP_OFF64 (instr_info *ins, int bytemode, int sizeflag)
>      {
>        if (!ins->active_seg_prefix)
>         {
> -         oappend_maybe_intel (ins, att_names_seg[ds_reg - es_reg]);
> +         oappend_maybe_intel_with_style (ins,
> +                                         att_names_seg[ds_reg - es_reg],
> +                                         dis_style_register);
>           oappend (ins, ":");
>         }
>      }
>    print_operand_value (ins, ins->scratchbuf, 1, off);
> -  oappend (ins, ins->scratchbuf);
> +  oappend_with_style (ins, ins->scratchbuf, dis_style_address_offset);
>  }
>
>  static void
> @@ -12350,9 +12495,8 @@ ptr_reg (instr_info *ins, int code, int sizeflag)
>      s = att_names32[code - eAX_reg];
>    else
>      s = att_names16[code - eAX_reg];
> -  oappend_maybe_intel (ins, s);
> -  *ins->obufp++ = ins->close_char;
> -  *ins->obufp = 0;
> +  oappend_maybe_intel_with_style (ins, s, dis_style_register);
> +  oappend_char (ins, ins->close_char, dis_style_text);
>  }
>
>  static void
> @@ -12375,7 +12519,8 @@ OP_ESreg (instr_info *ins, int code, int sizeflag)
>           intel_operand_size (ins, b_mode, sizeflag);
>         }
>      }
> -  oappend_maybe_intel (ins, "%es:");
> +  oappend_maybe_intel_with_style (ins, "%es", dis_style_register);
> +  oappend_char (ins, ':', dis_style_text);
>    ptr_reg (ins, code, sizeflag);
>  }
>
> @@ -12470,7 +12615,7 @@ OP_MMX (instr_info *ins, int bytemode ATTRIBUTE_UNUSED,
>      }
>    else
>      names = att_names_mm;
> -  oappend_maybe_intel (ins, names[reg]);
> +  oappend_maybe_intel_with_style (ins, names[reg], dis_style_register);
>  }
>
>  static void
> @@ -12543,7 +12688,7 @@ print_vector_reg (instr_info *ins, unsigned int reg, int bytemode)
>      }
>    else
>      names = att_names_xmm;
> -  oappend_maybe_intel (ins, names[reg]);
> +  oappend_maybe_intel_with_style (ins, names[reg], dis_style_register);
>  }
>
>  static void
> @@ -12603,7 +12748,7 @@ OP_EM (instr_info *ins, int bytemode, int sizeflag)
>      }
>    else
>      names = att_names_mm;
> -  oappend_maybe_intel (ins, names[reg]);
> +  oappend_maybe_intel_with_style (ins, names[reg], dis_style_register);
>  }
>
>  /* cvt* are the only instructions in sse2 which have
> @@ -12629,7 +12774,8 @@ OP_EMC (instr_info *ins, int bytemode, int sizeflag)
>    MODRM_CHECK;
>    ins->codep++;
>    ins->used_prefixes |= (ins->prefixes & PREFIX_DATA);
> -  oappend_maybe_intel (ins, att_names_mm[ins->modrm.rm]);
> +  oappend_maybe_intel_with_style (ins, att_names_mm[ins->modrm.rm],
> +                                 dis_style_register);
>  }
>
>  static void
> @@ -12813,7 +12959,7 @@ OP_3DNowSuffix (instr_info *ins, int bytemode ATTRIBUTE_UNUSED,
>    ins->obufp = ins->mnemonicendp;
>    mnemonic = Suffix3DNow[*ins->codep++ & 0xff];
>    if (mnemonic)
> -    oappend (ins, mnemonic);
> +    ins->obufp = stpcpy (ins->obufp, mnemonic);
>    else
>      {
>        /* Since a variable sized ins->modrm/ins->sib chunk is between the start
> @@ -12959,7 +13105,7 @@ BadOp (instr_info *ins)
>  {
>    /* Throw away prefixes and 1st. opcode byte.  */
>    ins->codep = ins->insn_codep + 1;
> -  oappend (ins, "(bad)");
> +  ins->obufp = stpcpy (ins->obufp, "(bad)");
>  }
>
>  static void
> @@ -13172,7 +13318,8 @@ OP_VEX (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
>    switch (bytemode)
>      {
>      case scalar_mode:
> -      oappend_maybe_intel (ins, att_names_xmm[reg]);
> +      oappend_maybe_intel_with_style (ins, att_names_xmm[reg],
> +                                     dis_style_register);
>        return;
>
>      case vex_vsib_d_w_dq_mode:
> @@ -13183,9 +13330,11 @@ OP_VEX (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
>        if (ins->vex.length == 128
>           || (bytemode != vex_vsib_d_w_dq_mode
>               && !ins->vex.w))
> -       oappend_maybe_intel (ins, att_names_xmm[reg]);
> +       oappend_maybe_intel_with_style (ins, att_names_xmm[reg],
> +                                       dis_style_register);
>        else
> -       oappend_maybe_intel (ins, att_names_ymm[reg]);
> +       oappend_maybe_intel_with_style (ins, att_names_ymm[reg],
> +                                       dis_style_register);
>
>        /* All 3 XMM/YMM registers must be distinct.  */
>        modrm_reg = ins->modrm.reg;
> @@ -13217,7 +13366,8 @@ OP_VEX (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
>           /* This must be the 3rd operand.  */
>           if (ins->obufp != ins->op_out[2])
>             abort ();
> -         oappend_maybe_intel (ins, att_names_tmm[reg]);
> +         oappend_maybe_intel_with_style (ins, att_names_tmm[reg],
> +                                         dis_style_register);
>           if (reg == ins->modrm.reg || reg == ins->modrm.rm)
>             strcpy (ins->obufp, "/(bad)");
>         }
> @@ -13292,7 +13442,7 @@ OP_VEX (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
>        abort ();
>        break;
>      }
> -  oappend_maybe_intel (ins, names[reg]);
> +  oappend_maybe_intel_with_style (ins, names[reg], dis_style_register);
>  }
>
>  static void
> @@ -13335,7 +13485,7 @@ OP_REG_VexI4 (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
>    if (bytemode == x_mode && ins->vex.length == 256)
>      names = att_names_ymm;
>
> -  oappend_maybe_intel (ins, names[reg]);
> +  oappend_maybe_intel_with_style (ins, names[reg], dis_style_register);
>
>    if (ins->vex.w)
>      {
> @@ -13352,7 +13502,7 @@ OP_VexI4 (instr_info *ins, int bytemode ATTRIBUTE_UNUSED,
>  {
>    ins->scratchbuf[0] = '$';
>    print_operand_value (ins, ins->scratchbuf + 1, 1, ins->codep[-1] & 0xf);
> -  oappend_maybe_intel (ins, ins->scratchbuf);
> +  oappend_maybe_intel_with_style (ins, ins->scratchbuf, dis_style_text);

This change isn't needed.

>  }
>
>  static void
> @@ -13397,7 +13547,7 @@ VPCMP_Fixup (instr_info *ins, int bytemode ATTRIBUTE_UNUSED,
>        /* We have a reserved extension byte.  Output it directly.  */
>        ins->scratchbuf[0] = '$';
>        print_operand_value (ins, ins->scratchbuf + 1, 1, cmp_type);
> -      oappend_maybe_intel (ins, ins->scratchbuf);
> +      oappend_maybe_intel_with_style (ins, ins->scratchbuf, dis_style_text);

This change isn't needed.

>        ins->scratchbuf[0] = '\0';
>      }
>  }
> @@ -13449,7 +13599,7 @@ VPCOM_Fixup (instr_info *ins, int bytemode ATTRIBUTE_UNUSED,
>        /* We have a reserved extension byte.  Output it directly.  */
>        ins->scratchbuf[0] = '$';
>        print_operand_value (ins, ins->scratchbuf + 1, 1, cmp_type);
> -      oappend_maybe_intel (ins, ins->scratchbuf);
> +      oappend_maybe_intel_with_style (ins, ins->scratchbuf, dis_style_text);

This change isn't needed.

>        ins->scratchbuf[0] = '\0';
>      }
>  }
> @@ -13497,7 +13647,8 @@ PCLMUL_Fixup (instr_info *ins, int bytemode ATTRIBUTE_UNUSED,
>        /* We have a reserved extension byte.  Output it directly.  */
>        ins->scratchbuf[0] = '$';
>        print_operand_value (ins, ins->scratchbuf + 1, 1, pclmul_type);
> -      oappend_maybe_intel (ins, ins->scratchbuf);
> +      oappend_maybe_intel_with_style (ins, ins->scratchbuf,
> +                                     dis_style_immediate);
>        ins->scratchbuf[0] = '\0';
>      }
>  }
> --
> 2.25.4
>


-- 
H.J.

  parent reply	other threads:[~2022-05-18 21:24 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-29 13:42 [PATCH 0/2] Disassembler styling for i386-dis.c Andrew Burgess
2022-04-29 13:42 ` [PATCH 1/2] objdump: fix styled printing of addresses Andrew Burgess
2022-05-02  7:14   ` Jan Beulich
2022-05-03  9:52     ` Andrew Burgess
2022-04-29 13:42 ` [PATCH 2/2] libopcodes: extend the styling within the i386 disassembler Andrew Burgess
2022-04-29 18:16   ` Vladimir Mezentsev
2022-05-03 13:15     ` Andrew Burgess
2022-04-29 18:57   ` H.J. Lu
2022-05-03 13:14     ` Andrew Burgess
2022-05-02  7:28   ` Jan Beulich
2022-05-03 13:12     ` Andrew Burgess
2022-05-03 15:47       ` H.J. Lu
2022-05-04  7:58       ` Jan Beulich
2022-05-09  9:48         ` Andrew Burgess
2022-05-09 12:54           ` [PATCHv2] " Andrew Burgess
2022-05-18 12:27             ` Jan Beulich
2022-05-26 12:48               ` Andrew Burgess
2022-05-18 21:23             ` H.J. Lu [this message]
2022-05-27 17:44             ` [PATCHv3] " Andrew Burgess
2022-05-30  8:19               ` Jan Beulich
2022-05-31 17:20                 ` Andrew Burgess
2022-06-01  5:59                   ` Jan Beulich
2022-06-01 15:56                     ` H.J. Lu
2022-06-08 16:03                       ` Andrew Burgess
2022-06-10 10:56               ` Jan Beulich
2022-06-10 13:01                 ` Andrew Burgess
2022-05-18  7:06           ` [PATCH 2/2] " Jan Beulich
2022-05-18 10:41             ` Andrew Burgess
2022-05-18 10:46               ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMe9rOoZHm97UQyaNDXYsnj0cftLCYhZCoRtsVKJkRrOHsEH1A@mail.gmail.com \
    --to=hjl.tools@gmail.com \
    --cc=aburgess@redhat.com \
    --cc=binutils@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).