* [PATCH 0/4] x86: follow-on to making disassembler thread-safe
@ 2022-06-10 13:33 Jan Beulich
2022-06-10 13:35 ` [PATCH 1/4] x86: properly initialize struct instr_info instance(s) Jan Beulich
` (4 more replies)
0 siblings, 5 replies; 10+ messages in thread
From: Jan Beulich @ 2022-06-10 13:33 UTC (permalink / raw)
To: Binutils
Alan had noticed ubsan fallout from that combined with my conversion
of op_riprel to bool[]. In the course of addressing that I did notice
further things which could do with improving (or so I hope).
1: properly initialize struct instr_info instance(s)
2: shrink prefix related disassembler state fields
3: avoid string copy when swapping Vex.W controlled operands
4: replace global scratch buffer
Jan
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/4] x86: properly initialize struct instr_info instance(s)
2022-06-10 13:33 [PATCH 0/4] x86: follow-on to making disassembler thread-safe Jan Beulich
@ 2022-06-10 13:35 ` Jan Beulich
2022-06-10 13:35 ` [PATCH 2/4] x86: shrink prefix related disassembler state fields Jan Beulich
` (3 subsequent siblings)
4 siblings, 0 replies; 10+ messages in thread
From: Jan Beulich @ 2022-06-10 13:35 UTC (permalink / raw)
To: Binutils
Commit 39fb369834a3 ("opcodes: Make i386-dis.c thread-safe") introduced
a lot of uninitialized data. Alan has in particular observed ubsan
taking issue with the loop inverting the order of operands, where
op_riprel[] - an array of bool - can hold values other than 0 or 1.
Move instantiation of struct instr_info into print_insn() (thus having
just a single central point), and make use of C99 dedicated initializers
to fill fields right in the initializer where possible. This way all
fields not explicitly initialized will be zero-filled, which in turn
allows dropping of some other explicit initialization later in the
function or in ckprefix(). Additionally this removes a lot of
indirection, as all "ins->info" uses can simply become "info".
Make one further arrangement though, to limit the amount of data needing
(zero)initializing on every invocation: Convert the op_out structure
member to just an array of pointers, with the actual arrays living
inside print_insn() (and, as befoe, having just their 1st char filled
with nul).
While there, instead of adjusting print_insn()'s forward declaration,
arrange for no such declaration to be needed in the first place.
---
Since I'm touching quite a few instances of indirect function calls, I
was tempted to drop the explicit indirection. With C99 (and even with
earlier ANSI C) this shouldn't be needed and imo makes the code harder
to read. But then I hesitated since maybe someone really wants this
written this way?
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -42,7 +42,6 @@
#include <setjmp.h>
typedef struct instr_info instr_info;
-static int print_insn (bfd_vma, instr_info *);
static void dofloat (instr_info *, int);
static void OP_ST (instr_info *, int, int);
static void OP_STi (instr_info *, int, int);
@@ -234,7 +233,7 @@ struct instr_info
unsigned char op_ad;
signed char op_index[MAX_OPERANDS];
bool op_riprel[MAX_OPERANDS];
- char op_out[MAX_OPERANDS][100];
+ char *op_out[MAX_OPERANDS];
bfd_vma op_address[MAX_OPERANDS];
bfd_vma start_pc;
@@ -8545,22 +8544,7 @@ static int
ckprefix (instr_info *ins)
{
int newrex, i, length;
- ins->rex = 0;
- ins->prefixes = 0;
- ins->used_prefixes = 0;
- ins->rex_used = 0;
- ins->evex_used = 0;
- ins->last_lock_prefix = -1;
- ins->last_repz_prefix = -1;
- ins->last_repnz_prefix = -1;
- ins->last_data_prefix = -1;
- ins->last_addr_prefix = -1;
- ins->last_rex_prefix = -1;
- ins->last_seg_prefix = -1;
- ins->fwait_prefix = -1;
- ins->active_seg_prefix = 0;
- for (i = 0; i < (int) ARRAY_SIZE (ins->all_prefixes); i++)
- ins->all_prefixes[i] = 0;
+
i = 0;
length = 0;
/* The maximum instruction length is 15bytes. */
@@ -8769,39 +8753,6 @@ prefix_name (instr_info *ins, int pref,
}
}
-/* Here for backwards compatibility. When gdb stops using
- print_insn_i386_att and print_insn_i386_intel these functions can
- disappear, and print_insn_i386 be merged into print_insn. */
-int
-print_insn_i386_att (bfd_vma pc, disassemble_info *info)
-{
- instr_info ins;
- ins.info = info;
- ins.intel_syntax = 0;
-
- return print_insn (pc, &ins);
-}
-
-int
-print_insn_i386_intel (bfd_vma pc, disassemble_info *info)
-{
- instr_info ins;
- ins.info = info;
- ins.intel_syntax = 1;
-
- return print_insn (pc, &ins);
-}
-
-int
-print_insn_i386 (bfd_vma pc, disassemble_info *info)
-{
- instr_info ins;
- ins.info = info;
- ins.intel_syntax = -1;
-
- return print_insn (pc, &ins);
-}
-
void
print_i386_disassembler_options (FILE *stream)
{
@@ -9421,7 +9372,7 @@ i386_dis_printf (instr_info *ins, enum d
}
static int
-print_insn (bfd_vma pc, instr_info *ins)
+print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
{
const struct dis386 *dp;
int i;
@@ -9433,60 +9384,75 @@ print_insn (bfd_vma pc, instr_info *ins)
struct dis_private priv;
int prefix_length;
int op_count;
+ instr_info ins = {
+ .info = info,
+ .intel_syntax = intel_syntax >= 0
+ ? intel_syntax
+ : (info->mach & bfd_mach_i386_intel_syntax) != 0,
+ .intel_mnemonic = !SYSV386_COMPAT,
+ .op_index[0 ... MAX_OPERANDS - 1] = -1,
+ .start_pc = pc,
+ .start_codep = priv.the_buffer,
+ .codep = priv.the_buffer,
+ .obufp = ins.obuf,
+ .last_lock_prefix = -1,
+ .last_repz_prefix = -1,
+ .last_repnz_prefix = -1,
+ .last_data_prefix = -1,
+ .last_addr_prefix = -1,
+ .last_rex_prefix = -1,
+ .last_seg_prefix = -1,
+ .fwait_prefix = -1,
+ };
+ char op_out[MAX_OPERANDS][100];
- ins->isa64 = 0;
- ins->intel_mnemonic = !SYSV386_COMPAT;
- ins->op_is_jump = false;
priv.orig_sizeflag = AFLAG | DFLAG;
- if ((ins->info->mach & bfd_mach_i386_i386) != 0)
- ins->address_mode = mode_32bit;
- else if (ins->info->mach == bfd_mach_i386_i8086)
+ if ((info->mach & bfd_mach_i386_i386) != 0)
+ ins.address_mode = mode_32bit;
+ else if (info->mach == bfd_mach_i386_i8086)
{
- ins->address_mode = mode_16bit;
+ ins.address_mode = mode_16bit;
priv.orig_sizeflag = 0;
}
else
- ins->address_mode = mode_64bit;
-
- if (ins->intel_syntax == (char) -1)
- ins->intel_syntax = (ins->info->mach & bfd_mach_i386_intel_syntax) != 0;
+ ins.address_mode = mode_64bit;
- for (p = ins->info->disassembler_options; p != NULL;)
+ for (p = info->disassembler_options; p != NULL;)
{
if (startswith (p, "amd64"))
- ins->isa64 = amd64;
+ ins.isa64 = amd64;
else if (startswith (p, "intel64"))
- ins->isa64 = intel64;
+ ins.isa64 = intel64;
else if (startswith (p, "x86-64"))
{
- ins->address_mode = mode_64bit;
+ ins.address_mode = mode_64bit;
priv.orig_sizeflag |= AFLAG | DFLAG;
}
else if (startswith (p, "i386"))
{
- ins->address_mode = mode_32bit;
+ ins.address_mode = mode_32bit;
priv.orig_sizeflag |= AFLAG | DFLAG;
}
else if (startswith (p, "i8086"))
{
- ins->address_mode = mode_16bit;
+ ins.address_mode = mode_16bit;
priv.orig_sizeflag &= ~(AFLAG | DFLAG);
}
else if (startswith (p, "intel"))
{
- ins->intel_syntax = 1;
+ ins.intel_syntax = 1;
if (startswith (p + 5, "-mnemonic"))
- ins->intel_mnemonic = true;
+ ins.intel_mnemonic = true;
}
else if (startswith (p, "att"))
{
- ins->intel_syntax = 0;
+ ins.intel_syntax = 0;
if (startswith (p + 3, "-mnemonic"))
- ins->intel_mnemonic = false;
+ ins.intel_mnemonic = false;
}
else if (startswith (p, "addr"))
{
- if (ins->address_mode == mode_64bit)
+ if (ins.address_mode == mode_64bit)
{
if (p[4] == '3' && p[5] == '2')
priv.orig_sizeflag &= ~AFLAG;
@@ -9516,46 +9482,41 @@ print_insn (bfd_vma pc, instr_info *ins)
p++;
}
- if (ins->address_mode == mode_64bit && sizeof (bfd_vma) < 8)
+ if (ins.address_mode == mode_64bit && sizeof (bfd_vma) < 8)
{
- i386_dis_printf (ins, dis_style_text, _("64-bit address is disabled"));
+ i386_dis_printf (&ins, dis_style_text, _("64-bit address is disabled"));
return -1;
}
- if (ins->intel_syntax)
+ if (ins.intel_syntax)
{
- ins->open_char = '[';
- ins->close_char = ']';
- ins->separator_char = '+';
- ins->scale_char = '*';
+ ins.open_char = '[';
+ ins.close_char = ']';
+ ins.separator_char = '+';
+ ins.scale_char = '*';
}
else
{
- ins->open_char = '(';
- ins->close_char = ')';
- ins->separator_char = ',';
- ins->scale_char = ',';
+ ins.open_char = '(';
+ ins.close_char = ')';
+ ins.separator_char = ',';
+ ins.scale_char = ',';
}
/* The output looks better if we put 7 bytes on a line, since that
puts most long word instructions on a single line. */
- ins->info->bytes_per_line = 7;
+ info->bytes_per_line = 7;
- ins->info->private_data = &priv;
+ info->private_data = &priv;
priv.max_fetched = priv.the_buffer;
priv.insn_start = pc;
- ins->obuf[0] = 0;
for (i = 0; i < MAX_OPERANDS; ++i)
{
- ins->op_out[i][0] = 0;
- ins->op_index[i] = -1;
+ op_out[i][0] = 0;
+ ins.op_out[i] = op_out[i];
}
- ins->start_pc = pc;
- ins->start_codep = priv.the_buffer;
- ins->codep = priv.the_buffer;
-
if (OPCODES_SIGSETJMP (priv.bailout) != 0)
{
const char *name;
@@ -9563,17 +9524,17 @@ print_insn (bfd_vma pc, instr_info *ins)
/* Getting here means we tried for data but didn't get it. That
means we have an incomplete instruction of some sort. Just
print the first byte as a prefix or a .byte pseudo-op. */
- if (ins->codep > priv.the_buffer)
+ if (ins.codep > priv.the_buffer)
{
- name = prefix_name (ins, priv.the_buffer[0], priv.orig_sizeflag);
+ name = prefix_name (&ins, priv.the_buffer[0], priv.orig_sizeflag);
if (name != NULL)
- i386_dis_printf (ins, dis_style_mnemonic, "%s", name);
+ i386_dis_printf (&ins, dis_style_mnemonic, "%s", name);
else
{
/* Just print the first byte as a .byte instruction. */
- i386_dis_printf (ins, dis_style_assembler_directive,
+ i386_dis_printf (&ins, dis_style_assembler_directive,
".byte ");
- i386_dis_printf (ins, dis_style_immediate, "0x%x",
+ i386_dis_printf (&ins, dis_style_immediate, "0x%x",
(unsigned int) priv.the_buffer[0]);
}
@@ -9583,134 +9544,129 @@ print_insn (bfd_vma pc, instr_info *ins)
return -1;
}
- ins->obufp = ins->obuf;
sizeflag = priv.orig_sizeflag;
- if (!ckprefix (ins) || ins->rex_used)
+ if (!ckprefix (&ins) || ins.rex_used)
{
- /* Too many ins->prefixes or unused REX ins->prefixes. */
+ /* Too many prefixes or unused REX prefixes. */
for (i = 0;
- i < (int) ARRAY_SIZE (ins->all_prefixes) && ins->all_prefixes[i];
+ i < (int) ARRAY_SIZE (ins.all_prefixes) && ins.all_prefixes[i];
i++)
- i386_dis_printf (ins, dis_style_mnemonic, "%s%s",
+ i386_dis_printf (&ins, dis_style_mnemonic, "%s%s",
(i == 0 ? "" : " "),
- prefix_name (ins, ins->all_prefixes[i], sizeflag));
+ prefix_name (&ins, ins.all_prefixes[i], sizeflag));
return i;
}
- ins->insn_codep = ins->codep;
+ ins.insn_codep = ins.codep;
- FETCH_DATA (ins->info, ins->codep + 1);
- ins->two_source_ops = (*ins->codep == 0x62) || (*ins->codep == 0xc8);
+ FETCH_DATA (info, ins.codep + 1);
+ ins.two_source_ops = (*ins.codep == 0x62) || (*ins.codep == 0xc8);
- if (((ins->prefixes & PREFIX_FWAIT)
- && ((*ins->codep < 0xd8) || (*ins->codep > 0xdf))))
+ if (((ins.prefixes & PREFIX_FWAIT)
+ && ((*ins.codep < 0xd8) || (*ins.codep > 0xdf))))
{
- /* Handle ins->prefixes before fwait. */
- for (i = 0; i < ins->fwait_prefix && ins->all_prefixes[i];
+ /* Handle ins.prefixes before fwait. */
+ for (i = 0; i < ins.fwait_prefix && ins.all_prefixes[i];
i++)
- i386_dis_printf (ins, dis_style_mnemonic, "%s ",
- prefix_name (ins, ins->all_prefixes[i], sizeflag));
- i386_dis_printf (ins, dis_style_mnemonic, "fwait");
+ i386_dis_printf (&ins, dis_style_mnemonic, "%s ",
+ prefix_name (&ins, ins.all_prefixes[i], sizeflag));
+ i386_dis_printf (&ins, dis_style_mnemonic, "fwait");
return i + 1;
}
- if (*ins->codep == 0x0f)
+ if (*ins.codep == 0x0f)
{
unsigned char threebyte;
- ins->codep++;
- FETCH_DATA (ins->info, ins->codep + 1);
- threebyte = *ins->codep;
+ ins.codep++;
+ FETCH_DATA (info, ins.codep + 1);
+ threebyte = *ins.codep;
dp = &dis386_twobyte[threebyte];
- ins->need_modrm = twobyte_has_modrm[threebyte];
- ins->codep++;
+ ins.need_modrm = twobyte_has_modrm[threebyte];
+ ins.codep++;
}
else
{
- dp = &dis386[*ins->codep];
- ins->need_modrm = onebyte_has_modrm[*ins->codep];
- ins->codep++;
+ dp = &dis386[*ins.codep];
+ ins.need_modrm = onebyte_has_modrm[*ins.codep];
+ ins.codep++;
}
- /* Save sizeflag for printing the extra ins->prefixes later before updating
+ /* Save sizeflag for printing the extra ins.prefixes later before updating
it for mnemonic and operand processing. The prefix names depend
only on the address mode. */
orig_sizeflag = sizeflag;
- if (ins->prefixes & PREFIX_ADDR)
+ if (ins.prefixes & PREFIX_ADDR)
sizeflag ^= AFLAG;
- if ((ins->prefixes & PREFIX_DATA))
+ if ((ins.prefixes & PREFIX_DATA))
sizeflag ^= DFLAG;
- ins->end_codep = ins->codep;
- if (ins->need_modrm)
+ ins.end_codep = ins.codep;
+ if (ins.need_modrm)
{
- FETCH_DATA (ins->info, ins->codep + 1);
- ins->modrm.mod = (*ins->codep >> 6) & 3;
- ins->modrm.reg = (*ins->codep >> 3) & 7;
- ins->modrm.rm = *ins->codep & 7;
+ FETCH_DATA (info, ins.codep + 1);
+ ins.modrm.mod = (*ins.codep >> 6) & 3;
+ ins.modrm.reg = (*ins.codep >> 3) & 7;
+ ins.modrm.rm = *ins.codep & 7;
}
- else
- memset (&ins->modrm, 0, sizeof (ins->modrm));
-
- ins->need_vex = false;
- memset (&ins->vex, 0, sizeof (ins->vex));
if (dp->name == NULL && dp->op[0].bytemode == FLOATCODE)
{
- get_sib (ins, sizeflag);
- dofloat (ins, sizeflag);
+ get_sib (&ins, sizeflag);
+ dofloat (&ins, sizeflag);
}
else
{
- dp = get_valid_dis386 (dp, ins);
- if (dp != NULL && putop (ins, dp->name, sizeflag) == 0)
+ dp = get_valid_dis386 (dp, &ins);
+ if (dp != NULL && putop (&ins, dp->name, sizeflag) == 0)
{
- get_sib (ins, sizeflag);
+ get_sib (&ins, sizeflag);
for (i = 0; i < MAX_OPERANDS; ++i)
{
- ins->obufp = ins->op_out[i];
- ins->op_ad = MAX_OPERANDS - 1 - i;
+ ins.obufp = ins.op_out[i];
+ ins.op_ad = MAX_OPERANDS - 1 - i;
if (dp->op[i].rtn)
- (*dp->op[i].rtn) (ins, dp->op[i].bytemode, sizeflag);
+ (*dp->op[i].rtn) (&ins, dp->op[i].bytemode, sizeflag);
/* For EVEX instruction after the last operand masking
should be printed. */
- if (i == 0 && ins->vex.evex)
+ if (i == 0 && ins.vex.evex)
{
/* Don't print {%k0}. */
- if (ins->vex.mask_register_specifier)
+ if (ins.vex.mask_register_specifier)
{
const char *reg_name
- = att_names_mask[ins->vex.mask_register_specifier];
- oappend (ins, "{");
- oappend_register (ins, reg_name);
- oappend (ins, "}");
+ = att_names_mask[ins.vex.mask_register_specifier];
+
+ oappend (&ins, "{");
+ oappend_register (&ins, reg_name);
+ oappend (&ins, "}");
}
- if (ins->vex.zeroing)
- oappend (ins, "{z}");
+ if (ins.vex.zeroing)
+ oappend (&ins, "{z}");
/* S/G insns require a mask and don't allow
zeroing-masking. */
if ((dp->op[0].bytemode == vex_vsib_d_w_dq_mode
|| dp->op[0].bytemode == vex_vsib_q_w_dq_mode)
- && (ins->vex.mask_register_specifier == 0
- || ins->vex.zeroing))
- oappend (ins, "/(bad)");
+ && (ins.vex.mask_register_specifier == 0
+ || ins.vex.zeroing))
+ oappend (&ins, "/(bad)");
}
}
/* Check whether rounding control was enabled for an insn not
supporting it. */
- if (ins->modrm.mod == 3 && ins->vex.b
- && !(ins->evex_used & EVEX_b_used))
+ if (ins.modrm.mod == 3 && ins.vex.b
+ && !(ins.evex_used & EVEX_b_used))
{
for (i = 0; i < MAX_OPERANDS; ++i)
{
- ins->obufp = ins->op_out[i];
- if (*ins->obufp)
+ ins.obufp = ins.op_out[i];
+ if (*ins.obufp)
continue;
- oappend (ins, names_rounding[ins->vex.ll]);
- oappend (ins, "bad}");
+ oappend (&ins, names_rounding[ins.vex.ll]);
+ oappend (&ins, "bad}");
break;
}
}
@@ -9718,15 +9674,15 @@ print_insn (bfd_vma pc, instr_info *ins)
}
/* Clear instruction information. */
- ins->info->insn_info_valid = 0;
- ins->info->branch_delay_insns = 0;
- ins->info->data_size = 0;
- ins->info->insn_type = dis_noninsn;
- ins->info->target = 0;
- ins->info->target2 = 0;
+ info->insn_info_valid = 0;
+ info->branch_delay_insns = 0;
+ info->data_size = 0;
+ info->insn_type = dis_noninsn;
+ info->target = 0;
+ info->target2 = 0;
/* Reset jump operation indicator. */
- ins->op_is_jump = false;
+ ins.op_is_jump = false;
{
int jump_detection = 0;
@@ -9747,28 +9703,28 @@ print_insn (bfd_vma pc, instr_info *ins)
/* Determine if this is a jump or branch. */
if ((jump_detection & 0x3) == 0x3)
{
- ins->op_is_jump = true;
+ ins.op_is_jump = true;
if (jump_detection & 0x4)
- ins->info->insn_type = dis_condbranch;
+ info->insn_type = dis_condbranch;
else
- ins->info->insn_type = (dp->name && !strncmp (dp->name, "call", 4))
+ info->insn_type = (dp->name && !strncmp (dp->name, "call", 4))
? dis_jsr : dis_branch;
}
}
/* If VEX.vvvv and EVEX.vvvv are unused, they must be all 1s, which
are all 0s in inverted form. */
- if (ins->need_vex && ins->vex.register_specifier != 0)
+ if (ins.need_vex && ins.vex.register_specifier != 0)
{
- i386_dis_printf (ins, dis_style_text, "(bad)");
- return ins->end_codep - priv.the_buffer;
+ i386_dis_printf (&ins, dis_style_text, "(bad)");
+ return ins.end_codep - priv.the_buffer;
}
/* If EVEX.z is set, there must be an actual mask register in use. */
- if (ins->vex.zeroing && ins->vex.mask_register_specifier == 0)
+ if (ins.vex.zeroing && ins.vex.mask_register_specifier == 0)
{
- i386_dis_printf (ins, dis_style_text, "(bad)");
- return ins->end_codep - priv.the_buffer;
+ i386_dis_printf (&ins, dis_style_text, "(bad)");
+ return ins.end_codep - priv.the_buffer;
}
switch (dp->prefix_requirement)
@@ -9776,12 +9732,12 @@ print_insn (bfd_vma pc, instr_info *ins)
case PREFIX_DATA:
/* If only the data prefix is marked as mandatory, its absence renders
the encoding invalid. Most other PREFIX_OPCODE rules still apply. */
- if (ins->need_vex ? !ins->vex.prefix : !(ins->prefixes & PREFIX_DATA))
+ if (ins.need_vex ? !ins.vex.prefix : !(ins.prefixes & PREFIX_DATA))
{
- i386_dis_printf (ins, dis_style_text, "(bad)");
- return ins->end_codep - priv.the_buffer;
+ i386_dis_printf (&ins, dis_style_text, "(bad)");
+ return ins.end_codep - priv.the_buffer;
}
- ins->used_prefixes |= PREFIX_DATA;
+ ins.used_prefixes |= PREFIX_DATA;
/* Fall through. */
case PREFIX_OPCODE:
/* If the mandatory PREFIX_REPZ/PREFIX_REPNZ/PREFIX_DATA prefix is
@@ -9789,79 +9745,79 @@ print_insn (bfd_vma pc, instr_info *ins)
used by putop and MMX/SSE operand and may be overridden by the
PREFIX_REPZ/PREFIX_REPNZ fix, we check the PREFIX_DATA prefix
separately. */
- if (((ins->need_vex
- ? ins->vex.prefix == REPE_PREFIX_OPCODE
- || ins->vex.prefix == REPNE_PREFIX_OPCODE
- : (ins->prefixes
+ if (((ins.need_vex
+ ? ins.vex.prefix == REPE_PREFIX_OPCODE
+ || ins.vex.prefix == REPNE_PREFIX_OPCODE
+ : (ins.prefixes
& (PREFIX_REPZ | PREFIX_REPNZ)) != 0)
- && (ins->used_prefixes
+ && (ins.used_prefixes
& (PREFIX_REPZ | PREFIX_REPNZ)) == 0)
- || (((ins->need_vex
- ? ins->vex.prefix == DATA_PREFIX_OPCODE
- : ((ins->prefixes
+ || (((ins.need_vex
+ ? ins.vex.prefix == DATA_PREFIX_OPCODE
+ : ((ins.prefixes
& (PREFIX_REPZ | PREFIX_REPNZ | PREFIX_DATA))
== PREFIX_DATA))
- && (ins->used_prefixes & PREFIX_DATA) == 0))
- || (ins->vex.evex && dp->prefix_requirement != PREFIX_DATA
- && !ins->vex.w != !(ins->used_prefixes & PREFIX_DATA)))
+ && (ins.used_prefixes & PREFIX_DATA) == 0))
+ || (ins.vex.evex && dp->prefix_requirement != PREFIX_DATA
+ && !ins.vex.w != !(ins.used_prefixes & PREFIX_DATA)))
{
- i386_dis_printf (ins, dis_style_text, "(bad)");
- return ins->end_codep - priv.the_buffer;
+ i386_dis_printf (&ins, dis_style_text, "(bad)");
+ return ins.end_codep - priv.the_buffer;
}
break;
case PREFIX_IGNORED:
/* Zap data size and rep prefixes from used_prefixes and reinstate their
origins in all_prefixes. */
- ins->used_prefixes &= ~PREFIX_OPCODE;
- if (ins->last_data_prefix >= 0)
- ins->all_prefixes[ins->last_data_prefix] = 0x66;
- if (ins->last_repz_prefix >= 0)
- ins->all_prefixes[ins->last_repz_prefix] = 0xf3;
- if (ins->last_repnz_prefix >= 0)
- ins->all_prefixes[ins->last_repnz_prefix] = 0xf2;
+ ins.used_prefixes &= ~PREFIX_OPCODE;
+ if (ins.last_data_prefix >= 0)
+ ins.all_prefixes[ins.last_data_prefix] = 0x66;
+ if (ins.last_repz_prefix >= 0)
+ ins.all_prefixes[ins.last_repz_prefix] = 0xf3;
+ if (ins.last_repnz_prefix >= 0)
+ ins.all_prefixes[ins.last_repnz_prefix] = 0xf2;
break;
}
/* Check if the REX prefix is used. */
- if ((ins->rex ^ ins->rex_used) == 0
- && !ins->need_vex && ins->last_rex_prefix >= 0)
- ins->all_prefixes[ins->last_rex_prefix] = 0;
+ if ((ins.rex ^ ins.rex_used) == 0
+ && !ins.need_vex && ins.last_rex_prefix >= 0)
+ ins.all_prefixes[ins.last_rex_prefix] = 0;
/* Check if the SEG prefix is used. */
- if ((ins->prefixes & (PREFIX_CS | PREFIX_SS | PREFIX_DS | PREFIX_ES
- | PREFIX_FS | PREFIX_GS)) != 0
- && (ins->used_prefixes & ins->active_seg_prefix) != 0)
- ins->all_prefixes[ins->last_seg_prefix] = 0;
+ if ((ins.prefixes & (PREFIX_CS | PREFIX_SS | PREFIX_DS | PREFIX_ES
+ | PREFIX_FS | PREFIX_GS)) != 0
+ && (ins.used_prefixes & ins.active_seg_prefix) != 0)
+ ins.all_prefixes[ins.last_seg_prefix] = 0;
/* Check if the ADDR prefix is used. */
- if ((ins->prefixes & PREFIX_ADDR) != 0
- && (ins->used_prefixes & PREFIX_ADDR) != 0)
- ins->all_prefixes[ins->last_addr_prefix] = 0;
+ if ((ins.prefixes & PREFIX_ADDR) != 0
+ && (ins.used_prefixes & PREFIX_ADDR) != 0)
+ ins.all_prefixes[ins.last_addr_prefix] = 0;
/* Check if the DATA prefix is used. */
- if ((ins->prefixes & PREFIX_DATA) != 0
- && (ins->used_prefixes & PREFIX_DATA) != 0
- && !ins->need_vex)
- ins->all_prefixes[ins->last_data_prefix] = 0;
+ if ((ins.prefixes & PREFIX_DATA) != 0
+ && (ins.used_prefixes & PREFIX_DATA) != 0
+ && !ins.need_vex)
+ ins.all_prefixes[ins.last_data_prefix] = 0;
- /* Print the extra ins->prefixes. */
+ /* Print the extra ins.prefixes. */
prefix_length = 0;
- for (i = 0; i < (int) ARRAY_SIZE (ins->all_prefixes); i++)
- if (ins->all_prefixes[i])
+ for (i = 0; i < (int) ARRAY_SIZE (ins.all_prefixes); i++)
+ if (ins.all_prefixes[i])
{
const char *name;
- name = prefix_name (ins, ins->all_prefixes[i], orig_sizeflag);
+ name = prefix_name (&ins, ins.all_prefixes[i], orig_sizeflag);
if (name == NULL)
abort ();
prefix_length += strlen (name) + 1;
- i386_dis_printf (ins, dis_style_mnemonic, "%s ", name);
+ i386_dis_printf (&ins, dis_style_mnemonic, "%s ", name);
}
/* Check maximum code length. */
- if ((ins->codep - ins->start_codep) > MAX_CODE_LENGTH)
+ if ((ins.codep - ins.start_codep) > MAX_CODE_LENGTH)
{
- i386_dis_printf (ins, dis_style_text, "(bad)");
+ i386_dis_printf (&ins, dis_style_text, "(bad)");
return MAX_CODE_LENGTH;
}
@@ -9872,10 +9828,10 @@ print_insn (bfd_vma pc, instr_info *ins)
++op_count;
/* Calculate the number of spaces to print after the mnemonic. */
- ins->obufp = ins->mnemonicendp;
+ ins.obufp = ins.mnemonicendp;
if (op_count > 0)
{
- i = strlen (ins->obuf) + prefix_length;
+ i = strlen (ins.obuf) + prefix_length;
if (i < 7)
i = 7 - i;
else
@@ -9885,21 +9841,21 @@ print_insn (bfd_vma pc, instr_info *ins)
i = 0;
/* Print the instruction mnemonic along with any trailing whitespace. */
- i386_dis_printf (ins, dis_style_mnemonic, "%s%*s", ins->obuf, i, "");
+ i386_dis_printf (&ins, dis_style_mnemonic, "%s%*s", ins.obuf, i, "");
/* The enter and bound instructions are printed with operands in the same
order as the intel book; everything else is printed in reverse order. */
intel_swap_2_3 = false;
- if (ins->intel_syntax || ins->two_source_ops)
+ if (ins.intel_syntax || ins.two_source_ops)
{
for (i = 0; i < MAX_OPERANDS; ++i)
- op_txt[i] = ins->op_out[i];
+ op_txt[i] = ins.op_out[i];
- if (ins->intel_syntax && dp && dp->op[2].rtn == OP_Rounding
+ if (ins.intel_syntax && dp && dp->op[2].rtn == OP_Rounding
&& dp->op[3].rtn == OP_E && dp->op[4].rtn == NULL)
{
- op_txt[2] = ins->op_out[3];
- op_txt[3] = ins->op_out[2];
+ op_txt[2] = ins.op_out[3];
+ op_txt[3] = ins.op_out[2];
intel_swap_2_3 = true;
}
@@ -9907,18 +9863,18 @@ print_insn (bfd_vma pc, instr_info *ins)
{
bool riprel;
- ins->op_ad = ins->op_index[i];
- ins->op_index[i] = ins->op_index[MAX_OPERANDS - 1 - i];
- ins->op_index[MAX_OPERANDS - 1 - i] = ins->op_ad;
- riprel = ins->op_riprel[i];
- ins->op_riprel[i] = ins->op_riprel[MAX_OPERANDS - 1 - i];
- ins->op_riprel[MAX_OPERANDS - 1 - i] = riprel;
+ ins.op_ad = ins.op_index[i];
+ ins.op_index[i] = ins.op_index[MAX_OPERANDS - 1 - i];
+ ins.op_index[MAX_OPERANDS - 1 - i] = ins.op_ad;
+ riprel = ins.op_riprel[i];
+ ins.op_riprel[i] = ins.op_riprel[MAX_OPERANDS - 1 - i];
+ ins.op_riprel[MAX_OPERANDS - 1 - i] = riprel;
}
}
else
{
for (i = 0; i < MAX_OPERANDS; ++i)
- op_txt[MAX_OPERANDS - 1 - i] = ins->op_out[i];
+ op_txt[MAX_OPERANDS - 1 - i] = ins.op_out[i];
}
needcomma = 0;
@@ -9928,7 +9884,7 @@ print_insn (bfd_vma pc, instr_info *ins)
/* In Intel syntax embedded rounding / SAE are not separate operands.
Instead they're attached to the prior register operand. Simply
suppress emission of the comma to achieve that effect. */
- switch (i & -(ins->intel_syntax && dp))
+ switch (i & -(ins.intel_syntax && dp))
{
case 2:
if (dp->op[2].rtn == OP_Rounding && !intel_swap_2_3)
@@ -9940,36 +9896,58 @@ print_insn (bfd_vma pc, instr_info *ins)
break;
}
if (needcomma)
- i386_dis_printf (ins, dis_style_text, ",");
- if (ins->op_index[i] != -1 && !ins->op_riprel[i])
+ i386_dis_printf (&ins, dis_style_text, ",");
+ if (ins.op_index[i] != -1 && !ins.op_riprel[i])
{
- bfd_vma target = (bfd_vma) ins->op_address[ins->op_index[i]];
+ bfd_vma target = (bfd_vma) ins.op_address[ins.op_index[i]];
- if (ins->op_is_jump)
+ if (ins.op_is_jump)
{
- ins->info->insn_info_valid = 1;
- ins->info->branch_delay_insns = 0;
- ins->info->data_size = 0;
- ins->info->target = target;
- ins->info->target2 = 0;
+ info->insn_info_valid = 1;
+ info->branch_delay_insns = 0;
+ info->data_size = 0;
+ info->target = target;
+ info->target2 = 0;
}
- (*ins->info->print_address_func) (target, ins->info);
+ (*info->print_address_func) (target, info);
}
else
- i386_dis_printf (ins, dis_style_text, "%s", op_txt[i]);
+ i386_dis_printf (&ins, dis_style_text, "%s", op_txt[i]);
needcomma = 1;
}
for (i = 0; i < MAX_OPERANDS; i++)
- if (ins->op_index[i] != -1 && ins->op_riprel[i])
+ if (ins.op_index[i] != -1 && ins.op_riprel[i])
{
- i386_dis_printf (ins, dis_style_comment_start, " # ");
- (*ins->info->print_address_func) ((bfd_vma)
- (ins->start_pc + (ins->codep - ins->start_codep)
- + ins->op_address[ins->op_index[i]]), ins->info);
+ i386_dis_printf (&ins, dis_style_comment_start, " # ");
+ (*info->print_address_func)
+ ((bfd_vma)(ins.start_pc + (ins.codep - ins.start_codep)
+ + ins.op_address[ins.op_index[i]]),
+ info);
break;
}
- return ins->codep - priv.the_buffer;
+ return ins.codep - priv.the_buffer;
+}
+
+/* Here for backwards compatibility. When gdb stops using
+ print_insn_i386_att and print_insn_i386_intel these functions can
+ disappear, and print_insn_i386 be merged into print_insn. */
+int
+print_insn_i386_att (bfd_vma pc, disassemble_info *info)
+{
+ return print_insn (pc, info, 0);
+}
+
+int
+print_insn_i386_intel (bfd_vma pc, disassemble_info *info)
+{
+ return print_insn (pc, info, 1);
+}
+
+int
+print_insn_i386 (bfd_vma pc, disassemble_info *info)
+{
+ return print_insn (pc, info, -1);
}
static const char *float_mem[] = {
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/4] x86: shrink prefix related disassembler state fields
2022-06-10 13:33 [PATCH 0/4] x86: follow-on to making disassembler thread-safe Jan Beulich
2022-06-10 13:35 ` [PATCH 1/4] x86: properly initialize struct instr_info instance(s) Jan Beulich
@ 2022-06-10 13:35 ` Jan Beulich
2022-06-10 13:35 ` [PATCH 3/4] x86: avoid string copy when swapping Vex.W controlled operands Jan Beulich
` (2 subsequent siblings)
4 siblings, 0 replies; 10+ messages in thread
From: Jan Beulich @ 2022-06-10 13:35 UTC (permalink / raw)
To: Binutils
By changing the values used for "artificial" prefix values,
all_prefixes[] can be shrunk to array of unsigned char. All that
additionally needs adjusting is the printing of possible apparently
standalone prefixes when recovering from longjmp(): Simply check
whether any prefixes were successfully decoded, to avoid converting
opcode bytes matching the "artificial" values to prefix mnemonics.
Similarly by re-arranging the bits assigned to PREFIX_* mask values
we can fit all segment register masks in a byte and hence shrink
active_seg_prefix to unsigned char.
Somewhat similarly with last_*_prefix representing offsets into the
opcode being disassembled, signed char is sufficient to hold all possible
values.
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -175,21 +175,21 @@ struct instr_info
unsigned char *insn_codep;
unsigned char *codep;
unsigned char *end_codep;
- int last_lock_prefix;
- int last_repz_prefix;
- int last_repnz_prefix;
- int last_data_prefix;
- int last_addr_prefix;
- int last_rex_prefix;
- int last_seg_prefix;
- int fwait_prefix;
+ signed char last_lock_prefix;
+ signed char last_repz_prefix;
+ signed char last_repnz_prefix;
+ signed char last_data_prefix;
+ signed char last_addr_prefix;
+ signed char last_rex_prefix;
+ signed char last_seg_prefix;
+ signed char fwait_prefix;
/* The active segment register prefix. */
- int active_seg_prefix;
+ unsigned char active_seg_prefix;
#define MAX_CODE_LENGTH 15
/* We can up to 14 ins->prefixes since the maximum instruction length is
15bytes. */
- int all_prefixes[MAX_CODE_LENGTH - 1];
+ unsigned char all_prefixes[MAX_CODE_LENGTH - 1];
disassemble_info *info;
struct
@@ -276,13 +276,13 @@ struct instr_info
/* Flags stored in PREFIXES. */
#define PREFIX_REPZ 1
#define PREFIX_REPNZ 2
-#define PREFIX_LOCK 4
-#define PREFIX_CS 8
-#define PREFIX_SS 0x10
-#define PREFIX_DS 0x20
-#define PREFIX_ES 0x40
-#define PREFIX_FS 0x80
-#define PREFIX_GS 0x100
+#define PREFIX_CS 4
+#define PREFIX_SS 8
+#define PREFIX_DS 0x10
+#define PREFIX_ES 0x20
+#define PREFIX_FS 0x40
+#define PREFIX_GS 0x80
+#define PREFIX_LOCK 0x100
#define PREFIX_DATA 0x200
#define PREFIX_ADDR 0x400
#define PREFIX_FWAIT 0x800
@@ -8532,13 +8532,13 @@ static const struct dis386 rm_table[][8]
#define INTERNAL_DISASSEMBLER_ERROR _("<internal disassembler error>")
-/* We use the high bit to indicate different name for the same
- prefix. */
-#define REP_PREFIX (0xf3 | 0x100)
-#define XACQUIRE_PREFIX (0xf2 | 0x200)
-#define XRELEASE_PREFIX (0xf3 | 0x400)
-#define BND_PREFIX (0xf2 | 0x400)
-#define NOTRACK_PREFIX (0x3e | 0x100)
+/* The values used here must be non-zero, fit in 'unsigned char', and not be
+ in conflict with actual prefix opcodes. */
+#define REP_PREFIX 0x01
+#define XACQUIRE_PREFIX 0x02
+#define XRELEASE_PREFIX 0x03
+#define BND_PREFIX 0x04
+#define NOTRACK_PREFIX 0x05
static int
ckprefix (instr_info *ins)
@@ -9519,14 +9519,15 @@ print_insn (bfd_vma pc, disassemble_info
if (OPCODES_SIGSETJMP (priv.bailout) != 0)
{
- const char *name;
-
/* Getting here means we tried for data but didn't get it. That
means we have an incomplete instruction of some sort. Just
print the first byte as a prefix or a .byte pseudo-op. */
if (ins.codep > priv.the_buffer)
{
- name = prefix_name (&ins, priv.the_buffer[0], priv.orig_sizeflag);
+ const char *name = NULL;
+
+ if (ins.prefixes || ins.fwait_prefix >= 0 || (ins.rex & REX_OPCODE))
+ name = prefix_name (&ins, priv.the_buffer[0], priv.orig_sizeflag);
if (name != NULL)
i386_dis_printf (&ins, dis_style_mnemonic, "%s", name);
else
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 3/4] x86: avoid string copy when swapping Vex.W controlled operands
2022-06-10 13:33 [PATCH 0/4] x86: follow-on to making disassembler thread-safe Jan Beulich
2022-06-10 13:35 ` [PATCH 1/4] x86: properly initialize struct instr_info instance(s) Jan Beulich
2022-06-10 13:35 ` [PATCH 2/4] x86: shrink prefix related disassembler state fields Jan Beulich
@ 2022-06-10 13:35 ` Jan Beulich
2022-06-10 13:36 ` [PATCH 4/4] x86: replace global scratch buffer Jan Beulich
2022-06-10 19:43 ` [PATCH 0/4] x86: follow-on to making disassembler thread-safe H.J. Lu
4 siblings, 0 replies; 10+ messages in thread
From: Jan Beulich @ 2022-06-10 13:35 UTC (permalink / raw)
To: Binutils
Now that op_out[] is an array of pointers, there's no need anymore to
copy strings. Simply swap the pointers.
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -13473,9 +13473,10 @@ OP_VexW (instr_info *ins, int bytemode,
if (ins->vex.w)
{
/* Swap 2nd and 3rd operands. */
- strcpy (ins->scratchbuf, ins->op_out[2]);
- strcpy (ins->op_out[2], ins->op_out[1]);
- strcpy (ins->op_out[1], ins->scratchbuf);
+ char *tmp = ins->op_out[2];
+
+ ins->op_out[2] = ins->op_out[1];
+ ins->op_out[1] = tmp;
}
}
@@ -13503,9 +13504,10 @@ OP_REG_VexI4 (instr_info *ins, int bytem
if (ins->vex.w)
{
/* Swap 3rd and 4th operands. */
- strcpy (ins->scratchbuf, ins->op_out[3]);
- strcpy (ins->op_out[3], ins->op_out[2]);
- strcpy (ins->op_out[2], ins->scratchbuf);
+ char *tmp = ins->op_out[3];
+
+ ins->op_out[3] = ins->op_out[2];
+ ins->op_out[2] = tmp;
}
}
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 4/4] x86: replace global scratch buffer
2022-06-10 13:33 [PATCH 0/4] x86: follow-on to making disassembler thread-safe Jan Beulich
` (2 preceding siblings ...)
2022-06-10 13:35 ` [PATCH 3/4] x86: avoid string copy when swapping Vex.W controlled operands Jan Beulich
@ 2022-06-10 13:36 ` Jan Beulich
2022-06-13 13:59 ` H.J. Lu
2022-06-10 19:43 ` [PATCH 0/4] x86: follow-on to making disassembler thread-safe H.J. Lu
4 siblings, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2022-06-10 13:36 UTC (permalink / raw)
To: Binutils; +Cc: H.J. Lu, Andrew Burgess
With its movement to the stack, and with the subsequent desire to
initialize the entire instr_info instances, this has become doubly
inefficient. Individual users have better knowledge of how big a buffer
they need, and in a number of cases going through an intermediate buffer
can be avoided altogether.
Having got confirmation that it wasn't intentional to print memory
operand displacements with inconsistent style, print_displacement() is
now using dis_style_address_offset consistently (eliminating the need
for callers to pass in a style).
While touching print_operand_value() also convert its "hex" parameter to
bool. And while altering (and moving) oappend_immediate(), fold
oappend_maybe_intel_with_style() into its only remaining caller. Finally
where doing adjustments, use snprintf() in favor of sprintf().
---
While doing the conversion I came to notice that print_operand_value()'s
"hex" parameter has only ever passed "true" to it. I wonder why this
parameter still exists.
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -51,9 +51,7 @@ static void oappend_with_style (instr_in
static void oappend (instr_info *, const char *);
static void append_seg (instr_info *);
static void OP_indirE (instr_info *, int, int);
-static void print_operand_value (instr_info *, char *, int, bfd_vma);
static void OP_E_memory (instr_info *, int, int);
-static void print_displacement (instr_info *, char *, bfd_vma);
static void OP_E (instr_info *, int, int);
static void OP_G (instr_info *, int, int);
static bfd_vma get64 (instr_info *);
@@ -170,7 +168,6 @@ struct instr_info
char obuf[100];
char *obufp;
char *mnemonicendp;
- char scratchbuf[100];
unsigned char *start_codep;
unsigned char *insn_codep;
unsigned char *codep;
@@ -9254,37 +9251,13 @@ get_sib (instr_info *ins, int sizeflag)
ins->has_sib = false;
}
-/* Like oappend (below), but S is a string starting with '%' or '$'. In
- Intel syntax, the '%' or '$' is elided. STYLE is used when displaying
- this part of the output in the disassembler.
-
- This function should not be used directly from the general disassembler
- code, instead the helpers oappend_register and oappend_immediate should
- be called as appropriate. */
-
-static void
-oappend_maybe_intel_with_style (instr_info *ins, const char *s,
- enum disassembler_style style)
-{
- oappend_with_style (ins, s + ins->intel_syntax, style);
-}
-
-/* Like oappend_maybe_intel_with_style above, but called when S is the
- name of a register. */
+/* Like oappend (below), but S is a string starting with '%'. In
+ Intel syntax, the '%' is elided. */
static void
oappend_register (instr_info *ins, const char *s)
{
- oappend_maybe_intel_with_style (ins, s, dis_style_register);
-}
-
-/* Like oappend_maybe_intel_with_style above, but called when S represents
- an immediate. */
-
-static void
-oappend_immediate (instr_info *ins, const char *s)
-{
- oappend_maybe_intel_with_style (ins, s, dis_style_immediate);
+ oappend_with_style (ins, s + ins->intel_syntax, dis_style_register);
}
/* Wrap around a call to INS->info->fprintf_styled_func, printing FMT.
@@ -10334,8 +10307,12 @@ static void
OP_STi (instr_info *ins, int bytemode ATTRIBUTE_UNUSED,
int sizeflag ATTRIBUTE_UNUSED)
{
- sprintf (ins->scratchbuf, "%%st(%d)", ins->modrm.rm);
- oappend_register (ins, ins->scratchbuf);
+ char scratch[8];
+ int res = snprintf (scratch, ARRAY_SIZE (scratch), "%%st(%d)", ins->modrm.rm);
+
+ if (res < 0 || (size_t) res >= ARRAY_SIZE (scratch))
+ abort ();
+ oappend_register (ins, scratch);
}
/* Capital letters in template are macros. */
@@ -10992,39 +10969,39 @@ OP_indirE (instr_info *ins, int bytemode
}
static void
-print_operand_value (instr_info *ins, char *buf, int hex, bfd_vma disp)
+print_operand_value (instr_info *ins, bool hex, bfd_vma disp,
+ enum disassembler_style style)
{
+ char tmp[30];
+
if (ins->address_mode == mode_64bit)
{
if (hex)
{
- char tmp[30];
int i;
- buf[0] = '0';
- buf[1] = 'x';
+ oappend_with_style (ins, "0x", style);
sprintf_vma (tmp, disp);
for (i = 0; tmp[i] == '0' && tmp[i + 1]; i++);
- strcpy (buf + 2, tmp + i);
+ oappend_with_style (ins, tmp + i, style);
}
else
{
bfd_signed_vma v = disp;
- char tmp[30];
int i;
if (v < 0)
{
- *(buf++) = '-';
+ oappend_char_with_style (ins, '-', style);
v = -disp;
/* Check for possible overflow on 0x8000000000000000. */
if (v < 0)
{
- strcpy (buf, "9223372036854775808");
+ oappend_with_style (ins, "9223372036854775808", style);
return;
}
}
if (!v)
{
- strcpy (buf, "0");
+ oappend_char_with_style (ins, '0', style);
return;
}
@@ -11036,30 +11013,41 @@ print_operand_value (instr_info *ins, ch
v /= 10;
i++;
}
- strcpy (buf, tmp + 29 - i);
+ oappend_with_style (ins, tmp + 29 - i, style);
}
}
else
{
if (hex)
- sprintf (buf, "0x%x", (unsigned int) disp);
+ sprintf (tmp, "0x%x", (unsigned int) disp);
else
- sprintf (buf, "%d", (int) disp);
+ sprintf (tmp, "%d", (int) disp);
+ oappend_with_style (ins, tmp, style);
}
}
+/* Like oappend, but called for immediate operands. */
+
+static void
+oappend_immediate (instr_info *ins, bfd_vma imm)
+{
+ if (!ins->intel_syntax)
+ oappend_char_with_style (ins, '$', dis_style_immediate);
+ print_operand_value (ins, true, imm, dis_style_immediate);
+}
+
/* Put DISP in BUF as signed hex number. */
static void
-print_displacement (instr_info *ins, char *buf, bfd_vma disp)
+print_displacement (instr_info *ins, bfd_vma disp)
{
bfd_signed_vma val = disp;
char tmp[30];
- int i, j = 0;
+ unsigned int i;
if (val < 0)
{
- buf[j++] = '-';
+ oappend_char_with_style (ins, '-', dis_style_address_offset);
val = -disp;
/* Check for possible overflow. */
@@ -11068,28 +11056,30 @@ print_displacement (instr_info *ins, cha
switch (ins->address_mode)
{
case mode_64bit:
- strcpy (buf + j, "0x8000000000000000");
+ oappend_with_style (ins, "0x8000000000000000",
+ dis_style_address_offset);
break;
case mode_32bit:
- strcpy (buf + j, "0x80000000");
+ oappend_with_style (ins, "0x80000000",
+ dis_style_address_offset);
break;
case mode_16bit:
- strcpy (buf + j, "0x8000");
+ oappend_with_style (ins, "0x8000",
+ dis_style_address_offset);
break;
}
return;
}
}
- buf[j++] = '0';
- buf[j++] = 'x';
+ oappend_with_style (ins, "0x", dis_style_address_offset);
sprintf_vma (tmp, (bfd_vma) val);
for (i = 0; tmp[i] == '0'; i++)
continue;
if (tmp[i] == '\0')
i--;
- strcpy (buf + j, tmp + i);
+ oappend_with_style (ins, tmp + i, dis_style_address_offset);
}
static void
@@ -11729,11 +11719,9 @@ OP_E_memory (instr_info *ins, int bytemo
if (ins->modrm.mod != 0 || base == 5)
{
if (havedisp || riprel)
- print_displacement (ins, ins->scratchbuf, disp);
+ print_displacement (ins, disp);
else
- print_operand_value (ins, ins->scratchbuf, 1, disp);
- oappend_with_style (ins, ins->scratchbuf,
- dis_style_address_offset);
+ print_operand_value (ins, true, disp, dis_style_address_offset);
if (riprel)
{
set_op (ins, disp, true);
@@ -11792,9 +11780,8 @@ OP_E_memory (instr_info *ins, int bytemo
: att_index32);
oappend_char (ins, ins->scale_char);
- sprintf (ins->scratchbuf, "%d", 1 << scale);
- oappend_with_style (ins, ins->scratchbuf,
- dis_style_immediate);
+ oappend_char_with_style (ins, '0' + (1 << scale),
+ dis_style_immediate);
}
}
if (ins->intel_syntax
@@ -11809,10 +11796,9 @@ OP_E_memory (instr_info *ins, int bytemo
}
if (havedisp)
- print_displacement (ins, ins->scratchbuf, disp);
+ print_displacement (ins, disp);
else
- print_operand_value (ins, ins->scratchbuf, 1, disp);
- oappend (ins, ins->scratchbuf);
+ print_operand_value (ins, true, disp, dis_style_address_offset);
}
oappend_char (ins, ins->close_char);
@@ -11839,8 +11825,7 @@ OP_E_memory (instr_info *ins, int bytemo
oappend_register (ins, att_names_seg[ds_reg - es_reg]);
oappend (ins, ":");
}
- print_operand_value (ins, ins->scratchbuf, 1, disp);
- oappend (ins, ins->scratchbuf);
+ print_operand_value (ins, true, disp, dis_style_text);
}
}
}
@@ -11885,10 +11870,7 @@ OP_E_memory (instr_info *ins, int bytemo
if (!ins->intel_syntax)
if (ins->modrm.mod != 0 || ins->modrm.rm == 6)
- {
- print_displacement (ins, ins->scratchbuf, disp);
- oappend (ins, ins->scratchbuf);
- }
+ print_displacement (ins, disp);
if (ins->modrm.mod != 0 || ins->modrm.rm != 6)
{
@@ -11906,8 +11888,7 @@ OP_E_memory (instr_info *ins, int bytemo
disp = -disp;
}
- print_displacement (ins, ins->scratchbuf, disp);
- oappend (ins, ins->scratchbuf);
+ print_displacement (ins, disp);
}
oappend_char (ins, ins->close_char);
@@ -11919,8 +11900,7 @@ OP_E_memory (instr_info *ins, int bytemo
oappend_register (ins, att_names_seg[ds_reg - es_reg]);
oappend (ins, ":");
}
- print_operand_value (ins, ins->scratchbuf, 1, disp & 0xffff);
- oappend (ins, ins->scratchbuf);
+ print_operand_value (ins, true, disp & 0xffff, dis_style_text);
}
}
if (ins->vex.b)
@@ -12274,10 +12254,7 @@ OP_I (instr_info *ins, int bytemode, int
}
op &= mask;
- ins->scratchbuf[0] = '$';
- print_operand_value (ins, ins->scratchbuf + 1, 1, op);
- oappend_immediate (ins, ins->scratchbuf);
- ins->scratchbuf[0] = '\0';
+ oappend_immediate (ins, op);
}
static void
@@ -12292,10 +12269,7 @@ OP_I64 (instr_info *ins, int bytemode, i
USED_REX (REX_W);
- ins->scratchbuf[0] = '$';
- print_operand_value (ins, ins->scratchbuf + 1, 1, get64 (ins));
- oappend_immediate (ins, ins->scratchbuf);
- ins->scratchbuf[0] = '\0';
+ oappend_immediate (ins, get64 (ins));
}
static void
@@ -12346,9 +12320,7 @@ OP_sI (instr_info *ins, int bytemode, in
return;
}
- ins->scratchbuf[0] = '$';
- print_operand_value (ins, ins->scratchbuf + 1, 1, op);
- oappend_immediate (ins, ins->scratchbuf);
+ oappend_immediate (ins, op);
}
static void
@@ -12398,8 +12370,7 @@ OP_J (instr_info *ins, int bytemode, int
disp = ((ins->start_pc + (ins->codep - ins->start_codep) + disp) & mask)
| segment;
set_op (ins, disp, false);
- print_operand_value (ins, ins->scratchbuf, 1, disp);
- oappend (ins, ins->scratchbuf);
+ print_operand_value (ins, true, disp, dis_style_text);
}
static void
@@ -12414,7 +12385,8 @@ OP_SEG (instr_info *ins, int bytemode, i
static void
OP_DIR (instr_info *ins, int dummy ATTRIBUTE_UNUSED, int sizeflag)
{
- int seg, offset;
+ int seg, offset, res;
+ char scratch[24];
if (sizeflag & DFLAG)
{
@@ -12427,11 +12399,13 @@ OP_DIR (instr_info *ins, int dummy ATTRI
seg = get16 (ins);
}
ins->used_prefixes |= (ins->prefixes & PREFIX_DATA);
- if (ins->intel_syntax)
- sprintf (ins->scratchbuf, "0x%x:0x%x", seg, offset);
- else
- sprintf (ins->scratchbuf, "$0x%x,$0x%x", seg, offset);
- oappend (ins, ins->scratchbuf);
+
+ res = snprintf (scratch, ARRAY_SIZE (scratch),
+ ins->intel_syntax ? "0x%x:0x%x" : "$0x%x,$0x%x",
+ seg, offset);
+ if (res < 0 || (size_t) res >= ARRAY_SIZE (scratch))
+ abort ();
+ oappend (ins, scratch);
}
static void
@@ -12456,8 +12430,7 @@ OP_OFF (instr_info *ins, int bytemode, i
oappend (ins, ":");
}
}
- print_operand_value (ins, ins->scratchbuf, 1, off);
- oappend_with_style (ins, ins->scratchbuf, dis_style_address_offset);
+ print_operand_value (ins, true, off, dis_style_address_offset);
}
static void
@@ -12486,8 +12459,7 @@ OP_OFF64 (instr_info *ins, int bytemode,
oappend (ins, ":");
}
}
- print_operand_value (ins, ins->scratchbuf, 1, off);
- oappend_with_style (ins, ins->scratchbuf, dis_style_address_offset);
+ print_operand_value (ins, true, off, dis_style_address_offset);
}
static void
@@ -12568,7 +12540,9 @@ static void
OP_C (instr_info *ins, int dummy ATTRIBUTE_UNUSED,
int sizeflag ATTRIBUTE_UNUSED)
{
- int add;
+ int add, res;
+ char scratch[8];
+
if (ins->rex & REX_R)
{
USED_REX (REX_R);
@@ -12582,33 +12556,44 @@ OP_C (instr_info *ins, int dummy ATTRIBU
}
else
add = 0;
- sprintf (ins->scratchbuf, "%%cr%d", ins->modrm.reg + add);
- oappend_register (ins, ins->scratchbuf);
+ res = snprintf (scratch, ARRAY_SIZE (scratch), "%%cr%d",
+ ins->modrm.reg + add);
+ if (res < 0 || (size_t) res >= ARRAY_SIZE (scratch))
+ abort ();
+ oappend_register (ins, scratch);
}
static void
OP_D (instr_info *ins, int dummy ATTRIBUTE_UNUSED,
int sizeflag ATTRIBUTE_UNUSED)
{
- int add;
+ int add, res;
+ char scratch[8];
+
USED_REX (REX_R);
if (ins->rex & REX_R)
add = 8;
else
add = 0;
- if (ins->intel_syntax)
- sprintf (ins->scratchbuf, "dr%d", ins->modrm.reg + add);
- else
- sprintf (ins->scratchbuf, "%%db%d", ins->modrm.reg + add);
- oappend (ins, ins->scratchbuf);
+ res = snprintf (scratch, ARRAY_SIZE (scratch),
+ ins->intel_syntax ? "dr%d" : "%%db%d",
+ ins->modrm.reg + add);
+ if (res < 0 || (size_t) res >= ARRAY_SIZE (scratch))
+ abort ();
+ oappend (ins, scratch);
}
static void
OP_T (instr_info *ins, int dummy ATTRIBUTE_UNUSED,
int sizeflag ATTRIBUTE_UNUSED)
{
- sprintf (ins->scratchbuf, "%%tr%d", ins->modrm.reg);
- oappend_register (ins, ins->scratchbuf);
+ int res;
+ char scratch[8];
+
+ res = snprintf (scratch, ARRAY_SIZE (scratch), "%%tr%d", ins->modrm.reg);
+ if (res < 0 || (size_t) res >= ARRAY_SIZE (scratch))
+ abort ();
+ oappend_register (ins, scratch);
}
static void
@@ -13060,10 +13045,7 @@ CMP_Fixup (instr_info *ins, int bytemode
else
{
/* We have a reserved extension byte. Output it directly. */
- ins->scratchbuf[0] = '$';
- print_operand_value (ins, ins->scratchbuf + 1, 1, cmp_type);
- oappend_immediate (ins, ins->scratchbuf);
- ins->scratchbuf[0] = '\0';
+ oappend_immediate (ins, cmp_type);
}
}
@@ -13515,9 +13497,7 @@ static void
OP_VexI4 (instr_info *ins, int bytemode ATTRIBUTE_UNUSED,
int sizeflag ATTRIBUTE_UNUSED)
{
- ins->scratchbuf[0] = '$';
- print_operand_value (ins, ins->scratchbuf + 1, 1, ins->codep[-1] & 0xf);
- oappend_immediate (ins, ins->scratchbuf);
+ oappend_immediate (ins, ins->codep[-1] & 0xf);
}
static void
@@ -13560,10 +13540,7 @@ VPCMP_Fixup (instr_info *ins, int bytemo
else
{
/* We have a reserved extension byte. Output it directly. */
- ins->scratchbuf[0] = '$';
- print_operand_value (ins, ins->scratchbuf + 1, 1, cmp_type);
- oappend_immediate (ins, ins->scratchbuf);
- ins->scratchbuf[0] = '\0';
+ oappend_immediate (ins, cmp_type);
}
}
@@ -13612,10 +13589,7 @@ VPCOM_Fixup (instr_info *ins, int bytemo
else
{
/* We have a reserved extension byte. Output it directly. */
- ins->scratchbuf[0] = '$';
- print_operand_value (ins, ins->scratchbuf + 1, 1, cmp_type);
- oappend_immediate (ins, ins->scratchbuf);
- ins->scratchbuf[0] = '\0';
+ oappend_immediate (ins, cmp_type);
}
}
@@ -13660,10 +13634,7 @@ PCLMUL_Fixup (instr_info *ins, int bytem
else
{
/* We have a reserved extension byte. Output it directly. */
- ins->scratchbuf[0] = '$';
- print_operand_value (ins, ins->scratchbuf + 1, 1, pclmul_type);
- oappend_immediate (ins, ins->scratchbuf);
- ins->scratchbuf[0] = '\0';
+ oappend_immediate (ins, pclmul_type);
}
}
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/4] x86: follow-on to making disassembler thread-safe
2022-06-10 13:33 [PATCH 0/4] x86: follow-on to making disassembler thread-safe Jan Beulich
` (3 preceding siblings ...)
2022-06-10 13:36 ` [PATCH 4/4] x86: replace global scratch buffer Jan Beulich
@ 2022-06-10 19:43 ` H.J. Lu
2022-06-13 7:54 ` Jan Beulich
4 siblings, 1 reply; 10+ messages in thread
From: H.J. Lu @ 2022-06-10 19:43 UTC (permalink / raw)
To: Jan Beulich; +Cc: Binutils, Alan Modra
On Fri, Jun 10, 2022 at 6:33 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> Alan had noticed ubsan fallout from that combined with my conversion
> of op_riprel to bool[]. In the course of addressing that I did notice
> further things which could do with improving (or so I hope).
>
> 1: properly initialize struct instr_info instance(s)
> 2: shrink prefix related disassembler state fields
> 3: avoid string copy when swapping Vex.W controlled operands
> 4: replace global scratch buffer
>
> Jan
OK for all.
Thanks.
--
H.J.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/4] x86: follow-on to making disassembler thread-safe
2022-06-10 19:43 ` [PATCH 0/4] x86: follow-on to making disassembler thread-safe H.J. Lu
@ 2022-06-13 7:54 ` Jan Beulich
0 siblings, 0 replies; 10+ messages in thread
From: Jan Beulich @ 2022-06-13 7:54 UTC (permalink / raw)
To: H.J. Lu; +Cc: Binutils, Alan Modra
On 10.06.2022 21:43, H.J. Lu wrote:
> On Fri, Jun 10, 2022 at 6:33 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> Alan had noticed ubsan fallout from that combined with my conversion
>> of op_riprel to bool[]. In the course of addressing that I did notice
>> further things which could do with improving (or so I hope).
>>
>> 1: properly initialize struct instr_info instance(s)
>> 2: shrink prefix related disassembler state fields
>> 3: avoid string copy when swapping Vex.W controlled operands
>> 4: replace global scratch buffer
>>
>> Jan
>
> OK for all.
Thanks. Any thoughts on the post-commit-message remark in patch 4?
Jan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 4/4] x86: replace global scratch buffer
2022-06-10 13:36 ` [PATCH 4/4] x86: replace global scratch buffer Jan Beulich
@ 2022-06-13 13:59 ` H.J. Lu
2022-06-13 14:55 ` Jan Beulich
0 siblings, 1 reply; 10+ messages in thread
From: H.J. Lu @ 2022-06-13 13:59 UTC (permalink / raw)
To: Jan Beulich; +Cc: Binutils, Andrew Burgess
On Fri, Jun 10, 2022 at 6:36 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> With its movement to the stack, and with the subsequent desire to
> initialize the entire instr_info instances, this has become doubly
> inefficient. Individual users have better knowledge of how big a buffer
> they need, and in a number of cases going through an intermediate buffer
> can be avoided altogether.
>
> Having got confirmation that it wasn't intentional to print memory
> operand displacements with inconsistent style, print_displacement() is
> now using dis_style_address_offset consistently (eliminating the need
> for callers to pass in a style).
>
> While touching print_operand_value() also convert its "hex" parameter to
> bool. And while altering (and moving) oappend_immediate(), fold
> oappend_maybe_intel_with_style() into its only remaining caller. Finally
> where doing adjustments, use snprintf() in favor of sprintf().
> ---
> While doing the conversion I came to notice that print_operand_value()'s
> "hex" parameter has only ever passed "true" to it. I wonder why this
> parameter still exists.
>
Andrew,
Do you have follow-up patches to pass "false" to print_operand_value?
--
H.J.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 4/4] x86: replace global scratch buffer
2022-06-13 13:59 ` H.J. Lu
@ 2022-06-13 14:55 ` Jan Beulich
2022-06-13 20:58 ` H.J. Lu
0 siblings, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2022-06-13 14:55 UTC (permalink / raw)
To: H.J. Lu; +Cc: Binutils, Andrew Burgess
On 13.06.2022 15:59, H.J. Lu wrote:
> On Fri, Jun 10, 2022 at 6:36 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> With its movement to the stack, and with the subsequent desire to
>> initialize the entire instr_info instances, this has become doubly
>> inefficient. Individual users have better knowledge of how big a buffer
>> they need, and in a number of cases going through an intermediate buffer
>> can be avoided altogether.
>>
>> Having got confirmation that it wasn't intentional to print memory
>> operand displacements with inconsistent style, print_displacement() is
>> now using dis_style_address_offset consistently (eliminating the need
>> for callers to pass in a style).
>>
>> While touching print_operand_value() also convert its "hex" parameter to
>> bool. And while altering (and moving) oappend_immediate(), fold
>> oappend_maybe_intel_with_style() into its only remaining caller. Finally
>> where doing adjustments, use snprintf() in favor of sprintf().
>> ---
>> While doing the conversion I came to notice that print_operand_value()'s
>> "hex" parameter has only ever passed "true" to it. I wonder why this
>> parameter still exists.
>>
>
> Andrew,
>
> Do you have follow-up patches to pass "false" to print_operand_value?
I don't think that's related to Andrew's work. Even when looking at plain
2.37 (which I'm sure predates any of his work) I see only 1 ever passed
as the argument. Going very far back (2.16) I see varying arguments. I
therefore wonder whether the parameter simply wasn't cleaned up when the
last party (possibly) passing 0 / false was removed / replaced.
Jan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 4/4] x86: replace global scratch buffer
2022-06-13 14:55 ` Jan Beulich
@ 2022-06-13 20:58 ` H.J. Lu
0 siblings, 0 replies; 10+ messages in thread
From: H.J. Lu @ 2022-06-13 20:58 UTC (permalink / raw)
To: Jan Beulich; +Cc: Binutils, Andrew Burgess
On Mon, Jun 13, 2022 at 7:55 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 13.06.2022 15:59, H.J. Lu wrote:
> > On Fri, Jun 10, 2022 at 6:36 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> With its movement to the stack, and with the subsequent desire to
> >> initialize the entire instr_info instances, this has become doubly
> >> inefficient. Individual users have better knowledge of how big a buffer
> >> they need, and in a number of cases going through an intermediate buffer
> >> can be avoided altogether.
> >>
> >> Having got confirmation that it wasn't intentional to print memory
> >> operand displacements with inconsistent style, print_displacement() is
> >> now using dis_style_address_offset consistently (eliminating the need
> >> for callers to pass in a style).
> >>
> >> While touching print_operand_value() also convert its "hex" parameter to
> >> bool. And while altering (and moving) oappend_immediate(), fold
> >> oappend_maybe_intel_with_style() into its only remaining caller. Finally
> >> where doing adjustments, use snprintf() in favor of sprintf().
> >> ---
> >> While doing the conversion I came to notice that print_operand_value()'s
> >> "hex" parameter has only ever passed "true" to it. I wonder why this
> >> parameter still exists.
> >>
> >
> > Andrew,
> >
> > Do you have follow-up patches to pass "false" to print_operand_value?
>
> I don't think that's related to Andrew's work. Even when looking at plain
> 2.37 (which I'm sure predates any of his work) I see only 1 ever passed
> as the argument. Going very far back (2.16) I see varying arguments. I
> therefore wonder whether the parameter simply wasn't cleaned up when the
> last party (possibly) passing 0 / false was removed / replaced.
>
It was changed to always use hexadecimal for
https://sourceware.org/bugzilla/show_bug.cgi?id=4430
We can remove the argument.
--
H.J.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2022-06-13 20:59 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-10 13:33 [PATCH 0/4] x86: follow-on to making disassembler thread-safe Jan Beulich
2022-06-10 13:35 ` [PATCH 1/4] x86: properly initialize struct instr_info instance(s) Jan Beulich
2022-06-10 13:35 ` [PATCH 2/4] x86: shrink prefix related disassembler state fields Jan Beulich
2022-06-10 13:35 ` [PATCH 3/4] x86: avoid string copy when swapping Vex.W controlled operands Jan Beulich
2022-06-10 13:36 ` [PATCH 4/4] x86: replace global scratch buffer Jan Beulich
2022-06-13 13:59 ` H.J. Lu
2022-06-13 14:55 ` Jan Beulich
2022-06-13 20:58 ` H.J. Lu
2022-06-10 19:43 ` [PATCH 0/4] x86: follow-on to making disassembler thread-safe H.J. Lu
2022-06-13 7:54 ` Jan Beulich
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).