From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-sender-0.a4lg.com (mail-sender.a4lg.com [153.120.152.154]) by sourceware.org (Postfix) with ESMTPS id CAAF9382CDEF for ; Wed, 24 Aug 2022 12:06:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CAAF9382CDEF Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=irq.a4lg.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=irq.a4lg.com Received: from [127.0.0.1] (localhost [127.0.0.1]) by mail-sender-0.a4lg.com (Postfix) with ESMTPSA id 94C20300089; Wed, 24 Aug 2022 12:06:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irq.a4lg.com; s=2017s01; t=1661342815; bh=5tpZphQqIMMAjvKWBF7gLsUEPHHxHBkjk1aoVdWfDRs=; h=Message-ID:Date:Mime-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type:Content-Transfer-Encoding; b=MNPckxteH9boLARMoObLV2PpXlf0+a9KePxXEjMM2XxX8oVW89oEL4ZSuzHq6sL2A mnzZ9HSbfqR8aJzxKoI/HSy4GMUVjVaKAcoRcDJUZqEs34SXp94dFk1sMsFMpsp/58 GzewyYlw1lkVTSIQxIeHTpsd6xv5fn1I4+iExLRQ= Message-ID: Date: Wed, 24 Aug 2022 21:06:50 +0900 Mime-Version: 1.0 Subject: Re: [PATCH v7 1/5] RISC-V: Print highest address on disassembler Content-Language: en-US To: Nelson Chu Cc: binutils@sourceware.org References: <38e0a159b32579371835fbdddaf2578bae970076.1661304407.git.research_trasio@irq.a4lg.com> From: Tsukasa OI In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Nelson, good to hear you again. I see why you were so busy. On 2022/08/24 20:22, Nelson Chu wrote: > Hi Tsukasa, > > I think there are three dis-assembler issues here, > > 1. PR29342 > 2. The target address of jump > 3. Should we show the target address when it is -1 > > Therefore, three separate patches looks reasonable.  But I would suggest > that we should add test cases for each patch as possible as we can, > rather than add them together at last.  That is because it is really > hard to make sure the correctness of the separate patches. I apologize about that. I am making a series of patches to make the disassembler faster and more reliable and... I have following branch graph. And... that's the only a part of it (that requires riscv-dis-fix-addr). Aside from the following branches, I have... eleven... branches that can be directly applied to master. 1. riscv-dis-fix-addr (This patchset) 2. riscv-dis-opts-batch-1 (Roughly 25% performance improvements [more general]) 3. riscv-dis-rv32e (disassembler support for RV32E) 4. riscv-dis-data-large (highly debatable: .long dumping on RV64) 5. riscv-dis-generics (fix disassembling some zext.h/pack/packw) 6. riscv-dis-arch-priv-spec (sometimes "real" arch and priv doesn't match with ELF attrs [e.g. OpenSBI] and JTAG-based debugging may require ability to override some disassembler parameters on the fly) 7. riscv-dis-opts-batch-2 (Not general as batch 1 but very effective on libraries; I observed nearly 10x improvements on disassembling libcrypto.so) 8. riscv-dis-reduce-fp-on-addresses (reduce false positives on address printing) Yes, clearly I was impatient. Not splitting up the problem and effectively forcing all or nothing to you is a rude behavior and I sincerely apologize about that. So, let me split this patchset based on your comments. > > This patch refers to issue 3.  So according to the source code, we won't > show the final target address when pd->print_addr is -1, which means > the address is exactly -1 or it is the default value.  I think it is > really rare to jump or refer to the symbol whose value is -1.  Besides > that, not showing the target address when it's value is -1 doesn't look > wrong to me, so personally I would like to keep the original behavior.  Still I think... if we have a chance to "fix" this (without any significant performance penalty), why not? So I stick to my original opinion (except I will split this change into a separate patchset for separate review). Thanks, Tsukasa > Unless there are other users who really want to change the behavior, or > llvm are doing something different.  However, the change of gp makes > sense, so it looks to me. > > Thanks > Nelson > > On Wed, Aug 24, 2022 at 9:28 AM Tsukasa OI via Binutils > > wrote: > > This patch makes possible to print the highest address (0xffffffff > on RV32, > 0xffffffff_ffffffff on RV64).  This is particularly useful if the > highest > address space is used for I/O registers and corresponding symbols > are defined. > > opcodes/ChangeLog: > >         * riscv-dis.c (struct riscv_private_data): Add > `to_print_addr' and >         `has_gp' to enable printing the highest address. >         (maybe_print_address): Utilize `to_print_addr' and `has_gp'. >         (riscv_disassemble_insn): Likewise. > --- >  opcodes/riscv-dis.c | 22 ++++++++++++++++------ >  1 file changed, 16 insertions(+), 6 deletions(-) > > diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c > index 164fd209dbd..c6d80c3ba49 100644 > --- a/opcodes/riscv-dis.c > +++ b/opcodes/riscv-dis.c > @@ -52,6 +52,8 @@ struct riscv_private_data >    bfd_vma gp; >    bfd_vma print_addr; >    bfd_vma hi_addr[OP_MASK_RD + 1]; > +  bool to_print_addr; > +  bool has_gp; >  }; > >  /* Used for mapping symbols.  */ > @@ -177,10 +179,13 @@ maybe_print_address (struct riscv_private_data > *pd, int base_reg, int offset, >        pd->print_addr = (base_reg != 0 ? pd->hi_addr[base_reg] : 0) > + offset; >        pd->hi_addr[base_reg] = -1; >      } > -  else if (base_reg == X_GP && pd->gp != (bfd_vma)-1) > +  else if (base_reg == X_GP && pd->has_gp) >      pd->print_addr = pd->gp + offset; >    else if (base_reg == X_TP || base_reg == 0) >      pd->print_addr = offset; > +  else > +    return; > +  pd->to_print_addr = true; > >    /* Sign-extend a 32-bit value to a 64-bit value.  */ >    if (wide) > @@ -595,14 +600,19 @@ riscv_disassemble_insn (bfd_vma memaddr, > insn_t word, disassemble_info *info) >        int i; > >        pd = info->private_data = xcalloc (1, sizeof (struct > riscv_private_data)); > -      pd->gp = -1; > -      pd->print_addr = -1; > +      pd->gp = 0; > +      pd->print_addr = 0; >        for (i = 0; i < (int)ARRAY_SIZE (pd->hi_addr); i++) >         pd->hi_addr[i] = -1; > +      pd->to_print_addr = false; > +      pd->has_gp = false; > >        for (i = 0; i < info->symtab_size; i++) >         if (strcmp (bfd_asymbol_name (info->symtab[i]), > RISCV_GP_SYMBOL) == 0) > -         pd->gp = bfd_asymbol_value (info->symtab[i]); > +         { > +           pd->gp = bfd_asymbol_value (info->symtab[i]); > +           pd->has_gp = true; > +         } >      } >    else >      pd = info->private_data; > @@ -662,13 +672,13 @@ riscv_disassemble_insn (bfd_vma memaddr, > insn_t word, disassemble_info *info) >           print_insn_args (op->args, word, memaddr, info); > >           /* Try to disassemble multi-instruction addressing > sequences.  */ > -         if (pd->print_addr != (bfd_vma)-1) > +         if (pd->to_print_addr) >             { >               info->target = pd->print_addr; >               (*info->fprintf_styled_func) >                 (info->stream, dis_style_comment_start, " # "); >               (*info->print_address_func) (info->target, info); > -             pd->print_addr = -1; > +             pd->to_print_addr = false; >             } > >           /* Finish filling out insn_info fields.  */ > -- > 2.34.1 >