From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oa1-x2a.google.com (mail-oa1-x2a.google.com [IPv6:2001:4860:4864:20::2a]) by sourceware.org (Postfix) with ESMTPS id C7B4A3858C36 for ; Tue, 28 Nov 2023 01:39:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C7B4A3858C36 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C7B4A3858C36 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:4860:4864:20::2a ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701135556; cv=none; b=wh1JM4BRhjH29JB/xf8ucOTNuwkxMMYZ5dnB5T1dzY0WX2fIpi123dP6PytNA4DZB4mVOcIzIxoC3SpEXaD8dh/RQemoCRfP0Yf5ZA6gcHQDrp9aAs0XhqPNCS/KQBkEGkYDrmmntUrZBL4Fjm8qDuh5uYoMvdKzjvIvJF8LBXs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701135556; c=relaxed/simple; bh=kaocAOKUjEjBT+dFAhy+p/JYNL/z1YOXoMOIT82g4f4=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=QjDOahIbkvTCyAkQC09RHzvHFBAxuhSYAAjvxiGJX1eBsaqBUKkOLH/n411/S2AahpZeSHlat0n640nl4XawXhosSGGb1wn9ZemYYtgz6bJOS3/1hhiexCU5EsxX/JOt8zbcUp0LBR9HRvzxhHOX3royiVQ9xpvhPZCv8JMmOYI= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oa1-x2a.google.com with SMTP id 586e51a60fabf-1fa21f561a1so1325071fac.3 for ; Mon, 27 Nov 2023 17:39:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1701135552; x=1701740352; darn=sourceware.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=xsrkbbj/kyi6bG6zGvMH/O6OJ+FRUkc0IMNM4ynBz8A=; b=xcYsvWf82FRyQLTWC4MDsf9DXM9DQ4itaDpzHb0mhZseodvoKPuXLJCHm/HgGTAXqA /uFJ7dRdhpisp89mZv+ls8qEmC+m7qY4KixcfUZqPJ+3aJ/U1sVWQDgK+z1a0k6NvXyD SwUOXb4qcLxLf03eUY6BW/tXeYm9esEKbU5F3oNnbqHIWoFwovIV0Zw8HQc/w3KoTeav c6Cad/+CBIx68YTVmwAJDd8iAvuP8L5zhvqzMLw8GxDObE+BIHpVIVhkIJbJGv9b/KKK HTkoGFqO7sW+WyXgXftIPPdBF0TdcfhTxanvH5lCX6sw8Lw4jLm+3sWmNN9bQJz36/V+ 7XeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701135552; x=1701740352; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=xsrkbbj/kyi6bG6zGvMH/O6OJ+FRUkc0IMNM4ynBz8A=; b=agAzL0KDoxpllmbCko9JjvsEUH5MBxwUjyS7M3jigpmqk2ek+F5rmdXm9jKeAE65Y/ uOtxt4blOWWxc0NhknsAikDqhxkNpNdBLSQRurxX7eVeaRzQ27tRm+LxBwtt2PkLldWj H4wnZfuXTQ8uFJIh5HuAtDwPSI2KIV1FHjt/cPkZtaIMpnT8p6dslNILT0eEdjIc1ILs 6hkCBla/JxsaepL6xYzQ+3AmdJP3CVC1l02NjXvEeWD6RnFkCtPjAf2vFcnHLbjXNHJ+ sAgx/4jV/xM8r+kTfx+I+Xi7j/nWixuqHwGwfdqODFnH9u3X91g2jtua113gOHuHZ/ar YHrQ== X-Gm-Message-State: AOJu0Yztrye7/w9uj/8Q2X1sFnCDvpClIleJ9Ld3fBDF8OLtvV3x6p+I +0Lj8ATQisUZXtGqffaLNukV58C8stD4IV0H9Aty5gFh+p9MNiClHIkghQ== X-Google-Smtp-Source: AGHT+IGCq1hVyvAtIh77IE++eaTg8f78RKsNLy0aX6TSzD7vRyQvfCGT0keEPvFy0r9mdXHs8ZL3rDiQM8D/+0WRErQ= X-Received: by 2002:a05:6870:6107:b0:1fa:2b51:cc7a with SMTP id s7-20020a056870610700b001fa2b51cc7amr9173148oae.25.1701135552037; Mon, 27 Nov 2023 17:39:12 -0800 (PST) MIME-Version: 1.0 References: <20231122013511.46088-1-patrick@rivosinc.com> In-Reply-To: <20231122013511.46088-1-patrick@rivosinc.com> From: Nelson Chu Date: Tue, 28 Nov 2023 09:39:01 +0800 Message-ID: Subject: Re: [PATCH] RISC-V: Avoid updating state until symbol is found To: "Patrick O'Neill" Cc: binutils@sourceware.org, kito.cheng@sifive.com Content-Type: multipart/alternative; boundary="0000000000004b5942060b2c7cbe" X-Spam-Status: No, score=-8.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,GIT_PATCH_0,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --0000000000004b5942060b2c7cbe Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Nov 22, 2023 at 9:35=E2=80=AFAM Patrick O'Neill wrote: > Currently objdump gets and updates the map state once per symbol. Updating > the > state (partiularly riscv_parse_subset) is expensive and grows quadratical= ly > since we iterate over all symbols. By deferring this until once we've > found the > symbol of interest, we can reduce the time to dump a 4k insn file of > .norvc and > .rvc insns from ~47 seconds to ~0.13 seconds. > > opcodes/ChangeLog: > > * riscv-dis.c (riscv_get_map_state): Remove state updating logic. > (riscv_update_map_state): Add state updating logic to seperate > function. > (riscv_search_mapping_symbol): Use new riscv_update_map_state. > (riscv_data_length): Ditto. > > Signed-off-by: Patrick O'Neill > --- > Somewhat related to: > > https://sourceware.org/git/gitweb.cgi?p=3Dbinutils-gdb.git;h=3Dc2f60ac565= f1d369fde98146a16f1d3ef79e1000 > Sequences of compressed/uncompressed insns have different isa strings, > so the cache in that patch is not triggered. Thankfully we can just > reduce the number of calls to fix this issue rather than create a new > cache. > --- > opcodes/riscv-dis.c | 47 ++++++++++++++++++++++++++++++++++++--------- > 1 file changed, 38 insertions(+), 9 deletions(-) > > diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c > index ca328b4c997..4ada1487732 100644 > --- a/opcodes/riscv-dis.c > +++ b/opcodes/riscv-dis.c > @@ -860,20 +860,20 @@ riscv_disassemble_insn (bfd_vma memaddr, > return insnlen; > } > > -/* Return true if we find the suitable mapping symbol, > - and also update the STATE. Otherwise, return false. */ > +/* If we find the suitable mapping symbol update the STATE. > + Otherwise, do nothing. */ > > -static bool > -riscv_get_map_state (int n, > - enum riscv_seg_mstate *state, > - struct disassemble_info *info) > +static void > +riscv_update_map_state (int n, > + enum riscv_seg_mstate *state, > + struct disassemble_info *info) > { > const char *name; > > /* If the symbol is in a different section, ignore it. */ > if (info->section !=3D NULL > && info->section !=3D info->symtab[n]->section) > - return false; > + return; > > name =3D bfd_asymbol_name(info->symtab[n]); > if (strcmp (name, "$x") =3D=3D 0) > @@ -900,10 +900,27 @@ riscv_get_map_state (int n, > else > riscv_parse_subset (&riscv_rps_dis, name + 2); > } > - else > +} > + > +/* Return true if we find the suitable mapping symbol. > + Otherwise, return false. */ > + > +static bool > +riscv_get_map_state (int n, > + enum riscv_seg_mstate *state, + struct disassemble_info *info) > +{ > + const char *name; > + > + /* If the symbol is in a different section, ignore it. */ > + if (info->section !=3D NULL > + && info->section !=3D info->symtab[n]->section) > return false; > > - return true; > + name =3D bfd_asymbol_name(info->symtab[n]); > + return (strcmp (name, "$x") =3D=3D 0 > + || strcmp (name, "$d") =3D=3D 0 > + || strncmp (name, "$xrv", 4) =3D=3D 0); > riscv_elf_is_mapping_symbols? > } > option 1: remove the unused input *state here, and then maybe change the function name to riscv_valid_mapping_symbols? or any other names without "state". option 2: keep the old riscv_get_map_state, keep updating map_state there, but don't call riscv_release_subset_list and riscv_parse_subset for MAP_INSN until riscv_update_map_state (so that we can changed the name to riscv_update_arch_from_map_state?). > /* Check the sorted symbol table (sorted by the symbol value), find the > @@ -972,6 +989,11 @@ riscv_search_mapping_symbol (bfd_vma memaddr, > } > } > > + if (found) > + riscv_update_map_state (symbol, &mstate, info); > + else > + riscv_update_map_state (info->symtab_size - 1, &mstate, info); > Do we still need to call riscv_update_map_state even if we cannot find any suitable mapping symbols? > + > /* We can not find the suitable mapping symbol above. Therefore, we > look forwards and try to find it again, but don't go past the start > of the section. Otherwise a data section without mapping symbols > @@ -996,6 +1018,10 @@ riscv_search_mapping_symbol (bfd_vma memaddr, > break; > } > } > + if (found) > + riscv_update_map_state (symbol, &mstate, info); > + else > + riscv_update_map_state (0, &mstate, info); > Likewise. > } > > if (found) > Can we call "riscv_update_map_state (symbol, ...), or riscv_update_arch_from_map_state" once here? > @@ -1060,9 +1086,12 @@ riscv_data_length (bfd_vma memaddr, > if (addr - memaddr < length) > length =3D addr - memaddr; > found =3D true; > + riscv_update_map_state (n, &m, info); break; > } > } > + if (!found) > + riscv_update_map_state (0, &m, info); > We should already know the map_state is MAP_DATA, and then call into here, so don't need to update the map_state again? Thanks Nelson --0000000000004b5942060b2c7cbe--