From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk1-x741.google.com (mail-qk1-x741.google.com [IPv6:2607:f8b0:4864:20::741]) by sourceware.org (Postfix) with ESMTPS id 83453385E824 for ; Thu, 19 Mar 2020 10:46:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 83453385E824 Received: by mail-qk1-x741.google.com with SMTP id f3so2475289qkh.1 for ; Thu, 19 Mar 2020 03:46:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=kNVyejlRhqoCNQ7kx0qVgZtR8SUsGklbRel6gPDiN9k=; b=cJfFnruhFuCydjzbmPilLyx8HitS2y7uJjzQaCu2zcqtFyz47tR4GGSPIm0XDppukS RiCsTGIdH2qDYNyLtdoqCw+RcNfBc0ixcbf0oPY7PSV60u3f7p4hTYfuk95md3KZmFwj 0vfymLg5y3IAOXubFyoL1tmXCSpHEsJUGdN7ihUN7BeYQsm98Ek0PEacWXa/JlhVf6v1 pu20J/0T+3GsT1AAvpSx1t6wz3W4UOV1nYDdYNUMrxntv7BDekz15zDEBSwbbNHRHY4I R3NSzl16Bx8iwW1Y7WzMlFR9Fyw2wZCjkpwy0hp6fIDD4JGvutlMDYsB7aw8c/en6C5N 7kvg== X-Gm-Message-State: ANhLgQ1bWIBvajMilLyMa/0eROO409f9TRB12VS9yg9lgVqWnkZjjZPE H5ZYYIjMtSd+81IHLW+Y+t0l1rbitJ4nKfqsAtJAMg== X-Google-Smtp-Source: ADFU+vuNhswzbpUi5oZgo2gti3gDrmOxqsWyem3A5C5suhiGYJhcuvQyVPKed7kB00gTal2R6/wMVL1Nn9QAlW9bFyQ= X-Received: by 2002:a5b:9c2:: with SMTP id y2mr3402003ybq.15.1584614789604; Thu, 19 Mar 2020 03:46:29 -0700 (PDT) MIME-Version: 1.0 References: <20200318113754.GI211970@google.com> <20200318121241.146259-1-gprocida@google.com> <877dzgtzlg.fsf@seketeli.org> In-Reply-To: <877dzgtzlg.fsf@seketeli.org> From: Giuliano Procida Date: Thu, 19 Mar 2020 10:46:12 +0000 Message-ID: Subject: Re: [PATCH v2] dwarf-reader: Use all bits of Bloom filter words. To: Dodji Seketeli Cc: libabigail@sourceware.org, kernel-team@android.com, =?UTF-8?Q?Matthias_M=C3=A4nnich?= Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-40.7 required=5.0 tests=DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, USER_IN_DEF_DKIM_WL, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libabigail mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Mar 2020 10:46:31 -0000 Hi Dodji. On Thu, 19 Mar 2020 at 10:21, Dodji Seketeli wrote: > > Hello Giuliano, > > Giuliano Procida a =C3=A9crit: > > [...] > > > diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc > > index 3454fcf5..5556bde5 100644 > > --- a/src/abg-dwarf-reader.cc > > +++ b/src/abg-dwarf-reader.cc > > @@ -2025,7 +2025,7 @@ lookup_symbol_from_gnu_hash_tab(const environment= * env, > > // filter, in bits. > > int c =3D get_elf_class_size_in_bytes(elf_handle) * 8; > > int n =3D (h1 / c) % ht.bf_nwords; > > - unsigned char bitmask =3D (1ul << (h1 % c)) | (1ul << (h2 % c)); > > + GElf_Word bitmask =3D (1ul << (h1 % c)) | (1ul << (h2 % c)); > > Good catch! I wonder what made me chose a byte-wide bitmask to begin > with. The docmentation of the bloom filter of the GNU Hash table at > https://blogs.oracle.com/solaris/gnu-hash-elf-sections-v2 says: > > "The filter consists of maskwords words, each of which is 32-bits > (ELFCLASS32) or 64-bits (ELFCLASS64) depending on the class of object= " > > So it's clear, the bitmask must be of type GElf_Word. > > Have you seen any practical speed increase in the loading of ELF > binaries due to this change? I spotted the bug when reviewing maennich's fix for undefined behaviour (integer overflow). I'm afraid I didn't go as far as testing on a real workload as I don't think we have one that exercises this code. It might help someone else though. The whole of make check (excluding fedabipkgdiff tests) only made 3 calls to the function. I think the efficiency was 100% both before and after (2/2 absent symbols excluded by the Bloom filter). :-) > [...] > > > Most of the bit values used for GNU hash ELF section Bloom filtering > > were being ignored due to integer narrowing, reducing missing symbol > > filtering efficiency considerably. > > > > This patch fixes this. > > > > * src/abg-dwarf-reader.cc (lookup_symbol_from_gnu_hash_tab): > > Don't narrow calculated Bloom word to 8 bits before using it > > to mask the fetched Bloom word. > > I have amended the patch to have the ChangeLog section above come > last in the commit log message. > > It really needs to come last because we have a script that build a > ChangeLog file from all the change log parts of the commit log and that > one is shipped in the released tarball, making it compliant with the GNU > tarball standards. That way, users who consume the tarball can access > some history without needing to have (internet) access to the full Git > history. I was happy for the note on testing to be excluded from the commit message. Would it be better to just reply to the patch to add this sort of context, or would you like to see these notes in the commit messages? Regards, Giuliano. > [...] > > > Note on testing. > > > > The .gnu.hash section seems to be present in all the .so but none of > > the .o test files. abisym doesn't appear to find dynamic symbols (nm > > -D), only normal ones, so it was a little tricky to test this. > > > > I found a Debian .so (libpthread) with both the .gnu.hash section and > > normal symbols. abisym behaves identically with my change, looking up > > lots of present and non-present (as far as it's concerned) symbols. I > > just extracted a full list with nm/sed and looked up each one. > > > > 389 symbols looked up, 241 present, 148 absent > > 8-bit filter: 336 maybe, 53 no (53/148 filtering efficiency) > > 64-bit filter: 255 maybe, 134 no (134/148 filtering efficiency) > > Thanks. > > So I am applying this to master. > > Below is the actual patch that got applied. > > Cheers, > > > -- > Dodji