From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qv1-xf43.google.com (mail-qv1-xf43.google.com [IPv6:2607:f8b0:4864:20::f43]) by sourceware.org (Postfix) with ESMTPS id 9867D3985423 for ; Tue, 3 Nov 2020 13:05:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 9867D3985423 Received: by mail-qv1-xf43.google.com with SMTP id i17so5607133qvp.11 for ; Tue, 03 Nov 2020 05:05:26 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:autocrypt:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=xJ0oLqFWzs6cwp5qJP/ouqbI3NJbfk6vfTIvmC8D5oo=; b=Nh7IlhtK9iAZvAoxHFJ/6LQJzOQ7q2AJu7iXfYyviwKeM4xXM7a4IGuTn0SFgi9pri nDa+KiRzvT3EhNVIbtU2DVszoAea7h7aaroOpU4chB0gHGExXA+5td1dHV9mmFTyOU8+ Ix1Z1OfRRCLaefsP8a+ASP4kCisGEeEPCt3ljEioCqPBJz0Af9cadYylqhZ+Tt8YC+gI wFF7lu/vJ9txO2LIe34/wZEuA3Vhb+zfTweGQ/TOpTw9exL5pkCSmWhzjGnK3tbsUJQM Ily2txr1eXsMXRdTWxoYvoquQGUssCcQnA+8wAzvzgDTJeuSeaaWoblnLmQan5q2HdbI tDmA== X-Gm-Message-State: AOAM533kXxXuvHyxsuLHfjb+LRmuS7ZgPgZ9jF33sumkkW23iMVrWskE H34ce8Nhth1i5mr+axzH3OBzwba7pYDoLA== X-Google-Smtp-Source: ABdhPJyejVLhfOSCFEZFRI770CuJkEMPZf6ZzuiqJ1GVW7h+G1+XXsP7VwSXo0mGw5cNxVpF7fQh/Q== X-Received: by 2002:a0c:9004:: with SMTP id o4mr27114242qvo.17.1604408725883; Tue, 03 Nov 2020 05:05:25 -0800 (PST) Received: from [192.168.1.4] ([177.194.48.209]) by smtp.googlemail.com with ESMTPSA id n6sm9578178qkk.6.2020.11.03.05.05.23 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 03 Nov 2020 05:05:24 -0800 (PST) Subject: Re: [PATCH 24/28] elf: Implement a string table for ldconfig, with tail merging To: Florian Weimer , Adhemerval Zanella via Libc-alpha References: <2fa948f4-4cc9-750d-3282-7df1d4dd4498@linaro.org> <87v9erk5fx.fsf@oldenburg2.str.redhat.com> From: Adhemerval Zanella Autocrypt: addr=adhemerval.zanella@linaro.org; prefer-encrypt=mutual; keydata= mQINBFcVGkoBEADiQU2x/cBBmAVf5C2d1xgz6zCnlCefbqaflUBw4hB/bEME40QsrVzWZ5Nq 8kxkEczZzAOKkkvv4pRVLlLn/zDtFXhlcvQRJ3yFMGqzBjofucOrmdYkOGo0uCaoJKPT186L NWp53SACXguFJpnw4ODI64ziInzXQs/rUJqrFoVIlrPDmNv/LUv1OVPKz20ETjgfpg8MNwG6 iMizMefCl+RbtXbIEZ3TE/IaDT/jcOirjv96lBKrc/pAL0h/O71Kwbbp43fimW80GhjiaN2y WGByepnkAVP7FyNarhdDpJhoDmUk9yfwNuIuESaCQtfd3vgKKuo6grcKZ8bHy7IXX1XJj2X/ BgRVhVgMHAnDPFIkXtP+SiarkUaLjGzCz7XkUn4XAGDskBNfbizFqYUQCaL2FdbW3DeZqNIa nSzKAZK7Dm9+0VVSRZXP89w71Y7JUV56xL/PlOE+YKKFdEw+gQjQi0e+DZILAtFjJLoCrkEX w4LluMhYX/X8XP6/C3xW0yOZhvHYyn72sV4yJ1uyc/qz3OY32CRy+bwPzAMAkhdwcORA3JPb kPTlimhQqVgvca8m+MQ/JFZ6D+K7QPyvEv7bQ7M+IzFmTkOCwCJ3xqOD6GjX3aphk8Sr0dq3 4Awlf5xFDAG8dn8Uuutb7naGBd/fEv6t8dfkNyzj6yvc4jpVxwARAQABtElBZGhlbWVydmFs IFphbmVsbGEgTmV0dG8gKExpbmFybyBWUE4gS2V5KSA8YWRoZW1lcnZhbC56YW5lbGxhQGxp bmFyby5vcmc+iQI3BBMBCAAhBQJXFRpKAhsDBQsJCAcDBRUKCQgLBRYCAwEAAh4BAheAAAoJ EKqx7BSnlIjv0e8P/1YOYoNkvJ+AJcNUaM5a2SA9oAKjSJ/M/EN4Id5Ow41ZJS4lUA0apSXW NjQg3VeVc2RiHab2LIB4MxdJhaWTuzfLkYnBeoy4u6njYcaoSwf3g9dSsvsl3mhtuzm6aXFH /Qsauav77enJh99tI4T+58rp0EuLhDsQbnBic/ukYNv7sQV8dy9KxA54yLnYUFqH6pfH8Lly sTVAMyi5Fg5O5/hVV+Z0Kpr+ZocC1YFJkTsNLAW5EIYSP9ftniqaVsim7MNmodv/zqK0IyDB GLLH1kjhvb5+6ySGlWbMTomt/or/uvMgulz0bRS+LUyOmlfXDdT+t38VPKBBVwFMarNuREU2 69M3a3jdTfScboDd2ck1u7l+QbaGoHZQ8ZNUrzgObltjohiIsazqkgYDQzXIMrD9H19E+8fw kCNUlXxjEgH/Kg8DlpoYJXSJCX0fjMWfXywL6ZXc2xyG/hbl5hvsLNmqDpLpc1CfKcA0BkK+ k8R57fr91mTCppSwwKJYO9T+8J+o4ho/CJnK/jBy1pWKMYJPvvrpdBCWq3MfzVpXYdahRKHI ypk8m4QlRlbOXWJ3TDd/SKNfSSrWgwRSg7XCjSlR7PNzNFXTULLB34sZhjrN6Q8NQZsZnMNs TX8nlGOVrKolnQPjKCLwCyu8PhllU8OwbSMKskcD1PSkG6h3r0AquQINBFcVGkoBEACgAdbR Ck+fsfOVwT8zowMiL3l9a2DP3Eeak23ifdZG+8Avb/SImpv0UMSbRfnw/N81IWwlbjkjbGTu oT37iZHLRwYUFmA8fZX0wNDNKQUUTjN6XalJmvhdz9l71H3WnE0wneEM5ahu5V1L1utUWTyh VUwzX1lwJeV3vyrNgI1kYOaeuNVvq7npNR6t6XxEpqPsNc6O77I12XELic2+36YibyqlTJIQ V1SZEbIy26AbC2zH9WqaKyGyQnr/IPbTJ2Lv0dM3RaXoVf+CeK7gB2B+w1hZummD21c1Laua +VIMPCUQ+EM8W9EtX+0iJXxI+wsztLT6vltQcm+5Q7tY+HFUucizJkAOAz98YFucwKefbkTp eKvCfCwiM1bGatZEFFKIlvJ2QNMQNiUrqJBlW9nZp/k7pbG3oStOjvawD9ZbP9e0fnlWJIsj 6c7pX354Yi7kxIk/6gREidHLLqEb/otuwt1aoMPg97iUgDV5mlNef77lWE8vxmlY0FBWIXuZ yv0XYxf1WF6dRizwFFbxvUZzIJp3spAao7jLsQj1DbD2s5+S1BW09A0mI/1DjB6EhNN+4bDB SJCOv/ReK3tFJXuj/HbyDrOdoMt8aIFbe7YFLEExHpSk+HgN05Lg5TyTro8oW7TSMTk+8a5M kzaH4UGXTTBDP/g5cfL3RFPl79ubXwARAQABiQIfBBgBCAAJBQJXFRpKAhsMAAoJEKqx7BSn lIjvI/8P/jg0jl4Tbvg3B5kT6PxJOXHYu9OoyaHLcay6Cd+ZrOd1VQQCbOcgLFbf4Yr+rE9l mYsY67AUgq2QKmVVbn9pjvGsEaz8UmfDnz5epUhDxC6yRRvY4hreMXZhPZ1pbMa6A0a/WOSt AgFj5V6Z4dXGTM/lNManr0HjXxbUYv2WfbNt3/07Db9T+GZkpUotC6iknsTA4rJi6u2ls0W9 1UIvW4o01vb4nZRCj4rni0g6eWoQCGoVDk/xFfy7ZliR5B+3Z3EWRJcQskip/QAHjbLa3pml xAZ484fVxgeESOoaeC9TiBIp0NfH8akWOI0HpBCiBD5xaCTvR7ujUWMvhsX2n881r/hNlR9g fcE6q00qHSPAEgGr1bnFv74/1vbKtjeXLCcRKk3Ulw0bY1OoDxWQr86T2fZGJ/HIZuVVBf3+ gaYJF92GXFynHnea14nFFuFgOni0Mi1zDxYH/8yGGBXvo14KWd8JOW0NJPaCDFJkdS5hu0VY 7vJwKcyHJGxsCLU+Et0mryX8qZwqibJIzu7kUJQdQDljbRPDFd/xmGUFCQiQAncSilYOcxNU EMVCXPAQTteqkvA+gNqSaK1NM9tY0eQ4iJpo+aoX8HAcn4sZzt2pfUB9vQMTBJ2d4+m/qO6+ cFTAceXmIoFsN8+gFN3i8Is3u12u8xGudcBPvpoy4OoG Message-ID: <21aa83aa-ad91-41dc-cdd5-666226fa9779@linaro.org> Date: Tue, 3 Nov 2020 10:05:22 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <87v9erk5fx.fsf@oldenburg2.str.redhat.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-14.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Nov 2020 13:05:28 -0000 On 30/10/2020 14:08, Florian Weimer wrote: > * Adhemerval Zanella via Libc-alpha: > >> On 01/10/2020 13:34, Florian Weimer via Libc-alpha wrote: >>> This will be used in ldconfig to reduce the ld.so.cache size slightly. >> >> Could you extend the commit message to explain why the 32-bit FNV-1a hash >> is used and how the hash table is organized (how collisions are handled, >> expected memory usage, strategy used to increase/decrease the bucket size)? >> >> It could help also to explain the usercase a bit more, the 'tail merging' >> is not straightforward to understand without dig deep in the code. Also >> why kind of cache size decrease do you expect by using strategy? > > That depends whether cached DSOs use sonames as file names (so that no > symbolic links are needed). Typically that's not the case today, so the > savings are really small. glibc-hwcaps may change that. > > The repost will have an expanded commit message: > > elf: Implement a string table for ldconfig, with tail merging > > This will be used in ldconfig to reduce the ld.so.cache size slightly. > > Tail merging is an optimization where a pointer points into another > string if the first string is a suffix of the second string. > > The hash function FNV-1a was chosen because it is simple and achieves > good dispersion even for short strings (so that the hash table bucket > count can be a power of two). It is clearly superior to the hsearch > hash and the ELF hash in this regard. > > The hash table uses chaining for collision resolution. Looks better, thanks. > >>> diff --git a/elf/stringtable.c b/elf/stringtable.c >>> new file mode 100644 >>> index 0000000000..f9ade50249 >>> --- /dev/null >>> +++ b/elf/stringtable.c >>> @@ -0,0 +1,201 @@ >>> +/* String tables for ld.so.cache construction. Implementation. >> >> This file misses the Copyright year. > > Fixed throughout. > >>> +static void >>> +stringtable_init (struct stringtable *table) >>> +{ >>> + table->count = 0; >>> + table->allocated = 16; >>> + table->entries = xcalloc (table->allocated, sizeof (table->entries[0])); >>> +} >>> + >> >> Why 16 elements as initial size? > > I'm increasing it to 128 with a comment. 128 is based on the number of > DSOs within glibc itself. > >>> +struct stringtable_entry * >>> +stringtable_intern (struct stringtable *table, const char *string) >>> +{ >>> + if (table->allocated == 0) >>> + stringtable_init (table); >> >> How this could happen? Is it expect the caller to set 'allocated' >> explicitly? > > Zero-initialization is valid. stringtable_free also leaves the table > ready for re-use. Wouldn't both usages make the above check unnecessary? > >>> + /* Copy the strings. */ >>> + for (uint32_t i = 0; i < table->allocated; ++i) >>> + for (struct stringtable_entry *e = table->entries[i]; e != NULL; >>> + e = e->next) >>> + if (result->strings[e->offset] == '\0') >>> + memcpy (&result->strings[e->offset], e->string, e->length + 1); >>> +} >> >> Ok, I guess allocating a new stringtable_finalized should be simpler than >> operating the table itself. > > Sorry, I don't understand. Do you mean reusing the table allocation in > some way? Yes, that would be fairly complicated. Yes, to avoid allocation memory on both new item insertion and on the finalize operation itself. But I think such optimization does not really matter for the ldconfig usage. > >>> +/* Adds STRING to TABLE. May return the address of an existing entry. */ >>> +struct stringtable_entry *stringtable_intern (struct stringtable *table, >>> + const char *string); >> >> I think this name is confusing, why not just 'stringtable_add' or >> 'stringtable_add_element'? > > I'm changing it to stringtable_add. Didn't realize that interning is > obscure terminology. > > Thanls, > Florian >