From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by sourceware.org (Postfix) with ESMTP id 43F843858006 for ; Mon, 18 Jan 2021 20:28:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 43F843858006 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-156-TDgriMXfNWCWF3Yx0zNXWA-1; Mon, 18 Jan 2021 15:28:35 -0500 X-MC-Unique: TDgriMXfNWCWF3Yx0zNXWA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1089C800D53; Mon, 18 Jan 2021 20:28:34 +0000 (UTC) Received: from oldenburg.str.redhat.com (ovpn-112-110.ams2.redhat.com [10.36.112.110]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B5EB75D9D3; Mon, 18 Jan 2021 20:28:29 +0000 (UTC) From: Florian Weimer To: Chung-Lin Tang Cc: GNU C Library , Carlos O'Donell Subject: Re: [PATCH v4 2/2] BZ #17645, fix slow DSO sorting behavior in dynamic loader -- Algorithm changes References: <85ab188b-64c2-d1fd-33fa-d7d2e6fb2d97@codesourcery.com> Date: Mon, 18 Jan 2021 21:28:25 +0100 In-Reply-To: <85ab188b-64c2-d1fd-33fa-d7d2e6fb2d97@codesourcery.com> (Chung-Lin Tang's message of "Sun, 3 Jan 2021 19:22:48 +0800") Message-ID: <87o8hmkn12.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-6.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jan 2021 20:28:42 -0000 * Chung-Lin Tang: > +/* We use a recursive function due to its better clarity and ease of > + implementation, as well as faster execution speed. We already use > + alloca() for list allocation during the breadth-first search of > + dependencies in _dl_map_object_deps(), and this should be on the > + same order of worst-case stack usage. */ > + > +static void > +dfs_traversal (struct link_map ***rpo, struct link_map *map, > +=09 bool *do_reldeps) > +{ Could you please add a comment describing the rpo, that it points past the end of the destination array? That's not obvious based on the interface. Although I'm quite happy how simple the code is to read. 8-) > + if (map->l_visited) > + return; > + > + map->l_visited =3D 1; > + > + if (map->l_initfini) > + { > + for (int i =3D 0; map->l_initfini[i] !=3D NULL; i++) > +=09{ > +=09 struct link_map *dep =3D map->l_initfini[i]; > +=09 if (dep->l_visited =3D=3D 0) > +=09 dfs_traversal (rpo, dep, do_reldeps); > +=09} > + } First DT_NEEDED dependency is added first =E2=80=A6 > + > + if (__glibc_unlikely (do_reldeps !=3D NULL && map->l_reldeps !=3D NULL= )) > + { > + /* Indicate that we encountered relocation dependencies during > +=09 traversal. */ > + *do_reldeps =3D true; > + > + for (int m =3D map->l_reldeps->act - 1; m >=3D 0; m--) > +=09{ > +=09 struct link_map *dep =3D map->l_reldeps->list[m]; > +=09 if (dep->l_visited =3D=3D 0) > +=09 dfs_traversal (rpo, dep, do_reldeps); > +=09} > + } > + > + *rpo -=3D 1; > + **rpo =3D map; > +} =E2=80=A6 and the depending object is added last. The l_visited check at t= he start (and in the loops) ensures that this doesn't happen before. > +/* Topologically sort array MAPS according to dependencies of the contai= ned > + objects. */ > + > +static void > +_dl_sort_maps_dfs (struct link_map **maps, unsigned int nmaps, > +=09=09 unsigned int skip __attribute__ ((unused)), bool for_fini) > +{ > + for (int i =3D nmaps - 1; i >=3D 0; i--) > + { > + dfs_traversal (&rpo_head, maps[i], do_reldeps_ref); > + > + /* We can break early if all objects are already placed. */ > + if (rpo_head =3D=3D rpo) > +=09goto end; > + } > + assert (rpo_head =3D=3D rpo); Should =E2=80=9Cgoto end=E2=80=9D be a =E2=80=9Cbreak=E2=80=9D? The compil= er elides the assert for that case, I expect. > + /* Here we may do a second pass of sorting, using only l_initfini[] > + static dependency links. This is avoided if !FOR_FINI or if we didn= 't > + find any reldeps in the first DFS traversal. > + > + The reason we do this is: while it is unspecified how circular > + dependencies should be handled, the presumed reasonable behavior is= to > + have destructors to respect static dependency links as much as poss= ible, > + overriding reldeps if needed. And the first sorting pass, which tak= es > + l_initfini/l_reldeps links equally, may not preserve this priority. > + > + Hence we do a 2nd sorting pass, taking only DT_NEEDED links into ac= count > + (see how the do_reldeps argument to dfs_traversal() is NULL below).= */ > + if (do_reldeps) > + { > + for (int i =3D nmaps - 1; i >=3D 0; i--) > +=09rpo[i]->l_visited =3D 0; > + > + struct link_map **maps_head =3D &maps[nmaps]; > + for (int i =3D nmaps - 1; i >=3D 0; i--) > +=09{ > +=09 dfs_traversal (&maps_head, rpo[i], NULL); > + > +=09 /* We can break early if all objects are already placed. > +=09 The below memcpy is not needed in the do_reldeps case here, > +=09 since we wrote back to maps[] during DFS traversal. */ > +=09 if (maps_head =3D=3D maps) > +=09 return; > +=09} > + assert (maps_head =3D=3D maps); > + return; > + } About that second sort: I *suspect* the original goal was to use reldeps for constructor ordering as well. There were actually two sorting calls in the original implementation of shared object loading, but the second happend *before* relocation processing, so it never encountered any collected relocations. They only mattered on dlclose/process termination, and this is not actually what users want. At this point, I am quite convinced that we should get rid of reldeps sorting completely. If we reorder the link map chaining at load time, I think we won't need to have to do *any* sorting at closing time. This means that we also don't have to allocate any memory for it. Does anyone have concerns about moving in that direction? I do believe that we should make one dependency sorting change in one release, and not do this transition piecewise. > +void > +_dl_sort_maps (struct link_map **maps, unsigned int nmaps, > +=09 unsigned int skip, bool for_fini) > +{ > + /* Index code for sorting algorithm currently in use. */ > + static int32_t algorithm =3D 0; > + if (__glibc_unlikely (algorithm =3D=3D 0)) > + algorithm =3D TUNABLE_GET (glibc, rtld, dynamic_sort, > +=09=09=09 int32_t, NULL); > + > + /* It can be tempting to use a static function pointer to store and ca= ll > + the current selected sorting algorithm routine, but experimentation > + shows that current processors still do not handle indirect branches > + that efficiently, plus a static function pointer will involve > + PTR_MANGLE/DEMANGLE, further impairing performance of small, common > + input cases. A simple if-case with direct function calls appears to > + be the fastest. */ > + if (__glibc_likely (algorithm =3D=3D 1)) > + _dl_sort_maps_original (maps, nmaps, skip, for_fini); > + else if (algorithm =3D=3D 2) > + _dl_sort_maps_dfs (maps, nmaps, skip, for_fini); > + else > + __builtin_unreachable (); > +} Some nitpicking: We could declare the function pointer attribute_relro, and since this function is necessarily called before final relocation, it will be set before RELRO because active, so writing to it still permitted. The flag-based solution is of course fine, so I suggest just to remove the comment. Thanks, Florian --=20 Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'N= eill