From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x62f.google.com (mail-pl1-x62f.google.com [IPv6:2607:f8b0:4864:20::62f]) by sourceware.org (Postfix) with ESMTPS id 406B7383B7B1 for ; Fri, 27 May 2022 20:32:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 406B7383B7B1 Received: by mail-pl1-x62f.google.com with SMTP id q4so5053301plr.11 for ; Fri, 27 May 2022 13:32:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=uEPjKdr9XGKWraJ1oiPJvs7OTJtnDz+QCYBJ6zkCnqQ=; b=kwMqqYyp/MNBGBKCEQIwjeHEPHaZgWXkoQ0jHTwdB1kQsmSI9O0yMSd46t4/Eo6y8l 3Dxopn/5BCv8ENTzcUy0xH5uQjfp2zBfu31Nb9U35jLTRK67g4+9yhuMkXGU1C7m6fA8 qBSN4vtm+4BMZGaGcMHYYo4cJlTa/ZhEIHfspNBFhebQojnio7qTnpeW850rkhOo7xhf VSw3m7xXrnye/LkU/XVvye3juRmc/4QFApuXypbUXaManmEYtpEQbdydr38hd70X4niG UFxGNS6l3M6/QbUf0dQcaso3G3TsKSL2E11l57toiJycbkxiXxKbKvP1RQUpguOq9zg7 KWag== X-Gm-Message-State: AOAM533XWmlwfgj1bT5Sj9FszNjTIZPNjkF+Xdt4w7aZBcX4ATtxv1Bk t2ngUW86O8QpH+KboDAH3qonzmOtZ2qzJQ== X-Google-Smtp-Source: ABdhPJzJSqLzmn4NPnolXnzd+WiPRUTnvJzz24ZIypZYfgzx7Q/WUcb0QrSCmxG06Sc/5HJaDgqDYA== X-Received: by 2002:a17:902:ebc8:b0:162:17a3:561 with SMTP id p8-20020a170902ebc800b0016217a30561mr29759350plg.144.1653683526086; Fri, 27 May 2022 13:32:06 -0700 (PDT) Received: from google.com ([2620:15c:2ce:200:781f:286a:7724:6870]) by smtp.gmail.com with ESMTPSA id u10-20020a170902e5ca00b001619cec6f95sm4124543plf.257.2022.05.27.13.32.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 May 2022 13:32:05 -0700 (PDT) Date: Fri, 27 May 2022 13:32:02 -0700 From: Fangrui Song To: Florian Weimer Cc: libc-alpha@sourceware.org Subject: Re: [PATCH] dlsym: Make RTLD_NEXT prefer default version definition [#BZ #14932] Message-ID: <20220527203202.r5vmk5ssufem4jga@google.com> References: <20220520083507.2368165-1-maskray@google.com> <87k0a7fd9n.fsf@oldenburg.str.redhat.com> <20220527192448.n7bny5g3eyxou5xo@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20220527192448.n7bny5g3eyxou5xo@google.com> X-Spam-Status: No, score=-27.4 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 May 2022 20:32:09 -0000 On 2022-05-27, Fangrui Song wrote: >On 2022-05-27, Florian Weimer wrote: >>I've been looking at this for a while. >> >>We currently have this code in elf/dl-lookup.c: >> >> /* No specific version is selected. There are two ways we >> can got here: >> >> - a binary which does not include versioning information >> is loaded >> >> - dlsym() instead of dlvsym() is used to get a symbol which >> might exist in more than one form >> >> If the library does not provide symbol version information >> there is no problem at all: we simply use the symbol if it >> is defined. >> >> These two lookups need to be handled differently if the >> library defines versions. In the case of the old >> unversioned application the oldest (default) version >> should be used. In case of a dlsym() call the latest and >> public interface should be returned. */ >> if (verstab != NULL) >> { >> if ((verstab[symidx] & 0x7fff) >> >= ((flags & DL_LOOKUP_RETURN_NEWEST) ? 2 : 3)) >> { >> /* Don't accept hidden symbols. */ >> if ((verstab[symidx] & 0x8000) == 0 >> && (*num_versions)++ == 0) >> /* No version so far. */ >> *versioned_sym = sym; >> >> return NULL; >> } >> } >> >>The numbers 2 and 3 look suspicious. The condition involving >>num_versions ensures that we only store the first matching symbol in >>*versioned_sym. But the skipped version indices are not special. >>Indices are not specific to individual symbols, so it is no clear that >>index 2 is special and contains the oldest possible version for that >>particular symbol, and the next index will have the latest version. >> >>Aligning dlvsym with dlsym still makes sense, so I suggest to proceed >>with this patch. But I think there are more bugs in this area. I suspect the code (from 199x) has a hidden assumption that for a stem symbol (say foo), the .dynsym entries (say foo@v1, foo@v2, foo@@v3) are ordered by the version node indices (in .gnu.version_d). If so: * (flags & DL_LOOKUP_RETURN_NEWEST) == 0 => the symbol with verstab[symidx]==2 will skip the check and be returned * (flags & DL_LOOKUP_RETURN_NEWEST) != 0 => only the default version symbol gets assigned in `*versioned_sym = sym` Unfortunately, this assumption hold for none of GNU ld, gold, and ld.lld (seems unrelated to --hash-style=): ``` % ld.bfd @response.txt && readelf -W --dyn-syms nextmod3.so | grep foo@ 5: 0000000000001100 6 FUNC GLOBAL DEFAULT 13 foo@v1 8: 0000000000001120 6 FUNC GLOBAL DEFAULT 13 foo@@v3 9: 0000000000001110 6 FUNC GLOBAL DEFAULT 13 foo@v2 % gold @response.txt && readelf -W --dyn-syms nextmod3.so | grep foo@ 7: 00000000000007c0 6 FUNC GLOBAL DEFAULT 14 foo@@v3 9: 00000000000007a0 6 FUNC GLOBAL DEFAULT 14 foo@v1 10: 00000000000007b0 6 FUNC GLOBAL DEFAULT 14 foo@v2 % ld.lld @response.txt && readelf -W --dyn-syms nextmod3.so | grep foo@ 7: 0000000000001740 6 FUNC GLOBAL DEFAULT 13 foo@@v3 8: 0000000000001720 6 FUNC GLOBAL DEFAULT 13 foo@v1 9: 0000000000001730 6 FUNC GLOBAL DEFAULT 13 foo@v2 ``` I think the simplification https://sourceware.org/pipermail/libc-alpha/2022-May/138304.html ([PATCH] elf: Remove one-default-version check when searching an unversioned symbol) can proceed, but a bug fix will still be needed. >>>diff --git a/elf/nextmod3.c b/elf/nextmod3.c >>>new file mode 100644 >>>index 0000000000..96608a65c0 >>>--- /dev/null >>>+++ b/elf/nextmod3.c >>>@@ -0,0 +1,19 @@ >>>+int >>>+foo_v1 (int a) >>>+{ >>>+ return 1; >>>+} >>>+asm (".symver foo_v1, foo@v1"); >>>+ >>>+int >>>+foo_v2 (int a) >>>+{ >>>+ return 2; >>>+} >>>+asm (".symver foo_v2, foo@v2"); >>>+ >>>+int >>>+foo (int a) >>>+{ >>>+ return 3; >>>+} >> >>Please set foo@@v3 explicitly. > >Ack. Having asm (".symver foo, foo@@@v3") is clearer, though it is >redundant since the version script specifies foo in the v3 version node. > >>>diff --git a/elf/nextmod3.map b/elf/nextmod3.map >>>new file mode 100644 >>>index 0000000000..0a8e4e4ee3 >>>--- /dev/null >>>+++ b/elf/nextmod3.map >>>@@ -0,0 +1,3 @@ >>>+v1 { }; >>>+v2 { }; >>>+v3 { foo; }; >> >>These versions are not ordered. Maybe use this instead? >> >>v1 { }; >>v2 { } v1; >>v3 { foo; } v2; > >The dependencies are a no-op in lld but tracks version definition >dependencies in GNU ld and gold (see readelf -V output: `Parent 1`). > >The vd_cnt member is ignored in glibc and FreeBSD rtld, so omitting it >should be fine. That said, I'll add it...