From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id A67B43850418; Tue, 30 Mar 2021 12:39:04 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A67B43850418 From: "jakub at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug sanitizer/99814] regexec fails with -fsanitize=address Date: Tue, 30 Mar 2021 12:39:04 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: sanitizer X-Bugzilla-Version: 11.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: jakub at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Mar 2021 12:39:04 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D99814 --- Comment #7 from Jakub Jelinek --- (In reply to Alex Richardson from comment #5) > Does the sanitizer runtime library include the > https://reviews.llvm.org/D96348 patch? >=20 > IMO the real issue is that dlsym() with RTLD_NEXT selects the oldest > versioned symbol. Not sure why that behaviour was chosen. > I'm sure there are lots of other sanitizer interceptors that are also > affected by https://sourceware.org/bugzilla/show_bug.cgi?id=3D1319. dlsym behavior matches the behavior of normal symbol lookup resolution. When glibc (or some other libraries) started, it was unversioned and later symbol versions were added to it. When linking against the very old glibc, libraries or binaries would use unversioned symbols and so that for ABI compatibility naturally needs to be resolved against the oldest symbol vers= ion. Libraries/binaries linked against newer glibc versions then have versioned symbols and use both the symbol name and symbol version in symbol lookup (i= .e. as dlvsym). For dlsym, one doesn't really know in which era the library or binary has b= een linked against and what it expects, it could be very old binary or newer or most recent, and if the same symbol has multiple symbol versions, which one= to choose is unknown. So, for symbols with more than one symbol version one should use dlvsym instead of dlsym. Ideally, libsanitizer shared libraries would be symbol versioned, for its o= wn APIs with some sanitizer specific symbol version(s), for the symbols it intercepts from glibc with the symbol versions from glibc it was configured against, and for symbols with multiple symbol versions one should have mult= iple interceptors, which if they call the intercepted function should use dlvsym. That would mean at library configure time scaning glibc symbol versions and deciding on the *san version scripts and predefined macros based on that.=