From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id D6EEA385800A; Fri, 14 May 2021 00:44:37 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D6EEA385800A From: "i at maskray dot me" To: gcc-bugs@gcc.gnu.org Subject: [Bug c/100593] New: [ELF] -fno-pic: Use GOT to take address of an external default visibility function Date: Fri, 14 May 2021 00:44:37 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c X-Bugzilla-Version: 11.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: i at maskray dot me X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 May 2021 00:44:38 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D100593 Bug ID: 100593 Summary: [ELF] -fno-pic: Use GOT to take address of an external default visibility function Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: i at maskray dot me Target Milestone: --- Most ELF targets use an absolute relocation (e.g. R_X86_64_32) to take the address of a default visibility non-definition function declaration. The absolute relocation can cause a canonical PLT entry (st_shndx=3D0, st_value!=3D0; The term is a parlance within a few LLD developers, but not broadly adopted). If the defining DSO is linked with Bsymbolic-functions (or -Bsymbolic), the addresses taken within the DSO and outside of the DSO will be different. Since C++ requires uniqueness of the address, this violates the language standard. Outside of the GNU ELF world, many dynamic linking implementations have shi= fted to a direct binding and non-interposition by default world. We have rants from people complaining about shared object performance. (e.g. https://lore.kernel.org/lkml/CAHk-=3Dwhs8QZf3YnifdLv57+FhBi5_WeNTG1B-suOES= =3DRcUSmQg@mail.gmail.com/ "Re: Very slow clang kernel config .." https://www.facebook.com/dan.colascione/posts/10107358290728348 "Python is = 1.3x faster when compiled in a way that re-examines shitty technical decisions f= rom the 1990s.") I believe ld -Bsymbolic-functions can materialize most of the savings other implementations provide, without introducing complex things to ELF. However, since -Bsymbolic-functions doesn't play well with -fno-pic's canon= ical PLT entries, we should fix -fno-pic. Converting a direct access to a GOT access for a function symbol cannot be = in a performance critical path, so let's just do it. Static linking is happy, too - the linker can either optimize out the GOT (x86-64 GOTPCRELX, PPC64 TOC) or prefill the GOT entry with a constant. Once -fno-pic has the sane behavior (GOT by default), more and more shared objects can be optionally built with -Bsymbolic-functions - if they don't intend to support interposition, while still being compatible with -fno-pic executables. How effective is -Bsymbolic-functions? As a data point, my x86_64 Linux ker= nel defconfig build with -Bsymbolic-functions linked Clang is 15% faster. (83% JUMP_SLOT relocations are eliminated!) % cat a.c extern void fun(); void *get() { return (void*)fun; } % gcc -fno-pic -S a.c -O2 -o - get: .LFB0: .cfi_startproc movl $fun, %eax ret % aarch64-linux-gnu-gcc -fno-pic -S a.c -O2 -o - ... adrp x0, fun add x0, x0, :lo12:fun # good, ppc64 elfv2 always uses TOC % powerpc64le-linux-gnu-gcc -fno-pic -S a.c -O2 -o - ... addis 3,2,.LC0@toc@ha ld 3,.LC0@toc@l(3)=