From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by sourceware.org (Postfix) with ESMTPS id 766B33858D35 for ; Fri, 6 Jan 2023 23:52:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 766B33858D35 Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=google.com Received: by mail-pl1-x633.google.com with SMTP id c6so3363891pls.4 for ; Fri, 06 Jan 2023 15:52:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=C2/Z+/kAL8yC9CFxllJs1TCm8ElObbobKYfPxmT0Dfk=; b=TiG+Wdu07c1xpSFrY9ECfzsK5zp6E2v8T2gywUnGdMQC6V0QnsqD3W6fzy6UhBNDA/ UUYg3YENOyRhJdLHMbvMYn5qWoF9WY2aZDV5Tyk1/I82OJajZFB5/jPibpnYbAFFYPvX WhEhI9mdH/YF2EuxgJ4TR3c0Tfsml3LT7JZiDMS8lH28oD0JPWKkfTGNWKNuxkujbyTH naA7IOdgP9XJuZkpAJFC0org6haqNtx2J8SwH6T9Rg9z2ERaRmPgCsggSiQHMjjFhaLj 9y3EGkOJzxngVXiSXGLDfYkm4qXoyZ4e0HTH2w6Tb1ikKb6ndb11k5cVu0Hm437oTz6Z Kzig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=C2/Z+/kAL8yC9CFxllJs1TCm8ElObbobKYfPxmT0Dfk=; b=prSI76TUKgRHgd1p6juMfU0+TXGfZXQ1Ps1+yBUSuNcs7I6HnGV4wMUVX+Rb2zJlLB y9c+3+CLsU0uKp93Q4K7cjSa4H+ZE4KY0zwShj8P5IfDo52yCDVaM4LVmVZ2nFqxAWbi 6QpAboXU2edO3gKJb0EL2+1Och+GWzhODc3RQT8rjHDbO5d3SEKclObQBW8M2gjGUCHO wDxkYzuDrWe8Ge6csK1bHevQWqANakw45Uao3Bu3wjG31g8OBxwTRIIFrf7xqbjgFTxW dBW86J7vklitLJ7bE9cZg4Riqofr8itRGZJykmOUXOTLQLvCdQxT1xg3Q+NnZUNmG5kg JK5w== X-Gm-Message-State: AFqh2krgjq8oKVqZqyArcwxSYoRHybWJ+KJ8wsLxWSnZkJd0JyoYy+Ir FGSekoGXqAh0WEgQejQ/fcFw8XUdpqbX81twzilgsP4DscDeH/U5 X-Google-Smtp-Source: AMrXdXtUC4L7LOk1FJdw6xKqevQv4I81CT5FnQA1ej8RypYQEyUDe5Fg4VLkx+LTpIA3hPsNZgshWYsMURTZVkCj9dY= X-Received: by 2002:a17:90a:9d8b:b0:219:3e67:39dd with SMTP id k11-20020a17090a9d8b00b002193e6739ddmr5518937pjp.156.1673049162991; Fri, 06 Jan 2023 15:52:42 -0800 (PST) MIME-Version: 1.0 References: <20230105210542.3573076-1-maskray@google.com> In-Reply-To: From: Fangrui Song Date: Fri, 6 Jan 2023 15:52:30 -0800 Message-ID: Subject: Re: [PATCH] ld: Allow R_X86_64_GOTPCREL for call *__tls_get_addr@GOTPCREL(%rip) To: "H.J. Lu" Cc: binutils@sourceware.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-23.4 required=5.0 tests=BAYES_00,BODY_8BITS,DKIMWL_WL_MED,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,ENV_AND_HDR_SPF_MATCH,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, Jan 6, 2023 at 3:20 PM H.J. Lu wrote: > > On Fri, Jan 6, 2023 at 3:02 PM Fangrui Song wrote: > > > > On Fri, Jan 6, 2023 at 2:42 PM H.J. Lu wrote: > > > > > > On Fri, Jan 6, 2023 at 1:44 PM Fangrui Song wrot= e: > > > > > > > > On Fri, Jan 6, 2023 at 1:27 PM H.J. Lu wrote: > > > > > > > > > > On Fri, Jan 6, 2023 at 1:25 PM Fangrui Song = wrote: > > > > > > > > > > > > On Fri, Jan 6, 2023 at 1:14 PM H.J. Lu w= rote: > > > > > > > > > > > > > > On Fri, Jan 6, 2023 at 10:48 AM Fangrui Song wrote: > > > > > > > > > > > > > > > > On Fri, Jan 6, 2023 at 9:04 AM H.J. Lu wrote: > > > > > > > > > > > > > > > > > > On Thu, Jan 5, 2023 at 1:06 PM Fangrui Song via Binutils > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > _Thread_local int a; > > > > > > > > > > int main() { return a; } > > > > > > > > > > > > > > > > > > > > % gcc -fno-plt -fpic a.c -fuse-ld=3Dbfd -Wa,-mrelax-rel= ocations=3Dno > > > > > > > > > > /usr/bin/ld.bfd: /tmp/ccSSBgrg.o: TLS transition from R= _X86_64_TLSGD to R_X86_64_GOTTPOFF against `a' at 0xd in section `.text' fa= iled > > > > > > > > > > /usr/bin/ld.bfd: failed to set dynamic section sizes: b= ad value > > > > > > > > > > collect2: error: ld returned 1 exit status > > > > > > > > > > > > > > > > > > > > This commit fixes the issue. > > > > > > > > > > > > > > > > > > > > PR ld/24784 > > > > > > > > > > * bfd/elf64-x86-64.c (elf_x86_64_check_tls_transiti= on): Allow > > > > > > > > > > R_X86_64_GOTPCREL. > > > > > > > > > > --- > > > > > > > > > > bfd/elf64-x86-64.c | 2 +- > > > > > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > diff --git a/bfd/elf64-x86-64.c b/bfd/elf64-x86-64.c > > > > > > > > > > index 914f82d0151..095fe2e0fe6 100644 > > > > > > > > > > --- a/bfd/elf64-x86-64.c > > > > > > > > > > +++ b/bfd/elf64-x86-64.c > > > > > > > > > > @@ -1241,7 +1241,7 @@ elf_x86_64_check_tls_transition (= bfd *abfd, > > > > > > > > > > if (largepic) > > > > > > > > > > return r_type =3D=3D R_X86_64_PLTOFF64; > > > > > > > > > > else if (indirect_call) > > > > > > > > > > - return r_type =3D=3D R_X86_64_GOTPCRELX; > > > > > > > > > > + return (r_type =3D=3D R_X86_64_GOTPCRELX ||= r_type =3D=3D R_X86_64_GOTPCREL); > > > > > > > > > > else > > > > > > > > > > return (r_type =3D=3D R_X86_64_PC32 || r_ty= pe =3D=3D R_X86_64_PLT32); > > > > > > > > > > } > > > > > > > > > > -- > > > > > > > > > > 2.39.0.314.g84b9a713c41-goog > > > > > > > > > > > > > > > > > > > > > > > > > > > > Since the new TLS sequence was added after R_X86_64_GOTPC= RELX was > > > > > > > > > required for call, R_X86_64_GOTPCREL should be invalid in= this TLS sequence. > > > > > > > > > > > > > > > > > > -- > > > > > > > > > H.J. > > > > > > > > > > > > > > > > I have multiple arguments (albeit no single one is very str= ong) that > > > > > > > > this 1-deletion-1-addition change provides benefits for use= rs (IMHO > > > > > > > > with no burden to binutils at all). > > > > > > > > > > > > > > > > Some projects may add -Wa,-mrelax-relocations=3Dno to work = around older > > > > > > > > GNU ld. Then the project's toolchain requirement may increa= se and no > > > > > > > > longer need to work around older GNU ld. > > > > > > > > But a distribution may for some reason use a global -fno-pl= t (e.g. > > > > > > > > Arch Linux) and then run into this TLS GD/LD->IE/LE optimiz= ation > > > > > > > > issue. > > > > > > > > > > > > > > > > rust src/ci/docker/host-x86_64/*musl/Dockerfile > > > > > > > > openjdk/jdk19u make/autoconf/flags-cflags.m4 (this file app= ears to be > > > > > > > > copied into quite a few projects) > > > > > > > > Linux kernel arch/x86/boot/compressed/Makefile (not a good = example as > > > > > > > > it doesn't use TLS AFAICT) > > > > > > > > > > > > > > > > R_X86_64_GOTPCREL isn't purely usefull. It may help linker = design: for > > > > > > > > R_X86_64_GOTPCRELX/R_X86_64_REX_GOTPCRELX, the linker can m= ake a > > > > > > > > decision upfront whether a GOT entry is needed > > > > > > > > (this affects the size of .got, which may affect section la= yout and > > > > > > > > whether other relocations may overflow). > > > > > > > > This may increase risk of 32-bit relocation overflow. > > > > > > > > R_X86_64_GOTPCREL can mitigate the risk while being aware t= o the user. > > > > > > > > > > > > > > > > rustc somehow disables x86 relaxed relocations and defaults= to `-Z > > > > > > > > > > > > > > Why is that? > > > > > > > > > > > > It's assuredly a rust's problem and I am trying to fix that in > > > > > > https://github.com/rust-lang/rust/pull/106511 > > > > > > > > > > > > The -Wa,-mrelax-relocations=3Dno problem may affect more packa= ges. > > > > > > > > > > -mrelax-relocations=3Dno should be a workaround for the older lin= ker. It > > > > > shouldn't be used with the current linker. > > > > > > > > A project may choose to work with many linker versions. > > > > For simplicity, before it decides to drop compatibility with GNU > > > > ld<2.26 (AIUI GOTPCRELX was supported in 2.26), > > > > it may unconditionally add -Wa,-mrelax-relocations=3Dno, instead of > > > > > > -mrelax-relocations=3Dno is only supported with the newer binutils. > > > > The relocatable object file producer and the consumer may be on > > different machines and use different binutils versions. > > https://github.com/rust-lang/rust/commit/305aca86f9d8d132650b495f610f9a= be5239fec6 > > added -Wa,-mrelax-relocations=3Dno so that the relocatable object files > > can be used on a user machine with an old ld. > > But -fno-plt may not work with the old linker. For this matter, > -Wa,-mrelax-relocations=3Dno doesn't work with ld today. When > -Wa,-mrelax-relocations=3Dno is used, no features from newer linkers > can be used. A project may add -Wa,-mrelax-relocations=3Dno so that its prebuilt relocatable object files can be linked on a machine with old ld. A distribution (like Arch Linux and rust) using new ld may decide to use -fno-plt globally. If they don't strip or override the project -Wa,-mrelax-relocations=3Dno, they may run into the -fpic -fno-plt -Wa,-mrelax-relocations=3Dno incompati= bility. In the GCC/gas model, GCC doesn't know whether its emitted assembly will be used with -mrelax-relocations=3Dno. GCC just emits the -fno-plt form of TLS GD/LD code sequence. If the assembly is fed into gas with -mrelax-relocations=3Dno, the output will be broken with current ld. I have tested some TLS examples and this simple patch work. One can argue that GNU ld either suppresses GD/LD=3D>IE/LE optimization or support this old relocation type. It appears that supporting the old relocation type is the simplest approach= . > > https://github.com/IHaskell/IHaskell/issues/636 and > > https://github.com/dcos/dcos/commit/facda25019e07051f501b39720b4e71049b= d0030 > > likely use the same argument. > > > > In other cases, the project may use -Wa,-mrelax-relocations=3Dno with > > Clang (where they assume a not-too-old version), but need to work with > > system ld (which may be old). > > This shouldn't happen with as and ld from the same binutils. I agree. > > > > doing configure work to check linker support. > > > > > > The TLS sequence from -fno-plt doesn't work for the older linker. > > > The older linker support should be dropped for -fno-plt. > > > > > > > Now a user may use -fno-plt (Arch Linux, rustc, maybe Alpine) and r= un > > > > into the aforementioned TLS problem. > > > > > > > > This 1-deletion-1-addition change can address this issue with no > > > > maintenance burden on binutils side in my opinion, > > > > so I made this patch. > > > > > > > > The linker design I described is true as well. Whether GOTPCRELX le= ads > > > > to a GOT entry can be decided at relocation scanning time, before t= he > > > > section layout is decided. > > > > Users may make a conscious decision to use GOTPCREL to avoid potent= ial > > > > relocation overflow risk. > > > > > > > > GOTPCREL isn't really dead. It can be used with Intel LAM and tagge= d > > > > global variables (with non-zero high address bits) > > > > https://reviews.llvm.org/D111343 > > > > GOTPCREL instead of GOTPCRELX makes it clear an instruction > > > > referencing the variable isn't supposed to be relaxed. > > > > > > The address of the local symbol, foo, in > > > > > > movq foo@GOTPCREL(%rip), %rax > > > > > > is assigned by the linker. I am not sure how the tag is involved her= e. > > > Besides, it is the call instruction here. > > > > This is an auxiliary argument. I wanted to emphasize that GOTPCREL > > isn't dead and did not intend to use it with this call instruction. > > If GOTPCRELX is used and the distance between the current location and > > the symbol is larger than 2**31, this will trigger a relocation > > overflow. > > > > This happens with tagged globals with non-zero high address bits. > > This sounds needing some linker changes to add tags to data variables. > I am not sure if GOTPCREL alone is sufficient. > > > A linker can fix the problem by avoiding relaxation, increasing the > > size of .got . This requires that it scans relocations more than once. > > If GOTPCRELX is decided upfront whether it needs relaxation or not, on > > an arch which doesn't use range extension thunks like x86, technically > > relocations can just be scanned once. > > > > > > > > > > plt=3Dno` and now relies on llvm-project to work around the= GNU ld > > > > > > > > compatibility issue. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > =E5=AE=8B=E6=96=B9=E7=9D=BF > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > H.J. > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > =E5=AE=8B=E6=96=B9=E7=9D=BF > > > > > > > > > > > > > > > > > > > > -- > > > > > H.J. > > > > > > > > > > > > > > > > -- > > > > =E5=AE=8B=E6=96=B9=E7=9D=BF > > > > > > > > > > > > -- > > > H.J. > > > > > > > > -- > > =E5=AE=8B=E6=96=B9=E7=9D=BF > > > > -- > H.J. --=20 =E5=AE=8B=E6=96=B9=E7=9D=BF