From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-x42b.google.com (mail-pf1-x42b.google.com [IPv6:2607:f8b0:4864:20::42b]) by sourceware.org (Postfix) with ESMTPS id 8335A3858C41 for ; Thu, 28 Dec 2023 14:55:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8335A3858C41 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8335A3858C41 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::42b ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703775321; cv=none; b=nt7uI04s3a7bFvb7Dj+S1NaWCzJ/XDg654vtR3Jv6hNtjtyi4cA0JDJg5rWQNs3+QpfXpYWoj44+sdQB8LczwB331SvVuBIhZI5YQDveU3GWN3Ee4e8Kx9GW47mZghrBAK47EN0X+IYFFpBIu2fRvUg5QwpCJPvNKaU2JvSo11A= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703775321; c=relaxed/simple; bh=1D46BwirTZxF3VdQavLiIKP4O1LPLXpMrevR1jj3hN4=; h=DKIM-Signature:Mime-Version:Subject:From:Date:Message-Id:To; b=FvyrJv8pnGfAxLF27nIUrVNtmCXidRl90pm8L7TKt1FbAq1cjpXIZ0x8pA/TLSNPaQG/1me1LuS9dXGIz+Gbr7WMcXn36fU/ROz3b8PR+9ivB6t0k1f6UDFKLTnoCXnpfo5ztjVJ55dvuUYx/UdasylxSLIFiabltWUq48rauQc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pf1-x42b.google.com with SMTP id d2e1a72fcca58-6d98e7baad4so285746b3a.1 for ; Thu, 28 Dec 2023 06:55:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1703775315; x=1704380115; darn=sourceware.org; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=RrzyBeTfgFWJcJt7nQjsXINqNcUVPcSzBuhjaQNtFDw=; b=SRPdo8qJ/DdKbLMCLs6xFu1JxFZHefsYws/+ftzEuhInUrT+GHl3PCaRlhf080TN0R 79AlUMBuoewWUuXrj8VGn6JqKZP0mJjkJDUGhKTss6G85c+n/jl9NySJnmUNsbo7OBQo 32DTVQOGC398/I2+/zIOljBwOe3vsEfKCy+ajdTRTp03yZc7DHUk+83V2sUCkuy/eXZD sWPQXEg/Fg33pPs64HFtLzNpFLkPxWZItYcjoseIF/QIX04eFki7JkJKoM/OnYZ9S8Ij UrlddMLVOnmY9AoLg76aQqIQ1Wres5fWm1IzMADcveYeSpY7j7u0Gk20fcp/Q+vw7574 XDJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703775315; x=1704380115; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RrzyBeTfgFWJcJt7nQjsXINqNcUVPcSzBuhjaQNtFDw=; b=LbBFcSEJf3kwtU86AhnlT9BJYZ7O4dfxMEpzAsEbsGV/4o9eyP8rec+eUcwL7Nwh1S chLS4QOks3DNqZu6LZCTPaP1GGYfrfYU3UOxpYicRVY+A9YI1dzxzjfJPNapHbiEYz6V UGCC50ahefCGLy3JAZWzyrBiyQ7bdrsOx65UIlO/A/zyandeA3kJiDVcE3XSOTqFrGkp IH2OOqCtLydBUH5ZTb1wLeG3ROmUSVo6vCb+q2YFREdKEJTt32ZGAd2QYqv/+6rTt8OO Zs/i7kYuLnnk72D1oO+tl6EB9DL4/Fd/jOT5BoOUodYtEBJ34scF5jI8zxlB7ITZR977 9pxg== X-Gm-Message-State: AOJu0YxGfZUTSWxW8FwURG2ullo0DKxxN7xCfwRgcq/NglB02zST33mB C3d5VZt88HUSwEPDZM0dtAg= X-Google-Smtp-Source: AGHT+IGYmlHMj9Cdc59uUUUZE+hWF5xoSRSzb+4arx/0QAQHSt2CBnFnD8H8OyqRvujZ4/dnGkZaog== X-Received: by 2002:a05:6a00:188d:b0:6d9:6ab8:e40f with SMTP id x13-20020a056a00188d00b006d96ab8e40fmr21134732pfh.0.1703775315245; Thu, 28 Dec 2023 06:55:15 -0800 (PST) Received: from smtpclient.apple (zz20184013906F627101.userreverse.dion.ne.jp. [111.98.113.1]) by smtp.gmail.com with ESMTPSA id v29-20020aa799dd000000b006d40f44dc03sm11722782pfi.11.2023.12.28.06.55.11 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 28 Dec 2023 06:55:14 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.300.61.1.2\)) Subject: Re: [PATCH v5 2/5] LoongArch: Add support for TLSDESC in ld. From: Tatsuyuki Ishi In-Reply-To: <20231222114243.1836112-3-cailulu@loongson.cn> Date: Thu, 28 Dec 2023 23:54:52 +0900 Cc: binutils@sourceware.org, xuchenghua@loongson.cn, chenglulu@loongson.cn, liuzhensong@loongson.cn, mengqinggang@loongson.cn, xry111@xry111.site, i.swmail@xen0n.name, maskray@google.com, luweining@loongson.cn, wanglei@loongson.cn, hejinyang@loongson.cn Content-Transfer-Encoding: quoted-printable Message-Id: <7DE21EC8-207D-41AA-B82E-0950D482F07B@gmail.com> References: <20231222114243.1836112-1-cailulu@loongson.cn> <20231222114243.1836112-3-cailulu@loongson.cn> To: Lulu Cai X-Mailer: Apple Mail (2.3774.300.61.1.2) X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi, author of RISC-V TLSDESC patches here. I recently learned that = LoongArch is working on a similar TLSDESC extension too. I=E2=80=99ll = leave some of my thoughts on the patch set. > On Dec 22, 2023, at 20:42, Lulu Cai wrote: >=20 > 1.The linker for each DESC generates a R_LARCH_TLS_DESC64 dynamic > relocation, which relocation is placed at .rela.dyn. > TLSDESC always allocates two GOT slots and one dynamic relocation > space to TLSDESC. > 2. When using multiple ways to access the same TLS variable, a > maximum of 5 GOT slots are used. For example, using GD, TLSDESC, > and IE to access the same TLS variable, GD always uses the first > two of the five GOT, TLSDESC uses the third and fourth, and IE > uses the last. > --- > bfd/elfnn-loongarch.c | 168 ++++++++++++++++++++++++++++++++++++------ > 1 file changed, 146 insertions(+), 22 deletions(-) >=20 > diff --git a/bfd/elfnn-loongarch.c b/bfd/elfnn-loongarch.c > index faffe27294a..95a39148f73 100644 > --- a/bfd/elfnn-loongarch.c > +++ b/bfd/elfnn-loongarch.c > @@ -48,6 +48,12 @@ struct loongarch_elf_link_hash_entry > #define GOT_TLS_GD 2 > #define GOT_TLS_IE 4 > #define GOT_TLS_LE 8 > +#define GOT_TLS_GDESC 16 > + > +#define GOT_TLS_GD_BOTH_P(tls_type) \ > + ((tls_type & GOT_TLS_GD) && (tls_type & GOT_TLS_GDESC)) > +#define GOT_TLS_GD_ANY_P(tls_type) \ > + ((tls_type & GOT_TLS_GD) || (tls_type & GOT_TLS_GDESC)) > char tls_type; > }; >=20 > @@ -563,6 +569,7 @@ loongarch_elf_record_tls_and_got_reference (bfd = *abfd, > case GOT_NORMAL: > case GOT_TLS_GD: > case GOT_TLS_IE: > + case GOT_TLS_GDESC: > /* Need GOT. */ > if (htab->elf.sgot =3D=3D NULL > && !loongarch_elf_create_got_section (htab->elf.dynobj, info)) > @@ -750,6 +757,14 @@ loongarch_elf_check_relocs (bfd *abfd, struct = bfd_link_info *info, > return false; > break; >=20 > + case R_LARCH_TLS_DESC_PC_HI20: > + case R_LARCH_TLS_DESC_HI20: > + if (!loongarch_elf_record_tls_and_got_reference (abfd, info, = h, > + r_symndx, > + = GOT_TLS_GDESC)) > + return false; > + break; > + > case R_LARCH_ABS_HI20: > case R_LARCH_SOP_PUSH_ABSOLUTE: > if (h !=3D NULL) > @@ -1130,7 +1145,7 @@ allocate_dynrelocs (struct elf_link_hash_entry = *h, void *inf) >=20 > s =3D htab->elf.sgot; > h->got.offset =3D s->size; > - if (tls_type & (GOT_TLS_GD | GOT_TLS_IE)) > + if (tls_type & (GOT_TLS_GD | GOT_TLS_IE | GOT_TLS_GDESC)) > { > /* TLS_GD needs two dynamic relocs and two GOT slots. */ > if (tls_type & GOT_TLS_GD) > @@ -1167,7 +1182,15 @@ allocate_dynrelocs (struct elf_link_hash_entry = *h, void *inf) > htab->elf.srelgot->size +=3D sizeof = (ElfNN_External_Rela); > } > } > + > + /* TLS_DESC needs one dynamic reloc and two GOT slot. */ > + if (tls_type & GOT_TLS_GDESC) > + { > + s->size +=3D GOT_ENTRY_SIZE * 2; > + htab->elf.srelgot->size +=3D sizeof (ElfNN_External_Rela); > + } > } > + > else > { > s->size +=3D GOT_ENTRY_SIZE; > @@ -1670,19 +1693,34 @@ loongarch_elf_size_dynamic_sections (bfd = *output_bfd, > if (0 < *local_got) > { > *local_got =3D s->size; > + if (*local_tls_type & (GOT_TLS_GD | GOT_TLS_IE | = GOT_TLS_GDESC)) > + { > + /* TLS gd use two got. */ > + if (*local_tls_type & GOT_TLS_GD) > + { > + s->size +=3D 2 * GOT_ENTRY_SIZE; > + if (!bfd_link_executable (info)) > + srel->size +=3D sizeof (ElfNN_External_Rela); > + } >=20 > - /* TLS gd use two got. */ > - if (*local_tls_type & GOT_TLS_GD) > - s->size +=3D GOT_ENTRY_SIZE * 2; > - else > - /* Normal got, tls ie/ld use one got. */ > - s->size +=3D GOT_ENTRY_SIZE; > + /* TLS_DESC use two got. */ > + if (*local_tls_type & GOT_TLS_GDESC) > + { > + s->size +=3D 2 * GOT_ENTRY_SIZE; > + srel->size +=3D sizeof (ElfNN_External_Rela); > + } >=20 > - if (bfd_link_executable (info) > - && (*local_tls_type & (GOT_TLS_GD| GOT_TLS_IE))) > - ;/* Do nothing. */ > + /* TLS ie and use one got. */ > + if (*local_tls_type & GOT_TLS_IE) > + { > + s->size +=3D GOT_ENTRY_SIZE; > + if (!bfd_link_executable (info)) > + srel->size +=3D sizeof (ElfNN_External_Rela); > + } > + } > else > { > + s->size +=3D GOT_ENTRY_SIZE; > srel->size +=3D sizeof (ElfNN_External_Rela); > } > } > @@ -2126,6 +2164,15 @@ perform_relocation (const Elf_Internal_Rela = *rel, asection *input_section, > case R_LARCH_TLS_GD_HI20: > case R_LARCH_PCREL20_S2: > case R_LARCH_CALL36: > + case R_LARCH_TLS_DESC_PC_HI20: > + case R_LARCH_TLS_DESC_PC_LO12: > + case R_LARCH_TLS_DESC64_PC_LO20: > + case R_LARCH_TLS_DESC64_PC_HI12: > + case R_LARCH_TLS_DESC_HI20: > + case R_LARCH_TLS_DESC_LO12: > + case R_LARCH_TLS_DESC64_LO20: > + case R_LARCH_TLS_DESC64_HI12: > + > r =3D loongarch_check_offset (rel, input_section); > if (r !=3D bfd_reloc_ok) > break; > @@ -2135,6 +2182,11 @@ perform_relocation (const Elf_Internal_Rela = *rel, asection *input_section, > contents, value); > break; >=20 > + case R_LARCH_TLS_DESC_LD: > + case R_LARCH_TLS_DESC_CALL: > + r =3D bfd_reloc_ok; > + break; > + > case R_LARCH_RELAX: > break; >=20 > @@ -2383,10 +2435,10 @@ loongarch_elf_relocate_section (bfd = *output_bfd, struct bfd_link_info *info, > struct elf_link_hash_entry *h =3D NULL; > const char *name; > bfd_reloc_status_type r =3D bfd_reloc_ok; > - bool is_ie, is_undefweak, unresolved_reloc, defined_local; > + bool is_ie, is_desc, is_undefweak, unresolved_reloc, = defined_local; > bool resolved_local, resolved_dynly, resolved_to_const; > char tls_type; > - bfd_vma relocation, off, ie_off; > + bfd_vma relocation, off, ie_off, desc_off; > int i, j; >=20 > howto =3D loongarch_elf_rtype_to_howto (input_bfd, r_type); > @@ -2515,6 +2567,7 @@ loongarch_elf_relocate_section (bfd *output_bfd, = struct bfd_link_info *info, >=20 > BFD_ASSERT (!resolved_local || defined_local); >=20 > + is_desc =3D false; > is_ie =3D false; > switch (r_type) > { > @@ -3405,6 +3458,8 @@ loongarch_elf_relocate_section (bfd *output_bfd, = struct bfd_link_info *info, > case R_LARCH_TLS_LD_HI20: > case R_LARCH_TLS_GD_PC_HI20: > case R_LARCH_TLS_GD_HI20: > + case R_LARCH_TLS_DESC_PC_HI20: > + case R_LARCH_TLS_DESC_HI20: > BFD_ASSERT (rel->r_addend =3D=3D 0); > unresolved_reloc =3D false; >=20 > @@ -3412,6 +3467,10 @@ loongarch_elf_relocate_section (bfd = *output_bfd, struct bfd_link_info *info, > || r_type =3D=3D R_LARCH_TLS_IE_HI20) > is_ie =3D true; >=20 > + if (r_type =3D=3D R_LARCH_TLS_DESC_PC_HI20 > + || r_type =3D=3D R_LARCH_TLS_DESC_HI20) > + is_desc =3D true; > + I see that the got_off assignment logic is duplicated for = R_LARCH_SOP_PUSH_TLS_{GOT, GD}, but that path was not modified in this = patch. Is this OK? Ideally, the logic should not be duplicated at all. > bfd_vma got_off =3D 0; > if (h !=3D NULL) > { > @@ -3426,9 +3485,19 @@ loongarch_elf_relocate_section (bfd = *output_bfd, struct bfd_link_info *info, >=20 > BFD_ASSERT (got_off !=3D MINUS_ONE); >=20 > - ie_off =3D 0; > tls_type =3D _bfd_loongarch_elf_tls_type (input_bfd, h, = r_symndx); > - if ((tls_type & GOT_TLS_GD) && (tls_type & GOT_TLS_IE)) > + > + /* If a tls variable is accessed in multiple ways, GD uses > + the first two slots of GOT, desc follows with two slots, > + and IE uses one slot at the end. */ > + desc_off =3D 0; > + if (GOT_TLS_GD_BOTH_P (tls_type)) > + desc_off =3D 2 * GOT_ENTRY_SIZE; > + > + ie_off =3D 0; > + if (GOT_TLS_GD_BOTH_P (tls_type) && (tls_type & GOT_TLS_IE)) > + ie_off =3D 4 * GOT_ENTRY_SIZE; > + else if (GOT_TLS_GD_ANY_P (tls_type) && (tls_type & = GOT_TLS_IE)) I think the calculation should be done additively, instead of checking = for every possible combination. That is, off =3D 0; if (tls gd) off +=3D gd size; desc_off =3D off; if (tls ie) off +=3D desc size; ie_off =3D off; would be much simpler. > ie_off =3D 2 * GOT_ENTRY_SIZE; >=20 > if ((got_off & 1) =3D=3D 0) > @@ -3477,6 +3546,21 @@ loongarch_elf_relocate_section (bfd = *output_bfd, struct bfd_link_info *info, > loongarch_elf_append_rela (output_bfd, relgot, = &rela); > } > } > + if (tls_type & GOT_TLS_GDESC) > + { > + /* Unless it is a static link, DESC always emits a > + dynamic relocation. */ > + int indx =3D h && h->dynindx !=3D -1 ? h->dynindx : 0; > + rela.r_offset =3D sec_addr (got) + got_off + desc_off; > + rela.r_addend =3D 0; > + if (indx =3D=3D 0) > + rela.r_addend =3D relocation - elf_hash_table = (info)->tls_sec->vma; > + > + rela.r_info =3D ELFNN_R_INFO (indx, = R_LARCH_TLS_DESCNN); > + loongarch_elf_append_rela (output_bfd, relgot, &rela); > + bfd_put_NN (output_bfd, 0, > + got->contents + got_off + desc_off); > + } > if (tls_type & GOT_TLS_IE) > { > rela.r_offset =3D sec_addr (got) + got_off + ie_off; > @@ -3504,16 +3588,52 @@ loongarch_elf_relocate_section (bfd = *output_bfd, struct bfd_link_info *info, > } > } > } > - relocation =3D (got_off & (~(bfd_vma)1)) + sec_addr (got) > - + (is_ie ? ie_off : 0); > + relocation =3D (got_off & (~(bfd_vma)1)) + sec_addr (got); > + if (is_desc) > + relocation +=3D desc_off; > + else if (is_ie) > + relocation +=3D ie_off; >=20 > if (r_type =3D=3D R_LARCH_TLS_LD_PC_HI20 > || r_type =3D=3D R_LARCH_TLS_GD_PC_HI20 > - || r_type =3D=3D R_LARCH_TLS_IE_PC_HI20) > + || r_type =3D=3D R_LARCH_TLS_IE_PC_HI20 > + || r_type =3D=3D R_LARCH_TLS_DESC_PC_HI20) > RELOCATE_CALC_PC32_HI20 (relocation, pc); >=20 > break; >=20 > + case R_LARCH_TLS_DESC_PC_LO12: > + case R_LARCH_TLS_DESC64_PC_LO20: > + case R_LARCH_TLS_DESC64_PC_HI12: > + case R_LARCH_TLS_DESC_LO12: > + case R_LARCH_TLS_DESC64_LO20: > + case R_LARCH_TLS_DESC64_HI12: > + { > + unresolved_reloc =3D false; > + > + if (h) > + relocation =3D sec_addr (got) + (h->got.offset & = (~(bfd_vma)1)); > + else > + relocation =3D sec_addr (got) > + + (local_got_offsets[r_symndx] & = (~(bfd_vma)1)); > + > + tls_type =3D _bfd_loongarch_elf_tls_type (input_bfd, h, = r_symndx); > + /* Use both TLS_GD and TLS_DESC. */ > + if ((tls_type & GOT_TLS_GD) && (tls_type & GOT_TLS_GDESC)) > + relocation +=3D 2 * GOT_ENTRY_SIZE; For simplicity, I think the ie_off and desc_off calculation should be = moved out of the switch. Then you can just reuse the desc/ie_off = calculated there. > + } > + > + if (r_type =3D=3D R_LARCH_TLS_DESC64_PC_LO20 > + || r_type =3D=3D R_LARCH_TLS_DESC64_PC_HI12) > + RELOCATE_CALC_PC64_HI32 (relocation, pc); > + > + break; > + > + case R_LARCH_TLS_DESC_LD: > + case R_LARCH_TLS_DESC_CALL: > + unresolved_reloc =3D false; > + break; > + > case R_LARCH_TLS_IE_PC_LO12: > case R_LARCH_TLS_IE64_PC_LO20: > case R_LARCH_TLS_IE64_PC_HI12: > @@ -3523,14 +3643,17 @@ loongarch_elf_relocate_section (bfd = *output_bfd, struct bfd_link_info *info, > unresolved_reloc =3D false; >=20 > if (h) > - relocation =3D sec_addr (got) + (h->got.offset & = (~(bfd_vma)3)); > + relocation =3D sec_addr (got) + (h->got.offset & = (~(bfd_vma)1)); > else > relocation =3D sec_addr (got) > - + (local_got_offsets[r_symndx] & (~(bfd_vma)3)); > + + (local_got_offsets[r_symndx] & (~(bfd_vma)1)); >=20 > tls_type =3D _bfd_loongarch_elf_tls_type (input_bfd, h, = r_symndx); > - /* Use both TLS_GD and TLS_IE. */ > - if ((tls_type & GOT_TLS_GD) && (tls_type & GOT_TLS_IE)) > + /* Use TLS_GD TLS_DESC and TLS_IE. */ > + if (GOT_TLS_GD_BOTH_P (tls_type) && (tls_type & GOT_TLS_IE)) > + relocation +=3D 4 * GOT_ENTRY_SIZE; > + /* Use GOT_TLS_GD_ANY_P (tls_type) and TLS_IE. */ > + else if (GOT_TLS_GD_ANY_P (tls_type) && (tls_type & = GOT_TLS_IE)) > relocation +=3D 2 * GOT_ENTRY_SIZE; >=20 > if (r_type =3D=3D R_LARCH_TLS_IE64_PC_LO20 > @@ -4174,7 +4297,8 @@ loongarch_elf_finish_dynamic_symbol (bfd = *output_bfd, >=20 > if (h->got.offset !=3D MINUS_ONE > /* TLS got entry have been handled in elf_relocate_section. */ > - && !(loongarch_elf_hash_entry (h)->tls_type & (GOT_TLS_GD | = GOT_TLS_IE)) > + && !(loongarch_elf_hash_entry (h)->tls_type > + & (GOT_TLS_GD | GOT_TLS_IE | GOT_TLS_GDESC)) > /* Have allocated got entry but not allocated rela before. */ > && !UNDEFWEAK_NO_DYNAMIC_RELOC (info, h)) > { > --=20 > 2.43.0 >=20 >=20