From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x22c.google.com (mail-oi1-x22c.google.com [IPv6:2607:f8b0:4864:20::22c]) by sourceware.org (Postfix) with ESMTPS id 2C2EB3852203 for ; Fri, 18 Nov 2022 08:18:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2C2EB3852203 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=adacore.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=adacore.com Received: by mail-oi1-x22c.google.com with SMTP id n205so4628959oib.1 for ; Fri, 18 Nov 2022 00:18:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=adacore.com; s=google; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=H9ymhw1KSHkTPxOOw+zRfR8Ee6N8h60XfcV6kUId188=; b=eGitumeteTSEjXvmHMT7uSktpb99wd/G30kX70XdSAAeQfsJykvbBe837dcfrSj8WO SzuuytKCAfjWmKJ62UbytnO7U3ErGC55pcBxbq8UDyAZPBKZ5CfEwfEDiM3BNx3xhuvG 0P2LE22G3sU8cL34Oczg5vky4KsNKMrIJtg/X/9EOepESTRRsgspIyf/xeTBLhqONCdI MTGKokfbZkPrblGs/0Tjw2OVD1uwkHMojZM38oh8fl0PTqSuZn57SZmBiXchBfY8H4r/ 7gxu0FiP8DMotne1yDcvDC3nw05h/y7507BxR2yh/DX7t98b1AMXjzNOPEUlOJb2T7JU DOnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=H9ymhw1KSHkTPxOOw+zRfR8Ee6N8h60XfcV6kUId188=; b=3smB5VEnmAajCEd7d64tbsHu3cqbMncXljTG6Cqt8qQA1jGQyQCmmqJ12Afsgb217W T64YUMiqSpDsrFUeqpg3VpIjjRpw7IaHapU11aCf6MM5FZ5DmmJNpt+cgaYhdbdZGXOg U2RFhgAi9/XP/5d8odL1KhPa2FGzt/fo20u+//rKQzDUjwsaBtOg69R6niFnniLqrzUA Y8K3kzHMmstbCm3UxwIOERf39k7zmxpjwY8GrQtZEI/OXfVQ5f+KDY/Lnz2vfctwlGPf vvnupDgdcQcx5+lECarmsrVlgHGIlT76qvPLjDn6SrbJvoIirJuiOiwRxFmSuUJKsrRk mTNw== X-Gm-Message-State: ANoB5plrvoTD5DTkAk4qE3pGwlMv5fpT5P6MMSPykkQFzQ1KdMiowkgw d26IXtXOvFpEP/ayrJKcVMbe/oWF7RBnvn/xe0r0Hv28qgeSVQ== X-Google-Smtp-Source: AA0mqf4naFlR2Z6+MUQJdmRpr0Y3JlDsRQ/3CxI7ZJnKENaGJ1QUhPkp+oS46u+S9PunL5bcaH3flLr/UJeKvaD/+y4= X-Received: by 2002:a05:6808:5c4:b0:35a:4aed:5904 with SMTP id d4-20020a05680805c400b0035a4aed5904mr5716479oij.198.1668759490494; Fri, 18 Nov 2022 00:18:10 -0800 (PST) MIME-Version: 1.0 References: <20221006114628.304185-1-chigot@adacore.com> In-Reply-To: From: =?UTF-8?Q?Cl=C3=A9ment_Chigot?= Date: Fri, 18 Nov 2022 09:17:59 +0100 Message-ID: Subject: Re: [PATCH] RISC-V: fix linker message when relaxation deletes bytes To: Nelson Chu Cc: binutils@sourceware.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, Nov 18, 2022 at 2:43 AM Nelson Chu wrote: > > On Fri, Oct 14, 2022 at 3:24 PM Cl=C3=A9ment Chigot = wrote: > > > > On Fri, Oct 14, 2022 at 7:08 AM Nelson Chu wrote: > > > > > > Hi, > > > > > > On Thu, Oct 13, 2022 at 8:06 PM Cl=C3=A9ment Chigot via Binutils > > > wrote: > > > > > > > > Gentle ping > > > > > > > > Thanks, > > > > Cl=C3=A9ment > > > > > > > > On Thu, Oct 6, 2022 at 1:46 PM Cl=C3=A9ment Chigot wrote: > > > > > > > > > > The section relaxation can delete some bytes resulting in the sym= bols' > > > > > value being modified. > > > > > As the linker messages retrieve a symbol information using the > > > > > outsymbols field of abfd, it must be updated as well. > > > > > > Interesting, I never noticed this. If this is necessary, then other > > > targets should also update the output symbols when relaxing (nds32 an= d > > > sh), but it seems like they don't do that as well. > > > > Yeah, I agree there are probably other targets having a similar issue. > > I don't know at what extend outsymbols is being used for. But the fact > > that is being filled with "untouched" symbols in object files looks > > weird to me. Maybe there is something to look into this direction. > > For now, this patch aims just to fix ld-undefined on Risc-V targets. > > > > > > > bfd/ChangeLog: > > > > > > > > > > * elfnn-riscv.c (riscv_relax_delete_bytes): Update > > > > > abfd->outsymbols. > > > > > > > > > > ld/ChangeLog: > > > > > > > > > > * testsuite/ld-riscv-elf/ld-riscv-elf.exp: New test. > > > > > * testsuite/ld-riscv-elf/undefined-align.d: New test. > > > > > * testsuite/ld-riscv-elf/undefined-align.s: New test. > > > > > --- > > > > > bfd/elfnn-riscv.c | 23 +++++++++++++++= ++++++ > > > > > ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp | 1 + > > > > > ld/testsuite/ld-riscv-elf/undefined-align.d | 5 +++++ > > > > > ld/testsuite/ld-riscv-elf/undefined-align.s | 10 +++++++++ > > > > > 4 files changed, 39 insertions(+) > > > > > create mode 100644 ld/testsuite/ld-riscv-elf/undefined-align.d > > > > > create mode 100644 ld/testsuite/ld-riscv-elf/undefined-align.s > > > > > > > > > > diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c > > > > > index 3d2ddf4e651..7a6b66dcd8a 100644 > > > > > --- a/bfd/elfnn-riscv.c > > > > > +++ b/bfd/elfnn-riscv.c > > > > > @@ -4060,6 +4060,7 @@ riscv_relax_delete_bytes (bfd *abfd, > > > > > unsigned int sec_shndx =3D _bfd_elf_section_from_bfd_section (= abfd, sec); > > > > > struct bfd_elf_section_data *data =3D elf_section_data (sec); > > > > > bfd_byte *contents =3D data->this_hdr.contents; > > > > > + asymbol **outsyms =3D bfd_get_outsymbols (abfd); > > > > > > > > > > /* Actually delete the bytes. */ > > > > > sec->size -=3D count; > > > > > @@ -4158,6 +4159,28 @@ riscv_relax_delete_bytes (bfd *abfd, > > > > > } > > > > > } > > > > > > > > > > + /* As linker messages are getting symbols through outsymbols f= ield of abfd, > > > > > + it must be adjusted too. */ > > > > > + if (outsyms =3D=3D NULL) > > > > > + { > > > > > + if (!bfd_generic_link_read_symbols (abfd)) > > > > > + link_info->callbacks->einfo (_("%F%P: %pB: could not read= symbols: %E\n"), abfd); > > > > > + outsyms =3D bfd_get_outsymbols (abfd); > > > > > + } > > > > > + > > > > > + for (i =3D 0; i < bfd_get_symcount (abfd); i++) > > > > > + { > > > > > + asymbol *sym =3D outsyms[i]; > > > > > + > > > > > + if (sym->section !=3D sec) > > > > > + continue; > > > > > + > > > > > + /* If the symbol is in the range of memory we just moved, = we > > > > > + have to adjust its value. */ > > > > > + if (sym->value > addr && sym->value <=3D toaddr) > > > > > + sym->value -=3D count; > > > > > + } > > > > > + > > > > > return true; > > > > > } > > > > > > > > > > diff --git a/ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp b/ld/test= suite/ld-riscv-elf/ld-riscv-elf.exp > > > > > index df89e0ee68b..86f1d05bc08 100644 > > > > > --- a/ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp > > > > > +++ b/ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp > > > > > @@ -176,6 +176,7 @@ if [istarget "riscv*-*-*"] { > > > > > [list "Weak reference 64" "-T weakref.ld -m[riscv_choose_= lp64_emul]" "" \ > > > > > "-march=3Drv64i -mabi=3Dlp64" {weakref64.s} \ > > > > > {{objdump -d weakref64.d}} "weakref64"]] > > > > > + run_dump_test "undefined-align" > > > > > > > > > > # The following tests require shared library support. > > > > > if ![check_shared_lib_support] { > > > > > diff --git a/ld/testsuite/ld-riscv-elf/undefined-align.d b/ld/tes= tsuite/ld-riscv-elf/undefined-align.d > > > > > new file mode 100644 > > > > > index 00000000000..c8cbb84ac7c > > > > > --- /dev/null > > > > > +++ b/ld/testsuite/ld-riscv-elf/undefined-align.d > > > > > @@ -0,0 +1,5 @@ > > > > > +#name: undefined with alignment > > > > > +#source: undefined-align.s > > > > > +#as: > > > > > +#ld: --relax > > > > > +#error: \A[^\n]*\.o: in function `_start':\n\(\.text\+0x0\): und= efined reference to `foo'\Z > > > > > diff --git a/ld/testsuite/ld-riscv-elf/undefined-align.s b/ld/tes= tsuite/ld-riscv-elf/undefined-align.s > > > > > new file mode 100644 > > > > > index 00000000000..577557c663a > > > > > --- /dev/null > > > > > +++ b/ld/testsuite/ld-riscv-elf/undefined-align.s > > > > > @@ -0,0 +1,10 @@ > > > > > +# Make sure that the linker messages take into account the modif= ication > > > > > +# of a symbol's value made during the section relaxation. > > > > > +# Here, "_start" will have an offset made by ".align 4" which wi= ll be > > > > > +# removed during the relaxation of R_RISCV_ALIGN. > > > > > + .text > > > > > + .align 4 > > > > > + .globl _start > > > > > + .type _start, @function > > > > > +_start: > > > > > + call foo > > > > > > Tested with undefined-align.s, I get the following error message > > > before applying this patch, > > > > > > % riscv64-unknown-elf-ld tmp.o > > > > > > riscv64-unknown-elf-ld: tmp.o:(.text+0x0): undefined reference to `fo= o' > > > > > > And get the error after applying this patch, > > > > > > % riscv64-unknown-elf-ld tmp.o > > > > > > riscv64-unknown-elf-ld: tmp.o: in function `_start': > > > > > > (.text+0x0): undefined reference to `foo' > > > > > > It reports more detail, but I cannot see if the removed offset may > > > cause the wrong message? > > > > I'm not sure I understand what you mean. > > As in the object file, _start starts at 0xc, you mean that the error > > message should be something like: > > | riscv64-unknown-elf-ld: tmp.o: in function `_start': > > | (.text+0xc): undefined reference to `foo' > > ? > > The error message doesn't seem wrong without your patch, just don't > report "in function `_start':". The comment in your testcase said the > offset should be removed after handling alignment, but I get the same > zero offset with or without your patch. It would be great if you can > provide other cases to show how the offset reported by error message > will be wrong. AFAICT, the offset will never be wrong because it's correctly updated during the relax in riscv_relax_delete_bytes: | data->relocs[i].r_offset -=3D count; However, the outsymbols, which are used inside ldmisc.c to find the function, aren't. Maybe my message is misleading, would something like this be clearer ? | # Make sure that the linker messages take into account the modification | # of a symbol's value made during the section relaxation when looking | # for the function name. | # Here, "_start" will have an offset made by ".align 4" which will be | # removed during the relaxation of R_RISCV_ALIGN. Ensure that | # that the error has both the new offset (0x0 instead of 0xc) and it's | # correctly showing "in function `_start`". Thanks, Cl=C3=A9ment