public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Palmer Dabbelt <palmer@dabbelt.com>
To: chigot@adacore.com
Cc: binutils@sourceware.org
Subject: Re: Risc - V failures inside ld-undefined
Date: Mon, 03 Oct 2022 10:10:41 -0700 (PDT)	[thread overview]
Message-ID: <mhng-2152513e-ffd9-46f7-bdcd-e3adb98690e9@palmer-ri-x1c9> (raw)
In-Reply-To: <CAJ307EjHyWU7AgHPTtFH2ySjT3-fph52A5Lwmk2vmnKiBd3YTQ@mail.gmail.com>

On Mon, 03 Oct 2022 05:09:42 PDT (-0700), chigot@adacore.com wrote:
> On Fri, Sep 30, 2022 at 11:40 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>>
>> On Fri, 30 Sep 2022 05:29:36 PDT (-0700), binutils@sourceware.org wrote:
>> > Hi all,
>> >
>> > I've been running the ld testsuite for Risc-V targets using a cross
>> > compiler and I'm seeing some failures in "undefined".
>> > PASS: undefined
>> > FAIL: undefined function
>> > FAIL: undefined line
>> >
>> > As I'm using a custom gcc, I just want confirmation from the Risc-V
>> > maintainers that these two tests are either failing or being skipped
>> > on their side.
>>
>> They're not failing for me in a local run, unless I'm somehow
>> misunderstanding the test results.  My "make check-ld" shows
>>
>>                 === ld Summary ===
>>
>> # of expected passes            797
>> # of expected failures          8
>> # of untested testcases         26
>> # of unsupported tests          175
>> ./ld-new 2.39.50.20220930
>
> I have less untested testcases but less tests overall.
>
>                 === ld Summary ===
>
> # of expected passes            493
> # of unexpected failures        5       (3 expected + 2 with undefined)
> # of unexpected successes       9
> # of expected failures          7
> # of untested testcases         7
> # of unsupported tests          193
>
> You can probably check through ld.sum what the status of "undefined",
> "undefined function" and "undefined line".

Yep, untested on my end

Running /home/palmer/life/binutils-gdb/ld/testsuite/ld-undefined/undefined.exp ...
UNTESTED: undefined
UNTESTED: undefined function
UNTESTED: undefined line
PASS: undefined symbols in shared lib (dyn sym)
PASS: undefined symbols in shared lib (dyn reloc)

>> > I've a patch fixing the testsuite for cross-compiler which might
>> > explain why it has always been skipped until now and why this error
>> > has been masked.
>>
>> OK, probably best to send the patch, then?  There's certainly been cases
>> before where we've missed failures because they're being skipped due to
>> environment issues, if it's possible to get the test running for more
>> people then it'll be more likely to get fixed -- at least with it
>> showing up in the tests as an unexpected failure it'll be harder to
>> forget about.
>
> Yes, this is done (see "[PATCH] ld/testsuite: consistently add
> board_ldflags when linking with GCC").
> It aims to resolve some incoherency between "check_compiler_available"
> and how the compiler is later used.

Thanks, I replied over there.

>> > Note that I've investigated a bit and it seems to be related to
>> > R_RISCV_ALIGN and the associated linker relaxation.
>> > Instead of showing
>> >   | ld: tmpdir/undefined.o: in function `function':
>> >   | .../undefined.c:9: undefined reference to `this_function_is_not_defined'
>> > the error is:
>> >   | riscv64-elf-ld: tmpdir/undefined.o: in function `.Ltext0':
>> >   | undefined.c:(.text+0x0): undefined reference to
>> > `this_function_is_not_defined'
>> >
>> > The idea is that "function" has an alignment of 2 and thus gas will
>> > add NOP instructions and a R_RISCV_ALIGN relocation before it (within
>> > tc-riscv.c:riscv_frag_align_code).
>> >   | $ riscv64-elf-objdump -D tmpdir/undefined.o
>> >   | 0000000000000000 <.Ltext0>:
>> >   |   0:   0001                    nop
>> >   |
>> >   | 0000000000000002 <function>:
>> >   |   2:   00000317                auipc   t1,0x0
>> >   |   6:   00030067                jr      t1 # 2 <function>
>> >
>> > When the linker will relax these NOPs (in
>> > elfnn-riscv.c:riscv_relax_delete_bytes), it will correctly change the
>> > relocation offset but not the DWARF information. Thus, when the error
>> > is trying to get the nearest line using _bfd_dwarf2_find_nearest_line,
>> > it ends up using a modified offset with non-modified DWARF information
>> > and targets the wrong symbol.
>> > I guess riscv_relax_delete_bytes can be improved to adjust the DWARF
>> > part too. But I'm not familiar with this part of bfd and I'm not sure
>> > it's even possible.
>> > Otherwise, I guess XFAIL these tests should be fine as they are
>> > similar to other XFAILs related to DWARF-2 (on arm-elf). Even if not
>> > having a "function" being shown in the error might be problematic.
>>
>> That seems like a plausible issue, and it's certainly not surprising
>> that there's a bug in debug info handling related to relaxation -- both
>> because we've had those before, and because it's the combination of some
>> complex and hard to test topics.  Probably best to just start with
>> getting this running for everyone so we can chew on it a bit, if it is a
>> XFAIL then we should at least have some sort of error here rather than
>> just producing bad binaries.
>
> IIUC, the binaries are fined at the end. At least, I've quickly
> checked the DWARF infos and they don't look wrong to me.
> The problem is more that when we are raising the issue for the
> "undefined" symbol we are in the middle of the relocation phase. The
> DWARF infos are then not yet corrected, resulting in the error message
> being wrong.

OK, that's a lot less scary.

>> > Anyway, the fact that I'm using a custom gcc shouldn't matter here but
>> > I want to confirm that first.
>>
>> That also seems reasonable, but there's quite a bit of coupling between
>> the compiler, assembler, and linker when it comes to DWARF on RISC-V as
>> we don't support all the flavors of DWARF-relate directives.  We should
>> have errors for all those, but it's entirely possible something has been
>> missed.  That's even more likely if your custom compiler that's doing
>> something different than we normally test against, but IMO that's still
>> a bug that we should fix (at least with an error message).
>
> I shouldn't assume things without checking.
> This seems indeed related to a change in our gcc. We are adding a
> ".align 2" before "function" which results in the relaxation part
> being triggered. But as you said and I agree with that, there is still
> a bug behind it.

You can probably trigger that with a regular GCC, the non-C targets all 
align functions to 4 bytes (as the non-C ISAs don't allow 2-byte aligned 
instructions).

> I'll see what I can do. Either add a warning/error or try to fix it.

Thanks.  Sounds like this one would be simple to produce a directed 
assembly test for, so if you don't want to sort that out then just 
sending along a patch that adds the failing test case would be great.  
That makes these way easier to reproduce, as rather than relying on a 
specific flavor of toolchain we can just sort out the bug directly.

      reply	other threads:[~2022-10-03 17:10 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-30 12:29 Clément Chigot
2022-09-30 21:40 ` Palmer Dabbelt
2022-10-03 12:09   ` Clément Chigot
2022-10-03 17:10     ` Palmer Dabbelt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mhng-2152513e-ffd9-46f7-bdcd-e3adb98690e9@palmer-ri-x1c9 \
    --to=palmer@dabbelt.com \
    --cc=binutils@sourceware.org \
    --cc=chigot@adacore.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).