public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: "Clément Chigot" <chigot@adacore.com>
To: Palmer Dabbelt <palmer@dabbelt.com>
Cc: binutils@sourceware.org
Subject: Re: Risc - V failures inside ld-undefined
Date: Mon, 3 Oct 2022 14:09:42 +0200	[thread overview]
Message-ID: <CAJ307EjHyWU7AgHPTtFH2ySjT3-fph52A5Lwmk2vmnKiBd3YTQ@mail.gmail.com> (raw)
In-Reply-To: <mhng-b17d9bae-597f-4e27-bd45-2be9e4338034@palmer-ri-x1c9>

On Fri, Sep 30, 2022 at 11:40 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>
> On Fri, 30 Sep 2022 05:29:36 PDT (-0700), binutils@sourceware.org wrote:
> > Hi all,
> >
> > I've been running the ld testsuite for Risc-V targets using a cross
> > compiler and I'm seeing some failures in "undefined".
> > PASS: undefined
> > FAIL: undefined function
> > FAIL: undefined line
> >
> > As I'm using a custom gcc, I just want confirmation from the Risc-V
> > maintainers that these two tests are either failing or being skipped
> > on their side.
>
> They're not failing for me in a local run, unless I'm somehow
> misunderstanding the test results.  My "make check-ld" shows
>
>                 === ld Summary ===
>
> # of expected passes            797
> # of expected failures          8
> # of untested testcases         26
> # of unsupported tests          175
> ./ld-new 2.39.50.20220930

I have less untested testcases but less tests overall.

                === ld Summary ===

# of expected passes            493
# of unexpected failures        5       (3 expected + 2 with undefined)
# of unexpected successes       9
# of expected failures          7
# of untested testcases         7
# of unsupported tests          193

You can probably check through ld.sum what the status of "undefined",
"undefined function" and "undefined line".

> > I've a patch fixing the testsuite for cross-compiler which might
> > explain why it has always been skipped until now and why this error
> > has been masked.
>
> OK, probably best to send the patch, then?  There's certainly been cases
> before where we've missed failures because they're being skipped due to
> environment issues, if it's possible to get the test running for more
> people then it'll be more likely to get fixed -- at least with it
> showing up in the tests as an unexpected failure it'll be harder to
> forget about.

Yes, this is done (see "[PATCH] ld/testsuite: consistently add
board_ldflags when linking with GCC").
It aims to resolve some incoherency between "check_compiler_available"
and how the compiler is later used.

> > Note that I've investigated a bit and it seems to be related to
> > R_RISCV_ALIGN and the associated linker relaxation.
> > Instead of showing
> >   | ld: tmpdir/undefined.o: in function `function':
> >   | .../undefined.c:9: undefined reference to `this_function_is_not_defined'
> > the error is:
> >   | riscv64-elf-ld: tmpdir/undefined.o: in function `.Ltext0':
> >   | undefined.c:(.text+0x0): undefined reference to
> > `this_function_is_not_defined'
> >
> > The idea is that "function" has an alignment of 2 and thus gas will
> > add NOP instructions and a R_RISCV_ALIGN relocation before it (within
> > tc-riscv.c:riscv_frag_align_code).
> >   | $ riscv64-elf-objdump -D tmpdir/undefined.o
> >   | 0000000000000000 <.Ltext0>:
> >   |   0:   0001                    nop
> >   |
> >   | 0000000000000002 <function>:
> >   |   2:   00000317                auipc   t1,0x0
> >   |   6:   00030067                jr      t1 # 2 <function>
> >
> > When the linker will relax these NOPs (in
> > elfnn-riscv.c:riscv_relax_delete_bytes), it will correctly change the
> > relocation offset but not the DWARF information. Thus, when the error
> > is trying to get the nearest line using _bfd_dwarf2_find_nearest_line,
> > it ends up using a modified offset with non-modified DWARF information
> > and targets the wrong symbol.
> > I guess riscv_relax_delete_bytes can be improved to adjust the DWARF
> > part too. But I'm not familiar with this part of bfd and I'm not sure
> > it's even possible.
> > Otherwise, I guess XFAIL these tests should be fine as they are
> > similar to other XFAILs related to DWARF-2 (on arm-elf). Even if not
> > having a "function" being shown in the error might be problematic.
>
> That seems like a plausible issue, and it's certainly not surprising
> that there's a bug in debug info handling related to relaxation -- both
> because we've had those before, and because it's the combination of some
> complex and hard to test topics.  Probably best to just start with
> getting this running for everyone so we can chew on it a bit, if it is a
> XFAIL then we should at least have some sort of error here rather than
> just producing bad binaries.

IIUC, the binaries are fined at the end. At least, I've quickly
checked the DWARF infos and they don't look wrong to me.
The problem is more that when we are raising the issue for the
"undefined" symbol we are in the middle of the relocation phase. The
DWARF infos are then not yet corrected, resulting in the error message
being wrong.

> > Anyway, the fact that I'm using a custom gcc shouldn't matter here but
> > I want to confirm that first.
>
> That also seems reasonable, but there's quite a bit of coupling between
> the compiler, assembler, and linker when it comes to DWARF on RISC-V as
> we don't support all the flavors of DWARF-relate directives.  We should
> have errors for all those, but it's entirely possible something has been
> missed.  That's even more likely if your custom compiler that's doing
> something different than we normally test against, but IMO that's still
> a bug that we should fix (at least with an error message).

I shouldn't assume things without checking.
This seems indeed related to a change in our gcc. We are adding a
".align 2" before "function" which results in the relaxation part
being triggered. But as you said and I agree with that, there is still
a bug behind it.
I'll see what I can do. Either add a warning/error or try to fix it.

Thanks,
Clément

  reply	other threads:[~2022-10-03 12:09 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-30 12:29 Clément Chigot
2022-09-30 21:40 ` Palmer Dabbelt
2022-10-03 12:09   ` Clément Chigot [this message]
2022-10-03 17:10     ` Palmer Dabbelt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJ307EjHyWU7AgHPTtFH2ySjT3-fph52A5Lwmk2vmnKiBd3YTQ@mail.gmail.com \
    --to=chigot@adacore.com \
    --cc=binutils@sourceware.org \
    --cc=palmer@dabbelt.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).