public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Joseph Faulls <Joseph.Faulls@imgtec.com>
To: Palmer Dabbelt <palmer@rivosinc.com>
Cc: "nelson@rivosinc.com" <nelson@rivosinc.com>,
	"binutils@sourceware.org" <binutils@sourceware.org>
Subject: RE: [EXTERNAL] Re: [PATCH] RISC-V: Do not gp relax against an ABS symbol if it is far away.
Date: Fri, 28 Jul 2023 15:52:41 +0000	[thread overview]
Message-ID: <LO4P265MB591421EBA3B59627721B032B8006A@LO4P265MB5914.GBRP265.PROD.OUTLOOK.COM> (raw)
In-Reply-To: <mhng-01e99499-c5c5-4184-a9d9-e110964d26fb@palmer-ri-x1c9>

> I think Nelson's point about cross-section GP relaxation is probably the root cause here, we've had some other bugs like that crop up recently.  
Assuming that's the case then SHN_ABS is just a red herring: maybe it makes triggering the bug easier, but we can probably come up with a similar failure in other ways.

Ah, I think you're right. My thought process was that any data not in the ABS section would move position if the sections before it shrunk / changed size, so that the offset to the gp would remain the same as instructions were relaxed. But you could place sections at particular addresses, so this bug would reveal itself just the same. I was able to hit the truncation issue with a very contrived test case.

> Can you post a reduced test case with a custom linker script that triggers the problem?  That should help us a bit more concrete.

Sure, here's an example I managed to create. I'm sure there's a way to reduce it further, without the weird C things to prevent optimising the buffer away. (etiquette question: should I be attaching these or just pasting them?)

Command: riscv64-unknown-elf-gcc test.c -c -o test.o; riscv64-unknown-elf-ld -melf64lriscv test.o -o test.elf -Ttest.ld

test.c:
extern char __bss_end[];
static char data[0x800];

void _start() {
  *__bss_end = 1;
}

char foo(int i) {
  data[i+1] = 1;
  return data[i];
}

test.ld:
ENTRY(_start)
MEMORY
{
        ram (!rx) : ORIGIN = 0x800000000, LENGTH = 0x1000
}
SECTIONS {
        .text : {
                *(.text*)
        } >ram

        .data : {
                *(.data*)
        } >ram

        __global_pointer$ = .;

        .bss : {
                *(.bss*)
        } >ram
        PROVIDE(__bss_end = .);
}

-----Original Message-----
From: Palmer Dabbelt <palmer@rivosinc.com> 
Sent: Friday, July 28, 2023 4:03 PM
To: Joseph Faulls <Joseph.Faulls@imgtec.com>
Cc: nelson@rivosinc.com; binutils@sourceware.org
Subject: RE: [EXTERNAL] Re: [PATCH] RISC-V: Do not gp relax against an ABS symbol if it is far away.

*** CAUTION: This email originates from a source not known to Imagination Technologies. Think before you click a link or open an attachment ***

On Fri, 28 Jul 2023 02:18:49 PDT (-0700), Joseph.Faulls@imgtec.com wrote:
> Thanks for the replies!
> 
> 
>   *   Do you have a reduced test that shows we still have a chance to get the relax error by using the default linker script?
> Yes, you can re-produce this with the default linkerscript, but it’s a little harder. If you take the same assembly files as the test cases in this patch (gp-relax-abs-sym.s and gp-relax-abs.s), assemble and link an elf and see the value __global_pointer$. Then adjust the absolute symbol to be this + 0x800. Re-assemble + link, and you should see a truncation.
> 
> I came across this issue when using a custom linker script that provided some symbols (which are absolute) and linking a program with quite a few relaxations.
> 
>   *   I'm not quite sure what you mean by GP moving -- GP's the register, so it doesn't have a value until runtime.  GP is meant to equal __global_pointer$, but that's usually a data symbol and thus doesn't move as a result of relaxation.
> Perhaps I’m using the terms incorrectly, but if the __global_pointer is set in the linkerscript at the end of some section (I think this is most cases), then this value will change if the size of any section before it changes, as would be in the case of relaxation. Could we not describe this as the GP moving?

I wouldn't, but I think it's just different meanings of the words here
-- it shouldn't really matter for the purposes of fixing the bug, I think I understand what you're saying ;)

I think Nelson's point about cross-section GP relaxation is probably the root cause here, we've had some other bugs like that crop up recently.  
Assuming that's the case then SHN_ABS is just a red herring: maybe it makes triggering the bug easier, but we can probably come up with a similar failure in other ways.

Assuming that's the case, both SHN_ABS and /2 are just a hueristic here.  
We've got some tricky cases floating around for the cross-section stuff where we don't have a great answer yet, I wouldn't be surprised if this is another case where we just can't relax safely.

Can you post a reduced test case with a custom linker script that triggers the problem?  That should help us a bit more concrete.

> From: Nelson Chu <nelson@rivosinc.com>
> Sent: Friday, July 28, 2023 2:16 AM
> To: Palmer Dabbelt <palmer@rivosinc.com>
> Cc: Joseph Faulls <Joseph.Faulls@imgtec.com>; binutils@sourceware.org
> Subject: [EXTERNAL] Re: [PATCH] RISC-V: Do not gp relax against an ABS symbol if it is far away.
> 
> Maybe, if the shrink code is more than the data segment alignment in the default linker script (0x1000), then probably still have a chance to get the truncated gp by using default linker script...  I will suggest if that is so, then we just have a linker option to disallow/allow the gp relax for any abs symbol, that should be more intuitive.
> 
> On Fri, Jul 28, 2023 at 9:02 AM Nelson Chu <nelson@rivosinc.com<mailto:nelson@rivosinc.com>> wrote:
> Relaxation should shrink the text, but since the default linker script has huge data segment alignment, data after the segment alignment shouldn't be affected and shrinked, so ideally  it should be safe by using default linker script.
> 
> Do you have a reduced test that shows we still have a chance to get the relax error by using the default linker script?
> 
> On Fri, Jul 28, 2023 at 5:08 AM Palmer Dabbelt <palmer@rivosinc.com<mailto:palmer@rivosinc.com>> wrote:
> On Thu, 27 Jul 2023 09:48:24 PDT (-0700), Joseph.Faulls@imgtec.com<mailto:Joseph.Faulls@imgtec.com> wrote:
>> Relaxations can cause the gp to move after it has been decided to gp 
>> relax. Against an absolute symbol, the distance may change such that 
>> the offset can no longer fit in the 12-bit immediate field.
>>
>> bfd/
>>         * elfnn-riscv.c (_bfd_riscv_relax_pc) Do not gp relax against and ABS
>>           symbol if it is far away.
>> ld/
>>         * ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp: Updated.
>>         * ld/testsuite/ld-riscv-elf/gp-relax-abs*: New testcases.
>> ---
>> bfd/elfnn-riscv.c                            | 12 ++++++++++++
>> ld/testsuite/ld-riscv-elf/gp-relax-abs-sym.s |  3 +++
>> ld/testsuite/ld-riscv-elf/gp-relax-abs.d     | 14 ++++++++++++++
>> ld/testsuite/ld-riscv-elf/gp-relax-abs.s     |  5 +++++
>> ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp   |  1 +
>> 5 files changed, 35 insertions(+)
>> create mode 100644 ld/testsuite/ld-riscv-elf/gp-relax-abs-sym.s
>> create mode 100644 ld/testsuite/ld-riscv-elf/gp-relax-abs.d
>> create mode 100644 ld/testsuite/ld-riscv-elf/gp-relax-abs.s
>>
>> diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c index 
>> 09aa7be225e..79e29e8b272 100644
>> --- a/bfd/elfnn-riscv.c
>> +++ b/bfd/elfnn-riscv.c
>> @@ -4885,6 +4885,18 @@ _bfd_riscv_relax_pc (bfd *abfd ATTRIBUTE_UNUSED,
>>               max_alignment = _bfd_riscv_get_max_alignment (sec, gp);
>>               htab->max_alignment_for_gp = max_alignment;
>>             }
>> +         /* If the symbol is in the abs section, relaxation could cause the gp
>> +          * to move such that the gp relocation is no longer possible.
> 
> I'm not quite sure what you mean by GP moving -- GP's the register, so 
> it doesn't have a value until runtime.  GP is meant to equal 
> __global_pointer$, but that's usually a data symbol and thus doesn't 
> move as a result of relaxation.
> 
> We do have some other issues related to SHN_ABS (for example position 
> independent vs PC relative stuff), so it's possible there's some other 
> bug here?
> 
>> +          * Conservatively half the allowed distance, as it cannot be that
>> +          * gp moves more than this, i.e. more than half the instructions be
>> +          * deleted due to relaxation.  Do this by adjusting reserve_size.  */
>> +         if (sym_sec->output_section == bfd_abs_section_ptr)
>> +           {
>> +             if (symval >= gp)
>> +               reserve_size += (symval - gp) / 2;
>> +             else
>> +               reserve_size += (gp - symval) / 2;
> 
> We can have more than a factor of two from relaxation, for example 
> from alignment.  If a hueristic is all we can do then I guess we'll 
> have to live with it, but I think I'd want to understand the movement
> 
> An extreme case is that all the instruction can be relaxed without any huge alignments between text and data, so I guess that's probably why the half of distance comes from.  If the default linker script may cause the problem, then we will need to know more details and try to resolve it.  But if this only occurs without the default linker script, then I will suggest users disable that relax patterns by the .norelax directive, or related options of compiler/assembler/linker.  Or maybe we can have an option for users to reject gp relax when meeting abs symbols, if without using default linker script.  However, please give a reduced case by using default linker script that will be broken in detailed, thanks a lot.
> 
> Nelson
> 
>> +           }
>>         }
>>      }
>>
>> diff --git a/ld/testsuite/ld-riscv-elf/gp-relax-abs-sym.s 
>> b/ld/testsuite/ld-riscv-elf/gp-relax-abs-sym.s
>> new file mode 100644
>> index 00000000000..a018bb3a50a
>> --- /dev/null
>> +++ b/ld/testsuite/ld-riscv-elf/gp-relax-abs-sym.s
>> @@ -0,0 +1,3 @@
>> +.section .data
>> +.globl sym
>> +.set sym,0x10804
>> diff --git a/ld/testsuite/ld-riscv-elf/gp-relax-abs.d 
>> b/ld/testsuite/ld-riscv-elf/gp-relax-abs.d
>> new file mode 100644
>> index 00000000000..2c7ab3a2579
>> --- /dev/null
>> +++ b/ld/testsuite/ld-riscv-elf/gp-relax-abs.d
>> @@ -0,0 +1,14 @@
>> +#source: gp-relax-abs.s
>> +#source: gp-relax-abs-sym.s
>> +#as: -march=rv64ic -mabi=lp64
>> +#ld: -Tcode-model-01.ld -melf64lriscv
>> +#objdump: -d
>> +
>> +.*:[   ]+file format .*
>> +
>> +Disassembly of section \.text:
>> +
>> +[0-9a-f]+ <_start>:
>> +.*auipc.*
>> +.*lw.*# [0-9a-f]* <sym>
>> +#pass
>> diff --git a/ld/testsuite/ld-riscv-elf/gp-relax-abs.s 
>> b/ld/testsuite/ld-riscv-elf/gp-relax-abs.s
>> new file mode 100644
>> index 00000000000..db2103bafd1
>> --- /dev/null
>> +++ b/ld/testsuite/ld-riscv-elf/gp-relax-abs.s
>> @@ -0,0 +1,5 @@
>> +.text
>> +.global _start
>> +_start:
>> +    auipc t0, %pcrel_hi(sym)
>> +    lw t0, %pcrel_lo(_start)(t0)
>> diff --git a/ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp 
>> b/ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp
>> index 947a266ba72..a53a2758991 100644
>> --- a/ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp
>> +++ b/ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp
>> @@ -172,6 +172,7 @@ if [istarget "riscv*-*-*"] {
>>      run_dump_test "attr-merge-priv-spec-failed-06"
>>      run_dump_test "attr-phdr"
>>      run_dump_test "relax-max-align-gp"
>> +    run_dump_test "gp-relax-abs"
>>      run_dump_test "uleb128"
>>      run_ld_link_tests [list \
>>         [list "Weak reference 32" "-T weakref.ld 
>> -m[riscv_choose_ilp32_emul]" "" \

  reply	other threads:[~2023-07-28 15:52 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-27 16:48 Joseph Faulls
2023-07-27 17:02 ` Andreas Schwab
2023-07-27 21:08 ` Palmer Dabbelt
2023-07-28  1:02   ` Nelson Chu
2023-07-28  1:15     ` Nelson Chu
2023-07-28  9:18       ` [EXTERNAL] " Joseph Faulls
2023-07-28 15:03         ` Palmer Dabbelt
2023-07-28 15:52           ` Joseph Faulls [this message]
2023-07-28 16:12             ` Palmer Dabbelt
2023-08-08  9:49 ` Maciej W. Rozycki
2023-08-08 10:38   ` [EXTERNAL] " Joseph Faulls
2023-08-08 12:04     ` Maciej W. Rozycki
2023-08-23 11:16     ` Joseph Faulls

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=LO4P265MB591421EBA3B59627721B032B8006A@LO4P265MB5914.GBRP265.PROD.OUTLOOK.COM \
    --to=joseph.faulls@imgtec.com \
    --cc=binutils@sourceware.org \
    --cc=nelson@rivosinc.com \
    --cc=palmer@rivosinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).