public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* RISC-V sibcall optimization with save-restore
@ 2019-03-20 12:25 Paulo Matos
  2019-03-21  1:20 ` Jim Wilson
  0 siblings, 1 reply; 2+ messages in thread
From: Paulo Matos @ 2019-03-20 12:25 UTC (permalink / raw)
  To: gcc

Hi,

I am working on trying to get RISC-V 32 emitting sibcalls even in the
present of `-msave-restore`, for a client concerned with generated code
size.

Take a look at what current gcc generates for:
  int __attribute__ ((noinline))
  bar ()
  {
    return 3;
  }

  int  __attribute__ ((noinline))
  foo ()
  {
    return bar ();
  }

  int
  main ()
  {
    return foo ();
  }

$ install-riscv/bin/riscv32-unknown-elf-gcc -msave-restore -Os -o- -S test.c
        .file   "test.c"
        .option nopic
        .attribute arch, "rv32i2p0_m2p0_c2p0"
        .attribute unaligned_access, 0
        .attribute stack_align, 16
        .text
        .align  2
        .globl  bar
        .type   bar, @function
bar:
        li      a0,3
        ret
        .size   bar, .-bar
        .align  2
        .globl  foo
        .type   foo, @function
foo:
        call    t0,__riscv_save_0
        call    bar
        tail    __riscv_restore_0
        .size   foo, .-foo
        .section        .text.startup,"ax",@progbits
        .align  2
        .globl  main
        .type   main, @function
main:
        call    t0,__riscv_save_0
        call    foo
        tail    __riscv_restore_0
        .size   main, .-main
        .ident  "GCC: (GNU) 9.0.1 20190315 (experimental)"

This could clearly be:

bar:
        li      a0,3
        ret
        .size   bar, .-bar
        .align  2
        .globl  foo
        .type   foo, @function
foo:
        tail    bar
        .size   foo, .-foo
        .section        .text.startup,"ax",@progbits
        .align  2
        .globl  main
        .type   main, @function
main:
        tail    foo
        .size   main, .-main


which is what I obtain if I remove the lines 4615-4616 from riscv.c:
4610 static bool
4611 riscv_function_ok_for_sibcall (tree decl ATTRIBUTE_UNUSED,
4612                                tree exp ATTRIBUTE_UNUSED)
4613 {
4614   /* Don't use sibcalls when use save-restore routine.  */
4615   if (TARGET_SAVE_RESTORE)
4616     return false;

This causes, of course, test save-restore-1.c to break because in the
case of the test, we need to save/restore s0 and therefore require the
prologue and epilogue to do its job properly.

AFAICT in cases where s0-s11 do not need to be saved, we could
potentially sibcalls in the presence of save-restore.

The problem I am facing while attempting to implement this is that the
decision to do a sibcall is done at expand time, and I only get register
liveness information much later which means I can't decide to sibcall or
not depending on which registers need to be saved.

I attempted then to always enable sibcall (by removing lines 4615-4616
above) and fixup the epilogue in `riscv_expand_epilogue' to enable a
libcall generation even if sibcall was requested.

The start of `riscv_expand_epilogue' would look like:

void
riscv_expand_epilogue (int style)
{
  struct riscv_frame_info *frame = &cfun->machine->frame;
  unsigned mask = frame->mask;
  HOST_WIDE_INT step1 = frame->total_size;
  HOST_WIDE_INT step2 = 0;
  bool use_restore_libcall = riscv_use_save_libcall (frame);
  rtx ra = gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM);
  rtx insn;

  if (use_restore_libcall)
    style = NORMAL_RETURN;
...


So, use_restore_libcall does not depend on style but only if we should
use save_libcall and then if use_restore_libcall then we move the style
to NORMAL_RETURN.

I thought I was on the right path until I noticed that the CFG is messed
up because of assumptions related to emission of sibcall instead of a
libcall until the epilogue is expanded. During the pro_and_epilogue pass
I get an emergency dump and a segfault:
gcc/gcc/testsuite/gcc.target/riscv/save-restore-1.c:11:1: error: in
basic block 2:
gcc/gcc/testsuite/gcc.target/riscv/save-restore-1.c:11:1: error: flow
control insn inside a basic block
(jump_insn 24 23 6 2 (parallel [
            (return)
            (use (reg:SI 1 ra))
            (const_int 0 [0])
        ]) "gcc/gcc/testsuite/gcc.target/riscv/save-restore-1.c":11:1 -1
     (nil))
during RTL pass: pro_and_epilogue
dump file: save-restore-1.c.285r.pro_and_epilogue
gcc/gcc/testsuite/gcc.target/riscv/save-restore-1.c:11:1: internal
compiler error: in rtl_verify_bb_insns, at cfgrtl.c:2705
0xf86661 _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
        gcc/gcc/rtl-error.c:108
0x9f0e7a rtl_verify_bb_insns
        gcc/gcc/cfgrtl.c:2705
0x9f117e rtl_verify_flow_info_1
        gcc/gcc/cfgrtl.c:2791
0x9f19b2 rtl_verify_flow_info
        gcc/gcc/cfgrtl.c:3034
0x9d6e7c verify_flow_info()
        gcc/gcc/cfghooks.c:263
0xed72e2 execute_function_todo
        gcc/gcc/passes.c:1989
0xed626f do_per_function
        gcc/gcc/passes.c:1638
0xed7467 execute_todo
        gcc/gcc/passes.c:2031
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.


Is there a way around this issue and allow the libcall to be emitted,
even if the sibcall was enabled? Or emit a sibcall even if it had been
disabled? Since the problem stems that at sibcall_ok_for_function_p I
don't have enough information to know what to do, is there a way to
decide this later on?

Thanks,

-- 
Paulo Matos

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: RISC-V sibcall optimization with save-restore
  2019-03-20 12:25 RISC-V sibcall optimization with save-restore Paulo Matos
@ 2019-03-21  1:20 ` Jim Wilson
  0 siblings, 0 replies; 2+ messages in thread
From: Jim Wilson @ 2019-03-21  1:20 UTC (permalink / raw)
  To: Paulo Matos, gcc

On 3/20/19 5:25 AM, Paulo Matos wrote:
> I am working on trying to get RISC-V 32 emitting sibcalls even in the
> present of `-msave-restore`, for a client concerned with generated code
> size.

This won't work unless you define a new set of restore functions.  The 
current ones restore the return address from the stack and return, which 
is wrong if you want to do a sibcall.  This is why we tail call (jump 
to) the restore functions, because the actual function return is in the 
restore functions.  You will need a new set of restore functions that 
restore regs without restoring the ra.  You then probably also then need 
other cascading changes to make this work.

The new set of restore functions will then increase code size a bit 
offsetting the gain you get from using them.  You would have to have 
enough sibling calls that can use -msave-restore to make this 
worthwhile.  It isn't clear if this would be a win or not.

> I thought I was on the right path until I noticed that the CFG is messed
> up because of assumptions related to emission of sibcall instead of a
> libcall until the epilogue is expanded. During the pro_and_epilogue pass
> I get an emergency dump and a segfault:
> gcc/gcc/testsuite/gcc.target/riscv/save-restore-1.c:11:1: error: in
> basic block 2:
> gcc/gcc/testsuite/gcc.target/riscv/save-restore-1.c:11:1: error: flow
> control insn inside a basic block
> (jump_insn 24 23 6 2 (parallel [
>              (return)
>              (use (reg:SI 1 ra))
>              (const_int 0 [0])
>          ]) "gcc/gcc/testsuite/gcc.target/riscv/save-restore-1.c":11:1 -1
>       (nil))

If you look at the epilogue code, you will see that it emits a regular 
instruction which hides the call to the restore routine, and then it 
emits a special fake return insn that doesn't do anything.  You can just 
stop emitting the special fake return insn in this case.  This of course 
assumes that you have a new set of restore functions that actually 
return the caller, instead of the caller's parent.

One of the issues with -msave-restore is that the limited offset ranges 
of calls and branches means that if you don't have a tiny program then 
each save/restore call/jump is probably an auipc/lui plus the call/tail, 
which limits the code size reduction you get from using it.  If you can 
control where the -msave-restore routines are placed in memory, then 
putting them near address 0, or near the global pointer address, will 
allow linker relaxation to optimize these calls/jumps to a single 
instruction.  This will probably help more than trying to get it to work 
with sibling calls.

If you can modify the hardware, you might try adding load/store multiple 
instructions and using that instead of the -msave-restore option.  I 
don't know if anyone has tried this yet, but it would be an interesting 
experiment that might result in smaller code size.

Jim

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-03-21  1:20 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-20 12:25 RISC-V sibcall optimization with save-restore Paulo Matos
2019-03-21  1:20 ` Jim Wilson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).