public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug other/115687] New: RISC-V optimization when "lui" instructions can be merged
@ 2024-06-27 17:51 Explorer09 at gmail dot com
  2024-06-27 18:17 ` [Bug target/115687] " palmer at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Explorer09 at gmail dot com @ 2024-06-27 17:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115687

            Bug ID: 115687
           Summary: RISC-V optimization when "lui" instructions can be
                    merged
           Product: gcc
           Version: 14.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: other
          Assignee: unassigned at gcc dot gnu.org
          Reporter: Explorer09 at gmail dot com
  Target Milestone: ---

```c
#include <stdio.h>
int main() {
    unsigned int a = 0x4010; // (0x4 << 12) + 0x10
    unsigned int b = 0x4020; // (0x4 << 12) + 0x20
    unsigned int c = 0x3FF0; // (0x4 << 12) - 0x10
    printf("0x%08x 0x%08x 0x%08x\n", a, b, c);
    return 0;
}
```

RISC-V (64-bits) GCC 14.1 with -Os option produces:

```text
    li a3,16384
    li a2,16384
    li a1,16384
    lui a0,%hi(.LC0)
    addi sp,sp,-16
    addi a3,a3,-16
    addi a2,a2,32
    addi a1,a1,16
    addi a0,a0,%lo(.LC0)
    sd ra,8(sp)
    call printf
    ld ra,8(sp)
    li a0,0
    addi sp,sp,16
    jr ra
```

There are three "li ...,16384" instructions generated, while it is possible to
reduce them to one. For example, by using the same "a1" register for the
addends:

```text
    li a1,16384
    ...
    addi a3,a1,-16
    addi a2,a1,32
    addi a1,a1,16
```

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/115687] RISC-V optimization when "lui" instructions can be merged
  2024-06-27 17:51 [Bug other/115687] New: RISC-V optimization when "lui" instructions can be merged Explorer09 at gmail dot com
@ 2024-06-27 18:17 ` palmer at gcc dot gnu.org
  2024-06-27 18:30 ` pinskia at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: palmer at gcc dot gnu.org @ 2024-06-27 18:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115687

palmer at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2024-06-27
     Ever confirmed|0                           |1
                 CC|                            |palmer at gcc dot gnu.org
             Status|UNCONFIRMED                 |NEW

--- Comment #1 from palmer at gcc dot gnu.org ---
Looks reasonable to me.  Here's a slightly smaller test case

long bar(unsigned int a, unsigned int b, unsigned int c);

long foo(void) {
    unsigned int a = 0x4010; // (0x4 << 12) + 0x10
    unsigned int b = 0x4020; // (0x4 << 12) + 0x20
    unsigned int c = 0x3FF0; // (0x4 << 12) - 0x10
    return bar(a, b, c);
}

which under -O2 produces

        li      a2,16384
        li      a1,16384
        li      a0,16384
        addi    a2,a2,-16
        addi    a1,a1,32
        addi    a0,a0,16
        tail    bar

Jeff had been doing a bunch of constant generation stuff, not sure if he's got
a fix for this one in the works?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/115687] RISC-V optimization when "lui" instructions can be merged
  2024-06-27 17:51 [Bug other/115687] New: RISC-V optimization when "lui" instructions can be merged Explorer09 at gmail dot com
  2024-06-27 18:17 ` [Bug target/115687] " palmer at gcc dot gnu.org
@ 2024-06-27 18:30 ` pinskia at gcc dot gnu.org
  2024-06-27 18:32 ` pinskia at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-06-27 18:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115687

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
There is some code in cse.cc which does handle this.
See
https://gcc.gnu.org/onlinedocs/gccint/Misc.html#index-TARGET_005fCONST_005fANCHOR
also.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/115687] RISC-V optimization when "lui" instructions can be merged
  2024-06-27 17:51 [Bug other/115687] New: RISC-V optimization when "lui" instructions can be merged Explorer09 at gmail dot com
  2024-06-27 18:17 ` [Bug target/115687] " palmer at gcc dot gnu.org
  2024-06-27 18:30 ` pinskia at gcc dot gnu.org
@ 2024-06-27 18:32 ` pinskia at gcc dot gnu.org
  2024-06-27 18:49 ` palmer at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-06-27 18:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115687

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #2)
> There is some code in cse.cc which does handle this.
> See
> https://gcc.gnu.org/onlinedocs/gccint/Misc.html#index-
> TARGET_005fCONST_005fANCHOR also.

MIPS, aarch64 and rs6000 all define TARGET_CONST_ANCHOR (well MIPS sets
targetm.const_anchor depending on if it is mips16/micromips or mips32/64).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/115687] RISC-V optimization when "lui" instructions can be merged
  2024-06-27 17:51 [Bug other/115687] New: RISC-V optimization when "lui" instructions can be merged Explorer09 at gmail dot com
                   ` (2 preceding siblings ...)
  2024-06-27 18:32 ` pinskia at gcc dot gnu.org
@ 2024-06-27 18:49 ` palmer at gcc dot gnu.org
  2024-06-27 18:54 ` palmer at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: palmer at gcc dot gnu.org @ 2024-06-27 18:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115687

--- Comment #4 from palmer at gcc dot gnu.org ---
Just poking around a bit: I think this is coming from CSE, which is replacing

(insn 5 2 6 2 (set (reg:DI 135)
        (const_int 16384 [0x4000])) "pr115687.c":7:12 275 {*movdi_64bit}
     (nil))
(insn 6 5 7 2 (set (reg:DI 12 a2)
        (plus:DI (reg:DI 135)
            (const_int -16 [0xfffffffffffffff0]))) "pr115687.c":7:12 5 {adddi3}
     (expr_list:REG_EQUAL (const_int 16368 [0x3ff0])
        (nil)))
(insn 7 6 8 2 (set (reg:DI 136)
        (const_int 16384 [0x4000])) "pr115687.c":7:12 275 {*movdi_64bit}
     (nil))
(insn 8 7 9 2 (set (reg:DI 11 a1)
        (plus:DI (reg:DI 136)
            (const_int 32 [0x20]))) "pr115687.c":7:12 5 {adddi3}
     (expr_list:REG_EQUAL (const_int 16416 [0x4020])
        (nil)))
(insn 9 8 10 2 (set (reg:DI 137)
        (const_int 16384 [0x4000])) "pr115687.c":7:12 275 {*movdi_64bit}
     (nil))

with

(insn 5 2 6 2 (set (reg:DI 135)
        (const_int 16384 [0x4000])) "pr115687.c":7:12 275 {*movdi_64bit}
     (nil))
(insn 6 5 7 2 (set (reg:DI 12 a2)
        (const_int 16368 [0x3ff0])) "pr115687.c":7:12 273 {*mvconst_internal}
     (expr_list:REG_DEAD (reg:DI 135)
        (expr_list:REG_EQUAL (const_int 16368 [0x3ff0])
            (nil))))
(insn 7 6 8 2 (set (reg:DI 136)
        (reg:DI 135)) "pr115687.c":7:12 275 {*movdi_64bit}
     (expr_list:REG_EQUAL (const_int 16384 [0x4000])
        (nil)))
(insn 8 7 9 2 (set (reg:DI 11 a1)
        (const_int 16416 [0x4020])) "pr115687.c":7:12 273 {*mvconst_internal}
     (expr_list:REG_DEAD (reg:DI 136)
        (expr_list:REG_EQUAL (const_int 16416 [0x4020])
            (nil))))
(insn 9 8 10 2 (set (reg:DI 137)
        (reg:DI 135)) "pr115687.c":7:12 275 {*movdi_64bit}
     (expr_list:REG_EQUAL (const_int 16384 [0x4000])
        (nil)))
(insn 10 9 11 2 (set (reg:DI 10 a0)
        (const_int 16400 [0x4010])) "pr115687.c":7:12 273 {*mvconst_internal}
     (expr_list:REG_DEAD (reg:DI 137)
        (expr_list:REG_EQUAL (const_int 16400 [0x4010])
            (nil))))

that seems to be as-designed -- or at least as this comment in cse.cc seems to
be describing

          /* Find cheapest and skip it for the next time.   For items
             of equal cost, use this order:
             src_folded, src, src_eqv, src_related and hash table entry.  */

That seems like a bit of a heuristic, but I haven't poked around this stuff to
really understand how it's handling multiple uses of the incoming constant
anchor.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/115687] RISC-V optimization when "lui" instructions can be merged
  2024-06-27 17:51 [Bug other/115687] New: RISC-V optimization when "lui" instructions can be merged Explorer09 at gmail dot com
                   ` (3 preceding siblings ...)
  2024-06-27 18:49 ` palmer at gcc dot gnu.org
@ 2024-06-27 18:54 ` palmer at gcc dot gnu.org
  2024-06-27 20:39 ` andrew at sifive dot com
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: palmer at gcc dot gnu.org @ 2024-06-27 18:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115687

--- Comment #5 from palmer at gcc dot gnu.org ---
(In reply to Andrew Pinski from comment #3)
> (In reply to Andrew Pinski from comment #2)
> > There is some code in cse.cc which does handle this.
> > See
> > https://gcc.gnu.org/onlinedocs/gccint/Misc.html#index-
> > TARGET_005fCONST_005fANCHOR also.
> 
> MIPS, aarch64 and rs6000 all define TARGET_CONST_ANCHOR (well MIPS sets
> targetm.const_anchor depending on if it is mips16/micromips or mips32/64).

Oh, thanks, I didn't know about that.  It looks like just adding 

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 9bba5da016e..6080298c36c 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -12019,6 +12019,9 @@ riscv_c_mode_for_floating_type (enum tree_index ti)
 #undef TARGET_C_MODE_FOR_FLOATING_TYPE
 #define TARGET_C_MODE_FOR_FLOATING_TYPE riscv_c_mode_for_floating_type

+#undef TARGET_CONST_ANCHOR
+#define TARGET_CONST_ANCHOR 0x4000
+
 struct gcc_target targetm = TARGET_INITIALIZER;

 #include "gt-riscv.h"

isn't enough for us here, but it seems like we should have something along
those lines.  2d7c73ee5ea ("AArch64: Enable TARGET_CONST_ANCHOR") has a test
case, so hopefully it's not that tricky to get something that exposes the issue
on RISC-V as well...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/115687] RISC-V optimization when "lui" instructions can be merged
  2024-06-27 17:51 [Bug other/115687] New: RISC-V optimization when "lui" instructions can be merged Explorer09 at gmail dot com
                   ` (4 preceding siblings ...)
  2024-06-27 18:54 ` palmer at gcc dot gnu.org
@ 2024-06-27 20:39 ` andrew at sifive dot com
  2024-06-27 21:02 ` vineetg at gcc dot gnu.org
  2024-06-27 21:57 ` palmer at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: andrew at sifive dot com @ 2024-06-27 20:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115687

Andrew Waterman <andrew at sifive dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |andrew at sifive dot com

--- Comment #6 from Andrew Waterman <andrew at sifive dot com> ---
I note MIPS sets TARGET_CONST_ANCHOR to 0x8000, and that architecture's ADDIU
instruction has a 16-bit immediate.  RISC-V's ADDI instruction has a 12-bit
immediate, so presumably we should be setting it to 0x800.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/115687] RISC-V optimization when "lui" instructions can be merged
  2024-06-27 17:51 [Bug other/115687] New: RISC-V optimization when "lui" instructions can be merged Explorer09 at gmail dot com
                   ` (5 preceding siblings ...)
  2024-06-27 20:39 ` andrew at sifive dot com
@ 2024-06-27 21:02 ` vineetg at gcc dot gnu.org
  2024-06-27 21:57 ` palmer at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: vineetg at gcc dot gnu.org @ 2024-06-27 21:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115687

Vineet Gupta <vineetg at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vineetg at gcc dot gnu.org

--- Comment #7 from Vineet Gupta <vineetg at gcc dot gnu.org> ---
Not sure if the anchor fixes the issue.

The deeper issue with CSE1 tripping up is the unfortunate side-effect of
mvconst_internal splitter, added in gcc-13 cycle.

commit 2e886eef7f2b5aadb00171af868f0895b647c3a4
Author: Raphael Moreira Zinsly <rzinsly@ventanamicro.com>
Date:   Tue Dec 27 18:29:25 2022 -0500

    RISC-V: Produce better code with complex constants [PR95632] [PR106602]

    gcc/Changelog:
            PR target/95632
            PR target/106602
            * config/riscv/riscv.md: New pattern to simulate complex
            const_int loads.

It helped fix a bunch of issues but with time both Jeff and I agree that it
needs to be reverted. Its on our list of TODO - RISE CT_00_029.

With that reverted we only get single copy of LUI.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/115687] RISC-V optimization when "lui" instructions can be merged
  2024-06-27 17:51 [Bug other/115687] New: RISC-V optimization when "lui" instructions can be merged Explorer09 at gmail dot com
                   ` (6 preceding siblings ...)
  2024-06-27 21:02 ` vineetg at gcc dot gnu.org
@ 2024-06-27 21:57 ` palmer at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: palmer at gcc dot gnu.org @ 2024-06-27 21:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115687

--- Comment #8 from Palmer Dabbelt <palmer at gcc dot gnu.org> ---
(In reply to Andrew Waterman from comment #6)
> I note MIPS sets TARGET_CONST_ANCHOR to 0x8000, and that architecture's
> ADDIU instruction has a 16-bit immediate.  RISC-V's ADDI instruction has a
> 12-bit immediate, so presumably we should be setting it to 0x800.

Ya, sorry, I wasn't paying attention -- regardless I think Vineet's on the
right track here with the splitter messing with us here, the incoming code has
a constant anchor already so dealing with those is sort of a different problem.

If removing that splitter is on the TODO list that seems reasonable, though I'd
taken a very different approach and just hacked up a post-split CSE as it seems
like we could end up in more situations like this.  I have no idea if that's a
sane idea, I sent an RFC to the lists to see what people think...

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-06-27 21:57 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-27 17:51 [Bug other/115687] New: RISC-V optimization when "lui" instructions can be merged Explorer09 at gmail dot com
2024-06-27 18:17 ` [Bug target/115687] " palmer at gcc dot gnu.org
2024-06-27 18:30 ` pinskia at gcc dot gnu.org
2024-06-27 18:32 ` pinskia at gcc dot gnu.org
2024-06-27 18:49 ` palmer at gcc dot gnu.org
2024-06-27 18:54 ` palmer at gcc dot gnu.org
2024-06-27 20:39 ` andrew at sifive dot com
2024-06-27 21:02 ` vineetg at gcc dot gnu.org
2024-06-27 21:57 ` palmer at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).