public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool
@ 2023-03-24 22:04 vineet.gupta at linux dot dev
  2023-03-24 22:08 ` [Bug target/109279] " pinskia at gcc dot gnu.org
                   ` (18 more replies)
  0 siblings, 19 replies; 20+ messages in thread
From: vineet.gupta at linux dot dev @ 2023-03-24 22:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

            Bug ID: 109279
           Summary: [13 Regression] RISC-V: complex constants synthesized
                    vs. fetching from constant pool
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vineet.gupta at linux dot dev
  Target Milestone: ---

This is code bloat regression since gcc 12.1, seen yet again in SPEC2017
deepsjeng. After 2e886eef7f2b5a ("RISC-V: Produce better code with complex
constants [PR95632] [PR106602]").

unsigned long long FileAttacks(unsigned long long occ, const unsigned int sq) {
    unsigned int o;
    unsigned int f = sq & 7;

    occ   =   0x0101010101010101ULL & (occ   >> f);
    o     = ( 0x0080402010080400ULL *  occ ) >> 58;
    return  ( aFileAttacks[o][sq>>3]    ) << f;
}

cc1 -O2 -march=rv64gc_zba_zbb_zbc_zbs -mabi=lp64d   # stage1 is enough

Before above commit
-------------------
        lui     a4,%hi(.LC0)
        ld      a4,%lo(.LC0)(a4)
        andi    a3,a1,7
        srl     a5,a0,a3
        and     a5,a5,a4
        lui     a4,%hi(.LC1)
        ld      a4,%lo(.LC1)(a4)
        srliw   a1,a1,3
        mul     a5,a5,a4
        lui     a4,%hi(aFileAttacks)
        addi    a4,a4,%lo(aFileAttacks)
        srli    a5,a5,58
        sh3add  a5,a5,a1
        sh3add  a5,a5,a4
        ld      a0,0(a5)
        sll     a0,a0,a3
        ret

        .section        .srodata.cst8,"aM",@progbits,8
        .align  3
.LC0:
        .dword  0x0101010101010101
        .align  3
.LC1:
        .dword  0x0080402010080400


With commit
-----------

       li      a5,16842752
        addi    a5,a5,257
        slli    a5,a5,16
        addi    a5,a5,257
        andi    a3,a1,7
        slli    a5,a5,16
        srl     a4,a0,a3
        addi    a5,a5,257
        and     a4,a4,a5
        slli    a5,a4,9
        add     a5,a5,a4
        slli    a5,a5,9
        add     a5,a5,a4
        slli    a4,a5,27
        add     a5,a5,a4
        srli    a5,a5,45
        srliw   a1,a1,3
        andi    a5,a5,504
        lui     a4,%hi(aFileAttacks)
        add     a5,a5,a1
        addi    a4,a4,%lo(aFileAttacks)
        sh3add  a5,a5,a4
        ld      a0,0(a5)
        sll     a0,a0,a3
        ret

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/109279] [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool
  2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
@ 2023-03-24 22:08 ` pinskia at gcc dot gnu.org
  2023-03-24 22:08 ` pinskia at gcc dot gnu.org
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-24 22:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note I think this really depends on the micro-arch if loading from memory is
faster than doing instructions for constant forming.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/109279] [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool
  2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
  2023-03-24 22:08 ` [Bug target/109279] " pinskia at gcc dot gnu.org
@ 2023-03-24 22:08 ` pinskia at gcc dot gnu.org
  2023-03-24 22:13 ` vineet.gupta at linux dot dev
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-24 22:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
If this was about -Os, then I would say yes this is a big code bloat but this
is about -O2.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/109279] [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool
  2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
  2023-03-24 22:08 ` [Bug target/109279] " pinskia at gcc dot gnu.org
  2023-03-24 22:08 ` pinskia at gcc dot gnu.org
@ 2023-03-24 22:13 ` vineet.gupta at linux dot dev
  2023-03-24 22:15 ` vineet.gupta at linux dot dev
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: vineet.gupta at linux dot dev @ 2023-03-24 22:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

--- Comment #3 from Vineet Gupta <vineet.gupta at linux dot dev> ---
We start off with following:

  (insn 18 17 19 2 (set (reg:DI 154)
     (mem/u/c:DI (reg/f:DI 155) [0  S8 A64])) "...":9:8 179 {*movdi_64bit}
     (expr_list:REG_DEAD (reg/f:DI 155)
        (expr_list:REG_EQUAL (const_int [0x101010101010101])

cse1 matches the new pattern and converts mem to const_int.

   (insn 18 17 19 2 (set (reg:DI 154)
        (const_int [0x101010101010101])) "...":9:8 177 {*mvconst_internal}
     (expr_list:REG_DEAD (reg/f:DI 155)
        (expr_list:REG_EQUAL (const_int [0x101010101010101])

split1 then does its job, again using the introduced define_insn_and_split
"*mvconst_internal"

Splitting with gen_split_15 (riscv.md:1677)
deleting insn with uid = 18.

   (insn 47 15 48 2 (set (reg:DI 154)
        (const_int 16842752 [0x1010000])) "...":9:8 -1

   (insn 48 47 49 2 (set (reg:DI 154)
        (plus:DI (reg:DI 154)
            (const_int 257 [0x101]))) "...":9:8 -1

   (insn 49 48 50 2 (set (reg:DI 154)
        (ashift:DI (reg:DI 154)
            (const_int 16 [0x10]))) "...":9:8 -1

   (insn 50 49 51 2 (set (reg:DI 154)
        (plus:DI (reg:DI 154)
            (const_int 257 [0x101]))) "...":9:8 -1

   (insn 51 50 52 2 (set (reg:DI 154)
        (ashift:DI (reg:DI 154)
            (const_int 16 [0x10]))) "...":9:8 -1

   (insn 52 51 19 2 (set (reg:DI 154)
        (plus:DI (reg:DI 154)
            (const_int 257 [0x101]))) "...":9:8 -1
     (expr_list:REG_EQUAL (const_int [0x101010101010101])

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/109279] [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool
  2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
                   ` (2 preceding siblings ...)
  2023-03-24 22:13 ` vineet.gupta at linux dot dev
@ 2023-03-24 22:15 ` vineet.gupta at linux dot dev
  2023-03-24 22:17 ` pinskia at gcc dot gnu.org
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: vineet.gupta at linux dot dev @ 2023-03-24 22:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

--- Comment #4 from Vineet Gupta <vineet.gupta at linux dot dev> ---
(In reply to Andrew Pinski from comment #2)
> If this was about -Os, then I would say yes this is a big code bloat but
> this is about -O2.

But this is not so much about code bloat, we see 3.5% additional dynamic icount
on qemu which will affect perf even if we didn't care about code size at all.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/109279] [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool
  2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
                   ` (3 preceding siblings ...)
  2023-03-24 22:15 ` vineet.gupta at linux dot dev
@ 2023-03-24 22:17 ` pinskia at gcc dot gnu.org
  2023-03-24 22:29 ` [Bug target/109279] RISC-V: complex constants synthesized should be improved pinskia at gcc dot gnu.org
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-24 22:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Vineet Gupta from comment #4)
> But this is not so much about code bloat, we see 3.5% additional dynamic
> icount on qemu which will affect perf even if we didn't care about code size
> at all.

Again dynamic icount increase does not always equals worse performance;
especially when it comes to loading from memory. It is a much more complex
equation depending on the cache and such.
Now we should be able to form these constants better in the first place too.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/109279] RISC-V: complex constants synthesized should be improved
  2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
                   ` (4 preceding siblings ...)
  2023-03-24 22:17 ` pinskia at gcc dot gnu.org
@ 2023-03-24 22:29 ` pinskia at gcc dot gnu.org
  2023-03-24 22:33 ` pinskia at gcc dot gnu.org
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-24 22:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2023-03-24
             Status|UNCONFIRMED                 |NEW
            Summary|[13 Regression] RISC-V:     |RISC-V: complex constants
                   |complex constants           |synthesized should be
                   |synthesized vs. fetching    |improved
                   |from constant pool          |
     Ever confirmed|0                           |1

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Take:
long long f(void)
{
  return 0x0101010101010101ull;
}

---- CUT ----
This should be done as:
        li      a0,16842752
        addi    a0,a0,257
        slli    a1,a0,32
        or      a0,a0,a1
Which is what this produces:
```
long long f(void)
{
  unsigned t = 16843009;
  long long t1 = t;
  long long t2 = ((unsigned long long )t) << 32;
  asm("":"+r"(t1));
  return t1 | t2;
}
```
I suspect: 0x0080402010080400ULL should be done as two 32bit with a shift/or
added too. Will definitely improve complex constants forming too.

Right now the backend does (const<<16+const)<<16+const... which is just so bad.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/109279] RISC-V: complex constants synthesized should be improved
  2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
                   ` (5 preceding siblings ...)
  2023-03-24 22:29 ` [Bug target/109279] RISC-V: complex constants synthesized should be improved pinskia at gcc dot gnu.org
@ 2023-03-24 22:33 ` pinskia at gcc dot gnu.org
  2023-03-24 22:34 ` pinskia at gcc dot gnu.org
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-24 22:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
riscv_split_integer does something similarlly to what I was suggesting but for
some reason it is not kicking in for this case ...

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/109279] RISC-V: complex constants synthesized should be improved
  2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
                   ` (6 preceding siblings ...)
  2023-03-24 22:33 ` pinskia at gcc dot gnu.org
@ 2023-03-24 22:34 ` pinskia at gcc dot gnu.org
  2023-03-24 23:33 ` pinskia at gcc dot gnu.org
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-24 22:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
oh right this is because can_create_pseudo is false ....

This code is just so broken ...

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/109279] RISC-V: complex constants synthesized should be improved
  2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
                   ` (7 preceding siblings ...)
  2023-03-24 22:34 ` pinskia at gcc dot gnu.org
@ 2023-03-24 23:33 ` pinskia at gcc dot gnu.org
  2023-03-25  0:44 ` vineet.gupta at linux dot dev
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-24 23:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The comment in riscv.cc:
  /* We can't call gen_reg_rtx from a splitter, because this might realloc
     the regno_reg_rtx array, which would invalidate reg rtx pointers in the
     combine undo buffer.  */
  bool can_create_pseudo = can_create_pseudo_p () && ! in_splitter;

is no longer true after r12-8030-g61bee6aed26eb3

so you should be able to get rid of the `&& ! in_splitter` part.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/109279] RISC-V: complex constants synthesized should be improved
  2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
                   ` (8 preceding siblings ...)
  2023-03-24 23:33 ` pinskia at gcc dot gnu.org
@ 2023-03-25  0:44 ` vineet.gupta at linux dot dev
  2023-03-25  0:48 ` vineet.gupta at linux dot dev
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: vineet.gupta at linux dot dev @ 2023-03-25  0:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

--- Comment #10 from Vineet Gupta <vineet.gupta at linux dot dev> ---
I tried removing the in_splitter  check (in 2 places), but no change in
results.

@@ -1313,7 +1313,7 @@ riscv_force_temporary (rtx dest, rtx value, bool
in_splitter)

-  if (can_create_pseudo_p () && !in_splitter)
+  if (can_create_pseudo_p ())


@@ -1669,7 +1669,7 @@ riscv_move_integer (rtx temp, rtx dest, HOST_WIDE_INT
value,

-  bool can_create_pseudo = can_create_pseudo_p () && ! in_splitter;
+  bool can_create_pseudo = can_create_pseudo_p ();

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/109279] RISC-V: complex constants synthesized should be improved
  2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
                   ` (9 preceding siblings ...)
  2023-03-25  0:44 ` vineet.gupta at linux dot dev
@ 2023-03-25  0:48 ` vineet.gupta at linux dot dev
  2023-03-25  0:56 ` pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: vineet.gupta at linux dot dev @ 2023-03-25  0:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

--- Comment #11 from Vineet Gupta <vineet.gupta at linux dot dev> ---
With change suggested by @pinksia, I do see that in split1, 
riscv_move_integer() -> riscv_split_integer() is now called.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/109279] RISC-V: complex constants synthesized should be improved
  2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
                   ` (10 preceding siblings ...)
  2023-03-25  0:48 ` vineet.gupta at linux dot dev
@ 2023-03-25  0:56 ` pinskia at gcc dot gnu.org
  2023-03-25  1:12 ` vineet.gupta at linux dot dev
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-25  0:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

--- Comment #12 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Here is something to look into:
#define const1 0x0101010101010101ULL 
#define const2 0x0080402010080400ULL 
#define const0 const1
unsigned long long g(unsigned long long occ, const unsigned int sq) {
  return const0 ;
}
unsigned long long f(unsigned long long occ, const unsigned int sq) {
  unsigned long long t= (const0)>>32<<32 ;
  unsigned long long t1= (unsigned int)(const0) ;
  asm("":"+r"(t));
  return t | t1;
}

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/109279] RISC-V: complex constants synthesized should be improved
  2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
                   ` (11 preceding siblings ...)
  2023-03-25  0:56 ` pinskia at gcc dot gnu.org
@ 2023-03-25  1:12 ` vineet.gupta at linux dot dev
  2023-03-30 18:40 ` vineet.gupta at linux dot dev
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: vineet.gupta at linux dot dev @ 2023-03-25  1:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

--- Comment #13 from Vineet Gupta <vineet.gupta at linux dot dev> ---
Ok it seems I missed _some_ improvement with prev change, although not ideal
still.

With 2e886eef7f2b

        li      a0,0x0101_0000
        addi    a0,a0,0x0101
        slli    a0,a0,16
        addi    a0,a0,0x0101
        slli    a0,a0,16
        addi    a0,a0,0x0101
        ret

Allow can_create_pseudo() in splitter

        li      a0,0x0101_0000
        addi    a5,a5,0x0101
        mv      a0,a5
        slli    a5,a5,32
        add     a0,a5,a0
        ret

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/109279] RISC-V: complex constants synthesized should be improved
  2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
                   ` (12 preceding siblings ...)
  2023-03-25  1:12 ` vineet.gupta at linux dot dev
@ 2023-03-30 18:40 ` vineet.gupta at linux dot dev
  2023-05-19 17:31 ` vineetg at gcc dot gnu.org
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: vineet.gupta at linux dot dev @ 2023-03-30 18:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

--- Comment #14 from Vineet Gupta <vineet.gupta at linux dot dev> ---
(In reply to Andrew Pinski from comment #12)
> Here is something to look into:

> #define const1 0x0101010101010101ULL 
> #define const0 const1

> unsigned long long f(unsigned long long occ, const unsigned int sq) {
>   unsigned long long t= (const0)>>32<<32 ;
>   unsigned long long t1= (unsigned int)(const0) ;
>   asm("":"+r"(t));
>   return t | t1;
> }

Before commit 2e886eef7f2

        li      a5,16842752
        addi    a5,a5,257
        slli    a0,a5,32
        or      a0,a0,a5
        ret

With commit 2e886eef7f2

        li      a0,16842752
        addi    a0,a0,257
        mv      a5,a0
        slli    a0,a0,32
        or      a0,a0,a5
        ret

With commit 2e886eef7f2 + in_splitter check removed from riscv_move_integer()

        li      a0,16842752
        li      a5,16842752
        addi    a5,a5,257
        addi    a0,a0,257
        slli    a0,a0,32
        or      a0,a0,a5
        ret

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/109279] RISC-V: complex constants synthesized should be improved
  2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
                   ` (13 preceding siblings ...)
  2023-03-30 18:40 ` vineet.gupta at linux dot dev
@ 2023-05-19 17:31 ` vineetg at gcc dot gnu.org
  2023-05-19 18:13 ` vineetg at gcc dot gnu.org
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: vineetg at gcc dot gnu.org @ 2023-05-19 17:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

Vineet Gupta <vineetg at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vineetg at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #15 from Vineet Gupta <vineetg at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #6)
> Take:
> long long f(void)
> {
>   return 0x0101010101010101ull;
> }
> 
> ---- CUT ----
> This should be done as:
>         li      a0,16842752
>         addi    a0,a0,257
>         slli    a1,a0,32
>         or      a0,a0,a1

I committed a fix today which gets us to exactly that [1]

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618948.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/109279] RISC-V: complex constants synthesized should be improved
  2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
                   ` (14 preceding siblings ...)
  2023-05-19 17:31 ` vineetg at gcc dot gnu.org
@ 2023-05-19 18:13 ` vineetg at gcc dot gnu.org
  2023-08-15 17:37 ` vineetg at gcc dot gnu.org
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: vineetg at gcc dot gnu.org @ 2023-05-19 18:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

--- Comment #16 from Vineet Gupta <vineetg at gcc dot gnu.org> ---
> Which is what this produces:
> ```
> long long f(void)
> {
>   unsigned t = 16843009;
>   long long t1 = t;
>   long long t2 = ((unsigned long long )t) << 32;
>   asm("":"+r"(t1));
>   return t1 | t2;
> }
> ```
> I suspect: 0x0080402010080400ULL should be done as two 32bit with a shift/or
> added too. Will definitely improve complex constants forming too.
> 
> Right now the backend does (const<<16+const)<<16+const... which is just so
> bad.

Umm this testcase is a different problem. It used to generate the same output
but no longer after g2e886eef7f2b5a and the other related updates:
g0530254413f8 and gc104ef4b5eb1.

For the test above, the low and high words are created independently and then
stitched.

260r.dfinit

# lower word

(insn 6 2 7 2 (set (reg:DI 138)
        (const_int [0x1010000]))  {*movdi_64bit}
(insn 7 6 8 2 (set (reg:DI 137)
        (plus:DI (reg:DI 138)
            (const_int [0x101]))) {adddi3}
     (expr_list:REG_EQUAL (const_int [0x1010101]) )
(insn 5 8 9 2 (set (reg/v:DI 134 [ t1 ])
        (reg:DI 136 [ t1 ])) {*movdi_64bit}

# upper word created independently, no reuse from prior values)

(insn 9 5 10 2 (set (reg:DI 141)
        (const_int [0x1010000]))  {*movdi_64bit}
(insn 10 9 11 2 (set (reg:DI 142)
        (plus:DI (reg:DI 141)
            (const_int [0x101]))) {adddi3}
(insn 11 10 12 2 (set (reg:DI 140)
        (ashift:DI (reg:DI 142)
            (const_int 32 [0x20]))) {ashldi3}
        (expr_list:REG_EQUAL (const_int [0x101010100000000]))

# stitch them
(insn 12 11 13 2 (set (reg:DI 139)
        (ior:DI (reg/v:DI 134 [ t1 ])
            (reg:DI 140))) "const2.c":7:13 99 {iordi3}


cse1 matches the new "*mvconst_internal" pattern independently on each of them 

(insn 7 6 8 2 (set (reg:DI 137)
        (const_int [0x1010101])) {*mvconst_internal}
        (expr_list:REG_EQUAL (const_int [0x1010101])))

(insn 11 10 12 2 (set (reg:DI 140)
        (const_int [0x1010101_00000000])) {*mvconst_internal}
        (expr_list:REG_EQUAL (const_int   
                [0x1010101_00000000]) ))

This ultimately gets in the way, as otherwise it would find the equivalent reg
across the 2 snippets and reuse reg.

It is interesting that due to same pattern, split1 undoes what cse1 did so in
theory cse2 ? could redo it it. Anyhow needs to be investigated. But ATM we
have the following codegen for the aforementioned test which clearly needs more
work.

        li      a0,16842752
        addi    a0,a0,257
        li      a5,16842752
        slli    a0,a0,32
        addi    a5,a5,257
        or      a0,a5,a0
        ret

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/109279] RISC-V: complex constants synthesized should be improved
  2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
                   ` (15 preceding siblings ...)
  2023-05-19 18:13 ` vineetg at gcc dot gnu.org
@ 2023-08-15 17:37 ` vineetg at gcc dot gnu.org
  2023-10-06 19:37 ` vineetg at gcc dot gnu.org
  2023-10-06 19:38 ` vineetg at gcc dot gnu.org
  18 siblings, 0 replies; 20+ messages in thread
From: vineetg at gcc dot gnu.org @ 2023-08-15 17:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

--- Comment #17 from Vineet Gupta <vineetg at gcc dot gnu.org> ---
(In reply to Vineet Gupta from comment #16)
> > Which is what this produces:
> > ```
> > long long f(void)
> > {
> >   unsigned t = 16843009;
> >   long long t1 = t;
> >   long long t2 = ((unsigned long long )t) << 32;
> >   asm("":"+r"(t1));
> >   return t1 | t2;
> > }
> > ```

> 
> 	li	a0,16842752
> 	addi	a0,a0,257
> 	li	a5,16842752
> 	slli	a0,a0,32
> 	addi	a5,a5,257
> 	or	a0,a5,a0
> 	ret

This is again IRA inflicted pain (similar to [PR110748]). 
IRA seems to be undoing split1 since we have 2 insn sequences to synthesize the
constant pieces. This explains why the problem got exacerbated with commit
0530254413f8 ("riscv: relax splitter restrictions for creating pseudos") since
now different regs are used to create parts of const, vs 1 reg being repeatedly
used for assembling a const (fooling IRA's equivalent replacement logic).

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/109279] RISC-V: complex constants synthesized should be improved
  2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
                   ` (16 preceding siblings ...)
  2023-08-15 17:37 ` vineetg at gcc dot gnu.org
@ 2023-10-06 19:37 ` vineetg at gcc dot gnu.org
  2023-10-06 19:38 ` vineetg at gcc dot gnu.org
  18 siblings, 0 replies; 20+ messages in thread
From: vineetg at gcc dot gnu.org @ 2023-10-06 19:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

--- Comment #18 from Vineet Gupta <vineetg at gcc dot gnu.org> ---
(In reply to Vineet Gupta from comment #17)
> (In reply to Vineet Gupta from comment #16)
> > > Which is what this produces:
> > > ```
> > > long long f(void)
> > > {
> > >   unsigned t = 16843009;
> > >   long long t1 = t;
> > >   long long t2 = ((unsigned long long )t) << 32;
> > >   asm("":"+r"(t1));
> > >   return t1 | t2;
> > > }
> > > ```

> > 	li	a0,16842752
> > 	addi	a0,a0,257
> > 	li	a5,16842752
> > 	slli	a0,a0,32
> > 	addi	a5,a5,257
> > 	or	a0,a5,a0
> > 	ret
> 
> This is again IRA inflicted pain (similar to [PR110748]). 
> IRA seems to be undoing split1 since we have 2 insn sequences to synthesize
> the constant pieces. This explains why the problem got exacerbated with
> commit 0530254413f8 ("riscv: relax splitter restrictions for creating
> pseudos") since now different regs are used to create parts of const, vs 1
> reg being repeatedly used for assembling a const (fooling IRA's equivalent
> replacement logic).

After commit 
    2023-08-18 a047513c9222 RISC-V: Enable pressure-aware scheduling by
default.  

the test above has improved.

        li      a5,16842752
        addi    a5,a5,257
        mv      a0,a5
        slli    a5,a5,32
        or      a0,a0,a5
        ret

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/109279] RISC-V: complex constants synthesized should be improved
  2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
                   ` (17 preceding siblings ...)
  2023-10-06 19:37 ` vineetg at gcc dot gnu.org
@ 2023-10-06 19:38 ` vineetg at gcc dot gnu.org
  18 siblings, 0 replies; 20+ messages in thread
From: vineetg at gcc dot gnu.org @ 2023-10-06 19:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279

--- Comment #19 from Vineet Gupta <vineetg at gcc dot gnu.org> ---
FWIW with today's change, splitter is now hidden from IRA, but we are still
getting the extraneous mv.

2023-10-06 c1bc7513b1d7 RISC-V: const: hide mvconst splitter from IRA

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-10-06 19:38 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
2023-03-24 22:08 ` [Bug target/109279] " pinskia at gcc dot gnu.org
2023-03-24 22:08 ` pinskia at gcc dot gnu.org
2023-03-24 22:13 ` vineet.gupta at linux dot dev
2023-03-24 22:15 ` vineet.gupta at linux dot dev
2023-03-24 22:17 ` pinskia at gcc dot gnu.org
2023-03-24 22:29 ` [Bug target/109279] RISC-V: complex constants synthesized should be improved pinskia at gcc dot gnu.org
2023-03-24 22:33 ` pinskia at gcc dot gnu.org
2023-03-24 22:34 ` pinskia at gcc dot gnu.org
2023-03-24 23:33 ` pinskia at gcc dot gnu.org
2023-03-25  0:44 ` vineet.gupta at linux dot dev
2023-03-25  0:48 ` vineet.gupta at linux dot dev
2023-03-25  0:56 ` pinskia at gcc dot gnu.org
2023-03-25  1:12 ` vineet.gupta at linux dot dev
2023-03-30 18:40 ` vineet.gupta at linux dot dev
2023-05-19 17:31 ` vineetg at gcc dot gnu.org
2023-05-19 18:13 ` vineetg at gcc dot gnu.org
2023-08-15 17:37 ` vineetg at gcc dot gnu.org
2023-10-06 19:37 ` vineetg at gcc dot gnu.org
2023-10-06 19:38 ` vineetg at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).