public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool
@ 2023-03-24 22:04 vineet.gupta at linux dot dev
2023-03-24 22:08 ` [Bug target/109279] " pinskia at gcc dot gnu.org
` (18 more replies)
0 siblings, 19 replies; 20+ messages in thread
From: vineet.gupta at linux dot dev @ 2023-03-24 22:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279
Bug ID: 109279
Summary: [13 Regression] RISC-V: complex constants synthesized
vs. fetching from constant pool
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: vineet.gupta at linux dot dev
Target Milestone: ---
This is code bloat regression since gcc 12.1, seen yet again in SPEC2017
deepsjeng. After 2e886eef7f2b5a ("RISC-V: Produce better code with complex
constants [PR95632] [PR106602]").
unsigned long long FileAttacks(unsigned long long occ, const unsigned int sq) {
unsigned int o;
unsigned int f = sq & 7;
occ = 0x0101010101010101ULL & (occ >> f);
o = ( 0x0080402010080400ULL * occ ) >> 58;
return ( aFileAttacks[o][sq>>3] ) << f;
}
cc1 -O2 -march=rv64gc_zba_zbb_zbc_zbs -mabi=lp64d # stage1 is enough
Before above commit
-------------------
lui a4,%hi(.LC0)
ld a4,%lo(.LC0)(a4)
andi a3,a1,7
srl a5,a0,a3
and a5,a5,a4
lui a4,%hi(.LC1)
ld a4,%lo(.LC1)(a4)
srliw a1,a1,3
mul a5,a5,a4
lui a4,%hi(aFileAttacks)
addi a4,a4,%lo(aFileAttacks)
srli a5,a5,58
sh3add a5,a5,a1
sh3add a5,a5,a4
ld a0,0(a5)
sll a0,a0,a3
ret
.section .srodata.cst8,"aM",@progbits,8
.align 3
.LC0:
.dword 0x0101010101010101
.align 3
.LC1:
.dword 0x0080402010080400
With commit
-----------
li a5,16842752
addi a5,a5,257
slli a5,a5,16
addi a5,a5,257
andi a3,a1,7
slli a5,a5,16
srl a4,a0,a3
addi a5,a5,257
and a4,a4,a5
slli a5,a4,9
add a5,a5,a4
slli a5,a5,9
add a5,a5,a4
slli a4,a5,27
add a5,a5,a4
srli a5,a5,45
srliw a1,a1,3
andi a5,a5,504
lui a4,%hi(aFileAttacks)
add a5,a5,a1
addi a4,a4,%lo(aFileAttacks)
sh3add a5,a5,a4
ld a0,0(a5)
sll a0,a0,a3
ret
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109279] [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
@ 2023-03-24 22:08 ` pinskia at gcc dot gnu.org
2023-03-24 22:08 ` pinskia at gcc dot gnu.org
` (17 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-24 22:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note I think this really depends on the micro-arch if loading from memory is
faster than doing instructions for constant forming.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109279] [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
2023-03-24 22:08 ` [Bug target/109279] " pinskia at gcc dot gnu.org
@ 2023-03-24 22:08 ` pinskia at gcc dot gnu.org
2023-03-24 22:13 ` vineet.gupta at linux dot dev
` (16 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-24 22:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
If this was about -Os, then I would say yes this is a big code bloat but this
is about -O2.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109279] [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
2023-03-24 22:08 ` [Bug target/109279] " pinskia at gcc dot gnu.org
2023-03-24 22:08 ` pinskia at gcc dot gnu.org
@ 2023-03-24 22:13 ` vineet.gupta at linux dot dev
2023-03-24 22:15 ` vineet.gupta at linux dot dev
` (15 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: vineet.gupta at linux dot dev @ 2023-03-24 22:13 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279
--- Comment #3 from Vineet Gupta <vineet.gupta at linux dot dev> ---
We start off with following:
(insn 18 17 19 2 (set (reg:DI 154)
(mem/u/c:DI (reg/f:DI 155) [0 S8 A64])) "...":9:8 179 {*movdi_64bit}
(expr_list:REG_DEAD (reg/f:DI 155)
(expr_list:REG_EQUAL (const_int [0x101010101010101])
cse1 matches the new pattern and converts mem to const_int.
(insn 18 17 19 2 (set (reg:DI 154)
(const_int [0x101010101010101])) "...":9:8 177 {*mvconst_internal}
(expr_list:REG_DEAD (reg/f:DI 155)
(expr_list:REG_EQUAL (const_int [0x101010101010101])
split1 then does its job, again using the introduced define_insn_and_split
"*mvconst_internal"
Splitting with gen_split_15 (riscv.md:1677)
deleting insn with uid = 18.
(insn 47 15 48 2 (set (reg:DI 154)
(const_int 16842752 [0x1010000])) "...":9:8 -1
(insn 48 47 49 2 (set (reg:DI 154)
(plus:DI (reg:DI 154)
(const_int 257 [0x101]))) "...":9:8 -1
(insn 49 48 50 2 (set (reg:DI 154)
(ashift:DI (reg:DI 154)
(const_int 16 [0x10]))) "...":9:8 -1
(insn 50 49 51 2 (set (reg:DI 154)
(plus:DI (reg:DI 154)
(const_int 257 [0x101]))) "...":9:8 -1
(insn 51 50 52 2 (set (reg:DI 154)
(ashift:DI (reg:DI 154)
(const_int 16 [0x10]))) "...":9:8 -1
(insn 52 51 19 2 (set (reg:DI 154)
(plus:DI (reg:DI 154)
(const_int 257 [0x101]))) "...":9:8 -1
(expr_list:REG_EQUAL (const_int [0x101010101010101])
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109279] [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
` (2 preceding siblings ...)
2023-03-24 22:13 ` vineet.gupta at linux dot dev
@ 2023-03-24 22:15 ` vineet.gupta at linux dot dev
2023-03-24 22:17 ` pinskia at gcc dot gnu.org
` (14 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: vineet.gupta at linux dot dev @ 2023-03-24 22:15 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279
--- Comment #4 from Vineet Gupta <vineet.gupta at linux dot dev> ---
(In reply to Andrew Pinski from comment #2)
> If this was about -Os, then I would say yes this is a big code bloat but
> this is about -O2.
But this is not so much about code bloat, we see 3.5% additional dynamic icount
on qemu which will affect perf even if we didn't care about code size at all.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109279] [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
` (3 preceding siblings ...)
2023-03-24 22:15 ` vineet.gupta at linux dot dev
@ 2023-03-24 22:17 ` pinskia at gcc dot gnu.org
2023-03-24 22:29 ` [Bug target/109279] RISC-V: complex constants synthesized should be improved pinskia at gcc dot gnu.org
` (13 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-24 22:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Vineet Gupta from comment #4)
> But this is not so much about code bloat, we see 3.5% additional dynamic
> icount on qemu which will affect perf even if we didn't care about code size
> at all.
Again dynamic icount increase does not always equals worse performance;
especially when it comes to loading from memory. It is a much more complex
equation depending on the cache and such.
Now we should be able to form these constants better in the first place too.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109279] RISC-V: complex constants synthesized should be improved
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
` (4 preceding siblings ...)
2023-03-24 22:17 ` pinskia at gcc dot gnu.org
@ 2023-03-24 22:29 ` pinskia at gcc dot gnu.org
2023-03-24 22:33 ` pinskia at gcc dot gnu.org
` (12 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-24 22:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2023-03-24
Status|UNCONFIRMED |NEW
Summary|[13 Regression] RISC-V: |RISC-V: complex constants
|complex constants |synthesized should be
|synthesized vs. fetching |improved
|from constant pool |
Ever confirmed|0 |1
--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Take:
long long f(void)
{
return 0x0101010101010101ull;
}
---- CUT ----
This should be done as:
li a0,16842752
addi a0,a0,257
slli a1,a0,32
or a0,a0,a1
Which is what this produces:
```
long long f(void)
{
unsigned t = 16843009;
long long t1 = t;
long long t2 = ((unsigned long long )t) << 32;
asm("":"+r"(t1));
return t1 | t2;
}
```
I suspect: 0x0080402010080400ULL should be done as two 32bit with a shift/or
added too. Will definitely improve complex constants forming too.
Right now the backend does (const<<16+const)<<16+const... which is just so bad.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109279] RISC-V: complex constants synthesized should be improved
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
` (5 preceding siblings ...)
2023-03-24 22:29 ` [Bug target/109279] RISC-V: complex constants synthesized should be improved pinskia at gcc dot gnu.org
@ 2023-03-24 22:33 ` pinskia at gcc dot gnu.org
2023-03-24 22:34 ` pinskia at gcc dot gnu.org
` (11 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-24 22:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279
--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
riscv_split_integer does something similarlly to what I was suggesting but for
some reason it is not kicking in for this case ...
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109279] RISC-V: complex constants synthesized should be improved
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
` (6 preceding siblings ...)
2023-03-24 22:33 ` pinskia at gcc dot gnu.org
@ 2023-03-24 22:34 ` pinskia at gcc dot gnu.org
2023-03-24 23:33 ` pinskia at gcc dot gnu.org
` (10 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-24 22:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279
--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
oh right this is because can_create_pseudo is false ....
This code is just so broken ...
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109279] RISC-V: complex constants synthesized should be improved
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
` (7 preceding siblings ...)
2023-03-24 22:34 ` pinskia at gcc dot gnu.org
@ 2023-03-24 23:33 ` pinskia at gcc dot gnu.org
2023-03-25 0:44 ` vineet.gupta at linux dot dev
` (9 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-24 23:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279
--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The comment in riscv.cc:
/* We can't call gen_reg_rtx from a splitter, because this might realloc
the regno_reg_rtx array, which would invalidate reg rtx pointers in the
combine undo buffer. */
bool can_create_pseudo = can_create_pseudo_p () && ! in_splitter;
is no longer true after r12-8030-g61bee6aed26eb3
so you should be able to get rid of the `&& ! in_splitter` part.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109279] RISC-V: complex constants synthesized should be improved
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
` (8 preceding siblings ...)
2023-03-24 23:33 ` pinskia at gcc dot gnu.org
@ 2023-03-25 0:44 ` vineet.gupta at linux dot dev
2023-03-25 0:48 ` vineet.gupta at linux dot dev
` (8 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: vineet.gupta at linux dot dev @ 2023-03-25 0:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279
--- Comment #10 from Vineet Gupta <vineet.gupta at linux dot dev> ---
I tried removing the in_splitter check (in 2 places), but no change in
results.
@@ -1313,7 +1313,7 @@ riscv_force_temporary (rtx dest, rtx value, bool
in_splitter)
- if (can_create_pseudo_p () && !in_splitter)
+ if (can_create_pseudo_p ())
@@ -1669,7 +1669,7 @@ riscv_move_integer (rtx temp, rtx dest, HOST_WIDE_INT
value,
- bool can_create_pseudo = can_create_pseudo_p () && ! in_splitter;
+ bool can_create_pseudo = can_create_pseudo_p ();
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109279] RISC-V: complex constants synthesized should be improved
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
` (9 preceding siblings ...)
2023-03-25 0:44 ` vineet.gupta at linux dot dev
@ 2023-03-25 0:48 ` vineet.gupta at linux dot dev
2023-03-25 0:56 ` pinskia at gcc dot gnu.org
` (7 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: vineet.gupta at linux dot dev @ 2023-03-25 0:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279
--- Comment #11 from Vineet Gupta <vineet.gupta at linux dot dev> ---
With change suggested by @pinksia, I do see that in split1,
riscv_move_integer() -> riscv_split_integer() is now called.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109279] RISC-V: complex constants synthesized should be improved
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
` (10 preceding siblings ...)
2023-03-25 0:48 ` vineet.gupta at linux dot dev
@ 2023-03-25 0:56 ` pinskia at gcc dot gnu.org
2023-03-25 1:12 ` vineet.gupta at linux dot dev
` (6 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-25 0:56 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279
--- Comment #12 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Here is something to look into:
#define const1 0x0101010101010101ULL
#define const2 0x0080402010080400ULL
#define const0 const1
unsigned long long g(unsigned long long occ, const unsigned int sq) {
return const0 ;
}
unsigned long long f(unsigned long long occ, const unsigned int sq) {
unsigned long long t= (const0)>>32<<32 ;
unsigned long long t1= (unsigned int)(const0) ;
asm("":"+r"(t));
return t | t1;
}
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109279] RISC-V: complex constants synthesized should be improved
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
` (11 preceding siblings ...)
2023-03-25 0:56 ` pinskia at gcc dot gnu.org
@ 2023-03-25 1:12 ` vineet.gupta at linux dot dev
2023-03-30 18:40 ` vineet.gupta at linux dot dev
` (5 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: vineet.gupta at linux dot dev @ 2023-03-25 1:12 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279
--- Comment #13 from Vineet Gupta <vineet.gupta at linux dot dev> ---
Ok it seems I missed _some_ improvement with prev change, although not ideal
still.
With 2e886eef7f2b
li a0,0x0101_0000
addi a0,a0,0x0101
slli a0,a0,16
addi a0,a0,0x0101
slli a0,a0,16
addi a0,a0,0x0101
ret
Allow can_create_pseudo() in splitter
li a0,0x0101_0000
addi a5,a5,0x0101
mv a0,a5
slli a5,a5,32
add a0,a5,a0
ret
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109279] RISC-V: complex constants synthesized should be improved
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
` (12 preceding siblings ...)
2023-03-25 1:12 ` vineet.gupta at linux dot dev
@ 2023-03-30 18:40 ` vineet.gupta at linux dot dev
2023-05-19 17:31 ` vineetg at gcc dot gnu.org
` (4 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: vineet.gupta at linux dot dev @ 2023-03-30 18:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279
--- Comment #14 from Vineet Gupta <vineet.gupta at linux dot dev> ---
(In reply to Andrew Pinski from comment #12)
> Here is something to look into:
> #define const1 0x0101010101010101ULL
> #define const0 const1
> unsigned long long f(unsigned long long occ, const unsigned int sq) {
> unsigned long long t= (const0)>>32<<32 ;
> unsigned long long t1= (unsigned int)(const0) ;
> asm("":"+r"(t));
> return t | t1;
> }
Before commit 2e886eef7f2
li a5,16842752
addi a5,a5,257
slli a0,a5,32
or a0,a0,a5
ret
With commit 2e886eef7f2
li a0,16842752
addi a0,a0,257
mv a5,a0
slli a0,a0,32
or a0,a0,a5
ret
With commit 2e886eef7f2 + in_splitter check removed from riscv_move_integer()
li a0,16842752
li a5,16842752
addi a5,a5,257
addi a0,a0,257
slli a0,a0,32
or a0,a0,a5
ret
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109279] RISC-V: complex constants synthesized should be improved
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
` (13 preceding siblings ...)
2023-03-30 18:40 ` vineet.gupta at linux dot dev
@ 2023-05-19 17:31 ` vineetg at gcc dot gnu.org
2023-05-19 18:13 ` vineetg at gcc dot gnu.org
` (3 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: vineetg at gcc dot gnu.org @ 2023-05-19 17:31 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279
Vineet Gupta <vineetg at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |vineetg at gcc dot gnu.org
Status|NEW |ASSIGNED
--- Comment #15 from Vineet Gupta <vineetg at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #6)
> Take:
> long long f(void)
> {
> return 0x0101010101010101ull;
> }
>
> ---- CUT ----
> This should be done as:
> li a0,16842752
> addi a0,a0,257
> slli a1,a0,32
> or a0,a0,a1
I committed a fix today which gets us to exactly that [1]
[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618948.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109279] RISC-V: complex constants synthesized should be improved
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
` (14 preceding siblings ...)
2023-05-19 17:31 ` vineetg at gcc dot gnu.org
@ 2023-05-19 18:13 ` vineetg at gcc dot gnu.org
2023-08-15 17:37 ` vineetg at gcc dot gnu.org
` (2 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: vineetg at gcc dot gnu.org @ 2023-05-19 18:13 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279
--- Comment #16 from Vineet Gupta <vineetg at gcc dot gnu.org> ---
> Which is what this produces:
> ```
> long long f(void)
> {
> unsigned t = 16843009;
> long long t1 = t;
> long long t2 = ((unsigned long long )t) << 32;
> asm("":"+r"(t1));
> return t1 | t2;
> }
> ```
> I suspect: 0x0080402010080400ULL should be done as two 32bit with a shift/or
> added too. Will definitely improve complex constants forming too.
>
> Right now the backend does (const<<16+const)<<16+const... which is just so
> bad.
Umm this testcase is a different problem. It used to generate the same output
but no longer after g2e886eef7f2b5a and the other related updates:
g0530254413f8 and gc104ef4b5eb1.
For the test above, the low and high words are created independently and then
stitched.
260r.dfinit
# lower word
(insn 6 2 7 2 (set (reg:DI 138)
(const_int [0x1010000])) {*movdi_64bit}
(insn 7 6 8 2 (set (reg:DI 137)
(plus:DI (reg:DI 138)
(const_int [0x101]))) {adddi3}
(expr_list:REG_EQUAL (const_int [0x1010101]) )
(insn 5 8 9 2 (set (reg/v:DI 134 [ t1 ])
(reg:DI 136 [ t1 ])) {*movdi_64bit}
# upper word created independently, no reuse from prior values)
(insn 9 5 10 2 (set (reg:DI 141)
(const_int [0x1010000])) {*movdi_64bit}
(insn 10 9 11 2 (set (reg:DI 142)
(plus:DI (reg:DI 141)
(const_int [0x101]))) {adddi3}
(insn 11 10 12 2 (set (reg:DI 140)
(ashift:DI (reg:DI 142)
(const_int 32 [0x20]))) {ashldi3}
(expr_list:REG_EQUAL (const_int [0x101010100000000]))
# stitch them
(insn 12 11 13 2 (set (reg:DI 139)
(ior:DI (reg/v:DI 134 [ t1 ])
(reg:DI 140))) "const2.c":7:13 99 {iordi3}
cse1 matches the new "*mvconst_internal" pattern independently on each of them
(insn 7 6 8 2 (set (reg:DI 137)
(const_int [0x1010101])) {*mvconst_internal}
(expr_list:REG_EQUAL (const_int [0x1010101])))
(insn 11 10 12 2 (set (reg:DI 140)
(const_int [0x1010101_00000000])) {*mvconst_internal}
(expr_list:REG_EQUAL (const_int
[0x1010101_00000000]) ))
This ultimately gets in the way, as otherwise it would find the equivalent reg
across the 2 snippets and reuse reg.
It is interesting that due to same pattern, split1 undoes what cse1 did so in
theory cse2 ? could redo it it. Anyhow needs to be investigated. But ATM we
have the following codegen for the aforementioned test which clearly needs more
work.
li a0,16842752
addi a0,a0,257
li a5,16842752
slli a0,a0,32
addi a5,a5,257
or a0,a5,a0
ret
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109279] RISC-V: complex constants synthesized should be improved
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
` (15 preceding siblings ...)
2023-05-19 18:13 ` vineetg at gcc dot gnu.org
@ 2023-08-15 17:37 ` vineetg at gcc dot gnu.org
2023-10-06 19:37 ` vineetg at gcc dot gnu.org
2023-10-06 19:38 ` vineetg at gcc dot gnu.org
18 siblings, 0 replies; 20+ messages in thread
From: vineetg at gcc dot gnu.org @ 2023-08-15 17:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279
--- Comment #17 from Vineet Gupta <vineetg at gcc dot gnu.org> ---
(In reply to Vineet Gupta from comment #16)
> > Which is what this produces:
> > ```
> > long long f(void)
> > {
> > unsigned t = 16843009;
> > long long t1 = t;
> > long long t2 = ((unsigned long long )t) << 32;
> > asm("":"+r"(t1));
> > return t1 | t2;
> > }
> > ```
>
> li a0,16842752
> addi a0,a0,257
> li a5,16842752
> slli a0,a0,32
> addi a5,a5,257
> or a0,a5,a0
> ret
This is again IRA inflicted pain (similar to [PR110748]).
IRA seems to be undoing split1 since we have 2 insn sequences to synthesize the
constant pieces. This explains why the problem got exacerbated with commit
0530254413f8 ("riscv: relax splitter restrictions for creating pseudos") since
now different regs are used to create parts of const, vs 1 reg being repeatedly
used for assembling a const (fooling IRA's equivalent replacement logic).
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109279] RISC-V: complex constants synthesized should be improved
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
` (16 preceding siblings ...)
2023-08-15 17:37 ` vineetg at gcc dot gnu.org
@ 2023-10-06 19:37 ` vineetg at gcc dot gnu.org
2023-10-06 19:38 ` vineetg at gcc dot gnu.org
18 siblings, 0 replies; 20+ messages in thread
From: vineetg at gcc dot gnu.org @ 2023-10-06 19:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279
--- Comment #18 from Vineet Gupta <vineetg at gcc dot gnu.org> ---
(In reply to Vineet Gupta from comment #17)
> (In reply to Vineet Gupta from comment #16)
> > > Which is what this produces:
> > > ```
> > > long long f(void)
> > > {
> > > unsigned t = 16843009;
> > > long long t1 = t;
> > > long long t2 = ((unsigned long long )t) << 32;
> > > asm("":"+r"(t1));
> > > return t1 | t2;
> > > }
> > > ```
> > li a0,16842752
> > addi a0,a0,257
> > li a5,16842752
> > slli a0,a0,32
> > addi a5,a5,257
> > or a0,a5,a0
> > ret
>
> This is again IRA inflicted pain (similar to [PR110748]).
> IRA seems to be undoing split1 since we have 2 insn sequences to synthesize
> the constant pieces. This explains why the problem got exacerbated with
> commit 0530254413f8 ("riscv: relax splitter restrictions for creating
> pseudos") since now different regs are used to create parts of const, vs 1
> reg being repeatedly used for assembling a const (fooling IRA's equivalent
> replacement logic).
After commit
2023-08-18 a047513c9222 RISC-V: Enable pressure-aware scheduling by
default.
the test above has improved.
li a5,16842752
addi a5,a5,257
mv a0,a5
slli a5,a5,32
or a0,a0,a5
ret
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/109279] RISC-V: complex constants synthesized should be improved
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
` (17 preceding siblings ...)
2023-10-06 19:37 ` vineetg at gcc dot gnu.org
@ 2023-10-06 19:38 ` vineetg at gcc dot gnu.org
18 siblings, 0 replies; 20+ messages in thread
From: vineetg at gcc dot gnu.org @ 2023-10-06 19:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109279
--- Comment #19 from Vineet Gupta <vineetg at gcc dot gnu.org> ---
FWIW with today's change, splitter is now hidden from IRA, but we are still
getting the extraneous mv.
2023-10-06 c1bc7513b1d7 RISC-V: const: hide mvconst splitter from IRA
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2023-10-06 19:38 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-24 22:04 [Bug target/109279] New: [13 Regression] RISC-V: complex constants synthesized vs. fetching from constant pool vineet.gupta at linux dot dev
2023-03-24 22:08 ` [Bug target/109279] " pinskia at gcc dot gnu.org
2023-03-24 22:08 ` pinskia at gcc dot gnu.org
2023-03-24 22:13 ` vineet.gupta at linux dot dev
2023-03-24 22:15 ` vineet.gupta at linux dot dev
2023-03-24 22:17 ` pinskia at gcc dot gnu.org
2023-03-24 22:29 ` [Bug target/109279] RISC-V: complex constants synthesized should be improved pinskia at gcc dot gnu.org
2023-03-24 22:33 ` pinskia at gcc dot gnu.org
2023-03-24 22:34 ` pinskia at gcc dot gnu.org
2023-03-24 23:33 ` pinskia at gcc dot gnu.org
2023-03-25 0:44 ` vineet.gupta at linux dot dev
2023-03-25 0:48 ` vineet.gupta at linux dot dev
2023-03-25 0:56 ` pinskia at gcc dot gnu.org
2023-03-25 1:12 ` vineet.gupta at linux dot dev
2023-03-30 18:40 ` vineet.gupta at linux dot dev
2023-05-19 17:31 ` vineetg at gcc dot gnu.org
2023-05-19 18:13 ` vineetg at gcc dot gnu.org
2023-08-15 17:37 ` vineetg at gcc dot gnu.org
2023-10-06 19:37 ` vineetg at gcc dot gnu.org
2023-10-06 19:38 ` vineetg at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).