public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/108185] New: [RISC-V]RVV assemble not set vsetvli correct.
@ 2022-12-20 3:55 jiawei at iscas dot ac.cn
2022-12-20 3:58 ` [Bug target/108185] " jiawei at iscas dot ac.cn
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: jiawei at iscas dot ac.cn @ 2022-12-20 3:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108185
Bug ID: 108185
Summary: [RISC-V]RVV assemble not set vsetvli correct.
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: jiawei at iscas dot ac.cn
Target Milestone: ---
Currently, when use gcc13 to compile follow code with rvv
extension(-march=rv64gcv -O3),
void foo5_3 (int32_t * restrict in, int32_t * restrict out, size_t n, int
cond)
{
vint8m1_t v = *(vint8m1_t*)in;
*(vint8m1_t*)out = v;
vbool8_t v3 = *(vbool8_t*)in;
*(vbool8_t*)(out + 200) = v3;
}
it will generate asm as:
vl1re8.v v25,0(a0)
sub a5,a3,a5
vs1r.v v25,0(a1)
vs1r.v v25,0(a5)
seems not use vsetvli correctly, any suggestions?
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/108185] [RISC-V]RVV assemble not set vsetvli correct.
2022-12-20 3:55 [Bug target/108185] New: [RISC-V]RVV assemble not set vsetvli correct jiawei at iscas dot ac.cn
@ 2022-12-20 3:58 ` jiawei at iscas dot ac.cn
2022-12-29 9:51 ` kito at gcc dot gnu.org
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: jiawei at iscas dot ac.cn @ 2022-12-20 3:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108185
jiawei <jiawei at iscas dot ac.cn> changed:
What |Removed |Added
----------------------------------------------------------------------------
Version|13.0 |fortran-dev
--- Comment #1 from jiawei <jiawei at iscas dot ac.cn> ---
vl1re8.v v25,0(a0)
sub a5,a3,a5
vs1r.v v25,0(a1)
vs1r.v v25,0(a5)
addi a4,a1,800
csrr t0,vlenb
slli t1,t0,1
vsetvli a5,zero,e8,m1,ta,ma
vsm.v v25,0(a4)
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/108185] [RISC-V]RVV assemble not set vsetvli correct.
2022-12-20 3:55 [Bug target/108185] New: [RISC-V]RVV assemble not set vsetvli correct jiawei at iscas dot ac.cn
2022-12-20 3:58 ` [Bug target/108185] " jiawei at iscas dot ac.cn
@ 2022-12-29 9:51 ` kito at gcc dot gnu.org
2023-01-03 1:54 ` jiawei at iscas dot ac.cn
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: kito at gcc dot gnu.org @ 2022-12-29 9:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108185
Kito Cheng <kito at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |kito at gcc dot gnu.org
--- Comment #2 from Kito Cheng <kito at gcc dot gnu.org> ---
It seems right to me?
```
$ riscv64-unknown-elf-gcc pr108185.c -march=rv64gcv -mabi=lp64d -O3 -S -o -
.file "pr108185.c"
.option nopic
.attribute arch,
"rv64i2p0_m2p0_a2p0_f2p0_d2p0_c2p0_v1p0_zve32f1p0_zve32x1p0_zve64d1p0_zve64f1p0_zve64x1p0_zvl128b1p0_zvl32b1p0_zvl64b1p0"
.attribute unaligned_access, 0
.attribute stack_align, 16
.text
.align 1
.globl foo5_3
.type foo5_3, @function
foo5_3:
csrr t0,vlenb
slli t1,t0,1
csrr a5,vlenb
sub sp,sp,t1
slli a3,a5,1
add a3,a3,sp
vl1re8.v v25,0(a0) # Load value from *(vint8m1_t*)in
sub a5,a3,a5
vs1r.v v25,0(a1) # Store value to *(vint8m1_t*)out
vs1r.v v25,0(a5) # Store value to stack, although it's
unused.
addi a4,a1,800
csrr t0,vlenb
slli t1,t0,1
vsetvli a5,zero,e8,m1,ta,ma # Right vsetvli for vsm.v
vsm.v v25,0(a4)
add sp,sp,t1
jr ra
.size foo5_3, .-foo5_3
.ident "GCC: (g44b22ab81cf) 13.0.0 20221229 (experimental)"
```
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/108185] [RISC-V]RVV assemble not set vsetvli correct.
2022-12-20 3:55 [Bug target/108185] New: [RISC-V]RVV assemble not set vsetvli correct jiawei at iscas dot ac.cn
2022-12-20 3:58 ` [Bug target/108185] " jiawei at iscas dot ac.cn
2022-12-29 9:51 ` kito at gcc dot gnu.org
@ 2023-01-03 1:54 ` jiawei at iscas dot ac.cn
2023-01-03 2:32 ` [Bug target/108185] [RISC-V] Sub-optimal code-gen for vsetvli: redundant stack store kito at gcc dot gnu.org
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: jiawei at iscas dot ac.cn @ 2023-01-03 1:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108185
--- Comment #3 from jiawei <jiawei at iscas dot ac.cn> ---
(In reply to Kito Cheng from comment #2)
> It seems right to me?
Yes, It have the same behavior with clang, but it could generate better
assemble code like:
vl1re8.v v24,0(a0)
addi a4,a1,800
vs1r.v v24,0(a1)
vsetvli a5,zero,e8,m1,ta,ma
vlm v24,0(a0)
vsm v24,0(a4)
ret
>
>
> ```
> $ riscv64-unknown-elf-gcc pr108185.c -march=rv64gcv -mabi=lp64d -O3 -S -o
> -
> .file "pr108185.c"
> .option nopic
> .attribute arch,
> "rv64i2p0_m2p0_a2p0_f2p0_d2p0_c2p0_v1p0_zve32f1p0_zve32x1p0_zve64d1p0_zve64f1
> p0_zve64x1p0_zvl128b1p0_zvl32b1p0_zvl64b1p0"
> .attribute unaligned_access, 0
> .attribute stack_align, 16
> .text
> .align 1
> .globl foo5_3
> .type foo5_3, @function
> foo5_3:
> csrr t0,vlenb
> slli t1,t0,1
> csrr a5,vlenb
> sub sp,sp,t1
> slli a3,a5,1
> add a3,a3,sp
> vl1re8.v v25,0(a0) # Load value from *(vint8m1_t*)in
> sub a5,a3,a5
> vs1r.v v25,0(a1) # Store value to *(vint8m1_t*)out
> vs1r.v v25,0(a5) # Store value to stack, although it's
> unused.
> addi a4,a1,800
> csrr t0,vlenb
> slli t1,t0,1
> vsetvli a5,zero,e8,m1,ta,ma # Right vsetvli for vsm.v
> vsm.v v25,0(a4)
> add sp,sp,t1
> jr ra
> .size foo5_3, .-foo5_3
> .ident "GCC: (g44b22ab81cf) 13.0.0 20221229 (experimental)"
> ```
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/108185] [RISC-V] Sub-optimal code-gen for vsetvli: redundant stack store
2022-12-20 3:55 [Bug target/108185] New: [RISC-V]RVV assemble not set vsetvli correct jiawei at iscas dot ac.cn
` (2 preceding siblings ...)
2023-01-03 1:54 ` jiawei at iscas dot ac.cn
@ 2023-01-03 2:32 ` kito at gcc dot gnu.org
2023-02-03 3:05 ` juzhe.zhong at rivai dot ai
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: kito at gcc dot gnu.org @ 2023-01-03 2:32 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108185
Kito Cheng <kito at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
Last reconfirmed| |2023-01-03
--- Comment #4 from Kito Cheng <kito at gcc dot gnu.org> ---
So it's about the code gen quality instead of correctness, let me update the
title.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/108185] [RISC-V] Sub-optimal code-gen for vsetvli: redundant stack store
2022-12-20 3:55 [Bug target/108185] New: [RISC-V]RVV assemble not set vsetvli correct jiawei at iscas dot ac.cn
` (3 preceding siblings ...)
2023-01-03 2:32 ` [Bug target/108185] [RISC-V] Sub-optimal code-gen for vsetvli: redundant stack store kito at gcc dot gnu.org
@ 2023-02-03 3:05 ` juzhe.zhong at rivai dot ai
2023-03-07 13:45 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2023-02-03 3:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108185
--- Comment #5 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
Revise the testcase, it has a bug here:
void foo5_3 (int32_t * restrict in, int32_t * restrict out, size_t n, int cond)
{
vint8m1_t v = *(vint8m1_t*)in;
*(vint8m1_t*)out = v;
vbool8_t v3 = *(vbool8_t*)in;
*(vbool8_t*)(out + 200) = v3;
vbool16_t v4 = *(vbool16_t *)in;
*(vbool16_t *)(out + 300) = v4;
}
The second vlm.v for vbool16_t is missing which is incorrect codegen.
Confirm for vbool8/16/32/64 all have the same issue.
-fdump-tree-optimized can observe that:
They are all tied and consider as same in GIMPLE IR.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/108185] [RISC-V] Sub-optimal code-gen for vsetvli: redundant stack store
2022-12-20 3:55 [Bug target/108185] New: [RISC-V]RVV assemble not set vsetvli correct jiawei at iscas dot ac.cn
` (4 preceding siblings ...)
2023-02-03 3:05 ` juzhe.zhong at rivai dot ai
@ 2023-03-07 13:45 ` cvs-commit at gcc dot gnu.org
2023-03-08 1:11 ` kito at gcc dot gnu.org
2023-03-23 8:40 ` cvs-commit at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-03-07 13:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108185
--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kito Cheng <kito@gcc.gnu.org>:
https://gcc.gnu.org/g:247cacc9e381d666a492dfa4ed61b7b19e2d008f
commit r13-6524-g247cacc9e381d666a492dfa4ed61b7b19e2d008f
Author: Pan Li <pan2.li@intel.com>
Date: Tue Mar 7 20:05:15 2023 +0800
RISC-V: Bugfix for rvv bool mode precision adjustment
Fix the bug of the rvv bool mode precision with the adjustment.
The bits size of vbool*_t will be adjusted to
[1, 2, 4, 8, 16, 32, 64] according to the rvv spec 1.0 isa. The
adjusted mode precison of vbool*_t will help underlying pass to
make the right decision for both the correctness and optimization.
Given below sample code:
void test_1(int8_t * restrict in, int8_t * restrict out)
{
vbool8_t v2 = *(vbool8_t*)in;
vbool16_t v5 = *(vbool16_t*)in;
*(vbool16_t*)(out + 200) = v5;
*(vbool8_t*)(out + 100) = v2;
}
Before the precision adjustment:
addi a4,a1,100
vsetvli a5,zero,e8,m1,ta,ma
addi a1,a1,200
vlm.v v24,0(a0)
vsm.v v24,0(a4)
// Need one vsetvli and vlm.v for correctness here.
vsm.v v24,0(a1)
After the precision adjustment:
csrr t0,vlenb
slli t1,t0,1
csrr a3,vlenb
sub sp,sp,t1
slli a4,a3,1
add a4,a4,sp
sub a3,a4,a3
vsetvli a5,zero,e8,m1,ta,ma
addi a2,a1,200
vlm.v v24,0(a0)
vsm.v v24,0(a3)
addi a1,a1,100
vsetvli a4,zero,e8,mf2,ta,ma
csrr t0,vlenb
vlm.v v25,0(a3)
vsm.v v25,0(a2)
slli t1,t0,1
vsetvli a5,zero,e8,m1,ta,ma
vsm.v v24,0(a1)
add sp,sp,t1
jr ra
However, there may be some optimization opportunates after
the mode precision adjustment. It can be token care of in
the RISC-V backend in the underlying separted PR(s).
gcc/ChangeLog:
PR target/108185
PR target/108654
* config/riscv/riscv-modes.def (ADJUST_PRECISION): Adjust VNx*BI
modes.
* config/riscv/riscv.cc (riscv_v_adjust_precision): New.
* config/riscv/riscv.h (riscv_v_adjust_precision): New.
* genmodes.cc (adj_precision): New.
(ADJUST_PRECISION): New.
(emit_mode_adjustments): Handle ADJUST_PRECISION.
gcc/testsuite/ChangeLog:
PR target/108185
PR target/108654
* gcc.target/riscv/rvv/base/pr108185-1.c: New test.
* gcc.target/riscv/rvv/base/pr108185-2.c: New test.
* gcc.target/riscv/rvv/base/pr108185-3.c: New test.
* gcc.target/riscv/rvv/base/pr108185-4.c: New test.
* gcc.target/riscv/rvv/base/pr108185-5.c: New test.
* gcc.target/riscv/rvv/base/pr108185-6.c: New test.
* gcc.target/riscv/rvv/base/pr108185-7.c: New test.
* gcc.target/riscv/rvv/base/pr108185-8.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
Co-authored-by: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/108185] [RISC-V] Sub-optimal code-gen for vsetvli: redundant stack store
2022-12-20 3:55 [Bug target/108185] New: [RISC-V]RVV assemble not set vsetvli correct jiawei at iscas dot ac.cn
` (5 preceding siblings ...)
2023-03-07 13:45 ` cvs-commit at gcc dot gnu.org
@ 2023-03-08 1:11 ` kito at gcc dot gnu.org
2023-03-23 8:40 ` cvs-commit at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: kito at gcc dot gnu.org @ 2023-03-08 1:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108185
Kito Cheng <kito at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |FIXED
--- Comment #7 from Kito Cheng <kito at gcc dot gnu.org> ---
Resolved by Pan's patch :)
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/108185] [RISC-V] Sub-optimal code-gen for vsetvli: redundant stack store
2022-12-20 3:55 [Bug target/108185] New: [RISC-V]RVV assemble not set vsetvli correct jiawei at iscas dot ac.cn
` (6 preceding siblings ...)
2023-03-08 1:11 ` kito at gcc dot gnu.org
@ 2023-03-23 8:40 ` cvs-commit at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-03-23 8:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108185
--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kito Cheng <kito@gcc.gnu.org>:
https://gcc.gnu.org/g:3a982e07d28a46da81ee5b65b03a896d84b32a48
commit r13-6826-g3a982e07d28a46da81ee5b65b03a896d84b32a48
Author: Pan Li <pan2.li@intel.com>
Date: Wed Mar 8 15:33:33 2023 +0800
RISC-V: Bugfix for rvv bool mode size adjustment
Fix the bug of the rvv bool mode size by the adjustment.
Besides the mode precision (aka bit size [1, 2, 4, 8, 16, 32, 64])
of the vbool*_t, the mode size (aka byte size) will be adjusted to
[1, 1, 1, 1, 2, 4, 8] according to the rvv spec 1.0 isa. The
adjustment will provide correct information for the underlying
redundant instruction elimiation.
Given the below sample code:
{
vbool1_t v1 = *(vbool1_t*)in;
vbool64_t v2 = *(vbool64_t*)in;
*(vbool1_t*)(out + 100) = v1;
*(vbool64_t*)(out + 200) = v2;
}
Before the size adjustment:
csrr t0,vlenb
slli t1,t0,1
csrr a3,vlenb
sub sp,sp,t1
slli a4,a3,1
add a4,a4,sp
addi a2,a1,100
vsetvli a5,zero,e8,m8,ta,ma
sub a3,a4,a3
vlm.v v24,0(a0)
vsm.v v24,0(a2)
vsm.v v24,0(a3)
addi a1,a1,200
csrr t0,vlenb
vsetvli a4,zero,e8,mf8,ta,ma
slli t1,t0,1
vlm.v v24,0(a3)
vsm.v v24,0(a1)
add sp,sp,t1
jr ra
After the size adjustment:
addi a3,a1,100
vsetvli a4,zero,e8,m8,ta,ma
addi a1,a1,200
vlm.v v24,0(a0)
vsm.v v24,0(a3)
vsetvli a5,zero,e8,mf8,ta,ma
vlm.v v24,0(a0)
vsm.v v24,0(a1)
ret
Additionally, the size adjust cannot cover all possible combinations
of the vbool*_t code pattern like above. We will take a look into it
in another patches.
PR 108185
PR 108654
gcc/ChangeLog:
PR target/108654
PR target/108185
* config/riscv/riscv-modes.def (ADJUST_BYTESIZE): Adjust size
for vector mask modes.
* config/riscv/riscv.cc (riscv_v_adjust_bytesize): New.
* config/riscv/riscv.h (riscv_v_adjust_bytesize): New.
gcc/testsuite/ChangeLog:
PR target/108654
PR target/108185
* gcc.target/riscv/rvv/base/pr108185-1.c: Update.
* gcc.target/riscv/rvv/base/pr108185-2.c: Ditto.
* gcc.target/riscv/rvv/base/pr108185-3.c: Ditto.
Signed-off-by: Pan Li <pan2.li@intel.com>
Co-authored-by: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-03-23 8:40 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-20 3:55 [Bug target/108185] New: [RISC-V]RVV assemble not set vsetvli correct jiawei at iscas dot ac.cn
2022-12-20 3:58 ` [Bug target/108185] " jiawei at iscas dot ac.cn
2022-12-29 9:51 ` kito at gcc dot gnu.org
2023-01-03 1:54 ` jiawei at iscas dot ac.cn
2023-01-03 2:32 ` [Bug target/108185] [RISC-V] Sub-optimal code-gen for vsetvli: redundant stack store kito at gcc dot gnu.org
2023-02-03 3:05 ` juzhe.zhong at rivai dot ai
2023-03-07 13:45 ` cvs-commit at gcc dot gnu.org
2023-03-08 1:11 ` kito at gcc dot gnu.org
2023-03-23 8:40 ` cvs-commit at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).