* [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor @ 2023-12-23 11:07 pan2.li 2023-12-23 17:19 ` Jeff Law ` (4 more replies) 0 siblings, 5 replies; 23+ messages in thread From: pan2.li @ 2023-12-23 11:07 UTC (permalink / raw) To: gcc-patches Cc: juzhe.zhong, pan2.li, yanzhang.wang, kito.cheng, richard.guenther, jeffreyalaw From: Pan Li <pan2.li@intel.com> This patch would like to XFAIL the test case pr30957-1.c for the RVV when build the elf with some configurations (list at the end of the log) It will be vectorized during vect_transform_loop with a variable factor. It won't benefit from unrolling/peeling and mark the loop->unroll as 1. Of course, it will do nothing during unroll_loops when loop->unroll is 1. After this patch the loops vectorized with a variable factor of the RVV will be treated as XFAIL by the tree dump. Aka the blow configuration will be treated as XFAIL and we still need further investigation for the failures of other configurations. * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax gcc/testsuite/ChangeLog: * gcc.dg/pr30957-1.c: Add XFAIL for RVV when vectorized with variable length. Signed-off-by: Pan Li <pan2.li@intel.com> --- gcc/testsuite/gcc.dg/pr30957-1.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c index 564410913ab..4c5c4cd4ad8 100644 --- a/gcc/testsuite/gcc.dg/pr30957-1.c +++ b/gcc/testsuite/gcc.dg/pr30957-1.c @@ -3,7 +3,7 @@ where each addition is a library call. / /* { dg-require-effective-target hard_float } */ /* -fassociative-math requires -fno-trapping-math and -fno-signed-zeros. */ -/* { dg-options "-O2 -funroll-loops -fassociative-math -fno-trapping-math -fno-signed-zeros -fvariable-expansion-in-unroller -fdump-rtl-loop2_unroll" } */ +/* { dg-options "-O2 -funroll-loops -fassociative-math -fno-trapping-math -fno-signed-zeros -fvariable-expansion-in-unroller -fdump-rtl-loop2_unroll -fdump-tree-vect-details" } */ extern void abort (void); extern void exit (int); @@ -34,3 +34,4 @@ main () } /* { dg-final { scan-rtl-dump "Expanding Accumulator" "loop2_unroll" { xfail mmix-*-* } } } */ +/* { dg-final { scan-tree-dump "Disabling unrolling due to variable-length vectorization factor" "vect" { xfail riscv*-*-* } } } */ -- 2.34.1 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor 2023-12-23 11:07 [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor pan2.li @ 2023-12-23 17:19 ` Jeff Law 2023-12-24 2:01 ` Li, Pan2 2023-12-26 9:34 ` [PATCH v2] " pan2.li ` (3 subsequent siblings) 4 siblings, 1 reply; 23+ messages in thread From: Jeff Law @ 2023-12-23 17:19 UTC (permalink / raw) To: pan2.li, gcc-patches Cc: juzhe.zhong, yanzhang.wang, kito.cheng, richard.guenther On 12/23/23 04:07, pan2.li@intel.com wrote: > From: Pan Li <pan2.li@intel.com> > > This patch would like to XFAIL the test case pr30957-1.c for the RVV when > build the elf with some configurations (list at the end of the log) > It will be vectorized during vect_transform_loop with a variable factor. > It won't benefit from unrolling/peeling and mark the loop->unroll as 1. > Of course, it will do nothing during unroll_loops when loop->unroll is 1. > > After this patch the loops vectorized with a variable factor of the RVV > will be treated as XFAIL by the tree dump. > > Aka the blow configuration will be treated as XFAIL and we still need > further investigation for the failures of other configurations. > > * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow > * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 > * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 > * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow > * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax > > gcc/testsuite/ChangeLog: > > * gcc.dg/pr30957-1.c: Add XFAIL for RVV when vectorized with > variable length. Isn't this going to XPASS for non-vector configurations? If I understand correctly, the test requires loop unrolling and its associated variable expansion to trigger the desired behavior. VLA style vectorization is inhibiting loop unrolling and thus we get the failure? So the natural question here is whether or not aarch64 SVE sees the same failure, if not, why? If so, then can we conditionalize this on an effective target test (check_effective_target_vect_variable_length perhaps?) Jeff ^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor 2023-12-23 17:19 ` Jeff Law @ 2023-12-24 2:01 ` Li, Pan2 0 siblings, 0 replies; 23+ messages in thread From: Li, Pan2 @ 2023-12-24 2:01 UTC (permalink / raw) To: Jeff Law, gcc-patches Cc: juzhe.zhong, Wang, Yanzhang, kito.cheng, richard.guenther Thanks Jeff for comments. > Isn't this going to XPASS for non-vector configurations? Yes, I think we still need something like riscv_v here. > If I understand correctly, the test requires loop unrolling and its > associated variable expansion to trigger the desired behavior. VLA > style vectorization is inhibiting loop unrolling and thus we get the > failure? Yes, exactly. > So the natural question here is whether or not aarch64 SVE sees the same > failure, if not, why? If so, then can we conditionalize this on an > effective target test (check_effective_target_vect_variable_length perhaps?) Sure, will have a try for this. Pan -----Original Message----- From: Jeff Law <jeffreyalaw@gmail.com> Sent: Sunday, December 24, 2023 1:20 AM To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com Subject: Re: [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor On 12/23/23 04:07, pan2.li@intel.com wrote: > From: Pan Li <pan2.li@intel.com> > > This patch would like to XFAIL the test case pr30957-1.c for the RVV when > build the elf with some configurations (list at the end of the log) > It will be vectorized during vect_transform_loop with a variable factor. > It won't benefit from unrolling/peeling and mark the loop->unroll as 1. > Of course, it will do nothing during unroll_loops when loop->unroll is 1. > > After this patch the loops vectorized with a variable factor of the RVV > will be treated as XFAIL by the tree dump. > > Aka the blow configuration will be treated as XFAIL and we still need > further investigation for the failures of other configurations. > > * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow > * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 > * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2 > * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow > * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax > * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax > > gcc/testsuite/ChangeLog: > > * gcc.dg/pr30957-1.c: Add XFAIL for RVV when vectorized with > variable length. Isn't this going to XPASS for non-vector configurations? If I understand correctly, the test requires loop unrolling and its associated variable expansion to trigger the desired behavior. VLA style vectorization is inhibiting loop unrolling and thus we get the failure? So the natural question here is whether or not aarch64 SVE sees the same failure, if not, why? If so, then can we conditionalize this on an effective target test (check_effective_target_vect_variable_length perhaps?) Jeff ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor 2023-12-23 11:07 [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor pan2.li 2023-12-23 17:19 ` Jeff Law @ 2023-12-26 9:34 ` pan2.li 2023-12-28 16:39 ` Jeff Law 2024-01-02 11:55 ` [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option pan2.li ` (2 subsequent siblings) 4 siblings, 1 reply; 23+ messages in thread From: pan2.li @ 2023-12-26 9:34 UTC (permalink / raw) To: gcc-patches Cc: juzhe.zhong, pan2.li, yanzhang.wang, kito.cheng, richard.guenther, jeffreyalaw From: Pan Li <pan2.li@intel.com> This patch would like to XFAIL the test case pr30957-1.c for the RVV when build the elf with some configurations (list at the end of the log) It will be vectorized during vect_transform_loop with a variable factor. It won't benefit from unrolling/peeling and mark the loop->unroll as 1. Of course, it will do nothing during unroll_loops when loop->unroll is 1. The aarch64_sve may have the similar issue but it initialize the const `0.0 / -5.0` in the test file to `+0.0` before pass to the function foo. Then it will pass the execution test. aarch64: movi v0.2s, #0x0 stp x29, x30, [sp, #-16]! mov w0, #0xa mov x29, sp bl 400280 <foo> <== s0 is +0.0 Unfortunately, the riscv initialize the the const `0.0 / -5.0` to the `-0.0`, and then pass it to the function foo. Of course it the execution test will fail. riscv: flw fa0,388(gp) # 1299c <__SDATA_BEGIN__+0x4> addi sp,sp,-16 li a0,10 sd ra,8(sp) jal 101fc <foo> <== fa0 is -0.0 After this patch the loops vectorized with a variable factor of the RVV will be treated as XFAIL by the tree dump when riscv_v and variable_vect_length. The below configurations are validated as XFAIL for RV64. * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax gcc/testsuite/ChangeLog: * gcc.dg/pr30957-1.c: Add XFAIL for RVV when vectorized with variable length. Signed-off-by: Pan Li <pan2.li@intel.com> --- gcc/testsuite/gcc.dg/pr30957-1.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c index 564410913ab..7a7242ec16d 100644 --- a/gcc/testsuite/gcc.dg/pr30957-1.c +++ b/gcc/testsuite/gcc.dg/pr30957-1.c @@ -3,7 +3,7 @@ where each addition is a library call. / /* { dg-require-effective-target hard_float } */ /* -fassociative-math requires -fno-trapping-math and -fno-signed-zeros. */ -/* { dg-options "-O2 -funroll-loops -fassociative-math -fno-trapping-math -fno-signed-zeros -fvariable-expansion-in-unroller -fdump-rtl-loop2_unroll" } */ +/* { dg-options "-O2 -funroll-loops -fassociative-math -fno-trapping-math -fno-signed-zeros -fvariable-expansion-in-unroller -fdump-rtl-loop2_unroll -fdump-tree-vect-details" } */ extern void abort (void); extern void exit (int); @@ -34,3 +34,4 @@ main () } /* { dg-final { scan-rtl-dump "Expanding Accumulator" "loop2_unroll" { xfail mmix-*-* } } } */ +/* { dg-final { scan-tree-dump-times "Disabling unrolling due to variable-length vectorization factor" 1 "vect" { target { riscv_v } xfail { vect_variable_length } } } } */ -- 2.34.1 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor 2023-12-26 9:34 ` [PATCH v2] " pan2.li @ 2023-12-28 16:39 ` Jeff Law 2023-12-29 0:42 ` Li, Pan2 0 siblings, 1 reply; 23+ messages in thread From: Jeff Law @ 2023-12-28 16:39 UTC (permalink / raw) To: pan2.li, gcc-patches Cc: juzhe.zhong, yanzhang.wang, kito.cheng, richard.guenther On 12/26/23 02:34, pan2.li@intel.com wrote: > From: Pan Li <pan2.li@intel.com> > > This patch would like to XFAIL the test case pr30957-1.c for the RVV when > build the elf with some configurations (list at the end of the log) > It will be vectorized during vect_transform_loop with a variable factor. > It won't benefit from unrolling/peeling and mark the loop->unroll as 1. > Of course, it will do nothing during unroll_loops when loop->unroll is 1. > > The aarch64_sve may have the similar issue but it initialize the const > `0.0 / -5.0` in the test file to `+0.0` before pass to the function foo. > Then it will pass the execution test. > > aarch64: > movi v0.2s, #0x0 > stp x29, x30, [sp, #-16]! > mov w0, #0xa > mov x29, sp > bl 400280 <foo> <== s0 is +0.0 > > Unfortunately, the riscv initialize the the const `0.0 / -5.0` to the > `-0.0`, and then pass it to the function foo. Of course it the execution > test will fail. > > riscv: > flw fa0,388(gp) # 1299c <__SDATA_BEGIN__+0x4> > addi sp,sp,-16 > li a0,10 > sd ra,8(sp) > jal 101fc <foo> <== fa0 is -0.0 > > After this patch the loops vectorized with a variable factor of the RVV > will be treated as XFAIL by the tree dump when riscv_v and > variable_vect_length. > > The below configurations are validated as XFAIL for RV64. Interesting. So I'd actually peel one more layer off this onion. Why do the aarch64 and riscv targets generate different constants (0.0 vs -0.0)? Is it possible that the aarch64 is generating 0.0 when asked for -0.0 and -fno-signed-zeros is in effect? That's a valid thing to do when -fno-signed-zeros is on. Look for HONOR_SIGNED_ZEROs in the aarch64 backend. Jeff ^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor 2023-12-28 16:39 ` Jeff Law @ 2023-12-29 0:42 ` Li, Pan2 2023-12-29 1:03 ` Jeff Law 0 siblings, 1 reply; 23+ messages in thread From: Li, Pan2 @ 2023-12-29 0:42 UTC (permalink / raw) To: Jeff Law, gcc-patches Cc: juzhe.zhong, Wang, Yanzhang, kito.cheng, richard.guenther Thanks Jeff for comments, and Happy new year! > Interesting. So I'd actually peel one more layer off this onion. Why > do the aarch64 and riscv targets generate different constants (0.0 vs > -0.0)? Yeah, it surprise me too when debugging the foo function. But didn't dig into it in previous as it may be unrelated to vectorize. > Is it possible that the aarch64 is generating 0.0 when asked for -0.0 > and -fno-signed-zeros is in effect? That's a valid thing to do when > -fno-signed-zeros is on. Look for HONOR_SIGNED_ZEROs in the aarch64 > backend. Sure, will have a try for making the -0.0 happen in aarch64. Pan -----Original Message----- From: Jeff Law <jeffreyalaw@gmail.com> Sent: Friday, December 29, 2023 12:39 AM To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com Subject: Re: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor On 12/26/23 02:34, pan2.li@intel.com wrote: > From: Pan Li <pan2.li@intel.com> > > This patch would like to XFAIL the test case pr30957-1.c for the RVV when > build the elf with some configurations (list at the end of the log) > It will be vectorized during vect_transform_loop with a variable factor. > It won't benefit from unrolling/peeling and mark the loop->unroll as 1. > Of course, it will do nothing during unroll_loops when loop->unroll is 1. > > The aarch64_sve may have the similar issue but it initialize the const > `0.0 / -5.0` in the test file to `+0.0` before pass to the function foo. > Then it will pass the execution test. > > aarch64: > movi v0.2s, #0x0 > stp x29, x30, [sp, #-16]! > mov w0, #0xa > mov x29, sp > bl 400280 <foo> <== s0 is +0.0 > > Unfortunately, the riscv initialize the the const `0.0 / -5.0` to the > `-0.0`, and then pass it to the function foo. Of course it the execution > test will fail. > > riscv: > flw fa0,388(gp) # 1299c <__SDATA_BEGIN__+0x4> > addi sp,sp,-16 > li a0,10 > sd ra,8(sp) > jal 101fc <foo> <== fa0 is -0.0 > > After this patch the loops vectorized with a variable factor of the RVV > will be treated as XFAIL by the tree dump when riscv_v and > variable_vect_length. > > The below configurations are validated as XFAIL for RV64. Interesting. So I'd actually peel one more layer off this onion. Why do the aarch64 and riscv targets generate different constants (0.0 vs -0.0)? Is it possible that the aarch64 is generating 0.0 when asked for -0.0 and -fno-signed-zeros is in effect? That's a valid thing to do when -fno-signed-zeros is on. Look for HONOR_SIGNED_ZEROs in the aarch64 backend. Jeff ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor 2023-12-29 0:42 ` Li, Pan2 @ 2023-12-29 1:03 ` Jeff Law 2023-12-29 5:56 ` Li, Pan2 0 siblings, 1 reply; 23+ messages in thread From: Jeff Law @ 2023-12-29 1:03 UTC (permalink / raw) To: Li, Pan2, gcc-patches Cc: juzhe.zhong, Wang, Yanzhang, kito.cheng, richard.guenther On 12/28/23 17:42, Li, Pan2 wrote: > Thanks Jeff for comments, and Happy new year! > >> Interesting. So I'd actually peel one more layer off this onion. Why >> do the aarch64 and riscv targets generate different constants (0.0 vs >> -0.0)? > > Yeah, it surprise me too when debugging the foo function. But didn't dig into it in previous as it may be unrelated to vectorize. > >> Is it possible that the aarch64 is generating 0.0 when asked for -0.0 >> and -fno-signed-zeros is in effect? That's a valid thing to do when >> -fno-signed-zeros is on. Look for HONOR_SIGNED_ZEROs in the aarch64 >> backend. > > Sure, will have a try for making the -0.0 happen in aarch64. I would first look at the .optimized dump, then I'd look at the .final dump alongside the resulting assembly for aarch64. I bet we're going to find that the aarch64 target internally converts -0.0 to 0.0 when we're not honoring signed zeros. jeff ^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor 2023-12-29 1:03 ` Jeff Law @ 2023-12-29 5:56 ` Li, Pan2 2023-12-30 3:13 ` Jeff Law 0 siblings, 1 reply; 23+ messages in thread From: Li, Pan2 @ 2023-12-29 5:56 UTC (permalink / raw) To: Jeff Law, gcc-patches Cc: juzhe.zhong, Wang, Yanzhang, kito.cheng, richard.guenther Thanks Jeff. I think I locate where aarch64 performs the trick here. 1. In the .final we have rtl like (insn:TI 6 8 29 (set (reg:SF 32 v0) (const_double:SF -0.0 [-0x0.0p+0])) "/home/box/panli/gnu-toolchain/gcc/gcc/testsuite/gcc.dg/pr30957-1.c":31:7 79 {*movsf_aarch64} (nil)) 2. the movsf_aarch64 comes from the aarch64.md file similar to the below rtl. Aka, it will generate movi\t%0.2s, #0 if the aarch64_reg_or_fp_zero is true. 1640 (define_insn "*mov<mode>_aarch64" 1641 [(set (match_operand:SFD 0 "nonimmediate_operand") 1642 match_operand:SFD 1 "general_operand"))] 1643 "TARGET_FLOAT && (register_operand (operands[0], <MODE>mode) 1644 || aarch64_reg_or_fp_zero (operands[1], <MODE>mode))" 1645 {@ [ cons: =0 , 1 ; attrs: type , arch ] 1646 [ w , Y ; neon_move , simd ] movi\t%0.2s, #0 3. Then we will have aarch64_float_const_zero_rtx_p here, and the -0.0 input rtl will return true in line 10873 because of no-signed-zero is given. 10863 bool 10864 aarch64_float_const_zero_rtx_p (rtx x 10865 { 10866 /* 0.0 in Decimal Floating Point cannot be represented by #0 or 10867 zr as our callers expect, so no need to check the actual 10868 value if X is of Decimal Floating Point type. */ 10869 if (GET_MODE_CLASS (GET_MODE (x)) == MODE_DECIMAL_FLOAT) 10870 return false; 10871 10872 if (REAL_VALUE_MINUS_ZERO (*CONST_DOUBLE_REAL_VALUE (x))) 10873 return !HONOR_SIGNED_ZEROS (GET_MODE (x)); 10874 return real_equal (CONST_DOUBLE_REAL_VALUE (x), &dconst0); 10875 } I think that explain why we have +0.0 in aarch64 here. Pan -----Original Message----- From: Jeff Law <jeffreyalaw@gmail.com> Sent: Friday, December 29, 2023 9:04 AM To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com Subject: Re: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor On 12/28/23 17:42, Li, Pan2 wrote: > Thanks Jeff for comments, and Happy new year! > >> Interesting. So I'd actually peel one more layer off this onion. Why >> do the aarch64 and riscv targets generate different constants (0.0 vs >> -0.0)? > > Yeah, it surprise me too when debugging the foo function. But didn't dig into it in previous as it may be unrelated to vectorize. > >> Is it possible that the aarch64 is generating 0.0 when asked for -0.0 >> and -fno-signed-zeros is in effect? That's a valid thing to do when >> -fno-signed-zeros is on. Look for HONOR_SIGNED_ZEROs in the aarch64 >> backend. > > Sure, will have a try for making the -0.0 happen in aarch64. I would first look at the .optimized dump, then I'd look at the .final dump alongside the resulting assembly for aarch64. I bet we're going to find that the aarch64 target internally converts -0.0 to 0.0 when we're not honoring signed zeros. jeff ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor 2023-12-29 5:56 ` Li, Pan2 @ 2023-12-30 3:13 ` Jeff Law 2024-01-01 8:56 ` Li, Pan2 0 siblings, 1 reply; 23+ messages in thread From: Jeff Law @ 2023-12-30 3:13 UTC (permalink / raw) To: Li, Pan2, gcc-patches Cc: juzhe.zhong, Wang, Yanzhang, kito.cheng, richard.guenther On 12/28/23 22:56, Li, Pan2 wrote: > Thanks Jeff. > > I think I locate where aarch64 performs the trick here. > > 1. In the .final we have rtl like > > (insn:TI 6 8 29 (set (reg:SF 32 v0) > (const_double:SF -0.0 [-0x0.0p+0])) "/home/box/panli/gnu-toolchain/gcc/gcc/testsuite/gcc.dg/pr30957-1.c":31:7 79 {*movsf_aarch64} > (nil)) > > 2. the movsf_aarch64 comes from the aarch64.md file similar to the below rtl. Aka, it will generate movi\t%0.2s, #0 if > the aarch64_reg_or_fp_zero is true. > > 1640 (define_insn "*mov<mode>_aarch64" > 1641 [(set (match_operand:SFD 0 "nonimmediate_operand") > 1642 match_operand:SFD 1 "general_operand"))] > 1643 "TARGET_FLOAT && (register_operand (operands[0], <MODE>mode) > 1644 || aarch64_reg_or_fp_zero (operands[1], <MODE>mode))" > 1645 {@ [ cons: =0 , 1 ; attrs: type , arch ] > 1646 [ w , Y ; neon_move , simd ] movi\t%0.2s, #0 > > 3. Then we will have aarch64_float_const_zero_rtx_p here, and the -0.0 input rtl will return true in line 10873 because of no-signed-zero is given. > > 10863 bool > 10864 aarch64_float_const_zero_rtx_p (rtx x > 10865 { > 10866 /* 0.0 in Decimal Floating Point cannot be represented by #0 or > 10867 zr as our callers expect, so no need to check the actual > 10868 value if X is of Decimal Floating Point type. */ > 10869 if (GET_MODE_CLASS (GET_MODE (x)) == MODE_DECIMAL_FLOAT) > 10870 return false; > 10871 > 10872 if (REAL_VALUE_MINUS_ZERO (*CONST_DOUBLE_REAL_VALUE (x))) > 10873 return !HONOR_SIGNED_ZEROS (GET_MODE (x)); > 10874 return real_equal (CONST_DOUBLE_REAL_VALUE (x), &dconst0); > 10875 } > > I think that explain why we have +0.0 in aarch64 here. Yup. Thanks a ton for diving into this. So I think that points us to the right fix, specifically we should be turning -0.0 into 0.0 when !HONOR_SIGNED_ZEROS rather than xfailing the test. I think we'd need to adjust reg_or_0_operand and riscv_output_move, probably the G constraint as well. We might also need to adjust move_operand and perhaps riscv_legitimize_move. jeff ^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor 2023-12-30 3:13 ` Jeff Law @ 2024-01-01 8:56 ` Li, Pan2 0 siblings, 0 replies; 23+ messages in thread From: Li, Pan2 @ 2024-01-01 8:56 UTC (permalink / raw) To: Jeff Law, gcc-patches Cc: juzhe.zhong, Wang, Yanzhang, kito.cheng, richard.guenther Thanks Jeff for the confirmation and suggestions. It looks like not a corner case for the option no-signed-zero. Consider 2 sample function as below with build with option " -march=rv64gcv -mabi=lp64d -O2 -fno-signed-zeros". void __attribute__ ((noinline)) test_float_zero_assign_0 (float *a) { *a = +0.0; } void __attribute__ ((noinline)) test_float_zero_assign_1 (float *a) { *a = -0.0; } For the first one (aka float 0.0) we have rtl as below: (insn 6 3 0 2 (set (mem:SF (reg/v/f:DI 134 [ a ]) [1 *a_2(D)+0 S4 A32]) (const_double:SF 0.0 [0x0.0p+0])) "test.c":14:6 -1 (nil)) But for the second one (aka float -0.0 with no-signed-zero) we have rtl as below but we expect const_double -0.0 here. (insn 6 3 7 2 (set (reg:DI 135 (high:DI (symbol_ref/u:DI ("*.LC0") [flags 0x82]))) "test.c":21:6 -1 (nil)) (insn 7 6 8 2 (set (reg:SF 136) (mem/u/c:SF (lo_sum:DI (reg:DI 135) (symbol_ref/u:DI ("*.LC0") [flags 0x82])) [0 S4 A32])) "test.c":21:6 -1 (nil)) I will have a try to fix it in V3. Pan -----Original Message----- From: Jeff Law <jeffreyalaw@gmail.com> Sent: Saturday, December 30, 2023 11:14 AM To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com Subject: Re: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor On 12/28/23 22:56, Li, Pan2 wrote: > Thanks Jeff. > > I think I locate where aarch64 performs the trick here. > > 1. In the .final we have rtl like > > (insn:TI 6 8 29 (set (reg:SF 32 v0) > (const_double:SF -0.0 [-0x0.0p+0])) "/home/box/panli/gnu-toolchain/gcc/gcc/testsuite/gcc.dg/pr30957-1.c":31:7 79 {*movsf_aarch64} > (nil)) > > 2. the movsf_aarch64 comes from the aarch64.md file similar to the below rtl. Aka, it will generate movi\t%0.2s, #0 if > the aarch64_reg_or_fp_zero is true. > > 1640 (define_insn "*mov<mode>_aarch64" > 1641 [(set (match_operand:SFD 0 "nonimmediate_operand") > 1642 match_operand:SFD 1 "general_operand"))] > 1643 "TARGET_FLOAT && (register_operand (operands[0], <MODE>mode) > 1644 || aarch64_reg_or_fp_zero (operands[1], <MODE>mode))" > 1645 {@ [ cons: =0 , 1 ; attrs: type , arch ] > 1646 [ w , Y ; neon_move , simd ] movi\t%0.2s, #0 > > 3. Then we will have aarch64_float_const_zero_rtx_p here, and the -0.0 input rtl will return true in line 10873 because of no-signed-zero is given. > > 10863 bool > 10864 aarch64_float_const_zero_rtx_p (rtx x > 10865 { > 10866 /* 0.0 in Decimal Floating Point cannot be represented by #0 or > 10867 zr as our callers expect, so no need to check the actual > 10868 value if X is of Decimal Floating Point type. */ > 10869 if (GET_MODE_CLASS (GET_MODE (x)) == MODE_DECIMAL_FLOAT) > 10870 return false; > 10871 > 10872 if (REAL_VALUE_MINUS_ZERO (*CONST_DOUBLE_REAL_VALUE (x))) > 10873 return !HONOR_SIGNED_ZEROS (GET_MODE (x)); > 10874 return real_equal (CONST_DOUBLE_REAL_VALUE (x), &dconst0); > 10875 } > > I think that explain why we have +0.0 in aarch64 here. Yup. Thanks a ton for diving into this. So I think that points us to the right fix, specifically we should be turning -0.0 into 0.0 when !HONOR_SIGNED_ZEROS rather than xfailing the test. I think we'd need to adjust reg_or_0_operand and riscv_output_move, probably the G constraint as well. We might also need to adjust move_operand and perhaps riscv_legitimize_move. jeff ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option 2023-12-23 11:07 [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor pan2.li 2023-12-23 17:19 ` Jeff Law 2023-12-26 9:34 ` [PATCH v2] " pan2.li @ 2024-01-02 11:55 ` pan2.li 2024-01-08 10:45 ` Richard Biener 2024-01-11 1:38 ` [PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion pan2.li 2024-01-11 8:50 ` [PATCH v5] " pan2.li 4 siblings, 1 reply; 23+ messages in thread From: pan2.li @ 2024-01-02 11:55 UTC (permalink / raw) To: gcc-patches Cc: juzhe.zhong, pan2.li, yanzhang.wang, kito.cheng, richard.guenther, jeffreyalaw From: Pan Li <pan2.li@intel.com> According to the sematics of no-signed-zeros option, the backend like RISC-V should treat the minus zero -0.0f as plus zero 0.0f. Consider below example with option -fno-signed-zeros. void test (float *a) { *a = -0.0; } We will generate code as below, which doesn't treat the minus zero as plus zero. test: lui a5,%hi(.LC0) flw fa5,%lo(.LC0)(a5) fsw fa5,0(a0) ret .LC0: .word -2147483648 // aka -0.0 (0x80000000 in hex) This patch would like to fix the bug and treat the minus zero -0.0 as plus zero, aka +0.0. Thus after this patch we will have asm code as below for the above sampe code. test: sw zero,0(a0) ret This patch also fix the run failure of the test case pr30957-1.c. The below tests are passed for this patch. * The riscv regression tests. * The pr30957-1.c run tests. gcc/ChangeLog: * config/riscv/constraints.md: Leverage func riscv_float_const_zero_rtx_p for predicating the rtx is const zero float or not. * config/riscv/predicates.md: Ditto. * config/riscv/riscv.cc (riscv_const_insns): Ditto. (riscv_float_const_zero_rtx_p): New func impl for predicating the rtx is const zero float or not. (riscv_const_zero_rtx_p): New func impl for predicating the rtx is const zero (both int and fp) or not. * config/riscv/riscv-protos.h (riscv_float_const_zero_rtx_p): New func decl. (riscv_const_zero_rtx_p): Ditto. * config/riscv/riscv.md: Making sure the operand[1] of movfp is CONST0_RTX when the operand[1] is const zero float. gcc/testsuite/ChangeLog: * gcc.target/riscv/no-signed-zeros-0.c: New test. * gcc.target/riscv/no-signed-zeros-1.c: New test. * gcc.target/riscv/no-signed-zeros-2.c: New test. * gcc.target/riscv/no-signed-zeros-3.c: New test. * gcc.target/riscv/no-signed-zeros-4.c: New test. * gcc.target/riscv/no-signed-zeros-5.c: New test. * gcc.target/riscv/no-signed-zeros-run-0.c: New test. * gcc.target/riscv/no-signed-zeros-run-1.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com> --- gcc/config/riscv/constraints.md | 2 +- gcc/config/riscv/predicates.md | 2 +- gcc/config/riscv/riscv-protos.h | 2 + gcc/config/riscv/riscv.cc | 35 ++++++++++++- gcc/config/riscv/riscv.md | 49 ++++++++++++++++--- .../gcc.target/riscv/no-signed-zeros-0.c | 26 ++++++++++ .../gcc.target/riscv/no-signed-zeros-1.c | 28 +++++++++++ .../gcc.target/riscv/no-signed-zeros-2.c | 26 ++++++++++ .../gcc.target/riscv/no-signed-zeros-3.c | 28 +++++++++++ .../gcc.target/riscv/no-signed-zeros-4.c | 26 ++++++++++ .../gcc.target/riscv/no-signed-zeros-5.c | 28 +++++++++++ .../gcc.target/riscv/no-signed-zeros-run-0.c | 36 ++++++++++++++ .../gcc.target/riscv/no-signed-zeros-run-1.c | 36 ++++++++++++++ 13 files changed, 314 insertions(+), 10 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md index de4359af00d..db1d5e1385f 100644 --- a/gcc/config/riscv/constraints.md +++ b/gcc/config/riscv/constraints.md @@ -108,7 +108,7 @@ (define_constraint "DnS" (define_constraint "G" "@internal" (and (match_code "const_double") - (match_test "op == CONST0_RTX (mode)"))) + (match_test "riscv_float_const_zero_rtx_p (op)"))) (define_memory_constraint "A" "An address that is held in a general-purpose register." diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md index b87a6900841..b428d842101 100644 --- a/gcc/config/riscv/predicates.md +++ b/gcc/config/riscv/predicates.md @@ -78,7 +78,7 @@ (define_predicate "sleu_operand" (define_predicate "const_0_operand" (and (match_code "const_int,const_wide_int,const_double,const_vector") - (match_test "op == CONST0_RTX (GET_MODE (op))"))) + (match_test "riscv_const_zero_rtx_p (op)"))) (define_predicate "const_1_operand" (and (match_code "const_int,const_wide_int,const_vector") diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 31049ef7523..fcf30e084a3 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -131,6 +131,8 @@ extern void riscv_asm_output_external (FILE *, const tree, const char *); extern bool riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int); extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx); +extern bool riscv_float_const_zero_rtx_p (rtx); +extern bool riscv_const_zero_rtx_p (rtx); #ifdef RTX_CODE extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool *invert_ptr = 0); diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 0d1cbc5cb5f..a8ad86b7068 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -1635,7 +1635,7 @@ riscv_const_insns (rtx x) return 1; /* We can use x0 to load floating-point zero. */ - return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0; + return riscv_float_const_zero_rtx_p (x) ? 1 : 0; case CONST_VECTOR: { /* TODO: This is not accurate, we will need to @@ -9481,6 +9481,39 @@ riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT total, int regs_num) || additioanl_bytes == ZCMP_MAX_SPIMM * ZCMP_SP_INC_STEP; } +/* Return true if rtx op is const zero of the floating-point. */ +bool +riscv_float_const_zero_rtx_p (rtx op) +{ + machine_mode mode = GET_MODE (op); + + if (GET_MODE_CLASS (mode) != MODE_FLOAT) + return false; + + if (GET_CODE (op) != CONST_DOUBLE) + return false; + + const struct real_value *op_rvalue = CONST_DOUBLE_REAL_VALUE (op); + + if (REAL_VALUE_MINUS_ZERO (*op_rvalue)) + return !HONOR_SIGNED_ZEROS (mode); + + return real_equal (op_rvalue, &dconst0); +} + +/* Return true if rtx op is const zero, include both the integer + and floating-point. */ +bool +riscv_const_zero_rtx_p (rtx op) +{ + machine_mode mode = GET_MODE (op); + + if (GET_MODE_CLASS (mode) == MODE_FLOAT) + return riscv_float_const_zero_rtx_p (op); + + return op == CONST0_RTX (mode); +} + /* Return true if it's valid gpr_save pattern. */ bool diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 68f7203b676..cd429f6dcdd 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -1878,7 +1878,12 @@ (define_insn "*movhf_hardfloat" "TARGET_ZFHMIN && (register_operand (operands[0], HFmode) || reg_or_0_operand (operands[1], HFmode))" - { return riscv_output_move (operands[0], operands[1]); } + { + if (riscv_float_const_zero_rtx_p (operands[1])) + operands[1] = CONST0_RTX (GET_MODE (operands[1])); + + return riscv_output_move (operands[0], operands[1]); + } [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") (set_attr "type" "fmove") (set_attr "mode" "HF")]) @@ -1889,7 +1894,12 @@ (define_insn "*movhf_softfloat" "!TARGET_ZFHMIN && (register_operand (operands[0], HFmode) || reg_or_0_operand (operands[1], HFmode))" - { return riscv_output_move (operands[0], operands[1]); } + { + if (riscv_float_const_zero_rtx_p (operands[1])) + operands[1] = CONST0_RTX (GET_MODE (operands[1])); + + return riscv_output_move (operands[0], operands[1]); + } [(set_attr "move_type" "fmove,move,load,store,mtc,mfc") (set_attr "type" "fmove") (set_attr "mode" "HF")]) @@ -2243,7 +2253,12 @@ (define_insn "*movsf_hardfloat" "TARGET_HARD_FLOAT && (register_operand (operands[0], SFmode) || reg_or_0_operand (operands[1], SFmode))" - { return riscv_output_move (operands[0], operands[1]); } + { + if (riscv_float_const_zero_rtx_p (operands[1])) + operands[1] = CONST0_RTX (GET_MODE (operands[1])); + + return riscv_output_move (operands[0], operands[1]); + } [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") (set_attr "type" "fmove") (set_attr "mode" "SF")]) @@ -2254,7 +2269,12 @@ (define_insn "*movsf_softfloat" "!TARGET_HARD_FLOAT && (register_operand (operands[0], SFmode) || reg_or_0_operand (operands[1], SFmode))" - { return riscv_output_move (operands[0], operands[1]); } + { + if (riscv_float_const_zero_rtx_p (operands[1])) + operands[1] = CONST0_RTX (GET_MODE (operands[1])); + + return riscv_output_move (operands[0], operands[1]); + } [(set_attr "move_type" "move,load,store") (set_attr "type" "fmove") (set_attr "mode" "SF")]) @@ -2279,7 +2299,12 @@ (define_insn "*movdf_hardfloat_rv32" "!TARGET_64BIT && TARGET_DOUBLE_FLOAT && (register_operand (operands[0], DFmode) || reg_or_0_operand (operands[1], DFmode))" - { return riscv_output_move (operands[0], operands[1]); } + { + if (riscv_float_const_zero_rtx_p (operands[1])) + operands[1] = CONST0_RTX (GET_MODE (operands[1])); + + return riscv_output_move (operands[0], operands[1]); + } [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") (set_attr "type" "fmove") (set_attr "mode" "DF")]) @@ -2290,7 +2315,12 @@ (define_insn "*movdf_hardfloat_rv64" "TARGET_64BIT && TARGET_DOUBLE_FLOAT && (register_operand (operands[0], DFmode) || reg_or_0_operand (operands[1], DFmode))" - { return riscv_output_move (operands[0], operands[1]); } + { + if (riscv_float_const_zero_rtx_p (operands[1])) + operands[1] = CONST0_RTX (GET_MODE (operands[1])); + + return riscv_output_move (operands[0], operands[1]); + } [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") (set_attr "type" "fmove") (set_attr "mode" "DF")]) @@ -2301,7 +2331,12 @@ (define_insn "*movdf_softfloat" "!TARGET_DOUBLE_FLOAT && (register_operand (operands[0], DFmode) || reg_or_0_operand (operands[1], DFmode))" - { return riscv_output_move (operands[0], operands[1]); } + { + if (riscv_float_const_zero_rtx_p (operands[1])) + operands[1] = CONST0_RTX (GET_MODE (operands[1])); + + return riscv_output_move (operands[0], operands[1]); + } [(set_attr "move_type" "move,load,store") (set_attr "type" "fmove") (set_attr "mode" "DF")]) diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c new file mode 100644 index 00000000000..1eda13a3406 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +/* +** test_float_plus_zero_assign: +** sw\s+zero,0\([atx][0-9]+\) +** ret +*/ +void +test_float_plus_zero_assign (float *a) +{ + *a = +0.0; +} + +/* +** test_float_plus_zero_assign: +** sw\s+zero,0\([atx][0-9]+\) +** ret +*/ +void +test_float_minus_zero_assign (float *a) +{ + *a = -0.0; +} diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c new file mode 100644 index 00000000000..8041ec3ea95 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +extern void float_assign (float *a, float b); + +/* +** test_float_plus_zero_assign: +** fmv\.s\.x\s+fa[0-9]+,\s*zero +** tail\s+float_assign +*/ +void +test_float_plus_zero_assign (float *a) +{ + float_assign (a, +0.0); +} + +/* +** test_float_minus_zero_assign: +** fmv\.s\.x\s+fa[0-9]+,\s*zero +** tail\s+float_assign +*/ +void +test_float_minus_zero_assign (float *a) +{ + float_assign (a, -0.0); +} diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c new file mode 100644 index 00000000000..a6996dae4de --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +/* +** test_float_plus_zero_assign: +** sd\s+zero,0\([atx][0-9]+\) +** ret +*/ +void +test_float_plus_zero_assign (double *a) +{ + *a = +0.0; +} + +/* +** test_float_plus_zero_assign: +** sd\s+zero,0\([atx][0-9]+\) +** ret +*/ +void +test_float_minus_zero_assign (double *a) +{ + *a = -0.0; +} diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c new file mode 100644 index 00000000000..b4ba8a247df --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +extern void float_assign (double *a, double b); + +/* +** test_float_plus_zero_assign: +** fmv\.d\.x\s+fa[0-9]+,\s*zero +** tail\s+float_assign +*/ +void +test_float_plus_zero_assign (double *a) +{ + float_assign (a, +0.0); +} + +/* +** test_float_minus_zero_assign: +** fmv\.d\.x\s+fa[0-9]+,\s*zero +** tail\s+float_assign +*/ +void +test_float_minus_zero_assign (double *a) +{ + float_assign (a, -0.0); +} diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c new file mode 100644 index 00000000000..60acf7155d3 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zfh -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +/* +** test_float_plus_zero_assign: +** sh\s+zero,0\([atx][0-9]+\) +** ret +*/ +void +test_float_plus_zero_assign (_Float16 *a) +{ + *a = +0.0; +} + +/* +** test_float_plus_zero_assign: +** sh\s+zero,0\([atx][0-9]+\) +** ret +*/ +void +test_float_minus_zero_assign (_Float16 *a) +{ + *a = -0.0; +} diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c new file mode 100644 index 00000000000..d10efbeb37b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zfh -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +extern void float_assign (_Float16 *a, _Float16 b); + +/* +** test_float_plus_zero_assign: +** fmv\.h\.x\s+fa[0-9]+,\s*zero +** tail\s+float_assign +*/ +void +test_float_plus_zero_assign (_Float16 *a) +{ + float_assign (a, +0.0); +} + +/* +** test_float_minus_zero_assign: +** fmv\.h\.x\s+fa[0-9]+,\s*zero +** tail\s+float_assign +*/ +void +test_float_minus_zero_assign (_Float16 *a) +{ + float_assign (a, -0.0); +} diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c new file mode 100644 index 00000000000..347e4b0ff74 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c @@ -0,0 +1,36 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -O3 -fno-signed-zeros" } */ + +float f_val; + +void +__attribute__ ((noinline)) +test_float_plus_zero_assign (float *a) +{ + *a = +0.0; +} + +void +__attribute__ ((noinline)) +test_float_minus_zero_assign (float *a) +{ + *a = -0.0; +} + +int +main () +{ + f_val = -1.0; + test_float_plus_zero_assign (&f_val); + + if (__builtin_copysignf (1.0, f_val) != 1.0) + __builtin_abort (); + + f_val = -1.0; + test_float_minus_zero_assign (&f_val); + + if (__builtin_copysignf (1.0, f_val) != 1.0) + __builtin_abort (); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c new file mode 100644 index 00000000000..1eb1edba457 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c @@ -0,0 +1,36 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -O3 -fno-signed-zeros" } */ + +double f_val; + +void +__attribute__ ((noinline)) +test_float_plus_zero_assign (double *a) +{ + *a = +0.0; +} + +void +__attribute__ ((noinline)) +test_float_minus_zero_assign (double *a) +{ + *a = -0.0; +} + +int +main () +{ + f_val = -1.0; + test_float_plus_zero_assign (&f_val); + + if (__builtin_copysignf (1.0, f_val) != 1.0) + __builtin_abort (); + + f_val = -1.0; + test_float_minus_zero_assign (&f_val); + + if (__builtin_copysignf (1.0, f_val) != 1.0) + __builtin_abort (); + + return 0; +} -- 2.34.1 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option 2024-01-02 11:55 ` [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option pan2.li @ 2024-01-08 10:45 ` Richard Biener 2024-01-09 1:22 ` Li, Pan2 2024-01-09 17:46 ` Jeff Law 0 siblings, 2 replies; 23+ messages in thread From: Richard Biener @ 2024-01-08 10:45 UTC (permalink / raw) To: pan2.li; +Cc: gcc-patches, juzhe.zhong, yanzhang.wang, kito.cheng, jeffreyalaw On Tue, Jan 2, 2024 at 2:37 PM <pan2.li@intel.com> wrote: > > From: Pan Li <pan2.li@intel.com> > > According to the sematics of no-signed-zeros option, the backend > like RISC-V should treat the minus zero -0.0f as plus zero 0.0f. > > Consider below example with option -fno-signed-zeros. > > void > test (float *a) > { > *a = -0.0; > } > > We will generate code as below, which doesn't treat the minus zero > as plus zero. > > test: > lui a5,%hi(.LC0) > flw fa5,%lo(.LC0)(a5) > fsw fa5,0(a0) > ret > > .LC0: > .word -2147483648 // aka -0.0 (0x80000000 in hex) > > This patch would like to fix the bug and treat the minus zero -0.0 > as plus zero, aka +0.0. Thus after this patch we will have asm code > as below for the above sampe code. > > test: > sw zero,0(a0) > ret > > This patch also fix the run failure of the test case pr30957-1.c. The > below tests are passed for this patch. We don't really expect targets to do this. The small testcase above is somewhat ill-formed with -fno-signed-zeros. Note there's no -0.0 in pr30957-1.c so why does that one fail for you? Does the -fvariable-expansion-in-unroller code maybe not trigger for riscv? I think we should go to PR30957 and see what that was filed originally for, the testcase doesn't make much sense to me. > * The riscv regression tests. > * The pr30957-1.c run tests. > > gcc/ChangeLog: > > * config/riscv/constraints.md: Leverage func riscv_float_const_zero_rtx_p > for predicating the rtx is const zero float or not. > * config/riscv/predicates.md: Ditto. > * config/riscv/riscv.cc (riscv_const_insns): Ditto. > (riscv_float_const_zero_rtx_p): New func impl for predicating the rtx is > const zero float or not. > (riscv_const_zero_rtx_p): New func impl for predicating the rtx > is const zero (both int and fp) or not. > * config/riscv/riscv-protos.h (riscv_float_const_zero_rtx_p): > New func decl. > (riscv_const_zero_rtx_p): Ditto. > * config/riscv/riscv.md: Making sure the operand[1] of movfp is > CONST0_RTX when the operand[1] is const zero float. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/no-signed-zeros-0.c: New test. > * gcc.target/riscv/no-signed-zeros-1.c: New test. > * gcc.target/riscv/no-signed-zeros-2.c: New test. > * gcc.target/riscv/no-signed-zeros-3.c: New test. > * gcc.target/riscv/no-signed-zeros-4.c: New test. > * gcc.target/riscv/no-signed-zeros-5.c: New test. > * gcc.target/riscv/no-signed-zeros-run-0.c: New test. > * gcc.target/riscv/no-signed-zeros-run-1.c: New test. > > Signed-off-by: Pan Li <pan2.li@intel.com> > --- > gcc/config/riscv/constraints.md | 2 +- > gcc/config/riscv/predicates.md | 2 +- > gcc/config/riscv/riscv-protos.h | 2 + > gcc/config/riscv/riscv.cc | 35 ++++++++++++- > gcc/config/riscv/riscv.md | 49 ++++++++++++++++--- > .../gcc.target/riscv/no-signed-zeros-0.c | 26 ++++++++++ > .../gcc.target/riscv/no-signed-zeros-1.c | 28 +++++++++++ > .../gcc.target/riscv/no-signed-zeros-2.c | 26 ++++++++++ > .../gcc.target/riscv/no-signed-zeros-3.c | 28 +++++++++++ > .../gcc.target/riscv/no-signed-zeros-4.c | 26 ++++++++++ > .../gcc.target/riscv/no-signed-zeros-5.c | 28 +++++++++++ > .../gcc.target/riscv/no-signed-zeros-run-0.c | 36 ++++++++++++++ > .../gcc.target/riscv/no-signed-zeros-run-1.c | 36 ++++++++++++++ > 13 files changed, 314 insertions(+), 10 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c > > diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md > index de4359af00d..db1d5e1385f 100644 > --- a/gcc/config/riscv/constraints.md > +++ b/gcc/config/riscv/constraints.md > @@ -108,7 +108,7 @@ (define_constraint "DnS" > (define_constraint "G" > "@internal" > (and (match_code "const_double") > - (match_test "op == CONST0_RTX (mode)"))) > + (match_test "riscv_float_const_zero_rtx_p (op)"))) > > (define_memory_constraint "A" > "An address that is held in a general-purpose register." > diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md > index b87a6900841..b428d842101 100644 > --- a/gcc/config/riscv/predicates.md > +++ b/gcc/config/riscv/predicates.md > @@ -78,7 +78,7 @@ (define_predicate "sleu_operand" > > (define_predicate "const_0_operand" > (and (match_code "const_int,const_wide_int,const_double,const_vector") > - (match_test "op == CONST0_RTX (GET_MODE (op))"))) > + (match_test "riscv_const_zero_rtx_p (op)"))) > > (define_predicate "const_1_operand" > (and (match_code "const_int,const_wide_int,const_vector") > diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h > index 31049ef7523..fcf30e084a3 100644 > --- a/gcc/config/riscv/riscv-protos.h > +++ b/gcc/config/riscv/riscv-protos.h > @@ -131,6 +131,8 @@ extern void riscv_asm_output_external (FILE *, const tree, const char *); > extern bool > riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int); > extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx); > +extern bool riscv_float_const_zero_rtx_p (rtx); > +extern bool riscv_const_zero_rtx_p (rtx); > > #ifdef RTX_CODE > extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool *invert_ptr = 0); > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc > index 0d1cbc5cb5f..a8ad86b7068 100644 > --- a/gcc/config/riscv/riscv.cc > +++ b/gcc/config/riscv/riscv.cc > @@ -1635,7 +1635,7 @@ riscv_const_insns (rtx x) > return 1; > > /* We can use x0 to load floating-point zero. */ > - return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0; > + return riscv_float_const_zero_rtx_p (x) ? 1 : 0; > case CONST_VECTOR: > { > /* TODO: This is not accurate, we will need to > @@ -9481,6 +9481,39 @@ riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT total, int regs_num) > || additioanl_bytes == ZCMP_MAX_SPIMM * ZCMP_SP_INC_STEP; > } > > +/* Return true if rtx op is const zero of the floating-point. */ > +bool > +riscv_float_const_zero_rtx_p (rtx op) > +{ > + machine_mode mode = GET_MODE (op); > + > + if (GET_MODE_CLASS (mode) != MODE_FLOAT) > + return false; > + > + if (GET_CODE (op) != CONST_DOUBLE) > + return false; > + > + const struct real_value *op_rvalue = CONST_DOUBLE_REAL_VALUE (op); > + > + if (REAL_VALUE_MINUS_ZERO (*op_rvalue)) > + return !HONOR_SIGNED_ZEROS (mode); > + > + return real_equal (op_rvalue, &dconst0); > +} > + > +/* Return true if rtx op is const zero, include both the integer > + and floating-point. */ > +bool > +riscv_const_zero_rtx_p (rtx op) > +{ > + machine_mode mode = GET_MODE (op); > + > + if (GET_MODE_CLASS (mode) == MODE_FLOAT) > + return riscv_float_const_zero_rtx_p (op); > + > + return op == CONST0_RTX (mode); > +} > + > /* Return true if it's valid gpr_save pattern. */ > > bool > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md > index 68f7203b676..cd429f6dcdd 100644 > --- a/gcc/config/riscv/riscv.md > +++ b/gcc/config/riscv/riscv.md > @@ -1878,7 +1878,12 @@ (define_insn "*movhf_hardfloat" > "TARGET_ZFHMIN > && (register_operand (operands[0], HFmode) > || reg_or_0_operand (operands[1], HFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") > (set_attr "type" "fmove") > (set_attr "mode" "HF")]) > @@ -1889,7 +1894,12 @@ (define_insn "*movhf_softfloat" > "!TARGET_ZFHMIN > && (register_operand (operands[0], HFmode) > || reg_or_0_operand (operands[1], HFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "fmove,move,load,store,mtc,mfc") > (set_attr "type" "fmove") > (set_attr "mode" "HF")]) > @@ -2243,7 +2253,12 @@ (define_insn "*movsf_hardfloat" > "TARGET_HARD_FLOAT > && (register_operand (operands[0], SFmode) > || reg_or_0_operand (operands[1], SFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") > (set_attr "type" "fmove") > (set_attr "mode" "SF")]) > @@ -2254,7 +2269,12 @@ (define_insn "*movsf_softfloat" > "!TARGET_HARD_FLOAT > && (register_operand (operands[0], SFmode) > || reg_or_0_operand (operands[1], SFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "move,load,store") > (set_attr "type" "fmove") > (set_attr "mode" "SF")]) > @@ -2279,7 +2299,12 @@ (define_insn "*movdf_hardfloat_rv32" > "!TARGET_64BIT && TARGET_DOUBLE_FLOAT > && (register_operand (operands[0], DFmode) > || reg_or_0_operand (operands[1], DFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") > (set_attr "type" "fmove") > (set_attr "mode" "DF")]) > @@ -2290,7 +2315,12 @@ (define_insn "*movdf_hardfloat_rv64" > "TARGET_64BIT && TARGET_DOUBLE_FLOAT > && (register_operand (operands[0], DFmode) > || reg_or_0_operand (operands[1], DFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") > (set_attr "type" "fmove") > (set_attr "mode" "DF")]) > @@ -2301,7 +2331,12 @@ (define_insn "*movdf_softfloat" > "!TARGET_DOUBLE_FLOAT > && (register_operand (operands[0], DFmode) > || reg_or_0_operand (operands[1], DFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "move,load,store") > (set_attr "type" "fmove") > (set_attr "mode" "DF")]) > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c > new file mode 100644 > index 00000000000..1eda13a3406 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c > @@ -0,0 +1,26 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > +/* { dg-final { check-function-bodies "**" "" } } */ > + > +/* > +** test_float_plus_zero_assign: > +** sw\s+zero,0\([atx][0-9]+\) > +** ret > +*/ > +void > +test_float_plus_zero_assign (float *a) > +{ > + *a = +0.0; > +} > + > +/* > +** test_float_plus_zero_assign: > +** sw\s+zero,0\([atx][0-9]+\) > +** ret > +*/ > +void > +test_float_minus_zero_assign (float *a) > +{ > + *a = -0.0; > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c > new file mode 100644 > index 00000000000..8041ec3ea95 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c > @@ -0,0 +1,28 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > +/* { dg-final { check-function-bodies "**" "" } } */ > + > +extern void float_assign (float *a, float b); > + > +/* > +** test_float_plus_zero_assign: > +** fmv\.s\.x\s+fa[0-9]+,\s*zero > +** tail\s+float_assign > +*/ > +void > +test_float_plus_zero_assign (float *a) > +{ > + float_assign (a, +0.0); > +} > + > +/* > +** test_float_minus_zero_assign: > +** fmv\.s\.x\s+fa[0-9]+,\s*zero > +** tail\s+float_assign > +*/ > +void > +test_float_minus_zero_assign (float *a) > +{ > + float_assign (a, -0.0); > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c > new file mode 100644 > index 00000000000..a6996dae4de > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c > @@ -0,0 +1,26 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > +/* { dg-final { check-function-bodies "**" "" } } */ > + > +/* > +** test_float_plus_zero_assign: > +** sd\s+zero,0\([atx][0-9]+\) > +** ret > +*/ > +void > +test_float_plus_zero_assign (double *a) > +{ > + *a = +0.0; > +} > + > +/* > +** test_float_plus_zero_assign: > +** sd\s+zero,0\([atx][0-9]+\) > +** ret > +*/ > +void > +test_float_minus_zero_assign (double *a) > +{ > + *a = -0.0; > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c > new file mode 100644 > index 00000000000..b4ba8a247df > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c > @@ -0,0 +1,28 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > +/* { dg-final { check-function-bodies "**" "" } } */ > + > +extern void float_assign (double *a, double b); > + > +/* > +** test_float_plus_zero_assign: > +** fmv\.d\.x\s+fa[0-9]+,\s*zero > +** tail\s+float_assign > +*/ > +void > +test_float_plus_zero_assign (double *a) > +{ > + float_assign (a, +0.0); > +} > + > +/* > +** test_float_minus_zero_assign: > +** fmv\.d\.x\s+fa[0-9]+,\s*zero > +** tail\s+float_assign > +*/ > +void > +test_float_minus_zero_assign (double *a) > +{ > + float_assign (a, -0.0); > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c > new file mode 100644 > index 00000000000..60acf7155d3 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c > @@ -0,0 +1,26 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv_zfh -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > +/* { dg-final { check-function-bodies "**" "" } } */ > + > +/* > +** test_float_plus_zero_assign: > +** sh\s+zero,0\([atx][0-9]+\) > +** ret > +*/ > +void > +test_float_plus_zero_assign (_Float16 *a) > +{ > + *a = +0.0; > +} > + > +/* > +** test_float_plus_zero_assign: > +** sh\s+zero,0\([atx][0-9]+\) > +** ret > +*/ > +void > +test_float_minus_zero_assign (_Float16 *a) > +{ > + *a = -0.0; > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c > new file mode 100644 > index 00000000000..d10efbeb37b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c > @@ -0,0 +1,28 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv_zfh -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > +/* { dg-final { check-function-bodies "**" "" } } */ > + > +extern void float_assign (_Float16 *a, _Float16 b); > + > +/* > +** test_float_plus_zero_assign: > +** fmv\.h\.x\s+fa[0-9]+,\s*zero > +** tail\s+float_assign > +*/ > +void > +test_float_plus_zero_assign (_Float16 *a) > +{ > + float_assign (a, +0.0); > +} > + > +/* > +** test_float_minus_zero_assign: > +** fmv\.h\.x\s+fa[0-9]+,\s*zero > +** tail\s+float_assign > +*/ > +void > +test_float_minus_zero_assign (_Float16 *a) > +{ > + float_assign (a, -0.0); > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c > new file mode 100644 > index 00000000000..347e4b0ff74 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c > @@ -0,0 +1,36 @@ > +/* { dg-do run { target { riscv_v } } } */ > +/* { dg-additional-options "-std=c99 -O3 -fno-signed-zeros" } */ > + > +float f_val; > + > +void > +__attribute__ ((noinline)) > +test_float_plus_zero_assign (float *a) > +{ > + *a = +0.0; > +} > + > +void > +__attribute__ ((noinline)) > +test_float_minus_zero_assign (float *a) > +{ > + *a = -0.0; > +} > + > +int > +main () > +{ > + f_val = -1.0; > + test_float_plus_zero_assign (&f_val); > + > + if (__builtin_copysignf (1.0, f_val) != 1.0) > + __builtin_abort (); > + > + f_val = -1.0; > + test_float_minus_zero_assign (&f_val); > + > + if (__builtin_copysignf (1.0, f_val) != 1.0) > + __builtin_abort (); > + > + return 0; > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c > new file mode 100644 > index 00000000000..1eb1edba457 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c > @@ -0,0 +1,36 @@ > +/* { dg-do run { target { riscv_v } } } */ > +/* { dg-additional-options "-std=c99 -O3 -fno-signed-zeros" } */ > + > +double f_val; > + > +void > +__attribute__ ((noinline)) > +test_float_plus_zero_assign (double *a) > +{ > + *a = +0.0; > +} > + > +void > +__attribute__ ((noinline)) > +test_float_minus_zero_assign (double *a) > +{ > + *a = -0.0; > +} > + > +int > +main () > +{ > + f_val = -1.0; > + test_float_plus_zero_assign (&f_val); > + > + if (__builtin_copysignf (1.0, f_val) != 1.0) > + __builtin_abort (); > + > + f_val = -1.0; > + test_float_minus_zero_assign (&f_val); > + > + if (__builtin_copysignf (1.0, f_val) != 1.0) > + __builtin_abort (); > + > + return 0; > +} > -- > 2.34.1 > ^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option 2024-01-08 10:45 ` Richard Biener @ 2024-01-09 1:22 ` Li, Pan2 2024-01-09 7:17 ` Li, Pan2 2024-01-09 17:46 ` Jeff Law 1 sibling, 1 reply; 23+ messages in thread From: Li, Pan2 @ 2024-01-09 1:22 UTC (permalink / raw) To: Richard Biener Cc: gcc-patches, juzhe.zhong, Wang, Yanzhang, kito.cheng, jeffreyalaw Thanks Richard B for comments. > We don't really expect targets to do this. The small testcase above > is somewhat ill-formed with -fno-signed-zeros. Note there's no > -0.0 in pr30957-1.c so why does that one fail for you? Does > the -fvariable-expansion-in-unroller code maybe not trigger for > riscv? Sorry this confused me a little about the sematics of the option -fno-signed-zeros. I wonder what the target/backend need to do for this option. About the failure, it comes from below code in pr30957-1.c. The 0.0 / -5.0 is initialized to -0.0 in riscv but +0.0 in aarch64. if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0) abort (); If my understanding is correct, the loop will be vectorized during vect_transform_loop with a variable factor. It won't benefit from unrolling/peeling and mark the loop->unroll as 1, and then we have tree-vect log similar to below: Disabling unrolling due to variable-length vectorization factor. > I think we should go to PR30957 and see what that was filed originally > for, the testcase doesn't make much sense to me. Sure thing, will take a look and back to you later. Pan -----Original Message----- From: Richard Biener <richard.guenther@gmail.com> Sent: Monday, January 8, 2024 6:45 PM To: Li, Pan2 <pan2.li@intel.com> Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; jeffreyalaw@gmail.com Subject: Re: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option On Tue, Jan 2, 2024 at 2:37 PM <pan2.li@intel.com> wrote: > > From: Pan Li <pan2.li@intel.com> > > According to the sematics of no-signed-zeros option, the backend > like RISC-V should treat the minus zero -0.0f as plus zero 0.0f. > > Consider below example with option -fno-signed-zeros. > > void > test (float *a) > { > *a = -0.0; > } > > We will generate code as below, which doesn't treat the minus zero > as plus zero. > > test: > lui a5,%hi(.LC0) > flw fa5,%lo(.LC0)(a5) > fsw fa5,0(a0) > ret > > .LC0: > .word -2147483648 // aka -0.0 (0x80000000 in hex) > > This patch would like to fix the bug and treat the minus zero -0.0 > as plus zero, aka +0.0. Thus after this patch we will have asm code > as below for the above sampe code. > > test: > sw zero,0(a0) > ret > > This patch also fix the run failure of the test case pr30957-1.c. The > below tests are passed for this patch. We don't really expect targets to do this. The small testcase above is somewhat ill-formed with -fno-signed-zeros. Note there's no -0.0 in pr30957-1.c so why does that one fail for you? Does the -fvariable-expansion-in-unroller code maybe not trigger for riscv? I think we should go to PR30957 and see what that was filed originally for, the testcase doesn't make much sense to me. > * The riscv regression tests. > * The pr30957-1.c run tests. > > gcc/ChangeLog: > > * config/riscv/constraints.md: Leverage func riscv_float_const_zero_rtx_p > for predicating the rtx is const zero float or not. > * config/riscv/predicates.md: Ditto. > * config/riscv/riscv.cc (riscv_const_insns): Ditto. > (riscv_float_const_zero_rtx_p): New func impl for predicating the rtx is > const zero float or not. > (riscv_const_zero_rtx_p): New func impl for predicating the rtx > is const zero (both int and fp) or not. > * config/riscv/riscv-protos.h (riscv_float_const_zero_rtx_p): > New func decl. > (riscv_const_zero_rtx_p): Ditto. > * config/riscv/riscv.md: Making sure the operand[1] of movfp is > CONST0_RTX when the operand[1] is const zero float. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/no-signed-zeros-0.c: New test. > * gcc.target/riscv/no-signed-zeros-1.c: New test. > * gcc.target/riscv/no-signed-zeros-2.c: New test. > * gcc.target/riscv/no-signed-zeros-3.c: New test. > * gcc.target/riscv/no-signed-zeros-4.c: New test. > * gcc.target/riscv/no-signed-zeros-5.c: New test. > * gcc.target/riscv/no-signed-zeros-run-0.c: New test. > * gcc.target/riscv/no-signed-zeros-run-1.c: New test. > > Signed-off-by: Pan Li <pan2.li@intel.com> > --- > gcc/config/riscv/constraints.md | 2 +- > gcc/config/riscv/predicates.md | 2 +- > gcc/config/riscv/riscv-protos.h | 2 + > gcc/config/riscv/riscv.cc | 35 ++++++++++++- > gcc/config/riscv/riscv.md | 49 ++++++++++++++++--- > .../gcc.target/riscv/no-signed-zeros-0.c | 26 ++++++++++ > .../gcc.target/riscv/no-signed-zeros-1.c | 28 +++++++++++ > .../gcc.target/riscv/no-signed-zeros-2.c | 26 ++++++++++ > .../gcc.target/riscv/no-signed-zeros-3.c | 28 +++++++++++ > .../gcc.target/riscv/no-signed-zeros-4.c | 26 ++++++++++ > .../gcc.target/riscv/no-signed-zeros-5.c | 28 +++++++++++ > .../gcc.target/riscv/no-signed-zeros-run-0.c | 36 ++++++++++++++ > .../gcc.target/riscv/no-signed-zeros-run-1.c | 36 ++++++++++++++ > 13 files changed, 314 insertions(+), 10 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c > > diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md > index de4359af00d..db1d5e1385f 100644 > --- a/gcc/config/riscv/constraints.md > +++ b/gcc/config/riscv/constraints.md > @@ -108,7 +108,7 @@ (define_constraint "DnS" > (define_constraint "G" > "@internal" > (and (match_code "const_double") > - (match_test "op == CONST0_RTX (mode)"))) > + (match_test "riscv_float_const_zero_rtx_p (op)"))) > > (define_memory_constraint "A" > "An address that is held in a general-purpose register." > diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md > index b87a6900841..b428d842101 100644 > --- a/gcc/config/riscv/predicates.md > +++ b/gcc/config/riscv/predicates.md > @@ -78,7 +78,7 @@ (define_predicate "sleu_operand" > > (define_predicate "const_0_operand" > (and (match_code "const_int,const_wide_int,const_double,const_vector") > - (match_test "op == CONST0_RTX (GET_MODE (op))"))) > + (match_test "riscv_const_zero_rtx_p (op)"))) > > (define_predicate "const_1_operand" > (and (match_code "const_int,const_wide_int,const_vector") > diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h > index 31049ef7523..fcf30e084a3 100644 > --- a/gcc/config/riscv/riscv-protos.h > +++ b/gcc/config/riscv/riscv-protos.h > @@ -131,6 +131,8 @@ extern void riscv_asm_output_external (FILE *, const tree, const char *); > extern bool > riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int); > extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx); > +extern bool riscv_float_const_zero_rtx_p (rtx); > +extern bool riscv_const_zero_rtx_p (rtx); > > #ifdef RTX_CODE > extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool *invert_ptr = 0); > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc > index 0d1cbc5cb5f..a8ad86b7068 100644 > --- a/gcc/config/riscv/riscv.cc > +++ b/gcc/config/riscv/riscv.cc > @@ -1635,7 +1635,7 @@ riscv_const_insns (rtx x) > return 1; > > /* We can use x0 to load floating-point zero. */ > - return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0; > + return riscv_float_const_zero_rtx_p (x) ? 1 : 0; > case CONST_VECTOR: > { > /* TODO: This is not accurate, we will need to > @@ -9481,6 +9481,39 @@ riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT total, int regs_num) > || additioanl_bytes == ZCMP_MAX_SPIMM * ZCMP_SP_INC_STEP; > } > > +/* Return true if rtx op is const zero of the floating-point. */ > +bool > +riscv_float_const_zero_rtx_p (rtx op) > +{ > + machine_mode mode = GET_MODE (op); > + > + if (GET_MODE_CLASS (mode) != MODE_FLOAT) > + return false; > + > + if (GET_CODE (op) != CONST_DOUBLE) > + return false; > + > + const struct real_value *op_rvalue = CONST_DOUBLE_REAL_VALUE (op); > + > + if (REAL_VALUE_MINUS_ZERO (*op_rvalue)) > + return !HONOR_SIGNED_ZEROS (mode); > + > + return real_equal (op_rvalue, &dconst0); > +} > + > +/* Return true if rtx op is const zero, include both the integer > + and floating-point. */ > +bool > +riscv_const_zero_rtx_p (rtx op) > +{ > + machine_mode mode = GET_MODE (op); > + > + if (GET_MODE_CLASS (mode) == MODE_FLOAT) > + return riscv_float_const_zero_rtx_p (op); > + > + return op == CONST0_RTX (mode); > +} > + > /* Return true if it's valid gpr_save pattern. */ > > bool > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md > index 68f7203b676..cd429f6dcdd 100644 > --- a/gcc/config/riscv/riscv.md > +++ b/gcc/config/riscv/riscv.md > @@ -1878,7 +1878,12 @@ (define_insn "*movhf_hardfloat" > "TARGET_ZFHMIN > && (register_operand (operands[0], HFmode) > || reg_or_0_operand (operands[1], HFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") > (set_attr "type" "fmove") > (set_attr "mode" "HF")]) > @@ -1889,7 +1894,12 @@ (define_insn "*movhf_softfloat" > "!TARGET_ZFHMIN > && (register_operand (operands[0], HFmode) > || reg_or_0_operand (operands[1], HFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "fmove,move,load,store,mtc,mfc") > (set_attr "type" "fmove") > (set_attr "mode" "HF")]) > @@ -2243,7 +2253,12 @@ (define_insn "*movsf_hardfloat" > "TARGET_HARD_FLOAT > && (register_operand (operands[0], SFmode) > || reg_or_0_operand (operands[1], SFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") > (set_attr "type" "fmove") > (set_attr "mode" "SF")]) > @@ -2254,7 +2269,12 @@ (define_insn "*movsf_softfloat" > "!TARGET_HARD_FLOAT > && (register_operand (operands[0], SFmode) > || reg_or_0_operand (operands[1], SFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "move,load,store") > (set_attr "type" "fmove") > (set_attr "mode" "SF")]) > @@ -2279,7 +2299,12 @@ (define_insn "*movdf_hardfloat_rv32" > "!TARGET_64BIT && TARGET_DOUBLE_FLOAT > && (register_operand (operands[0], DFmode) > || reg_or_0_operand (operands[1], DFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") > (set_attr "type" "fmove") > (set_attr "mode" "DF")]) > @@ -2290,7 +2315,12 @@ (define_insn "*movdf_hardfloat_rv64" > "TARGET_64BIT && TARGET_DOUBLE_FLOAT > && (register_operand (operands[0], DFmode) > || reg_or_0_operand (operands[1], DFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") > (set_attr "type" "fmove") > (set_attr "mode" "DF")]) > @@ -2301,7 +2331,12 @@ (define_insn "*movdf_softfloat" > "!TARGET_DOUBLE_FLOAT > && (register_operand (operands[0], DFmode) > || reg_or_0_operand (operands[1], DFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "move,load,store") > (set_attr "type" "fmove") > (set_attr "mode" "DF")]) > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c > new file mode 100644 > index 00000000000..1eda13a3406 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c > @@ -0,0 +1,26 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > +/* { dg-final { check-function-bodies "**" "" } } */ > + > +/* > +** test_float_plus_zero_assign: > +** sw\s+zero,0\([atx][0-9]+\) > +** ret > +*/ > +void > +test_float_plus_zero_assign (float *a) > +{ > + *a = +0.0; > +} > + > +/* > +** test_float_plus_zero_assign: > +** sw\s+zero,0\([atx][0-9]+\) > +** ret > +*/ > +void > +test_float_minus_zero_assign (float *a) > +{ > + *a = -0.0; > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c > new file mode 100644 > index 00000000000..8041ec3ea95 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c > @@ -0,0 +1,28 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > +/* { dg-final { check-function-bodies "**" "" } } */ > + > +extern void float_assign (float *a, float b); > + > +/* > +** test_float_plus_zero_assign: > +** fmv\.s\.x\s+fa[0-9]+,\s*zero > +** tail\s+float_assign > +*/ > +void > +test_float_plus_zero_assign (float *a) > +{ > + float_assign (a, +0.0); > +} > + > +/* > +** test_float_minus_zero_assign: > +** fmv\.s\.x\s+fa[0-9]+,\s*zero > +** tail\s+float_assign > +*/ > +void > +test_float_minus_zero_assign (float *a) > +{ > + float_assign (a, -0.0); > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c > new file mode 100644 > index 00000000000..a6996dae4de > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c > @@ -0,0 +1,26 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > +/* { dg-final { check-function-bodies "**" "" } } */ > + > +/* > +** test_float_plus_zero_assign: > +** sd\s+zero,0\([atx][0-9]+\) > +** ret > +*/ > +void > +test_float_plus_zero_assign (double *a) > +{ > + *a = +0.0; > +} > + > +/* > +** test_float_plus_zero_assign: > +** sd\s+zero,0\([atx][0-9]+\) > +** ret > +*/ > +void > +test_float_minus_zero_assign (double *a) > +{ > + *a = -0.0; > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c > new file mode 100644 > index 00000000000..b4ba8a247df > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c > @@ -0,0 +1,28 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > +/* { dg-final { check-function-bodies "**" "" } } */ > + > +extern void float_assign (double *a, double b); > + > +/* > +** test_float_plus_zero_assign: > +** fmv\.d\.x\s+fa[0-9]+,\s*zero > +** tail\s+float_assign > +*/ > +void > +test_float_plus_zero_assign (double *a) > +{ > + float_assign (a, +0.0); > +} > + > +/* > +** test_float_minus_zero_assign: > +** fmv\.d\.x\s+fa[0-9]+,\s*zero > +** tail\s+float_assign > +*/ > +void > +test_float_minus_zero_assign (double *a) > +{ > + float_assign (a, -0.0); > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c > new file mode 100644 > index 00000000000..60acf7155d3 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c > @@ -0,0 +1,26 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv_zfh -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > +/* { dg-final { check-function-bodies "**" "" } } */ > + > +/* > +** test_float_plus_zero_assign: > +** sh\s+zero,0\([atx][0-9]+\) > +** ret > +*/ > +void > +test_float_plus_zero_assign (_Float16 *a) > +{ > + *a = +0.0; > +} > + > +/* > +** test_float_plus_zero_assign: > +** sh\s+zero,0\([atx][0-9]+\) > +** ret > +*/ > +void > +test_float_minus_zero_assign (_Float16 *a) > +{ > + *a = -0.0; > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c > new file mode 100644 > index 00000000000..d10efbeb37b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c > @@ -0,0 +1,28 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv_zfh -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > +/* { dg-final { check-function-bodies "**" "" } } */ > + > +extern void float_assign (_Float16 *a, _Float16 b); > + > +/* > +** test_float_plus_zero_assign: > +** fmv\.h\.x\s+fa[0-9]+,\s*zero > +** tail\s+float_assign > +*/ > +void > +test_float_plus_zero_assign (_Float16 *a) > +{ > + float_assign (a, +0.0); > +} > + > +/* > +** test_float_minus_zero_assign: > +** fmv\.h\.x\s+fa[0-9]+,\s*zero > +** tail\s+float_assign > +*/ > +void > +test_float_minus_zero_assign (_Float16 *a) > +{ > + float_assign (a, -0.0); > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c > new file mode 100644 > index 00000000000..347e4b0ff74 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c > @@ -0,0 +1,36 @@ > +/* { dg-do run { target { riscv_v } } } */ > +/* { dg-additional-options "-std=c99 -O3 -fno-signed-zeros" } */ > + > +float f_val; > + > +void > +__attribute__ ((noinline)) > +test_float_plus_zero_assign (float *a) > +{ > + *a = +0.0; > +} > + > +void > +__attribute__ ((noinline)) > +test_float_minus_zero_assign (float *a) > +{ > + *a = -0.0; > +} > + > +int > +main () > +{ > + f_val = -1.0; > + test_float_plus_zero_assign (&f_val); > + > + if (__builtin_copysignf (1.0, f_val) != 1.0) > + __builtin_abort (); > + > + f_val = -1.0; > + test_float_minus_zero_assign (&f_val); > + > + if (__builtin_copysignf (1.0, f_val) != 1.0) > + __builtin_abort (); > + > + return 0; > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c > new file mode 100644 > index 00000000000..1eb1edba457 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c > @@ -0,0 +1,36 @@ > +/* { dg-do run { target { riscv_v } } } */ > +/* { dg-additional-options "-std=c99 -O3 -fno-signed-zeros" } */ > + > +double f_val; > + > +void > +__attribute__ ((noinline)) > +test_float_plus_zero_assign (double *a) > +{ > + *a = +0.0; > +} > + > +void > +__attribute__ ((noinline)) > +test_float_minus_zero_assign (double *a) > +{ > + *a = -0.0; > +} > + > +int > +main () > +{ > + f_val = -1.0; > + test_float_plus_zero_assign (&f_val); > + > + if (__builtin_copysignf (1.0, f_val) != 1.0) > + __builtin_abort (); > + > + f_val = -1.0; > + test_float_minus_zero_assign (&f_val); > + > + if (__builtin_copysignf (1.0, f_val) != 1.0) > + __builtin_abort (); > + > + return 0; > +} > -- > 2.34.1 > ^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option 2024-01-09 1:22 ` Li, Pan2 @ 2024-01-09 7:17 ` Li, Pan2 2024-01-09 13:08 ` Richard Biener 0 siblings, 1 reply; 23+ messages in thread From: Li, Pan2 @ 2024-01-09 7:17 UTC (permalink / raw) To: Richard Biener Cc: gcc-patches, juzhe.zhong, Wang, Yanzhang, kito.cheng, jeffreyalaw, eres, tom The test case pr30957-1.c first comes from this commit about 19 years ago which expect the -1.0 for testing. https://github.com/gcc-mirror/gcc/commit/290358f770d21d9204ea621f839ee8fba606a275 Then the below commit changes from -1.0 to +1.0 for this test file only, because of the instantiates copy(s) of the accumulator which it initializes with +0.0. https://github.com/gcc-mirror/gcc/commit/ffefa9288ab95b06b1dfed95e7235f4c09619a91 According to the implementation details of insert_var_expansion_initialization. The zero_init will be the CONST0_RTX (mode) if HONOR_SIGNED_ZEROS is false. If my understanding is correct, maybe the test case is not well designed for the variable expanding in unrolling? At least it is not good idea to mix/rely on the HONOR_SIGNED_ZEROS when testing variable-expansion-in-unroller. CC the original author and please feel free to correct me if any misunderstanding. Pan -----Original Message----- From: Li, Pan2 Sent: Tuesday, January 9, 2024 9:22 AM To: Richard Biener <richard.guenther@gmail.com> Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; jeffreyalaw@gmail.com Subject: RE: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option Thanks Richard B for comments. > We don't really expect targets to do this. The small testcase above > is somewhat ill-formed with -fno-signed-zeros. Note there's no > -0.0 in pr30957-1.c so why does that one fail for you? Does > the -fvariable-expansion-in-unroller code maybe not trigger for > riscv? Sorry this confused me a little about the sematics of the option -fno-signed-zeros. I wonder what the target/backend need to do for this option. About the failure, it comes from below code in pr30957-1.c. The 0.0 / -5.0 is initialized to -0.0 in riscv but +0.0 in aarch64. if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0) abort (); If my understanding is correct, the loop will be vectorized during vect_transform_loop with a variable factor. It won't benefit from unrolling/peeling and mark the loop->unroll as 1, and then we have tree-vect log similar to below: Disabling unrolling due to variable-length vectorization factor. > I think we should go to PR30957 and see what that was filed originally > for, the testcase doesn't make much sense to me. Sure thing, will take a look and back to you later. Pan -----Original Message----- From: Richard Biener <richard.guenther@gmail.com> Sent: Monday, January 8, 2024 6:45 PM To: Li, Pan2 <pan2.li@intel.com> Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; jeffreyalaw@gmail.com Subject: Re: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option On Tue, Jan 2, 2024 at 2:37 PM <pan2.li@intel.com> wrote: > > From: Pan Li <pan2.li@intel.com> > > According to the sematics of no-signed-zeros option, the backend > like RISC-V should treat the minus zero -0.0f as plus zero 0.0f. > > Consider below example with option -fno-signed-zeros. > > void > test (float *a) > { > *a = -0.0; > } > > We will generate code as below, which doesn't treat the minus zero > as plus zero. > > test: > lui a5,%hi(.LC0) > flw fa5,%lo(.LC0)(a5) > fsw fa5,0(a0) > ret > > .LC0: > .word -2147483648 // aka -0.0 (0x80000000 in hex) > > This patch would like to fix the bug and treat the minus zero -0.0 > as plus zero, aka +0.0. Thus after this patch we will have asm code > as below for the above sampe code. > > test: > sw zero,0(a0) > ret > > This patch also fix the run failure of the test case pr30957-1.c. The > below tests are passed for this patch. We don't really expect targets to do this. The small testcase above is somewhat ill-formed with -fno-signed-zeros. Note there's no -0.0 in pr30957-1.c so why does that one fail for you? Does the -fvariable-expansion-in-unroller code maybe not trigger for riscv? I think we should go to PR30957 and see what that was filed originally for, the testcase doesn't make much sense to me. > * The riscv regression tests. > * The pr30957-1.c run tests. > > gcc/ChangeLog: > > * config/riscv/constraints.md: Leverage func riscv_float_const_zero_rtx_p > for predicating the rtx is const zero float or not. > * config/riscv/predicates.md: Ditto. > * config/riscv/riscv.cc (riscv_const_insns): Ditto. > (riscv_float_const_zero_rtx_p): New func impl for predicating the rtx is > const zero float or not. > (riscv_const_zero_rtx_p): New func impl for predicating the rtx > is const zero (both int and fp) or not. > * config/riscv/riscv-protos.h (riscv_float_const_zero_rtx_p): > New func decl. > (riscv_const_zero_rtx_p): Ditto. > * config/riscv/riscv.md: Making sure the operand[1] of movfp is > CONST0_RTX when the operand[1] is const zero float. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/no-signed-zeros-0.c: New test. > * gcc.target/riscv/no-signed-zeros-1.c: New test. > * gcc.target/riscv/no-signed-zeros-2.c: New test. > * gcc.target/riscv/no-signed-zeros-3.c: New test. > * gcc.target/riscv/no-signed-zeros-4.c: New test. > * gcc.target/riscv/no-signed-zeros-5.c: New test. > * gcc.target/riscv/no-signed-zeros-run-0.c: New test. > * gcc.target/riscv/no-signed-zeros-run-1.c: New test. > > Signed-off-by: Pan Li <pan2.li@intel.com> > --- > gcc/config/riscv/constraints.md | 2 +- > gcc/config/riscv/predicates.md | 2 +- > gcc/config/riscv/riscv-protos.h | 2 + > gcc/config/riscv/riscv.cc | 35 ++++++++++++- > gcc/config/riscv/riscv.md | 49 ++++++++++++++++--- > .../gcc.target/riscv/no-signed-zeros-0.c | 26 ++++++++++ > .../gcc.target/riscv/no-signed-zeros-1.c | 28 +++++++++++ > .../gcc.target/riscv/no-signed-zeros-2.c | 26 ++++++++++ > .../gcc.target/riscv/no-signed-zeros-3.c | 28 +++++++++++ > .../gcc.target/riscv/no-signed-zeros-4.c | 26 ++++++++++ > .../gcc.target/riscv/no-signed-zeros-5.c | 28 +++++++++++ > .../gcc.target/riscv/no-signed-zeros-run-0.c | 36 ++++++++++++++ > .../gcc.target/riscv/no-signed-zeros-run-1.c | 36 ++++++++++++++ > 13 files changed, 314 insertions(+), 10 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c > > diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md > index de4359af00d..db1d5e1385f 100644 > --- a/gcc/config/riscv/constraints.md > +++ b/gcc/config/riscv/constraints.md > @@ -108,7 +108,7 @@ (define_constraint "DnS" > (define_constraint "G" > "@internal" > (and (match_code "const_double") > - (match_test "op == CONST0_RTX (mode)"))) > + (match_test "riscv_float_const_zero_rtx_p (op)"))) > > (define_memory_constraint "A" > "An address that is held in a general-purpose register." > diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md > index b87a6900841..b428d842101 100644 > --- a/gcc/config/riscv/predicates.md > +++ b/gcc/config/riscv/predicates.md > @@ -78,7 +78,7 @@ (define_predicate "sleu_operand" > > (define_predicate "const_0_operand" > (and (match_code "const_int,const_wide_int,const_double,const_vector") > - (match_test "op == CONST0_RTX (GET_MODE (op))"))) > + (match_test "riscv_const_zero_rtx_p (op)"))) > > (define_predicate "const_1_operand" > (and (match_code "const_int,const_wide_int,const_vector") > diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h > index 31049ef7523..fcf30e084a3 100644 > --- a/gcc/config/riscv/riscv-protos.h > +++ b/gcc/config/riscv/riscv-protos.h > @@ -131,6 +131,8 @@ extern void riscv_asm_output_external (FILE *, const tree, const char *); > extern bool > riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int); > extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx); > +extern bool riscv_float_const_zero_rtx_p (rtx); > +extern bool riscv_const_zero_rtx_p (rtx); > > #ifdef RTX_CODE > extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool *invert_ptr = 0); > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc > index 0d1cbc5cb5f..a8ad86b7068 100644 > --- a/gcc/config/riscv/riscv.cc > +++ b/gcc/config/riscv/riscv.cc > @@ -1635,7 +1635,7 @@ riscv_const_insns (rtx x) > return 1; > > /* We can use x0 to load floating-point zero. */ > - return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0; > + return riscv_float_const_zero_rtx_p (x) ? 1 : 0; > case CONST_VECTOR: > { > /* TODO: This is not accurate, we will need to > @@ -9481,6 +9481,39 @@ riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT total, int regs_num) > || additioanl_bytes == ZCMP_MAX_SPIMM * ZCMP_SP_INC_STEP; > } > > +/* Return true if rtx op is const zero of the floating-point. */ > +bool > +riscv_float_const_zero_rtx_p (rtx op) > +{ > + machine_mode mode = GET_MODE (op); > + > + if (GET_MODE_CLASS (mode) != MODE_FLOAT) > + return false; > + > + if (GET_CODE (op) != CONST_DOUBLE) > + return false; > + > + const struct real_value *op_rvalue = CONST_DOUBLE_REAL_VALUE (op); > + > + if (REAL_VALUE_MINUS_ZERO (*op_rvalue)) > + return !HONOR_SIGNED_ZEROS (mode); > + > + return real_equal (op_rvalue, &dconst0); > +} > + > +/* Return true if rtx op is const zero, include both the integer > + and floating-point. */ > +bool > +riscv_const_zero_rtx_p (rtx op) > +{ > + machine_mode mode = GET_MODE (op); > + > + if (GET_MODE_CLASS (mode) == MODE_FLOAT) > + return riscv_float_const_zero_rtx_p (op); > + > + return op == CONST0_RTX (mode); > +} > + > /* Return true if it's valid gpr_save pattern. */ > > bool > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md > index 68f7203b676..cd429f6dcdd 100644 > --- a/gcc/config/riscv/riscv.md > +++ b/gcc/config/riscv/riscv.md > @@ -1878,7 +1878,12 @@ (define_insn "*movhf_hardfloat" > "TARGET_ZFHMIN > && (register_operand (operands[0], HFmode) > || reg_or_0_operand (operands[1], HFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") > (set_attr "type" "fmove") > (set_attr "mode" "HF")]) > @@ -1889,7 +1894,12 @@ (define_insn "*movhf_softfloat" > "!TARGET_ZFHMIN > && (register_operand (operands[0], HFmode) > || reg_or_0_operand (operands[1], HFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "fmove,move,load,store,mtc,mfc") > (set_attr "type" "fmove") > (set_attr "mode" "HF")]) > @@ -2243,7 +2253,12 @@ (define_insn "*movsf_hardfloat" > "TARGET_HARD_FLOAT > && (register_operand (operands[0], SFmode) > || reg_or_0_operand (operands[1], SFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") > (set_attr "type" "fmove") > (set_attr "mode" "SF")]) > @@ -2254,7 +2269,12 @@ (define_insn "*movsf_softfloat" > "!TARGET_HARD_FLOAT > && (register_operand (operands[0], SFmode) > || reg_or_0_operand (operands[1], SFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "move,load,store") > (set_attr "type" "fmove") > (set_attr "mode" "SF")]) > @@ -2279,7 +2299,12 @@ (define_insn "*movdf_hardfloat_rv32" > "!TARGET_64BIT && TARGET_DOUBLE_FLOAT > && (register_operand (operands[0], DFmode) > || reg_or_0_operand (operands[1], DFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") > (set_attr "type" "fmove") > (set_attr "mode" "DF")]) > @@ -2290,7 +2315,12 @@ (define_insn "*movdf_hardfloat_rv64" > "TARGET_64BIT && TARGET_DOUBLE_FLOAT > && (register_operand (operands[0], DFmode) > || reg_or_0_operand (operands[1], DFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") > (set_attr "type" "fmove") > (set_attr "mode" "DF")]) > @@ -2301,7 +2331,12 @@ (define_insn "*movdf_softfloat" > "!TARGET_DOUBLE_FLOAT > && (register_operand (operands[0], DFmode) > || reg_or_0_operand (operands[1], DFmode))" > - { return riscv_output_move (operands[0], operands[1]); } > + { > + if (riscv_float_const_zero_rtx_p (operands[1])) > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > + > + return riscv_output_move (operands[0], operands[1]); > + } > [(set_attr "move_type" "move,load,store") > (set_attr "type" "fmove") > (set_attr "mode" "DF")]) > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c > new file mode 100644 > index 00000000000..1eda13a3406 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c > @@ -0,0 +1,26 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > +/* { dg-final { check-function-bodies "**" "" } } */ > + > +/* > +** test_float_plus_zero_assign: > +** sw\s+zero,0\([atx][0-9]+\) > +** ret > +*/ > +void > +test_float_plus_zero_assign (float *a) > +{ > + *a = +0.0; > +} > + > +/* > +** test_float_plus_zero_assign: > +** sw\s+zero,0\([atx][0-9]+\) > +** ret > +*/ > +void > +test_float_minus_zero_assign (float *a) > +{ > + *a = -0.0; > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c > new file mode 100644 > index 00000000000..8041ec3ea95 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c > @@ -0,0 +1,28 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > +/* { dg-final { check-function-bodies "**" "" } } */ > + > +extern void float_assign (float *a, float b); > + > +/* > +** test_float_plus_zero_assign: > +** fmv\.s\.x\s+fa[0-9]+,\s*zero > +** tail\s+float_assign > +*/ > +void > +test_float_plus_zero_assign (float *a) > +{ > + float_assign (a, +0.0); > +} > + > +/* > +** test_float_minus_zero_assign: > +** fmv\.s\.x\s+fa[0-9]+,\s*zero > +** tail\s+float_assign > +*/ > +void > +test_float_minus_zero_assign (float *a) > +{ > + float_assign (a, -0.0); > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c > new file mode 100644 > index 00000000000..a6996dae4de > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c > @@ -0,0 +1,26 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > +/* { dg-final { check-function-bodies "**" "" } } */ > + > +/* > +** test_float_plus_zero_assign: > +** sd\s+zero,0\([atx][0-9]+\) > +** ret > +*/ > +void > +test_float_plus_zero_assign (double *a) > +{ > + *a = +0.0; > +} > + > +/* > +** test_float_plus_zero_assign: > +** sd\s+zero,0\([atx][0-9]+\) > +** ret > +*/ > +void > +test_float_minus_zero_assign (double *a) > +{ > + *a = -0.0; > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c > new file mode 100644 > index 00000000000..b4ba8a247df > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c > @@ -0,0 +1,28 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > +/* { dg-final { check-function-bodies "**" "" } } */ > + > +extern void float_assign (double *a, double b); > + > +/* > +** test_float_plus_zero_assign: > +** fmv\.d\.x\s+fa[0-9]+,\s*zero > +** tail\s+float_assign > +*/ > +void > +test_float_plus_zero_assign (double *a) > +{ > + float_assign (a, +0.0); > +} > + > +/* > +** test_float_minus_zero_assign: > +** fmv\.d\.x\s+fa[0-9]+,\s*zero > +** tail\s+float_assign > +*/ > +void > +test_float_minus_zero_assign (double *a) > +{ > + float_assign (a, -0.0); > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c > new file mode 100644 > index 00000000000..60acf7155d3 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c > @@ -0,0 +1,26 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv_zfh -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > +/* { dg-final { check-function-bodies "**" "" } } */ > + > +/* > +** test_float_plus_zero_assign: > +** sh\s+zero,0\([atx][0-9]+\) > +** ret > +*/ > +void > +test_float_plus_zero_assign (_Float16 *a) > +{ > + *a = +0.0; > +} > + > +/* > +** test_float_plus_zero_assign: > +** sh\s+zero,0\([atx][0-9]+\) > +** ret > +*/ > +void > +test_float_minus_zero_assign (_Float16 *a) > +{ > + *a = -0.0; > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c > new file mode 100644 > index 00000000000..d10efbeb37b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c > @@ -0,0 +1,28 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv_zfh -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > +/* { dg-final { check-function-bodies "**" "" } } */ > + > +extern void float_assign (_Float16 *a, _Float16 b); > + > +/* > +** test_float_plus_zero_assign: > +** fmv\.h\.x\s+fa[0-9]+,\s*zero > +** tail\s+float_assign > +*/ > +void > +test_float_plus_zero_assign (_Float16 *a) > +{ > + float_assign (a, +0.0); > +} > + > +/* > +** test_float_minus_zero_assign: > +** fmv\.h\.x\s+fa[0-9]+,\s*zero > +** tail\s+float_assign > +*/ > +void > +test_float_minus_zero_assign (_Float16 *a) > +{ > + float_assign (a, -0.0); > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c > new file mode 100644 > index 00000000000..347e4b0ff74 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c > @@ -0,0 +1,36 @@ > +/* { dg-do run { target { riscv_v } } } */ > +/* { dg-additional-options "-std=c99 -O3 -fno-signed-zeros" } */ > + > +float f_val; > + > +void > +__attribute__ ((noinline)) > +test_float_plus_zero_assign (float *a) > +{ > + *a = +0.0; > +} > + > +void > +__attribute__ ((noinline)) > +test_float_minus_zero_assign (float *a) > +{ > + *a = -0.0; > +} > + > +int > +main () > +{ > + f_val = -1.0; > + test_float_plus_zero_assign (&f_val); > + > + if (__builtin_copysignf (1.0, f_val) != 1.0) > + __builtin_abort (); > + > + f_val = -1.0; > + test_float_minus_zero_assign (&f_val); > + > + if (__builtin_copysignf (1.0, f_val) != 1.0) > + __builtin_abort (); > + > + return 0; > +} > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c > new file mode 100644 > index 00000000000..1eb1edba457 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c > @@ -0,0 +1,36 @@ > +/* { dg-do run { target { riscv_v } } } */ > +/* { dg-additional-options "-std=c99 -O3 -fno-signed-zeros" } */ > + > +double f_val; > + > +void > +__attribute__ ((noinline)) > +test_float_plus_zero_assign (double *a) > +{ > + *a = +0.0; > +} > + > +void > +__attribute__ ((noinline)) > +test_float_minus_zero_assign (double *a) > +{ > + *a = -0.0; > +} > + > +int > +main () > +{ > + f_val = -1.0; > + test_float_plus_zero_assign (&f_val); > + > + if (__builtin_copysignf (1.0, f_val) != 1.0) > + __builtin_abort (); > + > + f_val = -1.0; > + test_float_minus_zero_assign (&f_val); > + > + if (__builtin_copysignf (1.0, f_val) != 1.0) > + __builtin_abort (); > + > + return 0; > +} > -- > 2.34.1 > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option 2024-01-09 7:17 ` Li, Pan2 @ 2024-01-09 13:08 ` Richard Biener 0 siblings, 0 replies; 23+ messages in thread From: Richard Biener @ 2024-01-09 13:08 UTC (permalink / raw) To: Li, Pan2 Cc: gcc-patches, juzhe.zhong, Wang, Yanzhang, kito.cheng, jeffreyalaw, eres, tdevries On Tue, Jan 9, 2024 at 8:17 AM Li, Pan2 <pan2.li@intel.com> wrote: > > The test case pr30957-1.c first comes from this commit about 19 years ago which expect the -1.0 for testing. > > https://github.com/gcc-mirror/gcc/commit/290358f770d21d9204ea621f839ee8fba606a275 > > Then the below commit changes from -1.0 to +1.0 for this test file only, because of the instantiates copy(s) of the > accumulator which it initializes with +0.0. > > https://github.com/gcc-mirror/gcc/commit/ffefa9288ab95b06b1dfed95e7235f4c09619a91 This second change looks bogus - it makes the testcase not well-defined by specifying -fno-signed-zeros but expecting defined behavior when signed zeros are involved. It might be that it's currently not possible to trigger the "bad" behavior but then the testcase should still not runfail. I suggest to instead remove -funsafe-math-optimizations from the original option set. > According to the implementation details of insert_var_expansion_initialization. The zero_init will be > the CONST0_RTX (mode) if HONOR_SIGNED_ZEROS is false. If my understanding is correct, maybe the test > case is not well designed for the variable expanding in unrolling? At least it is not good idea to mix/rely on > the HONOR_SIGNED_ZEROS when testing variable-expansion-in-unroller. For the first change I think insert_var_expansion_initialization behavior shouldn't depend on HONOR_SIGNED_ZEROS but instead check MODE_HAS_SINGED_ZEROS as -0.0 is always the correct initial value to use. Note that this will likely break the testcase expectation when it's not changed, and the testcase would need to be guarded for targets not supporting signed zeros. Richard. > CC the original author and please feel free to correct me if any misunderstanding. > > Pan > > -----Original Message----- > From: Li, Pan2 > Sent: Tuesday, January 9, 2024 9:22 AM > To: Richard Biener <richard.guenther@gmail.com> > Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; jeffreyalaw@gmail.com > Subject: RE: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option > > Thanks Richard B for comments. > > > We don't really expect targets to do this. The small testcase above > > is somewhat ill-formed with -fno-signed-zeros. Note there's no > > -0.0 in pr30957-1.c so why does that one fail for you? Does > > the -fvariable-expansion-in-unroller code maybe not trigger for > > riscv? > > Sorry this confused me a little about the sematics of the option -fno-signed-zeros. > I wonder what the target/backend need to do for this option. > > About the failure, it comes from below code in pr30957-1.c. The 0.0 / -5.0 is initialized to -0.0 in riscv but +0.0 in aarch64. > > if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0) > abort (); > > If my understanding is correct, the loop will be vectorized during vect_transform_loop with a variable factor. > It won't benefit from unrolling/peeling and mark the loop->unroll as 1, and then we have tree-vect log similar to below: > > Disabling unrolling due to variable-length vectorization factor. > > > I think we should go to PR30957 and see what that was filed originally > > for, the testcase doesn't make much sense to me. > > Sure thing, will take a look and back to you later. > > Pan > > -----Original Message----- > From: Richard Biener <richard.guenther@gmail.com> > Sent: Monday, January 8, 2024 6:45 PM > To: Li, Pan2 <pan2.li@intel.com> > Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; jeffreyalaw@gmail.com > Subject: Re: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option > > On Tue, Jan 2, 2024 at 2:37 PM <pan2.li@intel.com> wrote: > > > > From: Pan Li <pan2.li@intel.com> > > > > According to the sematics of no-signed-zeros option, the backend > > like RISC-V should treat the minus zero -0.0f as plus zero 0.0f. > > > > Consider below example with option -fno-signed-zeros. > > > > void > > test (float *a) > > { > > *a = -0.0; > > } > > > > We will generate code as below, which doesn't treat the minus zero > > as plus zero. > > > > test: > > lui a5,%hi(.LC0) > > flw fa5,%lo(.LC0)(a5) > > fsw fa5,0(a0) > > ret > > > > .LC0: > > .word -2147483648 // aka -0.0 (0x80000000 in hex) > > > > This patch would like to fix the bug and treat the minus zero -0.0 > > as plus zero, aka +0.0. Thus after this patch we will have asm code > > as below for the above sampe code. > > > > test: > > sw zero,0(a0) > > ret > > > > This patch also fix the run failure of the test case pr30957-1.c. The > > below tests are passed for this patch. > > We don't really expect targets to do this. The small testcase above > is somewhat ill-formed with -fno-signed-zeros. Note there's no > -0.0 in pr30957-1.c so why does that one fail for you? Does > the -fvariable-expansion-in-unroller code maybe not trigger for > riscv? > > I think we should go to PR30957 and see what that was filed originally > for, the testcase doesn't make much sense to me. > > > * The riscv regression tests. > > * The pr30957-1.c run tests. > > > > gcc/ChangeLog: > > > > * config/riscv/constraints.md: Leverage func riscv_float_const_zero_rtx_p > > for predicating the rtx is const zero float or not. > > * config/riscv/predicates.md: Ditto. > > * config/riscv/riscv.cc (riscv_const_insns): Ditto. > > (riscv_float_const_zero_rtx_p): New func impl for predicating the rtx is > > const zero float or not. > > (riscv_const_zero_rtx_p): New func impl for predicating the rtx > > is const zero (both int and fp) or not. > > * config/riscv/riscv-protos.h (riscv_float_const_zero_rtx_p): > > New func decl. > > (riscv_const_zero_rtx_p): Ditto. > > * config/riscv/riscv.md: Making sure the operand[1] of movfp is > > CONST0_RTX when the operand[1] is const zero float. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/riscv/no-signed-zeros-0.c: New test. > > * gcc.target/riscv/no-signed-zeros-1.c: New test. > > * gcc.target/riscv/no-signed-zeros-2.c: New test. > > * gcc.target/riscv/no-signed-zeros-3.c: New test. > > * gcc.target/riscv/no-signed-zeros-4.c: New test. > > * gcc.target/riscv/no-signed-zeros-5.c: New test. > > * gcc.target/riscv/no-signed-zeros-run-0.c: New test. > > * gcc.target/riscv/no-signed-zeros-run-1.c: New test. > > > > Signed-off-by: Pan Li <pan2.li@intel.com> > > --- > > gcc/config/riscv/constraints.md | 2 +- > > gcc/config/riscv/predicates.md | 2 +- > > gcc/config/riscv/riscv-protos.h | 2 + > > gcc/config/riscv/riscv.cc | 35 ++++++++++++- > > gcc/config/riscv/riscv.md | 49 ++++++++++++++++--- > > .../gcc.target/riscv/no-signed-zeros-0.c | 26 ++++++++++ > > .../gcc.target/riscv/no-signed-zeros-1.c | 28 +++++++++++ > > .../gcc.target/riscv/no-signed-zeros-2.c | 26 ++++++++++ > > .../gcc.target/riscv/no-signed-zeros-3.c | 28 +++++++++++ > > .../gcc.target/riscv/no-signed-zeros-4.c | 26 ++++++++++ > > .../gcc.target/riscv/no-signed-zeros-5.c | 28 +++++++++++ > > .../gcc.target/riscv/no-signed-zeros-run-0.c | 36 ++++++++++++++ > > .../gcc.target/riscv/no-signed-zeros-run-1.c | 36 ++++++++++++++ > > 13 files changed, 314 insertions(+), 10 deletions(-) > > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c > > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c > > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c > > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c > > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c > > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c > > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c > > create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c > > > > diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md > > index de4359af00d..db1d5e1385f 100644 > > --- a/gcc/config/riscv/constraints.md > > +++ b/gcc/config/riscv/constraints.md > > @@ -108,7 +108,7 @@ (define_constraint "DnS" > > (define_constraint "G" > > "@internal" > > (and (match_code "const_double") > > - (match_test "op == CONST0_RTX (mode)"))) > > + (match_test "riscv_float_const_zero_rtx_p (op)"))) > > > > (define_memory_constraint "A" > > "An address that is held in a general-purpose register." > > diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md > > index b87a6900841..b428d842101 100644 > > --- a/gcc/config/riscv/predicates.md > > +++ b/gcc/config/riscv/predicates.md > > @@ -78,7 +78,7 @@ (define_predicate "sleu_operand" > > > > (define_predicate "const_0_operand" > > (and (match_code "const_int,const_wide_int,const_double,const_vector") > > - (match_test "op == CONST0_RTX (GET_MODE (op))"))) > > + (match_test "riscv_const_zero_rtx_p (op)"))) > > > > (define_predicate "const_1_operand" > > (and (match_code "const_int,const_wide_int,const_vector") > > diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h > > index 31049ef7523..fcf30e084a3 100644 > > --- a/gcc/config/riscv/riscv-protos.h > > +++ b/gcc/config/riscv/riscv-protos.h > > @@ -131,6 +131,8 @@ extern void riscv_asm_output_external (FILE *, const tree, const char *); > > extern bool > > riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int); > > extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx); > > +extern bool riscv_float_const_zero_rtx_p (rtx); > > +extern bool riscv_const_zero_rtx_p (rtx); > > > > #ifdef RTX_CODE > > extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool *invert_ptr = 0); > > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc > > index 0d1cbc5cb5f..a8ad86b7068 100644 > > --- a/gcc/config/riscv/riscv.cc > > +++ b/gcc/config/riscv/riscv.cc > > @@ -1635,7 +1635,7 @@ riscv_const_insns (rtx x) > > return 1; > > > > /* We can use x0 to load floating-point zero. */ > > - return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0; > > + return riscv_float_const_zero_rtx_p (x) ? 1 : 0; > > case CONST_VECTOR: > > { > > /* TODO: This is not accurate, we will need to > > @@ -9481,6 +9481,39 @@ riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT total, int regs_num) > > || additioanl_bytes == ZCMP_MAX_SPIMM * ZCMP_SP_INC_STEP; > > } > > > > +/* Return true if rtx op is const zero of the floating-point. */ > > +bool > > +riscv_float_const_zero_rtx_p (rtx op) > > +{ > > + machine_mode mode = GET_MODE (op); > > + > > + if (GET_MODE_CLASS (mode) != MODE_FLOAT) > > + return false; > > + > > + if (GET_CODE (op) != CONST_DOUBLE) > > + return false; > > + > > + const struct real_value *op_rvalue = CONST_DOUBLE_REAL_VALUE (op); > > + > > + if (REAL_VALUE_MINUS_ZERO (*op_rvalue)) > > + return !HONOR_SIGNED_ZEROS (mode); > > + > > + return real_equal (op_rvalue, &dconst0); > > +} > > + > > +/* Return true if rtx op is const zero, include both the integer > > + and floating-point. */ > > +bool > > +riscv_const_zero_rtx_p (rtx op) > > +{ > > + machine_mode mode = GET_MODE (op); > > + > > + if (GET_MODE_CLASS (mode) == MODE_FLOAT) > > + return riscv_float_const_zero_rtx_p (op); > > + > > + return op == CONST0_RTX (mode); > > +} > > + > > /* Return true if it's valid gpr_save pattern. */ > > > > bool > > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md > > index 68f7203b676..cd429f6dcdd 100644 > > --- a/gcc/config/riscv/riscv.md > > +++ b/gcc/config/riscv/riscv.md > > @@ -1878,7 +1878,12 @@ (define_insn "*movhf_hardfloat" > > "TARGET_ZFHMIN > > && (register_operand (operands[0], HFmode) > > || reg_or_0_operand (operands[1], HFmode))" > > - { return riscv_output_move (operands[0], operands[1]); } > > + { > > + if (riscv_float_const_zero_rtx_p (operands[1])) > > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > > + > > + return riscv_output_move (operands[0], operands[1]); > > + } > > [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") > > (set_attr "type" "fmove") > > (set_attr "mode" "HF")]) > > @@ -1889,7 +1894,12 @@ (define_insn "*movhf_softfloat" > > "!TARGET_ZFHMIN > > && (register_operand (operands[0], HFmode) > > || reg_or_0_operand (operands[1], HFmode))" > > - { return riscv_output_move (operands[0], operands[1]); } > > + { > > + if (riscv_float_const_zero_rtx_p (operands[1])) > > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > > + > > + return riscv_output_move (operands[0], operands[1]); > > + } > > [(set_attr "move_type" "fmove,move,load,store,mtc,mfc") > > (set_attr "type" "fmove") > > (set_attr "mode" "HF")]) > > @@ -2243,7 +2253,12 @@ (define_insn "*movsf_hardfloat" > > "TARGET_HARD_FLOAT > > && (register_operand (operands[0], SFmode) > > || reg_or_0_operand (operands[1], SFmode))" > > - { return riscv_output_move (operands[0], operands[1]); } > > + { > > + if (riscv_float_const_zero_rtx_p (operands[1])) > > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > > + > > + return riscv_output_move (operands[0], operands[1]); > > + } > > [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") > > (set_attr "type" "fmove") > > (set_attr "mode" "SF")]) > > @@ -2254,7 +2269,12 @@ (define_insn "*movsf_softfloat" > > "!TARGET_HARD_FLOAT > > && (register_operand (operands[0], SFmode) > > || reg_or_0_operand (operands[1], SFmode))" > > - { return riscv_output_move (operands[0], operands[1]); } > > + { > > + if (riscv_float_const_zero_rtx_p (operands[1])) > > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > > + > > + return riscv_output_move (operands[0], operands[1]); > > + } > > [(set_attr "move_type" "move,load,store") > > (set_attr "type" "fmove") > > (set_attr "mode" "SF")]) > > @@ -2279,7 +2299,12 @@ (define_insn "*movdf_hardfloat_rv32" > > "!TARGET_64BIT && TARGET_DOUBLE_FLOAT > > && (register_operand (operands[0], DFmode) > > || reg_or_0_operand (operands[1], DFmode))" > > - { return riscv_output_move (operands[0], operands[1]); } > > + { > > + if (riscv_float_const_zero_rtx_p (operands[1])) > > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > > + > > + return riscv_output_move (operands[0], operands[1]); > > + } > > [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") > > (set_attr "type" "fmove") > > (set_attr "mode" "DF")]) > > @@ -2290,7 +2315,12 @@ (define_insn "*movdf_hardfloat_rv64" > > "TARGET_64BIT && TARGET_DOUBLE_FLOAT > > && (register_operand (operands[0], DFmode) > > || reg_or_0_operand (operands[1], DFmode))" > > - { return riscv_output_move (operands[0], operands[1]); } > > + { > > + if (riscv_float_const_zero_rtx_p (operands[1])) > > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > > + > > + return riscv_output_move (operands[0], operands[1]); > > + } > > [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") > > (set_attr "type" "fmove") > > (set_attr "mode" "DF")]) > > @@ -2301,7 +2331,12 @@ (define_insn "*movdf_softfloat" > > "!TARGET_DOUBLE_FLOAT > > && (register_operand (operands[0], DFmode) > > || reg_or_0_operand (operands[1], DFmode))" > > - { return riscv_output_move (operands[0], operands[1]); } > > + { > > + if (riscv_float_const_zero_rtx_p (operands[1])) > > + operands[1] = CONST0_RTX (GET_MODE (operands[1])); > > + > > + return riscv_output_move (operands[0], operands[1]); > > + } > > [(set_attr "move_type" "move,load,store") > > (set_attr "type" "fmove") > > (set_attr "mode" "DF")]) > > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c > > new file mode 100644 > > index 00000000000..1eda13a3406 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c > > @@ -0,0 +1,26 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > > +/* { dg-final { check-function-bodies "**" "" } } */ > > + > > +/* > > +** test_float_plus_zero_assign: > > +** sw\s+zero,0\([atx][0-9]+\) > > +** ret > > +*/ > > +void > > +test_float_plus_zero_assign (float *a) > > +{ > > + *a = +0.0; > > +} > > + > > +/* > > +** test_float_plus_zero_assign: > > +** sw\s+zero,0\([atx][0-9]+\) > > +** ret > > +*/ > > +void > > +test_float_minus_zero_assign (float *a) > > +{ > > + *a = -0.0; > > +} > > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c > > new file mode 100644 > > index 00000000000..8041ec3ea95 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c > > @@ -0,0 +1,28 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > > +/* { dg-final { check-function-bodies "**" "" } } */ > > + > > +extern void float_assign (float *a, float b); > > + > > +/* > > +** test_float_plus_zero_assign: > > +** fmv\.s\.x\s+fa[0-9]+,\s*zero > > +** tail\s+float_assign > > +*/ > > +void > > +test_float_plus_zero_assign (float *a) > > +{ > > + float_assign (a, +0.0); > > +} > > + > > +/* > > +** test_float_minus_zero_assign: > > +** fmv\.s\.x\s+fa[0-9]+,\s*zero > > +** tail\s+float_assign > > +*/ > > +void > > +test_float_minus_zero_assign (float *a) > > +{ > > + float_assign (a, -0.0); > > +} > > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c > > new file mode 100644 > > index 00000000000..a6996dae4de > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c > > @@ -0,0 +1,26 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > > +/* { dg-final { check-function-bodies "**" "" } } */ > > + > > +/* > > +** test_float_plus_zero_assign: > > +** sd\s+zero,0\([atx][0-9]+\) > > +** ret > > +*/ > > +void > > +test_float_plus_zero_assign (double *a) > > +{ > > + *a = +0.0; > > +} > > + > > +/* > > +** test_float_plus_zero_assign: > > +** sd\s+zero,0\([atx][0-9]+\) > > +** ret > > +*/ > > +void > > +test_float_minus_zero_assign (double *a) > > +{ > > + *a = -0.0; > > +} > > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c > > new file mode 100644 > > index 00000000000..b4ba8a247df > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c > > @@ -0,0 +1,28 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > > +/* { dg-final { check-function-bodies "**" "" } } */ > > + > > +extern void float_assign (double *a, double b); > > + > > +/* > > +** test_float_plus_zero_assign: > > +** fmv\.d\.x\s+fa[0-9]+,\s*zero > > +** tail\s+float_assign > > +*/ > > +void > > +test_float_plus_zero_assign (double *a) > > +{ > > + float_assign (a, +0.0); > > +} > > + > > +/* > > +** test_float_minus_zero_assign: > > +** fmv\.d\.x\s+fa[0-9]+,\s*zero > > +** tail\s+float_assign > > +*/ > > +void > > +test_float_minus_zero_assign (double *a) > > +{ > > + float_assign (a, -0.0); > > +} > > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c > > new file mode 100644 > > index 00000000000..60acf7155d3 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c > > @@ -0,0 +1,26 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-march=rv64gcv_zfh -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > > +/* { dg-final { check-function-bodies "**" "" } } */ > > + > > +/* > > +** test_float_plus_zero_assign: > > +** sh\s+zero,0\([atx][0-9]+\) > > +** ret > > +*/ > > +void > > +test_float_plus_zero_assign (_Float16 *a) > > +{ > > + *a = +0.0; > > +} > > + > > +/* > > +** test_float_plus_zero_assign: > > +** sh\s+zero,0\([atx][0-9]+\) > > +** ret > > +*/ > > +void > > +test_float_minus_zero_assign (_Float16 *a) > > +{ > > + *a = -0.0; > > +} > > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c > > new file mode 100644 > > index 00000000000..d10efbeb37b > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c > > @@ -0,0 +1,28 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-march=rv64gcv_zfh -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */ > > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ > > +/* { dg-final { check-function-bodies "**" "" } } */ > > + > > +extern void float_assign (_Float16 *a, _Float16 b); > > + > > +/* > > +** test_float_plus_zero_assign: > > +** fmv\.h\.x\s+fa[0-9]+,\s*zero > > +** tail\s+float_assign > > +*/ > > +void > > +test_float_plus_zero_assign (_Float16 *a) > > +{ > > + float_assign (a, +0.0); > > +} > > + > > +/* > > +** test_float_minus_zero_assign: > > +** fmv\.h\.x\s+fa[0-9]+,\s*zero > > +** tail\s+float_assign > > +*/ > > +void > > +test_float_minus_zero_assign (_Float16 *a) > > +{ > > + float_assign (a, -0.0); > > +} > > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c > > new file mode 100644 > > index 00000000000..347e4b0ff74 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c > > @@ -0,0 +1,36 @@ > > +/* { dg-do run { target { riscv_v } } } */ > > +/* { dg-additional-options "-std=c99 -O3 -fno-signed-zeros" } */ > > + > > +float f_val; > > + > > +void > > +__attribute__ ((noinline)) > > +test_float_plus_zero_assign (float *a) > > +{ > > + *a = +0.0; > > +} > > + > > +void > > +__attribute__ ((noinline)) > > +test_float_minus_zero_assign (float *a) > > +{ > > + *a = -0.0; > > +} > > + > > +int > > +main () > > +{ > > + f_val = -1.0; > > + test_float_plus_zero_assign (&f_val); > > + > > + if (__builtin_copysignf (1.0, f_val) != 1.0) > > + __builtin_abort (); > > + > > + f_val = -1.0; > > + test_float_minus_zero_assign (&f_val); > > + > > + if (__builtin_copysignf (1.0, f_val) != 1.0) > > + __builtin_abort (); > > + > > + return 0; > > +} > > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c > > new file mode 100644 > > index 00000000000..1eb1edba457 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c > > @@ -0,0 +1,36 @@ > > +/* { dg-do run { target { riscv_v } } } */ > > +/* { dg-additional-options "-std=c99 -O3 -fno-signed-zeros" } */ > > + > > +double f_val; > > + > > +void > > +__attribute__ ((noinline)) > > +test_float_plus_zero_assign (double *a) > > +{ > > + *a = +0.0; > > +} > > + > > +void > > +__attribute__ ((noinline)) > > +test_float_minus_zero_assign (double *a) > > +{ > > + *a = -0.0; > > +} > > + > > +int > > +main () > > +{ > > + f_val = -1.0; > > + test_float_plus_zero_assign (&f_val); > > + > > + if (__builtin_copysignf (1.0, f_val) != 1.0) > > + __builtin_abort (); > > + > > + f_val = -1.0; > > + test_float_minus_zero_assign (&f_val); > > + > > + if (__builtin_copysignf (1.0, f_val) != 1.0) > > + __builtin_abort (); > > + > > + return 0; > > +} > > -- > > 2.34.1 > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option 2024-01-08 10:45 ` Richard Biener 2024-01-09 1:22 ` Li, Pan2 @ 2024-01-09 17:46 ` Jeff Law 2024-01-10 4:28 ` Li, Pan2 1 sibling, 1 reply; 23+ messages in thread From: Jeff Law @ 2024-01-09 17:46 UTC (permalink / raw) To: Richard Biener, pan2.li Cc: gcc-patches, juzhe.zhong, yanzhang.wang, kito.cheng On 1/8/24 03:45, Richard Biener wrote: > On Tue, Jan 2, 2024 at 2:37 PM <pan2.li@intel.com> wrote: >> >> From: Pan Li <pan2.li@intel.com> >> >> According to the sematics of no-signed-zeros option, the backend >> like RISC-V should treat the minus zero -0.0f as plus zero 0.0f. >> >> Consider below example with option -fno-signed-zeros. >> >> void >> test (float *a) >> { >> *a = -0.0; >> } >> >> We will generate code as below, which doesn't treat the minus zero >> as plus zero. >> >> test: >> lui a5,%hi(.LC0) >> flw fa5,%lo(.LC0)(a5) >> fsw fa5,0(a0) >> ret >> >> .LC0: >> .word -2147483648 // aka -0.0 (0x80000000 in hex) >> >> This patch would like to fix the bug and treat the minus zero -0.0 >> as plus zero, aka +0.0. Thus after this patch we will have asm code >> as below for the above sampe code. >> >> test: >> sw zero,0(a0) >> ret >> >> This patch also fix the run failure of the test case pr30957-1.c. The >> below tests are passed for this patch. > > We don't really expect targets to do this. The small testcase above > is somewhat ill-formed with -fno-signed-zeros. Note there's no > -0.0 in pr30957-1.c so why does that one fail for you? Does > the -fvariable-expansion-in-unroller code maybe not trigger for > riscv? Loop unrolling (and thus variable expansion) doesn't trigger on the VLA style architectures. aarch64 passes becuase its backend knows it can translate -0.0 into 0.0. While we don't require that from ports, I'd just assume do the optimization similar to aarch64 rather than xfail or skip the test on RISC-V. We can load 0.0 more efficiently than -0.0. > > I think we should go to PR30957 and see what that was filed originally > for, the testcase doesn't make much sense to me. It's got more history than I'd like :( jeff ^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option 2024-01-09 17:46 ` Jeff Law @ 2024-01-10 4:28 ` Li, Pan2 0 siblings, 0 replies; 23+ messages in thread From: Li, Pan2 @ 2024-01-10 4:28 UTC (permalink / raw) To: Jeff Law, Richard Biener Cc: gcc-patches, juzhe.zhong, Wang, Yanzhang, kito.cheng Thanks Jeff and Richard for confirmation and comments. It looks like firstly we should address the issue of the original commits in v4 and then back to if there is something we need to deal with option no-signed-zero for the riscv. Pan -----Original Message----- From: Jeff Law <jeffreyalaw@gmail.com> Sent: Wednesday, January 10, 2024 1:46 AM To: Richard Biener <richard.guenther@gmail.com>; Li, Pan2 <pan2.li@intel.com> Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com Subject: Re: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option On 1/8/24 03:45, Richard Biener wrote: > On Tue, Jan 2, 2024 at 2:37 PM <pan2.li@intel.com> wrote: >> >> From: Pan Li <pan2.li@intel.com> >> >> According to the sematics of no-signed-zeros option, the backend >> like RISC-V should treat the minus zero -0.0f as plus zero 0.0f. >> >> Consider below example with option -fno-signed-zeros. >> >> void >> test (float *a) >> { >> *a = -0.0; >> } >> >> We will generate code as below, which doesn't treat the minus zero >> as plus zero. >> >> test: >> lui a5,%hi(.LC0) >> flw fa5,%lo(.LC0)(a5) >> fsw fa5,0(a0) >> ret >> >> .LC0: >> .word -2147483648 // aka -0.0 (0x80000000 in hex) >> >> This patch would like to fix the bug and treat the minus zero -0.0 >> as plus zero, aka +0.0. Thus after this patch we will have asm code >> as below for the above sampe code. >> >> test: >> sw zero,0(a0) >> ret >> >> This patch also fix the run failure of the test case pr30957-1.c. The >> below tests are passed for this patch. > > We don't really expect targets to do this. The small testcase above > is somewhat ill-formed with -fno-signed-zeros. Note there's no > -0.0 in pr30957-1.c so why does that one fail for you? Does > the -fvariable-expansion-in-unroller code maybe not trigger for > riscv? Loop unrolling (and thus variable expansion) doesn't trigger on the VLA style architectures. aarch64 passes becuase its backend knows it can translate -0.0 into 0.0. While we don't require that from ports, I'd just assume do the optimization similar to aarch64 rather than xfail or skip the test on RISC-V. We can load 0.0 more efficiently than -0.0. > > I think we should go to PR30957 and see what that was filed originally > for, the testcase doesn't make much sense to me. It's got more history than I'd like :( jeff ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion 2023-12-23 11:07 [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor pan2.li ` (2 preceding siblings ...) 2024-01-02 11:55 ` [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option pan2.li @ 2024-01-11 1:38 ` pan2.li 2024-01-11 8:33 ` Richard Biener 2024-01-11 8:50 ` [PATCH v5] " pan2.li 4 siblings, 1 reply; 23+ messages in thread From: pan2.li @ 2024-01-11 1:38 UTC (permalink / raw) To: gcc-patches Cc: juzhe.zhong, pan2.li, yanzhang.wang, kito.cheng, richard.guenther, jeffreyalaw From: Pan Li <pan2.li@intel.com> The insert_var_expansion_initialization depends on the HONOR_SIGNED_ZEROS to initialize the unrolling variables to +0.0f when -0.0f and no-signed-option. Unfortunately, we should always keep the -0.0f here because: * The -0.0f is always the correct initial value. * We need to support the target that always honor signed zero. Thus, we need to leverage MODE_HAS_SIGNED_ZEROS when initialize instead of HONOR_SIGNED_ZEROS. Then the target/backend can decide to honor the no-signed-zero or not. The below tests are passed for this patch: * The riscv regression tests. * The aarch64 regression tests. * The x86 bootstrap and regression tests. gcc/ChangeLog: * loop-unroll.cc (insert_var_expansion_initialization): Leverage MODE_HAS_SIGNED_ZEROS for expansion variable initialization. gcc/testsuite/ChangeLog: * gcc.dg/pr30957-1.c: Adjust tests cases for different scenarios. Signed-off-by: Pan Li <pan2.li@intel.com> --- gcc/loop-unroll.cc | 4 +-- gcc/testsuite/gcc.dg/pr30957-1.c | 48 ++++++++++++++++++++++++++++---- 2 files changed, 44 insertions(+), 8 deletions(-) diff --git a/gcc/loop-unroll.cc b/gcc/loop-unroll.cc index 4176a21e308..bfdfe6c2bb7 100644 --- a/gcc/loop-unroll.cc +++ b/gcc/loop-unroll.cc @@ -1855,7 +1855,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve, rtx var, zero_init; unsigned i; machine_mode mode = GET_MODE (ve->reg); - bool honor_signed_zero_p = HONOR_SIGNED_ZEROS (mode); + bool has_signed_zero_p = MODE_HAS_SIGNED_ZEROS (mode); if (ve->var_expansions.length () == 0) return; @@ -1869,7 +1869,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve, case MINUS: FOR_EACH_VEC_ELT (ve->var_expansions, i, var) { - if (honor_signed_zero_p) + if (has_signed_zero_p) zero_init = simplify_gen_unary (NEG, mode, CONST0_RTX (mode), mode); else zero_init = CONST0_RTX (mode); diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c index 564410913ab..6a9d3d87932 100644 --- a/gcc/testsuite/gcc.dg/pr30957-1.c +++ b/gcc/testsuite/gcc.dg/pr30957-1.c @@ -20,16 +20,52 @@ foo (float d, int n) return accum; } +float __attribute__((noinline)) +get_minus_zero() +{ + return 0.0 / -5.0; +} + int main () { - /* When compiling standard compliant we expect foo to return -0.0. But the - variable expansion during unrolling optimization (for this testcase enabled - by non-compliant -fassociative-math) instantiates copy(s) of the - accumulator which it initializes with +0.0. Hence we expect that foo - returns +0.0. */ - if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0) + /* The variable expansion in unroll requires option unsafe-math-optimizations + (aka -fno-signed-zeros, -fno-trapping-math, -fassociative-math + and -freciprocal-math). + + When loop like above will have expansion after unrolling as below: + + accum_1 += d_1; + accum_2 += d_2; + accum_3 += d_3; + ... + + The accum_1, accum_2 and accum_3 need to be initialized. Given the + floating-point we have + +0.0f + -0.0f = +0.0f. + + Thus, we should initialize the accum_* to -0.0 for correctness. But + the things become more complicated when no-signed-zeros, as well as VLA + vectorizer mode which doesn't trigger variable expansion. Then we have: + + Case 1: Trigger variable expansion but target doesn't honor no-signed-zero. + minus_zero will be -0.0f and foo (minus_zero, 10) will be -0.0f. + Case 2: Trigger variable expansion but target does honor no-signed-zero. + minus_zero will be +0.0f and foo (minus_zero, 10) will be +0.0f. + Case 3: No variable expansion but target doesn't honor no-signed-zero. + minus_zero will be -0.0f and foo (minus_zero, 10) will be -0.0f. + Case 4: No variable expansion but target does honor no-signed-zero. + minus_zero will be +0.0f and foo (minus_zero, 10) will be +0.0f. + + The test case covers above 4 cases for running. + */ + float minus_zero = get_minus_zero (); + float a = __builtin_copysignf (1.0, minus_zero); + float b = __builtin_copysignf (1.0, foo (minus_zero, 10)); + + if (a != b) abort (); + exit (0); } -- 2.34.1 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion 2024-01-11 1:38 ` [PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion pan2.li @ 2024-01-11 8:33 ` Richard Biener 2024-01-11 8:48 ` Li, Pan2 0 siblings, 1 reply; 23+ messages in thread From: Richard Biener @ 2024-01-11 8:33 UTC (permalink / raw) To: pan2.li; +Cc: gcc-patches, juzhe.zhong, yanzhang.wang, kito.cheng, jeffreyalaw On Thu, Jan 11, 2024 at 2:39 AM <pan2.li@intel.com> wrote: > > From: Pan Li <pan2.li@intel.com> > > The insert_var_expansion_initialization depends on the > HONOR_SIGNED_ZEROS to initialize the unrolling variables > to +0.0f when -0.0f and no-signed-option. Unfortunately, > we should always keep the -0.0f here because: > > * The -0.0f is always the correct initial value. > * We need to support the target that always honor signed zero. > > Thus, we need to leverage MODE_HAS_SIGNED_ZEROS when initialize > instead of HONOR_SIGNED_ZEROS. Then the target/backend can > decide to honor the no-signed-zero or not. > > The below tests are passed for this patch: > > * The riscv regression tests. > * The aarch64 regression tests. > * The x86 bootstrap and regression tests. > > gcc/ChangeLog: > > * loop-unroll.cc (insert_var_expansion_initialization): Leverage > MODE_HAS_SIGNED_ZEROS for expansion variable initialization. > > gcc/testsuite/ChangeLog: > > * gcc.dg/pr30957-1.c: Adjust tests cases for different scenarios. > > Signed-off-by: Pan Li <pan2.li@intel.com> > --- > gcc/loop-unroll.cc | 4 +-- > gcc/testsuite/gcc.dg/pr30957-1.c | 48 ++++++++++++++++++++++++++++---- > 2 files changed, 44 insertions(+), 8 deletions(-) > > diff --git a/gcc/loop-unroll.cc b/gcc/loop-unroll.cc > index 4176a21e308..bfdfe6c2bb7 100644 > --- a/gcc/loop-unroll.cc > +++ b/gcc/loop-unroll.cc > @@ -1855,7 +1855,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve, > rtx var, zero_init; > unsigned i; > machine_mode mode = GET_MODE (ve->reg); > - bool honor_signed_zero_p = HONOR_SIGNED_ZEROS (mode); > + bool has_signed_zero_p = MODE_HAS_SIGNED_ZEROS (mode); > > if (ve->var_expansions.length () == 0) > return; > @@ -1869,7 +1869,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve, > case MINUS: > FOR_EACH_VEC_ELT (ve->var_expansions, i, var) > { > - if (honor_signed_zero_p) > + if (has_signed_zero_p) > zero_init = simplify_gen_unary (NEG, mode, CONST0_RTX (mode), mode); > else > zero_init = CONST0_RTX (mode); This change is OK. > diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c > index 564410913ab..6a9d3d87932 100644 > --- a/gcc/testsuite/gcc.dg/pr30957-1.c > +++ b/gcc/testsuite/gcc.dg/pr30957-1.c > @@ -20,16 +20,52 @@ foo (float d, int n) > return accum; > } > > +float __attribute__((noinline)) > +get_minus_zero() > +{ > + return 0.0 / -5.0; > +} > + I still think this test shouldn't use -fno-signed-zeros since that makes it undefined whether the return value is positive or negative - checking whether get_minus_zero returns negative or positive zero isn't a good indicator and prone to break. We have a { dg-add-options ieee } we'd need. The original testcase indicates variable expansion wouldn't trigger then, and in the PR it's noted we should remove this broken testcase and the PR itself is invalid. I agree there. So, can you remove the testcase instead? Thanks, Richard. > int > main () > { > - /* When compiling standard compliant we expect foo to return -0.0. But the > - variable expansion during unrolling optimization (for this testcase enabled > - by non-compliant -fassociative-math) instantiates copy(s) of the > - accumulator which it initializes with +0.0. Hence we expect that foo > - returns +0.0. */ > - if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0) > + /* The variable expansion in unroll requires option unsafe-math-optimizations > + (aka -fno-signed-zeros, -fno-trapping-math, -fassociative-math > + and -freciprocal-math). > + > + When loop like above will have expansion after unrolling as below: > + > + accum_1 += d_1; > + accum_2 += d_2; > + accum_3 += d_3; > + ... > + > + The accum_1, accum_2 and accum_3 need to be initialized. Given the > + floating-point we have > + +0.0f + -0.0f = +0.0f. > + > + Thus, we should initialize the accum_* to -0.0 for correctness. But > + the things become more complicated when no-signed-zeros, as well as VLA > + vectorizer mode which doesn't trigger variable expansion. Then we have: > + > + Case 1: Trigger variable expansion but target doesn't honor no-signed-zero. > + minus_zero will be -0.0f and foo (minus_zero, 10) will be -0.0f. > + Case 2: Trigger variable expansion but target does honor no-signed-zero. > + minus_zero will be +0.0f and foo (minus_zero, 10) will be +0.0f. > + Case 3: No variable expansion but target doesn't honor no-signed-zero. > + minus_zero will be -0.0f and foo (minus_zero, 10) will be -0.0f. > + Case 4: No variable expansion but target does honor no-signed-zero. > + minus_zero will be +0.0f and foo (minus_zero, 10) will be +0.0f. > + > + The test case covers above 4 cases for running. > + */ > + float minus_zero = get_minus_zero (); > + float a = __builtin_copysignf (1.0, minus_zero); > + float b = __builtin_copysignf (1.0, foo (minus_zero, 10)); > + > + if (a != b) > abort (); > + > exit (0); > } > > -- > 2.34.1 > ^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion 2024-01-11 8:33 ` Richard Biener @ 2024-01-11 8:48 ` Li, Pan2 0 siblings, 0 replies; 23+ messages in thread From: Li, Pan2 @ 2024-01-11 8:48 UTC (permalink / raw) To: Richard Biener Cc: gcc-patches, juzhe.zhong, Wang, Yanzhang, kito.cheng, jeffreyalaw Thanks Richard, will delete the test case pr30957-1.c in patch V5. Pan -----Original Message----- From: Richard Biener <richard.guenther@gmail.com> Sent: Thursday, January 11, 2024 4:33 PM To: Li, Pan2 <pan2.li@intel.com> Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; jeffreyalaw@gmail.com Subject: Re: [PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion On Thu, Jan 11, 2024 at 2:39 AM <pan2.li@intel.com> wrote: > > From: Pan Li <pan2.li@intel.com> > > The insert_var_expansion_initialization depends on the > HONOR_SIGNED_ZEROS to initialize the unrolling variables > to +0.0f when -0.0f and no-signed-option. Unfortunately, > we should always keep the -0.0f here because: > > * The -0.0f is always the correct initial value. > * We need to support the target that always honor signed zero. > > Thus, we need to leverage MODE_HAS_SIGNED_ZEROS when initialize > instead of HONOR_SIGNED_ZEROS. Then the target/backend can > decide to honor the no-signed-zero or not. > > The below tests are passed for this patch: > > * The riscv regression tests. > * The aarch64 regression tests. > * The x86 bootstrap and regression tests. > > gcc/ChangeLog: > > * loop-unroll.cc (insert_var_expansion_initialization): Leverage > MODE_HAS_SIGNED_ZEROS for expansion variable initialization. > > gcc/testsuite/ChangeLog: > > * gcc.dg/pr30957-1.c: Adjust tests cases for different scenarios. > > Signed-off-by: Pan Li <pan2.li@intel.com> > --- > gcc/loop-unroll.cc | 4 +-- > gcc/testsuite/gcc.dg/pr30957-1.c | 48 ++++++++++++++++++++++++++++---- > 2 files changed, 44 insertions(+), 8 deletions(-) > > diff --git a/gcc/loop-unroll.cc b/gcc/loop-unroll.cc > index 4176a21e308..bfdfe6c2bb7 100644 > --- a/gcc/loop-unroll.cc > +++ b/gcc/loop-unroll.cc > @@ -1855,7 +1855,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve, > rtx var, zero_init; > unsigned i; > machine_mode mode = GET_MODE (ve->reg); > - bool honor_signed_zero_p = HONOR_SIGNED_ZEROS (mode); > + bool has_signed_zero_p = MODE_HAS_SIGNED_ZEROS (mode); > > if (ve->var_expansions.length () == 0) > return; > @@ -1869,7 +1869,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve, > case MINUS: > FOR_EACH_VEC_ELT (ve->var_expansions, i, var) > { > - if (honor_signed_zero_p) > + if (has_signed_zero_p) > zero_init = simplify_gen_unary (NEG, mode, CONST0_RTX (mode), mode); > else > zero_init = CONST0_RTX (mode); This change is OK. > diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c > index 564410913ab..6a9d3d87932 100644 > --- a/gcc/testsuite/gcc.dg/pr30957-1.c > +++ b/gcc/testsuite/gcc.dg/pr30957-1.c > @@ -20,16 +20,52 @@ foo (float d, int n) > return accum; > } > > +float __attribute__((noinline)) > +get_minus_zero() > +{ > + return 0.0 / -5.0; > +} > + I still think this test shouldn't use -fno-signed-zeros since that makes it undefined whether the return value is positive or negative - checking whether get_minus_zero returns negative or positive zero isn't a good indicator and prone to break. We have a { dg-add-options ieee } we'd need. The original testcase indicates variable expansion wouldn't trigger then, and in the PR it's noted we should remove this broken testcase and the PR itself is invalid. I agree there. So, can you remove the testcase instead? Thanks, Richard. > int > main () > { > - /* When compiling standard compliant we expect foo to return -0.0. But the > - variable expansion during unrolling optimization (for this testcase enabled > - by non-compliant -fassociative-math) instantiates copy(s) of the > - accumulator which it initializes with +0.0. Hence we expect that foo > - returns +0.0. */ > - if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0) > + /* The variable expansion in unroll requires option unsafe-math-optimizations > + (aka -fno-signed-zeros, -fno-trapping-math, -fassociative-math > + and -freciprocal-math). > + > + When loop like above will have expansion after unrolling as below: > + > + accum_1 += d_1; > + accum_2 += d_2; > + accum_3 += d_3; > + ... > + > + The accum_1, accum_2 and accum_3 need to be initialized. Given the > + floating-point we have > + +0.0f + -0.0f = +0.0f. > + > + Thus, we should initialize the accum_* to -0.0 for correctness. But > + the things become more complicated when no-signed-zeros, as well as VLA > + vectorizer mode which doesn't trigger variable expansion. Then we have: > + > + Case 1: Trigger variable expansion but target doesn't honor no-signed-zero. > + minus_zero will be -0.0f and foo (minus_zero, 10) will be -0.0f. > + Case 2: Trigger variable expansion but target does honor no-signed-zero. > + minus_zero will be +0.0f and foo (minus_zero, 10) will be +0.0f. > + Case 3: No variable expansion but target doesn't honor no-signed-zero. > + minus_zero will be -0.0f and foo (minus_zero, 10) will be -0.0f. > + Case 4: No variable expansion but target does honor no-signed-zero. > + minus_zero will be +0.0f and foo (minus_zero, 10) will be +0.0f. > + > + The test case covers above 4 cases for running. > + */ > + float minus_zero = get_minus_zero (); > + float a = __builtin_copysignf (1.0, minus_zero); > + float b = __builtin_copysignf (1.0, foo (minus_zero, 10)); > + > + if (a != b) > abort (); > + > exit (0); > } > > -- > 2.34.1 > ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v5] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion 2023-12-23 11:07 [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor pan2.li ` (3 preceding siblings ...) 2024-01-11 1:38 ` [PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion pan2.li @ 2024-01-11 8:50 ` pan2.li 2024-01-11 9:21 ` Richard Biener 4 siblings, 1 reply; 23+ messages in thread From: pan2.li @ 2024-01-11 8:50 UTC (permalink / raw) To: gcc-patches Cc: juzhe.zhong, pan2.li, yanzhang.wang, kito.cheng, richard.guenther, jeffreyalaw From: Pan Li <pan2.li@intel.com> The insert_var_expansion_initialization depends on the HONOR_SIGNED_ZEROS to initialize the unrolling variables to +0.0f when -0.0f and no-signed-option. Unfortunately, we should always keep the -0.0f here because: * The -0.0f is always the correct initial value. * We need to support the target that always honor signed zero. Thus, we need to leverage MODE_HAS_SIGNED_ZEROS when initialize instead of HONOR_SIGNED_ZEROS. Then the target/backend can decide to honor the no-signed-zero or not. We also removed the testcase pr30957-1.c, as it makes undefined behavior whether the return value is positive or negative. The below tests are passed for this patch: * The riscv regression tests. * The aarch64 regression tests. * The x86 bootstrap and regression tests. gcc/ChangeLog: * loop-unroll.cc (insert_var_expansion_initialization): Leverage MODE_HAS_SIGNED_ZEROS for expansion variable initialization. gcc/testsuite/ChangeLog: * gcc.dg/pr30957-1.c: Remove. Signed-off-by: Pan Li <pan2.li@intel.com> --- gcc/loop-unroll.cc | 4 ++-- gcc/testsuite/gcc.dg/pr30957-1.c | 36 -------------------------------- 2 files changed, 2 insertions(+), 38 deletions(-) delete mode 100644 gcc/testsuite/gcc.dg/pr30957-1.c diff --git a/gcc/loop-unroll.cc b/gcc/loop-unroll.cc index 4176a21e308..bfdfe6c2bb7 100644 --- a/gcc/loop-unroll.cc +++ b/gcc/loop-unroll.cc @@ -1855,7 +1855,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve, rtx var, zero_init; unsigned i; machine_mode mode = GET_MODE (ve->reg); - bool honor_signed_zero_p = HONOR_SIGNED_ZEROS (mode); + bool has_signed_zero_p = MODE_HAS_SIGNED_ZEROS (mode); if (ve->var_expansions.length () == 0) return; @@ -1869,7 +1869,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve, case MINUS: FOR_EACH_VEC_ELT (ve->var_expansions, i, var) { - if (honor_signed_zero_p) + if (has_signed_zero_p) zero_init = simplify_gen_unary (NEG, mode, CONST0_RTX (mode), mode); else zero_init = CONST0_RTX (mode); diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c deleted file mode 100644 index 564410913ab..00000000000 --- a/gcc/testsuite/gcc.dg/pr30957-1.c +++ /dev/null @@ -1,36 +0,0 @@ -/* { dg-do run { xfail { mmix-*-* } } } */ -/* We don't (and don't want to) perform this optimisation on soft-float targets, - where each addition is a library call. / -/* { dg-require-effective-target hard_float } */ -/* -fassociative-math requires -fno-trapping-math and -fno-signed-zeros. */ -/* { dg-options "-O2 -funroll-loops -fassociative-math -fno-trapping-math -fno-signed-zeros -fvariable-expansion-in-unroller -fdump-rtl-loop2_unroll" } */ - -extern void abort (void); -extern void exit (int); - -float __attribute__((noinline)) -foo (float d, int n) -{ - unsigned i; - float accum = d; - - for (i = 0; i < n; i++) - accum += d; - - return accum; -} - -int -main () -{ - /* When compiling standard compliant we expect foo to return -0.0. But the - variable expansion during unrolling optimization (for this testcase enabled - by non-compliant -fassociative-math) instantiates copy(s) of the - accumulator which it initializes with +0.0. Hence we expect that foo - returns +0.0. */ - if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0) - abort (); - exit (0); -} - -/* { dg-final { scan-rtl-dump "Expanding Accumulator" "loop2_unroll" { xfail mmix-*-* } } } */ -- 2.34.1 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v5] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion 2024-01-11 8:50 ` [PATCH v5] " pan2.li @ 2024-01-11 9:21 ` Richard Biener 2024-01-11 10:35 ` Li, Pan2 0 siblings, 1 reply; 23+ messages in thread From: Richard Biener @ 2024-01-11 9:21 UTC (permalink / raw) To: pan2.li; +Cc: gcc-patches, juzhe.zhong, yanzhang.wang, kito.cheng, jeffreyalaw On Thu, Jan 11, 2024 at 9:50 AM <pan2.li@intel.com> wrote: > > From: Pan Li <pan2.li@intel.com> > > The insert_var_expansion_initialization depends on the > HONOR_SIGNED_ZEROS to initialize the unrolling variables > to +0.0f when -0.0f and no-signed-option. Unfortunately, > we should always keep the -0.0f here because: > > * The -0.0f is always the correct initial value. > * We need to support the target that always honor signed zero. > > Thus, we need to leverage MODE_HAS_SIGNED_ZEROS when initialize > instead of HONOR_SIGNED_ZEROS. Then the target/backend can > decide to honor the no-signed-zero or not. > > We also removed the testcase pr30957-1.c, as it makes undefined behavior > whether the return value is positive or negative. > > The below tests are passed for this patch: > > * The riscv regression tests. > * The aarch64 regression tests. > * The x86 bootstrap and regression tests. OK > gcc/ChangeLog: > > * loop-unroll.cc (insert_var_expansion_initialization): Leverage > MODE_HAS_SIGNED_ZEROS for expansion variable initialization. > > gcc/testsuite/ChangeLog: > > * gcc.dg/pr30957-1.c: Remove. > > Signed-off-by: Pan Li <pan2.li@intel.com> > --- > gcc/loop-unroll.cc | 4 ++-- > gcc/testsuite/gcc.dg/pr30957-1.c | 36 -------------------------------- > 2 files changed, 2 insertions(+), 38 deletions(-) > delete mode 100644 gcc/testsuite/gcc.dg/pr30957-1.c > > diff --git a/gcc/loop-unroll.cc b/gcc/loop-unroll.cc > index 4176a21e308..bfdfe6c2bb7 100644 > --- a/gcc/loop-unroll.cc > +++ b/gcc/loop-unroll.cc > @@ -1855,7 +1855,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve, > rtx var, zero_init; > unsigned i; > machine_mode mode = GET_MODE (ve->reg); > - bool honor_signed_zero_p = HONOR_SIGNED_ZEROS (mode); > + bool has_signed_zero_p = MODE_HAS_SIGNED_ZEROS (mode); > > if (ve->var_expansions.length () == 0) > return; > @@ -1869,7 +1869,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve, > case MINUS: > FOR_EACH_VEC_ELT (ve->var_expansions, i, var) > { > - if (honor_signed_zero_p) > + if (has_signed_zero_p) > zero_init = simplify_gen_unary (NEG, mode, CONST0_RTX (mode), mode); > else > zero_init = CONST0_RTX (mode); > diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c > deleted file mode 100644 > index 564410913ab..00000000000 > --- a/gcc/testsuite/gcc.dg/pr30957-1.c > +++ /dev/null > @@ -1,36 +0,0 @@ > -/* { dg-do run { xfail { mmix-*-* } } } */ > -/* We don't (and don't want to) perform this optimisation on soft-float targets, > - where each addition is a library call. / > -/* { dg-require-effective-target hard_float } */ > -/* -fassociative-math requires -fno-trapping-math and -fno-signed-zeros. */ > -/* { dg-options "-O2 -funroll-loops -fassociative-math -fno-trapping-math -fno-signed-zeros -fvariable-expansion-in-unroller -fdump-rtl-loop2_unroll" } */ > - > -extern void abort (void); > -extern void exit (int); > - > -float __attribute__((noinline)) > -foo (float d, int n) > -{ > - unsigned i; > - float accum = d; > - > - for (i = 0; i < n; i++) > - accum += d; > - > - return accum; > -} > - > -int > -main () > -{ > - /* When compiling standard compliant we expect foo to return -0.0. But the > - variable expansion during unrolling optimization (for this testcase enabled > - by non-compliant -fassociative-math) instantiates copy(s) of the > - accumulator which it initializes with +0.0. Hence we expect that foo > - returns +0.0. */ > - if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0) > - abort (); > - exit (0); > -} > - > -/* { dg-final { scan-rtl-dump "Expanding Accumulator" "loop2_unroll" { xfail mmix-*-* } } } */ > -- > 2.34.1 > ^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [PATCH v5] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion 2024-01-11 9:21 ` Richard Biener @ 2024-01-11 10:35 ` Li, Pan2 0 siblings, 0 replies; 23+ messages in thread From: Li, Pan2 @ 2024-01-11 10:35 UTC (permalink / raw) To: Richard Biener Cc: gcc-patches, juzhe.zhong, Wang, Yanzhang, kito.cheng, jeffreyalaw Committed, thanks Richard. Pan -----Original Message----- From: Richard Biener <richard.guenther@gmail.com> Sent: Thursday, January 11, 2024 5:22 PM To: Li, Pan2 <pan2.li@intel.com> Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; jeffreyalaw@gmail.com Subject: Re: [PATCH v5] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion On Thu, Jan 11, 2024 at 9:50 AM <pan2.li@intel.com> wrote: > > From: Pan Li <pan2.li@intel.com> > > The insert_var_expansion_initialization depends on the > HONOR_SIGNED_ZEROS to initialize the unrolling variables > to +0.0f when -0.0f and no-signed-option. Unfortunately, > we should always keep the -0.0f here because: > > * The -0.0f is always the correct initial value. > * We need to support the target that always honor signed zero. > > Thus, we need to leverage MODE_HAS_SIGNED_ZEROS when initialize > instead of HONOR_SIGNED_ZEROS. Then the target/backend can > decide to honor the no-signed-zero or not. > > We also removed the testcase pr30957-1.c, as it makes undefined behavior > whether the return value is positive or negative. > > The below tests are passed for this patch: > > * The riscv regression tests. > * The aarch64 regression tests. > * The x86 bootstrap and regression tests. OK > gcc/ChangeLog: > > * loop-unroll.cc (insert_var_expansion_initialization): Leverage > MODE_HAS_SIGNED_ZEROS for expansion variable initialization. > > gcc/testsuite/ChangeLog: > > * gcc.dg/pr30957-1.c: Remove. > > Signed-off-by: Pan Li <pan2.li@intel.com> > --- > gcc/loop-unroll.cc | 4 ++-- > gcc/testsuite/gcc.dg/pr30957-1.c | 36 -------------------------------- > 2 files changed, 2 insertions(+), 38 deletions(-) > delete mode 100644 gcc/testsuite/gcc.dg/pr30957-1.c > > diff --git a/gcc/loop-unroll.cc b/gcc/loop-unroll.cc > index 4176a21e308..bfdfe6c2bb7 100644 > --- a/gcc/loop-unroll.cc > +++ b/gcc/loop-unroll.cc > @@ -1855,7 +1855,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve, > rtx var, zero_init; > unsigned i; > machine_mode mode = GET_MODE (ve->reg); > - bool honor_signed_zero_p = HONOR_SIGNED_ZEROS (mode); > + bool has_signed_zero_p = MODE_HAS_SIGNED_ZEROS (mode); > > if (ve->var_expansions.length () == 0) > return; > @@ -1869,7 +1869,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve, > case MINUS: > FOR_EACH_VEC_ELT (ve->var_expansions, i, var) > { > - if (honor_signed_zero_p) > + if (has_signed_zero_p) > zero_init = simplify_gen_unary (NEG, mode, CONST0_RTX (mode), mode); > else > zero_init = CONST0_RTX (mode); > diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c > deleted file mode 100644 > index 564410913ab..00000000000 > --- a/gcc/testsuite/gcc.dg/pr30957-1.c > +++ /dev/null > @@ -1,36 +0,0 @@ > -/* { dg-do run { xfail { mmix-*-* } } } */ > -/* We don't (and don't want to) perform this optimisation on soft-float targets, > - where each addition is a library call. / > -/* { dg-require-effective-target hard_float } */ > -/* -fassociative-math requires -fno-trapping-math and -fno-signed-zeros. */ > -/* { dg-options "-O2 -funroll-loops -fassociative-math -fno-trapping-math -fno-signed-zeros -fvariable-expansion-in-unroller -fdump-rtl-loop2_unroll" } */ > - > -extern void abort (void); > -extern void exit (int); > - > -float __attribute__((noinline)) > -foo (float d, int n) > -{ > - unsigned i; > - float accum = d; > - > - for (i = 0; i < n; i++) > - accum += d; > - > - return accum; > -} > - > -int > -main () > -{ > - /* When compiling standard compliant we expect foo to return -0.0. But the > - variable expansion during unrolling optimization (for this testcase enabled > - by non-compliant -fassociative-math) instantiates copy(s) of the > - accumulator which it initializes with +0.0. Hence we expect that foo > - returns +0.0. */ > - if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0) > - abort (); > - exit (0); > -} > - > -/* { dg-final { scan-rtl-dump "Expanding Accumulator" "loop2_unroll" { xfail mmix-*-* } } } */ > -- > 2.34.1 > ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2024-01-11 10:35 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-12-23 11:07 [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor pan2.li 2023-12-23 17:19 ` Jeff Law 2023-12-24 2:01 ` Li, Pan2 2023-12-26 9:34 ` [PATCH v2] " pan2.li 2023-12-28 16:39 ` Jeff Law 2023-12-29 0:42 ` Li, Pan2 2023-12-29 1:03 ` Jeff Law 2023-12-29 5:56 ` Li, Pan2 2023-12-30 3:13 ` Jeff Law 2024-01-01 8:56 ` Li, Pan2 2024-01-02 11:55 ` [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option pan2.li 2024-01-08 10:45 ` Richard Biener 2024-01-09 1:22 ` Li, Pan2 2024-01-09 7:17 ` Li, Pan2 2024-01-09 13:08 ` Richard Biener 2024-01-09 17:46 ` Jeff Law 2024-01-10 4:28 ` Li, Pan2 2024-01-11 1:38 ` [PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion pan2.li 2024-01-11 8:33 ` Richard Biener 2024-01-11 8:48 ` Li, Pan2 2024-01-11 8:50 ` [PATCH v5] " pan2.li 2024-01-11 9:21 ` Richard Biener 2024-01-11 10:35 ` Li, Pan2
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).