[PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor
@ 2023-12-23 11:07 pan2.li
  2023-12-23 17:19 ` Jeff Law
                   ` (4 more replies)
  0 siblings, 5 replies; 23+ messages in thread
From: pan2.li @ 2023-12-23 11:07 UTC (permalink / raw)
  To: gcc-patches
  Cc: juzhe.zhong, pan2.li, yanzhang.wang, kito.cheng,
	richard.guenther, jeffreyalaw

From: Pan Li <pan2.li@intel.com>

This patch would like to XFAIL the test case pr30957-1.c for the RVV when
build the elf with some configurations (list at the end of the log)
It will be vectorized during vect_transform_loop with a variable factor.
It won't benefit from unrolling/peeling and mark the loop->unroll as 1.
Of course, it will do nothing during unroll_loops when loop->unroll is 1.

After this patch the loops vectorized with a variable factor of the RVV
will be treated as XFAIL by the tree dump.

Aka the blow configuration will be treated as XFAIL and we still need
further investigation for the failures of other configurations.

* riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow
* riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow
* riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax

gcc/testsuite/ChangeLog:

	* gcc.dg/pr30957-1.c: Add XFAIL for RVV when vectorized with
	variable length.

Signed-off-by: Pan Li <pan2.li@intel.com>
---
 gcc/testsuite/gcc.dg/pr30957-1.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c
index 564410913ab..4c5c4cd4ad8 100644
--- a/gcc/testsuite/gcc.dg/pr30957-1.c
+++ b/gcc/testsuite/gcc.dg/pr30957-1.c
@@ -3,7 +3,7 @@
    where each addition is a library call.  /
 /* { dg-require-effective-target hard_float } */
 /* -fassociative-math requires -fno-trapping-math and -fno-signed-zeros. */
-/* { dg-options "-O2 -funroll-loops -fassociative-math -fno-trapping-math -fno-signed-zeros -fvariable-expansion-in-unroller -fdump-rtl-loop2_unroll" } */
+/* { dg-options "-O2 -funroll-loops -fassociative-math -fno-trapping-math -fno-signed-zeros -fvariable-expansion-in-unroller -fdump-rtl-loop2_unroll -fdump-tree-vect-details" } */
 
 extern void abort (void);
 extern void exit (int);
@@ -34,3 +34,4 @@ main ()
 }
 
 /* { dg-final { scan-rtl-dump "Expanding Accumulator" "loop2_unroll" { xfail mmix-*-* } } } */
+/* { dg-final { scan-tree-dump "Disabling unrolling due to variable-length vectorization factor" "vect" { xfail riscv*-*-* } } } */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor
  2023-12-23 11:07 [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor pan2.li
@ 2023-12-23 17:19 ` Jeff Law
  2023-12-24  2:01   ` Li, Pan2
  2023-12-26  9:34 ` [PATCH v2] " pan2.li
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 23+ messages in thread
From: Jeff Law @ 2023-12-23 17:19 UTC (permalink / raw)
  To: pan2.li, gcc-patches
  Cc: juzhe.zhong, yanzhang.wang, kito.cheng, richard.guenther



On 12/23/23 04:07, pan2.li@intel.com wrote:
> From: Pan Li <pan2.li@intel.com>
> 
> This patch would like to XFAIL the test case pr30957-1.c for the RVV when
> build the elf with some configurations (list at the end of the log)
> It will be vectorized during vect_transform_loop with a variable factor.
> It won't benefit from unrolling/peeling and mark the loop->unroll as 1.
> Of course, it will do nothing during unroll_loops when loop->unroll is 1.
> 
> After this patch the loops vectorized with a variable factor of the RVV
> will be treated as XFAIL by the tree dump.
> 
> Aka the blow configuration will be treated as XFAIL and we still need
> further investigation for the failures of other configurations.
> 
> * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow
> * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
> * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
> * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow
> * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.dg/pr30957-1.c: Add XFAIL for RVV when vectorized with
> 	variable length.
Isn't this going to XPASS for non-vector configurations?

If I understand correctly, the test requires loop unrolling and its 
associated variable expansion to trigger the desired behavior.  VLA 
style vectorization is inhibiting loop unrolling and thus we get the 
failure?

So the natural question here is whether or not aarch64 SVE sees the same 
failure, if not, why?  If so, then can we conditionalize this on an 
effective target test (check_effective_target_vect_variable_length perhaps?)

Jeff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor
  2023-12-23 17:19 ` Jeff Law
@ 2023-12-24  2:01   ` Li, Pan2
  0 siblings, 0 replies; 23+ messages in thread
From: Li, Pan2 @ 2023-12-24  2:01 UTC (permalink / raw)
  To: Jeff Law, gcc-patches
  Cc: juzhe.zhong, Wang, Yanzhang, kito.cheng, richard.guenther

Thanks Jeff for comments.

> Isn't this going to XPASS for non-vector configurations?

Yes, I think we still need something like riscv_v here.

> If I understand correctly, the test requires loop unrolling and its 
> associated variable expansion to trigger the desired behavior.  VLA 
> style vectorization is inhibiting loop unrolling and thus we get the 
> failure?

Yes, exactly.

> So the natural question here is whether or not aarch64 SVE sees the same 
> failure, if not, why?  If so, then can we conditionalize this on an 
> effective target test (check_effective_target_vect_variable_length perhaps?)

Sure, will have a try for this.

Pan

-----Original Message-----
From: Jeff Law <jeffreyalaw@gmail.com> 
Sent: Sunday, December 24, 2023 1:20 AM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com
Subject: Re: [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor



On 12/23/23 04:07, pan2.li@intel.com wrote:
> From: Pan Li <pan2.li@intel.com>
> 
> This patch would like to XFAIL the test case pr30957-1.c for the RVV when
> build the elf with some configurations (list at the end of the log)
> It will be vectorized during vect_transform_loop with a variable factor.
> It won't benefit from unrolling/peeling and mark the loop->unroll as 1.
> Of course, it will do nothing during unroll_loops when loop->unroll is 1.
> 
> After this patch the loops vectorized with a variable factor of the RVV
> will be treated as XFAIL by the tree dump.
> 
> Aka the blow configuration will be treated as XFAIL and we still need
> further investigation for the failures of other configurations.
> 
> * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow
> * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
> * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
> * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow
> * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.dg/pr30957-1.c: Add XFAIL for RVV when vectorized with
> 	variable length.
Isn't this going to XPASS for non-vector configurations?

If I understand correctly, the test requires loop unrolling and its 
associated variable expansion to trigger the desired behavior.  VLA 
style vectorization is inhibiting loop unrolling and thus we get the 
failure?

So the natural question here is whether or not aarch64 SVE sees the same 
failure, if not, why?  If so, then can we conditionalize this on an 
effective target test (check_effective_target_vect_variable_length perhaps?)

Jeff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor
  2023-12-23 11:07 [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor pan2.li
  2023-12-23 17:19 ` Jeff Law
@ 2023-12-26  9:34 ` pan2.li
  2023-12-28 16:39   ` Jeff Law
  2024-01-02 11:55 ` [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option pan2.li
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 23+ messages in thread
From: pan2.li @ 2023-12-26  9:34 UTC (permalink / raw)
  To: gcc-patches
  Cc: juzhe.zhong, pan2.li, yanzhang.wang, kito.cheng,
	richard.guenther, jeffreyalaw

From: Pan Li <pan2.li@intel.com>

This patch would like to XFAIL the test case pr30957-1.c for the RVV when
build the elf with some configurations (list at the end of the log)
It will be vectorized during vect_transform_loop with a variable factor.
It won't benefit from unrolling/peeling and mark the loop->unroll as 1.
Of course, it will do nothing during unroll_loops when loop->unroll is 1.

The aarch64_sve may have the similar issue but it initialize the const
`0.0 / -5.0` in the test file to `+0.0` before pass to the function foo.
Then it will pass the execution test.

aarch64:
movi    v0.2s, #0x0
stp     x29, x30, [sp, #-16]!
mov     w0, #0xa
mov     x29, sp
bl      400280 <foo> <== s0 is +0.0

Unfortunately, the riscv initialize the the const `0.0 / -5.0` to the
`-0.0`, and then pass it to the function foo. Of course it the execution
test will fail.

riscv:
flw     fa0,388(gp) # 1299c <__SDATA_BEGIN__+0x4>
addi    sp,sp,-16
li      a0,10
sd      ra,8(sp)
jal     101fc <foo>  <== fa0 is -0.0

After this patch the loops vectorized with a variable factor of the RVV
will be treated as XFAIL by the tree dump when riscv_v and
variable_vect_length.

The below configurations are validated as XFAIL for RV64.

* riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax

gcc/testsuite/ChangeLog:

	* gcc.dg/pr30957-1.c: Add XFAIL for RVV when vectorized with
	variable length.

Signed-off-by: Pan Li <pan2.li@intel.com>
---
 gcc/testsuite/gcc.dg/pr30957-1.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c
index 564410913ab..7a7242ec16d 100644
--- a/gcc/testsuite/gcc.dg/pr30957-1.c
+++ b/gcc/testsuite/gcc.dg/pr30957-1.c
@@ -3,7 +3,7 @@
    where each addition is a library call.  /
 /* { dg-require-effective-target hard_float } */
 /* -fassociative-math requires -fno-trapping-math and -fno-signed-zeros. */
-/* { dg-options "-O2 -funroll-loops -fassociative-math -fno-trapping-math -fno-signed-zeros -fvariable-expansion-in-unroller -fdump-rtl-loop2_unroll" } */
+/* { dg-options "-O2 -funroll-loops -fassociative-math -fno-trapping-math -fno-signed-zeros -fvariable-expansion-in-unroller -fdump-rtl-loop2_unroll -fdump-tree-vect-details" } */
 
 extern void abort (void);
 extern void exit (int);
@@ -34,3 +34,4 @@ main ()
 }
 
 /* { dg-final { scan-rtl-dump "Expanding Accumulator" "loop2_unroll" { xfail mmix-*-* } } } */
+/* { dg-final { scan-tree-dump-times "Disabling unrolling due to variable-length vectorization factor" 1 "vect" { target { riscv_v } xfail { vect_variable_length } } } } */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor
  2023-12-26  9:34 ` [PATCH v2] " pan2.li
@ 2023-12-28 16:39   ` Jeff Law
  2023-12-29  0:42     ` Li, Pan2
  0 siblings, 1 reply; 23+ messages in thread
From: Jeff Law @ 2023-12-28 16:39 UTC (permalink / raw)
  To: pan2.li, gcc-patches
  Cc: juzhe.zhong, yanzhang.wang, kito.cheng, richard.guenther



On 12/26/23 02:34, pan2.li@intel.com wrote:
> From: Pan Li <pan2.li@intel.com>
> 
> This patch would like to XFAIL the test case pr30957-1.c for the RVV when
> build the elf with some configurations (list at the end of the log)
> It will be vectorized during vect_transform_loop with a variable factor.
> It won't benefit from unrolling/peeling and mark the loop->unroll as 1.
> Of course, it will do nothing during unroll_loops when loop->unroll is 1.
> 
> The aarch64_sve may have the similar issue but it initialize the const
> `0.0 / -5.0` in the test file to `+0.0` before pass to the function foo.
> Then it will pass the execution test.
> 
> aarch64:
> movi    v0.2s, #0x0
> stp     x29, x30, [sp, #-16]!
> mov     w0, #0xa
> mov     x29, sp
> bl      400280 <foo> <== s0 is +0.0
> 
> Unfortunately, the riscv initialize the the const `0.0 / -5.0` to the
> `-0.0`, and then pass it to the function foo. Of course it the execution
> test will fail.
> 
> riscv:
> flw     fa0,388(gp) # 1299c <__SDATA_BEGIN__+0x4>
> addi    sp,sp,-16
> li      a0,10
> sd      ra,8(sp)
> jal     101fc <foo>  <== fa0 is -0.0
> 
> After this patch the loops vectorized with a variable factor of the RVV
> will be treated as XFAIL by the tree dump when riscv_v and
> variable_vect_length.
> 
> The below configurations are validated as XFAIL for RV64.
Interesting.  So I'd actually peel one more layer off this onion.  Why 
do the aarch64 and riscv targets generate different constants (0.0 vs 
-0.0)?

Is it possible that the aarch64 is generating 0.0 when asked for -0.0 
and -fno-signed-zeros is in effect?  That's a valid thing to do when 
-fno-signed-zeros is on.  Look for HONOR_SIGNED_ZEROs in the aarch64 
backend.



Jeff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor
  2023-12-28 16:39   ` Jeff Law
@ 2023-12-29  0:42     ` Li, Pan2
  2023-12-29  1:03       ` Jeff Law
  0 siblings, 1 reply; 23+ messages in thread
From: Li, Pan2 @ 2023-12-29  0:42 UTC (permalink / raw)
  To: Jeff Law, gcc-patches
  Cc: juzhe.zhong, Wang, Yanzhang, kito.cheng, richard.guenther

Thanks Jeff for comments, and Happy new year!

> Interesting.  So I'd actually peel one more layer off this onion.  Why 
> do the aarch64 and riscv targets generate different constants (0.0 vs 
> -0.0)?

Yeah, it surprise me too when debugging the foo function. But didn't dig into it in previous as it may be unrelated to vectorize.

> Is it possible that the aarch64 is generating 0.0 when asked for -0.0 
> and -fno-signed-zeros is in effect?  That's a valid thing to do when 
> -fno-signed-zeros is on.  Look for HONOR_SIGNED_ZEROs in the aarch64 
> backend.

Sure, will have a try for making the -0.0 happen in aarch64.

Pan


-----Original Message-----
From: Jeff Law <jeffreyalaw@gmail.com> 
Sent: Friday, December 29, 2023 12:39 AM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com
Subject: Re: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor



On 12/26/23 02:34, pan2.li@intel.com wrote:
> From: Pan Li <pan2.li@intel.com>
> 
> This patch would like to XFAIL the test case pr30957-1.c for the RVV when
> build the elf with some configurations (list at the end of the log)
> It will be vectorized during vect_transform_loop with a variable factor.
> It won't benefit from unrolling/peeling and mark the loop->unroll as 1.
> Of course, it will do nothing during unroll_loops when loop->unroll is 1.
> 
> The aarch64_sve may have the similar issue but it initialize the const
> `0.0 / -5.0` in the test file to `+0.0` before pass to the function foo.
> Then it will pass the execution test.
> 
> aarch64:
> movi    v0.2s, #0x0
> stp     x29, x30, [sp, #-16]!
> mov     w0, #0xa
> mov     x29, sp
> bl      400280 <foo> <== s0 is +0.0
> 
> Unfortunately, the riscv initialize the the const `0.0 / -5.0` to the
> `-0.0`, and then pass it to the function foo. Of course it the execution
> test will fail.
> 
> riscv:
> flw     fa0,388(gp) # 1299c <__SDATA_BEGIN__+0x4>
> addi    sp,sp,-16
> li      a0,10
> sd      ra,8(sp)
> jal     101fc <foo>  <== fa0 is -0.0
> 
> After this patch the loops vectorized with a variable factor of the RVV
> will be treated as XFAIL by the tree dump when riscv_v and
> variable_vect_length.
> 
> The below configurations are validated as XFAIL for RV64.
Interesting.  So I'd actually peel one more layer off this onion.  Why 
do the aarch64 and riscv targets generate different constants (0.0 vs 
-0.0)?

Is it possible that the aarch64 is generating 0.0 when asked for -0.0 
and -fno-signed-zeros is in effect?  That's a valid thing to do when 
-fno-signed-zeros is on.  Look for HONOR_SIGNED_ZEROs in the aarch64 
backend.



Jeff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor
  2023-12-29  0:42     ` Li, Pan2
@ 2023-12-29  1:03       ` Jeff Law
  2023-12-29  5:56         ` Li, Pan2
  0 siblings, 1 reply; 23+ messages in thread
From: Jeff Law @ 2023-12-29  1:03 UTC (permalink / raw)
  To: Li, Pan2, gcc-patches
  Cc: juzhe.zhong, Wang, Yanzhang, kito.cheng, richard.guenther



On 12/28/23 17:42, Li, Pan2 wrote:
> Thanks Jeff for comments, and Happy new year!
> 
>> Interesting.  So I'd actually peel one more layer off this onion.  Why
>> do the aarch64 and riscv targets generate different constants (0.0 vs
>> -0.0)?
> 
> Yeah, it surprise me too when debugging the foo function. But didn't dig into it in previous as it may be unrelated to vectorize.
> 
>> Is it possible that the aarch64 is generating 0.0 when asked for -0.0
>> and -fno-signed-zeros is in effect?  That's a valid thing to do when
>> -fno-signed-zeros is on.  Look for HONOR_SIGNED_ZEROs in the aarch64
>> backend.
> 
> Sure, will have a try for making the -0.0 happen in aarch64.
I would first look at the .optimized dump, then I'd look at the .final 
dump alongside the resulting assembly for aarch64.

I bet we're going to find that the aarch64 target internally converts 
-0.0 to 0.0 when we're not honoring signed zeros.

jeff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor
  2023-12-29  1:03       ` Jeff Law
@ 2023-12-29  5:56         ` Li, Pan2
  2023-12-30  3:13           ` Jeff Law
  0 siblings, 1 reply; 23+ messages in thread
From: Li, Pan2 @ 2023-12-29  5:56 UTC (permalink / raw)
  To: Jeff Law, gcc-patches
  Cc: juzhe.zhong, Wang, Yanzhang, kito.cheng, richard.guenther

Thanks Jeff.

I think I locate where aarch64 performs the trick here.

1. In the .final we have rtl like

(insn:TI 6 8 29 (set (reg:SF 32 v0)
        (const_double:SF -0.0 [-0x0.0p+0])) "/home/box/panli/gnu-toolchain/gcc/gcc/testsuite/gcc.dg/pr30957-1.c":31:7 79 {*movsf_aarch64}
     (nil))

2. the movsf_aarch64 comes from the aarch64.md file similar to the below rtl. Aka, it will generate movi\t%0.2s, #0 if
the aarch64_reg_or_fp_zero is true.

1640 (define_insn "*mov<mode>_aarch64"
1641   [(set (match_operand:SFD 0 "nonimmediate_operand")
1642       match_operand:SFD 1 "general_operand"))]
1643   "TARGET_FLOAT && (register_operand (operands[0], <MODE>mode)
1644     || aarch64_reg_or_fp_zero (operands[1], <MODE>mode))"
1645   {@ [ cons: =0 , 1   ; attrs: type , arch  ]
1646      [ w        , Y   ; neon_move   , simd  ] movi\t%0.2s, #0

3. Then we will have aarch64_float_const_zero_rtx_p here, and the -0.0 input rtl will return true in line 10873 because of no-signed-zero is given.

10863 bool
10864 aarch64_float_const_zero_rtx_p (rtx x
10865 {
10866   /* 0.0 in Decimal Floating Point cannot be represented by #0 or
10867      zr as our callers expect, so no need to check the actual
10868      value if X is of Decimal Floating Point type.  */
10869   if (GET_MODE_CLASS (GET_MODE (x)) == MODE_DECIMAL_FLOAT)
10870     return false;
10871
10872   if (REAL_VALUE_MINUS_ZERO (*CONST_DOUBLE_REAL_VALUE (x)))
10873     return !HONOR_SIGNED_ZEROS (GET_MODE (x));
10874   return real_equal (CONST_DOUBLE_REAL_VALUE (x), &dconst0);
10875 }

I think that explain why we have +0.0 in aarch64 here.

Pan

-----Original Message-----
From: Jeff Law <jeffreyalaw@gmail.com> 
Sent: Friday, December 29, 2023 9:04 AM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com
Subject: Re: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor



On 12/28/23 17:42, Li, Pan2 wrote:
> Thanks Jeff for comments, and Happy new year!
> 
>> Interesting.  So I'd actually peel one more layer off this onion.  Why
>> do the aarch64 and riscv targets generate different constants (0.0 vs
>> -0.0)?
> 
> Yeah, it surprise me too when debugging the foo function. But didn't dig into it in previous as it may be unrelated to vectorize.
> 
>> Is it possible that the aarch64 is generating 0.0 when asked for -0.0
>> and -fno-signed-zeros is in effect?  That's a valid thing to do when
>> -fno-signed-zeros is on.  Look for HONOR_SIGNED_ZEROs in the aarch64
>> backend.
> 
> Sure, will have a try for making the -0.0 happen in aarch64.
I would first look at the .optimized dump, then I'd look at the .final 
dump alongside the resulting assembly for aarch64.

I bet we're going to find that the aarch64 target internally converts 
-0.0 to 0.0 when we're not honoring signed zeros.

jeff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor
  2023-12-29  5:56         ` Li, Pan2
@ 2023-12-30  3:13           ` Jeff Law
  2024-01-01  8:56             ` Li, Pan2
  0 siblings, 1 reply; 23+ messages in thread
From: Jeff Law @ 2023-12-30  3:13 UTC (permalink / raw)
  To: Li, Pan2, gcc-patches
  Cc: juzhe.zhong, Wang, Yanzhang, kito.cheng, richard.guenther



On 12/28/23 22:56, Li, Pan2 wrote:
> Thanks Jeff.
> 
> I think I locate where aarch64 performs the trick here.
> 
> 1. In the .final we have rtl like
> 
> (insn:TI 6 8 29 (set (reg:SF 32 v0)
>          (const_double:SF -0.0 [-0x0.0p+0])) "/home/box/panli/gnu-toolchain/gcc/gcc/testsuite/gcc.dg/pr30957-1.c":31:7 79 {*movsf_aarch64}
>       (nil))
> 
> 2. the movsf_aarch64 comes from the aarch64.md file similar to the below rtl. Aka, it will generate movi\t%0.2s, #0 if
> the aarch64_reg_or_fp_zero is true.
> 
> 1640 (define_insn "*mov<mode>_aarch64"
> 1641   [(set (match_operand:SFD 0 "nonimmediate_operand")
> 1642       match_operand:SFD 1 "general_operand"))]
> 1643   "TARGET_FLOAT && (register_operand (operands[0], <MODE>mode)
> 1644     || aarch64_reg_or_fp_zero (operands[1], <MODE>mode))"
> 1645   {@ [ cons: =0 , 1   ; attrs: type , arch  ]
> 1646      [ w        , Y   ; neon_move   , simd  ] movi\t%0.2s, #0
> 
> 3. Then we will have aarch64_float_const_zero_rtx_p here, and the -0.0 input rtl will return true in line 10873 because of no-signed-zero is given.
> 
> 10863 bool
> 10864 aarch64_float_const_zero_rtx_p (rtx x
> 10865 {
> 10866   /* 0.0 in Decimal Floating Point cannot be represented by #0 or
> 10867      zr as our callers expect, so no need to check the actual
> 10868      value if X is of Decimal Floating Point type.  */
> 10869   if (GET_MODE_CLASS (GET_MODE (x)) == MODE_DECIMAL_FLOAT)
> 10870     return false;
> 10871
> 10872   if (REAL_VALUE_MINUS_ZERO (*CONST_DOUBLE_REAL_VALUE (x)))
> 10873     return !HONOR_SIGNED_ZEROS (GET_MODE (x));
> 10874   return real_equal (CONST_DOUBLE_REAL_VALUE (x), &dconst0);
> 10875 }
> 
> I think that explain why we have +0.0 in aarch64 here.
Yup.  Thanks a ton for diving into this.  So I think that points us to 
the right fix, specifically we should be turning -0.0 into 0.0 when 
!HONOR_SIGNED_ZEROS rather than xfailing the test.

I think we'd need to adjust reg_or_0_operand and riscv_output_move, 
probably the G constraint as well.   We might also need to adjust 
move_operand and perhaps riscv_legitimize_move.

jeff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor
  2023-12-30  3:13           ` Jeff Law
@ 2024-01-01  8:56             ` Li, Pan2
  0 siblings, 0 replies; 23+ messages in thread
From: Li, Pan2 @ 2024-01-01  8:56 UTC (permalink / raw)
  To: Jeff Law, gcc-patches
  Cc: juzhe.zhong, Wang, Yanzhang, kito.cheng, richard.guenther

Thanks Jeff for the confirmation and suggestions. It looks like not a corner case for the option no-signed-zero.
Consider 2 sample function as below with build with option " -march=rv64gcv -mabi=lp64d -O2 -fno-signed-zeros".

void
__attribute__ ((noinline))
test_float_zero_assign_0 (float *a)
{
  *a = +0.0;
}

void
__attribute__ ((noinline))
test_float_zero_assign_1 (float *a)
{
  *a = -0.0;
}

For the first one (aka float 0.0) we have rtl as below:
(insn 6 3 0 2 (set (mem:SF (reg/v/f:DI 134 [ a ]) [1 *a_2(D)+0 S4 A32])
        (const_double:SF 0.0 [0x0.0p+0])) "test.c":14:6 -1
     (nil))

But for the second one (aka float -0.0 with no-signed-zero) we have rtl as below but we expect const_double -0.0 here.
(insn 6 3 7 2 (set (reg:DI 135
        (high:DI (symbol_ref/u:DI ("*.LC0") [flags 0x82]))) "test.c":21:6 -1
     (nil))
(insn 7 6 8 2 (set (reg:SF 136)
        (mem/u/c:SF (lo_sum:DI (reg:DI 135)
                (symbol_ref/u:DI ("*.LC0") [flags 0x82])) [0  S4 A32])) "test.c":21:6 -1
     (nil))

I will have a try to fix it in V3.

Pan

-----Original Message-----
From: Jeff Law <jeffreyalaw@gmail.com> 
Sent: Saturday, December 30, 2023 11:14 AM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; richard.guenther@gmail.com
Subject: Re: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor



On 12/28/23 22:56, Li, Pan2 wrote:
> Thanks Jeff.
> 
> I think I locate where aarch64 performs the trick here.
> 
> 1. In the .final we have rtl like
> 
> (insn:TI 6 8 29 (set (reg:SF 32 v0)
>          (const_double:SF -0.0 [-0x0.0p+0])) "/home/box/panli/gnu-toolchain/gcc/gcc/testsuite/gcc.dg/pr30957-1.c":31:7 79 {*movsf_aarch64}
>       (nil))
> 
> 2. the movsf_aarch64 comes from the aarch64.md file similar to the below rtl. Aka, it will generate movi\t%0.2s, #0 if
> the aarch64_reg_or_fp_zero is true.
> 
> 1640 (define_insn "*mov<mode>_aarch64"
> 1641   [(set (match_operand:SFD 0 "nonimmediate_operand")
> 1642       match_operand:SFD 1 "general_operand"))]
> 1643   "TARGET_FLOAT && (register_operand (operands[0], <MODE>mode)
> 1644     || aarch64_reg_or_fp_zero (operands[1], <MODE>mode))"
> 1645   {@ [ cons: =0 , 1   ; attrs: type , arch  ]
> 1646      [ w        , Y   ; neon_move   , simd  ] movi\t%0.2s, #0
> 
> 3. Then we will have aarch64_float_const_zero_rtx_p here, and the -0.0 input rtl will return true in line 10873 because of no-signed-zero is given.
> 
> 10863 bool
> 10864 aarch64_float_const_zero_rtx_p (rtx x
> 10865 {
> 10866   /* 0.0 in Decimal Floating Point cannot be represented by #0 or
> 10867      zr as our callers expect, so no need to check the actual
> 10868      value if X is of Decimal Floating Point type.  */
> 10869   if (GET_MODE_CLASS (GET_MODE (x)) == MODE_DECIMAL_FLOAT)
> 10870     return false;
> 10871
> 10872   if (REAL_VALUE_MINUS_ZERO (*CONST_DOUBLE_REAL_VALUE (x)))
> 10873     return !HONOR_SIGNED_ZEROS (GET_MODE (x));
> 10874   return real_equal (CONST_DOUBLE_REAL_VALUE (x), &dconst0);
> 10875 }
> 
> I think that explain why we have +0.0 in aarch64 here.
Yup.  Thanks a ton for diving into this.  So I think that points us to 
the right fix, specifically we should be turning -0.0 into 0.0 when 
!HONOR_SIGNED_ZEROS rather than xfailing the test.

I think we'd need to adjust reg_or_0_operand and riscv_output_move, 
probably the G constraint as well.   We might also need to adjust 
move_operand and perhaps riscv_legitimize_move.

jeff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option
  2023-12-23 11:07 [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor pan2.li
  2023-12-23 17:19 ` Jeff Law
  2023-12-26  9:34 ` [PATCH v2] " pan2.li
@ 2024-01-02 11:55 ` pan2.li
  2024-01-08 10:45   ` Richard Biener
  2024-01-11  1:38 ` [PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion pan2.li
  2024-01-11  8:50 ` [PATCH v5] " pan2.li
  4 siblings, 1 reply; 23+ messages in thread
From: pan2.li @ 2024-01-02 11:55 UTC (permalink / raw)
  To: gcc-patches
  Cc: juzhe.zhong, pan2.li, yanzhang.wang, kito.cheng,
	richard.guenther, jeffreyalaw

From: Pan Li <pan2.li@intel.com>

According to the sematics of no-signed-zeros option, the backend
like RISC-V should treat the minus zero -0.0f as plus zero 0.0f.

Consider below example with option -fno-signed-zeros.

void
test (float *a)
{
  *a = -0.0;
}

We will generate code as below, which doesn't treat the minus zero
as plus zero.

test:
  lui  a5,%hi(.LC0)
  flw  fa5,%lo(.LC0)(a5)
  fsw  fa5,0(a0)
  ret

.LC0:
  .word -2147483648 // aka -0.0 (0x80000000 in hex)

This patch would like to fix the bug and treat the minus zero -0.0
as plus zero, aka +0.0. Thus after this patch we will have asm code
as below for the above sampe code.

test:
  sw zero,0(a0)
  ret

This patch also fix the run failure of the test case pr30957-1.c. The
below tests are passed for this patch.

* The riscv regression tests.
* The pr30957-1.c run tests.

gcc/ChangeLog:

	* config/riscv/constraints.md: Leverage func riscv_float_const_zero_rtx_p
	for predicating the rtx is const zero float or not.
	* config/riscv/predicates.md: Ditto.
	* config/riscv/riscv.cc (riscv_const_insns): Ditto.
	(riscv_float_const_zero_rtx_p): New func impl for predicating the rtx is
	const zero float or not.
	(riscv_const_zero_rtx_p): New func impl for predicating the rtx
	is const zero (both int and fp) or not.
	* config/riscv/riscv-protos.h (riscv_float_const_zero_rtx_p):
	New func decl.
	(riscv_const_zero_rtx_p): Ditto.
	* config/riscv/riscv.md: Making sure the operand[1] of movfp is
	CONST0_RTX when the operand[1] is const zero float.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/no-signed-zeros-0.c: New test.
	* gcc.target/riscv/no-signed-zeros-1.c: New test.
	* gcc.target/riscv/no-signed-zeros-2.c: New test.
	* gcc.target/riscv/no-signed-zeros-3.c: New test.
	* gcc.target/riscv/no-signed-zeros-4.c: New test.
	* gcc.target/riscv/no-signed-zeros-5.c: New test.
	* gcc.target/riscv/no-signed-zeros-run-0.c: New test.
	* gcc.target/riscv/no-signed-zeros-run-1.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
---
 gcc/config/riscv/constraints.md               |  2 +-
 gcc/config/riscv/predicates.md                |  2 +-
 gcc/config/riscv/riscv-protos.h               |  2 +
 gcc/config/riscv/riscv.cc                     | 35 ++++++++++++-
 gcc/config/riscv/riscv.md                     | 49 ++++++++++++++++---
 .../gcc.target/riscv/no-signed-zeros-0.c      | 26 ++++++++++
 .../gcc.target/riscv/no-signed-zeros-1.c      | 28 +++++++++++
 .../gcc.target/riscv/no-signed-zeros-2.c      | 26 ++++++++++
 .../gcc.target/riscv/no-signed-zeros-3.c      | 28 +++++++++++
 .../gcc.target/riscv/no-signed-zeros-4.c      | 26 ++++++++++
 .../gcc.target/riscv/no-signed-zeros-5.c      | 28 +++++++++++
 .../gcc.target/riscv/no-signed-zeros-run-0.c  | 36 ++++++++++++++
 .../gcc.target/riscv/no-signed-zeros-run-1.c  | 36 ++++++++++++++
 13 files changed, 314 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c

diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index de4359af00d..db1d5e1385f 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -108,7 +108,7 @@ (define_constraint "DnS"
 (define_constraint "G"
   "@internal"
   (and (match_code "const_double")
-       (match_test "op == CONST0_RTX (mode)")))
+       (match_test "riscv_float_const_zero_rtx_p (op)")))
 
 (define_memory_constraint "A"
   "An address that is held in a general-purpose register."
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index b87a6900841..b428d842101 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -78,7 +78,7 @@ (define_predicate "sleu_operand"
 
 (define_predicate "const_0_operand"
   (and (match_code "const_int,const_wide_int,const_double,const_vector")
-       (match_test "op == CONST0_RTX (GET_MODE (op))")))
+       (match_test "riscv_const_zero_rtx_p (op)")))
 
 (define_predicate "const_1_operand"
   (and (match_code "const_int,const_wide_int,const_vector")
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 31049ef7523..fcf30e084a3 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -131,6 +131,8 @@ extern void riscv_asm_output_external (FILE *, const tree, const char *);
 extern bool
 riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int);
 extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx);
+extern bool riscv_float_const_zero_rtx_p (rtx);
+extern bool riscv_const_zero_rtx_p (rtx);
 
 #ifdef RTX_CODE
 extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool *invert_ptr = 0);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 0d1cbc5cb5f..a8ad86b7068 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -1635,7 +1635,7 @@ riscv_const_insns (rtx x)
 	return 1;
 
       /* We can use x0 to load floating-point zero.  */
-      return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0;
+      return riscv_float_const_zero_rtx_p (x) ? 1 : 0;
     case CONST_VECTOR:
       {
 	/* TODO: This is not accurate, we will need to
@@ -9481,6 +9481,39 @@ riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT total, int regs_num)
 	 || additioanl_bytes == ZCMP_MAX_SPIMM * ZCMP_SP_INC_STEP;
 }
 
+/* Return true if rtx op is const zero of the floating-point.  */
+bool
+riscv_float_const_zero_rtx_p (rtx op)
+{
+  machine_mode mode = GET_MODE (op);
+
+  if (GET_MODE_CLASS (mode) != MODE_FLOAT)
+    return false;
+
+  if (GET_CODE (op) != CONST_DOUBLE)
+    return false;
+
+  const struct real_value *op_rvalue = CONST_DOUBLE_REAL_VALUE (op);
+
+  if (REAL_VALUE_MINUS_ZERO (*op_rvalue))
+    return !HONOR_SIGNED_ZEROS (mode);
+
+  return real_equal (op_rvalue, &dconst0);
+}
+
+/* Return true if rtx op is const zero, include both the integer
+   and floating-point.  */
+bool
+riscv_const_zero_rtx_p (rtx op)
+{
+  machine_mode mode = GET_MODE (op);
+
+  if (GET_MODE_CLASS (mode) == MODE_FLOAT)
+    return riscv_float_const_zero_rtx_p (op);
+
+  return op == CONST0_RTX (mode);
+}
+
 /* Return true if it's valid gpr_save pattern.  */
 
 bool
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 68f7203b676..cd429f6dcdd 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -1878,7 +1878,12 @@ (define_insn "*movhf_hardfloat"
   "TARGET_ZFHMIN
    && (register_operand (operands[0], HFmode)
        || reg_or_0_operand (operands[1], HFmode))"
-  { return riscv_output_move (operands[0], operands[1]); }
+  {
+    if (riscv_float_const_zero_rtx_p (operands[1]))
+      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
+
+    return riscv_output_move (operands[0], operands[1]);
+  }
   [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
    (set_attr "type" "fmove")
    (set_attr "mode" "HF")])
@@ -1889,7 +1894,12 @@ (define_insn "*movhf_softfloat"
   "!TARGET_ZFHMIN
    && (register_operand (operands[0], HFmode)
        || reg_or_0_operand (operands[1], HFmode))"
-  { return riscv_output_move (operands[0], operands[1]); }
+  {
+    if (riscv_float_const_zero_rtx_p (operands[1]))
+      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
+
+    return riscv_output_move (operands[0], operands[1]);
+  }
   [(set_attr "move_type" "fmove,move,load,store,mtc,mfc")
    (set_attr "type" "fmove")
    (set_attr "mode" "HF")])
@@ -2243,7 +2253,12 @@ (define_insn "*movsf_hardfloat"
   "TARGET_HARD_FLOAT
    && (register_operand (operands[0], SFmode)
        || reg_or_0_operand (operands[1], SFmode))"
-  { return riscv_output_move (operands[0], operands[1]); }
+  {
+    if (riscv_float_const_zero_rtx_p (operands[1]))
+      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
+
+    return riscv_output_move (operands[0], operands[1]);
+  }
   [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
    (set_attr "type" "fmove")
    (set_attr "mode" "SF")])
@@ -2254,7 +2269,12 @@ (define_insn "*movsf_softfloat"
   "!TARGET_HARD_FLOAT
    && (register_operand (operands[0], SFmode)
        || reg_or_0_operand (operands[1], SFmode))"
-  { return riscv_output_move (operands[0], operands[1]); }
+  {
+    if (riscv_float_const_zero_rtx_p (operands[1]))
+      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
+
+    return riscv_output_move (operands[0], operands[1]);
+  }
   [(set_attr "move_type" "move,load,store")
    (set_attr "type" "fmove")
    (set_attr "mode" "SF")])
@@ -2279,7 +2299,12 @@ (define_insn "*movdf_hardfloat_rv32"
   "!TARGET_64BIT && TARGET_DOUBLE_FLOAT
    && (register_operand (operands[0], DFmode)
        || reg_or_0_operand (operands[1], DFmode))"
-  { return riscv_output_move (operands[0], operands[1]); }
+  {
+    if (riscv_float_const_zero_rtx_p (operands[1]))
+      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
+
+    return riscv_output_move (operands[0], operands[1]);
+  }
   [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
    (set_attr "type" "fmove")
    (set_attr "mode" "DF")])
@@ -2290,7 +2315,12 @@ (define_insn "*movdf_hardfloat_rv64"
   "TARGET_64BIT && TARGET_DOUBLE_FLOAT
    && (register_operand (operands[0], DFmode)
        || reg_or_0_operand (operands[1], DFmode))"
-  { return riscv_output_move (operands[0], operands[1]); }
+  {
+    if (riscv_float_const_zero_rtx_p (operands[1]))
+      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
+
+    return riscv_output_move (operands[0], operands[1]);
+  }
   [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
    (set_attr "type" "fmove")
    (set_attr "mode" "DF")])
@@ -2301,7 +2331,12 @@ (define_insn "*movdf_softfloat"
   "!TARGET_DOUBLE_FLOAT
    && (register_operand (operands[0], DFmode)
        || reg_or_0_operand (operands[1], DFmode))"
-  { return riscv_output_move (operands[0], operands[1]); }
+  {
+    if (riscv_float_const_zero_rtx_p (operands[1]))
+      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
+
+    return riscv_output_move (operands[0], operands[1]);
+  }
   [(set_attr "move_type" "move,load,store")
    (set_attr "type" "fmove")
    (set_attr "mode" "DF")])
diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c
new file mode 100644
index 00000000000..1eda13a3406
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-skip-if "" { *-*-* } { "-flto" } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+/*
+** test_float_plus_zero_assign:
+** sw\s+zero,0\([atx][0-9]+\)
+** ret
+*/
+void
+test_float_plus_zero_assign (float *a)
+{
+  *a = +0.0;
+}
+
+/*
+** test_float_plus_zero_assign:
+** sw\s+zero,0\([atx][0-9]+\)
+** ret
+*/
+void
+test_float_minus_zero_assign (float *a)
+{
+  *a = -0.0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c
new file mode 100644
index 00000000000..8041ec3ea95
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-skip-if "" { *-*-* } { "-flto" } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+extern void float_assign (float *a, float b);
+
+/*
+** test_float_plus_zero_assign:
+** fmv\.s\.x\s+fa[0-9]+,\s*zero
+** tail\s+float_assign
+*/
+void
+test_float_plus_zero_assign (float *a)
+{
+  float_assign (a, +0.0);
+}
+
+/*
+** test_float_minus_zero_assign:
+** fmv\.s\.x\s+fa[0-9]+,\s*zero
+** tail\s+float_assign
+*/
+void
+test_float_minus_zero_assign (float *a)
+{
+  float_assign (a, -0.0);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c
new file mode 100644
index 00000000000..a6996dae4de
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-skip-if "" { *-*-* } { "-flto" } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+/*
+** test_float_plus_zero_assign:
+** sd\s+zero,0\([atx][0-9]+\)
+** ret
+*/
+void
+test_float_plus_zero_assign (double *a)
+{
+  *a = +0.0;
+}
+
+/*
+** test_float_plus_zero_assign:
+** sd\s+zero,0\([atx][0-9]+\)
+** ret
+*/
+void
+test_float_minus_zero_assign (double *a)
+{
+  *a = -0.0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c
new file mode 100644
index 00000000000..b4ba8a247df
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-skip-if "" { *-*-* } { "-flto" } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+extern void float_assign (double *a, double b);
+
+/*
+** test_float_plus_zero_assign:
+** fmv\.d\.x\s+fa[0-9]+,\s*zero
+** tail\s+float_assign
+*/
+void
+test_float_plus_zero_assign (double *a)
+{
+  float_assign (a, +0.0);
+}
+
+/*
+** test_float_minus_zero_assign:
+** fmv\.d\.x\s+fa[0-9]+,\s*zero
+** tail\s+float_assign
+*/
+void
+test_float_minus_zero_assign (double *a)
+{
+  float_assign (a, -0.0);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c
new file mode 100644
index 00000000000..60acf7155d3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zfh -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-skip-if "" { *-*-* } { "-flto" } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+/*
+** test_float_plus_zero_assign:
+** sh\s+zero,0\([atx][0-9]+\)
+** ret
+*/
+void
+test_float_plus_zero_assign (_Float16 *a)
+{
+  *a = +0.0;
+}
+
+/*
+** test_float_plus_zero_assign:
+** sh\s+zero,0\([atx][0-9]+\)
+** ret
+*/
+void
+test_float_minus_zero_assign (_Float16 *a)
+{
+  *a = -0.0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c
new file mode 100644
index 00000000000..d10efbeb37b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zfh -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-skip-if "" { *-*-* } { "-flto" } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+extern void float_assign (_Float16 *a, _Float16 b);
+
+/*
+** test_float_plus_zero_assign:
+** fmv\.h\.x\s+fa[0-9]+,\s*zero
+** tail\s+float_assign
+*/
+void
+test_float_plus_zero_assign (_Float16 *a)
+{
+  float_assign (a, +0.0);
+}
+
+/*
+** test_float_minus_zero_assign:
+** fmv\.h\.x\s+fa[0-9]+,\s*zero
+** tail\s+float_assign
+*/
+void
+test_float_minus_zero_assign (_Float16 *a)
+{
+  float_assign (a, -0.0);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c
new file mode 100644
index 00000000000..347e4b0ff74
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c
@@ -0,0 +1,36 @@
+/* { dg-do run { target { riscv_v } } } */
+/* { dg-additional-options "-std=c99 -O3 -fno-signed-zeros" } */
+
+float f_val;
+
+void
+__attribute__ ((noinline))
+test_float_plus_zero_assign (float *a)
+{
+  *a = +0.0;
+}
+
+void
+__attribute__ ((noinline))
+test_float_minus_zero_assign (float *a)
+{
+  *a = -0.0;
+}
+
+int
+main ()
+{
+  f_val = -1.0;
+  test_float_plus_zero_assign (&f_val);
+
+  if (__builtin_copysignf (1.0, f_val) != 1.0)
+    __builtin_abort ();
+
+  f_val = -1.0;
+  test_float_minus_zero_assign (&f_val);
+
+  if (__builtin_copysignf (1.0, f_val) != 1.0)
+    __builtin_abort ();
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c
new file mode 100644
index 00000000000..1eb1edba457
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c
@@ -0,0 +1,36 @@
+/* { dg-do run { target { riscv_v } } } */
+/* { dg-additional-options "-std=c99 -O3 -fno-signed-zeros" } */
+
+double f_val;
+
+void
+__attribute__ ((noinline))
+test_float_plus_zero_assign (double *a)
+{
+  *a = +0.0;
+}
+
+void
+__attribute__ ((noinline))
+test_float_minus_zero_assign (double *a)
+{
+  *a = -0.0;
+}
+
+int
+main ()
+{
+  f_val = -1.0;
+  test_float_plus_zero_assign (&f_val);
+
+  if (__builtin_copysignf (1.0, f_val) != 1.0)
+    __builtin_abort ();
+
+  f_val = -1.0;
+  test_float_minus_zero_assign (&f_val);
+
+  if (__builtin_copysignf (1.0, f_val) != 1.0)
+    __builtin_abort ();
+
+  return 0;
+}
-- 
2.34.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option
  2024-01-02 11:55 ` [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option pan2.li
@ 2024-01-08 10:45   ` Richard Biener
  2024-01-09  1:22     ` Li, Pan2
  2024-01-09 17:46     ` Jeff Law
  0 siblings, 2 replies; 23+ messages in thread
From: Richard Biener @ 2024-01-08 10:45 UTC (permalink / raw)
  To: pan2.li; +Cc: gcc-patches, juzhe.zhong, yanzhang.wang, kito.cheng, jeffreyalaw

On Tue, Jan 2, 2024 at 2:37 PM <pan2.li@intel.com> wrote:
>
> From: Pan Li <pan2.li@intel.com>
>
> According to the sematics of no-signed-zeros option, the backend
> like RISC-V should treat the minus zero -0.0f as plus zero 0.0f.
>
> Consider below example with option -fno-signed-zeros.
>
> void
> test (float *a)
> {
>   *a = -0.0;
> }
>
> We will generate code as below, which doesn't treat the minus zero
> as plus zero.
>
> test:
>   lui  a5,%hi(.LC0)
>   flw  fa5,%lo(.LC0)(a5)
>   fsw  fa5,0(a0)
>   ret
>
> .LC0:
>   .word -2147483648 // aka -0.0 (0x80000000 in hex)
>
> This patch would like to fix the bug and treat the minus zero -0.0
> as plus zero, aka +0.0. Thus after this patch we will have asm code
> as below for the above sampe code.
>
> test:
>   sw zero,0(a0)
>   ret
>
> This patch also fix the run failure of the test case pr30957-1.c. The
> below tests are passed for this patch.

We don't really expect targets to do this.  The small testcase above
is somewhat ill-formed with -fno-signed-zeros.  Note there's no
-0.0 in pr30957-1.c so why does that one fail for you?  Does
the -fvariable-expansion-in-unroller code maybe not trigger for
riscv?

I think we should go to PR30957 and see what that was filed originally
for, the testcase doesn't make much sense to me.

> * The riscv regression tests.
> * The pr30957-1.c run tests.
>
> gcc/ChangeLog:
>
>         * config/riscv/constraints.md: Leverage func riscv_float_const_zero_rtx_p
>         for predicating the rtx is const zero float or not.
>         * config/riscv/predicates.md: Ditto.
>         * config/riscv/riscv.cc (riscv_const_insns): Ditto.
>         (riscv_float_const_zero_rtx_p): New func impl for predicating the rtx is
>         const zero float or not.
>         (riscv_const_zero_rtx_p): New func impl for predicating the rtx
>         is const zero (both int and fp) or not.
>         * config/riscv/riscv-protos.h (riscv_float_const_zero_rtx_p):
>         New func decl.
>         (riscv_const_zero_rtx_p): Ditto.
>         * config/riscv/riscv.md: Making sure the operand[1] of movfp is
>         CONST0_RTX when the operand[1] is const zero float.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/riscv/no-signed-zeros-0.c: New test.
>         * gcc.target/riscv/no-signed-zeros-1.c: New test.
>         * gcc.target/riscv/no-signed-zeros-2.c: New test.
>         * gcc.target/riscv/no-signed-zeros-3.c: New test.
>         * gcc.target/riscv/no-signed-zeros-4.c: New test.
>         * gcc.target/riscv/no-signed-zeros-5.c: New test.
>         * gcc.target/riscv/no-signed-zeros-run-0.c: New test.
>         * gcc.target/riscv/no-signed-zeros-run-1.c: New test.
>
> Signed-off-by: Pan Li <pan2.li@intel.com>
> ---
>  gcc/config/riscv/constraints.md               |  2 +-
>  gcc/config/riscv/predicates.md                |  2 +-
>  gcc/config/riscv/riscv-protos.h               |  2 +
>  gcc/config/riscv/riscv.cc                     | 35 ++++++++++++-
>  gcc/config/riscv/riscv.md                     | 49 ++++++++++++++++---
>  .../gcc.target/riscv/no-signed-zeros-0.c      | 26 ++++++++++
>  .../gcc.target/riscv/no-signed-zeros-1.c      | 28 +++++++++++
>  .../gcc.target/riscv/no-signed-zeros-2.c      | 26 ++++++++++
>  .../gcc.target/riscv/no-signed-zeros-3.c      | 28 +++++++++++
>  .../gcc.target/riscv/no-signed-zeros-4.c      | 26 ++++++++++
>  .../gcc.target/riscv/no-signed-zeros-5.c      | 28 +++++++++++
>  .../gcc.target/riscv/no-signed-zeros-run-0.c  | 36 ++++++++++++++
>  .../gcc.target/riscv/no-signed-zeros-run-1.c  | 36 ++++++++++++++
>  13 files changed, 314 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c
>
> diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
> index de4359af00d..db1d5e1385f 100644
> --- a/gcc/config/riscv/constraints.md
> +++ b/gcc/config/riscv/constraints.md
> @@ -108,7 +108,7 @@ (define_constraint "DnS"
>  (define_constraint "G"
>    "@internal"
>    (and (match_code "const_double")
> -       (match_test "op == CONST0_RTX (mode)")))
> +       (match_test "riscv_float_const_zero_rtx_p (op)")))
>
>  (define_memory_constraint "A"
>    "An address that is held in a general-purpose register."
> diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
> index b87a6900841..b428d842101 100644
> --- a/gcc/config/riscv/predicates.md
> +++ b/gcc/config/riscv/predicates.md
> @@ -78,7 +78,7 @@ (define_predicate "sleu_operand"
>
>  (define_predicate "const_0_operand"
>    (and (match_code "const_int,const_wide_int,const_double,const_vector")
> -       (match_test "op == CONST0_RTX (GET_MODE (op))")))
> +       (match_test "riscv_const_zero_rtx_p (op)")))
>
>  (define_predicate "const_1_operand"
>    (and (match_code "const_int,const_wide_int,const_vector")
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index 31049ef7523..fcf30e084a3 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -131,6 +131,8 @@ extern void riscv_asm_output_external (FILE *, const tree, const char *);
>  extern bool
>  riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int);
>  extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx);
> +extern bool riscv_float_const_zero_rtx_p (rtx);
> +extern bool riscv_const_zero_rtx_p (rtx);
>
>  #ifdef RTX_CODE
>  extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool *invert_ptr = 0);
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 0d1cbc5cb5f..a8ad86b7068 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -1635,7 +1635,7 @@ riscv_const_insns (rtx x)
>         return 1;
>
>        /* We can use x0 to load floating-point zero.  */
> -      return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0;
> +      return riscv_float_const_zero_rtx_p (x) ? 1 : 0;
>      case CONST_VECTOR:
>        {
>         /* TODO: This is not accurate, we will need to
> @@ -9481,6 +9481,39 @@ riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT total, int regs_num)
>          || additioanl_bytes == ZCMP_MAX_SPIMM * ZCMP_SP_INC_STEP;
>  }
>
> +/* Return true if rtx op is const zero of the floating-point.  */
> +bool
> +riscv_float_const_zero_rtx_p (rtx op)
> +{
> +  machine_mode mode = GET_MODE (op);
> +
> +  if (GET_MODE_CLASS (mode) != MODE_FLOAT)
> +    return false;
> +
> +  if (GET_CODE (op) != CONST_DOUBLE)
> +    return false;
> +
> +  const struct real_value *op_rvalue = CONST_DOUBLE_REAL_VALUE (op);
> +
> +  if (REAL_VALUE_MINUS_ZERO (*op_rvalue))
> +    return !HONOR_SIGNED_ZEROS (mode);
> +
> +  return real_equal (op_rvalue, &dconst0);
> +}
> +
> +/* Return true if rtx op is const zero, include both the integer
> +   and floating-point.  */
> +bool
> +riscv_const_zero_rtx_p (rtx op)
> +{
> +  machine_mode mode = GET_MODE (op);
> +
> +  if (GET_MODE_CLASS (mode) == MODE_FLOAT)
> +    return riscv_float_const_zero_rtx_p (op);
> +
> +  return op == CONST0_RTX (mode);
> +}
> +
>  /* Return true if it's valid gpr_save pattern.  */
>
>  bool
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index 68f7203b676..cd429f6dcdd 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -1878,7 +1878,12 @@ (define_insn "*movhf_hardfloat"
>    "TARGET_ZFHMIN
>     && (register_operand (operands[0], HFmode)
>         || reg_or_0_operand (operands[1], HFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "HF")])
> @@ -1889,7 +1894,12 @@ (define_insn "*movhf_softfloat"
>    "!TARGET_ZFHMIN
>     && (register_operand (operands[0], HFmode)
>         || reg_or_0_operand (operands[1], HFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "fmove,move,load,store,mtc,mfc")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "HF")])
> @@ -2243,7 +2253,12 @@ (define_insn "*movsf_hardfloat"
>    "TARGET_HARD_FLOAT
>     && (register_operand (operands[0], SFmode)
>         || reg_or_0_operand (operands[1], SFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "SF")])
> @@ -2254,7 +2269,12 @@ (define_insn "*movsf_softfloat"
>    "!TARGET_HARD_FLOAT
>     && (register_operand (operands[0], SFmode)
>         || reg_or_0_operand (operands[1], SFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "move,load,store")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "SF")])
> @@ -2279,7 +2299,12 @@ (define_insn "*movdf_hardfloat_rv32"
>    "!TARGET_64BIT && TARGET_DOUBLE_FLOAT
>     && (register_operand (operands[0], DFmode)
>         || reg_or_0_operand (operands[1], DFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "DF")])
> @@ -2290,7 +2315,12 @@ (define_insn "*movdf_hardfloat_rv64"
>    "TARGET_64BIT && TARGET_DOUBLE_FLOAT
>     && (register_operand (operands[0], DFmode)
>         || reg_or_0_operand (operands[1], DFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "DF")])
> @@ -2301,7 +2331,12 @@ (define_insn "*movdf_softfloat"
>    "!TARGET_DOUBLE_FLOAT
>     && (register_operand (operands[0], DFmode)
>         || reg_or_0_operand (operands[1], DFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "move,load,store")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "DF")])
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c
> new file mode 100644
> index 00000000000..1eda13a3406
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +/*
> +** test_float_plus_zero_assign:
> +** sw\s+zero,0\([atx][0-9]+\)
> +** ret
> +*/
> +void
> +test_float_plus_zero_assign (float *a)
> +{
> +  *a = +0.0;
> +}
> +
> +/*
> +** test_float_plus_zero_assign:
> +** sw\s+zero,0\([atx][0-9]+\)
> +** ret
> +*/
> +void
> +test_float_minus_zero_assign (float *a)
> +{
> +  *a = -0.0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c
> new file mode 100644
> index 00000000000..8041ec3ea95
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +extern void float_assign (float *a, float b);
> +
> +/*
> +** test_float_plus_zero_assign:
> +** fmv\.s\.x\s+fa[0-9]+,\s*zero
> +** tail\s+float_assign
> +*/
> +void
> +test_float_plus_zero_assign (float *a)
> +{
> +  float_assign (a, +0.0);
> +}
> +
> +/*
> +** test_float_minus_zero_assign:
> +** fmv\.s\.x\s+fa[0-9]+,\s*zero
> +** tail\s+float_assign
> +*/
> +void
> +test_float_minus_zero_assign (float *a)
> +{
> +  float_assign (a, -0.0);
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c
> new file mode 100644
> index 00000000000..a6996dae4de
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +/*
> +** test_float_plus_zero_assign:
> +** sd\s+zero,0\([atx][0-9]+\)
> +** ret
> +*/
> +void
> +test_float_plus_zero_assign (double *a)
> +{
> +  *a = +0.0;
> +}
> +
> +/*
> +** test_float_plus_zero_assign:
> +** sd\s+zero,0\([atx][0-9]+\)
> +** ret
> +*/
> +void
> +test_float_minus_zero_assign (double *a)
> +{
> +  *a = -0.0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c
> new file mode 100644
> index 00000000000..b4ba8a247df
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +extern void float_assign (double *a, double b);
> +
> +/*
> +** test_float_plus_zero_assign:
> +** fmv\.d\.x\s+fa[0-9]+,\s*zero
> +** tail\s+float_assign
> +*/
> +void
> +test_float_plus_zero_assign (double *a)
> +{
> +  float_assign (a, +0.0);
> +}
> +
> +/*
> +** test_float_minus_zero_assign:
> +** fmv\.d\.x\s+fa[0-9]+,\s*zero
> +** tail\s+float_assign
> +*/
> +void
> +test_float_minus_zero_assign (double *a)
> +{
> +  float_assign (a, -0.0);
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c
> new file mode 100644
> index 00000000000..60acf7155d3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv_zfh -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +/*
> +** test_float_plus_zero_assign:
> +** sh\s+zero,0\([atx][0-9]+\)
> +** ret
> +*/
> +void
> +test_float_plus_zero_assign (_Float16 *a)
> +{
> +  *a = +0.0;
> +}
> +
> +/*
> +** test_float_plus_zero_assign:
> +** sh\s+zero,0\([atx][0-9]+\)
> +** ret
> +*/
> +void
> +test_float_minus_zero_assign (_Float16 *a)
> +{
> +  *a = -0.0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c
> new file mode 100644
> index 00000000000..d10efbeb37b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv_zfh -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +extern void float_assign (_Float16 *a, _Float16 b);
> +
> +/*
> +** test_float_plus_zero_assign:
> +** fmv\.h\.x\s+fa[0-9]+,\s*zero
> +** tail\s+float_assign
> +*/
> +void
> +test_float_plus_zero_assign (_Float16 *a)
> +{
> +  float_assign (a, +0.0);
> +}
> +
> +/*
> +** test_float_minus_zero_assign:
> +** fmv\.h\.x\s+fa[0-9]+,\s*zero
> +** tail\s+float_assign
> +*/
> +void
> +test_float_minus_zero_assign (_Float16 *a)
> +{
> +  float_assign (a, -0.0);
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c
> new file mode 100644
> index 00000000000..347e4b0ff74
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c
> @@ -0,0 +1,36 @@
> +/* { dg-do run { target { riscv_v } } } */
> +/* { dg-additional-options "-std=c99 -O3 -fno-signed-zeros" } */
> +
> +float f_val;
> +
> +void
> +__attribute__ ((noinline))
> +test_float_plus_zero_assign (float *a)
> +{
> +  *a = +0.0;
> +}
> +
> +void
> +__attribute__ ((noinline))
> +test_float_minus_zero_assign (float *a)
> +{
> +  *a = -0.0;
> +}
> +
> +int
> +main ()
> +{
> +  f_val = -1.0;
> +  test_float_plus_zero_assign (&f_val);
> +
> +  if (__builtin_copysignf (1.0, f_val) != 1.0)
> +    __builtin_abort ();
> +
> +  f_val = -1.0;
> +  test_float_minus_zero_assign (&f_val);
> +
> +  if (__builtin_copysignf (1.0, f_val) != 1.0)
> +    __builtin_abort ();
> +
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c
> new file mode 100644
> index 00000000000..1eb1edba457
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c
> @@ -0,0 +1,36 @@
> +/* { dg-do run { target { riscv_v } } } */
> +/* { dg-additional-options "-std=c99 -O3 -fno-signed-zeros" } */
> +
> +double f_val;
> +
> +void
> +__attribute__ ((noinline))
> +test_float_plus_zero_assign (double *a)
> +{
> +  *a = +0.0;
> +}
> +
> +void
> +__attribute__ ((noinline))
> +test_float_minus_zero_assign (double *a)
> +{
> +  *a = -0.0;
> +}
> +
> +int
> +main ()
> +{
> +  f_val = -1.0;
> +  test_float_plus_zero_assign (&f_val);
> +
> +  if (__builtin_copysignf (1.0, f_val) != 1.0)
> +    __builtin_abort ();
> +
> +  f_val = -1.0;
> +  test_float_minus_zero_assign (&f_val);
> +
> +  if (__builtin_copysignf (1.0, f_val) != 1.0)
> +    __builtin_abort ();
> +
> +  return 0;
> +}
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option
  2024-01-08 10:45   ` Richard Biener
@ 2024-01-09  1:22     ` Li, Pan2
  2024-01-09  7:17       ` Li, Pan2
  2024-01-09 17:46     ` Jeff Law
  1 sibling, 1 reply; 23+ messages in thread
From: Li, Pan2 @ 2024-01-09  1:22 UTC (permalink / raw)
  To: Richard Biener
  Cc: gcc-patches, juzhe.zhong, Wang, Yanzhang, kito.cheng, jeffreyalaw

Thanks Richard B for comments.

> We don't really expect targets to do this.  The small testcase above
> is somewhat ill-formed with -fno-signed-zeros.  Note there's no
> -0.0 in pr30957-1.c so why does that one fail for you?  Does
> the -fvariable-expansion-in-unroller code maybe not trigger for
> riscv?

Sorry this confused me a little about the sematics of the option -fno-signed-zeros.
I wonder what the target/backend need to do for this option.

About the failure, it comes from below code in pr30957-1.c. The 0.0 / -5.0 is initialized to -0.0 in riscv but +0.0 in aarch64.

  if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0)
    abort ();

If my understanding is correct, the loop will be vectorized during vect_transform_loop with a variable factor.
It won't benefit from unrolling/peeling and mark the loop->unroll as 1, and then we have tree-vect log similar to below:

Disabling unrolling due to variable-length vectorization factor.

> I think we should go to PR30957 and see what that was filed originally
> for, the testcase doesn't make much sense to me.

Sure thing, will take a look and back to you later.

Pan

-----Original Message-----
From: Richard Biener <richard.guenther@gmail.com> 
Sent: Monday, January 8, 2024 6:45 PM
To: Li, Pan2 <pan2.li@intel.com>
Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; jeffreyalaw@gmail.com
Subject: Re: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option

On Tue, Jan 2, 2024 at 2:37 PM <pan2.li@intel.com> wrote:
>
> From: Pan Li <pan2.li@intel.com>
>
> According to the sematics of no-signed-zeros option, the backend
> like RISC-V should treat the minus zero -0.0f as plus zero 0.0f.
>
> Consider below example with option -fno-signed-zeros.
>
> void
> test (float *a)
> {
>   *a = -0.0;
> }
>
> We will generate code as below, which doesn't treat the minus zero
> as plus zero.
>
> test:
>   lui  a5,%hi(.LC0)
>   flw  fa5,%lo(.LC0)(a5)
>   fsw  fa5,0(a0)
>   ret
>
> .LC0:
>   .word -2147483648 // aka -0.0 (0x80000000 in hex)
>
> This patch would like to fix the bug and treat the minus zero -0.0
> as plus zero, aka +0.0. Thus after this patch we will have asm code
> as below for the above sampe code.
>
> test:
>   sw zero,0(a0)
>   ret
>
> This patch also fix the run failure of the test case pr30957-1.c. The
> below tests are passed for this patch.

We don't really expect targets to do this.  The small testcase above
is somewhat ill-formed with -fno-signed-zeros.  Note there's no
-0.0 in pr30957-1.c so why does that one fail for you?  Does
the -fvariable-expansion-in-unroller code maybe not trigger for
riscv?

I think we should go to PR30957 and see what that was filed originally
for, the testcase doesn't make much sense to me.

> * The riscv regression tests.
> * The pr30957-1.c run tests.
>
> gcc/ChangeLog:
>
>         * config/riscv/constraints.md: Leverage func riscv_float_const_zero_rtx_p
>         for predicating the rtx is const zero float or not.
>         * config/riscv/predicates.md: Ditto.
>         * config/riscv/riscv.cc (riscv_const_insns): Ditto.
>         (riscv_float_const_zero_rtx_p): New func impl for predicating the rtx is
>         const zero float or not.
>         (riscv_const_zero_rtx_p): New func impl for predicating the rtx
>         is const zero (both int and fp) or not.
>         * config/riscv/riscv-protos.h (riscv_float_const_zero_rtx_p):
>         New func decl.
>         (riscv_const_zero_rtx_p): Ditto.
>         * config/riscv/riscv.md: Making sure the operand[1] of movfp is
>         CONST0_RTX when the operand[1] is const zero float.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/riscv/no-signed-zeros-0.c: New test.
>         * gcc.target/riscv/no-signed-zeros-1.c: New test.
>         * gcc.target/riscv/no-signed-zeros-2.c: New test.
>         * gcc.target/riscv/no-signed-zeros-3.c: New test.
>         * gcc.target/riscv/no-signed-zeros-4.c: New test.
>         * gcc.target/riscv/no-signed-zeros-5.c: New test.
>         * gcc.target/riscv/no-signed-zeros-run-0.c: New test.
>         * gcc.target/riscv/no-signed-zeros-run-1.c: New test.
>
> Signed-off-by: Pan Li <pan2.li@intel.com>
> ---
>  gcc/config/riscv/constraints.md               |  2 +-
>  gcc/config/riscv/predicates.md                |  2 +-
>  gcc/config/riscv/riscv-protos.h               |  2 +
>  gcc/config/riscv/riscv.cc                     | 35 ++++++++++++-
>  gcc/config/riscv/riscv.md                     | 49 ++++++++++++++++---
>  .../gcc.target/riscv/no-signed-zeros-0.c      | 26 ++++++++++
>  .../gcc.target/riscv/no-signed-zeros-1.c      | 28 +++++++++++
>  .../gcc.target/riscv/no-signed-zeros-2.c      | 26 ++++++++++
>  .../gcc.target/riscv/no-signed-zeros-3.c      | 28 +++++++++++
>  .../gcc.target/riscv/no-signed-zeros-4.c      | 26 ++++++++++
>  .../gcc.target/riscv/no-signed-zeros-5.c      | 28 +++++++++++
>  .../gcc.target/riscv/no-signed-zeros-run-0.c  | 36 ++++++++++++++
>  .../gcc.target/riscv/no-signed-zeros-run-1.c  | 36 ++++++++++++++
>  13 files changed, 314 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c
>
> diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
> index de4359af00d..db1d5e1385f 100644
> --- a/gcc/config/riscv/constraints.md
> +++ b/gcc/config/riscv/constraints.md
> @@ -108,7 +108,7 @@ (define_constraint "DnS"
>  (define_constraint "G"
>    "@internal"
>    (and (match_code "const_double")
> -       (match_test "op == CONST0_RTX (mode)")))
> +       (match_test "riscv_float_const_zero_rtx_p (op)")))
>
>  (define_memory_constraint "A"
>    "An address that is held in a general-purpose register."
> diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
> index b87a6900841..b428d842101 100644
> --- a/gcc/config/riscv/predicates.md
> +++ b/gcc/config/riscv/predicates.md
> @@ -78,7 +78,7 @@ (define_predicate "sleu_operand"
>
>  (define_predicate "const_0_operand"
>    (and (match_code "const_int,const_wide_int,const_double,const_vector")
> -       (match_test "op == CONST0_RTX (GET_MODE (op))")))
> +       (match_test "riscv_const_zero_rtx_p (op)")))
>
>  (define_predicate "const_1_operand"
>    (and (match_code "const_int,const_wide_int,const_vector")
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index 31049ef7523..fcf30e084a3 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -131,6 +131,8 @@ extern void riscv_asm_output_external (FILE *, const tree, const char *);
>  extern bool
>  riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int);
>  extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx);
> +extern bool riscv_float_const_zero_rtx_p (rtx);
> +extern bool riscv_const_zero_rtx_p (rtx);
>
>  #ifdef RTX_CODE
>  extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool *invert_ptr = 0);
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 0d1cbc5cb5f..a8ad86b7068 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -1635,7 +1635,7 @@ riscv_const_insns (rtx x)
>         return 1;
>
>        /* We can use x0 to load floating-point zero.  */
> -      return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0;
> +      return riscv_float_const_zero_rtx_p (x) ? 1 : 0;
>      case CONST_VECTOR:
>        {
>         /* TODO: This is not accurate, we will need to
> @@ -9481,6 +9481,39 @@ riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT total, int regs_num)
>          || additioanl_bytes == ZCMP_MAX_SPIMM * ZCMP_SP_INC_STEP;
>  }
>
> +/* Return true if rtx op is const zero of the floating-point.  */
> +bool
> +riscv_float_const_zero_rtx_p (rtx op)
> +{
> +  machine_mode mode = GET_MODE (op);
> +
> +  if (GET_MODE_CLASS (mode) != MODE_FLOAT)
> +    return false;
> +
> +  if (GET_CODE (op) != CONST_DOUBLE)
> +    return false;
> +
> +  const struct real_value *op_rvalue = CONST_DOUBLE_REAL_VALUE (op);
> +
> +  if (REAL_VALUE_MINUS_ZERO (*op_rvalue))
> +    return !HONOR_SIGNED_ZEROS (mode);
> +
> +  return real_equal (op_rvalue, &dconst0);
> +}
> +
> +/* Return true if rtx op is const zero, include both the integer
> +   and floating-point.  */
> +bool
> +riscv_const_zero_rtx_p (rtx op)
> +{
> +  machine_mode mode = GET_MODE (op);
> +
> +  if (GET_MODE_CLASS (mode) == MODE_FLOAT)
> +    return riscv_float_const_zero_rtx_p (op);
> +
> +  return op == CONST0_RTX (mode);
> +}
> +
>  /* Return true if it's valid gpr_save pattern.  */
>
>  bool
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index 68f7203b676..cd429f6dcdd 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -1878,7 +1878,12 @@ (define_insn "*movhf_hardfloat"
>    "TARGET_ZFHMIN
>     && (register_operand (operands[0], HFmode)
>         || reg_or_0_operand (operands[1], HFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "HF")])
> @@ -1889,7 +1894,12 @@ (define_insn "*movhf_softfloat"
>    "!TARGET_ZFHMIN
>     && (register_operand (operands[0], HFmode)
>         || reg_or_0_operand (operands[1], HFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "fmove,move,load,store,mtc,mfc")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "HF")])
> @@ -2243,7 +2253,12 @@ (define_insn "*movsf_hardfloat"
>    "TARGET_HARD_FLOAT
>     && (register_operand (operands[0], SFmode)
>         || reg_or_0_operand (operands[1], SFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "SF")])
> @@ -2254,7 +2269,12 @@ (define_insn "*movsf_softfloat"
>    "!TARGET_HARD_FLOAT
>     && (register_operand (operands[0], SFmode)
>         || reg_or_0_operand (operands[1], SFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "move,load,store")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "SF")])
> @@ -2279,7 +2299,12 @@ (define_insn "*movdf_hardfloat_rv32"
>    "!TARGET_64BIT && TARGET_DOUBLE_FLOAT
>     && (register_operand (operands[0], DFmode)
>         || reg_or_0_operand (operands[1], DFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "DF")])
> @@ -2290,7 +2315,12 @@ (define_insn "*movdf_hardfloat_rv64"
>    "TARGET_64BIT && TARGET_DOUBLE_FLOAT
>     && (register_operand (operands[0], DFmode)
>         || reg_or_0_operand (operands[1], DFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "DF")])
> @@ -2301,7 +2331,12 @@ (define_insn "*movdf_softfloat"
>    "!TARGET_DOUBLE_FLOAT
>     && (register_operand (operands[0], DFmode)
>         || reg_or_0_operand (operands[1], DFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "move,load,store")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "DF")])
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c
> new file mode 100644
> index 00000000000..1eda13a3406
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +/*
> +** test_float_plus_zero_assign:
> +** sw\s+zero,0\([atx][0-9]+\)
> +** ret
> +*/
> +void
> +test_float_plus_zero_assign (float *a)
> +{
> +  *a = +0.0;
> +}
> +
> +/*
> +** test_float_plus_zero_assign:
> +** sw\s+zero,0\([atx][0-9]+\)
> +** ret
> +*/
> +void
> +test_float_minus_zero_assign (float *a)
> +{
> +  *a = -0.0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c
> new file mode 100644
> index 00000000000..8041ec3ea95
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +extern void float_assign (float *a, float b);
> +
> +/*
> +** test_float_plus_zero_assign:
> +** fmv\.s\.x\s+fa[0-9]+,\s*zero
> +** tail\s+float_assign
> +*/
> +void
> +test_float_plus_zero_assign (float *a)
> +{
> +  float_assign (a, +0.0);
> +}
> +
> +/*
> +** test_float_minus_zero_assign:
> +** fmv\.s\.x\s+fa[0-9]+,\s*zero
> +** tail\s+float_assign
> +*/
> +void
> +test_float_minus_zero_assign (float *a)
> +{
> +  float_assign (a, -0.0);
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c
> new file mode 100644
> index 00000000000..a6996dae4de
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +/*
> +** test_float_plus_zero_assign:
> +** sd\s+zero,0\([atx][0-9]+\)
> +** ret
> +*/
> +void
> +test_float_plus_zero_assign (double *a)
> +{
> +  *a = +0.0;
> +}
> +
> +/*
> +** test_float_plus_zero_assign:
> +** sd\s+zero,0\([atx][0-9]+\)
> +** ret
> +*/
> +void
> +test_float_minus_zero_assign (double *a)
> +{
> +  *a = -0.0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c
> new file mode 100644
> index 00000000000..b4ba8a247df
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +extern void float_assign (double *a, double b);
> +
> +/*
> +** test_float_plus_zero_assign:
> +** fmv\.d\.x\s+fa[0-9]+,\s*zero
> +** tail\s+float_assign
> +*/
> +void
> +test_float_plus_zero_assign (double *a)
> +{
> +  float_assign (a, +0.0);
> +}
> +
> +/*
> +** test_float_minus_zero_assign:
> +** fmv\.d\.x\s+fa[0-9]+,\s*zero
> +** tail\s+float_assign
> +*/
> +void
> +test_float_minus_zero_assign (double *a)
> +{
> +  float_assign (a, -0.0);
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c
> new file mode 100644
> index 00000000000..60acf7155d3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv_zfh -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +/*
> +** test_float_plus_zero_assign:
> +** sh\s+zero,0\([atx][0-9]+\)
> +** ret
> +*/
> +void
> +test_float_plus_zero_assign (_Float16 *a)
> +{
> +  *a = +0.0;
> +}
> +
> +/*
> +** test_float_plus_zero_assign:
> +** sh\s+zero,0\([atx][0-9]+\)
> +** ret
> +*/
> +void
> +test_float_minus_zero_assign (_Float16 *a)
> +{
> +  *a = -0.0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c
> new file mode 100644
> index 00000000000..d10efbeb37b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv_zfh -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +extern void float_assign (_Float16 *a, _Float16 b);
> +
> +/*
> +** test_float_plus_zero_assign:
> +** fmv\.h\.x\s+fa[0-9]+,\s*zero
> +** tail\s+float_assign
> +*/
> +void
> +test_float_plus_zero_assign (_Float16 *a)
> +{
> +  float_assign (a, +0.0);
> +}
> +
> +/*
> +** test_float_minus_zero_assign:
> +** fmv\.h\.x\s+fa[0-9]+,\s*zero
> +** tail\s+float_assign
> +*/
> +void
> +test_float_minus_zero_assign (_Float16 *a)
> +{
> +  float_assign (a, -0.0);
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c
> new file mode 100644
> index 00000000000..347e4b0ff74
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c
> @@ -0,0 +1,36 @@
> +/* { dg-do run { target { riscv_v } } } */
> +/* { dg-additional-options "-std=c99 -O3 -fno-signed-zeros" } */
> +
> +float f_val;
> +
> +void
> +__attribute__ ((noinline))
> +test_float_plus_zero_assign (float *a)
> +{
> +  *a = +0.0;
> +}
> +
> +void
> +__attribute__ ((noinline))
> +test_float_minus_zero_assign (float *a)
> +{
> +  *a = -0.0;
> +}
> +
> +int
> +main ()
> +{
> +  f_val = -1.0;
> +  test_float_plus_zero_assign (&f_val);
> +
> +  if (__builtin_copysignf (1.0, f_val) != 1.0)
> +    __builtin_abort ();
> +
> +  f_val = -1.0;
> +  test_float_minus_zero_assign (&f_val);
> +
> +  if (__builtin_copysignf (1.0, f_val) != 1.0)
> +    __builtin_abort ();
> +
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c
> new file mode 100644
> index 00000000000..1eb1edba457
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c
> @@ -0,0 +1,36 @@
> +/* { dg-do run { target { riscv_v } } } */
> +/* { dg-additional-options "-std=c99 -O3 -fno-signed-zeros" } */
> +
> +double f_val;
> +
> +void
> +__attribute__ ((noinline))
> +test_float_plus_zero_assign (double *a)
> +{
> +  *a = +0.0;
> +}
> +
> +void
> +__attribute__ ((noinline))
> +test_float_minus_zero_assign (double *a)
> +{
> +  *a = -0.0;
> +}
> +
> +int
> +main ()
> +{
> +  f_val = -1.0;
> +  test_float_plus_zero_assign (&f_val);
> +
> +  if (__builtin_copysignf (1.0, f_val) != 1.0)
> +    __builtin_abort ();
> +
> +  f_val = -1.0;
> +  test_float_minus_zero_assign (&f_val);
> +
> +  if (__builtin_copysignf (1.0, f_val) != 1.0)
> +    __builtin_abort ();
> +
> +  return 0;
> +}
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option
  2024-01-09  1:22     ` Li, Pan2
@ 2024-01-09  7:17       ` Li, Pan2
  2024-01-09 13:08         ` Richard Biener
  0 siblings, 1 reply; 23+ messages in thread
From: Li, Pan2 @ 2024-01-09  7:17 UTC (permalink / raw)
  To: Richard Biener
  Cc: gcc-patches, juzhe.zhong, Wang, Yanzhang, kito.cheng,
	jeffreyalaw, eres, tom

The test case pr30957-1.c first comes from this commit about 19 years ago which expect the -1.0 for testing.

https://github.com/gcc-mirror/gcc/commit/290358f770d21d9204ea621f839ee8fba606a275

Then the below commit changes from -1.0 to +1.0 for this test file only, because of the instantiates copy(s) of the
accumulator which it initializes with +0.0.

https://github.com/gcc-mirror/gcc/commit/ffefa9288ab95b06b1dfed95e7235f4c09619a91

According to the implementation details of insert_var_expansion_initialization. The zero_init will be
the CONST0_RTX (mode) if HONOR_SIGNED_ZEROS is false. If my understanding is correct, maybe the test
case is not well designed for the variable expanding in unrolling? At least it is not good idea to mix/rely on
the HONOR_SIGNED_ZEROS when testing variable-expansion-in-unroller.

CC the original author and please feel free to correct me if any misunderstanding.

Pan

-----Original Message-----
From: Li, Pan2 
Sent: Tuesday, January 9, 2024 9:22 AM
To: Richard Biener <richard.guenther@gmail.com>
Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; jeffreyalaw@gmail.com
Subject: RE: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option

Thanks Richard B for comments.

> We don't really expect targets to do this.  The small testcase above
> is somewhat ill-formed with -fno-signed-zeros.  Note there's no
> -0.0 in pr30957-1.c so why does that one fail for you?  Does
> the -fvariable-expansion-in-unroller code maybe not trigger for
> riscv?

Sorry this confused me a little about the sematics of the option -fno-signed-zeros.
I wonder what the target/backend need to do for this option.

About the failure, it comes from below code in pr30957-1.c. The 0.0 / -5.0 is initialized to -0.0 in riscv but +0.0 in aarch64.

  if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0)
    abort ();

If my understanding is correct, the loop will be vectorized during vect_transform_loop with a variable factor.
It won't benefit from unrolling/peeling and mark the loop->unroll as 1, and then we have tree-vect log similar to below:

Disabling unrolling due to variable-length vectorization factor.

> I think we should go to PR30957 and see what that was filed originally
> for, the testcase doesn't make much sense to me.

Sure thing, will take a look and back to you later.

Pan

-----Original Message-----
From: Richard Biener <richard.guenther@gmail.com> 
Sent: Monday, January 8, 2024 6:45 PM
To: Li, Pan2 <pan2.li@intel.com>
Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; jeffreyalaw@gmail.com
Subject: Re: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option

On Tue, Jan 2, 2024 at 2:37 PM <pan2.li@intel.com> wrote:
>
> From: Pan Li <pan2.li@intel.com>
>
> According to the sematics of no-signed-zeros option, the backend
> like RISC-V should treat the minus zero -0.0f as plus zero 0.0f.
>
> Consider below example with option -fno-signed-zeros.
>
> void
> test (float *a)
> {
>   *a = -0.0;
> }
>
> We will generate code as below, which doesn't treat the minus zero
> as plus zero.
>
> test:
>   lui  a5,%hi(.LC0)
>   flw  fa5,%lo(.LC0)(a5)
>   fsw  fa5,0(a0)
>   ret
>
> .LC0:
>   .word -2147483648 // aka -0.0 (0x80000000 in hex)
>
> This patch would like to fix the bug and treat the minus zero -0.0
> as plus zero, aka +0.0. Thus after this patch we will have asm code
> as below for the above sampe code.
>
> test:
>   sw zero,0(a0)
>   ret
>
> This patch also fix the run failure of the test case pr30957-1.c. The
> below tests are passed for this patch.

We don't really expect targets to do this.  The small testcase above
is somewhat ill-formed with -fno-signed-zeros.  Note there's no
-0.0 in pr30957-1.c so why does that one fail for you?  Does
the -fvariable-expansion-in-unroller code maybe not trigger for
riscv?

I think we should go to PR30957 and see what that was filed originally
for, the testcase doesn't make much sense to me.

> * The riscv regression tests.
> * The pr30957-1.c run tests.
>
> gcc/ChangeLog:
>
>         * config/riscv/constraints.md: Leverage func riscv_float_const_zero_rtx_p
>         for predicating the rtx is const zero float or not.
>         * config/riscv/predicates.md: Ditto.
>         * config/riscv/riscv.cc (riscv_const_insns): Ditto.
>         (riscv_float_const_zero_rtx_p): New func impl for predicating the rtx is
>         const zero float or not.
>         (riscv_const_zero_rtx_p): New func impl for predicating the rtx
>         is const zero (both int and fp) or not.
>         * config/riscv/riscv-protos.h (riscv_float_const_zero_rtx_p):
>         New func decl.
>         (riscv_const_zero_rtx_p): Ditto.
>         * config/riscv/riscv.md: Making sure the operand[1] of movfp is
>         CONST0_RTX when the operand[1] is const zero float.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/riscv/no-signed-zeros-0.c: New test.
>         * gcc.target/riscv/no-signed-zeros-1.c: New test.
>         * gcc.target/riscv/no-signed-zeros-2.c: New test.
>         * gcc.target/riscv/no-signed-zeros-3.c: New test.
>         * gcc.target/riscv/no-signed-zeros-4.c: New test.
>         * gcc.target/riscv/no-signed-zeros-5.c: New test.
>         * gcc.target/riscv/no-signed-zeros-run-0.c: New test.
>         * gcc.target/riscv/no-signed-zeros-run-1.c: New test.
>
> Signed-off-by: Pan Li <pan2.li@intel.com>
> ---
>  gcc/config/riscv/constraints.md               |  2 +-
>  gcc/config/riscv/predicates.md                |  2 +-
>  gcc/config/riscv/riscv-protos.h               |  2 +
>  gcc/config/riscv/riscv.cc                     | 35 ++++++++++++-
>  gcc/config/riscv/riscv.md                     | 49 ++++++++++++++++---
>  .../gcc.target/riscv/no-signed-zeros-0.c      | 26 ++++++++++
>  .../gcc.target/riscv/no-signed-zeros-1.c      | 28 +++++++++++
>  .../gcc.target/riscv/no-signed-zeros-2.c      | 26 ++++++++++
>  .../gcc.target/riscv/no-signed-zeros-3.c      | 28 +++++++++++
>  .../gcc.target/riscv/no-signed-zeros-4.c      | 26 ++++++++++
>  .../gcc.target/riscv/no-signed-zeros-5.c      | 28 +++++++++++
>  .../gcc.target/riscv/no-signed-zeros-run-0.c  | 36 ++++++++++++++
>  .../gcc.target/riscv/no-signed-zeros-run-1.c  | 36 ++++++++++++++
>  13 files changed, 314 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c
>
> diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
> index de4359af00d..db1d5e1385f 100644
> --- a/gcc/config/riscv/constraints.md
> +++ b/gcc/config/riscv/constraints.md
> @@ -108,7 +108,7 @@ (define_constraint "DnS"
>  (define_constraint "G"
>    "@internal"
>    (and (match_code "const_double")
> -       (match_test "op == CONST0_RTX (mode)")))
> +       (match_test "riscv_float_const_zero_rtx_p (op)")))
>
>  (define_memory_constraint "A"
>    "An address that is held in a general-purpose register."
> diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
> index b87a6900841..b428d842101 100644
> --- a/gcc/config/riscv/predicates.md
> +++ b/gcc/config/riscv/predicates.md
> @@ -78,7 +78,7 @@ (define_predicate "sleu_operand"
>
>  (define_predicate "const_0_operand"
>    (and (match_code "const_int,const_wide_int,const_double,const_vector")
> -       (match_test "op == CONST0_RTX (GET_MODE (op))")))
> +       (match_test "riscv_const_zero_rtx_p (op)")))
>
>  (define_predicate "const_1_operand"
>    (and (match_code "const_int,const_wide_int,const_vector")
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index 31049ef7523..fcf30e084a3 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -131,6 +131,8 @@ extern void riscv_asm_output_external (FILE *, const tree, const char *);
>  extern bool
>  riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int);
>  extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx);
> +extern bool riscv_float_const_zero_rtx_p (rtx);
> +extern bool riscv_const_zero_rtx_p (rtx);
>
>  #ifdef RTX_CODE
>  extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool *invert_ptr = 0);
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 0d1cbc5cb5f..a8ad86b7068 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -1635,7 +1635,7 @@ riscv_const_insns (rtx x)
>         return 1;
>
>        /* We can use x0 to load floating-point zero.  */
> -      return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0;
> +      return riscv_float_const_zero_rtx_p (x) ? 1 : 0;
>      case CONST_VECTOR:
>        {
>         /* TODO: This is not accurate, we will need to
> @@ -9481,6 +9481,39 @@ riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT total, int regs_num)
>          || additioanl_bytes == ZCMP_MAX_SPIMM * ZCMP_SP_INC_STEP;
>  }
>
> +/* Return true if rtx op is const zero of the floating-point.  */
> +bool
> +riscv_float_const_zero_rtx_p (rtx op)
> +{
> +  machine_mode mode = GET_MODE (op);
> +
> +  if (GET_MODE_CLASS (mode) != MODE_FLOAT)
> +    return false;
> +
> +  if (GET_CODE (op) != CONST_DOUBLE)
> +    return false;
> +
> +  const struct real_value *op_rvalue = CONST_DOUBLE_REAL_VALUE (op);
> +
> +  if (REAL_VALUE_MINUS_ZERO (*op_rvalue))
> +    return !HONOR_SIGNED_ZEROS (mode);
> +
> +  return real_equal (op_rvalue, &dconst0);
> +}
> +
> +/* Return true if rtx op is const zero, include both the integer
> +   and floating-point.  */
> +bool
> +riscv_const_zero_rtx_p (rtx op)
> +{
> +  machine_mode mode = GET_MODE (op);
> +
> +  if (GET_MODE_CLASS (mode) == MODE_FLOAT)
> +    return riscv_float_const_zero_rtx_p (op);
> +
> +  return op == CONST0_RTX (mode);
> +}
> +
>  /* Return true if it's valid gpr_save pattern.  */
>
>  bool
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index 68f7203b676..cd429f6dcdd 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -1878,7 +1878,12 @@ (define_insn "*movhf_hardfloat"
>    "TARGET_ZFHMIN
>     && (register_operand (operands[0], HFmode)
>         || reg_or_0_operand (operands[1], HFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "HF")])
> @@ -1889,7 +1894,12 @@ (define_insn "*movhf_softfloat"
>    "!TARGET_ZFHMIN
>     && (register_operand (operands[0], HFmode)
>         || reg_or_0_operand (operands[1], HFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "fmove,move,load,store,mtc,mfc")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "HF")])
> @@ -2243,7 +2253,12 @@ (define_insn "*movsf_hardfloat"
>    "TARGET_HARD_FLOAT
>     && (register_operand (operands[0], SFmode)
>         || reg_or_0_operand (operands[1], SFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "SF")])
> @@ -2254,7 +2269,12 @@ (define_insn "*movsf_softfloat"
>    "!TARGET_HARD_FLOAT
>     && (register_operand (operands[0], SFmode)
>         || reg_or_0_operand (operands[1], SFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "move,load,store")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "SF")])
> @@ -2279,7 +2299,12 @@ (define_insn "*movdf_hardfloat_rv32"
>    "!TARGET_64BIT && TARGET_DOUBLE_FLOAT
>     && (register_operand (operands[0], DFmode)
>         || reg_or_0_operand (operands[1], DFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "DF")])
> @@ -2290,7 +2315,12 @@ (define_insn "*movdf_hardfloat_rv64"
>    "TARGET_64BIT && TARGET_DOUBLE_FLOAT
>     && (register_operand (operands[0], DFmode)
>         || reg_or_0_operand (operands[1], DFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "DF")])
> @@ -2301,7 +2331,12 @@ (define_insn "*movdf_softfloat"
>    "!TARGET_DOUBLE_FLOAT
>     && (register_operand (operands[0], DFmode)
>         || reg_or_0_operand (operands[1], DFmode))"
> -  { return riscv_output_move (operands[0], operands[1]); }
> +  {
> +    if (riscv_float_const_zero_rtx_p (operands[1]))
> +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> +
> +    return riscv_output_move (operands[0], operands[1]);
> +  }
>    [(set_attr "move_type" "move,load,store")
>     (set_attr "type" "fmove")
>     (set_attr "mode" "DF")])
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c
> new file mode 100644
> index 00000000000..1eda13a3406
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +/*
> +** test_float_plus_zero_assign:
> +** sw\s+zero,0\([atx][0-9]+\)
> +** ret
> +*/
> +void
> +test_float_plus_zero_assign (float *a)
> +{
> +  *a = +0.0;
> +}
> +
> +/*
> +** test_float_plus_zero_assign:
> +** sw\s+zero,0\([atx][0-9]+\)
> +** ret
> +*/
> +void
> +test_float_minus_zero_assign (float *a)
> +{
> +  *a = -0.0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c
> new file mode 100644
> index 00000000000..8041ec3ea95
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +extern void float_assign (float *a, float b);
> +
> +/*
> +** test_float_plus_zero_assign:
> +** fmv\.s\.x\s+fa[0-9]+,\s*zero
> +** tail\s+float_assign
> +*/
> +void
> +test_float_plus_zero_assign (float *a)
> +{
> +  float_assign (a, +0.0);
> +}
> +
> +/*
> +** test_float_minus_zero_assign:
> +** fmv\.s\.x\s+fa[0-9]+,\s*zero
> +** tail\s+float_assign
> +*/
> +void
> +test_float_minus_zero_assign (float *a)
> +{
> +  float_assign (a, -0.0);
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c
> new file mode 100644
> index 00000000000..a6996dae4de
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +/*
> +** test_float_plus_zero_assign:
> +** sd\s+zero,0\([atx][0-9]+\)
> +** ret
> +*/
> +void
> +test_float_plus_zero_assign (double *a)
> +{
> +  *a = +0.0;
> +}
> +
> +/*
> +** test_float_plus_zero_assign:
> +** sd\s+zero,0\([atx][0-9]+\)
> +** ret
> +*/
> +void
> +test_float_minus_zero_assign (double *a)
> +{
> +  *a = -0.0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c
> new file mode 100644
> index 00000000000..b4ba8a247df
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +extern void float_assign (double *a, double b);
> +
> +/*
> +** test_float_plus_zero_assign:
> +** fmv\.d\.x\s+fa[0-9]+,\s*zero
> +** tail\s+float_assign
> +*/
> +void
> +test_float_plus_zero_assign (double *a)
> +{
> +  float_assign (a, +0.0);
> +}
> +
> +/*
> +** test_float_minus_zero_assign:
> +** fmv\.d\.x\s+fa[0-9]+,\s*zero
> +** tail\s+float_assign
> +*/
> +void
> +test_float_minus_zero_assign (double *a)
> +{
> +  float_assign (a, -0.0);
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c
> new file mode 100644
> index 00000000000..60acf7155d3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv_zfh -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +/*
> +** test_float_plus_zero_assign:
> +** sh\s+zero,0\([atx][0-9]+\)
> +** ret
> +*/
> +void
> +test_float_plus_zero_assign (_Float16 *a)
> +{
> +  *a = +0.0;
> +}
> +
> +/*
> +** test_float_plus_zero_assign:
> +** sh\s+zero,0\([atx][0-9]+\)
> +** ret
> +*/
> +void
> +test_float_minus_zero_assign (_Float16 *a)
> +{
> +  *a = -0.0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c
> new file mode 100644
> index 00000000000..d10efbeb37b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv_zfh -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +extern void float_assign (_Float16 *a, _Float16 b);
> +
> +/*
> +** test_float_plus_zero_assign:
> +** fmv\.h\.x\s+fa[0-9]+,\s*zero
> +** tail\s+float_assign
> +*/
> +void
> +test_float_plus_zero_assign (_Float16 *a)
> +{
> +  float_assign (a, +0.0);
> +}
> +
> +/*
> +** test_float_minus_zero_assign:
> +** fmv\.h\.x\s+fa[0-9]+,\s*zero
> +** tail\s+float_assign
> +*/
> +void
> +test_float_minus_zero_assign (_Float16 *a)
> +{
> +  float_assign (a, -0.0);
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c
> new file mode 100644
> index 00000000000..347e4b0ff74
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c
> @@ -0,0 +1,36 @@
> +/* { dg-do run { target { riscv_v } } } */
> +/* { dg-additional-options "-std=c99 -O3 -fno-signed-zeros" } */
> +
> +float f_val;
> +
> +void
> +__attribute__ ((noinline))
> +test_float_plus_zero_assign (float *a)
> +{
> +  *a = +0.0;
> +}
> +
> +void
> +__attribute__ ((noinline))
> +test_float_minus_zero_assign (float *a)
> +{
> +  *a = -0.0;
> +}
> +
> +int
> +main ()
> +{
> +  f_val = -1.0;
> +  test_float_plus_zero_assign (&f_val);
> +
> +  if (__builtin_copysignf (1.0, f_val) != 1.0)
> +    __builtin_abort ();
> +
> +  f_val = -1.0;
> +  test_float_minus_zero_assign (&f_val);
> +
> +  if (__builtin_copysignf (1.0, f_val) != 1.0)
> +    __builtin_abort ();
> +
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c
> new file mode 100644
> index 00000000000..1eb1edba457
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c
> @@ -0,0 +1,36 @@
> +/* { dg-do run { target { riscv_v } } } */
> +/* { dg-additional-options "-std=c99 -O3 -fno-signed-zeros" } */
> +
> +double f_val;
> +
> +void
> +__attribute__ ((noinline))
> +test_float_plus_zero_assign (double *a)
> +{
> +  *a = +0.0;
> +}
> +
> +void
> +__attribute__ ((noinline))
> +test_float_minus_zero_assign (double *a)
> +{
> +  *a = -0.0;
> +}
> +
> +int
> +main ()
> +{
> +  f_val = -1.0;
> +  test_float_plus_zero_assign (&f_val);
> +
> +  if (__builtin_copysignf (1.0, f_val) != 1.0)
> +    __builtin_abort ();
> +
> +  f_val = -1.0;
> +  test_float_minus_zero_assign (&f_val);
> +
> +  if (__builtin_copysignf (1.0, f_val) != 1.0)
> +    __builtin_abort ();
> +
> +  return 0;
> +}
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option
  2024-01-09  7:17       ` Li, Pan2
@ 2024-01-09 13:08         ` Richard Biener
  0 siblings, 0 replies; 23+ messages in thread
From: Richard Biener @ 2024-01-09 13:08 UTC (permalink / raw)
  To: Li, Pan2
  Cc: gcc-patches, juzhe.zhong, Wang, Yanzhang, kito.cheng,
	jeffreyalaw, eres, tdevries

On Tue, Jan 9, 2024 at 8:17 AM Li, Pan2 <pan2.li@intel.com> wrote:
>
> The test case pr30957-1.c first comes from this commit about 19 years ago which expect the -1.0 for testing.
>
> https://github.com/gcc-mirror/gcc/commit/290358f770d21d9204ea621f839ee8fba606a275
>
> Then the below commit changes from -1.0 to +1.0 for this test file only, because of the instantiates copy(s) of the
> accumulator which it initializes with +0.0.
>
> https://github.com/gcc-mirror/gcc/commit/ffefa9288ab95b06b1dfed95e7235f4c09619a91

This second change looks bogus - it makes the testcase not
well-defined by specifying
-fno-signed-zeros but expecting defined behavior when signed zeros are
involved.  It might be
that it's currently not possible to trigger the "bad" behavior but
then the testcase should still
not runfail.  I suggest to instead remove -funsafe-math-optimizations
from the original option set.

> According to the implementation details of insert_var_expansion_initialization. The zero_init will be
> the CONST0_RTX (mode) if HONOR_SIGNED_ZEROS is false. If my understanding is correct, maybe the test
> case is not well designed for the variable expanding in unrolling? At least it is not good idea to mix/rely on
> the HONOR_SIGNED_ZEROS when testing variable-expansion-in-unroller.

For the first change I think insert_var_expansion_initialization
behavior shouldn't depend on
HONOR_SIGNED_ZEROS but instead check MODE_HAS_SINGED_ZEROS as -0.0 is
always the correct initial value to use.  Note that this will likely
break the testcase expectation
when it's not changed, and the testcase would need to be guarded for
targets not supporting
signed zeros.

Richard.

> CC the original author and please feel free to correct me if any misunderstanding.
>
> Pan
>
> -----Original Message-----
> From: Li, Pan2
> Sent: Tuesday, January 9, 2024 9:22 AM
> To: Richard Biener <richard.guenther@gmail.com>
> Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; jeffreyalaw@gmail.com
> Subject: RE: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option
>
> Thanks Richard B for comments.
>
> > We don't really expect targets to do this.  The small testcase above
> > is somewhat ill-formed with -fno-signed-zeros.  Note there's no
> > -0.0 in pr30957-1.c so why does that one fail for you?  Does
> > the -fvariable-expansion-in-unroller code maybe not trigger for
> > riscv?
>
> Sorry this confused me a little about the sematics of the option -fno-signed-zeros.
> I wonder what the target/backend need to do for this option.
>
> About the failure, it comes from below code in pr30957-1.c. The 0.0 / -5.0 is initialized to -0.0 in riscv but +0.0 in aarch64.
>
>   if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0)
>     abort ();
>
> If my understanding is correct, the loop will be vectorized during vect_transform_loop with a variable factor.
> It won't benefit from unrolling/peeling and mark the loop->unroll as 1, and then we have tree-vect log similar to below:
>
> Disabling unrolling due to variable-length vectorization factor.
>
> > I think we should go to PR30957 and see what that was filed originally
> > for, the testcase doesn't make much sense to me.
>
> Sure thing, will take a look and back to you later.
>
> Pan
>
> -----Original Message-----
> From: Richard Biener <richard.guenther@gmail.com>
> Sent: Monday, January 8, 2024 6:45 PM
> To: Li, Pan2 <pan2.li@intel.com>
> Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; jeffreyalaw@gmail.com
> Subject: Re: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option
>
> On Tue, Jan 2, 2024 at 2:37 PM <pan2.li@intel.com> wrote:
> >
> > From: Pan Li <pan2.li@intel.com>
> >
> > According to the sematics of no-signed-zeros option, the backend
> > like RISC-V should treat the minus zero -0.0f as plus zero 0.0f.
> >
> > Consider below example with option -fno-signed-zeros.
> >
> > void
> > test (float *a)
> > {
> >   *a = -0.0;
> > }
> >
> > We will generate code as below, which doesn't treat the minus zero
> > as plus zero.
> >
> > test:
> >   lui  a5,%hi(.LC0)
> >   flw  fa5,%lo(.LC0)(a5)
> >   fsw  fa5,0(a0)
> >   ret
> >
> > .LC0:
> >   .word -2147483648 // aka -0.0 (0x80000000 in hex)
> >
> > This patch would like to fix the bug and treat the minus zero -0.0
> > as plus zero, aka +0.0. Thus after this patch we will have asm code
> > as below for the above sampe code.
> >
> > test:
> >   sw zero,0(a0)
> >   ret
> >
> > This patch also fix the run failure of the test case pr30957-1.c. The
> > below tests are passed for this patch.
>
> We don't really expect targets to do this.  The small testcase above
> is somewhat ill-formed with -fno-signed-zeros.  Note there's no
> -0.0 in pr30957-1.c so why does that one fail for you?  Does
> the -fvariable-expansion-in-unroller code maybe not trigger for
> riscv?
>
> I think we should go to PR30957 and see what that was filed originally
> for, the testcase doesn't make much sense to me.
>
> > * The riscv regression tests.
> > * The pr30957-1.c run tests.
> >
> > gcc/ChangeLog:
> >
> >         * config/riscv/constraints.md: Leverage func riscv_float_const_zero_rtx_p
> >         for predicating the rtx is const zero float or not.
> >         * config/riscv/predicates.md: Ditto.
> >         * config/riscv/riscv.cc (riscv_const_insns): Ditto.
> >         (riscv_float_const_zero_rtx_p): New func impl for predicating the rtx is
> >         const zero float or not.
> >         (riscv_const_zero_rtx_p): New func impl for predicating the rtx
> >         is const zero (both int and fp) or not.
> >         * config/riscv/riscv-protos.h (riscv_float_const_zero_rtx_p):
> >         New func decl.
> >         (riscv_const_zero_rtx_p): Ditto.
> >         * config/riscv/riscv.md: Making sure the operand[1] of movfp is
> >         CONST0_RTX when the operand[1] is const zero float.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.target/riscv/no-signed-zeros-0.c: New test.
> >         * gcc.target/riscv/no-signed-zeros-1.c: New test.
> >         * gcc.target/riscv/no-signed-zeros-2.c: New test.
> >         * gcc.target/riscv/no-signed-zeros-3.c: New test.
> >         * gcc.target/riscv/no-signed-zeros-4.c: New test.
> >         * gcc.target/riscv/no-signed-zeros-5.c: New test.
> >         * gcc.target/riscv/no-signed-zeros-run-0.c: New test.
> >         * gcc.target/riscv/no-signed-zeros-run-1.c: New test.
> >
> > Signed-off-by: Pan Li <pan2.li@intel.com>
> > ---
> >  gcc/config/riscv/constraints.md               |  2 +-
> >  gcc/config/riscv/predicates.md                |  2 +-
> >  gcc/config/riscv/riscv-protos.h               |  2 +
> >  gcc/config/riscv/riscv.cc                     | 35 ++++++++++++-
> >  gcc/config/riscv/riscv.md                     | 49 ++++++++++++++++---
> >  .../gcc.target/riscv/no-signed-zeros-0.c      | 26 ++++++++++
> >  .../gcc.target/riscv/no-signed-zeros-1.c      | 28 +++++++++++
> >  .../gcc.target/riscv/no-signed-zeros-2.c      | 26 ++++++++++
> >  .../gcc.target/riscv/no-signed-zeros-3.c      | 28 +++++++++++
> >  .../gcc.target/riscv/no-signed-zeros-4.c      | 26 ++++++++++
> >  .../gcc.target/riscv/no-signed-zeros-5.c      | 28 +++++++++++
> >  .../gcc.target/riscv/no-signed-zeros-run-0.c  | 36 ++++++++++++++
> >  .../gcc.target/riscv/no-signed-zeros-run-1.c  | 36 ++++++++++++++
> >  13 files changed, 314 insertions(+), 10 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c
> >
> > diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
> > index de4359af00d..db1d5e1385f 100644
> > --- a/gcc/config/riscv/constraints.md
> > +++ b/gcc/config/riscv/constraints.md
> > @@ -108,7 +108,7 @@ (define_constraint "DnS"
> >  (define_constraint "G"
> >    "@internal"
> >    (and (match_code "const_double")
> > -       (match_test "op == CONST0_RTX (mode)")))
> > +       (match_test "riscv_float_const_zero_rtx_p (op)")))
> >
> >  (define_memory_constraint "A"
> >    "An address that is held in a general-purpose register."
> > diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
> > index b87a6900841..b428d842101 100644
> > --- a/gcc/config/riscv/predicates.md
> > +++ b/gcc/config/riscv/predicates.md
> > @@ -78,7 +78,7 @@ (define_predicate "sleu_operand"
> >
> >  (define_predicate "const_0_operand"
> >    (and (match_code "const_int,const_wide_int,const_double,const_vector")
> > -       (match_test "op == CONST0_RTX (GET_MODE (op))")))
> > +       (match_test "riscv_const_zero_rtx_p (op)")))
> >
> >  (define_predicate "const_1_operand"
> >    (and (match_code "const_int,const_wide_int,const_vector")
> > diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> > index 31049ef7523..fcf30e084a3 100644
> > --- a/gcc/config/riscv/riscv-protos.h
> > +++ b/gcc/config/riscv/riscv-protos.h
> > @@ -131,6 +131,8 @@ extern void riscv_asm_output_external (FILE *, const tree, const char *);
> >  extern bool
> >  riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int);
> >  extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx);
> > +extern bool riscv_float_const_zero_rtx_p (rtx);
> > +extern bool riscv_const_zero_rtx_p (rtx);
> >
> >  #ifdef RTX_CODE
> >  extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool *invert_ptr = 0);
> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> > index 0d1cbc5cb5f..a8ad86b7068 100644
> > --- a/gcc/config/riscv/riscv.cc
> > +++ b/gcc/config/riscv/riscv.cc
> > @@ -1635,7 +1635,7 @@ riscv_const_insns (rtx x)
> >         return 1;
> >
> >        /* We can use x0 to load floating-point zero.  */
> > -      return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0;
> > +      return riscv_float_const_zero_rtx_p (x) ? 1 : 0;
> >      case CONST_VECTOR:
> >        {
> >         /* TODO: This is not accurate, we will need to
> > @@ -9481,6 +9481,39 @@ riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT total, int regs_num)
> >          || additioanl_bytes == ZCMP_MAX_SPIMM * ZCMP_SP_INC_STEP;
> >  }
> >
> > +/* Return true if rtx op is const zero of the floating-point.  */
> > +bool
> > +riscv_float_const_zero_rtx_p (rtx op)
> > +{
> > +  machine_mode mode = GET_MODE (op);
> > +
> > +  if (GET_MODE_CLASS (mode) != MODE_FLOAT)
> > +    return false;
> > +
> > +  if (GET_CODE (op) != CONST_DOUBLE)
> > +    return false;
> > +
> > +  const struct real_value *op_rvalue = CONST_DOUBLE_REAL_VALUE (op);
> > +
> > +  if (REAL_VALUE_MINUS_ZERO (*op_rvalue))
> > +    return !HONOR_SIGNED_ZEROS (mode);
> > +
> > +  return real_equal (op_rvalue, &dconst0);
> > +}
> > +
> > +/* Return true if rtx op is const zero, include both the integer
> > +   and floating-point.  */
> > +bool
> > +riscv_const_zero_rtx_p (rtx op)
> > +{
> > +  machine_mode mode = GET_MODE (op);
> > +
> > +  if (GET_MODE_CLASS (mode) == MODE_FLOAT)
> > +    return riscv_float_const_zero_rtx_p (op);
> > +
> > +  return op == CONST0_RTX (mode);
> > +}
> > +
> >  /* Return true if it's valid gpr_save pattern.  */
> >
> >  bool
> > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> > index 68f7203b676..cd429f6dcdd 100644
> > --- a/gcc/config/riscv/riscv.md
> > +++ b/gcc/config/riscv/riscv.md
> > @@ -1878,7 +1878,12 @@ (define_insn "*movhf_hardfloat"
> >    "TARGET_ZFHMIN
> >     && (register_operand (operands[0], HFmode)
> >         || reg_or_0_operand (operands[1], HFmode))"
> > -  { return riscv_output_move (operands[0], operands[1]); }
> > +  {
> > +    if (riscv_float_const_zero_rtx_p (operands[1]))
> > +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> > +
> > +    return riscv_output_move (operands[0], operands[1]);
> > +  }
> >    [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> >     (set_attr "type" "fmove")
> >     (set_attr "mode" "HF")])
> > @@ -1889,7 +1894,12 @@ (define_insn "*movhf_softfloat"
> >    "!TARGET_ZFHMIN
> >     && (register_operand (operands[0], HFmode)
> >         || reg_or_0_operand (operands[1], HFmode))"
> > -  { return riscv_output_move (operands[0], operands[1]); }
> > +  {
> > +    if (riscv_float_const_zero_rtx_p (operands[1]))
> > +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> > +
> > +    return riscv_output_move (operands[0], operands[1]);
> > +  }
> >    [(set_attr "move_type" "fmove,move,load,store,mtc,mfc")
> >     (set_attr "type" "fmove")
> >     (set_attr "mode" "HF")])
> > @@ -2243,7 +2253,12 @@ (define_insn "*movsf_hardfloat"
> >    "TARGET_HARD_FLOAT
> >     && (register_operand (operands[0], SFmode)
> >         || reg_or_0_operand (operands[1], SFmode))"
> > -  { return riscv_output_move (operands[0], operands[1]); }
> > +  {
> > +    if (riscv_float_const_zero_rtx_p (operands[1]))
> > +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> > +
> > +    return riscv_output_move (operands[0], operands[1]);
> > +  }
> >    [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> >     (set_attr "type" "fmove")
> >     (set_attr "mode" "SF")])
> > @@ -2254,7 +2269,12 @@ (define_insn "*movsf_softfloat"
> >    "!TARGET_HARD_FLOAT
> >     && (register_operand (operands[0], SFmode)
> >         || reg_or_0_operand (operands[1], SFmode))"
> > -  { return riscv_output_move (operands[0], operands[1]); }
> > +  {
> > +    if (riscv_float_const_zero_rtx_p (operands[1]))
> > +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> > +
> > +    return riscv_output_move (operands[0], operands[1]);
> > +  }
> >    [(set_attr "move_type" "move,load,store")
> >     (set_attr "type" "fmove")
> >     (set_attr "mode" "SF")])
> > @@ -2279,7 +2299,12 @@ (define_insn "*movdf_hardfloat_rv32"
> >    "!TARGET_64BIT && TARGET_DOUBLE_FLOAT
> >     && (register_operand (operands[0], DFmode)
> >         || reg_or_0_operand (operands[1], DFmode))"
> > -  { return riscv_output_move (operands[0], operands[1]); }
> > +  {
> > +    if (riscv_float_const_zero_rtx_p (operands[1]))
> > +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> > +
> > +    return riscv_output_move (operands[0], operands[1]);
> > +  }
> >    [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> >     (set_attr "type" "fmove")
> >     (set_attr "mode" "DF")])
> > @@ -2290,7 +2315,12 @@ (define_insn "*movdf_hardfloat_rv64"
> >    "TARGET_64BIT && TARGET_DOUBLE_FLOAT
> >     && (register_operand (operands[0], DFmode)
> >         || reg_or_0_operand (operands[1], DFmode))"
> > -  { return riscv_output_move (operands[0], operands[1]); }
> > +  {
> > +    if (riscv_float_const_zero_rtx_p (operands[1]))
> > +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> > +
> > +    return riscv_output_move (operands[0], operands[1]);
> > +  }
> >    [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store")
> >     (set_attr "type" "fmove")
> >     (set_attr "mode" "DF")])
> > @@ -2301,7 +2331,12 @@ (define_insn "*movdf_softfloat"
> >    "!TARGET_DOUBLE_FLOAT
> >     && (register_operand (operands[0], DFmode)
> >         || reg_or_0_operand (operands[1], DFmode))"
> > -  { return riscv_output_move (operands[0], operands[1]); }
> > +  {
> > +    if (riscv_float_const_zero_rtx_p (operands[1]))
> > +      operands[1] = CONST0_RTX (GET_MODE (operands[1]));
> > +
> > +    return riscv_output_move (operands[0], operands[1]);
> > +  }
> >    [(set_attr "move_type" "move,load,store")
> >     (set_attr "type" "fmove")
> >     (set_attr "mode" "DF")])
> > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c
> > new file mode 100644
> > index 00000000000..1eda13a3406
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-0.c
> > @@ -0,0 +1,26 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> > +/* { dg-final { check-function-bodies "**" "" } } */
> > +
> > +/*
> > +** test_float_plus_zero_assign:
> > +** sw\s+zero,0\([atx][0-9]+\)
> > +** ret
> > +*/
> > +void
> > +test_float_plus_zero_assign (float *a)
> > +{
> > +  *a = +0.0;
> > +}
> > +
> > +/*
> > +** test_float_plus_zero_assign:
> > +** sw\s+zero,0\([atx][0-9]+\)
> > +** ret
> > +*/
> > +void
> > +test_float_minus_zero_assign (float *a)
> > +{
> > +  *a = -0.0;
> > +}
> > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c
> > new file mode 100644
> > index 00000000000..8041ec3ea95
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-1.c
> > @@ -0,0 +1,28 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> > +/* { dg-final { check-function-bodies "**" "" } } */
> > +
> > +extern void float_assign (float *a, float b);
> > +
> > +/*
> > +** test_float_plus_zero_assign:
> > +** fmv\.s\.x\s+fa[0-9]+,\s*zero
> > +** tail\s+float_assign
> > +*/
> > +void
> > +test_float_plus_zero_assign (float *a)
> > +{
> > +  float_assign (a, +0.0);
> > +}
> > +
> > +/*
> > +** test_float_minus_zero_assign:
> > +** fmv\.s\.x\s+fa[0-9]+,\s*zero
> > +** tail\s+float_assign
> > +*/
> > +void
> > +test_float_minus_zero_assign (float *a)
> > +{
> > +  float_assign (a, -0.0);
> > +}
> > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c
> > new file mode 100644
> > index 00000000000..a6996dae4de
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-2.c
> > @@ -0,0 +1,26 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> > +/* { dg-final { check-function-bodies "**" "" } } */
> > +
> > +/*
> > +** test_float_plus_zero_assign:
> > +** sd\s+zero,0\([atx][0-9]+\)
> > +** ret
> > +*/
> > +void
> > +test_float_plus_zero_assign (double *a)
> > +{
> > +  *a = +0.0;
> > +}
> > +
> > +/*
> > +** test_float_plus_zero_assign:
> > +** sd\s+zero,0\([atx][0-9]+\)
> > +** ret
> > +*/
> > +void
> > +test_float_minus_zero_assign (double *a)
> > +{
> > +  *a = -0.0;
> > +}
> > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c
> > new file mode 100644
> > index 00000000000..b4ba8a247df
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-3.c
> > @@ -0,0 +1,28 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> > +/* { dg-final { check-function-bodies "**" "" } } */
> > +
> > +extern void float_assign (double *a, double b);
> > +
> > +/*
> > +** test_float_plus_zero_assign:
> > +** fmv\.d\.x\s+fa[0-9]+,\s*zero
> > +** tail\s+float_assign
> > +*/
> > +void
> > +test_float_plus_zero_assign (double *a)
> > +{
> > +  float_assign (a, +0.0);
> > +}
> > +
> > +/*
> > +** test_float_minus_zero_assign:
> > +** fmv\.d\.x\s+fa[0-9]+,\s*zero
> > +** tail\s+float_assign
> > +*/
> > +void
> > +test_float_minus_zero_assign (double *a)
> > +{
> > +  float_assign (a, -0.0);
> > +}
> > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c
> > new file mode 100644
> > index 00000000000..60acf7155d3
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-4.c
> > @@ -0,0 +1,26 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv64gcv_zfh -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> > +/* { dg-final { check-function-bodies "**" "" } } */
> > +
> > +/*
> > +** test_float_plus_zero_assign:
> > +** sh\s+zero,0\([atx][0-9]+\)
> > +** ret
> > +*/
> > +void
> > +test_float_plus_zero_assign (_Float16 *a)
> > +{
> > +  *a = +0.0;
> > +}
> > +
> > +/*
> > +** test_float_plus_zero_assign:
> > +** sh\s+zero,0\([atx][0-9]+\)
> > +** ret
> > +*/
> > +void
> > +test_float_minus_zero_assign (_Float16 *a)
> > +{
> > +  *a = -0.0;
> > +}
> > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c
> > new file mode 100644
> > index 00000000000..d10efbeb37b
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-5.c
> > @@ -0,0 +1,28 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv64gcv_zfh -mabi=lp64d -O3 -fno-signed-zeros -fno-schedule-insns -fno-schedule-insns2" } */
> > +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
> > +/* { dg-final { check-function-bodies "**" "" } } */
> > +
> > +extern void float_assign (_Float16 *a, _Float16 b);
> > +
> > +/*
> > +** test_float_plus_zero_assign:
> > +** fmv\.h\.x\s+fa[0-9]+,\s*zero
> > +** tail\s+float_assign
> > +*/
> > +void
> > +test_float_plus_zero_assign (_Float16 *a)
> > +{
> > +  float_assign (a, +0.0);
> > +}
> > +
> > +/*
> > +** test_float_minus_zero_assign:
> > +** fmv\.h\.x\s+fa[0-9]+,\s*zero
> > +** tail\s+float_assign
> > +*/
> > +void
> > +test_float_minus_zero_assign (_Float16 *a)
> > +{
> > +  float_assign (a, -0.0);
> > +}
> > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c
> > new file mode 100644
> > index 00000000000..347e4b0ff74
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-0.c
> > @@ -0,0 +1,36 @@
> > +/* { dg-do run { target { riscv_v } } } */
> > +/* { dg-additional-options "-std=c99 -O3 -fno-signed-zeros" } */
> > +
> > +float f_val;
> > +
> > +void
> > +__attribute__ ((noinline))
> > +test_float_plus_zero_assign (float *a)
> > +{
> > +  *a = +0.0;
> > +}
> > +
> > +void
> > +__attribute__ ((noinline))
> > +test_float_minus_zero_assign (float *a)
> > +{
> > +  *a = -0.0;
> > +}
> > +
> > +int
> > +main ()
> > +{
> > +  f_val = -1.0;
> > +  test_float_plus_zero_assign (&f_val);
> > +
> > +  if (__builtin_copysignf (1.0, f_val) != 1.0)
> > +    __builtin_abort ();
> > +
> > +  f_val = -1.0;
> > +  test_float_minus_zero_assign (&f_val);
> > +
> > +  if (__builtin_copysignf (1.0, f_val) != 1.0)
> > +    __builtin_abort ();
> > +
> > +  return 0;
> > +}
> > diff --git a/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c
> > new file mode 100644
> > index 00000000000..1eb1edba457
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/no-signed-zeros-run-1.c
> > @@ -0,0 +1,36 @@
> > +/* { dg-do run { target { riscv_v } } } */
> > +/* { dg-additional-options "-std=c99 -O3 -fno-signed-zeros" } */
> > +
> > +double f_val;
> > +
> > +void
> > +__attribute__ ((noinline))
> > +test_float_plus_zero_assign (double *a)
> > +{
> > +  *a = +0.0;
> > +}
> > +
> > +void
> > +__attribute__ ((noinline))
> > +test_float_minus_zero_assign (double *a)
> > +{
> > +  *a = -0.0;
> > +}
> > +
> > +int
> > +main ()
> > +{
> > +  f_val = -1.0;
> > +  test_float_plus_zero_assign (&f_val);
> > +
> > +  if (__builtin_copysignf (1.0, f_val) != 1.0)
> > +    __builtin_abort ();
> > +
> > +  f_val = -1.0;
> > +  test_float_minus_zero_assign (&f_val);
> > +
> > +  if (__builtin_copysignf (1.0, f_val) != 1.0)
> > +    __builtin_abort ();
> > +
> > +  return 0;
> > +}
> > --
> > 2.34.1
> >

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option
  2024-01-08 10:45   ` Richard Biener
  2024-01-09  1:22     ` Li, Pan2
@ 2024-01-09 17:46     ` Jeff Law
  2024-01-10  4:28       ` Li, Pan2
  1 sibling, 1 reply; 23+ messages in thread
From: Jeff Law @ 2024-01-09 17:46 UTC (permalink / raw)
  To: Richard Biener, pan2.li
  Cc: gcc-patches, juzhe.zhong, yanzhang.wang, kito.cheng



On 1/8/24 03:45, Richard Biener wrote:
> On Tue, Jan 2, 2024 at 2:37 PM <pan2.li@intel.com> wrote:
>>
>> From: Pan Li <pan2.li@intel.com>
>>
>> According to the sematics of no-signed-zeros option, the backend
>> like RISC-V should treat the minus zero -0.0f as plus zero 0.0f.
>>
>> Consider below example with option -fno-signed-zeros.
>>
>> void
>> test (float *a)
>> {
>>    *a = -0.0;
>> }
>>
>> We will generate code as below, which doesn't treat the minus zero
>> as plus zero.
>>
>> test:
>>    lui  a5,%hi(.LC0)
>>    flw  fa5,%lo(.LC0)(a5)
>>    fsw  fa5,0(a0)
>>    ret
>>
>> .LC0:
>>    .word -2147483648 // aka -0.0 (0x80000000 in hex)
>>
>> This patch would like to fix the bug and treat the minus zero -0.0
>> as plus zero, aka +0.0. Thus after this patch we will have asm code
>> as below for the above sampe code.
>>
>> test:
>>    sw zero,0(a0)
>>    ret
>>
>> This patch also fix the run failure of the test case pr30957-1.c. The
>> below tests are passed for this patch.
> 
> We don't really expect targets to do this.  The small testcase above
> is somewhat ill-formed with -fno-signed-zeros.  Note there's no
> -0.0 in pr30957-1.c so why does that one fail for you?  Does
> the -fvariable-expansion-in-unroller code maybe not trigger for
> riscv?
Loop unrolling (and thus variable expansion) doesn't trigger on the VLA 
style architectures.  aarch64 passes becuase its backend knows it can 
translate -0.0 into 0.0.

While we don't require that from ports, I'd just assume do the 
optimization similar to aarch64 rather than xfail or skip the test on 
RISC-V.  We can load 0.0 more efficiently than -0.0.


> 
> I think we should go to PR30957 and see what that was filed originally
> for, the testcase doesn't make much sense to me.
It's got more history than I'd like :(


jeff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option
  2024-01-09 17:46     ` Jeff Law
@ 2024-01-10  4:28       ` Li, Pan2
  0 siblings, 0 replies; 23+ messages in thread
From: Li, Pan2 @ 2024-01-10  4:28 UTC (permalink / raw)
  To: Jeff Law, Richard Biener
  Cc: gcc-patches, juzhe.zhong, Wang, Yanzhang, kito.cheng

Thanks Jeff and Richard for confirmation and comments.

It looks like firstly we should address the issue of the original commits in v4 and then
back to if there is something we need to deal with option no-signed-zero for the riscv.

Pan

-----Original Message-----
From: Jeff Law <jeffreyalaw@gmail.com> 
Sent: Wednesday, January 10, 2024 1:46 AM
To: Richard Biener <richard.guenther@gmail.com>; Li, Pan2 <pan2.li@intel.com>
Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com
Subject: Re: [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option



On 1/8/24 03:45, Richard Biener wrote:
> On Tue, Jan 2, 2024 at 2:37 PM <pan2.li@intel.com> wrote:
>>
>> From: Pan Li <pan2.li@intel.com>
>>
>> According to the sematics of no-signed-zeros option, the backend
>> like RISC-V should treat the minus zero -0.0f as plus zero 0.0f.
>>
>> Consider below example with option -fno-signed-zeros.
>>
>> void
>> test (float *a)
>> {
>>    *a = -0.0;
>> }
>>
>> We will generate code as below, which doesn't treat the minus zero
>> as plus zero.
>>
>> test:
>>    lui  a5,%hi(.LC0)
>>    flw  fa5,%lo(.LC0)(a5)
>>    fsw  fa5,0(a0)
>>    ret
>>
>> .LC0:
>>    .word -2147483648 // aka -0.0 (0x80000000 in hex)
>>
>> This patch would like to fix the bug and treat the minus zero -0.0
>> as plus zero, aka +0.0. Thus after this patch we will have asm code
>> as below for the above sampe code.
>>
>> test:
>>    sw zero,0(a0)
>>    ret
>>
>> This patch also fix the run failure of the test case pr30957-1.c. The
>> below tests are passed for this patch.
> 
> We don't really expect targets to do this.  The small testcase above
> is somewhat ill-formed with -fno-signed-zeros.  Note there's no
> -0.0 in pr30957-1.c so why does that one fail for you?  Does
> the -fvariable-expansion-in-unroller code maybe not trigger for
> riscv?
Loop unrolling (and thus variable expansion) doesn't trigger on the VLA 
style architectures.  aarch64 passes becuase its backend knows it can 
translate -0.0 into 0.0.

While we don't require that from ports, I'd just assume do the 
optimization similar to aarch64 rather than xfail or skip the test on 
RISC-V.  We can load 0.0 more efficiently than -0.0.


> 
> I think we should go to PR30957 and see what that was filed originally
> for, the testcase doesn't make much sense to me.
It's got more history than I'd like :(


jeff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion
  2023-12-23 11:07 [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor pan2.li
                   ` (2 preceding siblings ...)
  2024-01-02 11:55 ` [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option pan2.li
@ 2024-01-11  1:38 ` pan2.li
  2024-01-11  8:33   ` Richard Biener
  2024-01-11  8:50 ` [PATCH v5] " pan2.li
  4 siblings, 1 reply; 23+ messages in thread
From: pan2.li @ 2024-01-11  1:38 UTC (permalink / raw)
  To: gcc-patches
  Cc: juzhe.zhong, pan2.li, yanzhang.wang, kito.cheng,
	richard.guenther, jeffreyalaw

From: Pan Li <pan2.li@intel.com>

The insert_var_expansion_initialization depends on the
HONOR_SIGNED_ZEROS to initialize the unrolling variables
to +0.0f when -0.0f and no-signed-option.  Unfortunately,
we should always keep the -0.0f here because:

* The -0.0f is always the correct initial value.
* We need to support the target that always honor signed zero.

Thus, we need to leverage MODE_HAS_SIGNED_ZEROS when initialize
instead of HONOR_SIGNED_ZEROS.  Then the target/backend can
decide to honor the no-signed-zero or not.

The below tests are passed for this patch:

* The riscv regression tests.
* The aarch64 regression tests.
* The x86 bootstrap and regression tests.

gcc/ChangeLog:

	* loop-unroll.cc (insert_var_expansion_initialization): Leverage
	MODE_HAS_SIGNED_ZEROS for expansion variable initialization.

gcc/testsuite/ChangeLog:

	* gcc.dg/pr30957-1.c: Adjust tests cases for different scenarios.

Signed-off-by: Pan Li <pan2.li@intel.com>
---
 gcc/loop-unroll.cc               |  4 +--
 gcc/testsuite/gcc.dg/pr30957-1.c | 48 ++++++++++++++++++++++++++++----
 2 files changed, 44 insertions(+), 8 deletions(-)

diff --git a/gcc/loop-unroll.cc b/gcc/loop-unroll.cc
index 4176a21e308..bfdfe6c2bb7 100644
--- a/gcc/loop-unroll.cc
+++ b/gcc/loop-unroll.cc
@@ -1855,7 +1855,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve,
   rtx var, zero_init;
   unsigned i;
   machine_mode mode = GET_MODE (ve->reg);
-  bool honor_signed_zero_p = HONOR_SIGNED_ZEROS (mode);
+  bool has_signed_zero_p = MODE_HAS_SIGNED_ZEROS (mode);
 
   if (ve->var_expansions.length () == 0)
     return;
@@ -1869,7 +1869,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve,
     case MINUS:
       FOR_EACH_VEC_ELT (ve->var_expansions, i, var)
         {
-	  if (honor_signed_zero_p)
+	  if (has_signed_zero_p)
 	    zero_init = simplify_gen_unary (NEG, mode, CONST0_RTX (mode), mode);
 	  else
 	    zero_init = CONST0_RTX (mode);
diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c
index 564410913ab..6a9d3d87932 100644
--- a/gcc/testsuite/gcc.dg/pr30957-1.c
+++ b/gcc/testsuite/gcc.dg/pr30957-1.c
@@ -20,16 +20,52 @@ foo (float d, int n)
   return accum;
 }
 
+float __attribute__((noinline))
+get_minus_zero()
+{
+  return 0.0 / -5.0;
+}
+
 int
 main ()
 {
-  /* When compiling standard compliant we expect foo to return -0.0.  But the
-     variable expansion during unrolling optimization (for this testcase enabled
-     by non-compliant -fassociative-math) instantiates copy(s) of the
-     accumulator which it initializes with +0.0.  Hence we expect that foo
-     returns +0.0.  */
-  if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0)
+  /* The variable expansion in unroll requires option unsafe-math-optimizations
+     (aka -fno-signed-zeros, -fno-trapping-math, -fassociative-math
+     and -freciprocal-math).
+
+     When loop like above will have expansion after unrolling as below:
+
+     accum_1 += d_1;
+     accum_2 += d_2;
+     accum_3 += d_3;
+     ...
+
+     The accum_1, accum_2 and accum_3 need to be initialized. Given the
+     floating-point we have
+     +0.0f + -0.0f = +0.0f.
+
+     Thus, we should initialize the accum_* to -0.0 for correctness.  But
+     the things become more complicated when no-signed-zeros, as well as VLA
+     vectorizer mode which doesn't trigger variable expansion. Then we have:
+
+     Case 1: Trigger variable expansion but target doesn't honor no-signed-zero.
+       minus_zero will be -0.0f and foo (minus_zero, 10) will be -0.0f.
+     Case 2: Trigger variable expansion but target does honor no-signed-zero.
+       minus_zero will be +0.0f and foo (minus_zero, 10) will be +0.0f.
+     Case 3: No variable expansion but target doesn't honor no-signed-zero.
+       minus_zero will be -0.0f and foo (minus_zero, 10) will be -0.0f.
+     Case 4: No variable expansion but target does honor no-signed-zero.
+       minus_zero will be +0.0f and foo (minus_zero, 10) will be +0.0f.
+
+     The test case covers above 4 cases for running.
+     */
+  float minus_zero = get_minus_zero ();
+  float a = __builtin_copysignf (1.0, minus_zero);
+  float b = __builtin_copysignf (1.0, foo (minus_zero, 10));
+
+  if (a != b)
     abort ();
+
   exit (0);
 }
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion
  2024-01-11  1:38 ` [PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion pan2.li
@ 2024-01-11  8:33   ` Richard Biener
  2024-01-11  8:48     ` Li, Pan2
  0 siblings, 1 reply; 23+ messages in thread
From: Richard Biener @ 2024-01-11  8:33 UTC (permalink / raw)
  To: pan2.li; +Cc: gcc-patches, juzhe.zhong, yanzhang.wang, kito.cheng, jeffreyalaw

On Thu, Jan 11, 2024 at 2:39 AM <pan2.li@intel.com> wrote:
>
> From: Pan Li <pan2.li@intel.com>
>
> The insert_var_expansion_initialization depends on the
> HONOR_SIGNED_ZEROS to initialize the unrolling variables
> to +0.0f when -0.0f and no-signed-option.  Unfortunately,
> we should always keep the -0.0f here because:
>
> * The -0.0f is always the correct initial value.
> * We need to support the target that always honor signed zero.
>
> Thus, we need to leverage MODE_HAS_SIGNED_ZEROS when initialize
> instead of HONOR_SIGNED_ZEROS.  Then the target/backend can
> decide to honor the no-signed-zero or not.
>
> The below tests are passed for this patch:
>
> * The riscv regression tests.
> * The aarch64 regression tests.
> * The x86 bootstrap and regression tests.
>
> gcc/ChangeLog:
>
>         * loop-unroll.cc (insert_var_expansion_initialization): Leverage
>         MODE_HAS_SIGNED_ZEROS for expansion variable initialization.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.dg/pr30957-1.c: Adjust tests cases for different scenarios.
>
> Signed-off-by: Pan Li <pan2.li@intel.com>
> ---
>  gcc/loop-unroll.cc               |  4 +--
>  gcc/testsuite/gcc.dg/pr30957-1.c | 48 ++++++++++++++++++++++++++++----
>  2 files changed, 44 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/loop-unroll.cc b/gcc/loop-unroll.cc
> index 4176a21e308..bfdfe6c2bb7 100644
> --- a/gcc/loop-unroll.cc
> +++ b/gcc/loop-unroll.cc
> @@ -1855,7 +1855,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve,
>    rtx var, zero_init;
>    unsigned i;
>    machine_mode mode = GET_MODE (ve->reg);
> -  bool honor_signed_zero_p = HONOR_SIGNED_ZEROS (mode);
> +  bool has_signed_zero_p = MODE_HAS_SIGNED_ZEROS (mode);
>
>    if (ve->var_expansions.length () == 0)
>      return;
> @@ -1869,7 +1869,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve,
>      case MINUS:
>        FOR_EACH_VEC_ELT (ve->var_expansions, i, var)
>          {
> -         if (honor_signed_zero_p)
> +         if (has_signed_zero_p)
>             zero_init = simplify_gen_unary (NEG, mode, CONST0_RTX (mode), mode);
>           else
>             zero_init = CONST0_RTX (mode);

This change is OK.

> diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c
> index 564410913ab..6a9d3d87932 100644
> --- a/gcc/testsuite/gcc.dg/pr30957-1.c
> +++ b/gcc/testsuite/gcc.dg/pr30957-1.c
> @@ -20,16 +20,52 @@ foo (float d, int n)
>    return accum;
>  }
>
> +float __attribute__((noinline))
> +get_minus_zero()
> +{
> +  return 0.0 / -5.0;
> +}
> +

I still think this test shouldn't use -fno-signed-zeros since that makes it
undefined whether the return value is positive or negative - checking
whether get_minus_zero returns negative or positive zero isn't a
good indicator and prone to break.

We have a { dg-add-options ieee } we'd need.

The original testcase indicates variable expansion wouldn't trigger then,
and in the PR it's noted we should remove this broken testcase and the
PR itself is invalid.

I agree there.

So, can you remove the testcase instead?

Thanks,
Richard.


>  int
>  main ()
>  {
> -  /* When compiling standard compliant we expect foo to return -0.0.  But the
> -     variable expansion during unrolling optimization (for this testcase enabled
> -     by non-compliant -fassociative-math) instantiates copy(s) of the
> -     accumulator which it initializes with +0.0.  Hence we expect that foo
> -     returns +0.0.  */
> -  if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0)
> +  /* The variable expansion in unroll requires option unsafe-math-optimizations
> +     (aka -fno-signed-zeros, -fno-trapping-math, -fassociative-math
> +     and -freciprocal-math).
> +
> +     When loop like above will have expansion after unrolling as below:
> +
> +     accum_1 += d_1;
> +     accum_2 += d_2;
> +     accum_3 += d_3;
> +     ...
> +
> +     The accum_1, accum_2 and accum_3 need to be initialized. Given the
> +     floating-point we have
> +     +0.0f + -0.0f = +0.0f.
> +
> +     Thus, we should initialize the accum_* to -0.0 for correctness.  But
> +     the things become more complicated when no-signed-zeros, as well as VLA
> +     vectorizer mode which doesn't trigger variable expansion. Then we have:
> +
> +     Case 1: Trigger variable expansion but target doesn't honor no-signed-zero.
> +       minus_zero will be -0.0f and foo (minus_zero, 10) will be -0.0f.
> +     Case 2: Trigger variable expansion but target does honor no-signed-zero.
> +       minus_zero will be +0.0f and foo (minus_zero, 10) will be +0.0f.
> +     Case 3: No variable expansion but target doesn't honor no-signed-zero.
> +       minus_zero will be -0.0f and foo (minus_zero, 10) will be -0.0f.
> +     Case 4: No variable expansion but target does honor no-signed-zero.
> +       minus_zero will be +0.0f and foo (minus_zero, 10) will be +0.0f.
> +
> +     The test case covers above 4 cases for running.
> +     */
> +  float minus_zero = get_minus_zero ();
> +  float a = __builtin_copysignf (1.0, minus_zero);
> +  float b = __builtin_copysignf (1.0, foo (minus_zero, 10));
> +
> +  if (a != b)
>      abort ();
> +
>    exit (0);
>  }
>
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion
  2024-01-11  8:33   ` Richard Biener
@ 2024-01-11  8:48     ` Li, Pan2
  0 siblings, 0 replies; 23+ messages in thread
From: Li, Pan2 @ 2024-01-11  8:48 UTC (permalink / raw)
  To: Richard Biener
  Cc: gcc-patches, juzhe.zhong, Wang, Yanzhang, kito.cheng, jeffreyalaw

Thanks Richard, will delete the test case pr30957-1.c in patch V5.

Pan

-----Original Message-----
From: Richard Biener <richard.guenther@gmail.com> 
Sent: Thursday, January 11, 2024 4:33 PM
To: Li, Pan2 <pan2.li@intel.com>
Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; jeffreyalaw@gmail.com
Subject: Re: [PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion

On Thu, Jan 11, 2024 at 2:39 AM <pan2.li@intel.com> wrote:
>
> From: Pan Li <pan2.li@intel.com>
>
> The insert_var_expansion_initialization depends on the
> HONOR_SIGNED_ZEROS to initialize the unrolling variables
> to +0.0f when -0.0f and no-signed-option.  Unfortunately,
> we should always keep the -0.0f here because:
>
> * The -0.0f is always the correct initial value.
> * We need to support the target that always honor signed zero.
>
> Thus, we need to leverage MODE_HAS_SIGNED_ZEROS when initialize
> instead of HONOR_SIGNED_ZEROS.  Then the target/backend can
> decide to honor the no-signed-zero or not.
>
> The below tests are passed for this patch:
>
> * The riscv regression tests.
> * The aarch64 regression tests.
> * The x86 bootstrap and regression tests.
>
> gcc/ChangeLog:
>
>         * loop-unroll.cc (insert_var_expansion_initialization): Leverage
>         MODE_HAS_SIGNED_ZEROS for expansion variable initialization.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.dg/pr30957-1.c: Adjust tests cases for different scenarios.
>
> Signed-off-by: Pan Li <pan2.li@intel.com>
> ---
>  gcc/loop-unroll.cc               |  4 +--
>  gcc/testsuite/gcc.dg/pr30957-1.c | 48 ++++++++++++++++++++++++++++----
>  2 files changed, 44 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/loop-unroll.cc b/gcc/loop-unroll.cc
> index 4176a21e308..bfdfe6c2bb7 100644
> --- a/gcc/loop-unroll.cc
> +++ b/gcc/loop-unroll.cc
> @@ -1855,7 +1855,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve,
>    rtx var, zero_init;
>    unsigned i;
>    machine_mode mode = GET_MODE (ve->reg);
> -  bool honor_signed_zero_p = HONOR_SIGNED_ZEROS (mode);
> +  bool has_signed_zero_p = MODE_HAS_SIGNED_ZEROS (mode);
>
>    if (ve->var_expansions.length () == 0)
>      return;
> @@ -1869,7 +1869,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve,
>      case MINUS:
>        FOR_EACH_VEC_ELT (ve->var_expansions, i, var)
>          {
> -         if (honor_signed_zero_p)
> +         if (has_signed_zero_p)
>             zero_init = simplify_gen_unary (NEG, mode, CONST0_RTX (mode), mode);
>           else
>             zero_init = CONST0_RTX (mode);

This change is OK.

> diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c
> index 564410913ab..6a9d3d87932 100644
> --- a/gcc/testsuite/gcc.dg/pr30957-1.c
> +++ b/gcc/testsuite/gcc.dg/pr30957-1.c
> @@ -20,16 +20,52 @@ foo (float d, int n)
>    return accum;
>  }
>
> +float __attribute__((noinline))
> +get_minus_zero()
> +{
> +  return 0.0 / -5.0;
> +}
> +

I still think this test shouldn't use -fno-signed-zeros since that makes it
undefined whether the return value is positive or negative - checking
whether get_minus_zero returns negative or positive zero isn't a
good indicator and prone to break.

We have a { dg-add-options ieee } we'd need.

The original testcase indicates variable expansion wouldn't trigger then,
and in the PR it's noted we should remove this broken testcase and the
PR itself is invalid.

I agree there.

So, can you remove the testcase instead?

Thanks,
Richard.


>  int
>  main ()
>  {
> -  /* When compiling standard compliant we expect foo to return -0.0.  But the
> -     variable expansion during unrolling optimization (for this testcase enabled
> -     by non-compliant -fassociative-math) instantiates copy(s) of the
> -     accumulator which it initializes with +0.0.  Hence we expect that foo
> -     returns +0.0.  */
> -  if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0)
> +  /* The variable expansion in unroll requires option unsafe-math-optimizations
> +     (aka -fno-signed-zeros, -fno-trapping-math, -fassociative-math
> +     and -freciprocal-math).
> +
> +     When loop like above will have expansion after unrolling as below:
> +
> +     accum_1 += d_1;
> +     accum_2 += d_2;
> +     accum_3 += d_3;
> +     ...
> +
> +     The accum_1, accum_2 and accum_3 need to be initialized. Given the
> +     floating-point we have
> +     +0.0f + -0.0f = +0.0f.
> +
> +     Thus, we should initialize the accum_* to -0.0 for correctness.  But
> +     the things become more complicated when no-signed-zeros, as well as VLA
> +     vectorizer mode which doesn't trigger variable expansion. Then we have:
> +
> +     Case 1: Trigger variable expansion but target doesn't honor no-signed-zero.
> +       minus_zero will be -0.0f and foo (minus_zero, 10) will be -0.0f.
> +     Case 2: Trigger variable expansion but target does honor no-signed-zero.
> +       minus_zero will be +0.0f and foo (minus_zero, 10) will be +0.0f.
> +     Case 3: No variable expansion but target doesn't honor no-signed-zero.
> +       minus_zero will be -0.0f and foo (minus_zero, 10) will be -0.0f.
> +     Case 4: No variable expansion but target does honor no-signed-zero.
> +       minus_zero will be +0.0f and foo (minus_zero, 10) will be +0.0f.
> +
> +     The test case covers above 4 cases for running.
> +     */
> +  float minus_zero = get_minus_zero ();
> +  float a = __builtin_copysignf (1.0, minus_zero);
> +  float b = __builtin_copysignf (1.0, foo (minus_zero, 10));
> +
> +  if (a != b)
>      abort ();
> +
>    exit (0);
>  }
>
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v5] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion
  2023-12-23 11:07 [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor pan2.li
                   ` (3 preceding siblings ...)
  2024-01-11  1:38 ` [PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion pan2.li
@ 2024-01-11  8:50 ` pan2.li
  2024-01-11  9:21   ` Richard Biener
  4 siblings, 1 reply; 23+ messages in thread
From: pan2.li @ 2024-01-11  8:50 UTC (permalink / raw)
  To: gcc-patches
  Cc: juzhe.zhong, pan2.li, yanzhang.wang, kito.cheng,
	richard.guenther, jeffreyalaw

From: Pan Li <pan2.li@intel.com>

The insert_var_expansion_initialization depends on the
HONOR_SIGNED_ZEROS to initialize the unrolling variables
to +0.0f when -0.0f and no-signed-option.  Unfortunately,
we should always keep the -0.0f here because:

* The -0.0f is always the correct initial value.
* We need to support the target that always honor signed zero.

Thus, we need to leverage MODE_HAS_SIGNED_ZEROS when initialize
instead of HONOR_SIGNED_ZEROS.  Then the target/backend can
decide to honor the no-signed-zero or not.

We also removed the testcase pr30957-1.c, as it makes undefined behavior
whether the return value is positive or negative.

The below tests are passed for this patch:

* The riscv regression tests.
* The aarch64 regression tests.
* The x86 bootstrap and regression tests.

gcc/ChangeLog:

	* loop-unroll.cc (insert_var_expansion_initialization): Leverage
	MODE_HAS_SIGNED_ZEROS for expansion variable initialization.

gcc/testsuite/ChangeLog:

	* gcc.dg/pr30957-1.c: Remove.

Signed-off-by: Pan Li <pan2.li@intel.com>
---
 gcc/loop-unroll.cc               |  4 ++--
 gcc/testsuite/gcc.dg/pr30957-1.c | 36 --------------------------------
 2 files changed, 2 insertions(+), 38 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.dg/pr30957-1.c

diff --git a/gcc/loop-unroll.cc b/gcc/loop-unroll.cc
index 4176a21e308..bfdfe6c2bb7 100644
--- a/gcc/loop-unroll.cc
+++ b/gcc/loop-unroll.cc
@@ -1855,7 +1855,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve,
   rtx var, zero_init;
   unsigned i;
   machine_mode mode = GET_MODE (ve->reg);
-  bool honor_signed_zero_p = HONOR_SIGNED_ZEROS (mode);
+  bool has_signed_zero_p = MODE_HAS_SIGNED_ZEROS (mode);
 
   if (ve->var_expansions.length () == 0)
     return;
@@ -1869,7 +1869,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve,
     case MINUS:
       FOR_EACH_VEC_ELT (ve->var_expansions, i, var)
         {
-	  if (honor_signed_zero_p)
+	  if (has_signed_zero_p)
 	    zero_init = simplify_gen_unary (NEG, mode, CONST0_RTX (mode), mode);
 	  else
 	    zero_init = CONST0_RTX (mode);
diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c
deleted file mode 100644
index 564410913ab..00000000000
--- a/gcc/testsuite/gcc.dg/pr30957-1.c
+++ /dev/null
@@ -1,36 +0,0 @@
-/* { dg-do run { xfail { mmix-*-* } } } */
-/* We don't (and don't want to) perform this optimisation on soft-float targets,
-   where each addition is a library call.  /
-/* { dg-require-effective-target hard_float } */
-/* -fassociative-math requires -fno-trapping-math and -fno-signed-zeros. */
-/* { dg-options "-O2 -funroll-loops -fassociative-math -fno-trapping-math -fno-signed-zeros -fvariable-expansion-in-unroller -fdump-rtl-loop2_unroll" } */
-
-extern void abort (void);
-extern void exit (int);
-
-float __attribute__((noinline))
-foo (float d, int n)
-{
-  unsigned i;
-  float accum = d;
-
-  for (i = 0; i < n; i++)
-    accum += d;
-
-  return accum;
-}
-
-int
-main ()
-{
-  /* When compiling standard compliant we expect foo to return -0.0.  But the
-     variable expansion during unrolling optimization (for this testcase enabled
-     by non-compliant -fassociative-math) instantiates copy(s) of the
-     accumulator which it initializes with +0.0.  Hence we expect that foo
-     returns +0.0.  */
-  if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0)
-    abort ();
-  exit (0);
-}
-
-/* { dg-final { scan-rtl-dump "Expanding Accumulator" "loop2_unroll" { xfail mmix-*-* } } } */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion
  2024-01-11  8:50 ` [PATCH v5] " pan2.li
@ 2024-01-11  9:21   ` Richard Biener
  2024-01-11 10:35     ` Li, Pan2
  0 siblings, 1 reply; 23+ messages in thread
From: Richard Biener @ 2024-01-11  9:21 UTC (permalink / raw)
  To: pan2.li; +Cc: gcc-patches, juzhe.zhong, yanzhang.wang, kito.cheng, jeffreyalaw

On Thu, Jan 11, 2024 at 9:50 AM <pan2.li@intel.com> wrote:
>
> From: Pan Li <pan2.li@intel.com>
>
> The insert_var_expansion_initialization depends on the
> HONOR_SIGNED_ZEROS to initialize the unrolling variables
> to +0.0f when -0.0f and no-signed-option.  Unfortunately,
> we should always keep the -0.0f here because:
>
> * The -0.0f is always the correct initial value.
> * We need to support the target that always honor signed zero.
>
> Thus, we need to leverage MODE_HAS_SIGNED_ZEROS when initialize
> instead of HONOR_SIGNED_ZEROS.  Then the target/backend can
> decide to honor the no-signed-zero or not.
>
> We also removed the testcase pr30957-1.c, as it makes undefined behavior
> whether the return value is positive or negative.
>
> The below tests are passed for this patch:
>
> * The riscv regression tests.
> * The aarch64 regression tests.
> * The x86 bootstrap and regression tests.

OK

> gcc/ChangeLog:
>
>         * loop-unroll.cc (insert_var_expansion_initialization): Leverage
>         MODE_HAS_SIGNED_ZEROS for expansion variable initialization.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.dg/pr30957-1.c: Remove.
>
> Signed-off-by: Pan Li <pan2.li@intel.com>
> ---
>  gcc/loop-unroll.cc               |  4 ++--
>  gcc/testsuite/gcc.dg/pr30957-1.c | 36 --------------------------------
>  2 files changed, 2 insertions(+), 38 deletions(-)
>  delete mode 100644 gcc/testsuite/gcc.dg/pr30957-1.c
>
> diff --git a/gcc/loop-unroll.cc b/gcc/loop-unroll.cc
> index 4176a21e308..bfdfe6c2bb7 100644
> --- a/gcc/loop-unroll.cc
> +++ b/gcc/loop-unroll.cc
> @@ -1855,7 +1855,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve,
>    rtx var, zero_init;
>    unsigned i;
>    machine_mode mode = GET_MODE (ve->reg);
> -  bool honor_signed_zero_p = HONOR_SIGNED_ZEROS (mode);
> +  bool has_signed_zero_p = MODE_HAS_SIGNED_ZEROS (mode);
>
>    if (ve->var_expansions.length () == 0)
>      return;
> @@ -1869,7 +1869,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve,
>      case MINUS:
>        FOR_EACH_VEC_ELT (ve->var_expansions, i, var)
>          {
> -         if (honor_signed_zero_p)
> +         if (has_signed_zero_p)
>             zero_init = simplify_gen_unary (NEG, mode, CONST0_RTX (mode), mode);
>           else
>             zero_init = CONST0_RTX (mode);
> diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c
> deleted file mode 100644
> index 564410913ab..00000000000
> --- a/gcc/testsuite/gcc.dg/pr30957-1.c
> +++ /dev/null
> @@ -1,36 +0,0 @@
> -/* { dg-do run { xfail { mmix-*-* } } } */
> -/* We don't (and don't want to) perform this optimisation on soft-float targets,
> -   where each addition is a library call.  /
> -/* { dg-require-effective-target hard_float } */
> -/* -fassociative-math requires -fno-trapping-math and -fno-signed-zeros. */
> -/* { dg-options "-O2 -funroll-loops -fassociative-math -fno-trapping-math -fno-signed-zeros -fvariable-expansion-in-unroller -fdump-rtl-loop2_unroll" } */
> -
> -extern void abort (void);
> -extern void exit (int);
> -
> -float __attribute__((noinline))
> -foo (float d, int n)
> -{
> -  unsigned i;
> -  float accum = d;
> -
> -  for (i = 0; i < n; i++)
> -    accum += d;
> -
> -  return accum;
> -}
> -
> -int
> -main ()
> -{
> -  /* When compiling standard compliant we expect foo to return -0.0.  But the
> -     variable expansion during unrolling optimization (for this testcase enabled
> -     by non-compliant -fassociative-math) instantiates copy(s) of the
> -     accumulator which it initializes with +0.0.  Hence we expect that foo
> -     returns +0.0.  */
> -  if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0)
> -    abort ();
> -  exit (0);
> -}
> -
> -/* { dg-final { scan-rtl-dump "Expanding Accumulator" "loop2_unroll" { xfail mmix-*-* } } } */
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [PATCH v5] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion
  2024-01-11  9:21   ` Richard Biener
@ 2024-01-11 10:35     ` Li, Pan2
  0 siblings, 0 replies; 23+ messages in thread
From: Li, Pan2 @ 2024-01-11 10:35 UTC (permalink / raw)
  To: Richard Biener
  Cc: gcc-patches, juzhe.zhong, Wang, Yanzhang, kito.cheng, jeffreyalaw

Committed, thanks Richard.

Pan

-----Original Message-----
From: Richard Biener <richard.guenther@gmail.com> 
Sent: Thursday, January 11, 2024 5:22 PM
To: Li, Pan2 <pan2.li@intel.com>
Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; jeffreyalaw@gmail.com
Subject: Re: [PATCH v5] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion

On Thu, Jan 11, 2024 at 9:50 AM <pan2.li@intel.com> wrote:
>
> From: Pan Li <pan2.li@intel.com>
>
> The insert_var_expansion_initialization depends on the
> HONOR_SIGNED_ZEROS to initialize the unrolling variables
> to +0.0f when -0.0f and no-signed-option.  Unfortunately,
> we should always keep the -0.0f here because:
>
> * The -0.0f is always the correct initial value.
> * We need to support the target that always honor signed zero.
>
> Thus, we need to leverage MODE_HAS_SIGNED_ZEROS when initialize
> instead of HONOR_SIGNED_ZEROS.  Then the target/backend can
> decide to honor the no-signed-zero or not.
>
> We also removed the testcase pr30957-1.c, as it makes undefined behavior
> whether the return value is positive or negative.
>
> The below tests are passed for this patch:
>
> * The riscv regression tests.
> * The aarch64 regression tests.
> * The x86 bootstrap and regression tests.

OK

> gcc/ChangeLog:
>
>         * loop-unroll.cc (insert_var_expansion_initialization): Leverage
>         MODE_HAS_SIGNED_ZEROS for expansion variable initialization.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.dg/pr30957-1.c: Remove.
>
> Signed-off-by: Pan Li <pan2.li@intel.com>
> ---
>  gcc/loop-unroll.cc               |  4 ++--
>  gcc/testsuite/gcc.dg/pr30957-1.c | 36 --------------------------------
>  2 files changed, 2 insertions(+), 38 deletions(-)
>  delete mode 100644 gcc/testsuite/gcc.dg/pr30957-1.c
>
> diff --git a/gcc/loop-unroll.cc b/gcc/loop-unroll.cc
> index 4176a21e308..bfdfe6c2bb7 100644
> --- a/gcc/loop-unroll.cc
> +++ b/gcc/loop-unroll.cc
> @@ -1855,7 +1855,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve,
>    rtx var, zero_init;
>    unsigned i;
>    machine_mode mode = GET_MODE (ve->reg);
> -  bool honor_signed_zero_p = HONOR_SIGNED_ZEROS (mode);
> +  bool has_signed_zero_p = MODE_HAS_SIGNED_ZEROS (mode);
>
>    if (ve->var_expansions.length () == 0)
>      return;
> @@ -1869,7 +1869,7 @@ insert_var_expansion_initialization (struct var_to_expand *ve,
>      case MINUS:
>        FOR_EACH_VEC_ELT (ve->var_expansions, i, var)
>          {
> -         if (honor_signed_zero_p)
> +         if (has_signed_zero_p)
>             zero_init = simplify_gen_unary (NEG, mode, CONST0_RTX (mode), mode);
>           else
>             zero_init = CONST0_RTX (mode);
> diff --git a/gcc/testsuite/gcc.dg/pr30957-1.c b/gcc/testsuite/gcc.dg/pr30957-1.c
> deleted file mode 100644
> index 564410913ab..00000000000
> --- a/gcc/testsuite/gcc.dg/pr30957-1.c
> +++ /dev/null
> @@ -1,36 +0,0 @@
> -/* { dg-do run { xfail { mmix-*-* } } } */
> -/* We don't (and don't want to) perform this optimisation on soft-float targets,
> -   where each addition is a library call.  /
> -/* { dg-require-effective-target hard_float } */
> -/* -fassociative-math requires -fno-trapping-math and -fno-signed-zeros. */
> -/* { dg-options "-O2 -funroll-loops -fassociative-math -fno-trapping-math -fno-signed-zeros -fvariable-expansion-in-unroller -fdump-rtl-loop2_unroll" } */
> -
> -extern void abort (void);
> -extern void exit (int);
> -
> -float __attribute__((noinline))
> -foo (float d, int n)
> -{
> -  unsigned i;
> -  float accum = d;
> -
> -  for (i = 0; i < n; i++)
> -    accum += d;
> -
> -  return accum;
> -}
> -
> -int
> -main ()
> -{
> -  /* When compiling standard compliant we expect foo to return -0.0.  But the
> -     variable expansion during unrolling optimization (for this testcase enabled
> -     by non-compliant -fassociative-math) instantiates copy(s) of the
> -     accumulator which it initializes with +0.0.  Hence we expect that foo
> -     returns +0.0.  */
> -  if (__builtin_copysignf (1.0, foo (0.0 / -5.0, 10)) != 1.0)
> -    abort ();
> -  exit (0);
> -}
> -
> -/* { dg-final { scan-rtl-dump "Expanding Accumulator" "loop2_unroll" { xfail mmix-*-* } } } */
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2024-01-11 10:35 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-23 11:07 [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor pan2.li
2023-12-23 17:19 ` Jeff Law
2023-12-24  2:01   ` Li, Pan2
2023-12-26  9:34 ` [PATCH v2] " pan2.li
2023-12-28 16:39   ` Jeff Law
2023-12-29  0:42     ` Li, Pan2
2023-12-29  1:03       ` Jeff Law
2023-12-29  5:56         ` Li, Pan2
2023-12-30  3:13           ` Jeff Law
2024-01-01  8:56             ` Li, Pan2
2024-01-02 11:55 ` [PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option pan2.li
2024-01-08 10:45   ` Richard Biener
2024-01-09  1:22     ` Li, Pan2
2024-01-09  7:17       ` Li, Pan2
2024-01-09 13:08         ` Richard Biener
2024-01-09 17:46     ` Jeff Law
2024-01-10  4:28       ` Li, Pan2
2024-01-11  1:38 ` [PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion pan2.li
2024-01-11  8:33   ` Richard Biener
2024-01-11  8:48     ` Li, Pan2
2024-01-11  8:50 ` [PATCH v5] " pan2.li
2024-01-11  9:21   ` Richard Biener
2024-01-11 10:35     ` Li, Pan2

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).