public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]
@ 2022-06-24  2:02 HAO CHEN GUI
  2022-07-04  6:32 ` Ping " HAO CHEN GUI
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: HAO CHEN GUI @ 2022-06-24  2:02 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool, David, Kewen.Lin, Peter Bergner

Hi,
  This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000.
Tests show that outputs of xs[min/max]dp are consistent with the standard
of C99 fmin/max.

  This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead
of smin/max. So the builtins always generate xs[min/max]dp on all
platforms.

  Bootstrapped and tested on ppc64 Linux BE and LE with no regressions.
Is this okay for trunk? Any recommendations? Thanks a lot.

ChangeLog
2022-06-24 Haochen Gui <guihaoc@linux.ibm.com>

gcc/
	PR target/103605
	* config/rs6000/rs6000.md (FMINMAX): New.
	(minmax_op): New.
	(f<minmax_op><mode>3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN.
	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set
	pattern to fmaxdf3.
	(__builtin_vsx_xsmindp): Set pattern to fmindf3.

gcc/testsuite/
	PR target/103605
	* gcc.dg/powerpc/pr103605.c: New.


patch.diff
diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index f4a9f24bcc5..8b735493b40 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1613,10 +1613,10 @@
     XSCVSPDP vsx_xscvspdp {}

   const double __builtin_vsx_xsmaxdp (double, double);
-    XSMAXDP smaxdf3 {}
+    XSMAXDP fmaxdf3 {}

   const double __builtin_vsx_xsmindp (double, double);
-    XSMINDP smindf3 {}
+    XSMINDP fmindf3 {}

   const double __builtin_vsx_xsrdpi (double);
     XSRDPI vsx_xsrdpi {}
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index bf85baa5370..ae0dd98f0f9 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -158,6 +158,8 @@ (define_c_enum "unspec"
    UNSPEC_HASHCHK
    UNSPEC_XXSPLTIDP_CONST
    UNSPEC_XXSPLTIW_CONST
+   UNSPEC_FMAX
+   UNSPEC_FMIN
   ])

 ;;
@@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr"
   DONE;
 })

+
+(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN])
+
+(define_int_attr  minmax_op [(UNSPEC_FMAX "max")
+			     (UNSPEC_FMIN "min")])
+
+(define_insn "f<minmax_op><mode>3"
+  [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa")
+	(unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa")
+		      (match_operand:SFDF 2 "vsx_register_operand" "wa")]
+		     FMINMAX))]
+  "TARGET_VSX && !flag_finite_math_only"
+  "xs<minmax_op>dp %x0,%x1,%x2"
+  [(set_attr "type" "fp")]
+)
+
 (define_expand "mov<mode>cc"
    [(set (match_operand:GPR 0 "gpc_reg_operand")
 	 (if_then_else:GPR (match_operand 1 "comparison_operator")
diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c b/gcc/testsuite/gcc.target/powerpc/pr103605.c
new file mode 100644
index 00000000000..1c938d40e61
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -mvsx" } */
+/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */
+
+#include <math.h>
+
+double test1 (double d0, double d1)
+{
+  return fmin (d0, d1);
+}
+
+float test2 (float d0, float d1)
+{
+  return fmin (d0, d1);
+}
+
+double test3 (double d0, double d1)
+{
+  return fmax (d0, d1);
+}
+
+float test4 (float d0, float d1)
+{
+  return fmax (d0, d1);
+}
+
+double test5 (double d0, double d1)
+{
+  return __builtin_vsx_xsmindp (d0, d1);
+}
+
+double test6 (double d0, double d1)
+{
+  return __builtin_vsx_xsmaxdp (d0, d1);
+}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Ping [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]
  2022-06-24  2:02 [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605] HAO CHEN GUI
@ 2022-07-04  6:32 ` HAO CHEN GUI
  2022-08-01  2:03   ` Ping^2 " HAO CHEN GUI
  2022-09-21  9:34 ` Kewen.Lin
  2022-09-21 21:56 ` Segher Boessenkool
  2 siblings, 1 reply; 11+ messages in thread
From: HAO CHEN GUI @ 2022-07-04  6:32 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool, David, Kewen.Lin, Peter Bergner

Hi,
   Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html
Thanks.

On 24/6/2022 上午 10:02, HAO CHEN GUI wrote:
> Hi,
>   This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000.
> Tests show that outputs of xs[min/max]dp are consistent with the standard
> of C99 fmin/max.
> 
>   This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead
> of smin/max. So the builtins always generate xs[min/max]dp on all
> platforms.
> 
>   Bootstrapped and tested on ppc64 Linux BE and LE with no regressions.
> Is this okay for trunk? Any recommendations? Thanks a lot.
> 
> ChangeLog
> 2022-06-24 Haochen Gui <guihaoc@linux.ibm.com>
> 
> gcc/
> 	PR target/103605
> 	* config/rs6000/rs6000.md (FMINMAX): New.
> 	(minmax_op): New.
> 	(f<minmax_op><mode>3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN.
> 	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set
> 	pattern to fmaxdf3.
> 	(__builtin_vsx_xsmindp): Set pattern to fmindf3.
> 
> gcc/testsuite/
> 	PR target/103605
> 	* gcc.dg/powerpc/pr103605.c: New.
> 
> 
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index f4a9f24bcc5..8b735493b40 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1613,10 +1613,10 @@
>      XSCVSPDP vsx_xscvspdp {}
> 
>    const double __builtin_vsx_xsmaxdp (double, double);
> -    XSMAXDP smaxdf3 {}
> +    XSMAXDP fmaxdf3 {}
> 
>    const double __builtin_vsx_xsmindp (double, double);
> -    XSMINDP smindf3 {}
> +    XSMINDP fmindf3 {}
> 
>    const double __builtin_vsx_xsrdpi (double);
>      XSRDPI vsx_xsrdpi {}
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index bf85baa5370..ae0dd98f0f9 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -158,6 +158,8 @@ (define_c_enum "unspec"
>     UNSPEC_HASHCHK
>     UNSPEC_XXSPLTIDP_CONST
>     UNSPEC_XXSPLTIW_CONST
> +   UNSPEC_FMAX
> +   UNSPEC_FMIN
>    ])
> 
>  ;;
> @@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr"
>    DONE;
>  })
> 
> +
> +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN])
> +
> +(define_int_attr  minmax_op [(UNSPEC_FMAX "max")
> +			     (UNSPEC_FMIN "min")])
> +
> +(define_insn "f<minmax_op><mode>3"
> +  [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa")
> +	(unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa")
> +		      (match_operand:SFDF 2 "vsx_register_operand" "wa")]
> +		     FMINMAX))]
> +  "TARGET_VSX && !flag_finite_math_only"
> +  "xs<minmax_op>dp %x0,%x1,%x2"
> +  [(set_attr "type" "fp")]
> +)
> +
>  (define_expand "mov<mode>cc"
>     [(set (match_operand:GPR 0 "gpc_reg_operand")
>  	 (if_then_else:GPR (match_operand 1 "comparison_operator")
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c b/gcc/testsuite/gcc.target/powerpc/pr103605.c
> new file mode 100644
> index 00000000000..1c938d40e61
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c
> @@ -0,0 +1,37 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_vsx_ok } */
> +/* { dg-options "-O2 -mvsx" } */
> +/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */
> +/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */
> +
> +#include <math.h>
> +
> +double test1 (double d0, double d1)
> +{
> +  return fmin (d0, d1);
> +}
> +
> +float test2 (float d0, float d1)
> +{
> +  return fmin (d0, d1);
> +}
> +
> +double test3 (double d0, double d1)
> +{
> +  return fmax (d0, d1);
> +}
> +
> +float test4 (float d0, float d1)
> +{
> +  return fmax (d0, d1);
> +}
> +
> +double test5 (double d0, double d1)
> +{
> +  return __builtin_vsx_xsmindp (d0, d1);
> +}
> +
> +double test6 (double d0, double d1)
> +{
> +  return __builtin_vsx_xsmaxdp (d0, d1);
> +}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Ping^2 [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]
  2022-07-04  6:32 ` Ping " HAO CHEN GUI
@ 2022-08-01  2:03   ` HAO CHEN GUI
  2022-09-21  5:20     ` Ping^3 " HAO CHEN GUI
  0 siblings, 1 reply; 11+ messages in thread
From: HAO CHEN GUI @ 2022-08-01  2:03 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool, David, Kewen.Lin, Peter Bergner

Hi,
   Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html
Thanks.


On 4/7/2022 下午 2:32, HAO CHEN GUI wrote:
> Hi,
>    Gentle ping this:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html
> Thanks.
> 
> On 24/6/2022 上午 10:02, HAO CHEN GUI wrote:
>> Hi,
>>   This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000.
>> Tests show that outputs of xs[min/max]dp are consistent with the standard
>> of C99 fmin/max.
>>
>>   This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead
>> of smin/max. So the builtins always generate xs[min/max]dp on all
>> platforms.
>>
>>   Bootstrapped and tested on ppc64 Linux BE and LE with no regressions.
>> Is this okay for trunk? Any recommendations? Thanks a lot.
>>
>> ChangeLog
>> 2022-06-24 Haochen Gui <guihaoc@linux.ibm.com>
>>
>> gcc/
>> 	PR target/103605
>> 	* config/rs6000/rs6000.md (FMINMAX): New.
>> 	(minmax_op): New.
>> 	(f<minmax_op><mode>3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN.
>> 	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set
>> 	pattern to fmaxdf3.
>> 	(__builtin_vsx_xsmindp): Set pattern to fmindf3.
>>
>> gcc/testsuite/
>> 	PR target/103605
>> 	* gcc.dg/powerpc/pr103605.c: New.
>>
>>
>> patch.diff
>> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
>> index f4a9f24bcc5..8b735493b40 100644
>> --- a/gcc/config/rs6000/rs6000-builtins.def
>> +++ b/gcc/config/rs6000/rs6000-builtins.def
>> @@ -1613,10 +1613,10 @@
>>      XSCVSPDP vsx_xscvspdp {}
>>
>>    const double __builtin_vsx_xsmaxdp (double, double);
>> -    XSMAXDP smaxdf3 {}
>> +    XSMAXDP fmaxdf3 {}
>>
>>    const double __builtin_vsx_xsmindp (double, double);
>> -    XSMINDP smindf3 {}
>> +    XSMINDP fmindf3 {}
>>
>>    const double __builtin_vsx_xsrdpi (double);
>>      XSRDPI vsx_xsrdpi {}
>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
>> index bf85baa5370..ae0dd98f0f9 100644
>> --- a/gcc/config/rs6000/rs6000.md
>> +++ b/gcc/config/rs6000/rs6000.md
>> @@ -158,6 +158,8 @@ (define_c_enum "unspec"
>>     UNSPEC_HASHCHK
>>     UNSPEC_XXSPLTIDP_CONST
>>     UNSPEC_XXSPLTIW_CONST
>> +   UNSPEC_FMAX
>> +   UNSPEC_FMIN
>>    ])
>>
>>  ;;
>> @@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr"
>>    DONE;
>>  })
>>
>> +
>> +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN])
>> +
>> +(define_int_attr  minmax_op [(UNSPEC_FMAX "max")
>> +			     (UNSPEC_FMIN "min")])
>> +
>> +(define_insn "f<minmax_op><mode>3"
>> +  [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa")
>> +	(unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa")
>> +		      (match_operand:SFDF 2 "vsx_register_operand" "wa")]
>> +		     FMINMAX))]
>> +  "TARGET_VSX && !flag_finite_math_only"
>> +  "xs<minmax_op>dp %x0,%x1,%x2"
>> +  [(set_attr "type" "fp")]
>> +)
>> +
>>  (define_expand "mov<mode>cc"
>>     [(set (match_operand:GPR 0 "gpc_reg_operand")
>>  	 (if_then_else:GPR (match_operand 1 "comparison_operator")
>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c b/gcc/testsuite/gcc.target/powerpc/pr103605.c
>> new file mode 100644
>> index 00000000000..1c938d40e61
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c
>> @@ -0,0 +1,37 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target powerpc_vsx_ok } */
>> +/* { dg-options "-O2 -mvsx" } */
>> +/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */
>> +/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */
>> +
>> +#include <math.h>
>> +
>> +double test1 (double d0, double d1)
>> +{
>> +  return fmin (d0, d1);
>> +}
>> +
>> +float test2 (float d0, float d1)
>> +{
>> +  return fmin (d0, d1);
>> +}
>> +
>> +double test3 (double d0, double d1)
>> +{
>> +  return fmax (d0, d1);
>> +}
>> +
>> +float test4 (float d0, float d1)
>> +{
>> +  return fmax (d0, d1);
>> +}
>> +
>> +double test5 (double d0, double d1)
>> +{
>> +  return __builtin_vsx_xsmindp (d0, d1);
>> +}
>> +
>> +double test6 (double d0, double d1)
>> +{
>> +  return __builtin_vsx_xsmaxdp (d0, d1);
>> +}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Ping^3 [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]
  2022-08-01  2:03   ` Ping^2 " HAO CHEN GUI
@ 2022-09-21  5:20     ` HAO CHEN GUI
  0 siblings, 0 replies; 11+ messages in thread
From: HAO CHEN GUI @ 2022-09-21  5:20 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool, David, Kewen.Lin, Peter Bergner

Hi,
    Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html
Thanks.

On 1/8/2022 上午 10:03, HAO CHEN GUI wrote:
> Hi,
>    Gentle ping this:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html
> Thanks.
> 
> 
> On 4/7/2022 下午 2:32, HAO CHEN GUI wrote:
>> Hi,
>>    Gentle ping this:
>> https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html
>> Thanks.
>>
>> On 24/6/2022 上午 10:02, HAO CHEN GUI wrote:
>>> Hi,
>>>   This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000.
>>> Tests show that outputs of xs[min/max]dp are consistent with the standard
>>> of C99 fmin/max.
>>>
>>>   This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead
>>> of smin/max. So the builtins always generate xs[min/max]dp on all
>>> platforms.
>>>
>>>   Bootstrapped and tested on ppc64 Linux BE and LE with no regressions.
>>> Is this okay for trunk? Any recommendations? Thanks a lot.
>>>
>>> ChangeLog
>>> 2022-06-24 Haochen Gui <guihaoc@linux.ibm.com>
>>>
>>> gcc/
>>> 	PR target/103605
>>> 	* config/rs6000/rs6000.md (FMINMAX): New.
>>> 	(minmax_op): New.
>>> 	(f<minmax_op><mode>3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN.
>>> 	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set
>>> 	pattern to fmaxdf3.
>>> 	(__builtin_vsx_xsmindp): Set pattern to fmindf3.
>>>
>>> gcc/testsuite/
>>> 	PR target/103605
>>> 	* gcc.dg/powerpc/pr103605.c: New.
>>>
>>>
>>> patch.diff
>>> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
>>> index f4a9f24bcc5..8b735493b40 100644
>>> --- a/gcc/config/rs6000/rs6000-builtins.def
>>> +++ b/gcc/config/rs6000/rs6000-builtins.def
>>> @@ -1613,10 +1613,10 @@
>>>      XSCVSPDP vsx_xscvspdp {}
>>>
>>>    const double __builtin_vsx_xsmaxdp (double, double);
>>> -    XSMAXDP smaxdf3 {}
>>> +    XSMAXDP fmaxdf3 {}
>>>
>>>    const double __builtin_vsx_xsmindp (double, double);
>>> -    XSMINDP smindf3 {}
>>> +    XSMINDP fmindf3 {}
>>>
>>>    const double __builtin_vsx_xsrdpi (double);
>>>      XSRDPI vsx_xsrdpi {}
>>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
>>> index bf85baa5370..ae0dd98f0f9 100644
>>> --- a/gcc/config/rs6000/rs6000.md
>>> +++ b/gcc/config/rs6000/rs6000.md
>>> @@ -158,6 +158,8 @@ (define_c_enum "unspec"
>>>     UNSPEC_HASHCHK
>>>     UNSPEC_XXSPLTIDP_CONST
>>>     UNSPEC_XXSPLTIW_CONST
>>> +   UNSPEC_FMAX
>>> +   UNSPEC_FMIN
>>>    ])
>>>
>>>  ;;
>>> @@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr"
>>>    DONE;
>>>  })
>>>
>>> +
>>> +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN])
>>> +
>>> +(define_int_attr  minmax_op [(UNSPEC_FMAX "max")
>>> +			     (UNSPEC_FMIN "min")])
>>> +
>>> +(define_insn "f<minmax_op><mode>3"
>>> +  [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa")
>>> +	(unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa")
>>> +		      (match_operand:SFDF 2 "vsx_register_operand" "wa")]
>>> +		     FMINMAX))]
>>> +  "TARGET_VSX && !flag_finite_math_only"
>>> +  "xs<minmax_op>dp %x0,%x1,%x2"
>>> +  [(set_attr "type" "fp")]
>>> +)
>>> +
>>>  (define_expand "mov<mode>cc"
>>>     [(set (match_operand:GPR 0 "gpc_reg_operand")
>>>  	 (if_then_else:GPR (match_operand 1 "comparison_operator")
>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c b/gcc/testsuite/gcc.target/powerpc/pr103605.c
>>> new file mode 100644
>>> index 00000000000..1c938d40e61
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c
>>> @@ -0,0 +1,37 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-require-effective-target powerpc_vsx_ok } */
>>> +/* { dg-options "-O2 -mvsx" } */
>>> +/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */
>>> +/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */
>>> +
>>> +#include <math.h>
>>> +
>>> +double test1 (double d0, double d1)
>>> +{
>>> +  return fmin (d0, d1);
>>> +}
>>> +
>>> +float test2 (float d0, float d1)
>>> +{
>>> +  return fmin (d0, d1);
>>> +}
>>> +
>>> +double test3 (double d0, double d1)
>>> +{
>>> +  return fmax (d0, d1);
>>> +}
>>> +
>>> +float test4 (float d0, float d1)
>>> +{
>>> +  return fmax (d0, d1);
>>> +}
>>> +
>>> +double test5 (double d0, double d1)
>>> +{
>>> +  return __builtin_vsx_xsmindp (d0, d1);
>>> +}
>>> +
>>> +double test6 (double d0, double d1)
>>> +{
>>> +  return __builtin_vsx_xsmaxdp (d0, d1);
>>> +}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]
  2022-06-24  2:02 [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605] HAO CHEN GUI
  2022-07-04  6:32 ` Ping " HAO CHEN GUI
@ 2022-09-21  9:34 ` Kewen.Lin
  2022-09-21 21:56 ` Segher Boessenkool
  2 siblings, 0 replies; 11+ messages in thread
From: Kewen.Lin @ 2022-09-21  9:34 UTC (permalink / raw)
  To: HAO CHEN GUI; +Cc: Segher Boessenkool, David, Peter Bergner, gcc-patches

Hi Haochen,

on 2022/6/24 10:02, HAO CHEN GUI wrote:
> Hi,
>   This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000.
> Tests show that outputs of xs[min/max]dp are consistent with the standard
> of C99 fmin/max.
> 
>   This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead
> of smin/max. So the builtins always generate xs[min/max]dp on all
> platforms.
> 
>   Bootstrapped and tested on ppc64 Linux BE and LE with no regressions.
> Is this okay for trunk? Any recommendations? Thanks a lot.
> 
> ChangeLog
> 2022-06-24 Haochen Gui <guihaoc@linux.ibm.com>
> 
> gcc/
> 	PR target/103605
> 	* config/rs6000/rs6000.md (FMINMAX): New.
> 	(minmax_op): New.
> 	(f<minmax_op><mode>3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN.

Nit: here miss UNSPEC_FMAX and UNSPEC_FMIN.

> 	* config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set
> 	pattern to fmaxdf3.
> 	(__builtin_vsx_xsmindp): Set pattern to fmindf3.
> 
> gcc/testsuite/
> 	PR target/103605
> 	* gcc.dg/powerpc/pr103605.c: New.
> 
> 
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index f4a9f24bcc5..8b735493b40 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1613,10 +1613,10 @@
>      XSCVSPDP vsx_xscvspdp {}
> 
>    const double __builtin_vsx_xsmaxdp (double, double);
> -    XSMAXDP smaxdf3 {}
> +    XSMAXDP fmaxdf3 {}
> 
>    const double __builtin_vsx_xsmindp (double, double);
> -    XSMINDP smindf3 {}
> +    XSMINDP fmindf3 {}
> 
>    const double __builtin_vsx_xsrdpi (double);
>      XSRDPI vsx_xsrdpi {}
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index bf85baa5370..ae0dd98f0f9 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -158,6 +158,8 @@ (define_c_enum "unspec"
>     UNSPEC_HASHCHK
>     UNSPEC_XXSPLTIDP_CONST
>     UNSPEC_XXSPLTIW_CONST
> +   UNSPEC_FMAX
> +   UNSPEC_FMIN
>    ])
> 
>  ;;
> @@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr"
>    DONE;
>  })
> 
> +
> +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN])
> +
> +(define_int_attr  minmax_op [(UNSPEC_FMAX "max")
> +			     (UNSPEC_FMIN "min")])
> +
> +(define_insn "f<minmax_op><mode>3"
> +  [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa")
> +	(unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa")
> +		      (match_operand:SFDF 2 "vsx_register_operand" "wa")]
> +		     FMINMAX))]
> +  "TARGET_VSX && !flag_finite_math_only"
> +  "xs<minmax_op>dp %x0,%x1,%x2"
> +  [(set_attr "type" "fp")]
> +)
> +
>  (define_expand "mov<mode>cc"
>     [(set (match_operand:GPR 0 "gpc_reg_operand")
>  	 (if_then_else:GPR (match_operand 1 "comparison_operator")
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c b/gcc/testsuite/gcc.target/powerpc/pr103605.c
> new file mode 100644
> index 00000000000..1c938d40e61
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c
> @@ -0,0 +1,37 @@
> +/* { dg-do compile } */

Nit: This dg-do line isn't needed.  OK with or without two nits fixed.  Thanks!

BR,
Kewen

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]
  2022-06-24  2:02 [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605] HAO CHEN GUI
  2022-07-04  6:32 ` Ping " HAO CHEN GUI
  2022-09-21  9:34 ` Kewen.Lin
@ 2022-09-21 21:56 ` Segher Boessenkool
  2022-09-22  2:28   ` Kewen.Lin
  2 siblings, 1 reply; 11+ messages in thread
From: Segher Boessenkool @ 2022-09-21 21:56 UTC (permalink / raw)
  To: HAO CHEN GUI; +Cc: gcc-patches, David, Kewen.Lin, Peter Bergner

Hi!

On Fri, Jun 24, 2022 at 10:02:19AM +0800, HAO CHEN GUI wrote:
>   This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead
> of smin/max. So the builtins always generate xs[min/max]dp on all
> platforms.

But how does this not blow up with -ffast-math?

In the other direction I am worried that the unspecs will degrade
performance (relative to smin/smax) when -ffast-math *is* active (and
this new builtin code and pattern doesn't blow up).

I still think we should get RTL codes for this, to have access to proper
floating point min/max semantics always and everywhere.  "fmin" and
"fmax" seem to be good names :-)


Segher

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]
  2022-09-21 21:56 ` Segher Boessenkool
@ 2022-09-22  2:28   ` Kewen.Lin
  2022-09-22  9:59     ` HAO CHEN GUI
  2022-09-22 14:05     ` Segher Boessenkool
  0 siblings, 2 replies; 11+ messages in thread
From: Kewen.Lin @ 2022-09-22  2:28 UTC (permalink / raw)
  To: Segher Boessenkool, HAO CHEN GUI; +Cc: gcc-patches, David, Peter Bergner

on 2022/9/22 05:56, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Jun 24, 2022 at 10:02:19AM +0800, HAO CHEN GUI wrote:
>>   This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead
>> of smin/max. So the builtins always generate xs[min/max]dp on all
>> platforms.
> 
> But how does this not blow up with -ffast-math?

Indeed.  Since it guards with "TARGET_VSX && !flag_finite_math_only",
the bifs seem to cause ICE at -ffast-math.

Haochen, could you double check it?

> 
> In the other direction I am worried that the unspecs will degrade
> performance (relative to smin/smax) when -ffast-math *is* active (and
> this new builtin code and pattern doesn't blow up).

For fmin/fmax it would be fine, since they are transformed to {MAX,MIN}
EXPR in middle end, and yes, it can degrade for the bifs, although IMHO
the previous expansion to smin/smax contradicts with the bif names (users
expect to map them to xs{min,max}dp than others).

> 
> I still think we should get RTL codes for this, to have access to proper
> floating point min/max semantics always and everywhere.  "fmin" and
> "fmax" seem to be good names :-)

It would be good, especially if we have observed some uses of these bifs
and further opportunities around them.  :)

BR,
Kewen

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]
  2022-09-22  2:28   ` Kewen.Lin
@ 2022-09-22  9:59     ` HAO CHEN GUI
  2022-09-22 13:56       ` Segher Boessenkool
  2022-09-22 14:05     ` Segher Boessenkool
  1 sibling, 1 reply; 11+ messages in thread
From: HAO CHEN GUI @ 2022-09-22  9:59 UTC (permalink / raw)
  To: Kewen.Lin, Segher Boessenkool; +Cc: gcc-patches, David, Peter Bergner

Hi Kewen & Segher,

Thanks so much for your review comments.

On 22/9/2022 上午 10:28, Kewen.Lin wrote:
> on 2022/9/22 05:56, Segher Boessenkool wrote:
>> Hi!
>>
>> On Fri, Jun 24, 2022 at 10:02:19AM +0800, HAO CHEN GUI wrote:
>>>   This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead
>>> of smin/max. So the builtins always generate xs[min/max]dp on all
>>> platforms.
>>
>> But how does this not blow up with -ffast-math?
> 
> Indeed.  Since it guards with "TARGET_VSX && !flag_finite_math_only",
> the bifs seem to cause ICE at -ffast-math.
> 
> Haochen, could you double check it?
I tested it with "-ffast-math". fmin/max functions are converted to
MIN/MAX_EXPR in gimple lower pass. But the built-ins are not and hit the
ICE. I thought the built-ins are folded to MIN/MAX_EXPR like vec_ versions'
when fast-math is set. In fact they're not. Sorry for that.

I made a patch to fold these two built-ins to MIN/MAX_EXPR when fast-math
is set. Then the built-ins are converted to MIN/MAX_EXPR and expanded to
smin/max.

Thanks for pointing out the problem!

> 
>>
>> In the other direction I am worried that the unspecs will degrade
>> performance (relative to smin/smax) when -ffast-math *is* active (and
>> this new builtin code and pattern doesn't blow up).
> 
> For fmin/fmax it would be fine, since they are transformed to {MAX,MIN}
> EXPR in middle end, and yes, it can degrade for the bifs, although IMHO
> the previous expansion to smin/smax contradicts with the bif names (users
> expect to map them to xs{min,max}dp than others).
> 
>>
>> I still think we should get RTL codes for this, to have access to proper
>> floating point min/max semantics always and everywhere.  "fmin" and
>> "fmax" seem to be good names :-)
> 
> It would be good, especially if we have observed some uses of these bifs
> and further opportunities around them.  :)
> 
Shall we submit a PR to add fmin/fmax to RTL codes?

> BR,
> Kewen

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]
  2022-09-22  9:59     ` HAO CHEN GUI
@ 2022-09-22 13:56       ` Segher Boessenkool
  0 siblings, 0 replies; 11+ messages in thread
From: Segher Boessenkool @ 2022-09-22 13:56 UTC (permalink / raw)
  To: HAO CHEN GUI; +Cc: Kewen.Lin, gcc-patches, David, Peter Bergner

Hi!

On Thu, Sep 22, 2022 at 05:59:07PM +0800, HAO CHEN GUI wrote:
> >> I still think we should get RTL codes for this, to have access to proper
> >> floating point min/max semantics always and everywhere.  "fmin" and
> >> "fmax" seem to be good names :-)
> > 
> > It would be good, especially if we have observed some uses of these bifs
> > and further opportunities around them.  :)
> > 
> Shall we submit a PR to add fmin/fmax to RTL codes?

Yes, please do.

If we have fmin/fmax RTL codes that describe the standard semantics,
we can generate code for that with -ffast-math as well, since the
code generated is optimal in either case; it's just the *generic*
optimisations that fall behind.


Segher

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]
  2022-09-22  2:28   ` Kewen.Lin
  2022-09-22  9:59     ` HAO CHEN GUI
@ 2022-09-22 14:05     ` Segher Boessenkool
  2022-09-26  5:58       ` Kewen.Lin
  1 sibling, 1 reply; 11+ messages in thread
From: Segher Boessenkool @ 2022-09-22 14:05 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: HAO CHEN GUI, gcc-patches, David, Peter Bergner

Hi!

On Thu, Sep 22, 2022 at 10:28:23AM +0800, Kewen.Lin wrote:
> on 2022/9/22 05:56, Segher Boessenkool wrote:
> > On Fri, Jun 24, 2022 at 10:02:19AM +0800, HAO CHEN GUI wrote:
> > In the other direction I am worried that the unspecs will degrade
> > performance (relative to smin/smax) when -ffast-math *is* active (and
> > this new builtin code and pattern doesn't blow up).
> 
> For fmin/fmax it would be fine, since they are transformed to {MAX,MIN}
> EXPR in middle end, and yes, it can degrade for the bifs, although IMHO
> the previous expansion to smin/smax contradicts with the bif names (users
> expect to map them to xs{min,max}dp than others).

But builtins *never* say to generate any particular instruction.  They
say to generate code that implements certain functionality.  For many
builtins this does of course boil down to specific instructions, but
even then it could be optimised away completely or replace with
something more specific if things can be folded or such.

> > I still think we should get RTL codes for this, to have access to proper
> > floating point min/max semantics always and everywhere.  "fmin" and
> > "fmax" seem to be good names :-)
> 
> It would be good, especially if we have observed some uses of these bifs
> and further opportunities around them.  :)

Currently we only have smin/smax for float, and those are not valid for
NaNs, or when the sign of zeros is relevant.  On the other hand the
semantics of fmin/fmax are settled and in most standards nowadays.  So
it is time we did this I would say :-)


Segher

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]
  2022-09-22 14:05     ` Segher Boessenkool
@ 2022-09-26  5:58       ` Kewen.Lin
  0 siblings, 0 replies; 11+ messages in thread
From: Kewen.Lin @ 2022-09-26  5:58 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: HAO CHEN GUI, gcc-patches, David, Peter Bergner

Hi Segher,

on 2022/9/22 22:05, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Sep 22, 2022 at 10:28:23AM +0800, Kewen.Lin wrote:
>> on 2022/9/22 05:56, Segher Boessenkool wrote:
>>> On Fri, Jun 24, 2022 at 10:02:19AM +0800, HAO CHEN GUI wrote:
>>> In the other direction I am worried that the unspecs will degrade
>>> performance (relative to smin/smax) when -ffast-math *is* active (and
>>> this new builtin code and pattern doesn't blow up).
>>
>> For fmin/fmax it would be fine, since they are transformed to {MAX,MIN}
>> EXPR in middle end, and yes, it can degrade for the bifs, although IMHO
>> the previous expansion to smin/smax contradicts with the bif names (users
>> expect to map them to xs{min,max}dp than others).
> 
> But builtins *never* say to generate any particular instruction.  They
> say to generate code that implements certain functionality.  For many
> builtins this does of course boil down to specific instructions, but
> even then it could be optimised away completely or replace with
> something more specific if things can be folded or such.

ah, your explanation refreshed my mind, thanks!  Previously I thought the
bifs with specific mnemonic as part of their names should be used to generate
specific instructions, it's to save users' efforts using inline-asm, if
we want them to represent the generic functionality (not bind with specific),
we can use some generic names instead.  As your explanation, binding at
fast-math isn't needed, then I think Haochen's patch v7 with gimple folding
can avoid the concern on degradation at fast-math (still smax/smin), nice. :)

BR,
Kewen

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-09-26  5:58 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-24  2:02 [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605] HAO CHEN GUI
2022-07-04  6:32 ` Ping " HAO CHEN GUI
2022-08-01  2:03   ` Ping^2 " HAO CHEN GUI
2022-09-21  5:20     ` Ping^3 " HAO CHEN GUI
2022-09-21  9:34 ` Kewen.Lin
2022-09-21 21:56 ` Segher Boessenkool
2022-09-22  2:28   ` Kewen.Lin
2022-09-22  9:59     ` HAO CHEN GUI
2022-09-22 13:56       ` Segher Boessenkool
2022-09-22 14:05     ` Segher Boessenkool
2022-09-26  5:58       ` Kewen.Lin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).