public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
@ 2021-04-16  7:10 Xiong Hu Luo
  2021-05-06  2:36 ` Ping: " Xionghu Luo
  0 siblings, 1 reply; 13+ messages in thread
From: Xiong Hu Luo @ 2021-04-16  7:10 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, wschmidt, guojiufu, linkw, Xiong Hu Luo

fmod/fmodf and remainder/remainderf could be expanded instead of library
call when fast-math build, which is much faster.

fmodf:
     fdivs   f0,f1,f2
     friz    f0,f0
     fnmsubs f1,f2,f0,f1

remainderf:
     fdivs   f0,f1,f2
     frin    f0,f0
     fnmsubs f1,f2,f0,f1

gcc/ChangeLog:

2021-04-16  Xionghu Luo  <luoxhu@linux.ibm.com>

	PR target/97142
	* config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
	(remainder<mode>3): Likewise.

gcc/testsuite/ChangeLog:

2021-04-16  Xionghu Luo  <luoxhu@linux.ibm.com>

	PR target/97142
	* gcc.target/powerpc/pr97142.c: New test.
---
 gcc/config/rs6000/rs6000.md                | 36 ++++++++++++++++++++++
 gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++
 2 files changed, 66 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index a1315523fec..7e0e94e6ba4 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -4902,6 +4902,42 @@ (define_insn "fre<sd>"
   [(set_attr "type" "fp")
    (set_attr "isa" "*,<Fisa>")])
 
+(define_expand "fmod<mode>3"
+  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
+	(use (match_operand:SFDF 1 "gpc_reg_operand"))
+	(use (match_operand:SFDF 2 "gpc_reg_operand"))]
+  "TARGET_HARD_FLOAT
+  && TARGET_FPRND
+  && flag_unsafe_math_optimizations"
+{
+  rtx div = gen_reg_rtx (<MODE>mode);
+  emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
+
+  rtx friz = gen_reg_rtx (<MODE>mode);
+  emit_insn (gen_btrunc<mode>2 (friz, div));
+
+  emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, operands[1]));
+  DONE;
+ })
+
+(define_expand "remainder<mode>3"
+  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
+	(use (match_operand:SFDF 1 "gpc_reg_operand"))
+	(use (match_operand:SFDF 2 "gpc_reg_operand"))]
+  "TARGET_HARD_FLOAT
+  && TARGET_FPRND
+  && flag_unsafe_math_optimizations"
+{
+  rtx div = gen_reg_rtx (<MODE>mode);
+  emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
+
+  rtx frin = gen_reg_rtx (<MODE>mode);
+  emit_insn (gen_round<mode>2 (frin, div));
+
+  emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, operands[1]));
+  DONE;
+ })
+
 (define_insn "*rsqrt<mode>2"
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa")
 	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")]
diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c b/gcc/testsuite/gcc.target/powerpc/pr97142.c
new file mode 100644
index 00000000000..48f25ca5b5b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast" } */
+
+#include <math.h>
+
+float test1 (float x, float y)
+{
+  return fmodf (x, y);
+}
+
+double test2 (double x, double y)
+{
+  return fmod (x, y);
+}
+
+float test3 (float x, float y)
+{
+  return remainderf (x, y);
+}
+
+double test4 (double x, double y)
+{
+  return remainder (x, y);
+}
+
+/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
+/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
+/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
+/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
+
-- 
2.27.0.90.geebb51ba8c


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Ping: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
  2021-04-16  7:10 [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] Xiong Hu Luo
@ 2021-05-06  2:36 ` Xionghu Luo
  2021-05-14  7:13   ` Xionghu Luo
  0 siblings, 1 reply; 13+ messages in thread
From: Xionghu Luo @ 2021-05-06  2:36 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, wschmidt, guojiufu, linkw

Gentle ping, thanks.


On 2021/4/16 15:10, Xiong Hu Luo wrote:
> fmod/fmodf and remainder/remainderf could be expanded instead of library
> call when fast-math build, which is much faster.
> 
> fmodf:
>       fdivs   f0,f1,f2
>       friz    f0,f0
>       fnmsubs f1,f2,f0,f1
> 
> remainderf:
>       fdivs   f0,f1,f2
>       frin    f0,f0
>       fnmsubs f1,f2,f0,f1
> 
> gcc/ChangeLog:
> 
> 2021-04-16  Xionghu Luo  <luoxhu@linux.ibm.com>
> 
> 	PR target/97142
> 	* config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
> 	(remainder<mode>3): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> 2021-04-16  Xionghu Luo  <luoxhu@linux.ibm.com>
> 
> 	PR target/97142
> 	* gcc.target/powerpc/pr97142.c: New test.
> ---
>   gcc/config/rs6000/rs6000.md                | 36 ++++++++++++++++++++++
>   gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++
>   2 files changed, 66 insertions(+)
>   create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c
> 
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index a1315523fec..7e0e94e6ba4 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -4902,6 +4902,42 @@ (define_insn "fre<sd>"
>     [(set_attr "type" "fp")
>      (set_attr "isa" "*,<Fisa>")])
>   
> +(define_expand "fmod<mode>3"
> +  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
> +	(use (match_operand:SFDF 1 "gpc_reg_operand"))
> +	(use (match_operand:SFDF 2 "gpc_reg_operand"))]
> +  "TARGET_HARD_FLOAT
> +  && TARGET_FPRND
> +  && flag_unsafe_math_optimizations"
> +{
> +  rtx div = gen_reg_rtx (<MODE>mode);
> +  emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
> +
> +  rtx friz = gen_reg_rtx (<MODE>mode);
> +  emit_insn (gen_btrunc<mode>2 (friz, div));
> +
> +  emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, operands[1]));
> +  DONE;
> + })
> +
> +(define_expand "remainder<mode>3"
> +  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
> +	(use (match_operand:SFDF 1 "gpc_reg_operand"))
> +	(use (match_operand:SFDF 2 "gpc_reg_operand"))]
> +  "TARGET_HARD_FLOAT
> +  && TARGET_FPRND
> +  && flag_unsafe_math_optimizations"
> +{
> +  rtx div = gen_reg_rtx (<MODE>mode);
> +  emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
> +
> +  rtx frin = gen_reg_rtx (<MODE>mode);
> +  emit_insn (gen_round<mode>2 (frin, div));
> +
> +  emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, operands[1]));
> +  DONE;
> + })
> +
>   (define_insn "*rsqrt<mode>2"
>     [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa")
>   	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")]
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c b/gcc/testsuite/gcc.target/powerpc/pr97142.c
> new file mode 100644
> index 00000000000..48f25ca5b5b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
> @@ -0,0 +1,30 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Ofast" } */
> +
> +#include <math.h>
> +
> +float test1 (float x, float y)
> +{
> +  return fmodf (x, y);
> +}
> +
> +double test2 (double x, double y)
> +{
> +  return fmod (x, y);
> +}
> +
> +float test3 (float x, float y)
> +{
> +  return remainderf (x, y);
> +}
> +
> +double test4 (double x, double y)
> +{
> +  return remainder (x, y);
> +}
> +
> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
> +
> 

-- 
Thanks,
Xionghu

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ping: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
  2021-05-06  2:36 ` Ping: " Xionghu Luo
@ 2021-05-14  7:13   ` Xionghu Luo
  2021-06-07  5:08     ` Ping^2: " Xionghu Luo
  2021-06-30  1:44     ` Ping ^ 2: " Xionghu Luo
  0 siblings, 2 replies; 13+ messages in thread
From: Xionghu Luo @ 2021-05-14  7:13 UTC (permalink / raw)
  To: gcc-patches; +Cc: wschmidt, dje.gcc, segher, linkw

Test SPEC2017 Ofast P8LE for this patch : 511.povray_r +1.14%,
526.blender_r +1.72%, no obvious changes to others.


On 2021/5/6 10:36, Xionghu Luo via Gcc-patches wrote:
> Gentle ping, thanks.
> 
> 
> On 2021/4/16 15:10, Xiong Hu Luo wrote:
>> fmod/fmodf and remainder/remainderf could be expanded instead of library
>> call when fast-math build, which is much faster.
>>
>> fmodf:
>>       fdivs   f0,f1,f2
>>       friz    f0,f0
>>       fnmsubs f1,f2,f0,f1
>>
>> remainderf:
>>       fdivs   f0,f1,f2
>>       frin    f0,f0
>>       fnmsubs f1,f2,f0,f1
>>
>> gcc/ChangeLog:
>>
>> 2021-04-16  Xionghu Luo  <luoxhu@linux.ibm.com>
>>
>>     PR target/97142
>>     * config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
>>     (remainder<mode>3): Likewise.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2021-04-16  Xionghu Luo  <luoxhu@linux.ibm.com>
>>
>>     PR target/97142
>>     * gcc.target/powerpc/pr97142.c: New test.
>> ---
>>   gcc/config/rs6000/rs6000.md                | 36 ++++++++++++++++++++++
>>   gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++
>>   2 files changed, 66 insertions(+)
>>   create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c
>>
>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
>> index a1315523fec..7e0e94e6ba4 100644
>> --- a/gcc/config/rs6000/rs6000.md
>> +++ b/gcc/config/rs6000/rs6000.md
>> @@ -4902,6 +4902,42 @@ (define_insn "fre<sd>"
>>     [(set_attr "type" "fp")
>>      (set_attr "isa" "*,<Fisa>")])
>> +(define_expand "fmod<mode>3"
>> +  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
>> +    (use (match_operand:SFDF 1 "gpc_reg_operand"))
>> +    (use (match_operand:SFDF 2 "gpc_reg_operand"))]
>> +  "TARGET_HARD_FLOAT
>> +  && TARGET_FPRND
>> +  && flag_unsafe_math_optimizations"
>> +{
>> +  rtx div = gen_reg_rtx (<MODE>mode);
>> +  emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
>> +
>> +  rtx friz = gen_reg_rtx (<MODE>mode);
>> +  emit_insn (gen_btrunc<mode>2 (friz, div));
>> +
>> +  emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, 
>> operands[1]));
>> +  DONE;
>> + })
>> +
>> +(define_expand "remainder<mode>3"
>> +  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
>> +    (use (match_operand:SFDF 1 "gpc_reg_operand"))
>> +    (use (match_operand:SFDF 2 "gpc_reg_operand"))]
>> +  "TARGET_HARD_FLOAT
>> +  && TARGET_FPRND
>> +  && flag_unsafe_math_optimizations"
>> +{
>> +  rtx div = gen_reg_rtx (<MODE>mode);
>> +  emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
>> +
>> +  rtx frin = gen_reg_rtx (<MODE>mode);
>> +  emit_insn (gen_round<mode>2 (frin, div));
>> +
>> +  emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, 
>> operands[1]));
>> +  DONE;
>> + })
>> +
>>   (define_insn "*rsqrt<mode>2"
>>     [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa")
>>       (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")]
>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c 
>> b/gcc/testsuite/gcc.target/powerpc/pr97142.c
>> new file mode 100644
>> index 00000000000..48f25ca5b5b
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
>> @@ -0,0 +1,30 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-Ofast" } */
>> +
>> +#include <math.h>
>> +
>> +float test1 (float x, float y)
>> +{
>> +  return fmodf (x, y);
>> +}
>> +
>> +double test2 (double x, double y)
>> +{
>> +  return fmod (x, y);
>> +}
>> +
>> +float test3 (float x, float y)
>> +{
>> +  return remainderf (x, y);
>> +}
>> +
>> +double test4 (double x, double y)
>> +{
>> +  return remainder (x, y);
>> +}
>> +
>> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
>> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
>> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
>> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
>> +
>>
> 

-- 
Thanks,
Xionghu

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Ping^2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
  2021-05-14  7:13   ` Xionghu Luo
@ 2021-06-07  5:08     ` Xionghu Luo
  2021-06-30  1:44     ` Ping ^ 2: " Xionghu Luo
  1 sibling, 0 replies; 13+ messages in thread
From: Xionghu Luo @ 2021-06-07  5:08 UTC (permalink / raw)
  To: gcc-patches; +Cc: wschmidt, segher, dje.gcc, linkw

Ping, thanks.


On 2021/5/14 15:13, Xionghu Luo via Gcc-patches wrote:
> Test SPEC2017 Ofast P8LE for this patch : 511.povray_r +1.14%,
> 526.blender_r +1.72%, no obvious changes to others.
> 
> 
> On 2021/5/6 10:36, Xionghu Luo via Gcc-patches wrote:
>> Gentle ping, thanks.
>>
>>
>> On 2021/4/16 15:10, Xiong Hu Luo wrote:
>>> fmod/fmodf and remainder/remainderf could be expanded instead of library
>>> call when fast-math build, which is much faster.
>>>
>>> fmodf:
>>>       fdivs   f0,f1,f2
>>>       friz    f0,f0
>>>       fnmsubs f1,f2,f0,f1
>>>
>>> remainderf:
>>>       fdivs   f0,f1,f2
>>>       frin    f0,f0
>>>       fnmsubs f1,f2,f0,f1
>>>
>>> gcc/ChangeLog:
>>>
>>> 2021-04-16  Xionghu Luo  <luoxhu@linux.ibm.com>
>>>
>>>     PR target/97142
>>>     * config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
>>>     (remainder<mode>3): Likewise.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2021-04-16  Xionghu Luo  <luoxhu@linux.ibm.com>
>>>
>>>     PR target/97142
>>>     * gcc.target/powerpc/pr97142.c: New test.
>>> ---
>>>   gcc/config/rs6000/rs6000.md                | 36 ++++++++++++++++++++++
>>>   gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++
>>>   2 files changed, 66 insertions(+)
>>>   create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c
>>>
>>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
>>> index a1315523fec..7e0e94e6ba4 100644
>>> --- a/gcc/config/rs6000/rs6000.md
>>> +++ b/gcc/config/rs6000/rs6000.md
>>> @@ -4902,6 +4902,42 @@ (define_insn "fre<sd>"
>>>     [(set_attr "type" "fp")
>>>      (set_attr "isa" "*,<Fisa>")])
>>> +(define_expand "fmod<mode>3"
>>> +  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
>>> +    (use (match_operand:SFDF 1 "gpc_reg_operand"))
>>> +    (use (match_operand:SFDF 2 "gpc_reg_operand"))]
>>> +  "TARGET_HARD_FLOAT
>>> +  && TARGET_FPRND
>>> +  && flag_unsafe_math_optimizations"
>>> +{
>>> +  rtx div = gen_reg_rtx (<MODE>mode);
>>> +  emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
>>> +
>>> +  rtx friz = gen_reg_rtx (<MODE>mode);
>>> +  emit_insn (gen_btrunc<mode>2 (friz, div));
>>> +
>>> +  emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, 
>>> operands[1]));
>>> +  DONE;
>>> + })
>>> +
>>> +(define_expand "remainder<mode>3"
>>> +  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
>>> +    (use (match_operand:SFDF 1 "gpc_reg_operand"))
>>> +    (use (match_operand:SFDF 2 "gpc_reg_operand"))]
>>> +  "TARGET_HARD_FLOAT
>>> +  && TARGET_FPRND
>>> +  && flag_unsafe_math_optimizations"
>>> +{
>>> +  rtx div = gen_reg_rtx (<MODE>mode);
>>> +  emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
>>> +
>>> +  rtx frin = gen_reg_rtx (<MODE>mode);
>>> +  emit_insn (gen_round<mode>2 (frin, div));
>>> +
>>> +  emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, 
>>> operands[1]));
>>> +  DONE;
>>> + })
>>> +
>>>   (define_insn "*rsqrt<mode>2"
>>>     [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa")
>>>       (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")]
>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c 
>>> b/gcc/testsuite/gcc.target/powerpc/pr97142.c
>>> new file mode 100644
>>> index 00000000000..48f25ca5b5b
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
>>> @@ -0,0 +1,30 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-Ofast" } */
>>> +
>>> +#include <math.h>
>>> +
>>> +float test1 (float x, float y)
>>> +{
>>> +  return fmodf (x, y);
>>> +}
>>> +
>>> +double test2 (double x, double y)
>>> +{
>>> +  return fmod (x, y);
>>> +}
>>> +
>>> +float test3 (float x, float y)
>>> +{
>>> +  return remainderf (x, y);
>>> +}
>>> +
>>> +double test4 (double x, double y)
>>> +{
>>> +  return remainder (x, y);
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
>>> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
>>> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
>>> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
>>> +
>>>
>>
> 

-- 
Thanks,
Xionghu

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
  2021-05-14  7:13   ` Xionghu Luo
  2021-06-07  5:08     ` Ping^2: " Xionghu Luo
@ 2021-06-30  1:44     ` Xionghu Luo
  2021-07-09 18:40       ` will schmidt
  1 sibling, 1 reply; 13+ messages in thread
From: Xionghu Luo @ 2021-06-30  1:44 UTC (permalink / raw)
  To: gcc-patches; +Cc: wschmidt, segher, dje.gcc, linkw

Gentle ping ^2, thanks.

https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html


On 2021/5/14 15:13, Xionghu Luo via Gcc-patches wrote:
> Test SPEC2017 Ofast P8LE for this patch : 511.povray_r +1.14%,
> 526.blender_r +1.72%, no obvious changes to others.
> 
> 
> On 2021/5/6 10:36, Xionghu Luo via Gcc-patches wrote:
>> Gentle ping, thanks.
>>
>>
>> On 2021/4/16 15:10, Xiong Hu Luo wrote:
>>> fmod/fmodf and remainder/remainderf could be expanded instead of library
>>> call when fast-math build, which is much faster.
>>>
>>> fmodf:
>>>       fdivs   f0,f1,f2
>>>       friz    f0,f0
>>>       fnmsubs f1,f2,f0,f1
>>>
>>> remainderf:
>>>       fdivs   f0,f1,f2
>>>       frin    f0,f0
>>>       fnmsubs f1,f2,f0,f1
>>>
>>> gcc/ChangeLog:
>>>
>>> 2021-04-16  Xionghu Luo  <luoxhu@linux.ibm.com>
>>>
>>>     PR target/97142
>>>     * config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
>>>     (remainder<mode>3): Likewise.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2021-04-16  Xionghu Luo  <luoxhu@linux.ibm.com>
>>>
>>>     PR target/97142
>>>     * gcc.target/powerpc/pr97142.c: New test.
>>> ---
>>>   gcc/config/rs6000/rs6000.md                | 36 ++++++++++++++++++++++
>>>   gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++
>>>   2 files changed, 66 insertions(+)
>>>   create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c
>>>
>>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
>>> index a1315523fec..7e0e94e6ba4 100644
>>> --- a/gcc/config/rs6000/rs6000.md
>>> +++ b/gcc/config/rs6000/rs6000.md
>>> @@ -4902,6 +4902,42 @@ (define_insn "fre<sd>"
>>>     [(set_attr "type" "fp")
>>>      (set_attr "isa" "*,<Fisa>")])
>>> +(define_expand "fmod<mode>3"
>>> +  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
>>> +    (use (match_operand:SFDF 1 "gpc_reg_operand"))
>>> +    (use (match_operand:SFDF 2 "gpc_reg_operand"))]
>>> +  "TARGET_HARD_FLOAT
>>> +  && TARGET_FPRND
>>> +  && flag_unsafe_math_optimizations"
>>> +{
>>> +  rtx div = gen_reg_rtx (<MODE>mode);
>>> +  emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
>>> +
>>> +  rtx friz = gen_reg_rtx (<MODE>mode);
>>> +  emit_insn (gen_btrunc<mode>2 (friz, div));
>>> +
>>> +  emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, 
>>> operands[1]));
>>> +  DONE;
>>> + })
>>> +
>>> +(define_expand "remainder<mode>3"
>>> +  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
>>> +    (use (match_operand:SFDF 1 "gpc_reg_operand"))
>>> +    (use (match_operand:SFDF 2 "gpc_reg_operand"))]
>>> +  "TARGET_HARD_FLOAT
>>> +  && TARGET_FPRND
>>> +  && flag_unsafe_math_optimizations"
>>> +{
>>> +  rtx div = gen_reg_rtx (<MODE>mode);
>>> +  emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
>>> +
>>> +  rtx frin = gen_reg_rtx (<MODE>mode);
>>> +  emit_insn (gen_round<mode>2 (frin, div));
>>> +
>>> +  emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, 
>>> operands[1]));
>>> +  DONE;
>>> + })
>>> +
>>>   (define_insn "*rsqrt<mode>2"
>>>     [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa")
>>>       (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")]
>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c 
>>> b/gcc/testsuite/gcc.target/powerpc/pr97142.c
>>> new file mode 100644
>>> index 00000000000..48f25ca5b5b
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
>>> @@ -0,0 +1,30 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-Ofast" } */
>>> +
>>> +#include <math.h>
>>> +
>>> +float test1 (float x, float y)
>>> +{
>>> +  return fmodf (x, y);
>>> +}
>>> +
>>> +double test2 (double x, double y)
>>> +{
>>> +  return fmod (x, y);
>>> +}
>>> +
>>> +float test3 (float x, float y)
>>> +{
>>> +  return remainderf (x, y);
>>> +}
>>> +
>>> +double test4 (double x, double y)
>>> +{
>>> +  return remainder (x, y);
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
>>> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
>>> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
>>> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
>>> +
>>>
>>
> 

-- 
Thanks,
Xionghu

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
  2021-06-30  1:44     ` Ping ^ 2: " Xionghu Luo
@ 2021-07-09 18:40       ` will schmidt
  2021-07-12  1:25         ` Xionghu Luo
  0 siblings, 1 reply; 13+ messages in thread
From: will schmidt @ 2021-07-09 18:40 UTC (permalink / raw)
  To: Xionghu Luo, gcc-patches; +Cc: wschmidt, dje.gcc, segher, linkw

On Wed, 2021-06-30 at 09:44 +0800, Xionghu Luo via Gcc-patches wrote:
> Gentle ping ^2, thanks.
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html
> 
> 
> On 2021/5/14 15:13, Xionghu Luo via Gcc-patches wrote:
> > Test SPEC2017 Ofast P8LE for this patch : 511.povray_r +1.14%,
> > 526.blender_r +1.72%, no obvious changes to others.

Ok.

> > 
> > 
> > On 2021/5/6 10:36, Xionghu Luo via Gcc-patches wrote:
> > > Gentle ping, thanks.
> > > 
> > > 
> > > On 2021/4/16 15:10, Xiong Hu Luo wrote:
> > > > fmod/fmodf and remainder/remainderf could be expanded instead of library
> > > > call when fast-math build, which is much faster.
> > > > 
> > > > fmodf:
> > > >       fdivs   f0,f1,f2
> > > >       friz    f0,f0
> > > >       fnmsubs f1,f2,f0,f1
> > > > 
> > > > remainderf:
> > > >       fdivs   f0,f1,f2
> > > >       frin    f0,f0
> > > >       fnmsubs f1,f2,f0,f1
> > > > 
> > > > gcc/ChangeLog:
> > > > 
> > > > 2021-04-16  Xionghu Luo  <luoxhu@linux.ibm.com>
> > > > 
> > > >     PR target/97142

That PR is " Bug 97142 
      - __builtin_fmod not optimized on POWER   "

OK.


> > > >     * config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
> > > >     (remainder<mode>3): Likewise.


> > > > 
> > > > gcc/testsuite/ChangeLog:
> > > > 
> > > > 2021-04-16  Xionghu Luo  <luoxhu@linux.ibm.com>
> > > > 
> > > >     PR target/97142
> > > >     * gcc.target/powerpc/pr97142.c: New test.

Ok.

> > > > ---
> > > >   gcc/config/rs6000/rs6000.md                | 36 ++++++++++++++++++++++
> > > >   gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++
> > > >   2 files changed, 66 insertions(+)
> > > >   create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c
> > > > 
> > > > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> > > > index a1315523fec..7e0e94e6ba4 100644
> > > > --- a/gcc/config/rs6000/rs6000.md
> > > > +++ b/gcc/config/rs6000/rs6000.md
> > > > @@ -4902,6 +4902,42 @@ (define_insn "fre<sd>"
> > > >     [(set_attr "type" "fp")
> > > >      (set_attr "isa" "*,<Fisa>")])
> > > > +(define_expand "fmod<mode>3"
> > > > +  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
> > > > +    (use (match_operand:SFDF 1 "gpc_reg_operand"))
> > > > +    (use (match_operand:SFDF 2 "gpc_reg_operand"))]
> > > > +  "TARGET_HARD_FLOAT
> > > > +  && TARGET_FPRND
> > > > +  && flag_unsafe_math_optimizations"
> > > > +{
> > > > +  rtx div = gen_reg_rtx (<MODE>mode);
> > > > +  emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
> > > > +
> > > > +  rtx friz = gen_reg_rtx (<MODE>mode);
> > > > +  emit_insn (gen_btrunc<mode>2 (friz, div));
> > > > +
> > > > +  emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, 
> > > > operands[1]));
> > > > +  DONE;
> > > > + })
> > > > +
> > > > +(define_expand "remainder<mode>3"
> > > > +  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
> > > > +    (use (match_operand:SFDF 1 "gpc_reg_operand"))
> > > > +    (use (match_operand:SFDF 2 "gpc_reg_operand"))]
> > > > +  "TARGET_HARD_FLOAT
> > > > +  && TARGET_FPRND
> > > > +  && flag_unsafe_math_optimizations"
> > > > +{
> > > > +  rtx div = gen_reg_rtx (<MODE>mode);
> > > > +  emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
> > > > +
> > > > +  rtx frin = gen_reg_rtx (<MODE>mode);
> > > > +  emit_insn (gen_round<mode>2 (frin, div));
> > > > +
> > > > +  emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, 
> > > > operands[1]));
> > > > +  DONE;
> > > > + })

I notice the pattern of arguments to the final emit
is op[0],op[2],fri*,op[1]
while the description comment suggests the generated instruction 
will be fnmsubs  f1,f2,f0,f1  ;

I don't see any rearranging in the nfms<mode>4 expansions, but
presumably this is correct and just a cosmetic nit that catches my eye.

Ok.


> > > > +
> > > >   (define_insn "*rsqrt<mode>2"
> > > >     [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa")
> > > >       (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")]
> > > > diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c 
> > > > b/gcc/testsuite/gcc.target/powerpc/pr97142.c
> > > > new file mode 100644
> > > > index 00000000000..48f25ca5b5b
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
> > > > @@ -0,0 +1,30 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-options "-Ofast" } */
> > > > +
> > > > +#include <math.h>
> > > > +
> > > > +float test1 (float x, float y)
> > > > +{
> > > > +  return fmodf (x, y);
> > > > +}
> > > > +
> > > > +double test2 (double x, double y)
> > > > +{
> > > > +  return fmod (x, y);
> > > > +}
> > > > +
> > > > +float test3 (float x, float y)
> > > > +{
> > > > +  return remainderf (x, y);
> > > > +}
> > > > +
> > > > +double test4 (double x, double y)
> > > > +{
> > > > +  return remainder (x, y);
> > > > +}
> > > > +
> > > > +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
> > > > +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
> > > > +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
> > > > +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */


Ok.
I'd be tempted to add scan-assembler checks for the fdivs,fri*,fnmsubs
instructions as well. 
I defer to others on that, of course.. :-) 

lgtm, 
thanks
-Will



> > > > +
> > > > 
> 
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
  2021-07-09 18:40       ` will schmidt
@ 2021-07-12  1:25         ` Xionghu Luo
  2021-09-03  2:31           ` Xionghu Luo
  0 siblings, 1 reply; 13+ messages in thread
From: Xionghu Luo @ 2021-07-12  1:25 UTC (permalink / raw)
  To: will schmidt, gcc-patches; +Cc: wschmidt, dje.gcc, segher, linkw



On 2021/7/10 02:40, will schmidt wrote:
> On Wed, 2021-06-30 at 09:44 +0800, Xionghu Luo via Gcc-patches wrote:
>> Gentle ping ^2, thanks.
>>
>> https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html
>>
>>
>> On 2021/5/14 15:13, Xionghu Luo via Gcc-patches wrote:
>>> Test SPEC2017 Ofast P8LE for this patch : 511.povray_r +1.14%,
>>> 526.blender_r +1.72%, no obvious changes to others.
> 
> Ok.
> 
>>>
>>>
>>> On 2021/5/6 10:36, Xionghu Luo via Gcc-patches wrote:
>>>> Gentle ping, thanks.
>>>>
>>>>
>>>> On 2021/4/16 15:10, Xiong Hu Luo wrote:
>>>>> fmod/fmodf and remainder/remainderf could be expanded instead of library
>>>>> call when fast-math build, which is much faster.
>>>>>
>>>>> fmodf:
>>>>>        fdivs   f0,f1,f2
>>>>>        friz    f0,f0
>>>>>        fnmsubs f1,f2,f0,f1
>>>>>
>>>>> remainderf:
>>>>>        fdivs   f0,f1,f2
>>>>>        frin    f0,f0
>>>>>        fnmsubs f1,f2,f0,f1
>>>>>
>>>>> gcc/ChangeLog:
>>>>>
>>>>> 2021-04-16  Xionghu Luo  <luoxhu@linux.ibm.com>
>>>>>
>>>>>      PR target/97142
> 
> That PR is " Bug 97142
>        - __builtin_fmod not optimized on POWER   "
> 
> OK.
> 
> 
>>>>>      * config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
>>>>>      (remainder<mode>3): Likewise.
> 
> 
>>>>>
>>>>> gcc/testsuite/ChangeLog:
>>>>>
>>>>> 2021-04-16  Xionghu Luo  <luoxhu@linux.ibm.com>
>>>>>
>>>>>      PR target/97142
>>>>>      * gcc.target/powerpc/pr97142.c: New test.
> 
> Ok.
> 
>>>>> ---
>>>>>    gcc/config/rs6000/rs6000.md                | 36 ++++++++++++++++++++++
>>>>>    gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++
>>>>>    2 files changed, 66 insertions(+)
>>>>>    create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c
>>>>>
>>>>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
>>>>> index a1315523fec..7e0e94e6ba4 100644
>>>>> --- a/gcc/config/rs6000/rs6000.md
>>>>> +++ b/gcc/config/rs6000/rs6000.md
>>>>> @@ -4902,6 +4902,42 @@ (define_insn "fre<sd>"
>>>>>      [(set_attr "type" "fp")
>>>>>       (set_attr "isa" "*,<Fisa>")])
>>>>> +(define_expand "fmod<mode>3"
>>>>> +  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
>>>>> +    (use (match_operand:SFDF 1 "gpc_reg_operand"))
>>>>> +    (use (match_operand:SFDF 2 "gpc_reg_operand"))]
>>>>> +  "TARGET_HARD_FLOAT
>>>>> +  && TARGET_FPRND
>>>>> +  && flag_unsafe_math_optimizations"
>>>>> +{
>>>>> +  rtx div = gen_reg_rtx (<MODE>mode);
>>>>> +  emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
>>>>> +
>>>>> +  rtx friz = gen_reg_rtx (<MODE>mode);
>>>>> +  emit_insn (gen_btrunc<mode>2 (friz, div));
>>>>> +
>>>>> +  emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz,
>>>>> operands[1]));
>>>>> +  DONE;
>>>>> + })
>>>>> +
>>>>> +(define_expand "remainder<mode>3"
>>>>> +  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
>>>>> +    (use (match_operand:SFDF 1 "gpc_reg_operand"))
>>>>> +    (use (match_operand:SFDF 2 "gpc_reg_operand"))]
>>>>> +  "TARGET_HARD_FLOAT
>>>>> +  && TARGET_FPRND
>>>>> +  && flag_unsafe_math_optimizations"
>>>>> +{
>>>>> +  rtx div = gen_reg_rtx (<MODE>mode);
>>>>> +  emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
>>>>> +
>>>>> +  rtx frin = gen_reg_rtx (<MODE>mode);
>>>>> +  emit_insn (gen_round<mode>2 (frin, div));
>>>>> +
>>>>> +  emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin,
>>>>> operands[1]));
>>>>> +  DONE;
>>>>> + })
> 
> I notice the pattern of arguments to the final emit
> is op[0],op[2],fri*,op[1]
> while the description comment suggests the generated instruction
> will be fnmsubs  f1,f2,f0,f1  ;
> 
> I don't see any rearranging in the nfms<mode>4 expansions, but
> presumably this is correct and just a cosmetic nit that catches my eye.


From the ISA, 

fnmsub FRT,FRA,FRC,FRB

The operation
FRT ← - ( [(FRA) (FRC)] - (FRB) )
is performed.

 fmodf:
       fdivs   f0,f1,f2
       friz    f0,f0
       fnmsubs f1,f2,f0,f1

Then the ASM means:

f1 = - (f2 * f0 - f1) = - ([f2 * f1/f2] - f1)

So f1 is set with the mod result.

> 
> Ok.
> 
> 
>>>>> +
>>>>>    (define_insn "*rsqrt<mode>2"
>>>>>      [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa")
>>>>>        (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")]
>>>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c
>>>>> b/gcc/testsuite/gcc.target/powerpc/pr97142.c
>>>>> new file mode 100644
>>>>> index 00000000000..48f25ca5b5b
>>>>> --- /dev/null
>>>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
>>>>> @@ -0,0 +1,30 @@
>>>>> +/* { dg-do compile } */
>>>>> +/* { dg-options "-Ofast" } */
>>>>> +
>>>>> +#include <math.h>
>>>>> +
>>>>> +float test1 (float x, float y)
>>>>> +{
>>>>> +  return fmodf (x, y);
>>>>> +}
>>>>> +
>>>>> +double test2 (double x, double y)
>>>>> +{
>>>>> +  return fmod (x, y);
>>>>> +}
>>>>> +
>>>>> +float test3 (float x, float y)
>>>>> +{
>>>>> +  return remainderf (x, y);
>>>>> +}
>>>>> +
>>>>> +double test4 (double x, double y)
>>>>> +{
>>>>> +  return remainder (x, y);
>>>>> +}
>>>>> +
>>>>> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
>>>>> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
>>>>> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
>>>>> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
> 
> 
> Ok.
> I'd be tempted to add scan-assembler checks for the fdivs,fri*,fnmsubs
> instructions as well.
> I defer to others on that, of course.. :-)

Thanks, will add below check:

diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c b/gcc/testsuite/gcc.target/powerpc/pr97142.c
index 48f25ca5b5b..081ab40b4c0 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr97142.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
@@ -27,4 +27,11 @@ double test4 (double x, double y)
 /* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
 /* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
 /* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
+/* { dg-final { scan-assembler-times {\mfdiv\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfdivs\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfnmsub\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfnmsubs\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfriz\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfrin\M} 2 } } */
+

> 
> lgtm,
> thanks
> -Will
> 
> 
> 
>>>>> +
>>>>>
>>
>>
> 

-- 
Thanks,
Xionghu

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
  2021-07-12  1:25         ` Xionghu Luo
@ 2021-09-03  2:31           ` Xionghu Luo
  2021-09-03 14:51             ` Bill Schmidt
                               ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Xionghu Luo @ 2021-09-03  2:31 UTC (permalink / raw)
  To: will schmidt, gcc-patches; +Cc: wschmidt, segher, dje.gcc, linkw

Resend the patch that addressed Will's comments.


fmod/fmodf and remainder/remainderf could be expanded instead of library
call when fast-math build, which is much faster.

fmodf:
     fdivs   f0,f1,f2
     friz    f0,f0
     fnmsubs f1,f2,f0,f1

remainderf:
     fdivs   f0,f1,f2
     frin    f0,f0
     fnmsubs f1,f2,f0,f1

SPEC2017 Ofast P8LE: 511.povray_r +1.14%,  526.blender_r +1.72%

gcc/ChangeLog:

2021-09-03  Xionghu Luo  <luoxhu@linux.ibm.com>

	PR target/97142
	* config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
	(remainder<mode>3): Likewise.

gcc/testsuite/ChangeLog:

2021-09-03  Xionghu Luo  <luoxhu@linux.ibm.com>

	PR target/97142
	* gcc.target/powerpc/pr97142.c: New test.
---
 gcc/config/rs6000/rs6000.md                | 36 ++++++++++++++++++++++
 gcc/testsuite/gcc.target/powerpc/pr97142.c | 35 +++++++++++++++++++++
 2 files changed, 71 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index c8cdc42533c..84820d3b5cb 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -4932,6 +4932,42 @@ (define_insn "fre<sd>"
   [(set_attr "type" "fp")
    (set_attr "isa" "*,<Fisa>")])
 
+(define_expand "fmod<mode>3"
+  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
+	(use (match_operand:SFDF 1 "gpc_reg_operand"))
+	(use (match_operand:SFDF 2 "gpc_reg_operand"))]
+  "TARGET_HARD_FLOAT
+  && TARGET_FPRND
+  && flag_unsafe_math_optimizations"
+{
+  rtx div = gen_reg_rtx (<MODE>mode);
+  emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
+
+  rtx friz = gen_reg_rtx (<MODE>mode);
+  emit_insn (gen_btrunc<mode>2 (friz, div));
+
+  emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, operands[1]));
+  DONE;
+ })
+
+(define_expand "remainder<mode>3"
+  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
+	(use (match_operand:SFDF 1 "gpc_reg_operand"))
+	(use (match_operand:SFDF 2 "gpc_reg_operand"))]
+  "TARGET_HARD_FLOAT
+  && TARGET_FPRND
+  && flag_unsafe_math_optimizations"
+{
+  rtx div = gen_reg_rtx (<MODE>mode);
+  emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
+
+  rtx frin = gen_reg_rtx (<MODE>mode);
+  emit_insn (gen_round<mode>2 (frin, div));
+
+  emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, operands[1]));
+  DONE;
+ })
+
 (define_insn "*rsqrt<mode>2"
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa")
 	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")]
diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c b/gcc/testsuite/gcc.target/powerpc/pr97142.c
new file mode 100644
index 00000000000..e5306eb681b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast" } */
+
+#include <math.h>
+
+float test1 (float x, float y)
+{
+  return fmodf (x, y);
+}
+
+double test2 (double x, double y)
+{
+  return fmod (x, y);
+}
+
+float test3 (float x, float y)
+{
+  return remainderf (x, y);
+}
+
+double test4 (double x, double y)
+{
+  return remainder (x, y);
+}
+
+/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
+/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
+/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
+/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
+/* { dg-final { scan-assembler-times {\mfdiv\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfdivs\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfnmsub\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfnmsubs\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfriz\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfrin\M} 2 } } */
-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
  2021-09-03  2:31           ` Xionghu Luo
@ 2021-09-03 14:51             ` Bill Schmidt
  2021-09-03 14:53             ` David Edelsohn
  2021-09-03 21:44             ` Segher Boessenkool
  2 siblings, 0 replies; 13+ messages in thread
From: Bill Schmidt @ 2021-09-03 14:51 UTC (permalink / raw)
  To: Xionghu Luo, will schmidt, gcc-patches; +Cc: segher, dje.gcc, linkw

Hi Xionghu,

This looks okay to me.  Recommend maintainers approve.

Thanks!
Bill

On 9/2/21 9:31 PM, Xionghu Luo wrote:
> Resend the patch that addressed Will's comments.
>
>
> fmod/fmodf and remainder/remainderf could be expanded instead of library
> call when fast-math build, which is much faster.
>
> fmodf:
>       fdivs   f0,f1,f2
>       friz    f0,f0
>       fnmsubs f1,f2,f0,f1
>
> remainderf:
>       fdivs   f0,f1,f2
>       frin    f0,f0
>       fnmsubs f1,f2,f0,f1
>
> SPEC2017 Ofast P8LE: 511.povray_r +1.14%,  526.blender_r +1.72%
>
> gcc/ChangeLog:
>
> 2021-09-03  Xionghu Luo  <luoxhu@linux.ibm.com>
>
> 	PR target/97142
> 	* config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
> 	(remainder<mode>3): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> 2021-09-03  Xionghu Luo  <luoxhu@linux.ibm.com>
>
> 	PR target/97142
> 	* gcc.target/powerpc/pr97142.c: New test.
> ---
>   gcc/config/rs6000/rs6000.md                | 36 ++++++++++++++++++++++
>   gcc/testsuite/gcc.target/powerpc/pr97142.c | 35 +++++++++++++++++++++
>   2 files changed, 71 insertions(+)
>   create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c
>
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index c8cdc42533c..84820d3b5cb 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -4932,6 +4932,42 @@ (define_insn "fre<sd>"
>     [(set_attr "type" "fp")
>      (set_attr "isa" "*,<Fisa>")])
>   
> +(define_expand "fmod<mode>3"
> +  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
> +	(use (match_operand:SFDF 1 "gpc_reg_operand"))
> +	(use (match_operand:SFDF 2 "gpc_reg_operand"))]
> +  "TARGET_HARD_FLOAT
> +  && TARGET_FPRND
> +  && flag_unsafe_math_optimizations"
> +{
> +  rtx div = gen_reg_rtx (<MODE>mode);
> +  emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
> +
> +  rtx friz = gen_reg_rtx (<MODE>mode);
> +  emit_insn (gen_btrunc<mode>2 (friz, div));
> +
> +  emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, operands[1]));
> +  DONE;
> + })
> +
> +(define_expand "remainder<mode>3"
> +  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
> +	(use (match_operand:SFDF 1 "gpc_reg_operand"))
> +	(use (match_operand:SFDF 2 "gpc_reg_operand"))]
> +  "TARGET_HARD_FLOAT
> +  && TARGET_FPRND
> +  && flag_unsafe_math_optimizations"
> +{
> +  rtx div = gen_reg_rtx (<MODE>mode);
> +  emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
> +
> +  rtx frin = gen_reg_rtx (<MODE>mode);
> +  emit_insn (gen_round<mode>2 (frin, div));
> +
> +  emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, operands[1]));
> +  DONE;
> + })
> +
>   (define_insn "*rsqrt<mode>2"
>     [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa")
>   	(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")]
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c b/gcc/testsuite/gcc.target/powerpc/pr97142.c
> new file mode 100644
> index 00000000000..e5306eb681b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
> @@ -0,0 +1,35 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Ofast" } */
> +
> +#include <math.h>
> +
> +float test1 (float x, float y)
> +{
> +  return fmodf (x, y);
> +}
> +
> +double test2 (double x, double y)
> +{
> +  return fmod (x, y);
> +}
> +
> +float test3 (float x, float y)
> +{
> +  return remainderf (x, y);
> +}
> +
> +double test4 (double x, double y)
> +{
> +  return remainder (x, y);
> +}
> +
> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
> +/* { dg-final { scan-assembler-times {\mfdiv\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mfdivs\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mfnmsub\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mfnmsubs\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mfriz\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mfrin\M} 2 } } */

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
  2021-09-03  2:31           ` Xionghu Luo
  2021-09-03 14:51             ` Bill Schmidt
@ 2021-09-03 14:53             ` David Edelsohn
  2021-09-03 21:44             ` Segher Boessenkool
  2 siblings, 0 replies; 13+ messages in thread
From: David Edelsohn @ 2021-09-03 14:53 UTC (permalink / raw)
  To: Xionghu Luo
  Cc: will schmidt, GCC Patches, Bill Schmidt, Segher Boessenkool, linkw

On Thu, Sep 2, 2021 at 10:31 PM Xionghu Luo <luoxhu@linux.ibm.com> wrote:
>
> Resend the patch that addressed Will's comments.
>
>
> fmod/fmodf and remainder/remainderf could be expanded instead of library
> call when fast-math build, which is much faster.
>
> fmodf:
>      fdivs   f0,f1,f2
>      friz    f0,f0
>      fnmsubs f1,f2,f0,f1
>
> remainderf:
>      fdivs   f0,f1,f2
>      frin    f0,f0
>      fnmsubs f1,f2,f0,f1
>
> SPEC2017 Ofast P8LE: 511.povray_r +1.14%,  526.blender_r +1.72%
>
> gcc/ChangeLog:
>
> 2021-09-03  Xionghu Luo  <luoxhu@linux.ibm.com>
>
>         PR target/97142
>         * config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
>         (remainder<mode>3): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> 2021-09-03  Xionghu Luo  <luoxhu@linux.ibm.com>
>
>         PR target/97142
>         * gcc.target/powerpc/pr97142.c: New test.

Okay.

Thanks, David

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
  2021-09-03  2:31           ` Xionghu Luo
  2021-09-03 14:51             ` Bill Schmidt
  2021-09-03 14:53             ` David Edelsohn
@ 2021-09-03 21:44             ` Segher Boessenkool
  2021-09-06  8:59               ` Xionghu Luo
  2 siblings, 1 reply; 13+ messages in thread
From: Segher Boessenkool @ 2021-09-03 21:44 UTC (permalink / raw)
  To: Xionghu Luo; +Cc: will schmidt, gcc-patches, wschmidt, dje.gcc, linkw

Hi!

On Fri, Sep 03, 2021 at 10:31:24AM +0800, Xionghu Luo wrote:
> fmod/fmodf and remainder/remainderf could be expanded instead of library
> call when fast-math build, which is much faster.

Thank you very much for this patch.

Some trivial comments if you haven't commmitted it yet:

> +(define_expand "fmod<mode>3"
> +  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
> +	(use (match_operand:SFDF 1 "gpc_reg_operand"))
> +	(use (match_operand:SFDF 2 "gpc_reg_operand"))]
> +  "TARGET_HARD_FLOAT
> +  && TARGET_FPRND
> +  && flag_unsafe_math_optimizations"

It should have one extra space before each && here:

  "TARGET_HARD_FLOAT
   && TARGET_FPRND
   && flag_unsafe_math_optimizations"

(so that everything inside of the string aligns).

> +(define_expand "remainder<mode>3"

(same here).

> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */

These are negative tests, so won't spuriously fail, but this does not
test for the function prefixes we can have.  See
gcc.target/powerpc/builtins-1.c for example.

Again, thank you, and thanks to everyone else for the patch review
action :-)


Segher

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
  2021-09-03 21:44             ` Segher Boessenkool
@ 2021-09-06  8:59               ` Xionghu Luo
  2021-09-06 21:57                 ` Segher Boessenkool
  0 siblings, 1 reply; 13+ messages in thread
From: Xionghu Luo @ 2021-09-06  8:59 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: will schmidt, gcc-patches, wschmidt, dje.gcc, linkw



On 2021/9/4 05:44, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Sep 03, 2021 at 10:31:24AM +0800, Xionghu Luo wrote:
>> fmod/fmodf and remainder/remainderf could be expanded instead of library
>> call when fast-math build, which is much faster.
> 
> Thank you very much for this patch.
> 
> Some trivial comments if you haven't commmitted it yet:
> 
>> +(define_expand "fmod<mode>3"
>> +  [(use (match_operand:SFDF 0 "gpc_reg_operand"))
>> +	(use (match_operand:SFDF 1 "gpc_reg_operand"))
>> +	(use (match_operand:SFDF 2 "gpc_reg_operand"))]
>> +  "TARGET_HARD_FLOAT
>> +  && TARGET_FPRND
>> +  && flag_unsafe_math_optimizations"
> 
> It should have one extra space before each && here:

OK.

> 
>    "TARGET_HARD_FLOAT
>     && TARGET_FPRND
>     && flag_unsafe_math_optimizations"
> 
> (so that everything inside of the string aligns).
> 
>> +(define_expand "remainder<mode>3"
> 
> (same here).
> 
>> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
>> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
>> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
>> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
> 
> These are negative tests, so won't spuriously fail, but this does not
> test for the function prefixes we can have.  See
> gcc.target/powerpc/builtins-1.c for example.

Thanks.  Verified that different calls are generated on different platforms
without this patch.

P8BE-64: bl __fmodf_finite
P8BE-32: b __fmodf_finite
P8LE-64:  bl fmodf

"l", "__" and "_finite" are optional, so is it OK to check them with below patterns?

+/* { dg-final { scan-assembler-not {\mbl? (__)?fmod(_finite)?\M} } } */
+/* { dg-final { scan-assembler-not {\mbl? (__)?fmodf(_finite)?\M} } } */
+/* { dg-final { scan-assembler-not {\mbl? (__)?remainder(_finite)?\M} } } */
+/* { dg-final { scan-assembler-not {\mbl? (__)?remainderf(_finite)?\M} } } */


> 
> Again, thank you, and thanks to everyone else for the patch review
> action :-)
> 
> 
> Segher
> 

-- 
Thanks,
Xionghu

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
  2021-09-06  8:59               ` Xionghu Luo
@ 2021-09-06 21:57                 ` Segher Boessenkool
  0 siblings, 0 replies; 13+ messages in thread
From: Segher Boessenkool @ 2021-09-06 21:57 UTC (permalink / raw)
  To: Xionghu Luo; +Cc: will schmidt, gcc-patches, wschmidt, dje.gcc, linkw

Hi!

On Mon, Sep 06, 2021 at 04:59:27PM +0800, Xionghu Luo wrote:
> On 2021/9/4 05:44, Segher Boessenkool wrote:
> >>+/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
> >>+/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
> >>+/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
> >>+/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
> >
> >These are negative tests, so won't spuriously fail, but this does not
> >test for the function prefixes we can have.  See
> >gcc.target/powerpc/builtins-1.c for example.
> 
> Thanks.  Verified that different calls are generated on different platforms
> without this patch.
> 
> P8BE-64: bl __fmodf_finite
> P8BE-32: b __fmodf_finite
> P8LE-64:  bl fmodf

Ah, it won't use the "dot-names" here, okay.  I think for Darwin you
need to allow a single underscore, but you'll find out (or Iain will,
most likely ;-) )

> "l", "__" and "_finite" are optional, so is it OK to check them with below 
> patterns?
> 
> +/* { dg-final { scan-assembler-not {\mbl? (__)?fmod(_finite)?\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl? (__)?fmodf(_finite)?\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl? (__)?remainder(_finite)?\M} } } 
> */
> +/* { dg-final { scan-assembler-not {\mbl? (__)?remainderf(_finite)?\M} } } 
> */

You could even do

/* { dg-final { scan-assembler-not {(?n)\mb.*fmod} } } */
/* { dg-final { scan-assembler-not {(?n)\mb.*remainder} } } */

or even

/* { dg-final { scan-assembler-not {fmod} } } */
/* { dg-final { scan-assembler-not {remainder} } } */

(and the testcase name will not accidentally match either of those REs
either, I checked :-) )

And yeah, on some subtargets the calls will be tail-optimised, good
find.  You can get around that (in general, on any target) by doing

float test1 (float x, float y)
{
  float z = fmodf (x, y);
  asm (""); // to prevent tail calls
  return z;
}

but what you do is fine as well, and much more elegant.

Please pick (and test ;-) ) whichever option you like best.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-09-06 21:58 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-16  7:10 [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] Xiong Hu Luo
2021-05-06  2:36 ` Ping: " Xionghu Luo
2021-05-14  7:13   ` Xionghu Luo
2021-06-07  5:08     ` Ping^2: " Xionghu Luo
2021-06-30  1:44     ` Ping ^ 2: " Xionghu Luo
2021-07-09 18:40       ` will schmidt
2021-07-12  1:25         ` Xionghu Luo
2021-09-03  2:31           ` Xionghu Luo
2021-09-03 14:51             ` Bill Schmidt
2021-09-03 14:53             ` David Edelsohn
2021-09-03 21:44             ` Segher Boessenkool
2021-09-06  8:59               ` Xionghu Luo
2021-09-06 21:57                 ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).