* [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
@ 2021-04-16 7:10 Xiong Hu Luo
2021-05-06 2:36 ` Ping: " Xionghu Luo
0 siblings, 1 reply; 13+ messages in thread
From: Xiong Hu Luo @ 2021-04-16 7:10 UTC (permalink / raw)
To: gcc-patches; +Cc: segher, dje.gcc, wschmidt, guojiufu, linkw, Xiong Hu Luo
fmod/fmodf and remainder/remainderf could be expanded instead of library
call when fast-math build, which is much faster.
fmodf:
fdivs f0,f1,f2
friz f0,f0
fnmsubs f1,f2,f0,f1
remainderf:
fdivs f0,f1,f2
frin f0,f0
fnmsubs f1,f2,f0,f1
gcc/ChangeLog:
2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com>
PR target/97142
* config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
(remainder<mode>3): Likewise.
gcc/testsuite/ChangeLog:
2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com>
PR target/97142
* gcc.target/powerpc/pr97142.c: New test.
---
gcc/config/rs6000/rs6000.md | 36 ++++++++++++++++++++++
gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++
2 files changed, 66 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index a1315523fec..7e0e94e6ba4 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -4902,6 +4902,42 @@ (define_insn "fre<sd>"
[(set_attr "type" "fp")
(set_attr "isa" "*,<Fisa>")])
+(define_expand "fmod<mode>3"
+ [(use (match_operand:SFDF 0 "gpc_reg_operand"))
+ (use (match_operand:SFDF 1 "gpc_reg_operand"))
+ (use (match_operand:SFDF 2 "gpc_reg_operand"))]
+ "TARGET_HARD_FLOAT
+ && TARGET_FPRND
+ && flag_unsafe_math_optimizations"
+{
+ rtx div = gen_reg_rtx (<MODE>mode);
+ emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
+
+ rtx friz = gen_reg_rtx (<MODE>mode);
+ emit_insn (gen_btrunc<mode>2 (friz, div));
+
+ emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, operands[1]));
+ DONE;
+ })
+
+(define_expand "remainder<mode>3"
+ [(use (match_operand:SFDF 0 "gpc_reg_operand"))
+ (use (match_operand:SFDF 1 "gpc_reg_operand"))
+ (use (match_operand:SFDF 2 "gpc_reg_operand"))]
+ "TARGET_HARD_FLOAT
+ && TARGET_FPRND
+ && flag_unsafe_math_optimizations"
+{
+ rtx div = gen_reg_rtx (<MODE>mode);
+ emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
+
+ rtx frin = gen_reg_rtx (<MODE>mode);
+ emit_insn (gen_round<mode>2 (frin, div));
+
+ emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, operands[1]));
+ DONE;
+ })
+
(define_insn "*rsqrt<mode>2"
[(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa")
(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")]
diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c b/gcc/testsuite/gcc.target/powerpc/pr97142.c
new file mode 100644
index 00000000000..48f25ca5b5b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast" } */
+
+#include <math.h>
+
+float test1 (float x, float y)
+{
+ return fmodf (x, y);
+}
+
+double test2 (double x, double y)
+{
+ return fmod (x, y);
+}
+
+float test3 (float x, float y)
+{
+ return remainderf (x, y);
+}
+
+double test4 (double x, double y)
+{
+ return remainder (x, y);
+}
+
+/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
+/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
+/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
+/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
+
--
2.27.0.90.geebb51ba8c
^ permalink raw reply [flat|nested] 13+ messages in thread
* Ping: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
2021-04-16 7:10 [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] Xiong Hu Luo
@ 2021-05-06 2:36 ` Xionghu Luo
2021-05-14 7:13 ` Xionghu Luo
0 siblings, 1 reply; 13+ messages in thread
From: Xionghu Luo @ 2021-05-06 2:36 UTC (permalink / raw)
To: gcc-patches; +Cc: segher, dje.gcc, wschmidt, guojiufu, linkw
Gentle ping, thanks.
On 2021/4/16 15:10, Xiong Hu Luo wrote:
> fmod/fmodf and remainder/remainderf could be expanded instead of library
> call when fast-math build, which is much faster.
>
> fmodf:
> fdivs f0,f1,f2
> friz f0,f0
> fnmsubs f1,f2,f0,f1
>
> remainderf:
> fdivs f0,f1,f2
> frin f0,f0
> fnmsubs f1,f2,f0,f1
>
> gcc/ChangeLog:
>
> 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com>
>
> PR target/97142
> * config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
> (remainder<mode>3): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com>
>
> PR target/97142
> * gcc.target/powerpc/pr97142.c: New test.
> ---
> gcc/config/rs6000/rs6000.md | 36 ++++++++++++++++++++++
> gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++
> 2 files changed, 66 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c
>
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index a1315523fec..7e0e94e6ba4 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -4902,6 +4902,42 @@ (define_insn "fre<sd>"
> [(set_attr "type" "fp")
> (set_attr "isa" "*,<Fisa>")])
>
> +(define_expand "fmod<mode>3"
> + [(use (match_operand:SFDF 0 "gpc_reg_operand"))
> + (use (match_operand:SFDF 1 "gpc_reg_operand"))
> + (use (match_operand:SFDF 2 "gpc_reg_operand"))]
> + "TARGET_HARD_FLOAT
> + && TARGET_FPRND
> + && flag_unsafe_math_optimizations"
> +{
> + rtx div = gen_reg_rtx (<MODE>mode);
> + emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
> +
> + rtx friz = gen_reg_rtx (<MODE>mode);
> + emit_insn (gen_btrunc<mode>2 (friz, div));
> +
> + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, operands[1]));
> + DONE;
> + })
> +
> +(define_expand "remainder<mode>3"
> + [(use (match_operand:SFDF 0 "gpc_reg_operand"))
> + (use (match_operand:SFDF 1 "gpc_reg_operand"))
> + (use (match_operand:SFDF 2 "gpc_reg_operand"))]
> + "TARGET_HARD_FLOAT
> + && TARGET_FPRND
> + && flag_unsafe_math_optimizations"
> +{
> + rtx div = gen_reg_rtx (<MODE>mode);
> + emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
> +
> + rtx frin = gen_reg_rtx (<MODE>mode);
> + emit_insn (gen_round<mode>2 (frin, div));
> +
> + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, operands[1]));
> + DONE;
> + })
> +
> (define_insn "*rsqrt<mode>2"
> [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa")
> (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")]
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c b/gcc/testsuite/gcc.target/powerpc/pr97142.c
> new file mode 100644
> index 00000000000..48f25ca5b5b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
> @@ -0,0 +1,30 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Ofast" } */
> +
> +#include <math.h>
> +
> +float test1 (float x, float y)
> +{
> + return fmodf (x, y);
> +}
> +
> +double test2 (double x, double y)
> +{
> + return fmod (x, y);
> +}
> +
> +float test3 (float x, float y)
> +{
> + return remainderf (x, y);
> +}
> +
> +double test4 (double x, double y)
> +{
> + return remainder (x, y);
> +}
> +
> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
> +
>
--
Thanks,
Xionghu
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Ping: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
2021-05-06 2:36 ` Ping: " Xionghu Luo
@ 2021-05-14 7:13 ` Xionghu Luo
2021-06-07 5:08 ` Ping^2: " Xionghu Luo
2021-06-30 1:44 ` Ping ^ 2: " Xionghu Luo
0 siblings, 2 replies; 13+ messages in thread
From: Xionghu Luo @ 2021-05-14 7:13 UTC (permalink / raw)
To: gcc-patches; +Cc: wschmidt, dje.gcc, segher, linkw
Test SPEC2017 Ofast P8LE for this patch : 511.povray_r +1.14%,
526.blender_r +1.72%, no obvious changes to others.
On 2021/5/6 10:36, Xionghu Luo via Gcc-patches wrote:
> Gentle ping, thanks.
>
>
> On 2021/4/16 15:10, Xiong Hu Luo wrote:
>> fmod/fmodf and remainder/remainderf could be expanded instead of library
>> call when fast-math build, which is much faster.
>>
>> fmodf:
>> fdivs f0,f1,f2
>> friz f0,f0
>> fnmsubs f1,f2,f0,f1
>>
>> remainderf:
>> fdivs f0,f1,f2
>> frin f0,f0
>> fnmsubs f1,f2,f0,f1
>>
>> gcc/ChangeLog:
>>
>> 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com>
>>
>> PR target/97142
>> * config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
>> (remainder<mode>3): Likewise.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com>
>>
>> PR target/97142
>> * gcc.target/powerpc/pr97142.c: New test.
>> ---
>> gcc/config/rs6000/rs6000.md | 36 ++++++++++++++++++++++
>> gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++
>> 2 files changed, 66 insertions(+)
>> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c
>>
>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
>> index a1315523fec..7e0e94e6ba4 100644
>> --- a/gcc/config/rs6000/rs6000.md
>> +++ b/gcc/config/rs6000/rs6000.md
>> @@ -4902,6 +4902,42 @@ (define_insn "fre<sd>"
>> [(set_attr "type" "fp")
>> (set_attr "isa" "*,<Fisa>")])
>> +(define_expand "fmod<mode>3"
>> + [(use (match_operand:SFDF 0 "gpc_reg_operand"))
>> + (use (match_operand:SFDF 1 "gpc_reg_operand"))
>> + (use (match_operand:SFDF 2 "gpc_reg_operand"))]
>> + "TARGET_HARD_FLOAT
>> + && TARGET_FPRND
>> + && flag_unsafe_math_optimizations"
>> +{
>> + rtx div = gen_reg_rtx (<MODE>mode);
>> + emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
>> +
>> + rtx friz = gen_reg_rtx (<MODE>mode);
>> + emit_insn (gen_btrunc<mode>2 (friz, div));
>> +
>> + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz,
>> operands[1]));
>> + DONE;
>> + })
>> +
>> +(define_expand "remainder<mode>3"
>> + [(use (match_operand:SFDF 0 "gpc_reg_operand"))
>> + (use (match_operand:SFDF 1 "gpc_reg_operand"))
>> + (use (match_operand:SFDF 2 "gpc_reg_operand"))]
>> + "TARGET_HARD_FLOAT
>> + && TARGET_FPRND
>> + && flag_unsafe_math_optimizations"
>> +{
>> + rtx div = gen_reg_rtx (<MODE>mode);
>> + emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
>> +
>> + rtx frin = gen_reg_rtx (<MODE>mode);
>> + emit_insn (gen_round<mode>2 (frin, div));
>> +
>> + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin,
>> operands[1]));
>> + DONE;
>> + })
>> +
>> (define_insn "*rsqrt<mode>2"
>> [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa")
>> (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")]
>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c
>> b/gcc/testsuite/gcc.target/powerpc/pr97142.c
>> new file mode 100644
>> index 00000000000..48f25ca5b5b
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
>> @@ -0,0 +1,30 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-Ofast" } */
>> +
>> +#include <math.h>
>> +
>> +float test1 (float x, float y)
>> +{
>> + return fmodf (x, y);
>> +}
>> +
>> +double test2 (double x, double y)
>> +{
>> + return fmod (x, y);
>> +}
>> +
>> +float test3 (float x, float y)
>> +{
>> + return remainderf (x, y);
>> +}
>> +
>> +double test4 (double x, double y)
>> +{
>> + return remainder (x, y);
>> +}
>> +
>> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
>> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
>> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
>> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
>> +
>>
>
--
Thanks,
Xionghu
^ permalink raw reply [flat|nested] 13+ messages in thread
* Ping^2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
2021-05-14 7:13 ` Xionghu Luo
@ 2021-06-07 5:08 ` Xionghu Luo
2021-06-30 1:44 ` Ping ^ 2: " Xionghu Luo
1 sibling, 0 replies; 13+ messages in thread
From: Xionghu Luo @ 2021-06-07 5:08 UTC (permalink / raw)
To: gcc-patches; +Cc: wschmidt, segher, dje.gcc, linkw
Ping, thanks.
On 2021/5/14 15:13, Xionghu Luo via Gcc-patches wrote:
> Test SPEC2017 Ofast P8LE for this patch : 511.povray_r +1.14%,
> 526.blender_r +1.72%, no obvious changes to others.
>
>
> On 2021/5/6 10:36, Xionghu Luo via Gcc-patches wrote:
>> Gentle ping, thanks.
>>
>>
>> On 2021/4/16 15:10, Xiong Hu Luo wrote:
>>> fmod/fmodf and remainder/remainderf could be expanded instead of library
>>> call when fast-math build, which is much faster.
>>>
>>> fmodf:
>>> fdivs f0,f1,f2
>>> friz f0,f0
>>> fnmsubs f1,f2,f0,f1
>>>
>>> remainderf:
>>> fdivs f0,f1,f2
>>> frin f0,f0
>>> fnmsubs f1,f2,f0,f1
>>>
>>> gcc/ChangeLog:
>>>
>>> 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com>
>>>
>>> PR target/97142
>>> * config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
>>> (remainder<mode>3): Likewise.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com>
>>>
>>> PR target/97142
>>> * gcc.target/powerpc/pr97142.c: New test.
>>> ---
>>> gcc/config/rs6000/rs6000.md | 36 ++++++++++++++++++++++
>>> gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++
>>> 2 files changed, 66 insertions(+)
>>> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c
>>>
>>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
>>> index a1315523fec..7e0e94e6ba4 100644
>>> --- a/gcc/config/rs6000/rs6000.md
>>> +++ b/gcc/config/rs6000/rs6000.md
>>> @@ -4902,6 +4902,42 @@ (define_insn "fre<sd>"
>>> [(set_attr "type" "fp")
>>> (set_attr "isa" "*,<Fisa>")])
>>> +(define_expand "fmod<mode>3"
>>> + [(use (match_operand:SFDF 0 "gpc_reg_operand"))
>>> + (use (match_operand:SFDF 1 "gpc_reg_operand"))
>>> + (use (match_operand:SFDF 2 "gpc_reg_operand"))]
>>> + "TARGET_HARD_FLOAT
>>> + && TARGET_FPRND
>>> + && flag_unsafe_math_optimizations"
>>> +{
>>> + rtx div = gen_reg_rtx (<MODE>mode);
>>> + emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
>>> +
>>> + rtx friz = gen_reg_rtx (<MODE>mode);
>>> + emit_insn (gen_btrunc<mode>2 (friz, div));
>>> +
>>> + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz,
>>> operands[1]));
>>> + DONE;
>>> + })
>>> +
>>> +(define_expand "remainder<mode>3"
>>> + [(use (match_operand:SFDF 0 "gpc_reg_operand"))
>>> + (use (match_operand:SFDF 1 "gpc_reg_operand"))
>>> + (use (match_operand:SFDF 2 "gpc_reg_operand"))]
>>> + "TARGET_HARD_FLOAT
>>> + && TARGET_FPRND
>>> + && flag_unsafe_math_optimizations"
>>> +{
>>> + rtx div = gen_reg_rtx (<MODE>mode);
>>> + emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
>>> +
>>> + rtx frin = gen_reg_rtx (<MODE>mode);
>>> + emit_insn (gen_round<mode>2 (frin, div));
>>> +
>>> + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin,
>>> operands[1]));
>>> + DONE;
>>> + })
>>> +
>>> (define_insn "*rsqrt<mode>2"
>>> [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa")
>>> (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")]
>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c
>>> b/gcc/testsuite/gcc.target/powerpc/pr97142.c
>>> new file mode 100644
>>> index 00000000000..48f25ca5b5b
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
>>> @@ -0,0 +1,30 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-Ofast" } */
>>> +
>>> +#include <math.h>
>>> +
>>> +float test1 (float x, float y)
>>> +{
>>> + return fmodf (x, y);
>>> +}
>>> +
>>> +double test2 (double x, double y)
>>> +{
>>> + return fmod (x, y);
>>> +}
>>> +
>>> +float test3 (float x, float y)
>>> +{
>>> + return remainderf (x, y);
>>> +}
>>> +
>>> +double test4 (double x, double y)
>>> +{
>>> + return remainder (x, y);
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
>>> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
>>> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
>>> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
>>> +
>>>
>>
>
--
Thanks,
Xionghu
^ permalink raw reply [flat|nested] 13+ messages in thread
* Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
2021-05-14 7:13 ` Xionghu Luo
2021-06-07 5:08 ` Ping^2: " Xionghu Luo
@ 2021-06-30 1:44 ` Xionghu Luo
2021-07-09 18:40 ` will schmidt
1 sibling, 1 reply; 13+ messages in thread
From: Xionghu Luo @ 2021-06-30 1:44 UTC (permalink / raw)
To: gcc-patches; +Cc: wschmidt, segher, dje.gcc, linkw
Gentle ping ^2, thanks.
https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html
On 2021/5/14 15:13, Xionghu Luo via Gcc-patches wrote:
> Test SPEC2017 Ofast P8LE for this patch : 511.povray_r +1.14%,
> 526.blender_r +1.72%, no obvious changes to others.
>
>
> On 2021/5/6 10:36, Xionghu Luo via Gcc-patches wrote:
>> Gentle ping, thanks.
>>
>>
>> On 2021/4/16 15:10, Xiong Hu Luo wrote:
>>> fmod/fmodf and remainder/remainderf could be expanded instead of library
>>> call when fast-math build, which is much faster.
>>>
>>> fmodf:
>>> fdivs f0,f1,f2
>>> friz f0,f0
>>> fnmsubs f1,f2,f0,f1
>>>
>>> remainderf:
>>> fdivs f0,f1,f2
>>> frin f0,f0
>>> fnmsubs f1,f2,f0,f1
>>>
>>> gcc/ChangeLog:
>>>
>>> 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com>
>>>
>>> PR target/97142
>>> * config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
>>> (remainder<mode>3): Likewise.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com>
>>>
>>> PR target/97142
>>> * gcc.target/powerpc/pr97142.c: New test.
>>> ---
>>> gcc/config/rs6000/rs6000.md | 36 ++++++++++++++++++++++
>>> gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++
>>> 2 files changed, 66 insertions(+)
>>> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c
>>>
>>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
>>> index a1315523fec..7e0e94e6ba4 100644
>>> --- a/gcc/config/rs6000/rs6000.md
>>> +++ b/gcc/config/rs6000/rs6000.md
>>> @@ -4902,6 +4902,42 @@ (define_insn "fre<sd>"
>>> [(set_attr "type" "fp")
>>> (set_attr "isa" "*,<Fisa>")])
>>> +(define_expand "fmod<mode>3"
>>> + [(use (match_operand:SFDF 0 "gpc_reg_operand"))
>>> + (use (match_operand:SFDF 1 "gpc_reg_operand"))
>>> + (use (match_operand:SFDF 2 "gpc_reg_operand"))]
>>> + "TARGET_HARD_FLOAT
>>> + && TARGET_FPRND
>>> + && flag_unsafe_math_optimizations"
>>> +{
>>> + rtx div = gen_reg_rtx (<MODE>mode);
>>> + emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
>>> +
>>> + rtx friz = gen_reg_rtx (<MODE>mode);
>>> + emit_insn (gen_btrunc<mode>2 (friz, div));
>>> +
>>> + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz,
>>> operands[1]));
>>> + DONE;
>>> + })
>>> +
>>> +(define_expand "remainder<mode>3"
>>> + [(use (match_operand:SFDF 0 "gpc_reg_operand"))
>>> + (use (match_operand:SFDF 1 "gpc_reg_operand"))
>>> + (use (match_operand:SFDF 2 "gpc_reg_operand"))]
>>> + "TARGET_HARD_FLOAT
>>> + && TARGET_FPRND
>>> + && flag_unsafe_math_optimizations"
>>> +{
>>> + rtx div = gen_reg_rtx (<MODE>mode);
>>> + emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
>>> +
>>> + rtx frin = gen_reg_rtx (<MODE>mode);
>>> + emit_insn (gen_round<mode>2 (frin, div));
>>> +
>>> + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin,
>>> operands[1]));
>>> + DONE;
>>> + })
>>> +
>>> (define_insn "*rsqrt<mode>2"
>>> [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa")
>>> (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")]
>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c
>>> b/gcc/testsuite/gcc.target/powerpc/pr97142.c
>>> new file mode 100644
>>> index 00000000000..48f25ca5b5b
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
>>> @@ -0,0 +1,30 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-Ofast" } */
>>> +
>>> +#include <math.h>
>>> +
>>> +float test1 (float x, float y)
>>> +{
>>> + return fmodf (x, y);
>>> +}
>>> +
>>> +double test2 (double x, double y)
>>> +{
>>> + return fmod (x, y);
>>> +}
>>> +
>>> +float test3 (float x, float y)
>>> +{
>>> + return remainderf (x, y);
>>> +}
>>> +
>>> +double test4 (double x, double y)
>>> +{
>>> + return remainder (x, y);
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
>>> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
>>> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
>>> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
>>> +
>>>
>>
>
--
Thanks,
Xionghu
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
2021-06-30 1:44 ` Ping ^ 2: " Xionghu Luo
@ 2021-07-09 18:40 ` will schmidt
2021-07-12 1:25 ` Xionghu Luo
0 siblings, 1 reply; 13+ messages in thread
From: will schmidt @ 2021-07-09 18:40 UTC (permalink / raw)
To: Xionghu Luo, gcc-patches; +Cc: wschmidt, dje.gcc, segher, linkw
On Wed, 2021-06-30 at 09:44 +0800, Xionghu Luo via Gcc-patches wrote:
> Gentle ping ^2, thanks.
>
> https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html
>
>
> On 2021/5/14 15:13, Xionghu Luo via Gcc-patches wrote:
> > Test SPEC2017 Ofast P8LE for this patch : 511.povray_r +1.14%,
> > 526.blender_r +1.72%, no obvious changes to others.
Ok.
> >
> >
> > On 2021/5/6 10:36, Xionghu Luo via Gcc-patches wrote:
> > > Gentle ping, thanks.
> > >
> > >
> > > On 2021/4/16 15:10, Xiong Hu Luo wrote:
> > > > fmod/fmodf and remainder/remainderf could be expanded instead of library
> > > > call when fast-math build, which is much faster.
> > > >
> > > > fmodf:
> > > > fdivs f0,f1,f2
> > > > friz f0,f0
> > > > fnmsubs f1,f2,f0,f1
> > > >
> > > > remainderf:
> > > > fdivs f0,f1,f2
> > > > frin f0,f0
> > > > fnmsubs f1,f2,f0,f1
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com>
> > > >
> > > > PR target/97142
That PR is " Bug 97142
- __builtin_fmod not optimized on POWER "
OK.
> > > > * config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
> > > > (remainder<mode>3): Likewise.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com>
> > > >
> > > > PR target/97142
> > > > * gcc.target/powerpc/pr97142.c: New test.
Ok.
> > > > ---
> > > > gcc/config/rs6000/rs6000.md | 36 ++++++++++++++++++++++
> > > > gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++
> > > > 2 files changed, 66 insertions(+)
> > > > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c
> > > >
> > > > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> > > > index a1315523fec..7e0e94e6ba4 100644
> > > > --- a/gcc/config/rs6000/rs6000.md
> > > > +++ b/gcc/config/rs6000/rs6000.md
> > > > @@ -4902,6 +4902,42 @@ (define_insn "fre<sd>"
> > > > [(set_attr "type" "fp")
> > > > (set_attr "isa" "*,<Fisa>")])
> > > > +(define_expand "fmod<mode>3"
> > > > + [(use (match_operand:SFDF 0 "gpc_reg_operand"))
> > > > + (use (match_operand:SFDF 1 "gpc_reg_operand"))
> > > > + (use (match_operand:SFDF 2 "gpc_reg_operand"))]
> > > > + "TARGET_HARD_FLOAT
> > > > + && TARGET_FPRND
> > > > + && flag_unsafe_math_optimizations"
> > > > +{
> > > > + rtx div = gen_reg_rtx (<MODE>mode);
> > > > + emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
> > > > +
> > > > + rtx friz = gen_reg_rtx (<MODE>mode);
> > > > + emit_insn (gen_btrunc<mode>2 (friz, div));
> > > > +
> > > > + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz,
> > > > operands[1]));
> > > > + DONE;
> > > > + })
> > > > +
> > > > +(define_expand "remainder<mode>3"
> > > > + [(use (match_operand:SFDF 0 "gpc_reg_operand"))
> > > > + (use (match_operand:SFDF 1 "gpc_reg_operand"))
> > > > + (use (match_operand:SFDF 2 "gpc_reg_operand"))]
> > > > + "TARGET_HARD_FLOAT
> > > > + && TARGET_FPRND
> > > > + && flag_unsafe_math_optimizations"
> > > > +{
> > > > + rtx div = gen_reg_rtx (<MODE>mode);
> > > > + emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
> > > > +
> > > > + rtx frin = gen_reg_rtx (<MODE>mode);
> > > > + emit_insn (gen_round<mode>2 (frin, div));
> > > > +
> > > > + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin,
> > > > operands[1]));
> > > > + DONE;
> > > > + })
I notice the pattern of arguments to the final emit
is op[0],op[2],fri*,op[1]
while the description comment suggests the generated instruction
will be fnmsubs f1,f2,f0,f1 ;
I don't see any rearranging in the nfms<mode>4 expansions, but
presumably this is correct and just a cosmetic nit that catches my eye.
Ok.
> > > > +
> > > > (define_insn "*rsqrt<mode>2"
> > > > [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa")
> > > > (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")]
> > > > diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c
> > > > b/gcc/testsuite/gcc.target/powerpc/pr97142.c
> > > > new file mode 100644
> > > > index 00000000000..48f25ca5b5b
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
> > > > @@ -0,0 +1,30 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-options "-Ofast" } */
> > > > +
> > > > +#include <math.h>
> > > > +
> > > > +float test1 (float x, float y)
> > > > +{
> > > > + return fmodf (x, y);
> > > > +}
> > > > +
> > > > +double test2 (double x, double y)
> > > > +{
> > > > + return fmod (x, y);
> > > > +}
> > > > +
> > > > +float test3 (float x, float y)
> > > > +{
> > > > + return remainderf (x, y);
> > > > +}
> > > > +
> > > > +double test4 (double x, double y)
> > > > +{
> > > > + return remainder (x, y);
> > > > +}
> > > > +
> > > > +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
> > > > +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
> > > > +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
> > > > +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
Ok.
I'd be tempted to add scan-assembler checks for the fdivs,fri*,fnmsubs
instructions as well.
I defer to others on that, of course.. :-)
lgtm,
thanks
-Will
> > > > +
> > > >
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
2021-07-09 18:40 ` will schmidt
@ 2021-07-12 1:25 ` Xionghu Luo
2021-09-03 2:31 ` Xionghu Luo
0 siblings, 1 reply; 13+ messages in thread
From: Xionghu Luo @ 2021-07-12 1:25 UTC (permalink / raw)
To: will schmidt, gcc-patches; +Cc: wschmidt, dje.gcc, segher, linkw
On 2021/7/10 02:40, will schmidt wrote:
> On Wed, 2021-06-30 at 09:44 +0800, Xionghu Luo via Gcc-patches wrote:
>> Gentle ping ^2, thanks.
>>
>> https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html
>>
>>
>> On 2021/5/14 15:13, Xionghu Luo via Gcc-patches wrote:
>>> Test SPEC2017 Ofast P8LE for this patch : 511.povray_r +1.14%,
>>> 526.blender_r +1.72%, no obvious changes to others.
>
> Ok.
>
>>>
>>>
>>> On 2021/5/6 10:36, Xionghu Luo via Gcc-patches wrote:
>>>> Gentle ping, thanks.
>>>>
>>>>
>>>> On 2021/4/16 15:10, Xiong Hu Luo wrote:
>>>>> fmod/fmodf and remainder/remainderf could be expanded instead of library
>>>>> call when fast-math build, which is much faster.
>>>>>
>>>>> fmodf:
>>>>> fdivs f0,f1,f2
>>>>> friz f0,f0
>>>>> fnmsubs f1,f2,f0,f1
>>>>>
>>>>> remainderf:
>>>>> fdivs f0,f1,f2
>>>>> frin f0,f0
>>>>> fnmsubs f1,f2,f0,f1
>>>>>
>>>>> gcc/ChangeLog:
>>>>>
>>>>> 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com>
>>>>>
>>>>> PR target/97142
>
> That PR is " Bug 97142
> - __builtin_fmod not optimized on POWER "
>
> OK.
>
>
>>>>> * config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
>>>>> (remainder<mode>3): Likewise.
>
>
>>>>>
>>>>> gcc/testsuite/ChangeLog:
>>>>>
>>>>> 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com>
>>>>>
>>>>> PR target/97142
>>>>> * gcc.target/powerpc/pr97142.c: New test.
>
> Ok.
>
>>>>> ---
>>>>> gcc/config/rs6000/rs6000.md | 36 ++++++++++++++++++++++
>>>>> gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++
>>>>> 2 files changed, 66 insertions(+)
>>>>> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c
>>>>>
>>>>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
>>>>> index a1315523fec..7e0e94e6ba4 100644
>>>>> --- a/gcc/config/rs6000/rs6000.md
>>>>> +++ b/gcc/config/rs6000/rs6000.md
>>>>> @@ -4902,6 +4902,42 @@ (define_insn "fre<sd>"
>>>>> [(set_attr "type" "fp")
>>>>> (set_attr "isa" "*,<Fisa>")])
>>>>> +(define_expand "fmod<mode>3"
>>>>> + [(use (match_operand:SFDF 0 "gpc_reg_operand"))
>>>>> + (use (match_operand:SFDF 1 "gpc_reg_operand"))
>>>>> + (use (match_operand:SFDF 2 "gpc_reg_operand"))]
>>>>> + "TARGET_HARD_FLOAT
>>>>> + && TARGET_FPRND
>>>>> + && flag_unsafe_math_optimizations"
>>>>> +{
>>>>> + rtx div = gen_reg_rtx (<MODE>mode);
>>>>> + emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
>>>>> +
>>>>> + rtx friz = gen_reg_rtx (<MODE>mode);
>>>>> + emit_insn (gen_btrunc<mode>2 (friz, div));
>>>>> +
>>>>> + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz,
>>>>> operands[1]));
>>>>> + DONE;
>>>>> + })
>>>>> +
>>>>> +(define_expand "remainder<mode>3"
>>>>> + [(use (match_operand:SFDF 0 "gpc_reg_operand"))
>>>>> + (use (match_operand:SFDF 1 "gpc_reg_operand"))
>>>>> + (use (match_operand:SFDF 2 "gpc_reg_operand"))]
>>>>> + "TARGET_HARD_FLOAT
>>>>> + && TARGET_FPRND
>>>>> + && flag_unsafe_math_optimizations"
>>>>> +{
>>>>> + rtx div = gen_reg_rtx (<MODE>mode);
>>>>> + emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
>>>>> +
>>>>> + rtx frin = gen_reg_rtx (<MODE>mode);
>>>>> + emit_insn (gen_round<mode>2 (frin, div));
>>>>> +
>>>>> + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin,
>>>>> operands[1]));
>>>>> + DONE;
>>>>> + })
>
> I notice the pattern of arguments to the final emit
> is op[0],op[2],fri*,op[1]
> while the description comment suggests the generated instruction
> will be fnmsubs f1,f2,f0,f1 ;
>
> I don't see any rearranging in the nfms<mode>4 expansions, but
> presumably this is correct and just a cosmetic nit that catches my eye.
From the ISA,
fnmsub FRT,FRA,FRC,FRB
The operation
FRT ← - ( [(FRA) (FRC)] - (FRB) )
is performed.
fmodf:
fdivs f0,f1,f2
friz f0,f0
fnmsubs f1,f2,f0,f1
Then the ASM means:
f1 = - (f2 * f0 - f1) = - ([f2 * f1/f2] - f1)
So f1 is set with the mod result.
>
> Ok.
>
>
>>>>> +
>>>>> (define_insn "*rsqrt<mode>2"
>>>>> [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa")
>>>>> (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")]
>>>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c
>>>>> b/gcc/testsuite/gcc.target/powerpc/pr97142.c
>>>>> new file mode 100644
>>>>> index 00000000000..48f25ca5b5b
>>>>> --- /dev/null
>>>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
>>>>> @@ -0,0 +1,30 @@
>>>>> +/* { dg-do compile } */
>>>>> +/* { dg-options "-Ofast" } */
>>>>> +
>>>>> +#include <math.h>
>>>>> +
>>>>> +float test1 (float x, float y)
>>>>> +{
>>>>> + return fmodf (x, y);
>>>>> +}
>>>>> +
>>>>> +double test2 (double x, double y)
>>>>> +{
>>>>> + return fmod (x, y);
>>>>> +}
>>>>> +
>>>>> +float test3 (float x, float y)
>>>>> +{
>>>>> + return remainderf (x, y);
>>>>> +}
>>>>> +
>>>>> +double test4 (double x, double y)
>>>>> +{
>>>>> + return remainder (x, y);
>>>>> +}
>>>>> +
>>>>> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
>>>>> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
>>>>> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
>>>>> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
>
>
> Ok.
> I'd be tempted to add scan-assembler checks for the fdivs,fri*,fnmsubs
> instructions as well.
> I defer to others on that, of course.. :-)
Thanks, will add below check:
diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c b/gcc/testsuite/gcc.target/powerpc/pr97142.c
index 48f25ca5b5b..081ab40b4c0 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr97142.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
@@ -27,4 +27,11 @@ double test4 (double x, double y)
/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
+/* { dg-final { scan-assembler-times {\mfdiv\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfdivs\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfnmsub\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfnmsubs\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfriz\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfrin\M} 2 } } */
+
>
> lgtm,
> thanks
> -Will
>
>
>
>>>>> +
>>>>>
>>
>>
>
--
Thanks,
Xionghu
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
2021-07-12 1:25 ` Xionghu Luo
@ 2021-09-03 2:31 ` Xionghu Luo
2021-09-03 14:51 ` Bill Schmidt
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Xionghu Luo @ 2021-09-03 2:31 UTC (permalink / raw)
To: will schmidt, gcc-patches; +Cc: wschmidt, segher, dje.gcc, linkw
Resend the patch that addressed Will's comments.
fmod/fmodf and remainder/remainderf could be expanded instead of library
call when fast-math build, which is much faster.
fmodf:
fdivs f0,f1,f2
friz f0,f0
fnmsubs f1,f2,f0,f1
remainderf:
fdivs f0,f1,f2
frin f0,f0
fnmsubs f1,f2,f0,f1
SPEC2017 Ofast P8LE: 511.povray_r +1.14%, 526.blender_r +1.72%
gcc/ChangeLog:
2021-09-03 Xionghu Luo <luoxhu@linux.ibm.com>
PR target/97142
* config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
(remainder<mode>3): Likewise.
gcc/testsuite/ChangeLog:
2021-09-03 Xionghu Luo <luoxhu@linux.ibm.com>
PR target/97142
* gcc.target/powerpc/pr97142.c: New test.
---
gcc/config/rs6000/rs6000.md | 36 ++++++++++++++++++++++
gcc/testsuite/gcc.target/powerpc/pr97142.c | 35 +++++++++++++++++++++
2 files changed, 71 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index c8cdc42533c..84820d3b5cb 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -4932,6 +4932,42 @@ (define_insn "fre<sd>"
[(set_attr "type" "fp")
(set_attr "isa" "*,<Fisa>")])
+(define_expand "fmod<mode>3"
+ [(use (match_operand:SFDF 0 "gpc_reg_operand"))
+ (use (match_operand:SFDF 1 "gpc_reg_operand"))
+ (use (match_operand:SFDF 2 "gpc_reg_operand"))]
+ "TARGET_HARD_FLOAT
+ && TARGET_FPRND
+ && flag_unsafe_math_optimizations"
+{
+ rtx div = gen_reg_rtx (<MODE>mode);
+ emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
+
+ rtx friz = gen_reg_rtx (<MODE>mode);
+ emit_insn (gen_btrunc<mode>2 (friz, div));
+
+ emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, operands[1]));
+ DONE;
+ })
+
+(define_expand "remainder<mode>3"
+ [(use (match_operand:SFDF 0 "gpc_reg_operand"))
+ (use (match_operand:SFDF 1 "gpc_reg_operand"))
+ (use (match_operand:SFDF 2 "gpc_reg_operand"))]
+ "TARGET_HARD_FLOAT
+ && TARGET_FPRND
+ && flag_unsafe_math_optimizations"
+{
+ rtx div = gen_reg_rtx (<MODE>mode);
+ emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
+
+ rtx frin = gen_reg_rtx (<MODE>mode);
+ emit_insn (gen_round<mode>2 (frin, div));
+
+ emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, operands[1]));
+ DONE;
+ })
+
(define_insn "*rsqrt<mode>2"
[(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa")
(unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")]
diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c b/gcc/testsuite/gcc.target/powerpc/pr97142.c
new file mode 100644
index 00000000000..e5306eb681b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast" } */
+
+#include <math.h>
+
+float test1 (float x, float y)
+{
+ return fmodf (x, y);
+}
+
+double test2 (double x, double y)
+{
+ return fmod (x, y);
+}
+
+float test3 (float x, float y)
+{
+ return remainderf (x, y);
+}
+
+double test4 (double x, double y)
+{
+ return remainder (x, y);
+}
+
+/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
+/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
+/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
+/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
+/* { dg-final { scan-assembler-times {\mfdiv\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfdivs\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfnmsub\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfnmsubs\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfriz\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mfrin\M} 2 } } */
--
2.25.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
2021-09-03 2:31 ` Xionghu Luo
@ 2021-09-03 14:51 ` Bill Schmidt
2021-09-03 14:53 ` David Edelsohn
2021-09-03 21:44 ` Segher Boessenkool
2 siblings, 0 replies; 13+ messages in thread
From: Bill Schmidt @ 2021-09-03 14:51 UTC (permalink / raw)
To: Xionghu Luo, will schmidt, gcc-patches; +Cc: segher, dje.gcc, linkw
Hi Xionghu,
This looks okay to me. Recommend maintainers approve.
Thanks!
Bill
On 9/2/21 9:31 PM, Xionghu Luo wrote:
> Resend the patch that addressed Will's comments.
>
>
> fmod/fmodf and remainder/remainderf could be expanded instead of library
> call when fast-math build, which is much faster.
>
> fmodf:
> fdivs f0,f1,f2
> friz f0,f0
> fnmsubs f1,f2,f0,f1
>
> remainderf:
> fdivs f0,f1,f2
> frin f0,f0
> fnmsubs f1,f2,f0,f1
>
> SPEC2017 Ofast P8LE: 511.povray_r +1.14%, 526.blender_r +1.72%
>
> gcc/ChangeLog:
>
> 2021-09-03 Xionghu Luo <luoxhu@linux.ibm.com>
>
> PR target/97142
> * config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
> (remainder<mode>3): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> 2021-09-03 Xionghu Luo <luoxhu@linux.ibm.com>
>
> PR target/97142
> * gcc.target/powerpc/pr97142.c: New test.
> ---
> gcc/config/rs6000/rs6000.md | 36 ++++++++++++++++++++++
> gcc/testsuite/gcc.target/powerpc/pr97142.c | 35 +++++++++++++++++++++
> 2 files changed, 71 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c
>
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index c8cdc42533c..84820d3b5cb 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -4932,6 +4932,42 @@ (define_insn "fre<sd>"
> [(set_attr "type" "fp")
> (set_attr "isa" "*,<Fisa>")])
>
> +(define_expand "fmod<mode>3"
> + [(use (match_operand:SFDF 0 "gpc_reg_operand"))
> + (use (match_operand:SFDF 1 "gpc_reg_operand"))
> + (use (match_operand:SFDF 2 "gpc_reg_operand"))]
> + "TARGET_HARD_FLOAT
> + && TARGET_FPRND
> + && flag_unsafe_math_optimizations"
> +{
> + rtx div = gen_reg_rtx (<MODE>mode);
> + emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
> +
> + rtx friz = gen_reg_rtx (<MODE>mode);
> + emit_insn (gen_btrunc<mode>2 (friz, div));
> +
> + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, operands[1]));
> + DONE;
> + })
> +
> +(define_expand "remainder<mode>3"
> + [(use (match_operand:SFDF 0 "gpc_reg_operand"))
> + (use (match_operand:SFDF 1 "gpc_reg_operand"))
> + (use (match_operand:SFDF 2 "gpc_reg_operand"))]
> + "TARGET_HARD_FLOAT
> + && TARGET_FPRND
> + && flag_unsafe_math_optimizations"
> +{
> + rtx div = gen_reg_rtx (<MODE>mode);
> + emit_insn (gen_div<mode>3 (div, operands[1], operands[2]));
> +
> + rtx frin = gen_reg_rtx (<MODE>mode);
> + emit_insn (gen_round<mode>2 (frin, div));
> +
> + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, operands[1]));
> + DONE;
> + })
> +
> (define_insn "*rsqrt<mode>2"
> [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa")
> (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")]
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c b/gcc/testsuite/gcc.target/powerpc/pr97142.c
> new file mode 100644
> index 00000000000..e5306eb681b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c
> @@ -0,0 +1,35 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Ofast" } */
> +
> +#include <math.h>
> +
> +float test1 (float x, float y)
> +{
> + return fmodf (x, y);
> +}
> +
> +double test2 (double x, double y)
> +{
> + return fmod (x, y);
> +}
> +
> +float test3 (float x, float y)
> +{
> + return remainderf (x, y);
> +}
> +
> +double test4 (double x, double y)
> +{
> + return remainder (x, y);
> +}
> +
> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
> +/* { dg-final { scan-assembler-times {\mfdiv\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mfdivs\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mfnmsub\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mfnmsubs\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mfriz\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mfrin\M} 2 } } */
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
2021-09-03 2:31 ` Xionghu Luo
2021-09-03 14:51 ` Bill Schmidt
@ 2021-09-03 14:53 ` David Edelsohn
2021-09-03 21:44 ` Segher Boessenkool
2 siblings, 0 replies; 13+ messages in thread
From: David Edelsohn @ 2021-09-03 14:53 UTC (permalink / raw)
To: Xionghu Luo
Cc: will schmidt, GCC Patches, Bill Schmidt, Segher Boessenkool, linkw
On Thu, Sep 2, 2021 at 10:31 PM Xionghu Luo <luoxhu@linux.ibm.com> wrote:
>
> Resend the patch that addressed Will's comments.
>
>
> fmod/fmodf and remainder/remainderf could be expanded instead of library
> call when fast-math build, which is much faster.
>
> fmodf:
> fdivs f0,f1,f2
> friz f0,f0
> fnmsubs f1,f2,f0,f1
>
> remainderf:
> fdivs f0,f1,f2
> frin f0,f0
> fnmsubs f1,f2,f0,f1
>
> SPEC2017 Ofast P8LE: 511.povray_r +1.14%, 526.blender_r +1.72%
>
> gcc/ChangeLog:
>
> 2021-09-03 Xionghu Luo <luoxhu@linux.ibm.com>
>
> PR target/97142
> * config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
> (remainder<mode>3): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> 2021-09-03 Xionghu Luo <luoxhu@linux.ibm.com>
>
> PR target/97142
> * gcc.target/powerpc/pr97142.c: New test.
Okay.
Thanks, David
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
2021-09-03 2:31 ` Xionghu Luo
2021-09-03 14:51 ` Bill Schmidt
2021-09-03 14:53 ` David Edelsohn
@ 2021-09-03 21:44 ` Segher Boessenkool
2021-09-06 8:59 ` Xionghu Luo
2 siblings, 1 reply; 13+ messages in thread
From: Segher Boessenkool @ 2021-09-03 21:44 UTC (permalink / raw)
To: Xionghu Luo; +Cc: will schmidt, gcc-patches, wschmidt, dje.gcc, linkw
Hi!
On Fri, Sep 03, 2021 at 10:31:24AM +0800, Xionghu Luo wrote:
> fmod/fmodf and remainder/remainderf could be expanded instead of library
> call when fast-math build, which is much faster.
Thank you very much for this patch.
Some trivial comments if you haven't commmitted it yet:
> +(define_expand "fmod<mode>3"
> + [(use (match_operand:SFDF 0 "gpc_reg_operand"))
> + (use (match_operand:SFDF 1 "gpc_reg_operand"))
> + (use (match_operand:SFDF 2 "gpc_reg_operand"))]
> + "TARGET_HARD_FLOAT
> + && TARGET_FPRND
> + && flag_unsafe_math_optimizations"
It should have one extra space before each && here:
"TARGET_HARD_FLOAT
&& TARGET_FPRND
&& flag_unsafe_math_optimizations"
(so that everything inside of the string aligns).
> +(define_expand "remainder<mode>3"
(same here).
> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
These are negative tests, so won't spuriously fail, but this does not
test for the function prefixes we can have. See
gcc.target/powerpc/builtins-1.c for example.
Again, thank you, and thanks to everyone else for the patch review
action :-)
Segher
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
2021-09-03 21:44 ` Segher Boessenkool
@ 2021-09-06 8:59 ` Xionghu Luo
2021-09-06 21:57 ` Segher Boessenkool
0 siblings, 1 reply; 13+ messages in thread
From: Xionghu Luo @ 2021-09-06 8:59 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: will schmidt, gcc-patches, wschmidt, dje.gcc, linkw
On 2021/9/4 05:44, Segher Boessenkool wrote:
> Hi!
>
> On Fri, Sep 03, 2021 at 10:31:24AM +0800, Xionghu Luo wrote:
>> fmod/fmodf and remainder/remainderf could be expanded instead of library
>> call when fast-math build, which is much faster.
>
> Thank you very much for this patch.
>
> Some trivial comments if you haven't commmitted it yet:
>
>> +(define_expand "fmod<mode>3"
>> + [(use (match_operand:SFDF 0 "gpc_reg_operand"))
>> + (use (match_operand:SFDF 1 "gpc_reg_operand"))
>> + (use (match_operand:SFDF 2 "gpc_reg_operand"))]
>> + "TARGET_HARD_FLOAT
>> + && TARGET_FPRND
>> + && flag_unsafe_math_optimizations"
>
> It should have one extra space before each && here:
OK.
>
> "TARGET_HARD_FLOAT
> && TARGET_FPRND
> && flag_unsafe_math_optimizations"
>
> (so that everything inside of the string aligns).
>
>> +(define_expand "remainder<mode>3"
>
> (same here).
>
>> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
>> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
>> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
>> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
>
> These are negative tests, so won't spuriously fail, but this does not
> test for the function prefixes we can have. See
> gcc.target/powerpc/builtins-1.c for example.
Thanks. Verified that different calls are generated on different platforms
without this patch.
P8BE-64: bl __fmodf_finite
P8BE-32: b __fmodf_finite
P8LE-64: bl fmodf
"l", "__" and "_finite" are optional, so is it OK to check them with below patterns?
+/* { dg-final { scan-assembler-not {\mbl? (__)?fmod(_finite)?\M} } } */
+/* { dg-final { scan-assembler-not {\mbl? (__)?fmodf(_finite)?\M} } } */
+/* { dg-final { scan-assembler-not {\mbl? (__)?remainder(_finite)?\M} } } */
+/* { dg-final { scan-assembler-not {\mbl? (__)?remainderf(_finite)?\M} } } */
>
> Again, thank you, and thanks to everyone else for the patch review
> action :-)
>
>
> Segher
>
--
Thanks,
Xionghu
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142]
2021-09-06 8:59 ` Xionghu Luo
@ 2021-09-06 21:57 ` Segher Boessenkool
0 siblings, 0 replies; 13+ messages in thread
From: Segher Boessenkool @ 2021-09-06 21:57 UTC (permalink / raw)
To: Xionghu Luo; +Cc: will schmidt, gcc-patches, wschmidt, dje.gcc, linkw
Hi!
On Mon, Sep 06, 2021 at 04:59:27PM +0800, Xionghu Luo wrote:
> On 2021/9/4 05:44, Segher Boessenkool wrote:
> >>+/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */
> >>+/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */
> >>+/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */
> >>+/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */
> >
> >These are negative tests, so won't spuriously fail, but this does not
> >test for the function prefixes we can have. See
> >gcc.target/powerpc/builtins-1.c for example.
>
> Thanks. Verified that different calls are generated on different platforms
> without this patch.
>
> P8BE-64: bl __fmodf_finite
> P8BE-32: b __fmodf_finite
> P8LE-64: bl fmodf
Ah, it won't use the "dot-names" here, okay. I think for Darwin you
need to allow a single underscore, but you'll find out (or Iain will,
most likely ;-) )
> "l", "__" and "_finite" are optional, so is it OK to check them with below
> patterns?
>
> +/* { dg-final { scan-assembler-not {\mbl? (__)?fmod(_finite)?\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl? (__)?fmodf(_finite)?\M} } } */
> +/* { dg-final { scan-assembler-not {\mbl? (__)?remainder(_finite)?\M} } }
> */
> +/* { dg-final { scan-assembler-not {\mbl? (__)?remainderf(_finite)?\M} } }
> */
You could even do
/* { dg-final { scan-assembler-not {(?n)\mb.*fmod} } } */
/* { dg-final { scan-assembler-not {(?n)\mb.*remainder} } } */
or even
/* { dg-final { scan-assembler-not {fmod} } } */
/* { dg-final { scan-assembler-not {remainder} } } */
(and the testcase name will not accidentally match either of those REs
either, I checked :-) )
And yeah, on some subtargets the calls will be tail-optimised, good
find. You can get around that (in general, on any target) by doing
float test1 (float x, float y)
{
float z = fmodf (x, y);
asm (""); // to prevent tail calls
return z;
}
but what you do is fine as well, and much more elegant.
Please pick (and test ;-) ) whichever option you like best. Thanks!
Segher
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2021-09-06 21:58 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-16 7:10 [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] Xiong Hu Luo
2021-05-06 2:36 ` Ping: " Xionghu Luo
2021-05-14 7:13 ` Xionghu Luo
2021-06-07 5:08 ` Ping^2: " Xionghu Luo
2021-06-30 1:44 ` Ping ^ 2: " Xionghu Luo
2021-07-09 18:40 ` will schmidt
2021-07-12 1:25 ` Xionghu Luo
2021-09-03 2:31 ` Xionghu Luo
2021-09-03 14:51 ` Bill Schmidt
2021-09-03 14:53 ` David Edelsohn
2021-09-03 21:44 ` Segher Boessenkool
2021-09-06 8:59 ` Xionghu Luo
2021-09-06 21:57 ` Segher Boessenkool
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).