* [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] @ 2021-04-16 7:10 Xiong Hu Luo 2021-05-06 2:36 ` Ping: " Xionghu Luo 0 siblings, 1 reply; 13+ messages in thread From: Xiong Hu Luo @ 2021-04-16 7:10 UTC (permalink / raw) To: gcc-patches; +Cc: segher, dje.gcc, wschmidt, guojiufu, linkw, Xiong Hu Luo fmod/fmodf and remainder/remainderf could be expanded instead of library call when fast-math build, which is much faster. fmodf: fdivs f0,f1,f2 friz f0,f0 fnmsubs f1,f2,f0,f1 remainderf: fdivs f0,f1,f2 frin f0,f0 fnmsubs f1,f2,f0,f1 gcc/ChangeLog: 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com> PR target/97142 * config/rs6000/rs6000.md (fmod<mode>3): New define_expand. (remainder<mode>3): Likewise. gcc/testsuite/ChangeLog: 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com> PR target/97142 * gcc.target/powerpc/pr97142.c: New test. --- gcc/config/rs6000/rs6000.md | 36 ++++++++++++++++++++++ gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++ 2 files changed, 66 insertions(+) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index a1315523fec..7e0e94e6ba4 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -4902,6 +4902,42 @@ (define_insn "fre<sd>" [(set_attr "type" "fp") (set_attr "isa" "*,<Fisa>")]) +(define_expand "fmod<mode>3" + [(use (match_operand:SFDF 0 "gpc_reg_operand")) + (use (match_operand:SFDF 1 "gpc_reg_operand")) + (use (match_operand:SFDF 2 "gpc_reg_operand"))] + "TARGET_HARD_FLOAT + && TARGET_FPRND + && flag_unsafe_math_optimizations" +{ + rtx div = gen_reg_rtx (<MODE>mode); + emit_insn (gen_div<mode>3 (div, operands[1], operands[2])); + + rtx friz = gen_reg_rtx (<MODE>mode); + emit_insn (gen_btrunc<mode>2 (friz, div)); + + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, operands[1])); + DONE; + }) + +(define_expand "remainder<mode>3" + [(use (match_operand:SFDF 0 "gpc_reg_operand")) + (use (match_operand:SFDF 1 "gpc_reg_operand")) + (use (match_operand:SFDF 2 "gpc_reg_operand"))] + "TARGET_HARD_FLOAT + && TARGET_FPRND + && flag_unsafe_math_optimizations" +{ + rtx div = gen_reg_rtx (<MODE>mode); + emit_insn (gen_div<mode>3 (div, operands[1], operands[2])); + + rtx frin = gen_reg_rtx (<MODE>mode); + emit_insn (gen_round<mode>2 (frin, div)); + + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, operands[1])); + DONE; + }) + (define_insn "*rsqrt<mode>2" [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa") (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")] diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c b/gcc/testsuite/gcc.target/powerpc/pr97142.c new file mode 100644 index 00000000000..48f25ca5b5b --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-options "-Ofast" } */ + +#include <math.h> + +float test1 (float x, float y) +{ + return fmodf (x, y); +} + +double test2 (double x, double y) +{ + return fmod (x, y); +} + +float test3 (float x, float y) +{ + return remainderf (x, y); +} + +double test4 (double x, double y) +{ + return remainder (x, y); +} + +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */ +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */ +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */ +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */ + -- 2.27.0.90.geebb51ba8c ^ permalink raw reply [flat|nested] 13+ messages in thread
* Ping: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] 2021-04-16 7:10 [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] Xiong Hu Luo @ 2021-05-06 2:36 ` Xionghu Luo 2021-05-14 7:13 ` Xionghu Luo 0 siblings, 1 reply; 13+ messages in thread From: Xionghu Luo @ 2021-05-06 2:36 UTC (permalink / raw) To: gcc-patches; +Cc: segher, dje.gcc, wschmidt, guojiufu, linkw Gentle ping, thanks. On 2021/4/16 15:10, Xiong Hu Luo wrote: > fmod/fmodf and remainder/remainderf could be expanded instead of library > call when fast-math build, which is much faster. > > fmodf: > fdivs f0,f1,f2 > friz f0,f0 > fnmsubs f1,f2,f0,f1 > > remainderf: > fdivs f0,f1,f2 > frin f0,f0 > fnmsubs f1,f2,f0,f1 > > gcc/ChangeLog: > > 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com> > > PR target/97142 > * config/rs6000/rs6000.md (fmod<mode>3): New define_expand. > (remainder<mode>3): Likewise. > > gcc/testsuite/ChangeLog: > > 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com> > > PR target/97142 > * gcc.target/powerpc/pr97142.c: New test. > --- > gcc/config/rs6000/rs6000.md | 36 ++++++++++++++++++++++ > gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++ > 2 files changed, 66 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c > > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index a1315523fec..7e0e94e6ba4 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -4902,6 +4902,42 @@ (define_insn "fre<sd>" > [(set_attr "type" "fp") > (set_attr "isa" "*,<Fisa>")]) > > +(define_expand "fmod<mode>3" > + [(use (match_operand:SFDF 0 "gpc_reg_operand")) > + (use (match_operand:SFDF 1 "gpc_reg_operand")) > + (use (match_operand:SFDF 2 "gpc_reg_operand"))] > + "TARGET_HARD_FLOAT > + && TARGET_FPRND > + && flag_unsafe_math_optimizations" > +{ > + rtx div = gen_reg_rtx (<MODE>mode); > + emit_insn (gen_div<mode>3 (div, operands[1], operands[2])); > + > + rtx friz = gen_reg_rtx (<MODE>mode); > + emit_insn (gen_btrunc<mode>2 (friz, div)); > + > + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, operands[1])); > + DONE; > + }) > + > +(define_expand "remainder<mode>3" > + [(use (match_operand:SFDF 0 "gpc_reg_operand")) > + (use (match_operand:SFDF 1 "gpc_reg_operand")) > + (use (match_operand:SFDF 2 "gpc_reg_operand"))] > + "TARGET_HARD_FLOAT > + && TARGET_FPRND > + && flag_unsafe_math_optimizations" > +{ > + rtx div = gen_reg_rtx (<MODE>mode); > + emit_insn (gen_div<mode>3 (div, operands[1], operands[2])); > + > + rtx frin = gen_reg_rtx (<MODE>mode); > + emit_insn (gen_round<mode>2 (frin, div)); > + > + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, operands[1])); > + DONE; > + }) > + > (define_insn "*rsqrt<mode>2" > [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa") > (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")] > diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c b/gcc/testsuite/gcc.target/powerpc/pr97142.c > new file mode 100644 > index 00000000000..48f25ca5b5b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c > @@ -0,0 +1,30 @@ > +/* { dg-do compile } */ > +/* { dg-options "-Ofast" } */ > + > +#include <math.h> > + > +float test1 (float x, float y) > +{ > + return fmodf (x, y); > +} > + > +double test2 (double x, double y) > +{ > + return fmod (x, y); > +} > + > +float test3 (float x, float y) > +{ > + return remainderf (x, y); > +} > + > +double test4 (double x, double y) > +{ > + return remainder (x, y); > +} > + > +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */ > +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */ > +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */ > +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */ > + > -- Thanks, Xionghu ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Ping: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] 2021-05-06 2:36 ` Ping: " Xionghu Luo @ 2021-05-14 7:13 ` Xionghu Luo 2021-06-07 5:08 ` Ping^2: " Xionghu Luo 2021-06-30 1:44 ` Ping ^ 2: " Xionghu Luo 0 siblings, 2 replies; 13+ messages in thread From: Xionghu Luo @ 2021-05-14 7:13 UTC (permalink / raw) To: gcc-patches; +Cc: wschmidt, dje.gcc, segher, linkw Test SPEC2017 Ofast P8LE for this patch : 511.povray_r +1.14%, 526.blender_r +1.72%, no obvious changes to others. On 2021/5/6 10:36, Xionghu Luo via Gcc-patches wrote: > Gentle ping, thanks. > > > On 2021/4/16 15:10, Xiong Hu Luo wrote: >> fmod/fmodf and remainder/remainderf could be expanded instead of library >> call when fast-math build, which is much faster. >> >> fmodf: >> fdivs f0,f1,f2 >> friz f0,f0 >> fnmsubs f1,f2,f0,f1 >> >> remainderf: >> fdivs f0,f1,f2 >> frin f0,f0 >> fnmsubs f1,f2,f0,f1 >> >> gcc/ChangeLog: >> >> 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com> >> >> PR target/97142 >> * config/rs6000/rs6000.md (fmod<mode>3): New define_expand. >> (remainder<mode>3): Likewise. >> >> gcc/testsuite/ChangeLog: >> >> 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com> >> >> PR target/97142 >> * gcc.target/powerpc/pr97142.c: New test. >> --- >> gcc/config/rs6000/rs6000.md | 36 ++++++++++++++++++++++ >> gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++ >> 2 files changed, 66 insertions(+) >> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c >> >> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md >> index a1315523fec..7e0e94e6ba4 100644 >> --- a/gcc/config/rs6000/rs6000.md >> +++ b/gcc/config/rs6000/rs6000.md >> @@ -4902,6 +4902,42 @@ (define_insn "fre<sd>" >> [(set_attr "type" "fp") >> (set_attr "isa" "*,<Fisa>")]) >> +(define_expand "fmod<mode>3" >> + [(use (match_operand:SFDF 0 "gpc_reg_operand")) >> + (use (match_operand:SFDF 1 "gpc_reg_operand")) >> + (use (match_operand:SFDF 2 "gpc_reg_operand"))] >> + "TARGET_HARD_FLOAT >> + && TARGET_FPRND >> + && flag_unsafe_math_optimizations" >> +{ >> + rtx div = gen_reg_rtx (<MODE>mode); >> + emit_insn (gen_div<mode>3 (div, operands[1], operands[2])); >> + >> + rtx friz = gen_reg_rtx (<MODE>mode); >> + emit_insn (gen_btrunc<mode>2 (friz, div)); >> + >> + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, >> operands[1])); >> + DONE; >> + }) >> + >> +(define_expand "remainder<mode>3" >> + [(use (match_operand:SFDF 0 "gpc_reg_operand")) >> + (use (match_operand:SFDF 1 "gpc_reg_operand")) >> + (use (match_operand:SFDF 2 "gpc_reg_operand"))] >> + "TARGET_HARD_FLOAT >> + && TARGET_FPRND >> + && flag_unsafe_math_optimizations" >> +{ >> + rtx div = gen_reg_rtx (<MODE>mode); >> + emit_insn (gen_div<mode>3 (div, operands[1], operands[2])); >> + >> + rtx frin = gen_reg_rtx (<MODE>mode); >> + emit_insn (gen_round<mode>2 (frin, div)); >> + >> + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, >> operands[1])); >> + DONE; >> + }) >> + >> (define_insn "*rsqrt<mode>2" >> [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa") >> (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")] >> diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c >> b/gcc/testsuite/gcc.target/powerpc/pr97142.c >> new file mode 100644 >> index 00000000000..48f25ca5b5b >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c >> @@ -0,0 +1,30 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-Ofast" } */ >> + >> +#include <math.h> >> + >> +float test1 (float x, float y) >> +{ >> + return fmodf (x, y); >> +} >> + >> +double test2 (double x, double y) >> +{ >> + return fmod (x, y); >> +} >> + >> +float test3 (float x, float y) >> +{ >> + return remainderf (x, y); >> +} >> + >> +double test4 (double x, double y) >> +{ >> + return remainder (x, y); >> +} >> + >> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */ >> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */ >> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */ >> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */ >> + >> > -- Thanks, Xionghu ^ permalink raw reply [flat|nested] 13+ messages in thread
* Ping^2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] 2021-05-14 7:13 ` Xionghu Luo @ 2021-06-07 5:08 ` Xionghu Luo 2021-06-30 1:44 ` Ping ^ 2: " Xionghu Luo 1 sibling, 0 replies; 13+ messages in thread From: Xionghu Luo @ 2021-06-07 5:08 UTC (permalink / raw) To: gcc-patches; +Cc: wschmidt, segher, dje.gcc, linkw Ping, thanks. On 2021/5/14 15:13, Xionghu Luo via Gcc-patches wrote: > Test SPEC2017 Ofast P8LE for this patch : 511.povray_r +1.14%, > 526.blender_r +1.72%, no obvious changes to others. > > > On 2021/5/6 10:36, Xionghu Luo via Gcc-patches wrote: >> Gentle ping, thanks. >> >> >> On 2021/4/16 15:10, Xiong Hu Luo wrote: >>> fmod/fmodf and remainder/remainderf could be expanded instead of library >>> call when fast-math build, which is much faster. >>> >>> fmodf: >>> fdivs f0,f1,f2 >>> friz f0,f0 >>> fnmsubs f1,f2,f0,f1 >>> >>> remainderf: >>> fdivs f0,f1,f2 >>> frin f0,f0 >>> fnmsubs f1,f2,f0,f1 >>> >>> gcc/ChangeLog: >>> >>> 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com> >>> >>> PR target/97142 >>> * config/rs6000/rs6000.md (fmod<mode>3): New define_expand. >>> (remainder<mode>3): Likewise. >>> >>> gcc/testsuite/ChangeLog: >>> >>> 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com> >>> >>> PR target/97142 >>> * gcc.target/powerpc/pr97142.c: New test. >>> --- >>> gcc/config/rs6000/rs6000.md | 36 ++++++++++++++++++++++ >>> gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++ >>> 2 files changed, 66 insertions(+) >>> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c >>> >>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md >>> index a1315523fec..7e0e94e6ba4 100644 >>> --- a/gcc/config/rs6000/rs6000.md >>> +++ b/gcc/config/rs6000/rs6000.md >>> @@ -4902,6 +4902,42 @@ (define_insn "fre<sd>" >>> [(set_attr "type" "fp") >>> (set_attr "isa" "*,<Fisa>")]) >>> +(define_expand "fmod<mode>3" >>> + [(use (match_operand:SFDF 0 "gpc_reg_operand")) >>> + (use (match_operand:SFDF 1 "gpc_reg_operand")) >>> + (use (match_operand:SFDF 2 "gpc_reg_operand"))] >>> + "TARGET_HARD_FLOAT >>> + && TARGET_FPRND >>> + && flag_unsafe_math_optimizations" >>> +{ >>> + rtx div = gen_reg_rtx (<MODE>mode); >>> + emit_insn (gen_div<mode>3 (div, operands[1], operands[2])); >>> + >>> + rtx friz = gen_reg_rtx (<MODE>mode); >>> + emit_insn (gen_btrunc<mode>2 (friz, div)); >>> + >>> + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, >>> operands[1])); >>> + DONE; >>> + }) >>> + >>> +(define_expand "remainder<mode>3" >>> + [(use (match_operand:SFDF 0 "gpc_reg_operand")) >>> + (use (match_operand:SFDF 1 "gpc_reg_operand")) >>> + (use (match_operand:SFDF 2 "gpc_reg_operand"))] >>> + "TARGET_HARD_FLOAT >>> + && TARGET_FPRND >>> + && flag_unsafe_math_optimizations" >>> +{ >>> + rtx div = gen_reg_rtx (<MODE>mode); >>> + emit_insn (gen_div<mode>3 (div, operands[1], operands[2])); >>> + >>> + rtx frin = gen_reg_rtx (<MODE>mode); >>> + emit_insn (gen_round<mode>2 (frin, div)); >>> + >>> + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, >>> operands[1])); >>> + DONE; >>> + }) >>> + >>> (define_insn "*rsqrt<mode>2" >>> [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa") >>> (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")] >>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c >>> b/gcc/testsuite/gcc.target/powerpc/pr97142.c >>> new file mode 100644 >>> index 00000000000..48f25ca5b5b >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c >>> @@ -0,0 +1,30 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-Ofast" } */ >>> + >>> +#include <math.h> >>> + >>> +float test1 (float x, float y) >>> +{ >>> + return fmodf (x, y); >>> +} >>> + >>> +double test2 (double x, double y) >>> +{ >>> + return fmod (x, y); >>> +} >>> + >>> +float test3 (float x, float y) >>> +{ >>> + return remainderf (x, y); >>> +} >>> + >>> +double test4 (double x, double y) >>> +{ >>> + return remainder (x, y); >>> +} >>> + >>> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */ >>> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */ >>> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */ >>> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */ >>> + >>> >> > -- Thanks, Xionghu ^ permalink raw reply [flat|nested] 13+ messages in thread
* Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] 2021-05-14 7:13 ` Xionghu Luo 2021-06-07 5:08 ` Ping^2: " Xionghu Luo @ 2021-06-30 1:44 ` Xionghu Luo 2021-07-09 18:40 ` will schmidt 1 sibling, 1 reply; 13+ messages in thread From: Xionghu Luo @ 2021-06-30 1:44 UTC (permalink / raw) To: gcc-patches; +Cc: wschmidt, segher, dje.gcc, linkw Gentle ping ^2, thanks. https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html On 2021/5/14 15:13, Xionghu Luo via Gcc-patches wrote: > Test SPEC2017 Ofast P8LE for this patch : 511.povray_r +1.14%, > 526.blender_r +1.72%, no obvious changes to others. > > > On 2021/5/6 10:36, Xionghu Luo via Gcc-patches wrote: >> Gentle ping, thanks. >> >> >> On 2021/4/16 15:10, Xiong Hu Luo wrote: >>> fmod/fmodf and remainder/remainderf could be expanded instead of library >>> call when fast-math build, which is much faster. >>> >>> fmodf: >>> fdivs f0,f1,f2 >>> friz f0,f0 >>> fnmsubs f1,f2,f0,f1 >>> >>> remainderf: >>> fdivs f0,f1,f2 >>> frin f0,f0 >>> fnmsubs f1,f2,f0,f1 >>> >>> gcc/ChangeLog: >>> >>> 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com> >>> >>> PR target/97142 >>> * config/rs6000/rs6000.md (fmod<mode>3): New define_expand. >>> (remainder<mode>3): Likewise. >>> >>> gcc/testsuite/ChangeLog: >>> >>> 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com> >>> >>> PR target/97142 >>> * gcc.target/powerpc/pr97142.c: New test. >>> --- >>> gcc/config/rs6000/rs6000.md | 36 ++++++++++++++++++++++ >>> gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++ >>> 2 files changed, 66 insertions(+) >>> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c >>> >>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md >>> index a1315523fec..7e0e94e6ba4 100644 >>> --- a/gcc/config/rs6000/rs6000.md >>> +++ b/gcc/config/rs6000/rs6000.md >>> @@ -4902,6 +4902,42 @@ (define_insn "fre<sd>" >>> [(set_attr "type" "fp") >>> (set_attr "isa" "*,<Fisa>")]) >>> +(define_expand "fmod<mode>3" >>> + [(use (match_operand:SFDF 0 "gpc_reg_operand")) >>> + (use (match_operand:SFDF 1 "gpc_reg_operand")) >>> + (use (match_operand:SFDF 2 "gpc_reg_operand"))] >>> + "TARGET_HARD_FLOAT >>> + && TARGET_FPRND >>> + && flag_unsafe_math_optimizations" >>> +{ >>> + rtx div = gen_reg_rtx (<MODE>mode); >>> + emit_insn (gen_div<mode>3 (div, operands[1], operands[2])); >>> + >>> + rtx friz = gen_reg_rtx (<MODE>mode); >>> + emit_insn (gen_btrunc<mode>2 (friz, div)); >>> + >>> + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, >>> operands[1])); >>> + DONE; >>> + }) >>> + >>> +(define_expand "remainder<mode>3" >>> + [(use (match_operand:SFDF 0 "gpc_reg_operand")) >>> + (use (match_operand:SFDF 1 "gpc_reg_operand")) >>> + (use (match_operand:SFDF 2 "gpc_reg_operand"))] >>> + "TARGET_HARD_FLOAT >>> + && TARGET_FPRND >>> + && flag_unsafe_math_optimizations" >>> +{ >>> + rtx div = gen_reg_rtx (<MODE>mode); >>> + emit_insn (gen_div<mode>3 (div, operands[1], operands[2])); >>> + >>> + rtx frin = gen_reg_rtx (<MODE>mode); >>> + emit_insn (gen_round<mode>2 (frin, div)); >>> + >>> + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, >>> operands[1])); >>> + DONE; >>> + }) >>> + >>> (define_insn "*rsqrt<mode>2" >>> [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa") >>> (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")] >>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c >>> b/gcc/testsuite/gcc.target/powerpc/pr97142.c >>> new file mode 100644 >>> index 00000000000..48f25ca5b5b >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c >>> @@ -0,0 +1,30 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-Ofast" } */ >>> + >>> +#include <math.h> >>> + >>> +float test1 (float x, float y) >>> +{ >>> + return fmodf (x, y); >>> +} >>> + >>> +double test2 (double x, double y) >>> +{ >>> + return fmod (x, y); >>> +} >>> + >>> +float test3 (float x, float y) >>> +{ >>> + return remainderf (x, y); >>> +} >>> + >>> +double test4 (double x, double y) >>> +{ >>> + return remainder (x, y); >>> +} >>> + >>> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */ >>> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */ >>> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */ >>> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */ >>> + >>> >> > -- Thanks, Xionghu ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] 2021-06-30 1:44 ` Ping ^ 2: " Xionghu Luo @ 2021-07-09 18:40 ` will schmidt 2021-07-12 1:25 ` Xionghu Luo 0 siblings, 1 reply; 13+ messages in thread From: will schmidt @ 2021-07-09 18:40 UTC (permalink / raw) To: Xionghu Luo, gcc-patches; +Cc: wschmidt, dje.gcc, segher, linkw On Wed, 2021-06-30 at 09:44 +0800, Xionghu Luo via Gcc-patches wrote: > Gentle ping ^2, thanks. > > https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html > > > On 2021/5/14 15:13, Xionghu Luo via Gcc-patches wrote: > > Test SPEC2017 Ofast P8LE for this patch : 511.povray_r +1.14%, > > 526.blender_r +1.72%, no obvious changes to others. Ok. > > > > > > On 2021/5/6 10:36, Xionghu Luo via Gcc-patches wrote: > > > Gentle ping, thanks. > > > > > > > > > On 2021/4/16 15:10, Xiong Hu Luo wrote: > > > > fmod/fmodf and remainder/remainderf could be expanded instead of library > > > > call when fast-math build, which is much faster. > > > > > > > > fmodf: > > > > fdivs f0,f1,f2 > > > > friz f0,f0 > > > > fnmsubs f1,f2,f0,f1 > > > > > > > > remainderf: > > > > fdivs f0,f1,f2 > > > > frin f0,f0 > > > > fnmsubs f1,f2,f0,f1 > > > > > > > > gcc/ChangeLog: > > > > > > > > 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com> > > > > > > > > PR target/97142 That PR is " Bug 97142 - __builtin_fmod not optimized on POWER " OK. > > > > * config/rs6000/rs6000.md (fmod<mode>3): New define_expand. > > > > (remainder<mode>3): Likewise. > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com> > > > > > > > > PR target/97142 > > > > * gcc.target/powerpc/pr97142.c: New test. Ok. > > > > --- > > > > gcc/config/rs6000/rs6000.md | 36 ++++++++++++++++++++++ > > > > gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++ > > > > 2 files changed, 66 insertions(+) > > > > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c > > > > > > > > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > > > > index a1315523fec..7e0e94e6ba4 100644 > > > > --- a/gcc/config/rs6000/rs6000.md > > > > +++ b/gcc/config/rs6000/rs6000.md > > > > @@ -4902,6 +4902,42 @@ (define_insn "fre<sd>" > > > > [(set_attr "type" "fp") > > > > (set_attr "isa" "*,<Fisa>")]) > > > > +(define_expand "fmod<mode>3" > > > > + [(use (match_operand:SFDF 0 "gpc_reg_operand")) > > > > + (use (match_operand:SFDF 1 "gpc_reg_operand")) > > > > + (use (match_operand:SFDF 2 "gpc_reg_operand"))] > > > > + "TARGET_HARD_FLOAT > > > > + && TARGET_FPRND > > > > + && flag_unsafe_math_optimizations" > > > > +{ > > > > + rtx div = gen_reg_rtx (<MODE>mode); > > > > + emit_insn (gen_div<mode>3 (div, operands[1], operands[2])); > > > > + > > > > + rtx friz = gen_reg_rtx (<MODE>mode); > > > > + emit_insn (gen_btrunc<mode>2 (friz, div)); > > > > + > > > > + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, > > > > operands[1])); > > > > + DONE; > > > > + }) > > > > + > > > > +(define_expand "remainder<mode>3" > > > > + [(use (match_operand:SFDF 0 "gpc_reg_operand")) > > > > + (use (match_operand:SFDF 1 "gpc_reg_operand")) > > > > + (use (match_operand:SFDF 2 "gpc_reg_operand"))] > > > > + "TARGET_HARD_FLOAT > > > > + && TARGET_FPRND > > > > + && flag_unsafe_math_optimizations" > > > > +{ > > > > + rtx div = gen_reg_rtx (<MODE>mode); > > > > + emit_insn (gen_div<mode>3 (div, operands[1], operands[2])); > > > > + > > > > + rtx frin = gen_reg_rtx (<MODE>mode); > > > > + emit_insn (gen_round<mode>2 (frin, div)); > > > > + > > > > + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, > > > > operands[1])); > > > > + DONE; > > > > + }) I notice the pattern of arguments to the final emit is op[0],op[2],fri*,op[1] while the description comment suggests the generated instruction will be fnmsubs f1,f2,f0,f1 ; I don't see any rearranging in the nfms<mode>4 expansions, but presumably this is correct and just a cosmetic nit that catches my eye. Ok. > > > > + > > > > (define_insn "*rsqrt<mode>2" > > > > [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa") > > > > (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")] > > > > diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c > > > > b/gcc/testsuite/gcc.target/powerpc/pr97142.c > > > > new file mode 100644 > > > > index 00000000000..48f25ca5b5b > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c > > > > @@ -0,0 +1,30 @@ > > > > +/* { dg-do compile } */ > > > > +/* { dg-options "-Ofast" } */ > > > > + > > > > +#include <math.h> > > > > + > > > > +float test1 (float x, float y) > > > > +{ > > > > + return fmodf (x, y); > > > > +} > > > > + > > > > +double test2 (double x, double y) > > > > +{ > > > > + return fmod (x, y); > > > > +} > > > > + > > > > +float test3 (float x, float y) > > > > +{ > > > > + return remainderf (x, y); > > > > +} > > > > + > > > > +double test4 (double x, double y) > > > > +{ > > > > + return remainder (x, y); > > > > +} > > > > + > > > > +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */ > > > > +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */ > > > > +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */ > > > > +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */ Ok. I'd be tempted to add scan-assembler checks for the fdivs,fri*,fnmsubs instructions as well. I defer to others on that, of course.. :-) lgtm, thanks -Will > > > > + > > > > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] 2021-07-09 18:40 ` will schmidt @ 2021-07-12 1:25 ` Xionghu Luo 2021-09-03 2:31 ` Xionghu Luo 0 siblings, 1 reply; 13+ messages in thread From: Xionghu Luo @ 2021-07-12 1:25 UTC (permalink / raw) To: will schmidt, gcc-patches; +Cc: wschmidt, dje.gcc, segher, linkw On 2021/7/10 02:40, will schmidt wrote: > On Wed, 2021-06-30 at 09:44 +0800, Xionghu Luo via Gcc-patches wrote: >> Gentle ping ^2, thanks. >> >> https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html >> >> >> On 2021/5/14 15:13, Xionghu Luo via Gcc-patches wrote: >>> Test SPEC2017 Ofast P8LE for this patch : 511.povray_r +1.14%, >>> 526.blender_r +1.72%, no obvious changes to others. > > Ok. > >>> >>> >>> On 2021/5/6 10:36, Xionghu Luo via Gcc-patches wrote: >>>> Gentle ping, thanks. >>>> >>>> >>>> On 2021/4/16 15:10, Xiong Hu Luo wrote: >>>>> fmod/fmodf and remainder/remainderf could be expanded instead of library >>>>> call when fast-math build, which is much faster. >>>>> >>>>> fmodf: >>>>> fdivs f0,f1,f2 >>>>> friz f0,f0 >>>>> fnmsubs f1,f2,f0,f1 >>>>> >>>>> remainderf: >>>>> fdivs f0,f1,f2 >>>>> frin f0,f0 >>>>> fnmsubs f1,f2,f0,f1 >>>>> >>>>> gcc/ChangeLog: >>>>> >>>>> 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com> >>>>> >>>>> PR target/97142 > > That PR is " Bug 97142 > - __builtin_fmod not optimized on POWER " > > OK. > > >>>>> * config/rs6000/rs6000.md (fmod<mode>3): New define_expand. >>>>> (remainder<mode>3): Likewise. > > >>>>> >>>>> gcc/testsuite/ChangeLog: >>>>> >>>>> 2021-04-16 Xionghu Luo <luoxhu@linux.ibm.com> >>>>> >>>>> PR target/97142 >>>>> * gcc.target/powerpc/pr97142.c: New test. > > Ok. > >>>>> --- >>>>> gcc/config/rs6000/rs6000.md | 36 ++++++++++++++++++++++ >>>>> gcc/testsuite/gcc.target/powerpc/pr97142.c | 30 ++++++++++++++++++ >>>>> 2 files changed, 66 insertions(+) >>>>> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c >>>>> >>>>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md >>>>> index a1315523fec..7e0e94e6ba4 100644 >>>>> --- a/gcc/config/rs6000/rs6000.md >>>>> +++ b/gcc/config/rs6000/rs6000.md >>>>> @@ -4902,6 +4902,42 @@ (define_insn "fre<sd>" >>>>> [(set_attr "type" "fp") >>>>> (set_attr "isa" "*,<Fisa>")]) >>>>> +(define_expand "fmod<mode>3" >>>>> + [(use (match_operand:SFDF 0 "gpc_reg_operand")) >>>>> + (use (match_operand:SFDF 1 "gpc_reg_operand")) >>>>> + (use (match_operand:SFDF 2 "gpc_reg_operand"))] >>>>> + "TARGET_HARD_FLOAT >>>>> + && TARGET_FPRND >>>>> + && flag_unsafe_math_optimizations" >>>>> +{ >>>>> + rtx div = gen_reg_rtx (<MODE>mode); >>>>> + emit_insn (gen_div<mode>3 (div, operands[1], operands[2])); >>>>> + >>>>> + rtx friz = gen_reg_rtx (<MODE>mode); >>>>> + emit_insn (gen_btrunc<mode>2 (friz, div)); >>>>> + >>>>> + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, >>>>> operands[1])); >>>>> + DONE; >>>>> + }) >>>>> + >>>>> +(define_expand "remainder<mode>3" >>>>> + [(use (match_operand:SFDF 0 "gpc_reg_operand")) >>>>> + (use (match_operand:SFDF 1 "gpc_reg_operand")) >>>>> + (use (match_operand:SFDF 2 "gpc_reg_operand"))] >>>>> + "TARGET_HARD_FLOAT >>>>> + && TARGET_FPRND >>>>> + && flag_unsafe_math_optimizations" >>>>> +{ >>>>> + rtx div = gen_reg_rtx (<MODE>mode); >>>>> + emit_insn (gen_div<mode>3 (div, operands[1], operands[2])); >>>>> + >>>>> + rtx frin = gen_reg_rtx (<MODE>mode); >>>>> + emit_insn (gen_round<mode>2 (frin, div)); >>>>> + >>>>> + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, >>>>> operands[1])); >>>>> + DONE; >>>>> + }) > > I notice the pattern of arguments to the final emit > is op[0],op[2],fri*,op[1] > while the description comment suggests the generated instruction > will be fnmsubs f1,f2,f0,f1 ; > > I don't see any rearranging in the nfms<mode>4 expansions, but > presumably this is correct and just a cosmetic nit that catches my eye. From the ISA, fnmsub FRT,FRA,FRC,FRB The operation FRT ← - ( [(FRA) (FRC)] - (FRB) ) is performed. fmodf: fdivs f0,f1,f2 friz f0,f0 fnmsubs f1,f2,f0,f1 Then the ASM means: f1 = - (f2 * f0 - f1) = - ([f2 * f1/f2] - f1) So f1 is set with the mod result. > > Ok. > > >>>>> + >>>>> (define_insn "*rsqrt<mode>2" >>>>> [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa") >>>>> (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")] >>>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c >>>>> b/gcc/testsuite/gcc.target/powerpc/pr97142.c >>>>> new file mode 100644 >>>>> index 00000000000..48f25ca5b5b >>>>> --- /dev/null >>>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c >>>>> @@ -0,0 +1,30 @@ >>>>> +/* { dg-do compile } */ >>>>> +/* { dg-options "-Ofast" } */ >>>>> + >>>>> +#include <math.h> >>>>> + >>>>> +float test1 (float x, float y) >>>>> +{ >>>>> + return fmodf (x, y); >>>>> +} >>>>> + >>>>> +double test2 (double x, double y) >>>>> +{ >>>>> + return fmod (x, y); >>>>> +} >>>>> + >>>>> +float test3 (float x, float y) >>>>> +{ >>>>> + return remainderf (x, y); >>>>> +} >>>>> + >>>>> +double test4 (double x, double y) >>>>> +{ >>>>> + return remainder (x, y); >>>>> +} >>>>> + >>>>> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */ >>>>> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */ >>>>> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */ >>>>> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */ > > > Ok. > I'd be tempted to add scan-assembler checks for the fdivs,fri*,fnmsubs > instructions as well. > I defer to others on that, of course.. :-) Thanks, will add below check: diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c b/gcc/testsuite/gcc.target/powerpc/pr97142.c index 48f25ca5b5b..081ab40b4c0 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr97142.c +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c @@ -27,4 +27,11 @@ double test4 (double x, double y) /* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */ /* { dg-final { scan-assembler-not {\mbl remainder\M} } } */ /* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */ +/* { dg-final { scan-assembler-times {\mfdiv\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mfdivs\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mfnmsub\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mfnmsubs\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mfriz\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mfrin\M} 2 } } */ + > > lgtm, > thanks > -Will > > > >>>>> + >>>>> >> >> > -- Thanks, Xionghu ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] 2021-07-12 1:25 ` Xionghu Luo @ 2021-09-03 2:31 ` Xionghu Luo 2021-09-03 14:51 ` Bill Schmidt ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Xionghu Luo @ 2021-09-03 2:31 UTC (permalink / raw) To: will schmidt, gcc-patches; +Cc: wschmidt, segher, dje.gcc, linkw Resend the patch that addressed Will's comments. fmod/fmodf and remainder/remainderf could be expanded instead of library call when fast-math build, which is much faster. fmodf: fdivs f0,f1,f2 friz f0,f0 fnmsubs f1,f2,f0,f1 remainderf: fdivs f0,f1,f2 frin f0,f0 fnmsubs f1,f2,f0,f1 SPEC2017 Ofast P8LE: 511.povray_r +1.14%, 526.blender_r +1.72% gcc/ChangeLog: 2021-09-03 Xionghu Luo <luoxhu@linux.ibm.com> PR target/97142 * config/rs6000/rs6000.md (fmod<mode>3): New define_expand. (remainder<mode>3): Likewise. gcc/testsuite/ChangeLog: 2021-09-03 Xionghu Luo <luoxhu@linux.ibm.com> PR target/97142 * gcc.target/powerpc/pr97142.c: New test. --- gcc/config/rs6000/rs6000.md | 36 ++++++++++++++++++++++ gcc/testsuite/gcc.target/powerpc/pr97142.c | 35 +++++++++++++++++++++ 2 files changed, 71 insertions(+) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index c8cdc42533c..84820d3b5cb 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -4932,6 +4932,42 @@ (define_insn "fre<sd>" [(set_attr "type" "fp") (set_attr "isa" "*,<Fisa>")]) +(define_expand "fmod<mode>3" + [(use (match_operand:SFDF 0 "gpc_reg_operand")) + (use (match_operand:SFDF 1 "gpc_reg_operand")) + (use (match_operand:SFDF 2 "gpc_reg_operand"))] + "TARGET_HARD_FLOAT + && TARGET_FPRND + && flag_unsafe_math_optimizations" +{ + rtx div = gen_reg_rtx (<MODE>mode); + emit_insn (gen_div<mode>3 (div, operands[1], operands[2])); + + rtx friz = gen_reg_rtx (<MODE>mode); + emit_insn (gen_btrunc<mode>2 (friz, div)); + + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, operands[1])); + DONE; + }) + +(define_expand "remainder<mode>3" + [(use (match_operand:SFDF 0 "gpc_reg_operand")) + (use (match_operand:SFDF 1 "gpc_reg_operand")) + (use (match_operand:SFDF 2 "gpc_reg_operand"))] + "TARGET_HARD_FLOAT + && TARGET_FPRND + && flag_unsafe_math_optimizations" +{ + rtx div = gen_reg_rtx (<MODE>mode); + emit_insn (gen_div<mode>3 (div, operands[1], operands[2])); + + rtx frin = gen_reg_rtx (<MODE>mode); + emit_insn (gen_round<mode>2 (frin, div)); + + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, operands[1])); + DONE; + }) + (define_insn "*rsqrt<mode>2" [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa") (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")] diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c b/gcc/testsuite/gcc.target/powerpc/pr97142.c new file mode 100644 index 00000000000..e5306eb681b --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c @@ -0,0 +1,35 @@ +/* { dg-do compile } */ +/* { dg-options "-Ofast" } */ + +#include <math.h> + +float test1 (float x, float y) +{ + return fmodf (x, y); +} + +double test2 (double x, double y) +{ + return fmod (x, y); +} + +float test3 (float x, float y) +{ + return remainderf (x, y); +} + +double test4 (double x, double y) +{ + return remainder (x, y); +} + +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */ +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */ +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */ +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */ +/* { dg-final { scan-assembler-times {\mfdiv\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mfdivs\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mfnmsub\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mfnmsubs\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mfriz\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mfrin\M} 2 } } */ -- 2.25.1 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] 2021-09-03 2:31 ` Xionghu Luo @ 2021-09-03 14:51 ` Bill Schmidt 2021-09-03 14:53 ` David Edelsohn 2021-09-03 21:44 ` Segher Boessenkool 2 siblings, 0 replies; 13+ messages in thread From: Bill Schmidt @ 2021-09-03 14:51 UTC (permalink / raw) To: Xionghu Luo, will schmidt, gcc-patches; +Cc: segher, dje.gcc, linkw Hi Xionghu, This looks okay to me. Recommend maintainers approve. Thanks! Bill On 9/2/21 9:31 PM, Xionghu Luo wrote: > Resend the patch that addressed Will's comments. > > > fmod/fmodf and remainder/remainderf could be expanded instead of library > call when fast-math build, which is much faster. > > fmodf: > fdivs f0,f1,f2 > friz f0,f0 > fnmsubs f1,f2,f0,f1 > > remainderf: > fdivs f0,f1,f2 > frin f0,f0 > fnmsubs f1,f2,f0,f1 > > SPEC2017 Ofast P8LE: 511.povray_r +1.14%, 526.blender_r +1.72% > > gcc/ChangeLog: > > 2021-09-03 Xionghu Luo <luoxhu@linux.ibm.com> > > PR target/97142 > * config/rs6000/rs6000.md (fmod<mode>3): New define_expand. > (remainder<mode>3): Likewise. > > gcc/testsuite/ChangeLog: > > 2021-09-03 Xionghu Luo <luoxhu@linux.ibm.com> > > PR target/97142 > * gcc.target/powerpc/pr97142.c: New test. > --- > gcc/config/rs6000/rs6000.md | 36 ++++++++++++++++++++++ > gcc/testsuite/gcc.target/powerpc/pr97142.c | 35 +++++++++++++++++++++ > 2 files changed, 71 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr97142.c > > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index c8cdc42533c..84820d3b5cb 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -4932,6 +4932,42 @@ (define_insn "fre<sd>" > [(set_attr "type" "fp") > (set_attr "isa" "*,<Fisa>")]) > > +(define_expand "fmod<mode>3" > + [(use (match_operand:SFDF 0 "gpc_reg_operand")) > + (use (match_operand:SFDF 1 "gpc_reg_operand")) > + (use (match_operand:SFDF 2 "gpc_reg_operand"))] > + "TARGET_HARD_FLOAT > + && TARGET_FPRND > + && flag_unsafe_math_optimizations" > +{ > + rtx div = gen_reg_rtx (<MODE>mode); > + emit_insn (gen_div<mode>3 (div, operands[1], operands[2])); > + > + rtx friz = gen_reg_rtx (<MODE>mode); > + emit_insn (gen_btrunc<mode>2 (friz, div)); > + > + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], friz, operands[1])); > + DONE; > + }) > + > +(define_expand "remainder<mode>3" > + [(use (match_operand:SFDF 0 "gpc_reg_operand")) > + (use (match_operand:SFDF 1 "gpc_reg_operand")) > + (use (match_operand:SFDF 2 "gpc_reg_operand"))] > + "TARGET_HARD_FLOAT > + && TARGET_FPRND > + && flag_unsafe_math_optimizations" > +{ > + rtx div = gen_reg_rtx (<MODE>mode); > + emit_insn (gen_div<mode>3 (div, operands[1], operands[2])); > + > + rtx frin = gen_reg_rtx (<MODE>mode); > + emit_insn (gen_round<mode>2 (frin, div)); > + > + emit_insn (gen_nfms<mode>4 (operands[0], operands[2], frin, operands[1])); > + DONE; > + }) > + > (define_insn "*rsqrt<mode>2" > [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,wa") > (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,wa")] > diff --git a/gcc/testsuite/gcc.target/powerpc/pr97142.c b/gcc/testsuite/gcc.target/powerpc/pr97142.c > new file mode 100644 > index 00000000000..e5306eb681b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr97142.c > @@ -0,0 +1,35 @@ > +/* { dg-do compile } */ > +/* { dg-options "-Ofast" } */ > + > +#include <math.h> > + > +float test1 (float x, float y) > +{ > + return fmodf (x, y); > +} > + > +double test2 (double x, double y) > +{ > + return fmod (x, y); > +} > + > +float test3 (float x, float y) > +{ > + return remainderf (x, y); > +} > + > +double test4 (double x, double y) > +{ > + return remainder (x, y); > +} > + > +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */ > +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */ > +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */ > +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */ > +/* { dg-final { scan-assembler-times {\mfdiv\M} 2 } } */ > +/* { dg-final { scan-assembler-times {\mfdivs\M} 2 } } */ > +/* { dg-final { scan-assembler-times {\mfnmsub\M} 2 } } */ > +/* { dg-final { scan-assembler-times {\mfnmsubs\M} 2 } } */ > +/* { dg-final { scan-assembler-times {\mfriz\M} 2 } } */ > +/* { dg-final { scan-assembler-times {\mfrin\M} 2 } } */ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] 2021-09-03 2:31 ` Xionghu Luo 2021-09-03 14:51 ` Bill Schmidt @ 2021-09-03 14:53 ` David Edelsohn 2021-09-03 21:44 ` Segher Boessenkool 2 siblings, 0 replies; 13+ messages in thread From: David Edelsohn @ 2021-09-03 14:53 UTC (permalink / raw) To: Xionghu Luo Cc: will schmidt, GCC Patches, Bill Schmidt, Segher Boessenkool, linkw On Thu, Sep 2, 2021 at 10:31 PM Xionghu Luo <luoxhu@linux.ibm.com> wrote: > > Resend the patch that addressed Will's comments. > > > fmod/fmodf and remainder/remainderf could be expanded instead of library > call when fast-math build, which is much faster. > > fmodf: > fdivs f0,f1,f2 > friz f0,f0 > fnmsubs f1,f2,f0,f1 > > remainderf: > fdivs f0,f1,f2 > frin f0,f0 > fnmsubs f1,f2,f0,f1 > > SPEC2017 Ofast P8LE: 511.povray_r +1.14%, 526.blender_r +1.72% > > gcc/ChangeLog: > > 2021-09-03 Xionghu Luo <luoxhu@linux.ibm.com> > > PR target/97142 > * config/rs6000/rs6000.md (fmod<mode>3): New define_expand. > (remainder<mode>3): Likewise. > > gcc/testsuite/ChangeLog: > > 2021-09-03 Xionghu Luo <luoxhu@linux.ibm.com> > > PR target/97142 > * gcc.target/powerpc/pr97142.c: New test. Okay. Thanks, David ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] 2021-09-03 2:31 ` Xionghu Luo 2021-09-03 14:51 ` Bill Schmidt 2021-09-03 14:53 ` David Edelsohn @ 2021-09-03 21:44 ` Segher Boessenkool 2021-09-06 8:59 ` Xionghu Luo 2 siblings, 1 reply; 13+ messages in thread From: Segher Boessenkool @ 2021-09-03 21:44 UTC (permalink / raw) To: Xionghu Luo; +Cc: will schmidt, gcc-patches, wschmidt, dje.gcc, linkw Hi! On Fri, Sep 03, 2021 at 10:31:24AM +0800, Xionghu Luo wrote: > fmod/fmodf and remainder/remainderf could be expanded instead of library > call when fast-math build, which is much faster. Thank you very much for this patch. Some trivial comments if you haven't commmitted it yet: > +(define_expand "fmod<mode>3" > + [(use (match_operand:SFDF 0 "gpc_reg_operand")) > + (use (match_operand:SFDF 1 "gpc_reg_operand")) > + (use (match_operand:SFDF 2 "gpc_reg_operand"))] > + "TARGET_HARD_FLOAT > + && TARGET_FPRND > + && flag_unsafe_math_optimizations" It should have one extra space before each && here: "TARGET_HARD_FLOAT && TARGET_FPRND && flag_unsafe_math_optimizations" (so that everything inside of the string aligns). > +(define_expand "remainder<mode>3" (same here). > +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */ > +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */ > +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */ > +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */ These are negative tests, so won't spuriously fail, but this does not test for the function prefixes we can have. See gcc.target/powerpc/builtins-1.c for example. Again, thank you, and thanks to everyone else for the patch review action :-) Segher ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] 2021-09-03 21:44 ` Segher Boessenkool @ 2021-09-06 8:59 ` Xionghu Luo 2021-09-06 21:57 ` Segher Boessenkool 0 siblings, 1 reply; 13+ messages in thread From: Xionghu Luo @ 2021-09-06 8:59 UTC (permalink / raw) To: Segher Boessenkool; +Cc: will schmidt, gcc-patches, wschmidt, dje.gcc, linkw On 2021/9/4 05:44, Segher Boessenkool wrote: > Hi! > > On Fri, Sep 03, 2021 at 10:31:24AM +0800, Xionghu Luo wrote: >> fmod/fmodf and remainder/remainderf could be expanded instead of library >> call when fast-math build, which is much faster. > > Thank you very much for this patch. > > Some trivial comments if you haven't commmitted it yet: > >> +(define_expand "fmod<mode>3" >> + [(use (match_operand:SFDF 0 "gpc_reg_operand")) >> + (use (match_operand:SFDF 1 "gpc_reg_operand")) >> + (use (match_operand:SFDF 2 "gpc_reg_operand"))] >> + "TARGET_HARD_FLOAT >> + && TARGET_FPRND >> + && flag_unsafe_math_optimizations" > > It should have one extra space before each && here: OK. > > "TARGET_HARD_FLOAT > && TARGET_FPRND > && flag_unsafe_math_optimizations" > > (so that everything inside of the string aligns). > >> +(define_expand "remainder<mode>3" > > (same here). > >> +/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */ >> +/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */ >> +/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */ >> +/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */ > > These are negative tests, so won't spuriously fail, but this does not > test for the function prefixes we can have. See > gcc.target/powerpc/builtins-1.c for example. Thanks. Verified that different calls are generated on different platforms without this patch. P8BE-64: bl __fmodf_finite P8BE-32: b __fmodf_finite P8LE-64: bl fmodf "l", "__" and "_finite" are optional, so is it OK to check them with below patterns? +/* { dg-final { scan-assembler-not {\mbl? (__)?fmod(_finite)?\M} } } */ +/* { dg-final { scan-assembler-not {\mbl? (__)?fmodf(_finite)?\M} } } */ +/* { dg-final { scan-assembler-not {\mbl? (__)?remainder(_finite)?\M} } } */ +/* { dg-final { scan-assembler-not {\mbl? (__)?remainderf(_finite)?\M} } } */ > > Again, thank you, and thanks to everyone else for the patch review > action :-) > > > Segher > -- Thanks, Xionghu ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Ping ^ 2: [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] 2021-09-06 8:59 ` Xionghu Luo @ 2021-09-06 21:57 ` Segher Boessenkool 0 siblings, 0 replies; 13+ messages in thread From: Segher Boessenkool @ 2021-09-06 21:57 UTC (permalink / raw) To: Xionghu Luo; +Cc: will schmidt, gcc-patches, wschmidt, dje.gcc, linkw Hi! On Mon, Sep 06, 2021 at 04:59:27PM +0800, Xionghu Luo wrote: > On 2021/9/4 05:44, Segher Boessenkool wrote: > >>+/* { dg-final { scan-assembler-not {\mbl fmod\M} } } */ > >>+/* { dg-final { scan-assembler-not {\mbl fmodf\M} } } */ > >>+/* { dg-final { scan-assembler-not {\mbl remainder\M} } } */ > >>+/* { dg-final { scan-assembler-not {\mbl remainderf\M} } } */ > > > >These are negative tests, so won't spuriously fail, but this does not > >test for the function prefixes we can have. See > >gcc.target/powerpc/builtins-1.c for example. > > Thanks. Verified that different calls are generated on different platforms > without this patch. > > P8BE-64: bl __fmodf_finite > P8BE-32: b __fmodf_finite > P8LE-64: bl fmodf Ah, it won't use the "dot-names" here, okay. I think for Darwin you need to allow a single underscore, but you'll find out (or Iain will, most likely ;-) ) > "l", "__" and "_finite" are optional, so is it OK to check them with below > patterns? > > +/* { dg-final { scan-assembler-not {\mbl? (__)?fmod(_finite)?\M} } } */ > +/* { dg-final { scan-assembler-not {\mbl? (__)?fmodf(_finite)?\M} } } */ > +/* { dg-final { scan-assembler-not {\mbl? (__)?remainder(_finite)?\M} } } > */ > +/* { dg-final { scan-assembler-not {\mbl? (__)?remainderf(_finite)?\M} } } > */ You could even do /* { dg-final { scan-assembler-not {(?n)\mb.*fmod} } } */ /* { dg-final { scan-assembler-not {(?n)\mb.*remainder} } } */ or even /* { dg-final { scan-assembler-not {fmod} } } */ /* { dg-final { scan-assembler-not {remainder} } } */ (and the testcase name will not accidentally match either of those REs either, I checked :-) ) And yeah, on some subtargets the calls will be tail-optimised, good find. You can get around that (in general, on any target) by doing float test1 (float x, float y) { float z = fmodf (x, y); asm (""); // to prevent tail calls return z; } but what you do is fine as well, and much more elegant. Please pick (and test ;-) ) whichever option you like best. Thanks! Segher ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2021-09-06 21:58 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-04-16 7:10 [PATCH] rs6000: Expand fmod and remainder when built with fast-math [PR97142] Xiong Hu Luo 2021-05-06 2:36 ` Ping: " Xionghu Luo 2021-05-14 7:13 ` Xionghu Luo 2021-06-07 5:08 ` Ping^2: " Xionghu Luo 2021-06-30 1:44 ` Ping ^ 2: " Xionghu Luo 2021-07-09 18:40 ` will schmidt 2021-07-12 1:25 ` Xionghu Luo 2021-09-03 2:31 ` Xionghu Luo 2021-09-03 14:51 ` Bill Schmidt 2021-09-03 14:53 ` David Edelsohn 2021-09-03 21:44 ` Segher Boessenkool 2021-09-06 8:59 ` Xionghu Luo 2021-09-06 21:57 ` Segher Boessenkool
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).