public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/97142] New: __builtin_fmod not optimized on POWER
@ 2020-09-21 10:57 fx at gnu dot org
2020-09-21 11:24 ` [Bug target/97142] " rguenth at gcc dot gnu.org
` (21 more replies)
0 siblings, 22 replies; 23+ messages in thread
From: fx at gnu dot org @ 2020-09-21 10:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
Bug ID: 97142
Summary: __builtin_fmod not optimized on POWER
Product: gcc
Version: 10.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: fx at gnu dot org
Target Milestone: ---
I ran some Fortran benchmarks (the "Polyhedron" set) on POWER9, and found one
of them has pathologically bad performance compared with xlf. Profiling shows
that's due to spending most of its time in fmod via a random-number function.
fmod isn't called when compiled with xlf -O5 or when compiling the same on
x86_64. Although it's Fortran, this doesn't appear to be Fortran-specific as
the DMOD intrinsic is turned into __builtin_fmod.
The following is with gcc 10.2, comparing the two targets.
On RHEL7 POWER9 (and the same with -mcpu=native):
$ cat ggl.f90
REAL FUNCTION GGL(Ds)
DOUBLE PRECISION Ds , d2
DATA d2/2147483647.D0/
Ds = DMOD(16807.D0*Ds,d2)
GGL = Ds/d2
END
$ gfortran -O3 -fopt-info-all -c ggl.f90
ggl.f90:4:0: missed: not inlinable: ggl/0 -> __builtin_fmod/2, function body
not available
Unit growth for small function inlining: 12->12 (0%)
Inlined 0 calls, eliminated 0 functions
ggl.f90:6:0: note: ***** Analysis failed with vector mode V2DF
$ nm ggl.o
U fmod
0000000000000000 T ggl_
U .TOC.
On Debian 10 SKX with the same source:
$ gfortran-10 -Ofast -fopt-info-all -c ggl.f90
ggl.f90:4:0: missed: not inlinable: ggl/0 -> __builtin_fmod/2, function body
not available
Unit growth for small function inlining: 12->12 (0%)
Inlined 0 calls, eliminated 0 functions
ggl.f90:6:0: note: ***** Analysis failed with vector mode V2DF
ggl.f90:6:0: note: ***** Skipping vector mode V16QI, which would repeat the
analysis for V2DF
$ nm ggl.o
0000000000000000 r .LC0
0000000000000008 r .LC1
0000000000000010 r .LC2
0000000000000000 T ggl_
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
@ 2020-09-21 11:24 ` rguenth at gcc dot gnu.org
2020-09-21 11:26 ` rguenth at gcc dot gnu.org
` (20 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-09-21 11:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2020-09-21
Target| |powerpc
Status|UNCONFIRMED |WAITING
Ever confirmed|0 |1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
I guess xlf inlines fmod while we simply dispatch to what libm from glibc
provides. That in turn might not be optimized to the same extent as what
xlf provides.
Can you provide the assembly as produced by xlf?
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
2020-09-21 11:24 ` [Bug target/97142] " rguenth at gcc dot gnu.org
@ 2020-09-21 11:26 ` rguenth at gcc dot gnu.org
2020-09-21 14:42 ` bergner at gcc dot gnu.org
` (19 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-09-21 11:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Btw, with -ffast-math (or -Ofast) on x86 you get fmod inlined, I guess xlf -O5
is to some extent doing -ffast-math?
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
2020-09-21 11:24 ` [Bug target/97142] " rguenth at gcc dot gnu.org
2020-09-21 11:26 ` rguenth at gcc dot gnu.org
@ 2020-09-21 14:42 ` bergner at gcc dot gnu.org
2020-09-21 14:45 ` fx at gnu dot org
` (18 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bergner at gcc dot gnu.org @ 2020-09-21 14:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
--- Comment #3 from Peter Bergner <bergner at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #2)
> Btw, with -ffast-math (or -Ofast) on x86 you get fmod inlined, I guess xlf
> -O5
> is to some extent doing -ffast-math?
Xlf at -O3, -O4 and -O5 automatically enables -qnostrict which is similar to
-fast-math + plus other stuff.
https://www.ibm.com/support/knowledgecenter/en/SSGH4D_16.1.0/com.ibm.xlf161.aix.doc/compiler_ref/opt_strict.html
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
` (2 preceding siblings ...)
2020-09-21 14:42 ` bergner at gcc dot gnu.org
@ 2020-09-21 14:45 ` fx at gnu dot org
2020-09-21 14:47 ` fx at gnu dot org
` (17 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: fx at gnu dot org @ 2020-09-21 14:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
--- Comment #4 from Dave Love <fx at gnu dot org> ---
Created attachment 49249
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49249&action=edit
xlf -O5 -S
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
` (3 preceding siblings ...)
2020-09-21 14:45 ` fx at gnu dot org
@ 2020-09-21 14:47 ` fx at gnu dot org
2020-09-21 17:52 ` segher at gcc dot gnu.org
` (16 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: fx at gnu dot org @ 2020-09-21 14:47 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
--- Comment #5 from Dave Love <fx at gnu dot org> ---
I meant to show that x86_64 expands the built in fmod too. (I wasn't sure
whether "inline" was the right term, as it seems not to be done by
-finline-functions.)
Yes, xlf -O3 (?) and above imlies something like -ffast-math. xlf also does
vectorized maths functions, but I don't think the speed of libm is as relevant
as not calling out to it. (Nothing leapt out from profiles for the rest of
that set to suggest other builtins are similar, but I haven't examined them
closely.)
I'll attach the assembler, which there's an awful lot of compared with x86_64.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
` (4 preceding siblings ...)
2020-09-21 14:47 ` fx at gnu dot org
@ 2020-09-21 17:52 ` segher at gcc dot gnu.org
2020-09-21 18:10 ` bergner at gcc dot gnu.org
` (15 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: segher at gcc dot gnu.org @ 2020-09-21 17:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
Segher Boessenkool <segher at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|WAITING |NEW
--- Comment #6 from Segher Boessenkool <segher at gcc dot gnu.org> ---
But GCC (at least for rs6000) does not inline expand this with
-ffast-math even. Confirmed.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
` (5 preceding siblings ...)
2020-09-21 17:52 ` segher at gcc dot gnu.org
@ 2020-09-21 18:10 ` bergner at gcc dot gnu.org
2020-09-21 19:33 ` segher at gcc dot gnu.org
` (14 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bergner at gcc dot gnu.org @ 2020-09-21 18:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
Peter Bergner <bergner at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |meissner at gcc dot gnu.org,
| |tuliom at ascii dot art.br
--- Comment #7 from Peter Bergner <bergner at gcc dot gnu.org> ---
I looked at libm/glibc and I don't think we have an fmod library call that even
uses the fmod hw instruction.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
` (6 preceding siblings ...)
2020-09-21 18:10 ` bergner at gcc dot gnu.org
@ 2020-09-21 19:33 ` segher at gcc dot gnu.org
2020-09-21 21:54 ` bergner at gcc dot gnu.org
` (13 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: segher at gcc dot gnu.org @ 2020-09-21 19:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
--- Comment #8 from Segher Boessenkool <segher at gcc dot gnu.org> ---
I don't think we have an instruction for that? But we can inline the
code we need instead of doing a library call, which is much faster.
(We probably can use FMAs here usefully, btw; maybe even without needing
-ffast-math.)
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
` (7 preceding siblings ...)
2020-09-21 19:33 ` segher at gcc dot gnu.org
@ 2020-09-21 21:54 ` bergner at gcc dot gnu.org
2021-04-13 6:24 ` luoxhu at gcc dot gnu.org
` (12 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bergner at gcc dot gnu.org @ 2020-09-21 21:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
--- Comment #9 from Peter Bergner <bergner at gcc dot gnu.org> ---
(In reply to Segher Boessenkool from comment #8)
> I don't think we have an instruction for that? But we can inline the
> code we need instead of doing a library call, which is much faster.
> (We probably can use FMAs here usefully, btw; maybe even without needing
> -ffast-math.)
Yes, I was mistaken we had a hw insn. I do see xlc does not expand
__builtin_fmod (x, y), so it's only xlf that does this optimization.
Agreed on the inlining thing. It's interesting that even libm's fmod() routine
on ppc just calls a C function and doesn't have an optimized asm routine.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
` (8 preceding siblings ...)
2020-09-21 21:54 ` bergner at gcc dot gnu.org
@ 2021-04-13 6:24 ` luoxhu at gcc dot gnu.org
2021-04-13 16:26 ` segher at gcc dot gnu.org
` (11 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-04-13 6:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
--- Comment #10 from luoxhu at gcc dot gnu.org ---
If not built with fast-math, gimple_has_side_effects will return true and cause
the expand_call_stmt fail to expand the "_1 = fmod (x_2(D), y_3(D));" to
internal function. X86 also produces "bl fmod" for O3 build.
xlF expands the fmod to below ASM, no FMA generated?
0000000010000900 <ggl>:
10000900: 8c 03 01 10 vspltisw v0,1
10000904: 00 00 24 c8 lfd f1,0(r4)
10000908: 00 00 03 c8 lfd f0,0(r3)
1000090c: e2 03 40 f0 xvcvsxwdp vs2,vs32
10000910: c0 09 62 f0 xsdivdp vs3,vs2,vs1
10000914: 80 19 80 f0 xsmuldp vs4,vs0,vs3
10000918: 64 21 a0 f0 xsrdpiz vs5,vs4
1000091c: 88 2d 01 f0 xsnmsubadp vs0,vs1,vs5
10000920: 18 00 20 fc frsp f1,f0
10000924: 20 00 80 4e blr
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
` (9 preceding siblings ...)
2021-04-13 6:24 ` luoxhu at gcc dot gnu.org
@ 2021-04-13 16:26 ` segher at gcc dot gnu.org
2021-05-27 0:54 ` luoxhu at gcc dot gnu.org
` (10 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: segher at gcc dot gnu.org @ 2021-04-13 16:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
--- Comment #11 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to luoxhu from comment #10)
> If not built with fast-math, gimple_has_side_effects will return true and
> cause the expand_call_stmt fail to expand the "_1 = fmod (x_2(D), y_3(D));"
> to internal function. X86 also produces "bl fmod" for O3 build.
>
>
> xlF expands the fmod to below ASM, no FMA generated?
>
>
> 0000000010000900 <ggl>:
> 10000900: 8c 03 01 10 vspltisw v0,1
> 10000904: 00 00 24 c8 lfd f1,0(r4)
> 10000908: 00 00 03 c8 lfd f0,0(r3)
> 1000090c: e2 03 40 f0 xvcvsxwdp vs2,vs32
> 10000910: c0 09 62 f0 xsdivdp vs3,vs2,vs1
> 10000914: 80 19 80 f0 xsmuldp vs4,vs0,vs3
> 10000918: 64 21 a0 f0 xsrdpiz vs5,vs4
> 1000091c: 88 2d 01 f0 xsnmsubadp vs0,vs1,vs5
> 10000920: 18 00 20 fc frsp f1,f0
> 10000924: 20 00 80 4e blr
xsnmsubadp is an FMA. Multiply-subtract in this case, but that is just
a sign switch -- I often say FMA for all of fmadd, fnmadd, fnmsub, fmsub,
and their VSX counterparts. "Anything that does a multiply-type operation
followed by an addition-type operation". (And often call integer MADs
"FMA" as well, which is totally wrong, but :-) )
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
` (10 preceding siblings ...)
2021-04-13 16:26 ` segher at gcc dot gnu.org
@ 2021-05-27 0:54 ` luoxhu at gcc dot gnu.org
2021-09-02 22:05 ` bergner at gcc dot gnu.org
` (9 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-05-27 0:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
--- Comment #12 from luoxhu at gcc dot gnu.org ---
Patch submitted:
https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
` (11 preceding siblings ...)
2021-05-27 0:54 ` luoxhu at gcc dot gnu.org
@ 2021-09-02 22:05 ` bergner at gcc dot gnu.org
2021-09-02 22:38 ` segher at gcc dot gnu.org
` (8 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bergner at gcc dot gnu.org @ 2021-09-02 22:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
Peter Bergner <bergner at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
URL| |https://gcc.gnu.org/piperma
| |il/gcc-patches/2021-June/57
| |3967.html
--- Comment #13 from Peter Bergner <bergner at gcc dot gnu.org> ---
(In reply to luoxhu from comment #12)
> Patch submitted:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html
Looks like Will reviewed this and changes were made back in mid July. Segher
or David, can we get a review of the last version of this patch? Link to last
ping above which includes link to original patch. Will's review is here:
https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574863.html
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
` (12 preceding siblings ...)
2021-09-02 22:05 ` bergner at gcc dot gnu.org
@ 2021-09-02 22:38 ` segher at gcc dot gnu.org
2021-09-03 2:33 ` luoxhu at gcc dot gnu.org
` (7 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: segher at gcc dot gnu.org @ 2021-09-02 22:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
--- Comment #14 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to Peter Bergner from comment #13)
> (In reply to luoxhu from comment #12)
> > Patch submitted:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html
>
> Looks like Will reviewed this and changes were made back in mid July.
> Segher or David, can we get a review of the last version of this patch?
> Link to last ping above which includes link to original patch. Will's
> review is here:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574863.html
The last email in there promised a new patch:
> Thanks, will add below check:
[etc.]
but a new patch never materialised?
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
` (13 preceding siblings ...)
2021-09-02 22:38 ` segher at gcc dot gnu.org
@ 2021-09-03 2:33 ` luoxhu at gcc dot gnu.org
2021-09-07 1:29 ` cvs-commit at gcc dot gnu.org
` (6 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-09-03 2:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
--- Comment #15 from luoxhu at gcc dot gnu.org ---
Patch updated:
https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578740.html
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
` (14 preceding siblings ...)
2021-09-03 2:33 ` luoxhu at gcc dot gnu.org
@ 2021-09-07 1:29 ` cvs-commit at gcc dot gnu.org
2021-09-07 17:58 ` bergner at gcc dot gnu.org
` (5 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-09-07 1:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
--- Comment #16 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Xiong Hu Luo <luoxhu@gcc.gnu.org>:
https://gcc.gnu.org/g:546ecb0054af302acf0839c7f3eb78598f8c0672
commit r12-3375-g546ecb0054af302acf0839c7f3eb78598f8c0672
Author: Xionghu Luo <luoxhu@linux.ibm.com>
Date: Mon Sep 6 20:22:50 2021 -0500
rs6000: Expand fmod and remainder when built with fast-math [PR97142]
fmod/fmodf and remainder/remainderf could be expanded instead of library
call when fast-math build, which is much faster.
fmodf:
fdivs f0,f1,f2
friz f0,f0
fnmsubs f1,f2,f0,f1
remainderf:
fdivs f0,f1,f2
frin f0,f0
fnmsubs f1,f2,f0,f1
SPEC2017 Ofast P8LE: 511.povray_r +1.14%, 526.blender_r +1.72%
gcc/ChangeLog:
2021-09-07 Xionghu Luo <luoxhu@linux.ibm.com>
PR target/97142
* config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
(remainder<mode>3): Likewise.
gcc/testsuite/ChangeLog:
2021-09-07 Xionghu Luo <luoxhu@linux.ibm.com>
PR target/97142
* gcc.target/powerpc/pr97142.c: New test.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
` (15 preceding siblings ...)
2021-09-07 1:29 ` cvs-commit at gcc dot gnu.org
@ 2021-09-07 17:58 ` bergner at gcc dot gnu.org
2021-09-07 19:34 ` segher at gcc dot gnu.org
` (4 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: bergner at gcc dot gnu.org @ 2021-09-07 17:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
--- Comment #17 from Peter Bergner <bergner at gcc dot gnu.org> ---
(In reply to CVS Commits from comment #16)
> The master branch has been updated by Xiong Hu Luo <luoxhu@gcc.gnu.org>:
So fixed on trunk.
The Version about is to 10.2, does that mean we're going to back port this to
the release branches, or are we calling it good with trunk?
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
` (16 preceding siblings ...)
2021-09-07 17:58 ` bergner at gcc dot gnu.org
@ 2021-09-07 19:34 ` segher at gcc dot gnu.org
2021-09-07 21:20 ` segher at gcc dot gnu.org
` (3 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: segher at gcc dot gnu.org @ 2021-09-07 19:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
--- Comment #18 from Segher Boessenkool <segher at gcc dot gnu.org> ---
+/* { dg-final { scan-assembler-not {(?n)\mb.*fmod} } } */
+/* { dg-final { scan-assembler-not {(?n)\mb.*fmodf} } } */
+/* { dg-final { scan-assembler-not {(?n)\mb.*remainder} } } */
+/* { dg-final { scan-assembler-not {(?n)\mb.*remainderf} } } */
The "f" variants are unnecessary, those are matched by the other REs
already. This is harmless of course.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
` (17 preceding siblings ...)
2021-09-07 19:34 ` segher at gcc dot gnu.org
@ 2021-09-07 21:20 ` segher at gcc dot gnu.org
2021-09-14 5:33 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: segher at gcc dot gnu.org @ 2021-09-07 21:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
--- Comment #19 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to Peter Bergner from comment #17)
> The Version about is to 10.2, does that mean we're going to back port this
> to the release branches, or are we calling it good with trunk?
This is a pretty low risk patch, so if it has an important positive effect
we could decide to backport it. To both 11 and 10 then? Does it have such
an important effect?
But first let it stew for a while, see if surprises show up on trunk :-)
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
` (18 preceding siblings ...)
2021-09-07 21:20 ` segher at gcc dot gnu.org
@ 2021-09-14 5:33 ` cvs-commit at gcc dot gnu.org
2021-09-14 5:34 ` cvs-commit at gcc dot gnu.org
2021-09-14 5:36 ` luoxhu at gcc dot gnu.org
21 siblings, 0 replies; 23+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-09-14 5:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
--- Comment #20 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Xiong Hu Luo
<luoxhu@gcc.gnu.org>:
https://gcc.gnu.org/g:a87d7fbef55f72781905bffb298aab698fe6ed40
commit r11-8985-ga87d7fbef55f72781905bffb298aab698fe6ed40
Author: Xionghu Luo <luoxhu@linux.ibm.com>
Date: Mon Sep 6 20:22:50 2021 -0500
rs6000: Expand fmod and remainder when built with fast-math [PR97142]
fmod/fmodf and remainder/remainderf could be expanded instead of library
call when fast-math build, which is much faster.
fmodf:
fdivs f0,f1,f2
friz f0,f0
fnmsubs f1,f2,f0,f1
remainderf:
fdivs f0,f1,f2
frin f0,f0
fnmsubs f1,f2,f0,f1
SPEC2017 Ofast P8LE: 511.povray_r +1.14%, 526.blender_r +1.72%
gcc/ChangeLog:
2021-09-07 Xionghu Luo <luoxhu@linux.ibm.com>
PR target/97142
* config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
(remainder<mode>3): Likewise.
gcc/testsuite/ChangeLog:
2021-09-07 Xionghu Luo <luoxhu@linux.ibm.com>
PR target/97142
* gcc.target/powerpc/pr97142.c: New test.
(cherry-picked from 546ecb0054af302acf0839c7f3eb78598f8c0672)
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
` (19 preceding siblings ...)
2021-09-14 5:33 ` cvs-commit at gcc dot gnu.org
@ 2021-09-14 5:34 ` cvs-commit at gcc dot gnu.org
2021-09-14 5:36 ` luoxhu at gcc dot gnu.org
21 siblings, 0 replies; 23+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-09-14 5:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
--- Comment #21 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Xiong Hu Luo
<luoxhu@gcc.gnu.org>:
https://gcc.gnu.org/g:68d525ee859041b21d87b23030d1e829a9cc3b6f
commit r10-10116-g68d525ee859041b21d87b23030d1e829a9cc3b6f
Author: Xionghu Luo <luoxhu@linux.ibm.com>
Date: Mon Sep 6 20:22:50 2021 -0500
rs6000: Expand fmod and remainder when built with fast-math [PR97142]
fmod/fmodf and remainder/remainderf could be expanded instead of library
call when fast-math build, which is much faster.
fmodf:
fdivs f0,f1,f2
friz f0,f0
fnmsubs f1,f2,f0,f1
remainderf:
fdivs f0,f1,f2
frin f0,f0
fnmsubs f1,f2,f0,f1
SPEC2017 Ofast P8LE: 511.povray_r +1.14%, 526.blender_r +1.72%
gcc/ChangeLog:
2021-09-07 Xionghu Luo <luoxhu@linux.ibm.com>
PR target/97142
* config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
(remainder<mode>3): Likewise.
gcc/testsuite/ChangeLog:
2021-09-07 Xionghu Luo <luoxhu@linux.ibm.com>
PR target/97142
* gcc.target/powerpc/pr97142.c: New test.
(cherry picked from commit 546ecb0054af302acf0839c7f3eb78598f8c0672)
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Bug target/97142] __builtin_fmod not optimized on POWER
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
` (20 preceding siblings ...)
2021-09-14 5:34 ` cvs-commit at gcc dot gnu.org
@ 2021-09-14 5:36 ` luoxhu at gcc dot gnu.org
21 siblings, 0 replies; 23+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-09-14 5:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142
luoxhu at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |FIXED
--- Comment #22 from luoxhu at gcc dot gnu.org ---
Fixed on master and backported to gcc-11 and gcc-10.
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2021-09-14 5:36 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
2020-09-21 11:24 ` [Bug target/97142] " rguenth at gcc dot gnu.org
2020-09-21 11:26 ` rguenth at gcc dot gnu.org
2020-09-21 14:42 ` bergner at gcc dot gnu.org
2020-09-21 14:45 ` fx at gnu dot org
2020-09-21 14:47 ` fx at gnu dot org
2020-09-21 17:52 ` segher at gcc dot gnu.org
2020-09-21 18:10 ` bergner at gcc dot gnu.org
2020-09-21 19:33 ` segher at gcc dot gnu.org
2020-09-21 21:54 ` bergner at gcc dot gnu.org
2021-04-13 6:24 ` luoxhu at gcc dot gnu.org
2021-04-13 16:26 ` segher at gcc dot gnu.org
2021-05-27 0:54 ` luoxhu at gcc dot gnu.org
2021-09-02 22:05 ` bergner at gcc dot gnu.org
2021-09-02 22:38 ` segher at gcc dot gnu.org
2021-09-03 2:33 ` luoxhu at gcc dot gnu.org
2021-09-07 1:29 ` cvs-commit at gcc dot gnu.org
2021-09-07 17:58 ` bergner at gcc dot gnu.org
2021-09-07 19:34 ` segher at gcc dot gnu.org
2021-09-07 21:20 ` segher at gcc dot gnu.org
2021-09-14 5:33 ` cvs-commit at gcc dot gnu.org
2021-09-14 5:34 ` cvs-commit at gcc dot gnu.org
2021-09-14 5:36 ` luoxhu at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).