public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/97142] New: __builtin_fmod not optimized on POWER
@ 2020-09-21 10:57 fx at gnu dot org
  2020-09-21 11:24 ` [Bug target/97142] " rguenth at gcc dot gnu.org
                   ` (21 more replies)
  0 siblings, 22 replies; 23+ messages in thread
From: fx at gnu dot org @ 2020-09-21 10:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

            Bug ID: 97142
           Summary: __builtin_fmod not optimized on POWER
           Product: gcc
           Version: 10.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: fx at gnu dot org
  Target Milestone: ---

I ran some Fortran benchmarks (the "Polyhedron" set) on POWER9, and found one
of them has pathologically bad performance compared with xlf.  Profiling shows
that's due to spending most of its time in fmod via a random-number function. 
fmod isn't called when compiled with xlf -O5 or when compiling the same on
x86_64.  Although it's Fortran, this doesn't appear to be Fortran-specific as
the DMOD intrinsic is turned into __builtin_fmod.

The following is with gcc 10.2, comparing the two targets.

On RHEL7 POWER9 (and the same with -mcpu=native):

$ cat ggl.f90
      REAL FUNCTION GGL(Ds)
      DOUBLE PRECISION Ds , d2
      DATA d2/2147483647.D0/
      Ds = DMOD(16807.D0*Ds,d2)
      GGL = Ds/d2
      END
$ gfortran -O3 -fopt-info-all -c ggl.f90
ggl.f90:4:0: missed:   not inlinable: ggl/0 -> __builtin_fmod/2, function body
not available
Unit growth for small function inlining: 12->12 (0%)

Inlined 0 calls, eliminated 0 functions

ggl.f90:6:0: note: ***** Analysis failed with vector mode V2DF
$ nm ggl.o
                 U fmod
0000000000000000 T ggl_
                 U .TOC.

On Debian 10 SKX with the same source:

$ gfortran-10 -Ofast -fopt-info-all -c ggl.f90
ggl.f90:4:0: missed:   not inlinable: ggl/0 -> __builtin_fmod/2, function body
not available
Unit growth for small function inlining: 12->12 (0%)

Inlined 0 calls, eliminated 0 functions

ggl.f90:6:0: note: ***** Analysis failed with vector mode V2DF
ggl.f90:6:0: note: ***** Skipping vector mode V16QI, which would repeat the
analysis for V2DF
$ nm ggl.o
0000000000000000 r .LC0
0000000000000008 r .LC1
0000000000000010 r .LC2
0000000000000000 T ggl_

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
@ 2020-09-21 11:24 ` rguenth at gcc dot gnu.org
  2020-09-21 11:26 ` rguenth at gcc dot gnu.org
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-09-21 11:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2020-09-21
             Target|                            |powerpc
             Status|UNCONFIRMED                 |WAITING
     Ever confirmed|0                           |1

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
I guess xlf inlines fmod while we simply dispatch to what libm from glibc
provides.  That in turn might not be optimized to the same extent as what
xlf provides.

Can you provide the assembly as produced by xlf?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
  2020-09-21 11:24 ` [Bug target/97142] " rguenth at gcc dot gnu.org
@ 2020-09-21 11:26 ` rguenth at gcc dot gnu.org
  2020-09-21 14:42 ` bergner at gcc dot gnu.org
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-09-21 11:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Btw, with -ffast-math (or -Ofast) on x86 you get fmod inlined, I guess xlf -O5
is to some extent doing -ffast-math?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
  2020-09-21 11:24 ` [Bug target/97142] " rguenth at gcc dot gnu.org
  2020-09-21 11:26 ` rguenth at gcc dot gnu.org
@ 2020-09-21 14:42 ` bergner at gcc dot gnu.org
  2020-09-21 14:45 ` fx at gnu dot org
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: bergner at gcc dot gnu.org @ 2020-09-21 14:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

--- Comment #3 from Peter Bergner <bergner at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #2)
> Btw, with -ffast-math (or -Ofast) on x86 you get fmod inlined, I guess xlf
> -O5
> is to some extent doing -ffast-math?

Xlf at -O3, -O4 and -O5 automatically enables -qnostrict which is similar to
-fast-math + plus other stuff.

https://www.ibm.com/support/knowledgecenter/en/SSGH4D_16.1.0/com.ibm.xlf161.aix.doc/compiler_ref/opt_strict.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
                   ` (2 preceding siblings ...)
  2020-09-21 14:42 ` bergner at gcc dot gnu.org
@ 2020-09-21 14:45 ` fx at gnu dot org
  2020-09-21 14:47 ` fx at gnu dot org
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: fx at gnu dot org @ 2020-09-21 14:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

--- Comment #4 from Dave Love <fx at gnu dot org> ---
Created attachment 49249
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49249&action=edit
xlf -O5 -S

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
                   ` (3 preceding siblings ...)
  2020-09-21 14:45 ` fx at gnu dot org
@ 2020-09-21 14:47 ` fx at gnu dot org
  2020-09-21 17:52 ` segher at gcc dot gnu.org
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: fx at gnu dot org @ 2020-09-21 14:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

--- Comment #5 from Dave Love <fx at gnu dot org> ---
I meant to show that x86_64 expands the built in fmod too.  (I wasn't sure
whether "inline" was the right term, as it seems not to be done by
-finline-functions.)

Yes, xlf -O3 (?) and above imlies something like -ffast-math.  xlf also does
vectorized maths functions, but I don't think the speed of libm is as relevant
as not calling out to it.  (Nothing leapt out from profiles for the rest of
that set to suggest other builtins are similar, but I haven't examined them
closely.)

I'll attach the assembler, which there's an awful lot of compared with x86_64.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
                   ` (4 preceding siblings ...)
  2020-09-21 14:47 ` fx at gnu dot org
@ 2020-09-21 17:52 ` segher at gcc dot gnu.org
  2020-09-21 18:10 ` bergner at gcc dot gnu.org
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: segher at gcc dot gnu.org @ 2020-09-21 17:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

Segher Boessenkool <segher at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |NEW

--- Comment #6 from Segher Boessenkool <segher at gcc dot gnu.org> ---
But GCC (at least for rs6000) does not inline expand this with
-ffast-math even.  Confirmed.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
                   ` (5 preceding siblings ...)
  2020-09-21 17:52 ` segher at gcc dot gnu.org
@ 2020-09-21 18:10 ` bergner at gcc dot gnu.org
  2020-09-21 19:33 ` segher at gcc dot gnu.org
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: bergner at gcc dot gnu.org @ 2020-09-21 18:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

Peter Bergner <bergner at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |meissner at gcc dot gnu.org,
                   |                            |tuliom at ascii dot art.br

--- Comment #7 from Peter Bergner <bergner at gcc dot gnu.org> ---
I looked at libm/glibc and I don't think we have an fmod library call that even
uses the fmod hw instruction.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
                   ` (6 preceding siblings ...)
  2020-09-21 18:10 ` bergner at gcc dot gnu.org
@ 2020-09-21 19:33 ` segher at gcc dot gnu.org
  2020-09-21 21:54 ` bergner at gcc dot gnu.org
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: segher at gcc dot gnu.org @ 2020-09-21 19:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

--- Comment #8 from Segher Boessenkool <segher at gcc dot gnu.org> ---
I don't think we have an instruction for that?  But we can inline the
code we need instead of doing a library call, which is much faster.
(We probably can use FMAs here usefully, btw; maybe even without needing
-ffast-math.)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
                   ` (7 preceding siblings ...)
  2020-09-21 19:33 ` segher at gcc dot gnu.org
@ 2020-09-21 21:54 ` bergner at gcc dot gnu.org
  2021-04-13  6:24 ` luoxhu at gcc dot gnu.org
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: bergner at gcc dot gnu.org @ 2020-09-21 21:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

--- Comment #9 from Peter Bergner <bergner at gcc dot gnu.org> ---
(In reply to Segher Boessenkool from comment #8)
> I don't think we have an instruction for that?  But we can inline the
> code we need instead of doing a library call, which is much faster.
> (We probably can use FMAs here usefully, btw; maybe even without needing
> -ffast-math.)

Yes, I was mistaken we had a hw insn.  I do see xlc does not expand
__builtin_fmod (x, y), so it's only xlf that does this optimization.

Agreed on the inlining thing.  It's interesting that even libm's fmod() routine
on ppc just calls a C function and doesn't have an optimized asm routine.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
                   ` (8 preceding siblings ...)
  2020-09-21 21:54 ` bergner at gcc dot gnu.org
@ 2021-04-13  6:24 ` luoxhu at gcc dot gnu.org
  2021-04-13 16:26 ` segher at gcc dot gnu.org
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-04-13  6:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

--- Comment #10 from luoxhu at gcc dot gnu.org ---

If not built with fast-math, gimple_has_side_effects will return true and cause
the expand_call_stmt fail to expand the "_1 = fmod (x_2(D), y_3(D));" to
internal function. X86 also produces "bl fmod" for O3 build.


xlF expands the fmod to below ASM, no FMA generated?


0000000010000900 <ggl>:
    10000900:   8c 03 01 10     vspltisw v0,1
    10000904:   00 00 24 c8     lfd     f1,0(r4)
    10000908:   00 00 03 c8     lfd     f0,0(r3)
    1000090c:   e2 03 40 f0     xvcvsxwdp vs2,vs32
    10000910:   c0 09 62 f0     xsdivdp vs3,vs2,vs1
    10000914:   80 19 80 f0     xsmuldp vs4,vs0,vs3
    10000918:   64 21 a0 f0     xsrdpiz vs5,vs4
    1000091c:   88 2d 01 f0     xsnmsubadp vs0,vs1,vs5
    10000920:   18 00 20 fc     frsp    f1,f0
    10000924:   20 00 80 4e     blr

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
                   ` (9 preceding siblings ...)
  2021-04-13  6:24 ` luoxhu at gcc dot gnu.org
@ 2021-04-13 16:26 ` segher at gcc dot gnu.org
  2021-05-27  0:54 ` luoxhu at gcc dot gnu.org
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: segher at gcc dot gnu.org @ 2021-04-13 16:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

--- Comment #11 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to luoxhu from comment #10)
> If not built with fast-math, gimple_has_side_effects will return true and
> cause the expand_call_stmt fail to expand the "_1 = fmod (x_2(D), y_3(D));"
> to internal function. X86 also produces "bl fmod" for O3 build.
>  
> 
> xlF expands the fmod to below ASM, no FMA generated?
> 
> 
> 0000000010000900 <ggl>:
>     10000900:   8c 03 01 10     vspltisw v0,1
>     10000904:   00 00 24 c8     lfd     f1,0(r4)
>     10000908:   00 00 03 c8     lfd     f0,0(r3)
>     1000090c:   e2 03 40 f0     xvcvsxwdp vs2,vs32
>     10000910:   c0 09 62 f0     xsdivdp vs3,vs2,vs1
>     10000914:   80 19 80 f0     xsmuldp vs4,vs0,vs3
>     10000918:   64 21 a0 f0     xsrdpiz vs5,vs4
>     1000091c:   88 2d 01 f0     xsnmsubadp vs0,vs1,vs5
>     10000920:   18 00 20 fc     frsp    f1,f0
>     10000924:   20 00 80 4e     blr

xsnmsubadp is an FMA.  Multiply-subtract in this case, but that is just
a sign switch -- I often say FMA for all of fmadd, fnmadd, fnmsub, fmsub,
and their VSX counterparts.  "Anything that does a multiply-type operation
followed by an addition-type operation".  (And often call integer MADs
"FMA" as well, which is totally wrong, but :-) )

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
                   ` (10 preceding siblings ...)
  2021-04-13 16:26 ` segher at gcc dot gnu.org
@ 2021-05-27  0:54 ` luoxhu at gcc dot gnu.org
  2021-09-02 22:05 ` bergner at gcc dot gnu.org
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-05-27  0:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

--- Comment #12 from luoxhu at gcc dot gnu.org ---
Patch submitted:

https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
                   ` (11 preceding siblings ...)
  2021-05-27  0:54 ` luoxhu at gcc dot gnu.org
@ 2021-09-02 22:05 ` bergner at gcc dot gnu.org
  2021-09-02 22:38 ` segher at gcc dot gnu.org
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: bergner at gcc dot gnu.org @ 2021-09-02 22:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

Peter Bergner <bergner at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                URL|                            |https://gcc.gnu.org/piperma
                   |                            |il/gcc-patches/2021-June/57
                   |                            |3967.html

--- Comment #13 from Peter Bergner <bergner at gcc dot gnu.org> ---
(In reply to luoxhu from comment #12)
> Patch submitted:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html

Looks like Will reviewed this and changes were made back in mid July.  Segher
or David, can we get a review of the last version of this patch?  Link to last
ping above which includes link to original patch.  Will's review is here:

  https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574863.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
                   ` (12 preceding siblings ...)
  2021-09-02 22:05 ` bergner at gcc dot gnu.org
@ 2021-09-02 22:38 ` segher at gcc dot gnu.org
  2021-09-03  2:33 ` luoxhu at gcc dot gnu.org
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: segher at gcc dot gnu.org @ 2021-09-02 22:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

--- Comment #14 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to Peter Bergner from comment #13)
> (In reply to luoxhu from comment #12)
> > Patch submitted:
> > 
> > https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568143.html
> 
> Looks like Will reviewed this and changes were made back in mid July. 
> Segher or David, can we get a review of the last version of this patch? 
> Link to last ping above which includes link to original patch.  Will's
> review is here:
> 
>   https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574863.html

The last email in there promised a new patch:

> Thanks, will add below check:
[etc.]

but a new patch never materialised?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
                   ` (13 preceding siblings ...)
  2021-09-02 22:38 ` segher at gcc dot gnu.org
@ 2021-09-03  2:33 ` luoxhu at gcc dot gnu.org
  2021-09-07  1:29 ` cvs-commit at gcc dot gnu.org
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-09-03  2:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

--- Comment #15 from luoxhu at gcc dot gnu.org ---
Patch updated:

https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578740.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
                   ` (14 preceding siblings ...)
  2021-09-03  2:33 ` luoxhu at gcc dot gnu.org
@ 2021-09-07  1:29 ` cvs-commit at gcc dot gnu.org
  2021-09-07 17:58 ` bergner at gcc dot gnu.org
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-09-07  1:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

--- Comment #16 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Xiong Hu Luo <luoxhu@gcc.gnu.org>:

https://gcc.gnu.org/g:546ecb0054af302acf0839c7f3eb78598f8c0672

commit r12-3375-g546ecb0054af302acf0839c7f3eb78598f8c0672
Author: Xionghu Luo <luoxhu@linux.ibm.com>
Date:   Mon Sep 6 20:22:50 2021 -0500

    rs6000: Expand fmod and remainder when built with fast-math [PR97142]

    fmod/fmodf and remainder/remainderf could be expanded instead of library
    call when fast-math build, which is much faster.

    fmodf:
         fdivs   f0,f1,f2
         friz    f0,f0
         fnmsubs f1,f2,f0,f1

    remainderf:
         fdivs   f0,f1,f2
         frin    f0,f0
         fnmsubs f1,f2,f0,f1

    SPEC2017 Ofast P8LE: 511.povray_r +1.14%,  526.blender_r +1.72%

    gcc/ChangeLog:

    2021-09-07  Xionghu Luo  <luoxhu@linux.ibm.com>

            PR target/97142
            * config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
            (remainder<mode>3): Likewise.

    gcc/testsuite/ChangeLog:

    2021-09-07  Xionghu Luo  <luoxhu@linux.ibm.com>

            PR target/97142
            * gcc.target/powerpc/pr97142.c: New test.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
                   ` (15 preceding siblings ...)
  2021-09-07  1:29 ` cvs-commit at gcc dot gnu.org
@ 2021-09-07 17:58 ` bergner at gcc dot gnu.org
  2021-09-07 19:34 ` segher at gcc dot gnu.org
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: bergner at gcc dot gnu.org @ 2021-09-07 17:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

--- Comment #17 from Peter Bergner <bergner at gcc dot gnu.org> ---
(In reply to CVS Commits from comment #16)
> The master branch has been updated by Xiong Hu Luo <luoxhu@gcc.gnu.org>:

So fixed on trunk.

The Version about is to 10.2, does that mean we're going to back port this to
the release branches, or are we calling it good with trunk?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
                   ` (16 preceding siblings ...)
  2021-09-07 17:58 ` bergner at gcc dot gnu.org
@ 2021-09-07 19:34 ` segher at gcc dot gnu.org
  2021-09-07 21:20 ` segher at gcc dot gnu.org
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: segher at gcc dot gnu.org @ 2021-09-07 19:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

--- Comment #18 from Segher Boessenkool <segher at gcc dot gnu.org> ---
+/* { dg-final { scan-assembler-not {(?n)\mb.*fmod} } } */
+/* { dg-final { scan-assembler-not {(?n)\mb.*fmodf} } } */
+/* { dg-final { scan-assembler-not {(?n)\mb.*remainder} } } */
+/* { dg-final { scan-assembler-not {(?n)\mb.*remainderf} } } */

The "f" variants are unnecessary, those are matched by the other REs
already.  This is harmless of course.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
                   ` (17 preceding siblings ...)
  2021-09-07 19:34 ` segher at gcc dot gnu.org
@ 2021-09-07 21:20 ` segher at gcc dot gnu.org
  2021-09-14  5:33 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 23+ messages in thread
From: segher at gcc dot gnu.org @ 2021-09-07 21:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

--- Comment #19 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to Peter Bergner from comment #17)
> The Version about is to 10.2, does that mean we're going to back port this
> to the release branches, or are we calling it good with trunk?

This is a pretty low risk patch, so if it has an important positive effect
we could decide to backport it.  To both 11 and 10 then?  Does it have such
an important effect?

But first let it stew for a while, see if surprises show up on trunk :-)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
                   ` (18 preceding siblings ...)
  2021-09-07 21:20 ` segher at gcc dot gnu.org
@ 2021-09-14  5:33 ` cvs-commit at gcc dot gnu.org
  2021-09-14  5:34 ` cvs-commit at gcc dot gnu.org
  2021-09-14  5:36 ` luoxhu at gcc dot gnu.org
  21 siblings, 0 replies; 23+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-09-14  5:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

--- Comment #20 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Xiong Hu Luo
<luoxhu@gcc.gnu.org>:

https://gcc.gnu.org/g:a87d7fbef55f72781905bffb298aab698fe6ed40

commit r11-8985-ga87d7fbef55f72781905bffb298aab698fe6ed40
Author: Xionghu Luo <luoxhu@linux.ibm.com>
Date:   Mon Sep 6 20:22:50 2021 -0500

    rs6000: Expand fmod and remainder when built with fast-math [PR97142]

    fmod/fmodf and remainder/remainderf could be expanded instead of library
    call when fast-math build, which is much faster.

    fmodf:
         fdivs   f0,f1,f2
         friz    f0,f0
         fnmsubs f1,f2,f0,f1

    remainderf:
         fdivs   f0,f1,f2
         frin    f0,f0
         fnmsubs f1,f2,f0,f1

    SPEC2017 Ofast P8LE: 511.povray_r +1.14%,  526.blender_r +1.72%

    gcc/ChangeLog:

    2021-09-07  Xionghu Luo  <luoxhu@linux.ibm.com>

            PR target/97142
            * config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
            (remainder<mode>3): Likewise.

    gcc/testsuite/ChangeLog:

    2021-09-07  Xionghu Luo  <luoxhu@linux.ibm.com>

            PR target/97142
            * gcc.target/powerpc/pr97142.c: New test.

            (cherry-picked from 546ecb0054af302acf0839c7f3eb78598f8c0672)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
                   ` (19 preceding siblings ...)
  2021-09-14  5:33 ` cvs-commit at gcc dot gnu.org
@ 2021-09-14  5:34 ` cvs-commit at gcc dot gnu.org
  2021-09-14  5:36 ` luoxhu at gcc dot gnu.org
  21 siblings, 0 replies; 23+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-09-14  5:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

--- Comment #21 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Xiong Hu Luo
<luoxhu@gcc.gnu.org>:

https://gcc.gnu.org/g:68d525ee859041b21d87b23030d1e829a9cc3b6f

commit r10-10116-g68d525ee859041b21d87b23030d1e829a9cc3b6f
Author: Xionghu Luo <luoxhu@linux.ibm.com>
Date:   Mon Sep 6 20:22:50 2021 -0500

    rs6000: Expand fmod and remainder when built with fast-math [PR97142]

    fmod/fmodf and remainder/remainderf could be expanded instead of library
    call when fast-math build, which is much faster.

    fmodf:
         fdivs   f0,f1,f2
         friz    f0,f0
         fnmsubs f1,f2,f0,f1

    remainderf:
         fdivs   f0,f1,f2
         frin    f0,f0
         fnmsubs f1,f2,f0,f1

    SPEC2017 Ofast P8LE: 511.povray_r +1.14%,  526.blender_r +1.72%

    gcc/ChangeLog:

    2021-09-07  Xionghu Luo  <luoxhu@linux.ibm.com>

            PR target/97142
            * config/rs6000/rs6000.md (fmod<mode>3): New define_expand.
            (remainder<mode>3): Likewise.

    gcc/testsuite/ChangeLog:

    2021-09-07  Xionghu Luo  <luoxhu@linux.ibm.com>

            PR target/97142
            * gcc.target/powerpc/pr97142.c: New test.

    (cherry picked from commit 546ecb0054af302acf0839c7f3eb78598f8c0672)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Bug target/97142] __builtin_fmod not optimized on POWER
  2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
                   ` (20 preceding siblings ...)
  2021-09-14  5:34 ` cvs-commit at gcc dot gnu.org
@ 2021-09-14  5:36 ` luoxhu at gcc dot gnu.org
  21 siblings, 0 replies; 23+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-09-14  5:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97142

luoxhu at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #22 from luoxhu at gcc dot gnu.org ---
Fixed on master and backported to gcc-11 and gcc-10.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2021-09-14  5:36 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-21 10:57 [Bug target/97142] New: __builtin_fmod not optimized on POWER fx at gnu dot org
2020-09-21 11:24 ` [Bug target/97142] " rguenth at gcc dot gnu.org
2020-09-21 11:26 ` rguenth at gcc dot gnu.org
2020-09-21 14:42 ` bergner at gcc dot gnu.org
2020-09-21 14:45 ` fx at gnu dot org
2020-09-21 14:47 ` fx at gnu dot org
2020-09-21 17:52 ` segher at gcc dot gnu.org
2020-09-21 18:10 ` bergner at gcc dot gnu.org
2020-09-21 19:33 ` segher at gcc dot gnu.org
2020-09-21 21:54 ` bergner at gcc dot gnu.org
2021-04-13  6:24 ` luoxhu at gcc dot gnu.org
2021-04-13 16:26 ` segher at gcc dot gnu.org
2021-05-27  0:54 ` luoxhu at gcc dot gnu.org
2021-09-02 22:05 ` bergner at gcc dot gnu.org
2021-09-02 22:38 ` segher at gcc dot gnu.org
2021-09-03  2:33 ` luoxhu at gcc dot gnu.org
2021-09-07  1:29 ` cvs-commit at gcc dot gnu.org
2021-09-07 17:58 ` bergner at gcc dot gnu.org
2021-09-07 19:34 ` segher at gcc dot gnu.org
2021-09-07 21:20 ` segher at gcc dot gnu.org
2021-09-14  5:33 ` cvs-commit at gcc dot gnu.org
2021-09-14  5:34 ` cvs-commit at gcc dot gnu.org
2021-09-14  5:36 ` luoxhu at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).