[Bug fortran/104535] New: don't use fmod?

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug fortran/104535] New: don't use fmod?
@ 2022-02-14 20:16 fx at gnu dot org
  2022-02-15  8:04 ` [Bug fortran/104535] " rguenth at gcc dot gnu.org
  2022-02-17 19:02 ` kargl at gcc dot gnu.org
  0 siblings, 2 replies; 3+ messages in thread
From: fx at gnu dot org @ 2022-02-14 20:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104535

            Bug ID: 104535
           Summary: don't use fmod?
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: fortran
          Assignee: unassigned at gcc dot gnu.org
          Reporter: fx at gnu dot org
  Target Milestone: ---

I was reminded by comments on the report I made about poor fmod performance on
x86 that I should have commented on the original observation.

I'd looked at one of the Polyhedron benchmarks which suffers badly from a
simple random number routine that calls DMOD.  That gets compiled to fmod,
which is only inlined, albeit poorly on x86, with the relevant component(s) of
-ffast-math.  It seems to me that MOD should compile to the arithmetical
expression in the standard, which doesn't have the complication of having to
treat errors.  (When I defined DMOD as a statement function for it in that
routine, I got performance much closer to ifort.  I should have kept the
profiles I compared, but could regenerate them.

Is there a good reason not to do that (and maybe similarly with other
intrinsics I haven't checked)?  I could probably have a go at implementing it
if appropriate, though I don't know my way around now.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug fortran/104535] don't use fmod?
  2022-02-14 20:16 [Bug fortran/104535] New: don't use fmod? fx at gnu dot org
@ 2022-02-15  8:04 ` rguenth at gcc dot gnu.org
  2022-02-17 19:02 ` kargl at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-15  8:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104535

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
            Version|unknown                     |12.0

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug fortran/104535] don't use fmod?
  2022-02-14 20:16 [Bug fortran/104535] New: don't use fmod? fx at gnu dot org
  2022-02-15  8:04 ` [Bug fortran/104535] " rguenth at gcc dot gnu.org
@ 2022-02-17 19:02 ` kargl at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: kargl at gcc dot gnu.org @ 2022-02-17 19:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104535

kargl at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kargl at gcc dot gnu.org

--- Comment #1 from kargl at gcc dot gnu.org ---
This should be closed with WONTFIX.  gcc/gfortran already provides at
least three options to get the requested behavior; namely, -ffast-math,
-Ofast, and -funsafe-math-optimization.  In addition, gfortran maps mod()
to __builtin_fmod() for REAL arguments.  It is trivial to show that gfortran
compiled Fortran code matches gcc compiled C code.

% cat u.f90
function mymod(x,y)
   real mymod
   real, intent(in), value :: x, y
   mymod = mod(x,y)
end function mymod

% cat u.c
#include <math.h>
float
mymod(float x, float y)
{
    return (fmodf(x,y));
}

% gfortran -O3 -S u.f90
% cat u.s
...
   .cfi_startproc
   jmp   fmodf
   .cfi_endproc
...

% gcc -O3 -S u.f90
% cat u.s
...
   .cfi_startproc
   jmp   fmodf
   .cfi_endproc
...

% gfortran -O3 -S -ffast-math u.f90
% cat u.s
...
   .cfi_startproc
   pushq   %rbp
   .cfi_def_cfa_offset 16
   .cfi_offset 6, -16
   movq    %rsp, %rbp
   .cfi_def_cfa_register 6
   movss   %xmm0, -4(%rbp)
   movss   %xmm1, -8(%rbp)
   flds    -8(%rbp)
   flds    -4(%rbp)
.L2:
   fprem
   fnstsw  %ax
   testb   $4, %ah
   jne     .L2
   fstp    %st(1)
   fstps   -4(%rbp)
   movss   -4(%rbp), %xmm0
   popq    %rbp
   .cfi_def_cfa 7, 8
   .ret
   .cfi_endproc
...


% gcc -O3 -S -ffast-math u.f90
   .cfi_startproc
   pushq   %rbp
   .cfi_def_cfa_offset 16
   .cfi_offset 6, -16
   movq    %rsp, %rbp
   .cfi_def_cfa_register 6
   movss   %xmm0, -4(%rbp)
   movss   %xmm1, -8(%rbp)
   flds    -8(%rbp)
   flds    -4(%rbp)
.L2:
   fprem
   fnstsw  %ax
   testb   $4, %ah
   jne     .L2
   fstp    %st(1)
   fstps   -4(%rbp)
   movss   -4(%rbp), %xmm0
   popq    %rbp
   .cfi_def_cfa 7, 8
   ret
   .cfi_endproc


Now, let's look at trans-intrinsic.cc.  One finds the following comment

/* Remainder function MOD(A, P) = A - INT(A / P) * P
                      MODULO(A, P) = A - FLOOR (A / P) * P

   The obvious algorithms above are numerically instable for large
   arguments, hence these intrinsics are instead implemented via calls
   to the fmod family of functions.  It is the responsibility of the
   user to ensure that the second argument is non-zero.  */

In infinite precision arithmetic the above equations could use
a straightforward naive in-lined implementation.  Unfortunately,
computers tend to use finite precision arithmetic, so numerical
algorithms must be appropriately considered (See Goldberg).

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-02-17 19:02 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-14 20:16 [Bug fortran/104535] New: don't use fmod? fx at gnu dot org
2022-02-15  8:04 ` [Bug fortran/104535] " rguenth at gcc dot gnu.org
2022-02-17 19:02 ` kargl at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).