public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug fortran/104535] New: don't use fmod? @ 2022-02-14 20:16 fx at gnu dot org 2022-02-15 8:04 ` [Bug fortran/104535] " rguenth at gcc dot gnu.org 2022-02-17 19:02 ` kargl at gcc dot gnu.org 0 siblings, 2 replies; 3+ messages in thread From: fx at gnu dot org @ 2022-02-14 20:16 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104535 Bug ID: 104535 Summary: don't use fmod? Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: fx at gnu dot org Target Milestone: --- I was reminded by comments on the report I made about poor fmod performance on x86 that I should have commented on the original observation. I'd looked at one of the Polyhedron benchmarks which suffers badly from a simple random number routine that calls DMOD. That gets compiled to fmod, which is only inlined, albeit poorly on x86, with the relevant component(s) of -ffast-math. It seems to me that MOD should compile to the arithmetical expression in the standard, which doesn't have the complication of having to treat errors. (When I defined DMOD as a statement function for it in that routine, I got performance much closer to ifort. I should have kept the profiles I compared, but could regenerate them. Is there a good reason not to do that (and maybe similarly with other intrinsics I haven't checked)? I could probably have a go at implementing it if appropriate, though I don't know my way around now. ^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug fortran/104535] don't use fmod? 2022-02-14 20:16 [Bug fortran/104535] New: don't use fmod? fx at gnu dot org @ 2022-02-15 8:04 ` rguenth at gcc dot gnu.org 2022-02-17 19:02 ` kargl at gcc dot gnu.org 1 sibling, 0 replies; 3+ messages in thread From: rguenth at gcc dot gnu.org @ 2022-02-15 8:04 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104535 Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|normal |enhancement Version|unknown |12.0 ^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug fortran/104535] don't use fmod? 2022-02-14 20:16 [Bug fortran/104535] New: don't use fmod? fx at gnu dot org 2022-02-15 8:04 ` [Bug fortran/104535] " rguenth at gcc dot gnu.org @ 2022-02-17 19:02 ` kargl at gcc dot gnu.org 1 sibling, 0 replies; 3+ messages in thread From: kargl at gcc dot gnu.org @ 2022-02-17 19:02 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104535 kargl at gcc dot gnu.org changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kargl at gcc dot gnu.org --- Comment #1 from kargl at gcc dot gnu.org --- This should be closed with WONTFIX. gcc/gfortran already provides at least three options to get the requested behavior; namely, -ffast-math, -Ofast, and -funsafe-math-optimization. In addition, gfortran maps mod() to __builtin_fmod() for REAL arguments. It is trivial to show that gfortran compiled Fortran code matches gcc compiled C code. % cat u.f90 function mymod(x,y) real mymod real, intent(in), value :: x, y mymod = mod(x,y) end function mymod % cat u.c #include <math.h> float mymod(float x, float y) { return (fmodf(x,y)); } % gfortran -O3 -S u.f90 % cat u.s ... .cfi_startproc jmp fmodf .cfi_endproc ... % gcc -O3 -S u.f90 % cat u.s ... .cfi_startproc jmp fmodf .cfi_endproc ... % gfortran -O3 -S -ffast-math u.f90 % cat u.s ... .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq %rsp, %rbp .cfi_def_cfa_register 6 movss %xmm0, -4(%rbp) movss %xmm1, -8(%rbp) flds -8(%rbp) flds -4(%rbp) .L2: fprem fnstsw %ax testb $4, %ah jne .L2 fstp %st(1) fstps -4(%rbp) movss -4(%rbp), %xmm0 popq %rbp .cfi_def_cfa 7, 8 .ret .cfi_endproc ... % gcc -O3 -S -ffast-math u.f90 .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq %rsp, %rbp .cfi_def_cfa_register 6 movss %xmm0, -4(%rbp) movss %xmm1, -8(%rbp) flds -8(%rbp) flds -4(%rbp) .L2: fprem fnstsw %ax testb $4, %ah jne .L2 fstp %st(1) fstps -4(%rbp) movss -4(%rbp), %xmm0 popq %rbp .cfi_def_cfa 7, 8 ret .cfi_endproc Now, let's look at trans-intrinsic.cc. One finds the following comment /* Remainder function MOD(A, P) = A - INT(A / P) * P MODULO(A, P) = A - FLOOR (A / P) * P The obvious algorithms above are numerically instable for large arguments, hence these intrinsics are instead implemented via calls to the fmod family of functions. It is the responsibility of the user to ensure that the second argument is non-zero. */ In infinite precision arithmetic the above equations could use a straightforward naive in-lined implementation. Unfortunately, computers tend to use finite precision arithmetic, so numerical algorithms must be appropriately considered (See Goldberg). ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-02-17 19:02 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-02-14 20:16 [Bug fortran/104535] New: don't use fmod? fx at gnu dot org 2022-02-15 8:04 ` [Bug fortran/104535] " rguenth at gcc dot gnu.org 2022-02-17 19:02 ` kargl at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).