public inbox for libc-ports@sourceware.org
 help / color / mirror / Atom feed
From: "Carlos O'Donell" <carlos@redhat.com>
To: Steve Ellcey <sellcey@mips.com>
Cc: libc-ports@sourceware.org
Subject: Re: [PATCH] Speed up libm on MIPS
Date: Fri, 20 Sep 2013 03:32:00 -0000	[thread overview]
Message-ID: <523BC1B8.4040102@redhat.com> (raw)
In-Reply-To: <1379631395.5770.445.camel@ubuntu-sellcey>

On 09/19/2013 06:56 PM, Steve Ellcey wrote:
> This patch defines various inline routines and macros used by the math
> library in order to speed up libm on MIPS.  It does not affect
> soft-float builds but for hard-float builds 'make bench' shows a
> speed-up.  With an o32 little-endian glibc, sin() went from 27792.6
> iter/s to 31293.6 iter/s.  On n32 it went from 32955.2 to 36179.7 and on
> n64 from 33074.7 to 36242. exp() went from 45742.4 to 56511.2 on o32 and
> pow() went from 19008.8 to 20508.7.  I have attached the original and
> new bench.out files for o32, n32, and n64 ABIs in case you want to see
> more of the data.  These are all little-endian hard-float runs.
> 
> I ran 'make check' and 'make bench' using the o32, n32, and n64 ABIs
> with big and little endian and with hard and soft float to verify there
> were no failures.  I did run into an unrelated problem that is being
> fixed (https://sourceware.org/ml/libc-alpha/2013-09/msg00601.html) but
> there were no other failures except the expected ones for MIPS.

Thank you for running `make bench' and posting the results. I appreciate
you going out of your way to make the measurements and post them.

> OK for checkin?

Does MIPS have a slow fpu save/restore?

Does using HAVE_RM_CTX speed anything up for you?

For example see 2506109403de69bd454de27835d42e6eb6ec3abc

Two nits below.

> Steve Ellcey
> sellcey@mips.com
> 
> 
> 2013-09-18  Steve Ellcey  <sellcey@mips.com>
> 
> 	* sysdeps/mips/math_private.h (libc_feholdexcept_mips): New function.
> 	(libc_feholdexcept): New macro.
> 	(libc_feholdexceptf): New macro.
> 	(libc_feholdexceptl): New macro.
> 	(libc_fesetround_mips): New function.
> 	(libc_fesetround): New macro.
> 	(libc_fesetroundf): New macro.
> 	(libc_fesetroundl): New macro.
> 	(libc_feholdexcept_setround_mips): New function.
> 	(libc_feholdexcept_setround): New macro.
> 	(libc_feholdexcept_setroundf): New macro.
> 	(libc_feholdexcept_setroundl): New macro.
> 	(libc_fesetenv_mips): New function.
> 	(libc_fesetenv): New macro.
> 	(libc_fesetenvf): New macro.
> 	(libc_fesetenvl): New macro.
> 	(libc_feupdateenv_mips): New function.
> 	(libc_feupdateenv): New macro.
> 	(libc_feupdateenvf): New macro.
> 	(libc_feupdateenvl): New macro.
> 
> 
> mips-libm.patch
> 
> 
> diff --git a/ports/sysdeps/mips/math_private.h b/ports/sysdeps/mips/math_private.h
> index 6b99957..0ac18fd 100644
> --- a/ports/sysdeps/mips/math_private.h
> +++ b/ports/sysdeps/mips/math_private.h
> @@ -26,6 +26,119 @@
>  # define HIGH_ORDER_BIT_IS_SET_FOR_SNAN
>  #endif
>  
> +/* Inline functions to speed up the math library implementation.  The
> +   default versions of these routines are in generic/math_private.h
> +   and call fesetround, feholdexcept, etc.  These routines use inlined
> +   code instead.  */
> +
> +#ifdef __mips_hard_float
> +
> +#include <fenv.h>
> +#include <fenv_libc.h>
> +#include <fpu_control.h>
> +
> +static __always_inline void
> +libc_feholdexcept_mips (fenv_t *envp)
> +{
> +  fpu_control_t cw;
> +
> +  /* Save the current state.  */
> +  _FPU_GETCW (cw);
> +  envp->__fp_control_register = cw;
> +
> +  /* Clear all exception enable bits and flags.  */
> +  cw &= ~(_FPU_MASK_V|_FPU_MASK_Z|_FPU_MASK_O|_FPU_MASK_U|_FPU_MASK_I|FE_ALL_EXCEPT);
> +  _FPU_SETCW (cw);
> +}
> +#define libc_feholdexcept libc_feholdexcept_mips
> +#define libc_feholdexceptf libc_feholdexcept_mips
> +#define libc_feholdexceptl libc_feholdexcept_mips
> +
> +static __always_inline void
> +libc_fesetround_mips (int round)
> +{
> +  fpu_control_t cw;
> +
> +  /* Get current state.  */
> +  _FPU_GETCW (cw);
> +
> +  /* Set rounding bits.  */
> +  cw &= ~0x3;

What's the magic ~0x3? Should it be a new macro?

> +  cw |= round;
> +
> +  /* Set new state.  */
> +  _FPU_SETCW (cw);
> +}
> +#define libc_fesetround libc_fesetround_mips
> +#define libc_fesetroundf libc_fesetround_mips
> +#define libc_fesetroundl libc_fesetround_mips
> +
> +static __always_inline void
> +libc_feholdexcept_setround_mips (fenv_t *envp, int round)
> +{
> +  fpu_control_t cw;
> +
> +  /* Save the current state.  */
> +  _FPU_GETCW (cw);
> +  envp->__fp_control_register = cw;
> +
> +  /* Clear all exception enable bits and flags.  */
> +  cw &= ~(_FPU_MASK_V|_FPU_MASK_Z|_FPU_MASK_O|_FPU_MASK_U|_FPU_MASK_I|FE_ALL_EXCEPT);
> +
> +  /* Set rounding bits.  */
> +  cw &= ~0x3;

Likewise?

> +  cw |= round;
> +
> +  /* Set new state.  */
> +  _FPU_SETCW (cw);
> +}
> +#define libc_feholdexcept_setround libc_feholdexcept_setround_mips
> +#define libc_feholdexcept_setroundf libc_feholdexcept_setround_mips
> +#define libc_feholdexcept_setroundl libc_feholdexcept_setround_mips
> +
> +static __always_inline void
> +libc_fesetenv_mips (fenv_t *envp)
> +{
> +  fpu_control_t cw;
> +
> +  /* Read first current state to flush fpu pipeline.  */
> +  _FPU_GETCW (cw);
> +
> +  if (envp == FE_DFL_ENV)
> +    _FPU_SETCW (_FPU_DEFAULT);
> +  else if (envp == FE_NOMASK_ENV)
> +    _FPU_SETCW (_FPU_IEEE);
> +  else
> +    _FPU_SETCW (envp->__fp_control_register);
> +}
> +#define libc_fesetenv libc_fesetenv_mips
> +#define libc_fesetenvf libc_fesetenv_mips
> +#define libc_fesetenvl libc_fesetenv_mips
> +
> +static __always_inline void
> +libc_feupdateenv_mips (fenv_t *envp)
> +{
> +  int temp;
> +
> +  /* Save current exceptions.  */
> +  _FPU_GETCW (temp);
> +
> +  /* Set flag bits (which are accumulative), and *also* set the
> +     cause bits. The setting of the cause bits is what actually causes

Two spaces after period.

> +     the hardware to generate the exception, if the corresponding enable
> +     bit is set as well.  */
> +  temp &= FE_ALL_EXCEPT;
> +  temp |= envp->__fp_control_register | (temp << CAUSE_SHIFT);
> +
> +  /* Set new state.  */
> +  _FPU_SETCW (temp);
> +}
> +#define libc_feupdateenv libc_feupdateenv_mips
> +#define libc_feupdateenvf libc_feupdateenv_mips
> +#define libc_feupdateenvl libc_feupdateenv_mips
> +
> +#endif
> +
>  #include_next <math_private.h>
>  
>  #endif

Otherwise looks good to me.

Cheers,
Carlos.

  reply	other threads:[~2013-09-20  3:32 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-19 22:58 Steve Ellcey
2013-09-20  3:32 ` Carlos O'Donell [this message]
2013-09-20 16:51   ` Steve Ellcey
2013-09-20 17:06     ` Carlos O'Donell
2013-09-21 18:47       ` Maciej W. Rozycki
2013-09-22 17:40         ` Carlos O'Donell
2013-09-23 16:42           ` Steve Ellcey
2013-09-20 15:01 ` Richard Henderson
2013-09-20 15:35 ` Joseph S. Myers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=523BC1B8.4040102@redhat.com \
    --to=carlos@redhat.com \
    --cc=libc-ports@sourceware.org \
    --cc=sellcey@mips.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).