[Bug target/53513] SH Target: Add support for fschg and fpchg insns

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

From: "olegendo at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/53513] SH Target: Add support for fschg and fpchg insns
Date: Sun, 16 Mar 2014 20:47:00 -0000	[thread overview]
Message-ID: <bug-53513-4-BNVSOM6Xu5@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-53513-4@http.gcc.gnu.org/bugzilla/>

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53513

Oleg Endo <olegendo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2014-03-16
     Ever confirmed|0                           |1

--- Comment #4 from Oleg Endo <olegendo at gcc dot gnu.org> ---
As mentioned in PR 60138, this issue also prevents a working implementation of
fenv.h & friends on SH.

The idea would be to get rid of the __fpscr_values first and set the FPSCR.PR
bit with insn sequences such like..

set pr = 1:
  sts      fpscr,r2
  mov.l   #(1 << 19),r1
  or      r1,r2
  lds      r2,fpscr

set pr = 0:
  sts     fpscr,r2
  mov.l   #~(1 << 19),r1
  and      r1,r2
  lds     r2,fpscr

This would obviously result in a performance regression but would work with all
SH FPUs.

On SH4A this can then be improved by adding support for fpchg.  Although this
would require changes/extensions to the mode switching machinery, as mentioned
in PR 29349.  The problem is that the mode switching pass emits only mode
changes to a particular mode, not from mode 'x' to mode 'y'.  In PR 29349 an
extension of the pre_edge_lcm function is suggested which would make the
necessary information available.

Here are a few more 'requirements' for SH specific mode change issues:

1)
The following function:

double test (const float* a, const float* b, const double* c, float x)
{
  float aa = a[0] * b[0];
  double cc = c[0] + c[1];

  aa += b[1] * b[2];
  cc += c[2] + c[3];
  aa += b[3] * b[4];
  cc += c[4] + c[5];
  aa += b[5] * b[6];

  return aa / cc;
}

compiled with -m4 -O2 (default PR mode = double) results in 4 mode switches. 
Rewriting it as:

double test (const float* a, const float* b, const double* c, float x)
{
  float aa = a[0] * b[0] + b[1] * b[2] + b[3] * b[4] + b[5] * b[6];

  double cc = c[0] + c[1];
  cc += c[2] + c[3];
  cc += c[4] + c[5];

  return aa / cc;
}

results only in 2 mode switches (as expected).  FP operations which are
independent should be reordered in order to minimize mode switches.
This could go as far as ...

2)
... doing loop distribution, for cases such as:

double test (const float* x, const double* y, unsigned int c)
{
  float r0 = 0;
  double r1 = 0;

  while (c--)
  {
    float xx = x[0] * x[1] + x[2] + 123.0f;
    x += 3;

    double yy = y[0] + y[1];
    y += 2;

    r0 += xx;
    r1 += yy;
  }

  return r0 + r1;
}

which currently produces a loop with 4 mode switches in it.  Reordering the FP
operations would bring this down to 2 mode switches in the loop.  Since r0 and
r1 are calculated independently, 2 loops can be used, having the mode switches
outside the loops.

3)
FPSCR.SZ mode changes might interfere with FPSCR.PR mode changes.  For example,
using fschg to flip FPSCR.SZ might require changing FPSCR.PR first (and
potentially changing it back).  If fpchg is not available (SH4A only), it's
better to set both bits directly.  In order to minimize mode switches it might
be necessary to reorder instructions and doing loop distribution while looking
at PR and SZ bits simultaneously.

4)
rounding mode settings also mean FPSCR mode changes.
http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01378.html

5)
in some cases preserving FPSCR bits across mode changes is not required (if I'm
not mistaken):

double func (float a, float b, double c, double d)
{
  #pragma STDC FENV_ACCESS ON

  // function entry, PR = double

  // mode switch PR = single
  float ab = a + b;

  // mode switch PR = double
  double x = ab + c + d;

  // read back FP status bits and do something with it
  return x;

  // function exit, PR = double
}

In this case the mode switch double -> float -> double can be done more
efficiently by pushing the PR = double FPSCR state onto the stack, switch to PR
= single and then switch back to PR = double by popping FPSCR from the stack.

However, this must not happen if other FPSCR settings are changed after the
first switch to PR = single, such as invoking a fenv modifying standard
function or changing the FPSCR.FR bit on SH4.

next prev parent reply	other threads:[~2014-03-16 20:47 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-29  1:51 [Bug target/53513] New: " olegendo at gcc dot gnu.org
2013-03-10 19:54 ` [Bug target/53513] " olegendo at gcc dot gnu.org
2013-07-31 17:12 ` olegendo at gcc dot gnu.org
2014-03-13 20:48 ` olegendo at gcc dot gnu.org
2014-03-16 20:47 ` olegendo at gcc dot gnu.org [this message]
2014-03-16 23:32 ` olegendo at gcc dot gnu.org
2014-03-17  1:30 ` bugdal at aerifal dot cx
2014-03-17  9:09 ` olegendo at gcc dot gnu.org
2014-03-17 11:53 ` chrbr at gcc dot gnu.org
2014-03-17 13:51 ` kkojima at gcc dot gnu.org
2014-03-17 14:23 ` olegendo at gcc dot gnu.org
2014-03-17 15:18 ` chrbr at gcc dot gnu.org
2014-03-17 15:41 ` olegendo at gcc dot gnu.org
2014-05-12  8:47 ` chrbr at gcc dot gnu.org
2014-10-11 21:28 ` olegendo at gcc dot gnu.org
2014-10-11 22:02 ` olegendo at gcc dot gnu.org
2014-10-11 22:09 ` olegendo at gcc dot gnu.org
2014-10-12 14:13 ` olegendo at gcc dot gnu.org
2014-10-13  6:57 ` [Bug target/53513] [SH] Add support for fschg and fpchg insns and improve fenv support olegendo at gcc dot gnu.org
2014-10-13 14:40 ` olegendo at gcc dot gnu.org
2014-10-14  3:46 ` olegendo at gcc dot gnu.org
2014-10-14 12:26 ` olegendo at gcc dot gnu.org
2014-10-15  0:06 ` olegendo at gcc dot gnu.org
2014-10-15  1:00 ` olegendo at gcc dot gnu.org
2014-10-15  4:02 ` kkojima at gcc dot gnu.org
2014-10-15  8:41 ` olegendo at gcc dot gnu.org
2014-10-15 17:57 ` olegendo at gcc dot gnu.org
2014-10-15 22:08 ` olegendo at gcc dot gnu.org
2014-10-16  1:13 ` kkojima at gcc dot gnu.org
2014-10-16 10:59 ` olegendo at gcc dot gnu.org
2014-10-16 12:12 ` olegendo at gcc dot gnu.org
2014-10-16 13:28 ` kkojima at gcc dot gnu.org
2014-10-16 18:57 ` olegendo at gcc dot gnu.org
2014-10-17  9:22 ` olegendo at gcc dot gnu.org
2014-10-17 17:42 ` olegendo at gcc dot gnu.org
2014-10-17 22:20 ` olegendo at gcc dot gnu.org
2014-10-17 23:16 ` kkojima at gcc dot gnu.org
2014-12-07 23:20 ` olegendo at gcc dot gnu.org
2014-12-10  0:22 ` olegendo at gcc dot gnu.org
2014-12-10  8:32 ` olegendo at gcc dot gnu.org
2014-12-13 13:18 ` olegendo at gcc dot gnu.org
2014-12-14 14:00 ` [Bug target/53513] [SH] Add support for fpchg insn " olegendo at gcc dot gnu.org
2014-12-16 21:29 ` olegendo at gcc dot gnu.org
2014-12-21 17:54 ` olegendo at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-53513-4-BNVSOM6Xu5@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).