public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "olegendo at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug target/53513] SH Target: Add support for fschg and fpchg insns Date: Sun, 16 Mar 2014 20:47:00 -0000 [thread overview] Message-ID: <bug-53513-4-BNVSOM6Xu5@http.gcc.gnu.org/bugzilla/> (raw) In-Reply-To: <bug-53513-4@http.gcc.gnu.org/bugzilla/> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53513 Oleg Endo <olegendo at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Last reconfirmed| |2014-03-16 Ever confirmed|0 |1 --- Comment #4 from Oleg Endo <olegendo at gcc dot gnu.org> --- As mentioned in PR 60138, this issue also prevents a working implementation of fenv.h & friends on SH. The idea would be to get rid of the __fpscr_values first and set the FPSCR.PR bit with insn sequences such like.. set pr = 1: sts fpscr,r2 mov.l #(1 << 19),r1 or r1,r2 lds r2,fpscr set pr = 0: sts fpscr,r2 mov.l #~(1 << 19),r1 and r1,r2 lds r2,fpscr This would obviously result in a performance regression but would work with all SH FPUs. On SH4A this can then be improved by adding support for fpchg. Although this would require changes/extensions to the mode switching machinery, as mentioned in PR 29349. The problem is that the mode switching pass emits only mode changes to a particular mode, not from mode 'x' to mode 'y'. In PR 29349 an extension of the pre_edge_lcm function is suggested which would make the necessary information available. Here are a few more 'requirements' for SH specific mode change issues: 1) The following function: double test (const float* a, const float* b, const double* c, float x) { float aa = a[0] * b[0]; double cc = c[0] + c[1]; aa += b[1] * b[2]; cc += c[2] + c[3]; aa += b[3] * b[4]; cc += c[4] + c[5]; aa += b[5] * b[6]; return aa / cc; } compiled with -m4 -O2 (default PR mode = double) results in 4 mode switches. Rewriting it as: double test (const float* a, const float* b, const double* c, float x) { float aa = a[0] * b[0] + b[1] * b[2] + b[3] * b[4] + b[5] * b[6]; double cc = c[0] + c[1]; cc += c[2] + c[3]; cc += c[4] + c[5]; return aa / cc; } results only in 2 mode switches (as expected). FP operations which are independent should be reordered in order to minimize mode switches. This could go as far as ... 2) ... doing loop distribution, for cases such as: double test (const float* x, const double* y, unsigned int c) { float r0 = 0; double r1 = 0; while (c--) { float xx = x[0] * x[1] + x[2] + 123.0f; x += 3; double yy = y[0] + y[1]; y += 2; r0 += xx; r1 += yy; } return r0 + r1; } which currently produces a loop with 4 mode switches in it. Reordering the FP operations would bring this down to 2 mode switches in the loop. Since r0 and r1 are calculated independently, 2 loops can be used, having the mode switches outside the loops. 3) FPSCR.SZ mode changes might interfere with FPSCR.PR mode changes. For example, using fschg to flip FPSCR.SZ might require changing FPSCR.PR first (and potentially changing it back). If fpchg is not available (SH4A only), it's better to set both bits directly. In order to minimize mode switches it might be necessary to reorder instructions and doing loop distribution while looking at PR and SZ bits simultaneously. 4) rounding mode settings also mean FPSCR mode changes. http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01378.html 5) in some cases preserving FPSCR bits across mode changes is not required (if I'm not mistaken): double func (float a, float b, double c, double d) { #pragma STDC FENV_ACCESS ON // function entry, PR = double // mode switch PR = single float ab = a + b; // mode switch PR = double double x = ab + c + d; // read back FP status bits and do something with it return x; // function exit, PR = double } In this case the mode switch double -> float -> double can be done more efficiently by pushing the PR = double FPSCR state onto the stack, switch to PR = single and then switch back to PR = double by popping FPSCR from the stack. However, this must not happen if other FPSCR settings are changed after the first switch to PR = single, such as invoking a fenv modifying standard function or changing the FPSCR.FR bit on SH4.
next prev parent reply other threads:[~2014-03-16 20:47 UTC|newest] Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top 2012-05-29 1:51 [Bug target/53513] New: " olegendo at gcc dot gnu.org 2013-03-10 19:54 ` [Bug target/53513] " olegendo at gcc dot gnu.org 2013-07-31 17:12 ` olegendo at gcc dot gnu.org 2014-03-13 20:48 ` olegendo at gcc dot gnu.org 2014-03-16 20:47 ` olegendo at gcc dot gnu.org [this message] 2014-03-16 23:32 ` olegendo at gcc dot gnu.org 2014-03-17 1:30 ` bugdal at aerifal dot cx 2014-03-17 9:09 ` olegendo at gcc dot gnu.org 2014-03-17 11:53 ` chrbr at gcc dot gnu.org 2014-03-17 13:51 ` kkojima at gcc dot gnu.org 2014-03-17 14:23 ` olegendo at gcc dot gnu.org 2014-03-17 15:18 ` chrbr at gcc dot gnu.org 2014-03-17 15:41 ` olegendo at gcc dot gnu.org 2014-05-12 8:47 ` chrbr at gcc dot gnu.org 2014-10-11 21:28 ` olegendo at gcc dot gnu.org 2014-10-11 22:02 ` olegendo at gcc dot gnu.org 2014-10-11 22:09 ` olegendo at gcc dot gnu.org 2014-10-12 14:13 ` olegendo at gcc dot gnu.org 2014-10-13 6:57 ` [Bug target/53513] [SH] Add support for fschg and fpchg insns and improve fenv support olegendo at gcc dot gnu.org 2014-10-13 14:40 ` olegendo at gcc dot gnu.org 2014-10-14 3:46 ` olegendo at gcc dot gnu.org 2014-10-14 12:26 ` olegendo at gcc dot gnu.org 2014-10-15 0:06 ` olegendo at gcc dot gnu.org 2014-10-15 1:00 ` olegendo at gcc dot gnu.org 2014-10-15 4:02 ` kkojima at gcc dot gnu.org 2014-10-15 8:41 ` olegendo at gcc dot gnu.org 2014-10-15 17:57 ` olegendo at gcc dot gnu.org 2014-10-15 22:08 ` olegendo at gcc dot gnu.org 2014-10-16 1:13 ` kkojima at gcc dot gnu.org 2014-10-16 10:59 ` olegendo at gcc dot gnu.org 2014-10-16 12:12 ` olegendo at gcc dot gnu.org 2014-10-16 13:28 ` kkojima at gcc dot gnu.org 2014-10-16 18:57 ` olegendo at gcc dot gnu.org 2014-10-17 9:22 ` olegendo at gcc dot gnu.org 2014-10-17 17:42 ` olegendo at gcc dot gnu.org 2014-10-17 22:20 ` olegendo at gcc dot gnu.org 2014-10-17 23:16 ` kkojima at gcc dot gnu.org 2014-12-07 23:20 ` olegendo at gcc dot gnu.org 2014-12-10 0:22 ` olegendo at gcc dot gnu.org 2014-12-10 8:32 ` olegendo at gcc dot gnu.org 2014-12-13 13:18 ` olegendo at gcc dot gnu.org 2014-12-14 14:00 ` [Bug target/53513] [SH] Add support for fpchg insn " olegendo at gcc dot gnu.org 2014-12-16 21:29 ` olegendo at gcc dot gnu.org 2014-12-21 17:54 ` olegendo at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-53513-4-BNVSOM6Xu5@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).