public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/60138] New: superh single/double precision fpu mode setting is backwards and unusable
@ 2014-02-10 21:16 bugdal at aerifal dot cx
  2014-03-13 20:48 ` [Bug target/60138] " olegendo at gcc dot gnu.org
  2014-03-13 20:56 ` bugdal at aerifal dot cx
  0 siblings, 2 replies; 3+ messages in thread
From: bugdal at aerifal dot cx @ 2014-02-10 21:16 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60138

            Bug ID: 60138
           Summary: superh single/double precision fpu mode setting is
                    backwards and unusable
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: bugdal at aerifal dot cx

The sh target switches between single and double precision fpu modes (needed
for operating on floats and doubles, respectively) via loading the fpscr from a
global variable array __fpscr_values[2]. At times in the past this array has
been provided by libgcc; it seems it's currently expected that libc provides
it.

Each time gcc needs to perform a floating point operation and does not know the
current mode, it loads the full fpscr value, clobbering any non-default
rounding mode (set by fesetround), exception flags (set by floating point
operations), and other state. This makes it impossible to implement working
fenv.h API on sh; while glibc implements these functions, they don't work,
because gcc immediately clobbers the fpscr before performing the next floating
point operation.

It should be noted that __fpscr_values is non-const, and in fact I've found
some software that writes non-default bits into this array so that it can use
non-default settings (particular, flush-denormals-to-zero mode) without them
getting clobbered (since gcc will happily reload whatever you put in this
array). I suspect the original intent was that the fenv.h functions could also
do this. However, that's utterly broken since __fpscr_values is global and
floating point environment is supposed to be thread-local.

To fix this, the ridiculous hack of loading the value for fpscr from
__fpscr_values and directly writing to the fpscr should be removed, and
replaced with a read-modify-write cycle (read old fpscr, set/mask precision
bit, and write back to fpscr). Naturally this will hurt performance but the
current code is simply non-working and needs to be fixed.

One could move __fpscr_values to TLS, but given that TLS access does not seem
to perform that well on sh, I suspect that's slower than the read-modify-write
cycle on fpscr.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug target/60138] superh single/double precision fpu mode setting is backwards and unusable
  2014-02-10 21:16 [Bug target/60138] New: superh single/double precision fpu mode setting is backwards and unusable bugdal at aerifal dot cx
@ 2014-03-13 20:48 ` olegendo at gcc dot gnu.org
  2014-03-13 20:56 ` bugdal at aerifal dot cx
  1 sibling, 0 replies; 3+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-03-13 20:48 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60138

Oleg Endo <olegendo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |sh*-*-*
             Status|UNCONFIRMED                 |RESOLVED
                 CC|                            |olegendo at gcc dot gnu.org
         Resolution|---                         |DUPLICATE

--- Comment #1 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Rich Felker from comment #0)
> The sh target switches between single and double precision fpu modes (needed
> for operating on floats and doubles, respectively) via loading the fpscr
> from a global variable array __fpscr_values[2]. At times in the past this
> array has been provided by libgcc; it seems it's currently expected that
> libc provides it.

What leads you to this conclusion?  __fpscr_values is still in libgcc -- at
least last time I checked.

> 
> Each time gcc needs to perform a floating point operation and does not know
> the current mode, it loads the full fpscr value, clobbering any non-default
> rounding mode (set by fesetround), exception flags (set by floating point
> operations), and other state. This makes it impossible to implement working
> fenv.h API on sh; while glibc implements these functions, they don't work,
> because gcc immediately clobbers the fpscr before performing the next
> floating point operation.
> 
> It should be noted that __fpscr_values is non-const, and in fact I've found
> some software that writes non-default bits into this array so that it can
> use non-default settings (particular, flush-denormals-to-zero mode) without
> them getting clobbered (since gcc will happily reload whatever you put in
> this array). I suspect the original intent was that the fenv.h functions
> could also do this. However, that's utterly broken since __fpscr_values is
> global and floating point environment is supposed to be thread-local.
> 
> To fix this, the ridiculous hack of loading the value for fpscr from
> __fpscr_values and directly writing to the fpscr should be removed, and
> replaced with a read-modify-write cycle (read old fpscr, set/mask precision
> bit, and write back to fpscr). Naturally this will hurt performance but the
> current code is simply non-working and needs to be fixed.
> 
> One could move __fpscr_values to TLS, but given that TLS access does not
> seem to perform that well on sh, I suspect that's slower than the
> read-modify-write cycle on fpscr.

No need to move it to TLS, since the FPSCR register is usually already part of
an execution context (i.e. thread) and will be saved and restored accordingly.

I've raised the FPSCR issue a while ago ( PR 53513 ) and briefly discussed it
with Kaz.  The conclusion was to get rid of __fpscr_values usage altogether, as
you suggested it.

The reason why it's a constant global is/was performance.  Flipping the
FPSCR.PR bit with 'sts - xor - lds' takes way more cycles than simply reloading
it with a  single lds.

When this was introduced, double precision usage was not so popular and things
such as fenv.h did not exist, so 'correctness' wasn't so important.  Now of
course times have changed.  To proceed with this issue the missing bit in the
mode-switching RTL pass needs to be solved first.  See 3rd bullet point in PR
29349.  Unfortunately I haven't had the time to do it yet.

*** This bug has been marked as a duplicate of bug 53513 ***


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug target/60138] superh single/double precision fpu mode setting is backwards and unusable
  2014-02-10 21:16 [Bug target/60138] New: superh single/double precision fpu mode setting is backwards and unusable bugdal at aerifal dot cx
  2014-03-13 20:48 ` [Bug target/60138] " olegendo at gcc dot gnu.org
@ 2014-03-13 20:56 ` bugdal at aerifal dot cx
  1 sibling, 0 replies; 3+ messages in thread
From: bugdal at aerifal dot cx @ 2014-03-13 20:56 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60138

--- Comment #2 from Rich Felker <bugdal at aerifal dot cx> ---
> No need to move it to TLS, since the FPSCR register is usually already part of
> an execution context (i.e. thread) and will be saved and restored accordingly.

This does not help. The compiler constantly clobbers the FPSCR with the values
read from __fpscr_values[]. If you want to affect a change to the fpu behavior,
you'd have to modify the values in this array, and they are global, not
thread-local, which is wrong.

If use of __fpscr_values is eliminated, of course this will not matter.

Right now this is a blocking issue for musl libc having properly a working sh
port (floating point tests fail because fenv is not honored). We could treat sh
as a no-fenv port, but I'd rather not do that since the hardware does have
working fenv and it's just gcc messing it up; also we don't really support
no-fenv targets in a conforming way (we don't provide errno for math functions)
so I'd much rather have fenv.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-03-13 20:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-10 21:16 [Bug target/60138] New: superh single/double precision fpu mode setting is backwards and unusable bugdal at aerifal dot cx
2014-03-13 20:48 ` [Bug target/60138] " olegendo at gcc dot gnu.org
2014-03-13 20:56 ` bugdal at aerifal dot cx

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).