From: Frank Klemm <pfk@fuchs.offl.uni-jena.de>
To: Jan Hubicka <jh@suse.cz>
Cc: gcc@gcc.gnu.org
Subject: Re: floor on i386
Date: Tue, 25 Sep 2001 12:48:00 -0000 [thread overview]
Message-ID: <20010925210156.C328@fuchs.offl.uni-jena.de> (raw)
In-Reply-To: <20010925154659.B13734@atrey.karlin.mff.cuni.cz>
On Tue, Sep 25, 2001 at 03:46:59PM +0200, Jan Hubicka wrote:
> > It would probably be best to introduce a hard register to indicate the
> > rounding mode, and use OPTIMIZE_MODE_SWITCHING to do as few mode
> > changes as possible. For reference, have a look at the SH4
> > implementation of floating-point support, that defines an explicit
> > floating-point control register, mode-switching RTL and USEs that
>
> The USEs itself are problem - you loose a lot of optimizations then.
> The trick can be to lower code before reload using pre-reload splitting.
>
> Major problem still remains in reload.
> If we don't want to get exact IEEE by setting proper precisity before each
> mathematic operation (as SH4 does IMO), we will run into problems with spills ,
> since these can be put in place control word is set to some wrong value
> resutlting in wrong rounding before storing.
>
> Thats the main purpose why my original patch didn't contained it.
>
> The problem can be solved by mode switching pass after reload, when all spills
> are visible - you use existing pass before reload to compute control word values
> as these needs pseudos and after reload just insert fldcw/fstcw at strategic places.
>
> If you insert them at last optimal position in code, you will get them after the
> lazy code to compute control word inserted by pre-reload pass.
>
> As disussed with Timothy, the benefits are relativly small compared to the first
> half (computing control word values optimally), as CPUs do have hardware bypass.
>
> > register in all instructions that depend on the floating-point mode,
> > indicating in an attribute which mode the register is supposed to be
> > in. The difference is that SH4 uses the floating-point control
> > register to switch between single- and double-precision operations,
> > that have the same encoding but different behavior depending on the
> > state of the control register. Modeling mode switching for purposes
> > of rounding on x86 should be far simpler. In fact, I'm not even sure
> > you'd need the hard register: just define unspec patterns that switch
> > back and forth and you're done.
> You need scheduling barrier, but it is big problem.
>
Note, that this optimization is necessary if gcc don't want to have 4% of
the performance of icc for Intel iA32. For example a MPEG-2 Layer 2 decoder
spends 65% of the execution time in rounding floats to integers (Athlon).
This is not a joke, it's a flaw of the compiler.
Currently gcc is unusable if you need fast float to int convertion
(Video).
______________________________________________________________________
Another work-around is the following. It can be implemented very fast.
enum rounding_model_e {
round_default = 0x0000,
round_floor = 0x0400,
round_ceil = 0x0800,
round_trunc = 0x0C00,
round_round = 0x0000
}
enum rounding_model_e set_rounding_model ( enum rounding_model_e );
double rint ( double );
float rintf ( float );
long double rintl ( long double );
int irint ( double ); // 64 bit float to 32 bit int
long long llrintl ( long double ); // 80 bit float to 64 bit int
Other target types ( signed|unsigned, char|short|int|long|long long) are
also possible, also other saturation models (wrap|saturate|integerinfinity).
--
Frank Klemm
PS: The are CPUs with the following mapping of 32 bit integers:
0x80000001...0x7FFFFFFF: -2^31+1 ... +2^31-1
0x80000000: integer NAN
next prev parent reply other threads:[~2001-09-25 12:48 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-09-12 12:03 Chris Lattner
2001-09-12 16:16 ` Joe Buck
2001-09-24 10:52 ` Alexandre Oliva
2001-09-25 6:47 ` Jan Hubicka
2001-09-25 7:47 ` Brad Lucier
2001-09-25 8:00 ` Jan Hubicka
2001-09-25 12:52 ` Tim Prince
2001-09-25 12:48 ` Frank Klemm [this message]
2001-09-26 4:14 ` Jan Hubicka
2001-09-26 15:45 ` Frank Klemm
-- strict thread matches above, loose matches on Subject: below --
2001-09-26 16:19 dewar
2001-09-27 7:09 ` Frank Klemm
2001-09-12 9:26 Brad Lucier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20010925210156.C328@fuchs.offl.uni-jena.de \
--to=pfk@fuchs.offl.uni-jena.de \
--cc=gcc@gcc.gnu.org \
--cc=jh@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).