public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Frank Klemm <pfk@fuchs.offl.uni-jena.de>
To: Jan Hubicka <jh@suse.cz>
Cc: gcc@gcc.gnu.org
Subject: Re: floor on i386
Date: Tue, 25 Sep 2001 12:48:00 -0000	[thread overview]
Message-ID: <20010925210156.C328@fuchs.offl.uni-jena.de> (raw)
In-Reply-To: <20010925154659.B13734@atrey.karlin.mff.cuni.cz>

On Tue, Sep 25, 2001 at 03:46:59PM +0200, Jan Hubicka wrote:
> > It would probably be best to introduce a hard register to indicate the
> > rounding mode, and use OPTIMIZE_MODE_SWITCHING to do as few mode
> > changes as possible.  For reference, have a look at the SH4
> > implementation of floating-point support, that defines an explicit
> > floating-point control register, mode-switching RTL and USEs that
> 
> The USEs itself are problem - you loose a lot of optimizations then.
> The trick can be to lower code before reload using pre-reload splitting.
> 
> Major problem still remains in reload.
> If we don't want to get exact IEEE by setting proper precisity before each
> mathematic operation (as SH4 does IMO), we will run into problems with spills ,
> since these can be put in place control word is set to some wrong value
> resutlting in wrong rounding before storing.
> 
> Thats the main purpose why my original patch didn't contained it.
> 
> The problem can be solved by mode switching pass after reload, when all spills
> are visible - you use existing pass before reload to compute control word values
> as these needs pseudos and after reload just insert fldcw/fstcw at strategic places.
> 
> If you insert them at last optimal position in code, you will get them after the
> lazy code to compute control word inserted by pre-reload pass.
> 
> As disussed with Timothy, the benefits are relativly small compared to the first
> half (computing control word values optimally), as CPUs do have hardware bypass.
> 
> > register in all instructions that depend on the floating-point mode,
> > indicating in an attribute which mode the register is supposed to be
> > in.  The difference is that SH4 uses the floating-point control
> > register to switch between single- and double-precision operations,
> > that have the same encoding but different behavior depending on the
> > state of the control register.  Modeling mode switching for purposes
> > of rounding on x86 should be far simpler.  In fact, I'm not even sure
> > you'd need the hard register: just define unspec patterns that switch
> > back and forth and you're done.
> You need scheduling barrier, but it is big problem.
> 

Note, that this optimization is necessary if gcc don't want to have 4% of
the performance of icc for Intel iA32. For example a MPEG-2 Layer 2 decoder
spends 65% of the execution time in rounding floats to integers (Athlon).
This is not a joke, it's a flaw of the compiler.

Currently gcc is unusable if you need fast float to int convertion
(Video).
______________________________________________________________________

Another work-around is the following. It can be implemented very fast.

enum rounding_model_e {
    round_default = 0x0000,
    round_floor   = 0x0400,
    round_ceil    = 0x0800,
    round_trunc   = 0x0C00,
    round_round   = 0x0000
}

enum rounding_model_e  set_rounding_model ( enum rounding_model_e );

double         rint    ( double );
float          rintf   ( float );
long double    rintl   ( long double );
int            irint   ( double );		// 64 bit float to 32 bit int
long long      llrintl ( long double );		// 80 bit float to 64 bit int

Other target types ( signed|unsigned, char|short|int|long|long long) are
also possible, also other saturation models (wrap|saturate|integerinfinity).

-- 
Frank Klemm

PS: The are CPUs with the following mapping of 32 bit integers:
    0x80000001...0x7FFFFFFF:   -2^31+1 ... +2^31-1
    0x80000000:                integer NAN


  parent reply	other threads:[~2001-09-25 12:48 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-09-12 12:03 Chris Lattner
2001-09-12 16:16 ` Joe Buck
2001-09-24 10:52   ` Alexandre Oliva
2001-09-25  6:47     ` Jan Hubicka
2001-09-25  7:47       ` Brad Lucier
2001-09-25  8:00         ` Jan Hubicka
2001-09-25 12:52           ` Tim Prince
2001-09-25 12:48       ` Frank Klemm [this message]
2001-09-26  4:14         ` Jan Hubicka
2001-09-26 15:45           ` Frank Klemm
  -- strict thread matches above, loose matches on Subject: below --
2001-09-26 16:19 dewar
2001-09-27  7:09 ` Frank Klemm
2001-09-12  9:26 Brad Lucier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20010925210156.C328@fuchs.offl.uni-jena.de \
    --to=pfk@fuchs.offl.uni-jena.de \
    --cc=gcc@gcc.gnu.org \
    --cc=jh@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).