public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: Multiplications on Pentium 4
@ 2001-09-08  8:31 dewar
  2001-09-08  9:17 ` Jan Hubicka
  0 siblings, 1 reply; 18+ messages in thread
From: dewar @ 2001-09-08  8:31 UTC (permalink / raw)
  To: jh, pfk; +Cc: gcc

It's amazing how poor the scaling lea's are on the Pentium 4, probably they
should never be generated.

(at least you could take care of this by simply removing them :-)

^ permalink raw reply	[flat|nested] 18+ messages in thread
* Re: long long / long long
@ 2001-09-10 10:48 mike stump
  0 siblings, 0 replies; 18+ messages in thread
From: mike stump @ 2001-09-10 10:48 UTC (permalink / raw)
  To: gcc, jbuck, torvalds

> From: Linus Torvalds <torvalds@transmeta.com>
> Date: Mon, 10 Sep 2001 10:16:22 -0700
> To: jbuck@synopsys.COM, gcc@gcc.gnu.org
> Cc: 

> Well, the linux kernel people would also scream very loudly if the
> compiler started using floating point for integer divides (Linux
> uses -fno-fp-regs on architectures where it is needed/supported, but
> x86 doesn't even _have_ that flag right now).  In the kernel, we do
> NOT want to pollute the (big) FP state, as the kernel doesn't want
> to save/restore it all the time.

I'll echo this point as well.  In our OS, we use gcc, and we need the
ability to generate code that avoids the FP resources, as long as
there isn't any user code that appears to use those resources.

So, for example:

  long long a,b;

  main() { a = b; }

should not use FP resources, but:

  double a,b;

  main() { a = b; }

can.  Presently, our best use of gcc has us using -msoft-float for
this purpose, but, that's not quite what we want.

Would be nice if gcc had a target independent way of doing this.

^ permalink raw reply	[flat|nested] 18+ messages in thread
* Re: long long / long long
@ 2001-09-11  5:06 Benedetto Proietti
  0 siblings, 0 replies; 18+ messages in thread
From: Benedetto Proietti @ 2001-09-11  5:06 UTC (permalink / raw)
  To: bernds; +Cc: gcc

On Mon, 10 Sep 2001 21:45:10 +0200 (MET DST) Michael Matz wrote:
>
> Hi,
>
> On Mon, 10 Sep 2001, Bernd Schmidt wrote:
> > > Well, the Linux kernel developers found that they couldn't let gcc
> > > do long long arithmetic because it did such a poor job, so they do
> > > it in assembly or in C on pairs of 32 bit values instead.  So at
> > > least some folks probably wouldn't mind seeing an improvement.
> >
> > The main problem probably is the register allocator's requirement that
> > DImode values be allocated to adjacent registers.
> 
> Yes, that requirement creates many constraints.
> 
> > Unfortunately this is not going to be easy to change.
> 
> The sad thing is, that it _is_ easy to change in the allocator, and in
> fact would make the algorithm simpler (I'm talking only about the
> new-regalloc) and the graph easier colorable.  The thing which horrifies
> me is the encoding of that requirement in the different machine
> descriptions.  A first step would be to define a new rtx code MREG
> ("multi" reg), which can possibly contain a set of (disjoint) REG
> or SUBREG expressions, including the then necessary handling of multi-reg
> moves (with cycle breaking).  The occurences of those MREG rtx's could
> probably be limited to few passes around the allocator.  Unfortunately
> nevertheless all .md files would need a good overhaul.  If we only had
> such a multi-reg rtx from the beginning ;-|
> 

Hi
in my thesis at university i have done something like this. I called it
"REGSET" 
(SET of REGisters) instead of MREG but it sounds the same.
In the .md i added the "movblk" patterns like this

(define_expand "movblk"
[(set (match_operand:BLK 0 "general_operand" "") 
       (match_operand:BLK 1 "general_operand" ""))]
....

(define_insn "hard_movblk_to_regset"
[(set (match_operand:BLK 0 "regset_operand" "") 
      (match_operand:BLK 1 "memory_operand" "m"))]
....

(define_insn "hard_movblk_from_regset"
[(set (match_operand:BLK 0 "memory_operand" "m") 
       (match_operand:BLK 1 "regset_operand" ""))]
....

Maybe a little primitive but effective.
I also posted a first patch, but not in the *standard* way and not for
the last release.
Hope to have time to do that soon.
Anyhow many changes in the gcc code were necessary, sometimes a little
hard! ;)
I did not implement attributes or other to specify consecutiveness of
registers,
neither asm constraints to use with global register variables.
The idea came out to solve memory access of big structures, because our
machine had lots of registers
(512 per processor!).
My comment is: it is possible, not so difficult, you should agree on the
syntax and the exact behaviour.

ciao
benedetto



> 
> Ciao,
> Michael.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2001-09-11  8:06 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-09-08  8:31 Multiplications on Pentium 4 dewar
2001-09-08  9:17 ` Jan Hubicka
2001-09-08 17:13   ` Profiling and optimization Frank Klemm
2001-09-08 19:08   ` long long / long long Frank Klemm
2001-09-09 21:53     ` Joe Buck
2001-09-09 22:21       ` John S. Dyson
2001-09-10  6:20       ` Bernd Schmidt
2001-09-10 12:47         ` Michael Matz
2001-09-10 19:55           ` Hans-Peter Nilsson
2001-09-11  2:26           ` Jan Hubicka
2001-09-10 10:21       ` Linus Torvalds
2001-09-10 10:40         ` David Edelsohn
2001-09-11  2:21           ` Jan Hubicka
2001-09-11  2:20         ` Jan Hubicka
2001-09-11  8:06           ` Linus Torvalds
2001-09-10 12:18       ` Florian Weimer
2001-09-10 10:48 mike stump
2001-09-11  5:06 Benedetto Proietti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).