From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marc Lehmann To: gcc@gcc.gnu.org Subject: Re: MMX regs and GCC Date: Thu, 30 Sep 1999 18:02:00 -0000 Message-ID: <19990912010441.C16740@cerebro.laendle> References: <199909081748.KAA22465@netcom1.netcom.com> <37D6C6D7.80917F58@pobox.com> X-SW-Source: 1999-09n/msg00444.html Message-ID: <19990930180200.fyq87oewPXZPdjtdXSI7p-wt1LGOjHdarqmg2wlN8f4@z> On Wed, Sep 08, 1999 at 04:28:07PM -0400, Jeff Garzik wrote: > hmmm. I have the Pentium Pro databooks, and the quoted text below seems > to imply direct moves between general registers are possible. Since Fortunately, i have some direct data for a mmx implemerntation of this sort in gcc (actually, pgcc). pgcc implements (very suboptimally) mmx registers as general integer registers that can store one and only one integer value (no parallelity). There are two options, one that automaticlly emits emms (quite aggressively, though), which works even when you mix integer(mmx) and fp code. That options is a loss on each cpu except on the p-ii, where it is almost as fast as without that option. The second option just removes all emms from the output. There benchmarks seem to run faster on p-ii, and a bit slower on other cpus. I'm quite sure with proper scheduling the first case could be improved into a net win, even when we do not do any efforts to optimize for parallel use. (I also have just been told that pgcc's placement of emms is _very_ bad) > reg<->MMX reg transfers are possible, that seems to imply that loading > and using data in MMX registers would be cheaper than loaded data from > memory. Yes, on the p-ii, that is. On other cpus (intel or not) moves between general registers and mmx might be very slow. > "The MOVD (Move 32 Bits) instruction transfers 32 bits of packed data > from memory to MMX registers and visa versa, or from integer registers > to MMX registers and visa versa." The problem was actually 64 bit moves, which are implemented as push;push;movq;add. Using the 3dnow or katmai instructions will eventually< get rid of the x86-fp nonsense, independent of mmx usage or (but I don't have such a cpu to test that). -- -----==- | ----==-- _ | ---==---(_)__ __ ____ __ Marc Lehmann +-- --==---/ / _ \/ // /\ \/ / pcg@goof.com |e| -=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE --+ The choice of a GNU generation | |