From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sylvain Pion To: Richard Henderson Cc: law@cygnus.com, Jason Merrill , egcs@egcs.cygnus.com Subject: Re: C++ default copy ctor not optimal Date: Sun, 28 Feb 1999 22:53:00 -0000 Message-ID: <19990216091048.A13581@rigel.inria.fr> References: <19990215171524.A19063@cygnus.com> <14555.919137968@upchuck> <19990215213927.A19254@cygnus.com> X-SW-Source: 1999-02n/msg00699.html Message-ID: <19990228225300.s_JAAAuQaYI7WkLtBo_RG0L-KSZAw8yqz6t7gyQyiSw@z> On Mon, Feb 15, 1999 at 09:39:27PM -0800, Richard Henderson wrote: > On Mon, Feb 15, 1999 at 09:06:08PM -0700, Jeffrey A Law wrote: > > > The x86 fpu can load DImode values without faulting, and since > > > the frational part of the extended double register is 64-bits > > > wide, we don't lose bits. "can" ? Does it mean it depends on some flags in the FPCW ? What about if the FPU is in MMX mode ? I guess it won't work, will it ? In MMX mode, we can use MMX insns, but the compiler doesn't know in which mode we are. > > But is it profitable? Particularly in cases where the addresses are > > not 64bit aligned? > > Certainly not when alignment is not to be had. But on Pentiums, > it can speed things up quite a bit. Yes. The speed up is noticable for my stuff, so I guess that using it more widely is a good idea, if it's feasible. The speed difference is also very important in case the alignement is not correct. > I'm not sure what effect it has on p2. Probably still a good thing > in small doses. Larger copies should use rep movsl, as the microcode > does neat cache tricks. I don't know, but the FP memcpy() patch for the linux kernel worked very well (at least on pentiums), and it was for large areas. -- Sylvain