public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* _mm_movpi64_epi64 does not generate MOVQ2DQ
@ 2006-05-19 13:50 James Milne
  0 siblings, 0 replies; only message in thread
From: James Milne @ 2006-05-19 13:50 UTC (permalink / raw)
  To: gcc-help

Hello,

I have some code written in SSE2 intrinsics that is compiled with GCC 4.1.0, and
I've been profiling it with Intel's VTune 8.0.

I'm unpacking some interleaved data into planar form, and due to the nature of
the packing I'm going through the MMX registers first, before moving into the
XMM registers.

At the point where I want to move my data from MMX to XMM registers, I'm calling
_mm_movpi64_epi64(). Ideally, this ought to generate a MOVQ2DQ instruction, but
instead GCC is saving the value from the MMX register to the stack, then loading
that value back into a XMM register.

The assembly generated is this:

mov $0x0, -56(%ebp)
movq %mm0, -88(%ebp)
movq -88(%ebp), xmm3
movhps -56(%ebp), xmm3

I would have expected to see this:

movq2dq %mm0, %xmm3

The issue is that the VTune informs me that the former assembly being generated 
is blocking store-forwarding and introducing a large stall in my code. This is 
in the inner loop of some image processing code.

There's not exactly much register pressure, since my register usage is 
distributed about 50/50 between MMX and XMM, and I'm only using half of each 
register set.

Has anyone else seen similar behaviour? Is this something that is preventing GCC 
issuing the MOVQ2DQ. I'm building with -msse2.

-- 
Kind regards
James Milne

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2006-05-19 13:50 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-05-19 13:50 _mm_movpi64_epi64 does not generate MOVQ2DQ James Milne

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).