* Questions about the instruction selection for sse2 intrinsic with -O optimization
@ 2004-03-09 3:36 haibo
0 siblings, 0 replies; 3+ messages in thread
From: haibo @ 2004-03-09 3:36 UTC (permalink / raw)
To: gcc
Hi:
I am trying to compile sse2 program with gcc-3.4-02-25(prerelease version).
I found that I will get the suprise assembly code which is different from
the syntax of the intrinsic function and the final result is wrong.
For example,for intrinsic :
mark0=_mm_set1_epi16(0xffff);
mark0 = _mm_insert_epi16(mark0,0,0);
I will get :
movl $65535, %eax
movd %eax, %xmm0
punpcklwd %xmm0, %xmm0
pshufd $0, %xmm0, %xmm0
movdqa %xmm0, -888(%ebp)
movw $15, %ax
movd %eax, %xmm0
punpcklbw %xmm0, %xmm0
punpcklbw %xmm0, %xmm0
pshufd $0, %xmm0, %xmm0
movdqa %xmm0, -568(%ebp)
and no "pinsrw" found.
I wonder when and where gcc do these things in source code,
I have tried to step into the source code,and it do reach the code that
processes the _mm_insert_epi16 intrinsics and emit the rtl for it,but
when gcc do insn match,It doesn't math it with sse2_pinsrw.
thanks.
haibo
chbchb1130@sina.com
2004-03-09
^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <40357C320066E855@ims1c.libero.it>]
* Re: Questions about the instruction selection for sse2 intrinsic with -O optimization
[not found] <40357C320066E855@ims1c.libero.it>
@ 2004-03-10 1:52 ` Giovanni Bajo
0 siblings, 0 replies; 3+ messages in thread
From: Giovanni Bajo @ 2004-03-10 1:52 UTC (permalink / raw)
To: haibo, gcc
haibo wrote:
> Hi:
> I am trying to compile sse2 program with gcc-3.4-02-25(prerelease
> version). I found that I will get the suprise assembly code which is
> different from the syntax of the intrinsic function and the final
> result is wrong.
Can you please file a bugreport on Bugzilla about this, together with a minimal
testcase to reproduce the bug? Thank you!
Giovanni Bajo
^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <E1B0Y2O-0003fb-00@deer.gmane.org>]
* Re: Questions about the instruction selection for sse2 intrinsic with -O optimization
[not found] <E1B0Y2O-0003fb-00@deer.gmane.org>
@ 2004-03-10 2:05 ` Jim Wilson
0 siblings, 0 replies; 3+ messages in thread
From: Jim Wilson @ 2004-03-10 2:05 UTC (permalink / raw)
To: haibo; +Cc: gcc
haibo wrote:
> I wonder when and where gcc do these things in source code,
> I have tried to step into the source code,and it do reach the code that
> processes the _mm_insert_epi16 intrinsics and emit the rtl for it,but
> when gcc do insn match,It doesn't math it with sse2_pinsrw.
This is a bit too vague to easily answer.
You didn't provide a proper testcase that I can compile. You didn't
provide example RTL dumps, so I am not sure exactly what you are
complaining about.
I am guessing that the pattern that the sse2_pinsrw pattern emitted is
also matched by other patterns, and hence you got other instructions.
This might be an optimization. This might be an accidental bug. It
might be a false impression on your part that the mm_insert_epi16
intrinsic should always map to a pinsrw instruction. There isn't enough
information here to tell.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2004-03-10 2:05 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-03-09 3:36 Questions about the instruction selection for sse2 intrinsic with -O optimization haibo
[not found] <40357C320066E855@ims1c.libero.it>
2004-03-10 1:52 ` Giovanni Bajo
[not found] <E1B0Y2O-0003fb-00@deer.gmane.org>
2004-03-10 2:05 ` Jim Wilson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).