public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/40122] New: missed optimization when using union of __m128i and int[4]
@ 2009-05-12 13:52 kretz at kde dot org
2009-05-12 15:01 ` [Bug middle-end/40122] " rguenth at gcc dot gnu dot org
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: kretz at kde dot org @ 2009-05-12 13:52 UTC (permalink / raw)
To: gcc-bugs
The following testcase
#include <emmintrin.h>
typedef union {
__m128i v;
int m[4];
} VectorUnion;
VectorUnion one()
{
VectorUnion r = { _mm_set1_epi32(1) };
return r;
}
int main()
{
VectorUnion x = one();
if (0xffff == _mm_movemask_epi8(_mm_cmpeq_epi32(x.v, x.v))) {
return 0;
}
return 1;
}
compiles (-Wall -Wextra -O2 -mssse3) to
00000000004004d0 <main>:
4004d0: 66 0f 6f 05 38 01 00 00 movdqa 0x138(%rip),%xmm0
4004d8: 66 0f 7f 44 24 d8 movdqa %xmm0,-0x28(%rsp)
4004de: 48 8b 44 24 d8 mov -0x28(%rsp),%rax
4004e3: 48 89 44 24 e8 mov %rax,-0x18(%rsp)
4004e8: 48 8b 44 24 e0 mov -0x20(%rsp),%rax
4004ed: 48 89 44 24 f0 mov %rax,-0x10(%rsp)
4004f2: 66 0f 6f 44 24 e8 movdqa -0x18(%rsp),%xmm0
4004f8: 66 0f 76 c0 pcmpeqd %xmm0,%xmm0
4004fc: 66 0f d7 c0 pmovmskb %xmm0,%eax
As can be seen the xmm0 register is stored on the stack, then copied via two 64
bit moves on the stack and then, from there, loaded back into xmm0. The values
on the stack are not needed/used later on.
I expected gcc to note those no-op moves and produce code like
movdqa 0x138(%rip),%xmm0
pcmpeqd %xmm0,%xmm0
pmovmskb %xmm0,%eax
--
Summary: missed optimization when using union of __m128i and
int[4]
Product: gcc
Version: 4.3.2
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: kretz at kde dot org
GCC build triplet: x86_64-unknown-linux-gnu
GCC host triplet: x86_64-unknown-linux-gnu
GCC target triplet: x86_64-unknown-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40122
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug middle-end/40122] missed optimization when using union of __m128i and int[4]
2009-05-12 13:52 [Bug middle-end/40122] New: missed optimization when using union of __m128i and int[4] kretz at kde dot org
@ 2009-05-12 15:01 ` rguenth at gcc dot gnu dot org
2009-05-12 15:24 ` pinskia at gcc dot gnu dot org
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-05-12 15:01 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from rguenth at gcc dot gnu dot org 2009-05-12 15:00 -------
The union copy confuses GCC:
r.v = VIEW_CONVERT_EXPR<vector long long int>({1, 1, 1, 1});
D.6990 = r;
x = D.6990;
D.6997 = VIEW_CONVERT_EXPR<vector int>(x.v);
D.6994 = __builtin_ia32_pcmpeqd128 (D.6997, D.6997);
D.7000 = __builtin_ia32_pmovmskb128 (VIEW_CONVERT_EXPR<vector
char>(VIEW_CONVERT_EXPR<vector long long int>(D.6994)));
return D.7000 != 65535;
this will likely be fixed with the new SRA or is a duplicate of PR36327.
Martin, can you check this (and maybe add a testcase)?
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rguenth at gcc dot gnu dot
| |org, mjambor at suse dot cz
Severity|normal |enhancement
Keywords| |missed-optimization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40122
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug middle-end/40122] missed optimization when using union of __m128i and int[4]
2009-05-12 13:52 [Bug middle-end/40122] New: missed optimization when using union of __m128i and int[4] kretz at kde dot org
2009-05-12 15:01 ` [Bug middle-end/40122] " rguenth at gcc dot gnu dot org
@ 2009-05-12 15:24 ` pinskia at gcc dot gnu dot org
2009-05-21 16:02 ` jamborm at gcc dot gnu dot org
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2009-05-12 15:24 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from pinskia at gcc dot gnu dot org 2009-05-12 15:24 -------
This is a dup of bug 36327.
*** This bug has been marked as a duplicate of 36327 ***
--
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution| |DUPLICATE
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40122
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug middle-end/40122] missed optimization when using union of __m128i and int[4]
2009-05-12 13:52 [Bug middle-end/40122] New: missed optimization when using union of __m128i and int[4] kretz at kde dot org
2009-05-12 15:01 ` [Bug middle-end/40122] " rguenth at gcc dot gnu dot org
2009-05-12 15:24 ` pinskia at gcc dot gnu dot org
@ 2009-05-21 16:02 ` jamborm at gcc dot gnu dot org
2009-05-25 15:20 ` jamborm at gcc dot gnu dot org
2009-05-25 16:00 ` rguenth at gcc dot gnu dot org
4 siblings, 0 replies; 6+ messages in thread
From: jamborm at gcc dot gnu dot org @ 2009-05-21 16:02 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from jamborm at gcc dot gnu dot org 2009-05-21 16:02 -------
With he new SRA, the optimized dump looks like:
D.6886_10 = {1, 1, 1, 1};
D.6887_11 = VIEW_CONVERT_EXPR<vector long long int>(D.6886_10);
D.6893_12 = VIEW_CONVERT_EXPR<vector int>(D.6887_11);
D.6891_14 = __builtin_ia32_pcmpeqd128 (D.6893_12, D.6893_12);
D.6890_15 = VIEW_CONVERT_EXPR<vector long long int>(D.6891_14);
D.6897_16 = VIEW_CONVERT_EXPR<vector char>(D.6890_15);
D.6896_17 = __builtin_ia32_pmovmskb128 (D.6897_16);
D.6933_21 = D.6896_17 != 65535;
return D.6933_21;
x is completely gone.
The (relevant) assembly output is
main:
movdqa .LC0, %xmm0
pcmpeqd %xmm0, %xmm0
pmovmskb %xmm0, %eax
cmpl $65535, %eax
pushl %ebp
setne %al
movl %esp, %ebp
movzbl %al, %eax
popl %ebp
ret
So even though I don't really understand the SSE instructions I
believe the new SRA does indeed help. I'll add a testcase checking
that x vanishes to the patch series as I am finalizing the final patch
set now.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40122
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug middle-end/40122] missed optimization when using union of __m128i and int[4]
2009-05-12 13:52 [Bug middle-end/40122] New: missed optimization when using union of __m128i and int[4] kretz at kde dot org
` (2 preceding siblings ...)
2009-05-21 16:02 ` jamborm at gcc dot gnu dot org
@ 2009-05-25 15:20 ` jamborm at gcc dot gnu dot org
2009-05-25 16:00 ` rguenth at gcc dot gnu dot org
4 siblings, 0 replies; 6+ messages in thread
From: jamborm at gcc dot gnu dot org @ 2009-05-25 15:20 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from jamborm at gcc dot gnu dot org 2009-05-25 15:20 -------
...hm, when I wanted to make such a testcase I realized that the SSE
code is not very portable. So I changed my mind and won't use it.
I'll be adding different union scalarization checks, though.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40122
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug middle-end/40122] missed optimization when using union of __m128i and int[4]
2009-05-12 13:52 [Bug middle-end/40122] New: missed optimization when using union of __m128i and int[4] kretz at kde dot org
` (3 preceding siblings ...)
2009-05-25 15:20 ` jamborm at gcc dot gnu dot org
@ 2009-05-25 16:00 ` rguenth at gcc dot gnu dot org
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-05-25 16:00 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from rguenth at gcc dot gnu dot org 2009-05-25 15:59 -------
I have some CCP / fold_stmt patches that produce
movdqa .LC1(%rip), %xmm0
pcmpeqd %xmm0, %xmm0
pmovmskb %xmm0, %eax
cmpl $65535, %eax
setne %al
movzbl %al, %eax
ret
as well. The issue is that the CONSTRUCTOR from _mm_set1_epi32(1) is neither
marked TREE_CONSTANT nor folded to VECTOR_CST.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40122
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2009-05-25 16:00 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-12 13:52 [Bug middle-end/40122] New: missed optimization when using union of __m128i and int[4] kretz at kde dot org
2009-05-12 15:01 ` [Bug middle-end/40122] " rguenth at gcc dot gnu dot org
2009-05-12 15:24 ` pinskia at gcc dot gnu dot org
2009-05-21 16:02 ` jamborm at gcc dot gnu dot org
2009-05-25 15:20 ` jamborm at gcc dot gnu dot org
2009-05-25 16:00 ` rguenth at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).