public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/55448] New: using const-reference SSE or AVX types leads to unnecessary unaligned loads
@ 2012-11-23 15:15 kretz at kde dot org
2012-11-23 17:28 ` [Bug target/55448] " jakub at gcc dot gnu.org
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: kretz at kde dot org @ 2012-11-23 15:15 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55448
Bug #: 55448
Summary: using const-reference SSE or AVX types leads to
unnecessary unaligned loads
Classification: Unclassified
Product: gcc
Version: 4.7.2
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: kretz@kde.org
The following testcase:
#include <immintrin.h>
static inline __m256 add(const __m256 &a, const __m256 &b) { return
_mm256_add_ps(a, b); }
void foo(__m256 &a, const __m256 b) { a = add(a, b); }
static inline __m128 add(const __m128 &a, const __m128 &b) { return
_mm_add_ps(a, b); }
void foo(__m128 &a, const __m128 b) { a = add(a, b); }
compiled with "-O2 -mavx"
lead to
vmovups (%rdi), %xmm1
vinsertf128 $0x1, 16(%rdi), %ymm1, %ymm1
vaddps %ymm0, %ymm1, %ymm0
vmovaps %ymm0, (%rdi)
for the __m256 case and
vmovups (%rdi), %xmm1
vaddps %xmm0, %xmm1, %xmm0
vmovaps %xmm0, (%rdi)
for the __m128 case.
It should rather be:
vaddps (%rdi), %ymm0, %ymm0
vmovaps %ymm0, (%rdi)
and:
vaddps (%rdi), %xmm0, %xmm0
vmovaps %xmm0, (%rdi)
The latter result can be obtained if the const-ref arguments to add are changed
to pass by value.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/55448] using const-reference SSE or AVX types leads to unnecessary unaligned loads
2012-11-23 15:15 [Bug target/55448] New: using const-reference SSE or AVX types leads to unnecessary unaligned loads kretz at kde dot org
@ 2012-11-23 17:28 ` jakub at gcc dot gnu.org
2012-11-23 17:51 ` jakub at gcc dot gnu.org
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-11-23 17:28 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55448
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2012-11-23
CC| |jakub at gcc dot gnu.org,
| |jamborm at gcc dot gnu.org
Ever Confirmed|0 |1
--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-11-23 17:28:01 UTC ---
The low alignment originates from eipa_sra, foo isn't still in SSA form, and
ipa_modify_call_arguments computes align and misalign of base, which is
PARM_DECL (of REFERENCE_TYPE, referencing __m256 with 256-bit alignment).
But get_pointer_alignment_1 for PARM_DECLs always returns 8-bit alignment.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/55448] using const-reference SSE or AVX types leads to unnecessary unaligned loads
2012-11-23 15:15 [Bug target/55448] New: using const-reference SSE or AVX types leads to unnecessary unaligned loads kretz at kde dot org
2012-11-23 17:28 ` [Bug target/55448] " jakub at gcc dot gnu.org
@ 2012-11-23 17:51 ` jakub at gcc dot gnu.org
2012-11-24 21:38 ` kretz at kde dot org
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-11-23 17:51 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55448
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |uros at gcc dot gnu.org
Target Milestone|--- |4.8.0
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-11-23 17:50:56 UTC ---
With -O2 -mavx -fno-ipa-sra the whole
#include <x86intrin.h>
static inline __m256
add1 (const __m256 & a, const __m256 & b)
{
return _mm256_add_ps (a, b);
}
void
f1 (__m256 & a, const __m256 b)
{
a = add1 (a, b);
}
static inline __m128
add2 (const __m128 & a, const __m128 & b)
{
return _mm_add_ps (a, b);
}
void
f2 (__m128 & a, const __m128 b)
{
a = add2 (a, b);
}
static inline __m256
add3 (const __m256 *a, const __m256 *b)
{
return _mm256_add_ps (*a, *b);
}
void
f3 (__m256 *a, const __m256 b)
{
*a = add3 (a, &b);
}
static inline __m128
add4 (const __m128 *a, const __m128 *b)
{
return _mm_add_ps (*a, *b);
}
void
f4 (__m128 *a, const __m128 b)
{
*a = add4 (a, &b);
}
testcase compiles into optimal code. Beyond the eipa_sra issue the thing is
that for AVX/AVX2 we generally should attempt to combine unaligned loads with
operations that use them (unless it is a plain move), but there is UNSPEC_LOADU
involved (and for 256-bit values also vec_concat with another MEM load), so not
sure what would be the best pass to handle that, if some hack in the combiner,
peephole2 (but we'd need many of them) or what.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/55448] using const-reference SSE or AVX types leads to unnecessary unaligned loads
2012-11-23 15:15 [Bug target/55448] New: using const-reference SSE or AVX types leads to unnecessary unaligned loads kretz at kde dot org
2012-11-23 17:28 ` [Bug target/55448] " jakub at gcc dot gnu.org
2012-11-23 17:51 ` jakub at gcc dot gnu.org
@ 2012-11-24 21:38 ` kretz at kde dot org
2012-11-27 20:46 ` jamborm at gcc dot gnu.org
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: kretz at kde dot org @ 2012-11-24 21:38 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55448
--- Comment #3 from Matthias Kretz <kretz at kde dot org> 2012-11-24 21:38:21 UTC ---
BTW, the problem is just as well visible with only SSE. The __m128 case then
compiles to movlps and movhps instead of the memory operand.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/55448] using const-reference SSE or AVX types leads to unnecessary unaligned loads
2012-11-23 15:15 [Bug target/55448] New: using const-reference SSE or AVX types leads to unnecessary unaligned loads kretz at kde dot org
` (2 preceding siblings ...)
2012-11-24 21:38 ` kretz at kde dot org
@ 2012-11-27 20:46 ` jamborm at gcc dot gnu.org
2012-11-30 16:12 ` jamborm at gcc dot gnu.org
2012-12-03 13:29 ` jamborm at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: jamborm at gcc dot gnu.org @ 2012-11-27 20:46 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55448
--- Comment #4 from Martin Jambor <jamborm at gcc dot gnu.org> 2012-11-27 20:45:50 UTC ---
I have proposed a patch on the mailing list:
http://gcc.gnu.org/ml/gcc-patches/2012-11/msg02265.html
It still needs a testcase from this bug but addresses this problem.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/55448] using const-reference SSE or AVX types leads to unnecessary unaligned loads
2012-11-23 15:15 [Bug target/55448] New: using const-reference SSE or AVX types leads to unnecessary unaligned loads kretz at kde dot org
` (3 preceding siblings ...)
2012-11-27 20:46 ` jamborm at gcc dot gnu.org
@ 2012-11-30 16:12 ` jamborm at gcc dot gnu.org
2012-12-03 13:29 ` jamborm at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: jamborm at gcc dot gnu.org @ 2012-11-30 16:12 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55448
--- Comment #5 from Martin Jambor <jamborm at gcc dot gnu.org> 2012-11-30 16:11:44 UTC ---
Author: jamborm
Date: Fri Nov 30 16:11:33 2012
New Revision: 193998
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=193998
Log:
2012-11-30 Martin Jambor <mjambor@suse.cz>
PR middle-end/52890
PR tree-optimization/55415
PR tree-optimization/54386
PR target/55448
* ipa-prop.c (ipa_modify_call_arguments): Be optimistic when
get_pointer_alignment_1 returns false and the base was not a
dereference.
* tree-sra.c (access_precludes_ipa_sra_p): New parameter req_align,
added check for required alignment. Update the user.
* testsuite/gcc.dg/ipa/ipa-sra-7.c: New test.
* testsuite/gcc.dg/ipa/ipa-sra-8.c: Likewise.
* testsuite/gcc.dg/ipa/ipa-sra-9.c: Likewise.
* testsuite/gcc.target/i386/pr55448.c: Likewise.
Added:
trunk/gcc/testsuite/gcc.dg/ipa/ipa-sra-7.c
trunk/gcc/testsuite/gcc.dg/ipa/ipa-sra-8.c
trunk/gcc/testsuite/gcc.dg/ipa/ipa-sra-9.c
trunk/gcc/testsuite/gcc.target/i386/pr55448.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-prop.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-sra.c
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/55448] using const-reference SSE or AVX types leads to unnecessary unaligned loads
2012-11-23 15:15 [Bug target/55448] New: using const-reference SSE or AVX types leads to unnecessary unaligned loads kretz at kde dot org
` (4 preceding siblings ...)
2012-11-30 16:12 ` jamborm at gcc dot gnu.org
@ 2012-12-03 13:29 ` jamborm at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: jamborm at gcc dot gnu.org @ 2012-12-03 13:29 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55448
Martin Jambor <jamborm at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |DUPLICATE
--- Comment #6 from Martin Jambor <jamborm at gcc dot gnu.org> 2012-12-03 13:28:48 UTC ---
Duplicate. Now a fixed one, fortunately.
*** This bug has been marked as a duplicate of bug 54386 ***
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-12-03 13:29 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-23 15:15 [Bug target/55448] New: using const-reference SSE or AVX types leads to unnecessary unaligned loads kretz at kde dot org
2012-11-23 17:28 ` [Bug target/55448] " jakub at gcc dot gnu.org
2012-11-23 17:51 ` jakub at gcc dot gnu.org
2012-11-24 21:38 ` kretz at kde dot org
2012-11-27 20:46 ` jamborm at gcc dot gnu.org
2012-11-30 16:12 ` jamborm at gcc dot gnu.org
2012-12-03 13:29 ` jamborm at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).