public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/55448] New: using const-reference SSE or AVX types leads to unnecessary unaligned loads
@ 2012-11-23 15:15 kretz at kde dot org
  2012-11-23 17:28 ` [Bug target/55448] " jakub at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: kretz at kde dot org @ 2012-11-23 15:15 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55448

             Bug #: 55448
           Summary: using const-reference SSE or AVX types leads to
                    unnecessary unaligned loads
    Classification: Unclassified
           Product: gcc
           Version: 4.7.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: kretz@kde.org


The following testcase:

#include <immintrin.h>
static inline __m256 add(const __m256 &a, const __m256 &b) { return
_mm256_add_ps(a, b); }
void foo(__m256 &a, const __m256 b) { a = add(a, b); }

static inline __m128 add(const __m128 &a, const __m128 &b) { return
_mm_add_ps(a, b); }
void foo(__m128 &a, const __m128 b) { a = add(a, b); }

compiled with "-O2 -mavx"

lead to
        vmovups (%rdi), %xmm1
        vinsertf128     $0x1, 16(%rdi), %ymm1, %ymm1
        vaddps  %ymm0, %ymm1, %ymm0
        vmovaps %ymm0, (%rdi)

for the __m256 case and

        vmovups (%rdi), %xmm1
        vaddps  %xmm0, %xmm1, %xmm0
        vmovaps %xmm0, (%rdi)

for the __m128 case.

It should rather be:
        vaddps  (%rdi), %ymm0, %ymm0
        vmovaps %ymm0, (%rdi)
and:
        vaddps  (%rdi), %xmm0, %xmm0
        vmovaps %xmm0, (%rdi)

The latter result can be obtained if the const-ref arguments to add are changed
to pass by value.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/55448] using const-reference SSE or AVX types leads to unnecessary unaligned loads
  2012-11-23 15:15 [Bug target/55448] New: using const-reference SSE or AVX types leads to unnecessary unaligned loads kretz at kde dot org
@ 2012-11-23 17:28 ` jakub at gcc dot gnu.org
  2012-11-23 17:51 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-11-23 17:28 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55448

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2012-11-23
                 CC|                            |jakub at gcc dot gnu.org,
                   |                            |jamborm at gcc dot gnu.org
     Ever Confirmed|0                           |1

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-11-23 17:28:01 UTC ---
The low alignment originates from eipa_sra, foo isn't still in SSA form, and
ipa_modify_call_arguments computes align and misalign of base, which is
PARM_DECL (of REFERENCE_TYPE, referencing __m256 with 256-bit alignment).
But get_pointer_alignment_1 for PARM_DECLs always returns 8-bit alignment.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/55448] using const-reference SSE or AVX types leads to unnecessary unaligned loads
  2012-11-23 15:15 [Bug target/55448] New: using const-reference SSE or AVX types leads to unnecessary unaligned loads kretz at kde dot org
  2012-11-23 17:28 ` [Bug target/55448] " jakub at gcc dot gnu.org
@ 2012-11-23 17:51 ` jakub at gcc dot gnu.org
  2012-11-24 21:38 ` kretz at kde dot org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-11-23 17:51 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55448

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |uros at gcc dot gnu.org
   Target Milestone|---                         |4.8.0

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-11-23 17:50:56 UTC ---
With -O2 -mavx -fno-ipa-sra the whole
#include <x86intrin.h>

static inline __m256
add1 (const __m256 & a, const __m256 & b)
{
  return _mm256_add_ps (a, b);
}

void
f1 (__m256 & a, const __m256 b)
{
  a = add1 (a, b);
}

static inline __m128
add2 (const __m128 & a, const __m128 & b)
{
  return _mm_add_ps (a, b);
}

void
f2 (__m128 & a, const __m128 b)
{
  a = add2 (a, b);
}

static inline __m256
add3 (const __m256 *a, const __m256 *b)
{
  return _mm256_add_ps (*a, *b);
}

void
f3 (__m256 *a, const __m256 b)
{
  *a = add3 (a, &b);
}

static inline __m128
add4 (const __m128 *a, const __m128 *b)
{
  return _mm_add_ps (*a, *b);
}

void
f4 (__m128 *a, const __m128 b)
{
  *a = add4 (a, &b);
}

testcase compiles into optimal code.  Beyond the eipa_sra issue the thing is
that for AVX/AVX2 we generally should attempt to combine unaligned loads with
operations that use them (unless it is a plain move), but there is UNSPEC_LOADU
involved (and for 256-bit values also vec_concat with another MEM load), so not
sure what would be the best pass to handle that, if some hack in the combiner,
peephole2 (but we'd need many of them) or what.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/55448] using const-reference SSE or AVX types leads to unnecessary unaligned loads
  2012-11-23 15:15 [Bug target/55448] New: using const-reference SSE or AVX types leads to unnecessary unaligned loads kretz at kde dot org
  2012-11-23 17:28 ` [Bug target/55448] " jakub at gcc dot gnu.org
  2012-11-23 17:51 ` jakub at gcc dot gnu.org
@ 2012-11-24 21:38 ` kretz at kde dot org
  2012-11-27 20:46 ` jamborm at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: kretz at kde dot org @ 2012-11-24 21:38 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55448

--- Comment #3 from Matthias Kretz <kretz at kde dot org> 2012-11-24 21:38:21 UTC ---
BTW, the problem is just as well visible with only SSE. The __m128 case then
compiles to movlps and movhps instead of the memory operand.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/55448] using const-reference SSE or AVX types leads to unnecessary unaligned loads
  2012-11-23 15:15 [Bug target/55448] New: using const-reference SSE or AVX types leads to unnecessary unaligned loads kretz at kde dot org
                   ` (2 preceding siblings ...)
  2012-11-24 21:38 ` kretz at kde dot org
@ 2012-11-27 20:46 ` jamborm at gcc dot gnu.org
  2012-11-30 16:12 ` jamborm at gcc dot gnu.org
  2012-12-03 13:29 ` jamborm at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: jamborm at gcc dot gnu.org @ 2012-11-27 20:46 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55448

--- Comment #4 from Martin Jambor <jamborm at gcc dot gnu.org> 2012-11-27 20:45:50 UTC ---
I have proposed a patch on the mailing list:

http://gcc.gnu.org/ml/gcc-patches/2012-11/msg02265.html

It still needs a testcase from this bug but addresses this problem.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/55448] using const-reference SSE or AVX types leads to unnecessary unaligned loads
  2012-11-23 15:15 [Bug target/55448] New: using const-reference SSE or AVX types leads to unnecessary unaligned loads kretz at kde dot org
                   ` (3 preceding siblings ...)
  2012-11-27 20:46 ` jamborm at gcc dot gnu.org
@ 2012-11-30 16:12 ` jamborm at gcc dot gnu.org
  2012-12-03 13:29 ` jamborm at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: jamborm at gcc dot gnu.org @ 2012-11-30 16:12 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55448

--- Comment #5 from Martin Jambor <jamborm at gcc dot gnu.org> 2012-11-30 16:11:44 UTC ---
Author: jamborm
Date: Fri Nov 30 16:11:33 2012
New Revision: 193998

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=193998
Log:
2012-11-30  Martin Jambor  <mjambor@suse.cz>

    PR middle-end/52890
    PR tree-optimization/55415
    PR tree-optimization/54386
    PR target/55448
    * ipa-prop.c (ipa_modify_call_arguments): Be optimistic when
    get_pointer_alignment_1 returns false and the base was not a
    dereference.
    * tree-sra.c (access_precludes_ipa_sra_p): New parameter req_align,
    added check for required alignment.  Update the user.

    * testsuite/gcc.dg/ipa/ipa-sra-7.c: New test.
    * testsuite/gcc.dg/ipa/ipa-sra-8.c: Likewise.
    * testsuite/gcc.dg/ipa/ipa-sra-9.c: Likewise.
    * testsuite/gcc.target/i386/pr55448.c: Likewise.


Added:
    trunk/gcc/testsuite/gcc.dg/ipa/ipa-sra-7.c
    trunk/gcc/testsuite/gcc.dg/ipa/ipa-sra-8.c
    trunk/gcc/testsuite/gcc.dg/ipa/ipa-sra-9.c
    trunk/gcc/testsuite/gcc.target/i386/pr55448.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/ipa-prop.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-sra.c


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/55448] using const-reference SSE or AVX types leads to unnecessary unaligned loads
  2012-11-23 15:15 [Bug target/55448] New: using const-reference SSE or AVX types leads to unnecessary unaligned loads kretz at kde dot org
                   ` (4 preceding siblings ...)
  2012-11-30 16:12 ` jamborm at gcc dot gnu.org
@ 2012-12-03 13:29 ` jamborm at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: jamborm at gcc dot gnu.org @ 2012-12-03 13:29 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55448

Martin Jambor <jamborm at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |DUPLICATE

--- Comment #6 from Martin Jambor <jamborm at gcc dot gnu.org> 2012-12-03 13:28:48 UTC ---
Duplicate. Now a fixed one, fortunately.

*** This bug has been marked as a duplicate of bug 54386 ***


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-12-03 13:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-23 15:15 [Bug target/55448] New: using const-reference SSE or AVX types leads to unnecessary unaligned loads kretz at kde dot org
2012-11-23 17:28 ` [Bug target/55448] " jakub at gcc dot gnu.org
2012-11-23 17:51 ` jakub at gcc dot gnu.org
2012-11-24 21:38 ` kretz at kde dot org
2012-11-27 20:46 ` jamborm at gcc dot gnu.org
2012-11-30 16:12 ` jamborm at gcc dot gnu.org
2012-12-03 13:29 ` jamborm at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).