public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/54422] New: Merge adjacent stores of elements of a vector (or loads)
@ 2012-08-30 16:01 glisse at gcc dot gnu.org
  2012-08-30 21:29 ` [Bug tree-optimization/54422] " steven at gcc dot gnu.org
  2012-09-03 10:37 ` rguenth at gcc dot gnu.org
  0 siblings, 2 replies; 3+ messages in thread
From: glisse at gcc dot gnu.org @ 2012-08-30 16:01 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54422

             Bug #: 54422
           Summary: Merge adjacent stores of elements of a vector (or
                    loads)
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: enhancement
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: glisse@gcc.gnu.org
            Target: x86_64-linux-gnu


Hello,

#include <x86intrin.h>
void f1(__m128d*dd,__m128d e){
  double*d=(double*)dd;
  d[0]=e[0];
  d[1]=e[1];
}
void f2(__m128d*dd,__m128d e){
  _mm_storeu_pd((double*)dd,e);
}
void f3(__m128d*dd,__m128d e){
  __builtin_memcpy(dd,&e,16);
}

for this code, gcc -O3 -mavx2 generates:

for f2:
    vmovupd    %xmm0, (%rdi)
(it could possibly have guessed that the alignment was right, but I don't mind
today)

for f1:
    vmovlpd    %xmm0, (%rdi)
    vmovhpd    %xmm0, 8(%rdi)
(this is my main issue, could it merge those into a vmovupd?)

for f3:
    vmovdqa    %xmm0, -40(%rsp)
    movq    -40(%rsp), %rax
    vmovapd    %xmm0, -24(%rsp)
    movq    %rax, (%rdi)
    movq    -16(%rsp), %rax
    movq    %rax, 8(%rdi)
(I hope the sse memcpy patch at
http://gcc.gnu.org/ml/gcc-patches/2011-12/msg00336.html will eventually help
with that)


At tree level, for f1, we have:
  _3 = BIT_FIELD_REF <e_5(D), 64, 0>;
  MEM[(double *)dd_1(D)] = _3;
  _6 = BIT_FIELD_REF <e_5(D), 64, 64>;
  MEM[(double *)dd_1(D) + 8B] = _6;

merging those 2 looks like it might be possible (though I am not familiar with
that part of the compiler, maybe only the backend can handle it). Note that I
am interested in both the aligned and unaligned cases (if f1 takes a double*
argument instead of a __m128d*), and in both loads and stores.

Most relevant other bugs I found were: PR 41464, PR 23684, PR 47059.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug tree-optimization/54422] Merge adjacent stores of elements of a vector (or loads)
  2012-08-30 16:01 [Bug tree-optimization/54422] New: Merge adjacent stores of elements of a vector (or loads) glisse at gcc dot gnu.org
@ 2012-08-30 21:29 ` steven at gcc dot gnu.org
  2012-09-03 10:37 ` rguenth at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: steven at gcc dot gnu.org @ 2012-08-30 21:29 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54422

Steven Bosscher <steven at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2012-08-30
     Ever Confirmed|0                           |1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug tree-optimization/54422] Merge adjacent stores of elements of a vector (or loads)
  2012-08-30 16:01 [Bug tree-optimization/54422] New: Merge adjacent stores of elements of a vector (or loads) glisse at gcc dot gnu.org
  2012-08-30 21:29 ` [Bug tree-optimization/54422] " steven at gcc dot gnu.org
@ 2012-09-03 10:37 ` rguenth at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-09-03 10:37 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54422

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org

--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-09-03 10:36:59 UTC ---
Might be again a target for the basic-block vectorizer (though the vectorizer
doesn't exactly deal well with pre-existing vector types).  Or a tree-level
store combiner, looking at the bugs you cite.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-09-03 10:37 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-30 16:01 [Bug tree-optimization/54422] New: Merge adjacent stores of elements of a vector (or loads) glisse at gcc dot gnu.org
2012-08-30 21:29 ` [Bug tree-optimization/54422] " steven at gcc dot gnu.org
2012-09-03 10:37 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).