public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/34011]  New: Memory load is not eliminated from tight vectorized loop
@ 2007-11-07  9:05 ubizjak at gmail dot com
  2007-11-07 18:06 ` [Bug rtl-optimization/34011] " dorit at gcc dot gnu dot org
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: ubizjak at gmail dot com @ 2007-11-07  9:05 UTC (permalink / raw)
  To: gcc-bugs

Following testcase exposes optimization problem with current SVN gcc:

--cut here--
extern const int srcshift;

void good (const int *srcdata, int *dstdata)
{
  int i;

  for (i = 0; i < 256; i++)
    dstdata[i] = srcdata[i] << srcshift;
}


void bad (const int *srcdata, int *dstdata)
{
  int i;

  for (i = 0; i < 256; i++)
    {
      dstdata[i] |= srcdata[i] << srcshift;
    }
}
--cut here--

Using -O3 -msse2, the loop in above testcase gets vectorized, and produced code
differs substantially between good and bad function:

good:
        ...
.L8:
        xorl    %eax, %eax
        movd    srcshift, %xmm1
        .p2align 4,,7
        .p2align 3
.L4:
        movdqu  (%ebx,%eax), %xmm0
        pslld   %xmm1, %xmm0
        movdqa  %xmm0, (%esi,%eax)
        addl    $16, %eax
        cmpl    $1024, %eax
        jne     .L4
        ...

bad:
        ...
.L21:
        movl    %esi, %eax        (2)
        movl    %ebx, %edx
        leal    1024(%esi), %ecx
        .p2align 4,,7
        .p2align 3
.L17:
        movdqu  (%edx), %xmm0
        movd    srcshift, %xmm1   (1)
        pslld   %xmm1, %xmm0
        movdqu  (%eax), %xmm1     (3)
        por     %xmm1, %xmm0
        movdqa  %xmm0, (%eax)
        addl    $16, %eax         (4)
        addl    $16, %edx
        cmpl    %ecx, %eax
        jne     .L17
        popl    %ebx
        popl    %esi
        popl    %ebp
        ret

In addition to memory load in the loop (1), several other problems can be
identified: There is no need to move registers (2), because loop is followed by
function exit. For some reason, additional IV is used (4) and the same address
is accessed with unaligned access (3) as well as aligned access.

Expected code for "bad" case would be something like "good" case with
additional movaps+por instructions:

.L8:
        xorl    %eax, %eax
        movd    srcshift, %xmm1
        .p2align 4,,7
        .p2align 3
.L4:
        movdqu  (%ebx,%eax), %xmm0
        movaps  %xmm0, %xmm2
        pslld   %xmm1, %xmm0
        por     %xmm2, %xmm0
        movdqa  %xmm0, (%esi,%eax)
        addl    $16, %eax
        cmpl    $1024, %eax
        jne     .L4

Missing IV elimination could be attributed to tree loop optimizations, but
others are IMO RTL optimization problems, because we enter RTL generation with:

good:
<bb 3>:
  MEM[base: dstdata, index: ivtmp.60] = M*(vect_p.29 + ivtmp.60){misalignment:
0} << srcshift.1;

bad:
<bb 4>:
  MEM[index: ivtmp.127] = M*(vector int *) ivtmp.130{misalignment: 0} <<
srcshift.3 | M*(vector int *) ivtmp.127{misalignment: 0};


-- 
           Summary: Memory load is not eliminated from tight vectorized loop
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: ubizjak at gmail dot com
GCC target triplet: i686-*-*, x86_64-*-*


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34011


^ permalink raw reply	[flat|nested] 10+ messages in thread
[parent not found: <bug-34011-4@http.gcc.gnu.org/bugzilla/>]

end of thread, other threads:[~2021-07-26 19:49 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-11-07  9:05 [Bug rtl-optimization/34011] New: Memory load is not eliminated from tight vectorized loop ubizjak at gmail dot com
2007-11-07 18:06 ` [Bug rtl-optimization/34011] " dorit at gcc dot gnu dot org
2009-09-12 19:25 ` ubizjak at gmail dot com
2009-09-12 20:02 ` [Bug tree-optimization/34011] " rguenth at gcc dot gnu dot org
2009-09-15 14:07 ` rguenth at gcc dot gnu dot org
2009-09-15 14:40 ` rguenth at gcc dot gnu dot org
2009-09-16  8:51 ` rguenth at gcc dot gnu dot org
2009-09-17  9:08 ` rguenth at gcc dot gnu dot org
     [not found] <bug-34011-4@http.gcc.gnu.org/bugzilla/>
2012-01-20 10:42 ` ubizjak at gmail dot com
2021-07-26 19:49 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).