public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/51492] New: vectorizer generates unnecessary code
@ 2011-12-10  1:38 drepper.fsp at gmail dot com
  2011-12-12 10:25 ` [Bug tree-optimization/51492] vectorizer does not support saturated arithmetic patterns rguenth at gcc dot gnu.org
                   ` (19 more replies)
  0 siblings, 20 replies; 21+ messages in thread
From: drepper.fsp at gmail dot com @ 2011-12-10  1:38 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51492

             Bug #: 51492
           Summary: vectorizer generates unnecessary code
    Classification: Unclassified
           Product: gcc
           Version: 4.6.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: drepper.fsp@gmail.com
             Build: x86_64-linux


Compile this code with 4.6.2 on a x86-64 machine with -O3:

#define SIZE 65536
#define WSIZE 64
unsigned short head[SIZE] __attribute__((aligned(64)));

void
f(void)
{
  for (unsigned n = 0; n < SIZE; ++n) {
    unsigned short m = head[n];
    head[n] = (unsigned short)(m >= WSIZE ? m-WSIZE : 0);
  }
}

The result I see is this:

0000000000000000 <f>:
   0:    66 0f ef d2              pxor   %xmm2,%xmm2
   4:    b8 00 00 00 00           mov    $0x0,%eax
            5: R_X86_64_32    head
   9:    66 0f 6f 25 00 00 00     movdqa 0x0(%rip),%xmm4        # 11 <f+0x11>
  10:    00 
            d: R_X86_64_PC32    .LC0-0x4
  11:    66 0f 6f 1d 00 00 00     movdqa 0x0(%rip),%xmm3        # 19 <f+0x19>
  18:    00 
            15: R_X86_64_PC32    .LC1-0x4
  19:    0f 1f 80 00 00 00 00     nopl   0x0(%rax)
  20:    66 0f 6f 00              movdqa (%rax),%xmm0
  24:    66 0f 6f c8              movdqa %xmm0,%xmm1
  28:    66 0f d9 c4              psubusw %xmm4,%xmm0
  2c:    66 0f 75 c2              pcmpeqw %xmm2,%xmm0
  30:    66 0f fd cb              paddw  %xmm3,%xmm1
  34:    66 0f df c1              pandn  %xmm1,%xmm0
  38:    66 0f 7f 00              movdqa %xmm0,(%rax)
  3c:    48 83 c0 10              add    $0x10,%rax
  40:    48 3d 00 00 00 00        cmp    $0x0,%rax
            42: R_X86_64_32S    head+0x20000
  46:    75 d8                    jne    20 <f+0x20>
  48:    f3 c3                    repz retq 


There is a lot of unnecessary code.  The psubusw instruction alone is
sufficient.  The purpose of this instruction is to implement saturated
subtraction.  Why does gcc create all this extra code?  The code should just be

   movdqa (%rax), %xmm0
   psubusw %xmm1, %xmm0
   movdqa %mm0, (%rax)

where %xmm1 has WSIZE in the 16-bit values.


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2024-05-18  2:17 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-10  1:38 [Bug tree-optimization/51492] New: vectorizer generates unnecessary code drepper.fsp at gmail dot com
2011-12-12 10:25 ` [Bug tree-optimization/51492] vectorizer does not support saturated arithmetic patterns rguenth at gcc dot gnu.org
2012-01-08 18:57 ` drepper.fsp at gmail dot com
2012-07-13  8:39 ` rguenth at gcc dot gnu.org
2021-08-24 23:44 ` pinskia at gcc dot gnu.org
2021-08-25  3:54 ` pinskia at gcc dot gnu.org
2024-02-01 13:06 ` pan2.li at intel dot com
2024-02-01 13:37 ` juzhe.zhong at rivai dot ai
2024-02-01 13:42 ` juzhe.zhong at rivai dot ai
2024-02-01 14:40 ` juzhe.zhong at rivai dot ai
2024-02-01 14:41 ` juzhe.zhong at rivai dot ai
2024-02-01 15:10 ` tnfchris at gcc dot gnu.org
2024-02-02  1:04 ` pan2.li at intel dot com
2024-02-02 11:11 ` tnfchris at gcc dot gnu.org
2024-02-03  6:57 ` pan2.li at intel dot com
2024-02-06  1:13 ` pan2.li at intel dot com
2024-02-06 22:11 ` tnfchris at gcc dot gnu.org
2024-02-07  0:57 ` pan2.li at intel dot com
2024-05-16 12:09 ` cvs-commit at gcc dot gnu.org
2024-05-16 12:09 ` cvs-commit at gcc dot gnu.org
2024-05-18  2:17 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).