public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/54855] New: Unnecessary duplication when performing scalar operation on vector element
@ 2012-10-08 13:58 drepper.fsp at gmail dot com
  2012-10-08 14:19 ` [Bug tree-optimization/54855] " rguenth at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: drepper.fsp at gmail dot com @ 2012-10-08 13:58 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54855

             Bug #: 54855
           Summary: Unnecessary duplication when performing scalar
                    operation on vector element
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: drepper.fsp@gmail.com


Take the following code:


#include <stdio.h>

typedef double v2df __attribute__((vector_size(16)));

int
main(int argc, char *argv[])
{
  v2df v = { 2.0, 2.0 };
  v2df v2 = { 2.0, 2.0 };
  while (argc-- > 1)
    {
      v[0] -= 1.0;
      v *= v2;
    }
  printf("%g\n", v[0] + v[1]);
  return 0;
}

It compiles as C and C++, both compilers behave the same.

When compiling on x86-64 (therefore with SSE enabled) it generates for the loop
this code:


  4003f0:       66 0f 28 c1             movapd %xmm1,%xmm0
  4003f4:       83 e8 01                sub    $0x1,%eax
  4003f7:       f2 0f 5c c2             subsd  %xmm2,%xmm0
  4003fb:       f2 0f 10 c8             movsd  %xmm0,%xmm1
  4003ff:       66 0f 58 c9             addpd  %xmm1,%xmm1
  400403:       75 eb                   jne    4003f0 <main+0x20>


I.e., the value is pulled out of the vector, the subtraction is performed, and
then the scalar value is put back into the vector.

Instead the following sequence would have been completely sufficient:

sub    $0x1,%eax
subsd  %xmm2,%xmm1
addpd  %xmm1,%xmm1
jne    ...back

The subsd instruction doesn't touch the high parts of the register.


I know this is a special case, it only works if the scalar operation is for the
element zero of the vector.  But code can be designed like that.  I have some
code which would work nicely like this.  I don't know whether this translates
to other architectures as well.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-09-06  8:21 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-08 13:58 [Bug tree-optimization/54855] New: Unnecessary duplication when performing scalar operation on vector element drepper.fsp at gmail dot com
2012-10-08 14:19 ` [Bug tree-optimization/54855] " rguenth at gcc dot gnu.org
2012-10-12 13:41 ` glisse at gcc dot gnu.org
2012-10-12 17:08 ` glisse at gcc dot gnu.org
2012-10-12 17:34 ` glisse at gcc dot gnu.org
2012-10-12 18:09 ` glisse at gcc dot gnu.org
2012-10-20 17:44 ` glisse at gcc dot gnu.org
2012-11-30  1:31 ` glisse at gcc dot gnu.org
2013-09-06  8:21 ` glisse at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).