public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/56160] New: unnecessary additions in loop [x86, x86_64]
@ 2013-01-31 10:43 jtaylor.debian at gmail dot com
  2013-01-31 10:44 ` [Bug c/56160] " jtaylor.debian at gmail dot com
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: jtaylor.debian at gmail dot com @ 2013-01-31 10:43 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56160

             Bug #: 56160
           Summary: unnecessary additions in loop [x86, x86_64]
    Classification: Unclassified
           Product: gcc
           Version: 4.4.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: jtaylor.debian@gmail.com


the attached code which does complex float multiplication using sse3 produces 4
unnecessary integer additions if the NaN fallback function comp_mult is inlined

the assembly for the loop generated with -msse3 -O3 -std=c99 in gcc 4.4, 4.6,
4.7 and 4.8 svn 195604 looks like this:
  28:    0f 28 0e                 movaps (%esi),%xmm1
  2b:    f3 0f 12 c1              movsldup %xmm1,%xmm0
  2f:    8b 55 08                 mov    0x8(%ebp),%edx
  32:    0f 28 13                 movaps (%ebx),%xmm2
  35:    f3 0f 16 c9              movshdup %xmm1,%xmm1
  39:    0f 59 c2                 mulps  %xmm2,%xmm0
  3c:    0f c6 d2 b1              shufps $0xb1,%xmm2,%xmm2
  40:    0f 59 ca                 mulps  %xmm2,%xmm1
  43:    f2 0f d0 c1              addsubps %xmm1,%xmm0
  47:    0f 29 04 fa              movaps %xmm0,(%edx,%edi,8)
  4b:    0f c2 c0 04              cmpneqps %xmm0,%xmm0
  4f:    0f 50 c0                 movmskps %xmm0,%eax
  52:    85 c0                    test   %eax,%eax
  54:    75 1d                    jne    73 <sse3_mult+0x73> // inlined
comp_mult
  56:    83 c7 02                 add    $0x2,%edi
  59:    83 c6 10                 add    $0x10,%esi
  5c:    83 c3 10                 add    $0x10,%ebx
  5f:    83 c1 10                 add    $0x10,%ecx
  62:    83 45 e4 10              addl   $0x10,-0x1c(%ebp)
  66:    39 7d 14                 cmp    %edi,0x14(%ebp)
  69:    7f bd                    jg     28 <sse3_mult+0x28>
...

the 4 adds for esi ebx ecx and ebp are completely unnecessary and reduce
performance by about 20% on my core2duo.
on amd64 it also creates to seemingly unnecessary additions but I did not test
the performance.

a way to coax gcc to emit proper code is to not allow it to inline the fallback
it then generates following good assembly with only one integer add:

  a8:    0f 28 0c df              movaps (%edi,%ebx,8),%xmm1
  ac:    f3 0f 12 c1              movsldup %xmm1,%xmm0
  b0:    8b 45 08                 mov    0x8(%ebp),%eax
  b3:    0f 28 14 de              movaps (%esi,%ebx,8),%xmm2
  b7:    f3 0f 16 c9              movshdup %xmm1,%xmm1
  bb:    0f 59 c2                 mulps  %xmm2,%xmm0
  be:    0f c6 d2 b1              shufps $0xb1,%xmm2,%xmm2
  c2:    0f 59 ca                 mulps  %xmm2,%xmm1
  c5:    f2 0f d0 c1              addsubps %xmm1,%xmm0
  c9:    0f 29 04 d8              movaps %xmm0,(%eax,%ebx,8)
  cd:    0f c2 c0 04              cmpneqps %xmm0,%xmm0
  d1:    0f 50 c0                 movmskps %xmm0,%eax
  d4:    85 c0                    test   %eax,%eax
  d6:    75 10                    jne    e8 <sse3_mult+0x58> // non-inlined
comp_mult
  d8:    83 c3 02                 add    $0x2,%ebx
  db:    39 5d 14                 cmp    %ebx,0x14(%ebp)
  de:    7f c8                    jg     a8 <sse3_mult+0x18>
...


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c/56160] unnecessary additions in loop [x86, x86_64]
  2013-01-31 10:43 [Bug c/56160] New: unnecessary additions in loop [x86, x86_64] jtaylor.debian at gmail dot com
@ 2013-01-31 10:44 ` jtaylor.debian at gmail dot com
  2013-01-31 10:48 ` jtaylor.debian at gmail dot com
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: jtaylor.debian at gmail dot com @ 2013-01-31 10:44 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56160

--- Comment #1 from Julian Taylor <jtaylor.debian at gmail dot com> 2013-01-31 10:43:51 UTC ---
Created attachment 29313
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29313
code


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c/56160] unnecessary additions in loop [x86, x86_64]
  2013-01-31 10:43 [Bug c/56160] New: unnecessary additions in loop [x86, x86_64] jtaylor.debian at gmail dot com
  2013-01-31 10:44 ` [Bug c/56160] " jtaylor.debian at gmail dot com
@ 2013-01-31 10:48 ` jtaylor.debian at gmail dot com
  2013-01-31 17:51 ` [Bug middle-end/56160] " pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: jtaylor.debian at gmail dot com @ 2013-01-31 10:48 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56160

--- Comment #2 from Julian Taylor <jtaylor.debian at gmail dot com> 2013-01-31 10:47:54 UTC ---
these three lines is missing at the top of the attachment

#include <complex.h>
#include <pmmintrin.h>
#define UNLIKELY(x)     __builtin_expect((x),0)


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug middle-end/56160] unnecessary additions in loop [x86, x86_64]
  2013-01-31 10:43 [Bug c/56160] New: unnecessary additions in loop [x86, x86_64] jtaylor.debian at gmail dot com
  2013-01-31 10:44 ` [Bug c/56160] " jtaylor.debian at gmail dot com
  2013-01-31 10:48 ` jtaylor.debian at gmail dot com
@ 2013-01-31 17:51 ` pinskia at gcc dot gnu.org
  2013-01-31 17:53 ` jtaylor.debian at gmail dot com
  2021-07-24  4:53 ` [Bug target/56160] " pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2013-01-31 17:51 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56160

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|c                           |middle-end

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> 2013-01-31 17:50:42 UTC ---
Can you try a new compiler, 4.4 is no longer maintained?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug middle-end/56160] unnecessary additions in loop [x86, x86_64]
  2013-01-31 10:43 [Bug c/56160] New: unnecessary additions in loop [x86, x86_64] jtaylor.debian at gmail dot com
                   ` (2 preceding siblings ...)
  2013-01-31 17:51 ` [Bug middle-end/56160] " pinskia at gcc dot gnu.org
@ 2013-01-31 17:53 ` jtaylor.debian at gmail dot com
  2021-07-24  4:53 ` [Bug target/56160] " pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: jtaylor.debian at gmail dot com @ 2013-01-31 17:53 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56160

Julian Taylor <jtaylor.debian at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|4.4.1                       |4.8.0

--- Comment #4 from Julian Taylor <jtaylor.debian at gmail dot com> 2013-01-31 17:53:17 UTC ---
it is still the case in 4.8 svn r195604 (built on i586 fedora 11) and the
versions in between, 4.4 is the oldest I tested.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/56160] unnecessary additions in loop [x86, x86_64]
  2013-01-31 10:43 [Bug c/56160] New: unnecessary additions in loop [x86, x86_64] jtaylor.debian at gmail dot com
                   ` (3 preceding siblings ...)
  2013-01-31 17:53 ` jtaylor.debian at gmail dot com
@ 2021-07-24  4:53 ` pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-24  4:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56160

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Target|                            |x86_64-linux-gnu
          Component|middle-end                  |target

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
This is just IV-OPTs going wrong.

One thing which I will note does improve the code is doing:



        __m128 n = _mm_cmpneq_ps(res, res);
        int need = _mm_movemask_ps(n);
        if (UNLIKELY(need)) {
            comp_mult(r, a, b, i); 
        } else
            _mm_store_ps((float*)&r[i], res);

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-07-24  4:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-31 10:43 [Bug c/56160] New: unnecessary additions in loop [x86, x86_64] jtaylor.debian at gmail dot com
2013-01-31 10:44 ` [Bug c/56160] " jtaylor.debian at gmail dot com
2013-01-31 10:48 ` jtaylor.debian at gmail dot com
2013-01-31 17:51 ` [Bug middle-end/56160] " pinskia at gcc dot gnu.org
2013-01-31 17:53 ` jtaylor.debian at gmail dot com
2021-07-24  4:53 ` [Bug target/56160] " pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).