public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "jtaylor.debian at gmail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug c/56160] New: unnecessary additions in loop [x86, x86_64]
Date: Thu, 31 Jan 2013 10:43:00 -0000	[thread overview]
Message-ID: <bug-56160-4@http.gcc.gnu.org/bugzilla/> (raw)


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56160

             Bug #: 56160
           Summary: unnecessary additions in loop [x86, x86_64]
    Classification: Unclassified
           Product: gcc
           Version: 4.4.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: jtaylor.debian@gmail.com


the attached code which does complex float multiplication using sse3 produces 4
unnecessary integer additions if the NaN fallback function comp_mult is inlined

the assembly for the loop generated with -msse3 -O3 -std=c99 in gcc 4.4, 4.6,
4.7 and 4.8 svn 195604 looks like this:
  28:    0f 28 0e                 movaps (%esi),%xmm1
  2b:    f3 0f 12 c1              movsldup %xmm1,%xmm0
  2f:    8b 55 08                 mov    0x8(%ebp),%edx
  32:    0f 28 13                 movaps (%ebx),%xmm2
  35:    f3 0f 16 c9              movshdup %xmm1,%xmm1
  39:    0f 59 c2                 mulps  %xmm2,%xmm0
  3c:    0f c6 d2 b1              shufps $0xb1,%xmm2,%xmm2
  40:    0f 59 ca                 mulps  %xmm2,%xmm1
  43:    f2 0f d0 c1              addsubps %xmm1,%xmm0
  47:    0f 29 04 fa              movaps %xmm0,(%edx,%edi,8)
  4b:    0f c2 c0 04              cmpneqps %xmm0,%xmm0
  4f:    0f 50 c0                 movmskps %xmm0,%eax
  52:    85 c0                    test   %eax,%eax
  54:    75 1d                    jne    73 <sse3_mult+0x73> // inlined
comp_mult
  56:    83 c7 02                 add    $0x2,%edi
  59:    83 c6 10                 add    $0x10,%esi
  5c:    83 c3 10                 add    $0x10,%ebx
  5f:    83 c1 10                 add    $0x10,%ecx
  62:    83 45 e4 10              addl   $0x10,-0x1c(%ebp)
  66:    39 7d 14                 cmp    %edi,0x14(%ebp)
  69:    7f bd                    jg     28 <sse3_mult+0x28>
...

the 4 adds for esi ebx ecx and ebp are completely unnecessary and reduce
performance by about 20% on my core2duo.
on amd64 it also creates to seemingly unnecessary additions but I did not test
the performance.

a way to coax gcc to emit proper code is to not allow it to inline the fallback
it then generates following good assembly with only one integer add:

  a8:    0f 28 0c df              movaps (%edi,%ebx,8),%xmm1
  ac:    f3 0f 12 c1              movsldup %xmm1,%xmm0
  b0:    8b 45 08                 mov    0x8(%ebp),%eax
  b3:    0f 28 14 de              movaps (%esi,%ebx,8),%xmm2
  b7:    f3 0f 16 c9              movshdup %xmm1,%xmm1
  bb:    0f 59 c2                 mulps  %xmm2,%xmm0
  be:    0f c6 d2 b1              shufps $0xb1,%xmm2,%xmm2
  c2:    0f 59 ca                 mulps  %xmm2,%xmm1
  c5:    f2 0f d0 c1              addsubps %xmm1,%xmm0
  c9:    0f 29 04 d8              movaps %xmm0,(%eax,%ebx,8)
  cd:    0f c2 c0 04              cmpneqps %xmm0,%xmm0
  d1:    0f 50 c0                 movmskps %xmm0,%eax
  d4:    85 c0                    test   %eax,%eax
  d6:    75 10                    jne    e8 <sse3_mult+0x58> // non-inlined
comp_mult
  d8:    83 c3 02                 add    $0x2,%ebx
  db:    39 5d 14                 cmp    %ebx,0x14(%ebp)
  de:    7f c8                    jg     a8 <sse3_mult+0x18>
...


             reply	other threads:[~2013-01-31 10:43 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-31 10:43 jtaylor.debian at gmail dot com [this message]
2013-01-31 10:44 ` [Bug c/56160] " jtaylor.debian at gmail dot com
2013-01-31 10:48 ` jtaylor.debian at gmail dot com
2013-01-31 17:51 ` [Bug middle-end/56160] " pinskia at gcc dot gnu.org
2013-01-31 17:53 ` jtaylor.debian at gmail dot com
2021-07-24  4:53 ` [Bug target/56160] " pinskia at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-56160-4@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).