public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "tim at klingt dot org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/38824]  New: [4.4 regression] performance regression of sse code from 4.2/4.3
Date: Tue, 13 Jan 2009 11:25:00 -0000	[thread overview]
Message-ID: <bug-38824-12873@http.gcc.gnu.org/bugzilla/> (raw)

the following code shows a performance regression from gcc-4.2 to gcc-4.3 and
4.4 (20090111) on an intel core2 using the x86_64 architecture:

void bench_1(float * out, float * in, float f, unsigned int n)
{
    n /= 4;
    __m128 scalar = _mm_set_ps1(f);
    do
    {
        __m128 arg = _mm_load_ps(in);
        __m128 result = _mm_add_ps(arg, scalar);
        _mm_store_ps(out, result);
        in += 4;
        out += 4;
    }
    while (--n);
}

results, running the function 100000000 times, measured with performance
counters (requires a patched kernel), compiled with -O3 -mfpmath=sse -msse
gcc-4.2: 1946256122 cycles, 8394301290 instructions, 5005 branch misses
gcc-4.3: 2191990305 cycles, 7658465214 instructions, 3442 branch misses
gcc-4.4: 2532778908 cycles, 7462359830 instructions, 8593402 branch misses

although the instruction count decreases, the cycles spent in the function
increases. also gcc-4.4 shows a huge number of branch misses.

the generated code is

gcc-4.2:
.globl _Z7bench_1PfS_fj
        .type   _Z7bench_1PfS_fj, @function
_Z7bench_1PfS_fj:
.LFB2695:
        movaps  %xmm0, %xmm2
        shrl    $2, %edx
        shufps  $0, %xmm2, %xmm2
        movaps  %xmm2, %xmm1
        .p2align 4,,7
.L15:
        movaps  (%rsi), %xmm0
        addq    $16, %rsi
        addps   %xmm1, %xmm0
        movaps  %xmm0, (%rdi)
        addq    $16, %rdi
        subl    $1, %edx
        jne     .L15
        rep ; ret
.LFE2695:
        .size   _Z7bench_1PfS_fj, .-_Z7bench_1PfS_fj
        .align 2
        .p2align 4,,15

gcc-4.3
.globl _Z7bench_1PfS_fj
        .type   _Z7bench_1PfS_fj, @function
_Z7bench_1PfS_fj:
.LFB2563:
        movaps  %xmm0, %xmm2
        shrl    $2, %edx
        subl    $1, %edx
        xorl    %eax, %eax
        shufps  $0, %xmm2, %xmm2
        mov     %edx, %edx
        addq    $1, %rdx
        salq    $4, %rdx
        movaps  %xmm2, %xmm1
        .p2align 4,,10
        .p2align 3
.L17:
        movaps  (%rsi,%rax), %xmm0
        addps   %xmm1, %xmm0
        movaps  %xmm0, (%rdi,%rax)
        addq    $16, %rax
        cmpq    %rdx, %rax
        jne     .L17
        rep
        ret
.LFE2563:
        .size   _Z7bench_1PfS_fj, .-_Z7bench_1PfS_fj
        .p2align 4,,15

gcc-4.4
.globl _Z7bench_1PfS_fj
        .type   _Z7bench_1PfS_fj, @function
_Z7bench_1PfS_fj:
.LFB2489:
        .cfi_startproc
        .cfi_personality 0x3,__gxx_personality_v0
        shrl    $2, %edx
        shufps  $0, %xmm0, %xmm0
        subl    $1, %edx
        xorl    %eax, %eax
        addq    $1, %rdx
        salq    $4, %rdx
        .p2align 4,,10
        .p2align 3
.L17:
        movaps  %xmm0, %xmm1
        addps   (%rsi,%rax), %xmm1
        movaps  %xmm1, (%rdi,%rax)
        addq    $16, %rax
        cmpq    %rdx, %rax
        jne     .L17
        rep
        ret
        .cfi_endproc
.LFE2489:
        .size   _Z7bench_1PfS_fj, .-_Z7bench_1PfS_fj
        .p2align 4,,15


-- 
           Summary: [4.4 regression] performance regression of sse code from
                    4.2/4.3
           Product: gcc
           Version: 4.4.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: tim at klingt dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38824


             reply	other threads:[~2009-01-13 11:25 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-13 11:25 tim at klingt dot org [this message]
2009-01-13 15:07 ` [Bug target/38824] " rguenth at gcc dot gnu dot org
2009-01-13 16:22 ` tim at klingt dot org
2009-01-14 20:20 ` hubicka at gcc dot gnu dot org
2009-01-14 20:26 ` [Bug target/38824] [4.4 Regression] " rguenth at gcc dot gnu dot org
2009-01-14 20:32 ` [Bug target/38824] [4.4 regression] " hubicka at gcc dot gnu dot org
2009-01-15  0:31 ` hubicka at gcc dot gnu dot org
2009-01-15  1:26 ` hjl dot tools at gmail dot com
2009-01-15  1:49 ` hubicka at ucw dot cz
2009-01-23 16:19 ` [Bug target/38824] [4.4 Regression] " rguenth at gcc dot gnu dot org
2009-01-24  5:12 ` xuepeng dot guo at intel dot com
2009-01-24  9:56 ` tim at klingt dot org
2009-01-24 13:14 ` tim at klingt dot org
2009-01-25 17:56 ` rguenth at gcc dot gnu dot org
2009-02-06  9:16 ` bonzini at gnu dot org
2009-02-06 22:35 ` dwarak dot rajagopal at amd dot com
2009-02-07 16:18 ` rob1weld at aol dot com
2009-02-08 12:36 ` hubicka at gcc dot gnu dot org
2009-02-08 12:40 ` hubicka at gcc dot gnu dot org
2009-02-09  9:16 ` xuepeng dot guo at intel dot com
2009-02-09 13:36 ` bonzini at gnu dot org
2009-02-09 13:38 ` bonzini at gnu dot org
2009-02-10 16:29 ` dwarak dot rajagopal at amd dot com
2009-02-10 16:39 ` bonzini at gnu dot org
2009-02-11  7:37 ` xuepeng dot guo at intel dot com
2009-02-11  8:01 ` bonzini at gnu dot org
2009-02-11  8:14 ` ubizjak at gmail dot com
2009-02-11  8:58 ` bonzini at gnu dot org
2009-02-12 15:45 ` hjl at gcc dot gnu dot org
2009-02-16  9:15 ` bonzini at gnu dot org
2009-03-12 16:01 ` hjl dot tools at gmail dot com
2009-03-12 16:08 ` hjl at gcc dot gnu dot org
2009-03-12 20:22 ` hjl dot tools at gmail dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-38824-12873@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).