From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugs-return-266822-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 15305 invoked by alias); 17 Nov 2008 18:13:14 -0000
Received: (qmail 4647 invoked by uid 48); 17 Nov 2008 18:11:49 -0000
Date: Mon, 17 Nov 2008 18:13:00 -0000
Message-ID: <20081117181149.4646.qmail@sourceware.org>
X-Bugzilla-Reason: CC
References: <bug-38134-12873@http.gcc.gnu.org/bugzilla/>
Subject: [Bug target/38134] [4.4 Regression] speed regression with inline-asm sse code
In-Reply-To: <bug-38134-12873@http.gcc.gnu.org/bugzilla/>
Reply-To: gcc-bugzilla@gcc.gnu.org
To: gcc-bugs@gcc.gnu.org
From: "ubizjak at gmail dot com" <gcc-bugzilla@gcc.gnu.org>
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
X-SW-Source: 2008-11/txt/msg01396.txt.bz2


------- Comment #6 from ubizjak at gmail dot com  2008-11-17 18:11 -------
I think that

        addps   .LC10(%rip), %xmm0
        mulps   %xmm1, %xmm0
        addps   .LC11(%rip), %xmm0
        mulps   %xmm1, %xmm0
        addps   .LC12(%rip), %xmm0
        mulps   %xmm1, %xmm0
        addps   .LC13(%rip), %xmm0
        mulps   %xmm1, %xmm0
        addps   .LC14(%rip), %xmm0
        mulps   %xmm1, %xmm0

is the bottleneck. Perhaps we should split impilicit memory operands out of the
insn by some generic peephole (if the register is available) and schedule loads
appropriately.

OTOH, loop optimizer should detect invariant loads and move them out of the
loop.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38134