From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 14148 invoked by alias); 12 May 2009 16:18:50 -0000 Received: (qmail 13773 invoked by uid 48); 12 May 2009 16:18:33 -0000 Date: Tue, 12 May 2009 16:18:00 -0000 Message-ID: <20090512161833.13772.qmail@sourceware.org> X-Bugzilla-Reason: CC References: Subject: [Bug middle-end/40106] Time increase with inlining for the Polyhedron test air.f90 In-Reply-To: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "dominiq at lps dot ens dot fr" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2009-05/txt/msg01036.txt.bz2 ------- Comment #4 from dominiq at lps dot ens dot fr 2009-05-12 16:18 ------- Assembly code for the inlined inner loop: L123: movsd (%rdx), %xmm15 movsd 8(%rdx), %xmm6 mulsd (%rax), %xmm15 mulsd 1200(%rax), %xmm6 movsd 16(%rdx), %xmm4 movsd 24(%rdx), %xmm3 mulsd 2400(%rax), %xmm4 mulsd 3600(%rax), %xmm3 addsd %xmm15, %xmm0 movsd 32(%rdx), %xmm9 movsd 40(%rdx), %xmm1 mulsd 4800(%rax), %xmm9 mulsd 6000(%rax), %xmm1 addsd %xmm6, %xmm0 movsd 48(%rdx), %xmm7 movsd 56(%rdx), %xmm2 addq $64, %rdx mulsd 7200(%rax), %xmm7 mulsd 8400(%rax), %xmm2 addq $9600, %rax addsd %xmm4, %xmm0 cmpq %rax, %rcx addsd %xmm3, %xmm0 addsd %xmm9, %xmm0 addsd %xmm1, %xmm0 addsd %xmm7, %xmm0 addsd %xmm2, %xmm0 jne L123 and in the subroutine DERIVX: L953: movsd (%rax), %xmm9 addl $8, %ebx movsd 8(%rax), %xmm8 mulsd (%rcx), %xmm9 mulsd 1200(%rcx), %xmm8 movsd 16(%rax), %xmm7 movsd 24(%rax), %xmm6 mulsd 2400(%rcx), %xmm7 mulsd 3600(%rcx), %xmm6 addsd %xmm9, %xmm0 movsd 32(%rax), %xmm5 movsd 40(%rax), %xmm4 mulsd 4800(%rcx), %xmm5 mulsd 6000(%rcx), %xmm4 addsd %xmm8, %xmm0 movsd 48(%rax), %xmm3 movsd 56(%rax), %xmm1 addq $64, %rax mulsd 7200(%rcx), %xmm3 mulsd 8400(%rcx), %xmm1 addq $9600, %rcx cmpl %edi, %ebx addsd %xmm7, %xmm0 addsd %xmm6, %xmm0 addsd %xmm5, %xmm0 addsd %xmm4, %xmm0 addsd %xmm3, %xmm0 addsd %xmm1, %xmm0 jne L953 The structure of the outer loops seems quite comparable in both cases. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40106