From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7251 invoked by alias); 18 May 2012 17:46:25 -0000 Received: (qmail 7184 invoked by uid 22791); 18 May 2012 17:46:24 -0000 X-SWARE-Spam-Status: No, hits=-4.2 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00,KHOP_THREADED,TW_SL,TW_ZJ X-Spam-Check-By: sourceware.org Received: from localhost (HELO gcc.gnu.org) (127.0.0.1) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 18 May 2012 17:46:11 +0000 From: "ubizjak at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/53346] [4.6/4.7/4.8 Regression] Bad vectorization in the proc cptrf2 of rnflow.f90 Date: Fri, 18 May 2012 17:46:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: ubizjak at gmail dot com X-Bugzilla-Status: NEW X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org X-Bugzilla-Target-Milestone: 4.8.0 X-Bugzilla-Changed-Fields: CC Message-ID: In-Reply-To: References: X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2012-05/txt/msg01846.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53346 Uros Bizjak changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |hjl.tools at gmail dot com --- Comment #14 from Uros Bizjak 2012-05-18 17:17:45 UTC --- Compile and execute slow assembly: gfortran rnflow.s && time ./a.out real 0m24.454s user 0m24.167s sys 0m0.231s Apply following patch that changes cmove in very fast loops (cptrf2) to jumps: --cut here-- --- rnflow.s 2012-05-18 19:00:22.314102061 +0200 +++ rnflow1.s 2012-05-18 19:10:59.363428625 +0200 @@ -1305,7 +1305,9 @@ movslq %edx, %rbx movss -4(%rdi,%rbx,4), %xmm0 ucomiss (%r9), %xmm0 - cmova %ecx, %edx + jbe .L183x + movl %ecx, %edx +.L183x: subl $1, %ecx subq $4, %r9 cmpl %r10d, %ecx @@ -1329,7 +1331,9 @@ movslq %ecx, %r10 movss -4(%rdi,%r10,4), %xmm0 ucomiss (%r9), %xmm0 - cmova %r11d, %ecx + jbe .L192x + movl %r11d, %ecx +.L192x: subl $1, %r11d subq $4, %r9 cmpl %eax, %r11d @@ -1485,7 +1489,9 @@ movslq %edx, %r10 movss -4(%rdi,%r10,4), %xmm0 ucomiss (%r9), %xmm0 - cmova %ecx, %edx + jbe .L179x + movl %ecx, %edx +.L179x: subq $4, %r9 subl $1, %ecx jne .L179 --cut here-- gfortran rnflow.s && time ./a.out real 0m18.170s user 0m17.907s sys 0m0.223s WTF happened here?! Relevant part of my /proc/cpuinfo: vendor_id : GenuineIntel cpu family : 6 model : 42 Adding CC.