From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 13615 invoked by alias); 22 May 2011 12:23:38 -0000 Received: (qmail 13606 invoked by uid 22791); 22 May 2011 12:23:37 -0000 X-SWARE-Spam-Status: No, hits=-2.7 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 X-Spam-Check-By: sourceware.org Received: from localhost (HELO gcc.gnu.org) (127.0.0.1) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 22 May 2011 12:23:23 +0000 From: "dominiq at lps dot ens.fr" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/34265] Missed optimizations X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: dominiq at lps dot ens.fr X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Date: Sun, 22 May 2011 12:33:00 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2011-05/txt/msg01918.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34265 --- Comment #34 from Dominique d'Humieres 2011-05-22 12:06:20 UTC --- Created attachment 24325 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24325 reduced tests The attached bzipped tar contains the files induct_red.f90 with the all the infrastructure to provide a realistic framework to run a reduced version of the subroutine mutual_ind_quad_cir_coil contained in induct_qc_x.F90 (reduced to only one critical nested loops). When the macro XPA is defined the original rotate code rot_q_vector(1) = dot_product(rotate_quad(1,:),q_vector(:)) rot_q_vector(2) = dot_product(rotate_quad(2,:),q_vector(:)) rot_q_vector(3) = dot_product(rotate_quad(3,:),q_vector(:)) is unrolled as (q_vector(2)==0) if the macro FLD is not defined rot_q_vector(1) = rotate_quad(1,1) * q_vector(1) + & rotate_quad(1,2) * q_vector(2) rot_q_vector(2) = rotate_quad(2,1) * q_vector(1) + & rotate_quad(2,2) * q_vector(2) rot_q_vector(3) = rotate_quad(3,1) * q_vector(1) + & rotate_quad(3,2) * q_vector(2) Otherwise it is folded as rot_q_vector(:) = rotate_quad(:,1) * q_vector(1) + & rotate_quad(:,2) * q_vector(2) When the macro XPB is defined the original numerator numerator = w1gauss(j) * w2gauss(k) * & dot_product(coil_current_vec,current_vector) is unrolled as numerator = w1gauss(j) * w2gauss(k) * & (coil_current_vec(1)*current_vector(1) + & coil_current_vec(2)*current_vector(2) + & coil_current_vec(3)*current_vector(3)) When the macro XPC is defined the original denominator denominator = sqrt(dot_product(rot_c_vector-rot_q_vector, & rot_c_vector-rot_q_vector)) is unrolled as denominator = sqrt((rot_c_vector(1)-rot_q_vector(1))**2 + & (rot_c_vector(2)-rot_q_vector(2))**2 + & (rot_c_vector(3)-rot_q_vector(3))**2) It contains also a script to run the twelve cases and one case with graphite and the raw results for revisions 167530, 167531, and 173917 (original, with r167531 reverted: 173917r1, and with /* NEXT_PASS (pass_complete_unrolli); */ : 173917n since I think this is related to revision 134730). See also pr49006.