From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 29725 invoked by alias); 18 Dec 2010 03:42:58 -0000 Received: (qmail 29716 invoked by uid 22791); 18 Dec 2010 03:42:57 -0000 X-SWARE-Spam-Status: No, hits=-1.0 required=5.0 tests=AWL,BAYES_05,TW_DB,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from bromo.med.uc.edu (HELO bromo.med.uc.edu) (129.137.3.146) by sourceware.org (qpsmtpd/0.43rc1) with SMTP; Sat, 18 Dec 2010 03:42:52 +0000 Received: from bromo.med.uc.edu (localhost.localdomain [127.0.0.1]) by bromo.med.uc.edu (Postfix) with ESMTP id 5BE47B2DE4; Fri, 17 Dec 2010 22:42:50 -0500 (EST) Received: (from howarth@localhost) by bromo.med.uc.edu (8.14.3/8.14.3/Submit) id oBI3gniw021767; Fri, 17 Dec 2010 22:42:49 -0500 Date: Sat, 18 Dec 2010 11:35:00 -0000 From: Jack Howarth To: "Fang, Changpeng" Cc: Zdenek Dvorak , Richard Guenther , Xinliang David Li , "gcc-patches@gcc.gnu.org" Subject: Re: [PATCH, Loop optimizer]: Add logic to disable certain loop optimizations on pre-/post-loops Message-ID: <20101218034249.GA21749@bromo.med.uc.edu> References: <20101214075629.GA10020@kam.mff.cuni.cz> <20101214210552.GA19633@kam.mff.cuni.cz> <20101215092220.GA9872@kam.mff.cuni.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2010-12/txt/msg01448.txt.bz2 On Fri, Dec 17, 2010 at 03:30:37PM -0600, Fang, Changpeng wrote: > Hi, Jack: > > Is prefetch default on at -O3 on your systems? > > Can you do an additional test to test > -O3 -ffast-math with/without the original patch? > > This way, we can know that whether it is the rtl-loop unrolling problem. unpatched r168001 ================================================================================ Date & Time : 17 Dec 2010 19:24:47 Test Name : gfortran_lin_O3_nounroll Compile Command : gfortran -ffast-math -O3 %n.f90 -o %n Benchmarks : ac aermod air capacita channel doduc fatigue gas_dyn induct linpk mdbx nf protein rnflow test_fpu tfft Maximum Times : 2000.0 Target Error % : 0.100 Minimum Repeats : 10 Maximum Repeats : 100 Benchmark Compile Executable Ave Run Number Estim Name (secs) (bytes) (secs) Repeats Err % --------- ------- ---------- ------- ------- ------ ac 1.01 10000 8.76 10 0.0129 aermod 46.64 10000 17.23 10 0.0154 air 2.57 10000 5.45 10 0.0579 capacita 1.63 10000 33.17 10 0.0407 channel 0.62 10000 1.87 10 0.0166 doduc 6.21 10000 27.33 10 0.0067 fatigue 2.02 10000 7.95 10 0.0339 gas_dyn 2.18 10000 4.36 19 0.0871 induct 5.08 10000 12.44 10 0.0054 linpk 0.70 10000 15.52 10 0.0638 mdbx 1.87 10000 11.46 10 0.0080 nf 1.25 10000 31.27 19 0.0955 protein 3.66 10000 35.68 10 0.0062 rnflow 4.54 10000 26.09 10 0.0104 test_fpu 3.46 10000 8.81 10 0.0443 tfft 0.50 10000 1.90 10 0.0442 Geometric Mean Execution Time = 11.09 seconds ================================================================================ r168001 with original patch ================================================================================ Date & Time : 17 Dec 2010 21:49:34 Test Name : gfortran_lin_O3_nounroll Compile Command : gfortran -ffast-math -O3 %n.f90 -o %n Benchmarks : ac aermod air capacita channel doduc fatigue gas_dyn induct linpk mdbx nf protein rnflow test_fpu tfft Maximum Times : 2000.0 Target Error % : 0.100 Minimum Repeats : 10 Maximum Repeats : 100 Benchmark Compile Executable Ave Run Number Estim Name (secs) (bytes) (secs) Repeats Err % --------- ------- ---------- ------- ------- ------ ac 0.89 10000 8.76 10 0.0146 aermod 45.65 10000 17.23 10 0.0068 air 2.48 10000 5.44 12 0.0345 capacita 1.50 10000 33.20 10 0.0661 channel 0.60 10000 1.87 10 0.0331 doduc 6.07 10000 27.34 10 0.0118 fatigue 1.84 10000 7.96 10 0.0169 gas_dyn 1.95 10000 4.36 17 0.0911 induct 4.97 10000 12.44 10 0.0068 linpk 0.69 10000 15.54 10 0.0884 mdbx 1.83 10000 11.46 10 0.0258 nf 1.09 10000 31.38 20 0.0899 protein 3.09 10000 35.64 10 0.0141 rnflow 4.22 10000 26.06 10 0.0883 test_fpu 3.24 10000 8.82 12 0.0866 tfft 0.50 10000 1.91 10 0.0475 Geometric Mean Execution Time = 11.09 seconds ================================================================================ So the performance regression with the patch only manifests itself with -funroll-loops. Jack > > Thanks, > > Changpeng > > > >