From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2942 invoked by alias); 16 Feb 2008 18:50:58 -0000 Received: (qmail 1417 invoked by uid 48); 16 Feb 2008 18:50:09 -0000 Date: Sat, 16 Feb 2008 18:50:00 -0000 Message-ID: <20080216185009.1416.qmail@sourceware.org> X-Bugzilla-Reason: CC References: Subject: [Bug fortran/29549] matmul slow for complex matrices In-Reply-To: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "fxcoudert at gcc dot gnu dot org" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2008-02/txt/msg01806.txt.bz2 ------- Comment #7 from fxcoudert at gcc dot gnu dot org 2008-02-16 18:50 ------- Thomas is right: -fcx-limited-range sets flag_complex_method to 0, but already with flag_complex_method == 1 we have some rather good figures. Here are the execution times of 300x300 matmul on my MacBook Pro (i386-apple-darwin8.11.1): - a home-made triple do loop in Fortran (Janne's comment #2) is 0.1876 sec - unpatched matmul is 0.5499 sec - matmul compiled with flag_complex_method == 1 is 0.1448 sec The following patch is what I used to benchmark: it creates a -fcx-fortran-rules (of course, we do know that Fortran actually rules, but hiding it in an option name is a clever way for people to slowly start realizing it) option that sets flag_complex_method to 1, and uses it to compile libgfortran's matmul routines. Index: gcc/toplev.c =================================================================== --- gcc/toplev.c (revision 132353) +++ gcc/toplev.c (working copy) @@ -2001,6 +2001,10 @@ if (flag_cx_limited_range) flag_complex_method = 0; + /* With -fcx-fortran-rules, we do something in-between cheap and C99. */ + if (flag_cx_fortran_rules) + flag_complex_method = 1; + /* Targets must be able to place spill slots at lower addresses. If the target already uses a soft frame pointer, the transition is trivial. */ if (!FRAME_GROWS_DOWNWARD && flag_stack_protect) Index: gcc/common.opt =================================================================== --- gcc/common.opt (revision 132353) +++ gcc/common.opt (working copy) @@ -390,6 +390,10 @@ Common Report Var(flag_cx_limited_range) Optimization Omit range reduction step when performing complex division +fcx-fortran-rules +Common Report Var(flag_cx_fortran_rules) Optimization +Complex multiplication and division follow Fortran rules + fdata-sections Common Report Var(flag_data_sections) Optimization Place data items into their own section Index: libgfortran/Makefile.am =================================================================== --- libgfortran/Makefile.am (revision 132353) +++ libgfortran/Makefile.am (working copy) @@ -636,7 +636,7 @@ install-pdf: # Turn on vectorization and loop unrolling for matmul. -$(patsubst %.c,%.lo,$(notdir $(i_matmul_c))): AM_CFLAGS += -ftree-vectorize -fs +$(patsubst %.c,%.lo,$(notdir $(i_matmul_c))): AM_CFLAGS += -ftree-vectorize -fs # Logical matmul doesn't vectorize. $(patsubst %.c,%.lo,$(notdir $(i_matmull_c))): AM_CFLAGS += -funroll-loops -- fxcoudert at gcc dot gnu dot org changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |fxcoudert at gcc dot gnu dot | |org Keywords| |patch http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29549