From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 74847 invoked by alias); 15 Nov 2016 08:21:47 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 74822 invoked by uid 89); 15 Nov 2016 08:21:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.7 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_NONE,RCVD_IN_SORBS_SPAM,SPF_PASS autolearn=no version=3.3.2 spammy=D*charter.net, HX-Envelope-From:sk:richard, jvdelisle@charter.net, jvdelislecharternet X-Spam-User: qpsmtpd, 2 recipients X-HELO: mail-wm0-f66.google.com Received: from mail-wm0-f66.google.com (HELO mail-wm0-f66.google.com) (74.125.82.66) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 15 Nov 2016 08:21:37 +0000 Received: by mail-wm0-f66.google.com with SMTP id m203so23099312wma.3; Tue, 15 Nov 2016 00:21:36 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=r5MVGjDQeSSlk8Ql/j7pY+zzUAGcceIzblkMXeDqOl4=; b=HCZ5QwVCNRIhS3FCcYPKFC7WK1WD5y3yU3jXYYZWO4h3Uk1YeSPSV4AoNQ3vxfFpYc 3Qfd+m2ZkpogftglciF/ot56otl+Yx3+NNnzHRR6xGE0Q1BNQPOdJQpfwaGXpEyoGCCM KzwpJ7Zkkbmc03tieRMM/0vYtGfyPjC5KRerAz7FupZsqafRYtzxZgdmSpsj2kV7y7KE lz5vunfhvhcAfXzxeUhkSLE9s2oW9aR+X5Q1mdbDkEbIPYLZgvArKOUUzt6JASaJcyp9 iN4ErD0QUtXedLbyaQwqZ1OLify9qDgYSL/yRh4RUDHh2TUEdaQq2fYKOqFvn0v7hWQC BRww== X-Gm-Message-State: ABUngveBbSzoy/TK8PWIrssxkQFAIhxK5iS1Z5Wknny03fgqguy5P5V8lrntxWkBsccBDcupnYn3dg3Xx5lDyw== X-Received: by 10.194.54.99 with SMTP id i3mr23052463wjp.86.1479198095015; Tue, 15 Nov 2016 00:21:35 -0800 (PST) MIME-Version: 1.0 Received: by 10.28.73.215 with HTTP; Tue, 15 Nov 2016 00:21:34 -0800 (PST) In-Reply-To: <63f1ffa4-02b5-11c9-b64d-86336733d4b5@charter.net> References: <2aad89ce-02e1-45bf-0bdc-d318e7995595@charter.net> <49101da5-e1cc-0817-7cb6-64cfe5778e60@netcologne.de> <63f1ffa4-02b5-11c9-b64d-86336733d4b5@charter.net> From: Richard Biener Date: Tue, 15 Nov 2016 08:21:00 -0000 Message-ID: Subject: Re: [patch,libgfortran] PR51119 - MATMUL slow for large matrices To: Jerry DeLisle Cc: Thomas Koenig , "fortran@gcc.gnu.org" , GCC Patches Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2016-11/txt/msg01404.txt.bz2 On Mon, Nov 14, 2016 at 11:13 PM, Jerry DeLisle wrote: > On 11/13/2016 11:03 PM, Thomas Koenig wrote: >> >> Hi Jerry, >> >> I think this >> >> + /* Parameter adjustments */ >> + c_dim1 = m; >> + c_offset = 1 + c_dim1; >> >> should be >> >> + /* Parameter adjustments */ >> + c_dim1 = rystride; >> + c_offset = 1 + c_dim1; >> >> Regarding options for matmul: It is possible to add the >> options to the lines in Makefile.in >> >> # Turn on vectorization and loop unrolling for matmul. >> $(patsubst %.c,%.lo,$(notdir $(i_matmul_c))): AM_CFLAGS += >> -ftree-vectorize >> -funroll-loops >> >> This is a great step forward. I think we can close most matmul-related >> PRs once this patch has been applied. >> >> Regards >> >> Thomas >> > > With Thomas suggestion, I can remove the #pragma optimize from the source > code. Doing this: (long lines wrapped as shown) > > diff --git a/libgfortran/Makefile.am b/libgfortran/Makefile.am > index 39d3e11..9ee17f9 100644 > --- a/libgfortran/Makefile.am > +++ b/libgfortran/Makefile.am > @@ -850,7 +850,7 @@ intrinsics/dprod_r8.f90 \ > intrinsics/f2c_specifics.F90 > > # Turn on vectorization and loop unrolling for matmul. > -$(patsubst %.c,%.lo,$(notdir $(i_matmul_c))): AM_CFLAGS += -ftree-vectorize > -funroll-loops > +$(patsubst %.c,%.lo,$(notdir $(i_matmul_c))): AM_CFLAGS += -ffast-math > -fno-protect-parens -fstack-arrays -ftree-vectorize -funroll-loops --param > max-unroll-times=4 -ftree-loop-vectorize -ftree-vectorize turns on -ftree-loop-vectorize and -ftree-slp-vectorize already. > # Logical matmul doesn't vectorize. > $(patsubst %.c,%.lo,$(notdir $(i_matmull_c))): AM_CFLAGS += -funroll-loops > > > Comparing gfortran 6 vs 7: (test program posted in PR51119) > > $ gfc6 -static -Ofast -finline-matmul-limit=32 -funroll-loops --param > max-unroll-times=4 compare.f90 > $ ./a.out > ========================================================= > ================ MEASURED GIGAFLOPS = > ========================================================= > Matmul Matmul > fixed Matmul variable > Size Loops explicit refMatmul assumed explicit > ========================================================= > 2 2000 11.928 0.047 0.082 0.138 > 4 2000 1.455 0.220 0.371 0.316 > 8 2000 1.476 0.737 0.704 1.574 > 16 2000 4.536 3.755 2.825 3.820 > 32 2000 6.070 5.443 3.124 5.158 > 64 2000 5.423 5.355 5.405 5.413 > 128 2000 5.913 5.841 5.917 5.917 > 256 477 5.865 5.252 5.863 5.862 > 512 59 2.794 2.841 2.794 2.791 > 1024 7 1.662 1.356 1.662 1.661 > 2048 1 1.753 1.724 1.753 1.754 > > $ gfc -static -Ofast -finline-matmul-limit=32 -funroll-loops --param > max-unroll-times=4 compare.f90 > $ ./a.out > ========================================================= > ================ MEASURED GIGAFLOPS = > ========================================================= > Matmul Matmul > fixed Matmul variable > Size Loops explicit refMatmul assumed explicit > ========================================================= > 2 2000 12.146 0.042 0.090 0.146 > 4 2000 1.496 0.232 0.384 0.325 > 8 2000 2.330 0.765 0.763 0.965 > 16 2000 4.611 4.120 2.792 3.830 > 32 2000 6.068 5.265 3.102 4.859 > 64 2000 6.527 5.329 6.425 6.495 > 128 2000 8.207 5.643 8.336 8.441 > 256 477 9.210 4.967 9.367 9.299 > 512 59 8.330 2.772 8.422 8.342 > 1024 7 8.430 1.378 8.511 8.424 > 2048 1 8.339 1.718 8.425 8.322 > > I do think we need to adjust the default inline limit and should do this > separately from this patch. > > With these changes, OK for trunk? > > Regards, > > Jerry >