From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1784 invoked by alias); 10 Apr 2015 14:15:52 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 1760 invoked by uid 89); 10 Apr 2015 14:15:51 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=3.8 required=5.0 tests=AWL,BAYES_50,KAM_LAZY_DOMAIN_SECURITY,SPAM_BODY,T_RP_MATCHES_RCVD autolearn=no version=3.3.2 X-Spam-User: qpsmtpd, 2 recipients X-HELO: nef2.ens.fr Received: from nef2.ens.fr (HELO nef2.ens.fr) (129.199.96.40) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 10 Apr 2015 14:15:50 +0000 Received: from mailhost.lps.ens.fr (tournesol.lps.ens.fr [129.199.120.1]) by nef2.ens.fr (8.13.6/1.01.28121999) with ESMTP id t3AEFkrC085697 ; Fri, 10 Apr 2015 16:15:47 +0200 (CEST) X-Envelope-To: gcc-patches@gcc.gnu.org Received: from localhost (localhost [127.0.0.1]) by mailhost.lps.ens.fr (Postfix) with ESMTP id C4735105; Fri, 10 Apr 2015 16:15:46 +0200 (CEST) Received: from mailhost.lps.ens.fr ([127.0.0.1]) by localhost (tournesol.lps.ens.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OAVNvQnI_3LV; Fri, 10 Apr 2015 16:15:46 +0200 (CEST) Received: from [192.168.1.11] (log78-1-82-242-47-10.fbx.proxad.net [82.242.47.10]) by mailhost.lps.ens.fr (Postfix) with ESMTPSA id 8BE64100; Fri, 10 Apr 2015 16:15:46 +0200 (CEST) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2098\)) Subject: Re: [patch, fortran, RFC] First steps towards inlining matmul From: =?windows-1252?Q?Dominique_d=27Humi=E8res?= In-Reply-To: <5526FAAA.8010408@netcologne.de> Date: Fri, 10 Apr 2015 14:15:00 -0000 Cc: GCC Patches , GNU GFortran Content-Transfer-Encoding: quoted-printable Message-Id: References: <20150405135519.AF21A105@mailhost.lps.ens.fr> <552147A3.7000603@netcologne.de> <4DA1047C-92D3-485A-9457-61655ED14681@lps.ens.fr> <55219D1C.6070507@netcologne.de> <9F588746-B32C-4919-AEFC-4EF5E6792001@lps.ens.fr> <3916A939-23B3-4D17-B91C-8B6CC34CD7BC@lps.ens.fr> <5526FAAA.8010408@netcologne.de> To: Thomas Koenig X-SW-Source: 2015-04/txt/msg00452.txt.bz2 > Le 10 avr. 2015 =E0 00:18, Thomas Koenig a =E9cri= t : >=20 > Hello world, >=20 > here is an update on the matmul inlining patch. Now, the > rank one + rank two cases are also handled.=20=20 Preliminary tests show that my variant of fatigue.f90 with matmul is as fas= t (if not faster) as the original test with dot products. However this causes new failures FAIL: gfortran.dg/matmul_bounds_4.f90 -O1 execution test FAIL: gfortran.dg/matmul_bounds_4.f90 -O2 execution test FAIL: gfortran.dg/matmul_bounds_4.f90 -O3 -fomit-frame-pointer execution= test FAIL: gfortran.dg/matmul_bounds_4.f90 -O3 -fomit-frame-pointer -funroll-l= oops execution test FAIL: gfortran.dg/matmul_bounds_4.f90 -O3 -fomit-frame-pointer -funroll-a= ll-loops -finline-functions execution test FAIL: gfortran.dg/matmul_bounds_4.f90 -O3 -g execution test FAIL: gfortran.dg/matmul_bounds_4.f90 -Os execution test FAIL: gfortran.dg/matmul_bounds_5.f90 -O1 execution test FAIL: gfortran.dg/matmul_bounds_5.f90 -O2 execution test FAIL: gfortran.dg/matmul_bounds_5.f90 -O3 -fomit-frame-pointer execution= test FAIL: gfortran.dg/matmul_bounds_5.f90 -O3 -fomit-frame-pointer -funroll-l= oops execution test FAIL: gfortran.dg/matmul_bounds_5.f90 -O3 -fomit-frame-pointer -funroll-a= ll-loops -finline-functions execution test FAIL: gfortran.dg/matmul_bounds_5.f90 -O3 -g execution test FAIL: gfortran.dg/matmul_bounds_5.f90 -Os execution test and may be FAIL: gfortran.dg/coarray_lib_this_image_2.f90 -O scan-tree-dump-times = original "mylbound =3D parm...dim\\[0\\].stride >=3D 0 && parm...dim\\[0\\]= .ubound >=3D parm...dim\\[0\\].lbound \\|\\| parm...dim\\[0\\].stride < 0 \= \?[^\n\r]* parm...dim\\[0\\].lbound : 1;" 1 FAIL: gfortran.dg/dependency_26.f90 -O scan-tree-dump-times original "&= a" 1 > Reallocation on assignment also works now. Confirmed. >=20 > Still missing: >=20 > 1. Control via an option, BLAS inlining has to take precedence > 2. handling of matmul(a,b) occurring in the middle of an expression. > 3. Bounds checking from the front end pass (basically, calling > gfortran_runtime_error). > 4. More test cases >=20 > What do you think? 1. is straightforward, I agree if the bounds are known at compile time (it will be nice to use -fb= las-matmul-limit=3Dn for the threshold). However for bounds know at run time only, I don=92t see how we can escape s= ome king of versioning. > we don't really need to do 2. for committing early in stage one.=20=20 Agreed: no point to mess with complicated expressions if the simple ones ar= e not properly debugged! > For 3, we could maybe just issue a STOP.=20=20 I have silenced these failures (gfortran.dg/matmul_bounds_*.f90) by adding = -fno-frontend-optimize to the list of options. IMO having the same message = with/without inlining is needed. > 4. is required, I need to look at some more corner cases where bugs may s= till be lurking. I also see the following failures FAIL: gfortran.dg/shape_2.f90 -O0 execution test FAIL: gfortran.dg/shape_2.f90 -O1 execution test FAIL: gfortran.dg/shape_2.f90 -O2 execution test FAIL: gfortran.dg/shape_2.f90 -O3 -fomit-frame-pointer execution test FAIL: gfortran.dg/shape_2.f90 -O3 -fomit-frame-pointer -funroll-loops ex= ecution test FAIL: gfortran.dg/shape_2.f90 -O3 -fomit-frame-pointer -funroll-all-loops= -finline-functions execution test FAIL: gfortran.dg/shape_2.f90 -O3 -g execution test FAIL: gfortran.dg/shape_2.f90 -Os execution test FAIL: gfortran.dg/shape_2.f90 -g -flto execution test A reduced test is: program main integer, dimension (40, 80) :: a =3D 1 call test (a) contains subroutine test (b) integer, dimension (11:, -8:), target :: b integer, dimension (:, :), pointer :: ptr print *, lbound (b (:, :), 1) ! if (lbound (b (:, :), 1) .ne. 1) call abort print *, lbound (b (:, :), 2) ! if (lbound (b (:, :), 2) .ne. 1) call abort print *, lbound (b (20:30:3, 40), 1) if (lbound (b (20:30:3, 40), 1) .ne. 1) call abort end subroutine test end program main (the other tests in the original file succeed). Thanks for working of this, Dominique > What do you think? >=20 > Thomas >=20 >