From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by sourceware.org (Postfix) with ESMTPS id 4A8D5384400A for ; Thu, 18 Mar 2021 16:13:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 4A8D5384400A Received: from troutmask.apl.washington.edu (localhost [127.0.0.1]) by troutmask.apl.washington.edu (8.16.1/8.16.1) with ESMTPS id 12IGDl3N024527 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Thu, 18 Mar 2021 09:13:48 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.16.1/8.16.1/Submit) id 12IGDlhZ024526; Thu, 18 Mar 2021 09:13:47 -0700 (PDT) (envelope-from sgk) Date: Thu, 18 Mar 2021 09:13:47 -0700 From: Steve Kargl To: Richard Biener Cc: Tobias Burnus , "fortran@gcc.gnu.org" , Thomas Koenig Subject: Re: MATMUL broken with frontend optimization. Message-ID: <20210318161347.GA24201@troutmask.apl.washington.edu> References: <20210318074849.GA22541@troutmask.apl.washington.edu> <563cee48-fbcc-09bc-0cd1-f05082e4feb3@codesourcery.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-8.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: fortran@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Fortran mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Mar 2021 16:13:53 -0000 On Thu, Mar 18, 2021 at 04:05:40PM +0100, Richard Biener wrote: > On Thu, Mar 18, 2021 at 3:48 PM Tobias Burnus wrote: > > > > Richard, > > > > On 18.03.21 13:35, Richard Biener via Fortran wrote: > > > [...] > > > Since the libgfortran MATMUL should be vectorized > > > I think it's not reasonable to inline any but _very_ small > > > MATMUL at optimization levels that do not enable vectorization. > > > > Besides the obvious if (!flag_external_blas) which should always prevent > > inlining (possibly except for tiny N like N=1), your idea is 'if (N > > small || flag_tree_loop_vectorize)'? > > > > Or are you thinking of a different or additional flag_... than > > flag_tree_loop_vectorize for making this choice? > > Yes, I was thinking of flag_tree_loop_vectorize. Of course libgfortran > is far from having micro-optimized matmul for various architectures > but IIRC it uses attribute(target) to provide several overloads. So > maybe only ever inlining tiny matmul makes sense as well (does the > runtime have specializations for small sizes?) > With -fexternal-blas, there is a cross-over value of N=30, which can be changed by -fblas-matmul-limit=N option. I forgot the important example, but Thomas seems to be aware. % gfcx -o z -O2 -fno-frontend-optimize -fexternal-blas a.f90 && ./z /usr/local/bin/ld: /tmp/ccOe3VoD.o: in function `MAIN__': a.f90:(.text+0x156): undefined reference to `sgemm_' collect2: error: ld returned 1 exit status sgemm_ would come from a tuned BLAS library such as OpenBLAS. I was going to suggest adding a testcase that scans a dump for sgemm. It seems matmul_blas_1.f tests the -fexternal-blas and -fblas-matmul-limit=N options, but it doesn't look for sgemm. This, I believe, does the checking diff --git a/gcc/testsuite/gfortran.dg/matmul_blas_1.f b/gcc/testsuite/gfortran.dg/matmul_blas_1.f index 6a88981c9d7..52298d09cce 100644 --- a/gcc/testsuite/gfortran.dg/matmul_blas_1.f +++ b/gcc/testsuite/gfortran.dg/matmul_blas_1.f @@ -237,4 +237,4 @@ C Test calling of BLAS routines if (any (c /= cres)) stop 20 end -! { dg-final { scan-tree-dump-times "_gfortran_matmul" 0 "optimized" } } +! { dg-final { scan-tree-dump "sgemm" "optimized" } } -- Steve