From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by sourceware.org (Postfix) with ESMTPS id 0298D3848011 for ; Thu, 18 Mar 2021 20:22:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 0298D3848011 Received: from troutmask.apl.washington.edu (localhost [127.0.0.1]) by troutmask.apl.washington.edu (8.16.1/8.16.1) with ESMTPS id 12IKMdD1025603 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Thu, 18 Mar 2021 13:22:39 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.16.1/8.16.1/Submit) id 12IKMdFq025602; Thu, 18 Mar 2021 13:22:39 -0700 (PDT) (envelope-from sgk) Date: Thu, 18 Mar 2021 13:22:39 -0700 From: Steve Kargl To: Thomas Koenig Cc: Richard Biener , Tobias Burnus , "fortran@gcc.gnu.org" Subject: Re: MATMUL broken with frontend optimization. Message-ID: <20210318202239.GA25584@troutmask.apl.washington.edu> References: <20210318074849.GA22541@troutmask.apl.washington.edu> <563cee48-fbcc-09bc-0cd1-f05082e4feb3@codesourcery.com> <20210318161347.GA24201@troutmask.apl.washington.edu> <1780c473-3523-316f-c372-52824d062a01@netcologne.de> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: fortran@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Fortran mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Mar 2021 20:22:45 -0000 On Thu, Mar 18, 2021 at 07:24:21PM +0100, Thomas Koenig wrote: > I didn't finish the previous mail before hitting "send", so here > is the postscript... > > > OK, so I've had a bit of time to look at the actual test case.  I > > missed one very important detail before:  This is a vector-matrix > > operation. > > > > For this, we do not have a good library routine (Harald just > > removed it because of a bug in buffering), and -fexternal-blas > > does not work because we do not handle calls to anything but > > *GEMM. > > A vector-matrix multiplicatin would be a call to *GEMV, a worthy > goal, but out of scope so close to a release. Agreed. > > The idea is that, for a vector-matrix-multiplication, the > > compiler should have enough information about the information > about how to optimize for the relevant architecture, especially > if the user compilers with the right flags. > > So, the current idea is that, if we optimize, we can inline. > > What would a better heuristic be? > Does _gfortran_matmul_r4 (and friends) work for vector-matrix products? I haven't checked. If so, how about disabling in-lining MATMUL for 11.1; then, for 11.2, this can be revisited where a small N can be chosen for in-lining. With -fexternal-blas and *gemm, the default cross-over is N = 30. BTW, I cam across this in StackOverflow. https://stackoverflow.com/questions/66682180/why-is-matmul-slower-with-gfortran-compiler-optimization-turned-on -- Steve