From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 47933 invoked by alias); 20 Nov 2019 22:18:30 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 47916 invoked by uid 89); 20 Nov 2019 22:18:30 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.7 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=H*f:sk:4e68f25, H*i:sk:4e68f25, HX-Received:6809, belt X-HELO: mail-yb1-f180.google.com Received: from mail-yb1-f180.google.com (HELO mail-yb1-f180.google.com) (209.85.219.180) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 20 Nov 2019 22:18:28 +0000 Received: by mail-yb1-f180.google.com with SMTP id g38so639098ybe.11; Wed, 20 Nov 2019 14:18:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=1akt/H3wxltxsc4h5dW2c7Mi+fctvL8cY81L0Ai6wHc=; b=E5GddTk2x/rnwYz8jJ+oYhBRWzKUrKyyCCSYVysGtPNBzORIFjvyjbmXzzy6u46/+q 846naWPOXSROWYejPIRRKA3JUPLkl0OieAz3aalRKz7MRXIXxAoC+rQd7xnV41+9tHCv MPURX44yxp0iKuN/7hya7TMpyElWbIrHoKtcmqXvnPUzPV9vT7p5p0qJuv9RmWmvjP4+ kck4y+zwmv1OV1bf1/ttmk/rfM+812qLMN52C+imA0DtyB6B4RTSsAdyIwiBaDJcgpHY zjZEb0B8nW6nKQKSWXWvZIp5Q4b2sOZniAfU1WqDycM9vlsJRP9WIamOC659wafJd7at orug== MIME-Version: 1.0 References: <48286910-ebbb-10e4-488b-8c96e505375c@tkoenig.net> <43b9fcf0-f457-90a7-c807-4aebc65cb045@tkoenig.net> <2981fd67-007e-7327-8208-27e8fd18d9db@netcologne.de> <4e68f250-1e41-ac7c-dc64-88f91cdf183e@tkoenig.net> In-Reply-To: <4e68f250-1e41-ac7c-dc64-88f91cdf183e@tkoenig.net> From: Janne Blomqvist Date: Wed, 20 Nov 2019 22:19:00 -0000 Message-ID: Subject: Re: [patch, fortran] Load scalar intent-in variables at the beginning of procedures To: =?UTF-8?Q?Thomas_K=C3=B6nig?= Cc: Thomas Koenig , Tobias Burnus , "fortran@gcc.gnu.org" , gcc-patches Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-SW-Source: 2019-11/txt/msg02041.txt.bz2 On Wed, Nov 20, 2019 at 11:35 PM Thomas K=C3=B6nig wrote: > > Am 20.11.19 um 21:45 schrieb Janne Blomqvist: > > BTW, since this is done for the purpose of optimization, have you done > > testing on some suitable benchmark suite such as polyhedron, whether > > it a) generates any different code b) does it make it go faster? > > I haven't run any actual benchmarks. > > However, there is a simple example which shows its advantages. > Consider > > subroutine foo(n,m) > m =3D 0 > do 100 i=3D1,100 > call bar > m =3D m + n > 100 continue > end > > (I used old-style DO loops just because :-) > > Without the optimization, the inner loop is translated to > > .L2: > xorl %eax, %eax > call bar_ > movl (%r12), %eax > addl %eax, 0(%rbp) > subl $1, %ebx > jne .L2 > > and with the optimization to > > .L2: > xorl %eax, %eax > call bar_ > addl %r12d, 0(%rbp) > subl $1, %ebx > jne .L2 > > so the load of the address is missing. (Why do we zero %eax > before each call? It should not be a variadic call right?) Not sure. Maybe some belt and suspenders thing? I guess someone better versed in ABI minutiae knows better. It's not Fortran-specific though, the C frontend does the same when calling a void function. AFAIK on reasonably current OoO CPU's xor'ing a register with itself is handled by the renamer and doesn't consume an execute slot, so it's in effect a zero-cycle instruction. Still bloats the code slightly, though. > Of course, Fortran language rules specify that the call to bar > cannot do anything to n Hmm, does it? What about the following modification to your testcase: module nmod integer :: n end module nmod subroutine foo(n,m) m =3D 0 do 100 i=3D1,100 call bar m =3D m + n 100 continue end subroutine foo subroutine bar() use nmod n =3D 0 end subroutine bar program main use nmod implicit none integer :: m n =3D 1 m =3D 0 call foo(n, m) print *, m end program main > So, a copy in / copy out for variables where we can not be sure that > no value is assigned? Does anybody see a downside for that?) In principle sounds good, unless my concerns above are real and affect this case too. > > Is there a risk of performance regressions due to higher register press= ure? > > I don't think so. Either the compiler realizes that it can > keep the variable in a register (then it makes no difference), > or it has to load it fresh from its address (then there is > one additional register needed). Yes, true. Good point. --=20 Janne Blomqvist