From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 114053 invoked by alias); 18 Jul 2018 14:55:34 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 114036 invoked by uid 89); 18 Jul 2018 14:55:33 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=H*c:alternative X-HELO: mail-yw0-f195.google.com Received: from mail-yw0-f195.google.com (HELO mail-yw0-f195.google.com) (209.85.161.195) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 18 Jul 2018 14:55:32 +0000 Received: by mail-yw0-f195.google.com with SMTP id r3-v6so1821236ywc.5; Wed, 18 Jul 2018 07:55:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=hjZ/LpYnOYBMFB6bAj+wP5hoOW1hG6DDkpTKGiwinRs=; b=NTtD7Sedp+1DP5BhwLsTlvxbtAXEERpRfjHZ3k+zTrY7KUkelpbKCmIFHNKV4mRxWI bjUAsqj/k1QPOyTBkGZGSa3Kwhym87FQhJ2JXn69QSHerLDFfHGxGN0haGnmxc3iodnE vPEynj2sMVHcZdipZXAV+JOXSGdtJ9N3Zzw7V2BRamZXaDUhn7fkW+T9d9J1obUMQWVx ScwPHBEcuDXzdinH7o4/Bzd0FhkUbVxDEYaOjypnFG95ixsYpeZGaaJ5wcdIEFtz4jA+ GxcfnWkB/QJ6yhY8GEcWBWsEWqTFa59CLDJeGYPpg0aMt33Up/3/BL/Gh5LprSEE2S9j d5yA== MIME-Version: 1.0 Received: by 2002:a0d:e304:0:0:0:0:0 with HTTP; Wed, 18 Jul 2018 07:55:30 -0700 (PDT) In-Reply-To: <5B4F48A7.9030804@foss.arm.com> References: <5B4DE283.9060100@foss.arm.com> <5B4DF325.2050609@foss.arm.com> <9d0cf3dc-8c5c-bbb2-960c-386b2c936a50@netcologne.de> <5B4F21E0.3060307@foss.arm.com> <2AD7441E-7E57-47EC-9073-92375938B8B8@tkoenig.net> <5B4F48A7.9030804@foss.arm.com> From: Janne Blomqvist Date: Wed, 18 Jul 2018 14:55:00 -0000 Message-ID: Subject: Re: [PATCH][Fortran][v2] Use MIN/MAX_EXPR for min/max intrinsics To: Kyrill Tkachov Cc: =?UTF-8?Q?Thomas_K=C3=B6nig?= , Thomas Koenig , Richard Biener , "fortran@gcc.gnu.org" , GCC Patches , Richard Sandiford Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-SW-Source: 2018-07/txt/msg01003.txt.bz2 On Wed, Jul 18, 2018 at 5:03 PM, Kyrill Tkachov wrote: > > On 18/07/18 14:26, Thomas K=C3=B6nig wrote: > >> Hi Kyrlll, >> >> Am 18.07.2018 um 13:17 schrieb Kyrill Tkachov < >>> kyrylo.tkachov@foss.arm.com>: >>> >>> Thomas, Janne, would this relaxation of NaN handling be acceptable given >>> the benefits >>> mentioned above? If so, what would be the recommended adjustment to the >>> nan_1.f90 test? >>> >> I would be a bit careful about changing behavior in such a major way. >> What would the results with NaN and infinity then be, with or without >> optimization? Would the results be consistent with min(nan,num) vs >> min(num,nan)? Would they be consistent with the new IEEE standard? >> >> In general, I think that min(nan,num) should be nan and that our current >> behavior is not the best. >> >> Does anybody have dats points on how this is handled by other compilers? >> >> Oh, and if anything is changed, then compile and runtime behavior should >> always be the same. >> > > Thanks, that makes it clearer what behaviour is accceptable. > > So this v3 patch follows Richard Sandiford's suggested approach of > emitting IFN_FMIN/FMAX > when dealing with floating-point values and NaN handling is important and > the target > supports the IFN_FMIN/FMAX. Otherwise the current explicit comparison > sequence is emitted. > For integer types and -ffast-math floating-point it will emit MIN/MAX_EXP= R. > > With this patch the nan_1.f90 behaviour is preserved on all targets, we > get the optimal > sequence on aarch64 and on x86_64 we avoid the function call, with no > changes in code generation. > > This gives the performance improvement on 521.wrf on aarch64 and leaves it > unchanged on x86_64. > > I'm hoping this addresses all the concerns raised in this thread: > * The NaN-handling behaviour is unchanged on all platforms. > * The fast inline sequence is emitted where it is available. > * No calls to library fmin*/fmax* are emitted where there were none. > * MIN/MAX_EXPR sequence are emitted where possible. > > Is this acceptable? > So if I understand it correctly, the "internal fn" thing is a mechanism that allows to check whether the target supports expanding a builtin inline or whether it requires a call to an external library function? If so, then yes, Ok, thanks for the patch! --=20 Janne Blomqvist