From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-481731-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 35610 invoked by alias); 17 Jul 2018 13:46:19 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 35074 invoked by uid 89); 17 Jul 2018 13:46:18 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=H*i:sk:5hQ@mai, H*f:sk:5hQ@mai, H*i:sk:caLQr8W, H*f:sk:caLQr8W
X-HELO: foss.arm.com
Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 17 Jul 2018 13:46:17 +0000
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249])	by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A1F0A18A;	Tue, 17 Jul 2018 06:46:15 -0700 (PDT)
Received: from [10.2.207.77] (e100706-lin.cambridge.arm.com [10.2.207.77])	by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 074AB3F5B1;	Tue, 17 Jul 2018 06:46:14 -0700 (PDT)
Message-ID: <5B4DF325.2050609@foss.arm.com>
Date: Tue, 17 Jul 2018 13:46:00 -0000
From: Kyrill  Tkachov <kyrylo.tkachov@foss.arm.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0
MIME-Version: 1.0
To: Richard Biener <richard.guenther@gmail.com>
CC: "fortran@gcc.gnu.org" <fortran@gcc.gnu.org>,  GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH][Fortran] Use MIN/MAX_EXPR for intrinsics or __builtin_fmin/max when appropriate
References: <5B4DE283.9060100@foss.arm.com> <CAFiYyc2F_H1bSCQg+caLQr8WnqExtkAVyAhaQMky_HbZCC=5hQ@mail.gmail.com>
In-Reply-To: <CAFiYyc2F_H1bSCQg+caLQr8WnqExtkAVyAhaQMky_HbZCC=5hQ@mail.gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-SW-Source: 2018-07/txt/msg00928.txt.bz2

Hi Richard,

On 17/07/18 14:27, Richard Biener wrote:
> On Tue, Jul 17, 2018 at 2:35 PM Kyrill Tkachov
> <kyrylo.tkachov@foss.arm.com> wrote:
>> Hi all,
>>
>> This is my first Fortran patch, so apologies if I'm missing something.
>> The current expansion of the min and max intrinsics explicitly expands
>> the comparisons between each argument to calculate the global min/max.
>> Some targets, like aarch64, have instructions that can calculate the min/max
>> of two real (floating-point) numbers with the proper NaN-handling semantics
>> (if both inputs are NaN, return Nan. If one is NaN, return the other) and those
>> are the semantics provided by the __builtin_fmin/max family of functions that expand
>> to these instructions.
>>
>> This patch makes the frontend emit __builtin_fmin/max directly to compare each
>> pair of numbers when the numbers are floating-point, and use MIN_EXPR/MAX_EXPR otherwise
>> (integral types and -ffast-math) which should hopefully be easier to recognise in the
> What is Fortrans requirement on min/max intrinsics?  Doesn't it only
> require things that
> are guaranteed by MIN/MAX_EXPR anyways?  The only restriction here is

The current implementation expands to:
     mvar = a1;
     if (a2 .op. mvar || isnan (mvar))
       mvar = a2;
     if (a3 .op. mvar || isnan (mvar))
       mvar = a3;
     ...
     return mvar;

That is, if one of the operands is a NaN it will return the other argument.
If both (all) are NaNs, it will return NaN. This is the same as the semantics of fmin/max
as far as I can tell.

> /* Minimum and maximum values.  When used with floating point, if both
>     operands are zeros, or if either operand is NaN, then it is unspecified
>     which of the two operands is returned as the result.  */
>
> which means MIN/MAX_EXPR are not strictly IEEE compliant with signed
> zeros or NaNs.
> Thus the correct test would be !HONOR_SIGNED_ZEROS && !HONOR_NANS if singed
> zeros are significant.

True, MIN/MAX_EXPR would not be appropriate in that condition. I guarded their use
on !HONOR_NANS (type) only. I'll update it to !HONOR_SIGNED_ZEROS (type) && !HONOR_NANS (type).


>
> I'm not sure if using fmin/max calls when we cannot use MIN/MAX_EXPR
> is a good idea,
> this may both generate bigger code and be slower.

The patch will generate fmin/fmax calls (or the fminf,fminl variants) when mathfn_built_in advertises
them as available (does that mean they'll have a fast inline implementation?)

If the above doesn't hold and we can't use either MIN/MAX_EXPR of fmin/fmax then the patch falls back
to the existing expansion.

FWIW, this patch does improve performance on 521.wrf from SPEC2017 on aarch64.

Thanks,
Kyrill

>
> Richard.
>
>> midend and optimise. The previous approach of generating the open-coded version of that
>> is used when we don't have an appropriate __builtin_fmin/max available.
>> For example, for a configuration of x86_64-unknown-linux-gnu that I tested there was no
>> 128-bit __built_fminl available.
>>
>> With this patch I'm seeing more than 7000 FMINNM/FMAXNM instructions being generated at -O3
>> on aarch64 for 521.wrf from fprate SPEC2017 where none before were generated
>> (we were generating explicit comparisons and NaN checks). This gave a 2.4% improvement
>> in performance on a Cortex-A72.
>>
>> Bootstrapped and tested on aarch64-none-linux-gnu and x86_64-unknown-linux-gnu.
>>
>> Ok for trunk?
>> Thanks,
>> Kyrill
>>
>> 2018-07-17  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>       * f95-lang.c (gfc_init_builtin_functions): Define __builtin_fmin,
>>       __builtin_fminf, __builtin_fminl, __builtin_fmax, __builtin_fmaxf,
>>       __builtin_fmaxl.
>>       * trans-intrinsic.c: Include builtins.h.
>>       (gfc_conv_intrinsic_minmax): Emit __builtin_fmin/max or MIN/MAX_EXPR
>>       functions to calculate the min/max.
>>
>> 2018-07-17  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>       * gfortran.dg/max_fmaxf.f90: New test.
>>       * gfortran.dg/min_fminf.f90: Likewise.
>>       * gfortran.dg/minmax_integer.f90: Likewise.
>>       * gfortran.dg/max_fmaxl_aarch64.f90: Likewise.
>>       * gfortran.dg/min_fminl_aarch64.f90: Likewise.