From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-481795-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 57421 invoked by alias); 18 Jul 2018 11:45:42 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 55252 invoked by uid 89); 18 Jul 2018 11:44:15 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_PASS autolearn=ham version=3.3.2 spammy=H*i:sk:J2i4r3L, H*f:sk:J2i4r3L
X-HELO: foss.arm.com
Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 18 Jul 2018 11:44:14 +0000
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249])	by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 677707A9;	Wed, 18 Jul 2018 04:44:12 -0700 (PDT)
Received: from localhost (unknown [10.32.99.48])	by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 761343F318;	Wed, 18 Jul 2018 04:44:11 -0700 (PDT)
From: Richard Sandiford <richard.sandiford@arm.com>
To: Richard Biener <richard.guenther@gmail.com>
Mail-Followup-To: Richard Biener <richard.guenther@gmail.com>,kyrylo.tkachov@foss.arm.com,  "fortran\@gcc.gnu.org" <fortran@gcc.gnu.org>,  GCC Patches <gcc-patches@gcc.gnu.org>, richard.sandiford@arm.com
Cc: kyrylo.tkachov@foss.arm.com,  "fortran\@gcc.gnu.org" <fortran@gcc.gnu.org>,  GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH]Use MIN/MAX_EXPR for intrinsics or __builtin_fmin/max when appropriate
References: <5B4DE283.9060100@foss.arm.com>	<CAFiYyc2F_H1bSCQg+caLQr8WnqExtkAVyAhaQMky_HbZCC=5hQ@mail.gmail.com>	<5B4DF325.2050609@foss.arm.com>	<CAFiYyc2ECcDg+25fmb9Yr8y0-6F2kQ3FKLrsm-jyw0LPvyqQ=Q@mail.gmail.com>	<5B4F0D5A.2030401@foss.arm.com>	<CAFiYyc2KjSqJjys51QnCuPnFYOv7e3=J2i4r3LA74Xj062utEg@mail.gmail.com>
Date: Wed, 18 Jul 2018 11:45:00 -0000
In-Reply-To: <CAFiYyc2KjSqJjys51QnCuPnFYOv7e3=J2i4r3LA74Xj062utEg@mail.gmail.com>	(Richard Biener's message of "Wed, 18 Jul 2018 12:06:15 +0200")
Message-ID: <87r2k04vli.fsf@arm.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-SW-Source: 2018-07/txt/msg00992.txt.bz2

Richard Biener <richard.guenther@gmail.com> writes:
> On Wed, Jul 18, 2018 at 11:50 AM Kyrill Tkachov
> <kyrylo.tkachov@foss.arm.com> wrote:
>>
>>
>> On 18/07/18 10:44, Richard Biener wrote:
>> > On Tue, Jul 17, 2018 at 3:46 PM Kyrill Tkachov
>> > <kyrylo.tkachov@foss.arm.com> wrote:
>> >> Hi Richard,
>> >>
>> >> On 17/07/18 14:27, Richard Biener wrote:
>> >>> On Tue, Jul 17, 2018 at 2:35 PM Kyrill Tkachov
>> >>> <kyrylo.tkachov@foss.arm.com> wrote:
>> >>>> Hi all,
>> >>>>
>> >>>> This is my first Fortran patch, so apologies if I'm missing something.
>> >>>> The current expansion of the min and max intrinsics explicitly expands
>> >>>> the comparisons between each argument to calculate the global min/max.
>> >>>> Some targets, like aarch64, have instructions that can calculate
>> >>>> the min/max
>> >>>> of two real (floating-point) numbers with the proper NaN-handling
>> >>>> semantics
>> >>>> (if both inputs are NaN, return Nan. If one is NaN, return the
>> >>>> other) and those
>> >>>> are the semantics provided by the __builtin_fmin/max family of
>> >>>> functions that expand
>> >>>> to these instructions.
>> >>>>
>> >>>> This patch makes the frontend emit __builtin_fmin/max directly to
>> >>>> compare each
>> >>>> pair of numbers when the numbers are floating-point, and use
>> >>>> MIN_EXPR/MAX_EXPR otherwise
>> >>>> (integral types and -ffast-math) which should hopefully be easier
>> >>>> to recognise in the
>> >>> What is Fortrans requirement on min/max intrinsics?  Doesn't it only
>> >>> require things that
>> >>> are guaranteed by MIN/MAX_EXPR anyways?  The only restriction here is
>> >> The current implementation expands to:
>> >>       mvar = a1;
>> >>       if (a2 .op. mvar || isnan (mvar))
>> >>         mvar = a2;
>> >>       if (a3 .op. mvar || isnan (mvar))
>> >>         mvar = a3;
>> >>       ...
>> >>       return mvar;
>> >>
>> >> That is, if one of the operands is a NaN it will return the other argument.
>> >> If both (all) are NaNs, it will return NaN. This is the same as the semantics of fmin/max
>> >> as far as I can tell.
>> >>
>> >>> /* Minimum and maximum values.  When used with floating point, if both
>> >>>      operands are zeros, or if either operand is NaN, then it is
>> >>> unspecified
>> >>>      which of the two operands is returned as the result.  */
>> >>>
>> >>> which means MIN/MAX_EXPR are not strictly IEEE compliant with signed
>> >>> zeros or NaNs.
>> >>> Thus the correct test would be !HONOR_SIGNED_ZEROS && !HONOR_NANS
>> >>> if singed
>> >>> zeros are significant.
>> >> True, MIN/MAX_EXPR would not be appropriate in that condition. I
>> >> guarded their use
>> >> on !HONOR_NANS (type) only. I'll update it to !HONOR_SIGNED_ZEROS
>> >> (type) && !HONOR_NANS (type).
>> >>
>> >>
>> >>> I'm not sure if using fmin/max calls when we cannot use MIN/MAX_EXPR
>> >>> is a good idea,
>> >>> this may both generate bigger code and be slower.
>> >> The patch will generate fmin/fmax calls (or the fminf,fminl
>> >> variants) when mathfn_built_in advertises
>> >> them as available (does that mean they'll have a fast inline
>> >> implementation?)
>> > This doesn't mean anything given you make them available with your
>> > patch ;)  So I expect it may
>> > cause issues for !c99_runtime targets (and long double at least).
>>
>> Urgh, that can cause headaches...
>>
>> >> If the above doesn't hold and we can't use either MIN/MAX_EXPR of
>> >> fmin/fmax then the patch falls back
>> >> to the existing expansion.
>> > As said I would not use fmin/fmax calls here at all.
>>
>> ... Given the comments from Thomas and Janne, maybe we should just
>> emit MIN/MAX_EXPRs here
>> since there is no language requirement on NaN/signed zero handling on
>> these intrinsics?
>> That should make it simpler and more portable.
>
> That's fortran maintainers call.
>
>> >> FWIW, this patch does improve performance on 521.wrf from SPEC2017
>> >> on aarch64.
>> > You said that, yes.  Even without -ffast-math?
>>
>> It improves at -O3 without -ffast-math in particular. With -ffast-math
>> phiopt optimisation
>> is more aggressive and merges the conditionals into MIN/MAX_EXPRs
>> (minmax_replacement in tree-ssa-phiopt.c)
>
> The question is will it be slower without -ffast-math, that is, when
> fmin/max() calls are emitted rather
> than inline conditionals.
>
> I think a patch just using MAX/MIN_EXPR within the existing
> constraints and otherwise falling back to
> the current code would be more obvious and other changes should be
> mande independently.

If going to MIN_EXPR and MAX_EXPR unconditionally isn't acceptable,
maybe an alternative would be to go straight to internal functions,
under the usual:

  direct_internal_fn_supported_p (IFN_F{MIN,MAX}, type, OPTIMIZE_FOR_SPEED)

condition.

Thanks,
Richard