From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-481748-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 41272 invoked by alias); 17 Jul 2018 16:16:32 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 41163 invoked by uid 89); 17 Jul 2018 16:16:26 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=H*i:sk:9d0cf3d, H*f:sk:9d0cf3d, emits, expects
X-HELO: foss.arm.com
Received: from usa-sjc-mx-foss1.foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 17 Jul 2018 16:16:24 +0000
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249])	by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CD4BC7A9;	Tue, 17 Jul 2018 09:16:22 -0700 (PDT)
Received: from [10.2.207.77] (e100706-lin.cambridge.arm.com [10.2.207.77])	by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0C4643F318;	Tue, 17 Jul 2018 09:16:21 -0700 (PDT)
Message-ID: <5B4E1654.3010806@foss.arm.com>
Date: Tue, 17 Jul 2018 16:16:00 -0000
From: Kyrill  Tkachov <kyrylo.tkachov@foss.arm.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0
MIME-Version: 1.0
To: Thomas Koenig <tkoenig@netcologne.de>,  Richard Biener <richard.guenther@gmail.com>
CC: "fortran@gcc.gnu.org" <fortran@gcc.gnu.org>,  GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH][Fortran] Use MIN/MAX_EXPR for intrinsics or __builtin_fmin/max when appropriate
References: <5B4DE283.9060100@foss.arm.com> <CAFiYyc2F_H1bSCQg+caLQr8WnqExtkAVyAhaQMky_HbZCC=5hQ@mail.gmail.com> <5B4DF325.2050609@foss.arm.com> <9d0cf3dc-8c5c-bbb2-960c-386b2c936a50@netcologne.de>
In-Reply-To: <9d0cf3dc-8c5c-bbb2-960c-386b2c936a50@netcologne.de>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-SW-Source: 2018-07/txt/msg00945.txt.bz2

Hi Thomas,

On 17/07/18 16:36, Thomas Koenig wrote:
> Hi Kyrill,
>
>> The current implementation expands to:
>>      mvar = a1;
>>      if (a2 .op. mvar || isnan (mvar))
>>        mvar = a2;
>>      if (a3 .op. mvar || isnan (mvar))
>>        mvar = a3;
>>      ...
>>      return mvar;
>>
>> That is, if one of the operands is a NaN it will return the other argument.
>> If both (all) are NaNs, it will return NaN. This is the same as the semantics of fmin/max
>> as far as I can tell.
>
> I've looked at the F2008 standard, and, interestingly enough, the
> requirement on MIN and MAX do not mention NaNs at all. 13.7.106
> has, for MAX,
>
> Result Value. The value of the result is that of the largest argument.
>
> plus some stuff about character variables (not relevant here). Similar
> for MIN.
>
> Also, the section on IEEE_ARITHMETIC (14.9) does not mention
> comparisons; also, "Complete conformance with IEC 60559:1989 is not
> required", what is required is the correct support for +,-, and *,
> plus support for / if IEEE_SUPPORT_DIVIDE is covered.
>

Thanks for checking this.

> So, the Fortran standard does not impose many requirements. I do think
> that a patch such as yours should not change the current behavior unless
> we know what it does and do think it is a good idea.  Hmm...
>
> Having said that, I think we pretty much cover all the corner cases
> in nan_1.f90, so if that test passes without regression, then that
> aspect should be fine.
>

Looking at the test it looks like there is a de facto expected behaviour.
For example it contains:
if (max(2.d0, nan) /= 2.d0) STOP 9

So it definitely expects comparison with NaN to return the non-NaN result,
which is a the behaviour what my patch preserves.

On integral arguments or when we don't care about NaNs (-Ofast and such) we'll be using
the MIN/MAX_EXPR, which doesn't specify what's returned on a NaN argument, thus allowing
for more aggressive optimisations.

> Question: You have found an advantage on Aarm64. Do you have
> access to other architectures so see if there is also a speed
> advantage, or maybe a disadvantage?
>

Because the expansion now emits straightline code rather than conditionals and branches
it should be easier to optimise in general, so I'd expect this to be an improvement overall.
That said, I have benchmarked it on SPEC2017 on aarch64.

If you have any benchmarks of interest to you you (or somebody else) can run on a target that you
care about I would be very grateful for any results.

Thanks,
Kyrill

> Regards
>
>     Thomas