From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-508230-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 54227 invoked by alias); 3 Sep 2019 14:19:36 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 54219 invoked by uid 89); 3 Sep 2019 14:19:36 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-6.6 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_1,SPF_PASS autolearn=ham version=3.3.1 spammy=well-known, wellknown, ordering
X-HELO: foss.arm.com
Received: from foss.arm.com (HELO foss.arm.com) (217.140.110.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 03 Sep 2019 14:19:34 +0000
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])	by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2EE1B337;	Tue,  3 Sep 2019 07:19:33 -0700 (PDT)
Received: from localhost (e121540-lin.manchester.arm.com [10.32.99.62])	by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 050C73F246;	Tue,  3 Sep 2019 07:19:31 -0700 (PDT)
From: Richard Sandiford <richard.sandiford@arm.com>
To: Richard Biener <rguenther@suse.de>
Mail-Followup-To: Richard Biener <rguenther@suse.de>,Barnaby Wilks <Barnaby.Wilks@arm.com>,  "gcc-patches\@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,  nd <nd@arm.com>,  "law\@redhat.com" <law@redhat.com>,  "ian\@airs.com" <ian@airs.com>,  Tamar Christina <Tamar.Christina@arm.com>,  Wilco Dijkstra <Wilco.Dijkstra@arm.com>, richard.sandiford@arm.com
Cc: Barnaby Wilks <Barnaby.Wilks@arm.com>,  "gcc-patches\@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,  nd <nd@arm.com>,  "law\@redhat.com" <law@redhat.com>,  "ian\@airs.com" <ian@airs.com>,  Tamar Christina <Tamar.Christina@arm.com>,  Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Subject: Re: [PATCH][GCC] Simplify to single precision where possible for binary/builtin maths operations.
References: <571395fe-921b-5a68-ec8d-84850a732253@arm.com>	<alpine.LSU.2.20.1909031006590.32458@zhemvz.fhfr.qr>
Date: Tue, 03 Sep 2019 14:19:00 -0000
In-Reply-To: <alpine.LSU.2.20.1909031006590.32458@zhemvz.fhfr.qr> (Richard	Biener's message of "Tue, 3 Sep 2019 10:23:33 +0200 (CEST)")
Message-ID: <mpt5zm9fonh.fsf@arm.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-IsSubscribed: yes
X-SW-Source: 2019-09/txt/msg00121.txt.bz2

Richard Biener <rguenther@suse.de> writes:
> On Mon, 2 Sep 2019, Barnaby Wilks wrote:
>
>> Hello,
>> 
>> This patch introduces an optimization for narrowing binary and builtin
>> math operations to the smallest type when unsafe math optimizations are
>> enabled (typically -Ofast or -ffast-math).
>> 
>> Consider the example:
>> 
>>    float f (float x) {
>>      return 1.0 / sqrt (x);
>>    }
>> 
>>    f:
>>      fcvt	d0, s0
>>      fmov	d1, 1.0e+0
>>      fsqrt	d0, d0
>>      fdiv	d0, d1, d0
>>      fcvt	s0, d0
>>      ret
>> 
>> Given that all outputs are of float type, we can do the whole 
>> calculation in single precision and avoid any potentially expensive 
>> conversions between single and double precision.
>> 
>> Aka the expression would end up looking more like
>> 
>>    float f (float x) {
>>      return 1.0f / sqrtf (x);
>>    }
>> 
>>    f:
>>      fsqrt	s0, s0
>>      fmov	s1, 1.0e+0
>>      fdiv	s0, s1, s0
>>      ret
>> 
>> This optimization will narrow casts around math builtins, and also
>> not try to find the widest type for calculations when processing binary
>> math operations (if unsafe math optimizations are enable).
>> 
>> Added tests to verify that narrower math builtins are chosen and
>> no unnecessary casts are introduced when appropriate.
>> 
>> Bootstrapped and regtested on aarch64 and x86_64 with no regressions.
>> 
>> I don't have write access, so if OK for trunk then can someone commit on 
>> my behalf?
> [...]
>
> Now - as a general comment I think adding this kind of narrowing is
> good but doing it via match.pd patterns is quite limiting - eventually
> the backprop pass would be a fit for propagating "needed precision"
> and narrowing feeding stmts accordingly in a more general way?
> Richard can probably tell quickest if it is feasible in that framework.

Yeah, I think it would be a good fit, and would for example cope with
cases in which we select between two double results before doing the
truncation to float.  I'd wanted to do something similar for integer
truncation but never found the time...

At the moment, backprop handles a single piece of information: whether
the sign of the value matters.  This is (over?)generalised to be one bit
of information in a word of flags.  I guess we could take the same
approach here and have flags for certain well-known floating-point
types, but it might be cleaner to instead have a field that records the
widest mode that users of the result want.

I think to do this we'd need to build an array that maps floating-point
machine_modes to their order in the FOR_EACH_MODE_IN_CLASS chain.
That'll give us a total ordering over floating-point modes and mean
that operator & (the usage_info confluence function) can just take
whichever of the input usage_info modes has the highest index in this
chain.

Thanks,
Richard