From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by sourceware.org (Postfix) with ESMTPS id 99EB8386EC57 for ; Wed, 12 May 2021 09:27:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 99EB8386EC57 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rguenther@suse.de X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 5BC96AF75; Wed, 12 May 2021 09:27:56 +0000 (UTC) Date: Wed, 12 May 2021 11:27:56 +0200 (CEST) From: Richard Biener To: Richard Sandiford cc: Tamar Christina , "gcc@gcc.gnu.org" Subject: Re: [RFC] Implementing detection of saturation and rounding arithmetic In-Reply-To: Message-ID: References: User-Agent: Alpine 2.21 (LSU 202 2017-01-01) MIME-Version: 1.0 X-Spam-Status: No, score=-5.4 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 8BIT X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 May 2021 09:27:59 -0000 On Wed, 12 May 2021, Richard Sandiford wrote: > Tamar Christina writes: > > Hi All, > > > > We are looking to implement saturation support in the compiler. The aim is to > > recognize both Scalar and Vector variant of typical saturating expressions. > > > > As an example: > > > > 1. Saturating addition: > > char sat (char a, char b) > > { > > int tmp = a + b; > > return tmp > 127 ? 127 : ((tmp < -128) ? -128 : tmp); > > } > > > > 2. Saturating abs: > > char sat (char a) > > { > > int tmp = abs (a); > > return tmp > 127 ? 127 : ((tmp < -128) ? -128 : tmp); > > } > > > > 3. Rounding shifts > > char rndshift (char dc) > > { > > int round_const = 1 << (shift - 1); > > return (dc + round_const) >> shift; > > } > > > > etc. > > > > Of course the first issue is that C does not really have a single idiom for > > expressing this. > > > > At the RTL level we have ss_truncate and us_truncate and float_truncate for > > truncation. > > > > At the Tree level we have nothing for truncation (I believe) for scalars. For > > Vector code there already seems to be VEC_PACK_SAT_EXPR but it looks like > > nothing actually generates this at the moment. it's just an unused tree code. > > > > For rounding there doesn't seem to be any existing infrastructure. > > > > The proposal to handle these are as follow, keep in mind that all of these also > > exist in their scalar form, as such detecting them in the vectorizer would be > > the wrong place. > > > > 1. Rounding: > > a) Use match.pd to rewrite various rounding idioms to shifts. > > b) Use backwards or forward prop to rewrite these to internal functions > > where even if the target does not support these rounding instructions they > > have a chance to provide a more efficient implementation than what would > > be generated normally. > > > > 2. Saturation: > > a) Use match.pd to rewrite the various saturation expressions into min/max > > operations which opens up the expressions to further optimizations. > > b) Use backwards or forward prop to convert to internal functions if the > > resulting min/max expression still meet the criteria for being a > > saturating expression. This follows the algorithm as outlined in "The > > Software Vectorization handbook" by Aart J.C. Bik. > > > > We could get the right instructions by using combine if we don't rewrite > > the instructions to an internal function, however then during Vectorization > > we would overestimate the cost of performing the saturation. The constants > > will the also be loaded into registers and so becomes a lot more difficult > > to cleanup solely in the backend. > > > > The one thing I am wondering about is whether we would need an internal function > > for all operations supported, or if it should be modelled as an internal FN which > > just "marks" the operation as rounding/saturating. After all, the only difference > > between a normal and saturating expression in RTL is the xx_truncate RTL surrounding > > the expression. Doing so would also mean that all targets whom have saturating > > instructions would automatically benefit from this. > > I might have misunderstood what you meant here, but the *_truncate > RTL codes are true truncations: the operand has to be wider than the > result. Using this representation for general arithmetic is a problem > if you're operating at the maximum size that the target supports natively. > E.g. representing a 64-bit saturating addition as: > > - extend to 128 bits > - do a 128-bit addition > - truncate to 64 bits > > is going to be hard to cost and code-generate on targets that don't support > native 128-bit operations (or at least, don't support them cheaply). > This might not be a problem when recognising C idioms, since the C source > code has to be able do the wider operation before truncating the result, > but it could be a problem if we provide built-in functions or if we want > to introduce compiler-generated saturating operations. > > RTL already has per-operation saturation such as ss_plus/us_plus, > ss_minus/us_minus, ss_neg/us_neg, ss_mult/us_mult, ss_div, > ss_ashift/us_ashift and ss_abs. I think we should do the same > in gimple, using internal functions like you say. I think that for followup optimizations using regular arithmetic ops and just new saturating truncations is better. Maybe we can also do both, with first only matching the actual saturation with a new tree code and then later match the optabs the target actually supports (in ISEL for example)? Truly saturating ops might provide an interesting example how to deal with -ftrapv - one might think we can now simply use the trapping optabs as internal functions to reflect -ftrapv onto the IL ... Richard. > Thanks, > Richard > > -- Richard Biener SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)