From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4585 invoked by alias); 20 Aug 2019 12:59:12 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 4576 invoked by uid 89); 20 Aug 2019 12:59:11 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-4.0 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.110.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 20 Aug 2019 12:59:10 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7BFA7344; Tue, 20 Aug 2019 05:59:08 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.99.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 879AA3F246; Tue, 20 Aug 2019 05:59:07 -0700 (PDT) From: Richard Sandiford To: Segher Boessenkool Mail-Followup-To: Segher Boessenkool ,Tejas Joshi , gcc@gcc.gnu.org, Martin Jambor , hubicka@ucw.cz, joseph@codesourcery.com, richard.sandiford@arm.com Cc: Tejas Joshi , gcc@gcc.gnu.org, Martin Jambor , hubicka@ucw.cz, joseph@codesourcery.com Subject: Re: Expansion of narrowing math built-ins into power instructions References: <20190814210015.GJ31406@gate.crashing.org> <20190815184450.GO31406@gate.crashing.org> <20190819130720.GG31406@gate.crashing.org> <20190820121137.GP31406@gate.crashing.org> Date: Tue, 20 Aug 2019 12:59:00 -0000 In-Reply-To: <20190820121137.GP31406@gate.crashing.org> (Segher Boessenkool's message of "Tue, 20 Aug 2019 07:11:37 -0500") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-IsSubscribed: yes X-SW-Source: 2019-08/txt/msg00156.txt.bz2 Segher Boessenkool writes: >> > And yes, various parts of GCC can manipulate RTL, doing substitution and >> > algebraic simplication and whatnot. All within the rules of RTL. And >> > that means nothing ever can "pass" a float_narrow, because there are no >> > rules that allow it to. >> >> You mean create a new float_narrow out of thin air, with no justification? >> Sure, but I don't think that was ever the issue. > > No. I mean that if you have > > ... (float_narrow:M (x:N)) > > it will always stay in that form, with just x changed. Nothing can > change the float_narrow. OK, I guessed wrong :-) But it was the change to x that IMO was the problem. I wasn't worried about code changing the float_narrow itself to random other stuff. >> [(set (match_operand:SI 0 "register_operand" "=d") >> (truncate:SI >> (lshiftrt:DI > > (this is optimised to a subreg, in many cases, for example). Right. MIPS avoids that one thanks to TARGET_TRULY_NOOP_TRUNCATION. >> float_narrow is different in that the plus (or whatever operation >> it's quoting) has to be kept in-place rather than folded away, >> otherwise the rtx itself is malformed and could trigger an ICE, >> just like the zero_extend of a const_int that I mentioned. > > Yes, it will not pass recog. Structurally it is just hunky-dory though. So maybe that's the main point of difference. We're introducing float_narrow to modify another rtx operation rather than to operate on an rtx value. So to me it makes no sense to say that: (float_narrow:SF (const_double:DF X)) (float_narrow:SF (reg:DF X)) (float_narrow:SF (mem:DF X)) are well-formed rtxes and just happen not to match any instructions. Without an operation to modify they're meaningless on their own terms, regardless of what the target says about it. Just like: (unsigned_saturate:QI (reg:QI X)) would be meaningless if we modelled saturation this way. There's no way you can go from a normal unsaturated result to the equivalent saturated result without knowing which operation was performed, and on which operands. This isn't a choice for targets to make even in principle, just like it isn't for my favourite (zero_extend:m (const_int -1)) example. >> > And you need many many more RTX codes, which you will not handle in >> > almost all places, because there are too many. >> > >> > >> > I agree this construct is not as nice as could be hoped for. I don't >> > agree that 60 new RTX codes is an acceptable solution (or that that will >> > ever really work out, even). >> >> 60 sounds a high number. :-) Do we really have that many rtx codes with >> a floating-point rounding effect? > > It was meant to sound high, heh. If things need a variant A, and also a > variant B, then before you know it there is a variant A+B as well, and > you have unbridled growth. > > plus minus neg mult div mod smin smax abs sqrt fma I think? And let's > hope we never ever have to do saturating versions of FP :-) neg, abs, smin and smax shouldn't do rounding AFAIK. But yeah, the rest look plausible. That is only 7 though :-) Unless I counted wrong. Not that I'm saying I like adding codes for each one either. It just doesn't seem that bad (and definitely better than float_narrow IMO). >> Whatever the number is, we'll still be listing them individually for >> built-in enumerations, internal_fn, and (I assume) optabs. But maybe >> after a certain point it does become too unwieldly for rtx codes. >> We have to keep it within 16 bits at least... > > My main concern is all the (simplification) code that parses RTL. All of > that will have to handle all variant versions as well. True, but we'd have to err on the side of caution whatever happens. Not all existing PLUS simplifications necessarily apply as-is. Thanks, Richard >> > It would be nice if somehow we could make a variant of RTL codes, so that >> > we could have nice and simple code that applies to all variants of some >> > code. Not sure how that would work out. Maybe we don't have to do this >> > very generically, how often will we need this anyway? >> > >> > I have three examples so far: >> > 1) Saturating arithmetic; >> > 2) This float_narrow thing; >> > 3) Ordered compares, that is, fp compares that set an exception on NaNs. >> > >> > Something that works for all three would be nice! >> >> Yeah, agree that sounds good. Maybe we could bundle the code with some >> flags. Storage-wise, there should be room for that in the u2 field. >> >> But there might still be cases in which it's useful to view the code+flags >> as a combined supercode, e.g. for switch statements. > > Yeah... Whether to make "code" or "code+flags" the more usual version is > the biggest design question then. Oh, and what the rest of the interface > to this looks like ;-) > > > Segher