From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 52695 invoked by alias); 25 Aug 2019 13:55:45 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 52680 invoked by uid 89); 25 Aug 2019 13:55:45 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-20.4 required=5.0 tests=AWL,BAYES_00,FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=HX-Spam-Relays-External:209.85.208.66, H*RU:209.85.208.66, HX-Received:906 X-HELO: mail-ed1-f66.google.com Received: from mail-ed1-f66.google.com (HELO mail-ed1-f66.google.com) (209.85.208.66) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sun, 25 Aug 2019 13:55:43 +0000 Received: by mail-ed1-f66.google.com with SMTP id x19so22314526eda.12 for ; Sun, 25 Aug 2019 06:55:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=0DuxHCZTedrevQBapcEShcyvv4UyJ6OyRNIH6bAbDkQ=; b=siKzo5g7WuIKj2/ieaYcaYcCYTWPczrh7YexflJpvP5LEHdf/Otwi0KzBu5FkV/EP6 ne4PnCrK12xjv4qZognHsl2/vi+wZVCJiTcdIfb+0RlmynP/rc+CNxXSXYecaM8AqQ+C JsCvgaxCaDkbP6hUm1iY8hUVoNnJpa9bCRy/vS22BvusLKGPU5mXR5V1sAGhL8mvryY6 7bjj5WC5+vip10bD3gPh0AXnz5614dCq77tN3EoRQQBE98854fOGV72LBKI/wg//Fe2P Jm0S+ja8IEtKo0+nXFo9QaBzVilcCI8wiyGG1tIGDcBEIiSWgKXvaAy3d/gtp1PLAuxS yowg== MIME-Version: 1.0 References: <20190820134613.GR31406@gate.crashing.org> <20190820194154.GY31406@gate.crashing.org> <20190821182852.GG31406@gate.crashing.org> <20190821191722.GH31406@gate.crashing.org> <20190822062503.GI31406@gate.crashing.org> <20190822095620.GK31406@gate.crashing.org> In-Reply-To: From: Tejas Joshi Date: Sun, 25 Aug 2019 13:55:00 -0000 Message-ID: Subject: Re: Expansion of narrowing math built-ins into power instructions To: gcc@gcc.gnu.org Cc: Martin Jambor , hubicka@ucw.cz, segher@kernel.crashing.org, joseph@codesourcery.com Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2019-08/txt/msg00211.txt.bz2 Hello. > > Similarly addtfsf3 that multiplies TFmode and produces an SFmode result, and so on. I want to extend this patch for FADDL and DADDL. What operand constraints should I use for TFmode alongside "f"? > In cases where long double and double have the same mode, >the daddl function should use the existing adddf3 pattern. So, should I use adddf3 for DADDL directly? How would I map the add3 optab with DADDL? Thanks, Tejas On Sat, 24 Aug 2019 at 15:23, Richard Sandiford wrote: > > Martin Jambor writes: > > Hello, > > > > On Thu, Aug 22 2019, Segher Boessenkool wrote: > >>> > Hi Tejas, > >>> > > >>> > [ Please do not top-post. ] > >> > >> On Thu, Aug 22, 2019 at 01:27:06PM +0530, Tejas Joshi wrote: > >>> > What happens then? "It does not work" is very very vague. At least it > >>> > seems the compiler does build now? > >>> > >>> Oh, compiler builds but instruction is still "bl fadd". It should be > >>> "fadds" right? > >> > >> Yes, but that means the problem is earlier, before it hits RTL perhaps. > >> > >> Compile with -dap, look at the expand dump (the lowest numbered one, 234 > >> or so), and see what it looked like in the final Gimple, and then in the > >> RTL generated from that. And then drill down. > >> > > > > Tejas sent me his patch and I looked at why it did not work. I found > > two reasons: > > > > 1. associated_internal_fn (in builtins.c) does not handle > > DEF_INTERNAL_OPTAB_FN kind of internal functions, and Tejas > > (sensibly, I'd say) used that macro to define the internal function. > > But when I worked around that by manually adding a case for it in the > > switch statement, I ran into an assert because... > > > > 2. direct_internal_fn_supported_p on which replacement_internal_fn > > depends to expand built-ins as internal functions cannot handle > > conversion optabs... and narrowing is a kind of conversion and the > > optab is added as such with OPTAB_CD. > > > > Actually, the second statement is not entirely true because somehow it > > can handle optab while_ult which is a conversion optab but a) the way it > > is handled, if I can understand it at all, seems to be a big hack and > > would be even worse if we decided to copy that for all narrowing math > > functions > > Think "big hack" is a bit unfair. The way that the internal function > maps argument types to the optab modes, and the way it expands calls > into rtl, depends on the "optab type" argument (the final argument to > DEF_INTERNAL_OPTAB_FN). This is relatively flexible in that it can use > a single-mode "direct" optab or a dual-mode "conversion" optab, with the > modes coming from whichever arguments are appropriate. New optab types > can be added as needed. > > FWIW, several other DEF_INTERNAL_OPTAB_FNs are conversion optabs too > (e.g. IFN_LOAD_LANES, IFN_STORE_LANES, IFN_MASK_LOAD, etc.). > > But... > > > and b) it gets both modes from argument types whereas we need one from > > the result type and so we would have to rewrite > > replacement_internal_fn anyway. > > ...yeah, I agree this breaks the current model. The reason IFN_WHILE_ULT > doesn't rely on the return type is that if you have: > > _2 = .WHILE_ULT (_0, _1) // returning a vector of 4 booleans > _3 = .WHILE_ULT (_0, _1) // returning a vector of 8 booleans > > then the calls look equivalent. So instead we pass an extra argument > indicating the required boolean vector "shape". > > The same "problem" could in principle apply to FADD if we ever needed > to support double+double->_Float16 for example. > > > Therefore, at least for now (GSoC deadline is kind of looming), I > > decided that the best way forward would be to not rely on internal > > functions but plug into expand_builtin() and I wrote the following, > > lightly tested patch - which of course misses testcases and stuff - but > > I'd be curious about any feedback now anyway. When I proposed a very > > similar approach for the roundeven x86_64 expansion, Uros actually then > > opted for a solution based on internal functions, so I am curious > > whether there are simple alternatives I do not see. > > > > Tejas, of course cases for other fadd variants should at least be added > > to expand_builtin. > > > > Thanks, > > > > Martin > > > > > > 2019-08-23 Tejas Joshi > > Martin Jambor > > > > * builtins.c (expand_builtin_binary_conversion): New function. > > (expand_builtin): Call it. > > * config/rs6000/rs6000.md (unspec): Add UNSPEC_ADD_NARROWING. > > (add_truncdfsf3): New define_insn. > > * optabs.def (fadd_optab): New. > > > > [...] > > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def > > index 9461693bcd1..3f56880c23f 100644 > > --- a/gcc/internal-fn.def > > +++ b/gcc/internal-fn.def > > @@ -140,6 +140,8 @@ DEF_INTERNAL_OPTAB_FN (WHILE_ULT, ECF_CONST | ECF_NOTHROW, while_ult, while) > > DEF_INTERNAL_OPTAB_FN (VEC_SHL_INSERT, ECF_CONST | ECF_NOTHROW, > > vec_shl_insert, binary) > > > > +DEF_INTERNAL_OPTAB_FN (FADD, ECF_CONST, fadd, binary) > > + > > DEF_INTERNAL_OPTAB_FN (FMS, ECF_CONST, fms, ternary) > > DEF_INTERNAL_OPTAB_FN (FNMA, ECF_CONST, fnma, ternary) > > DEF_INTERNAL_OPTAB_FN (FNMS, ECF_CONST, fnms, ternary) > > Should be dropped now. > > OK with that change and the ones Segher asked for. > > Thanks, > Richard