From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id F1F5B3858D33 for ; Wed, 3 May 2023 19:07:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F1F5B3858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E5FB52F4; Wed, 3 May 2023 12:08:11 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 00A253F64C; Wed, 3 May 2023 12:07:26 -0700 (PDT) From: Richard Sandiford To: Richard Biener Mail-Followup-To: Richard Biener ,"Andre Vieira \(lists\)" , Richard Biener , "gcc-patches\@gcc.gnu.org" , richard.sandiford@arm.com Cc: "Andre Vieira \(lists\)" , Richard Biener , "gcc-patches\@gcc.gnu.org" Subject: Re: [PATCH 2/3] Refactor widen_plus as internal_fn References: <51ce8969-3130-452e-092e-f9d91eff2dad@arm.com> Date: Wed, 03 May 2023 20:07:25 +0100 In-Reply-To: (Richard Biener's message of "Wed, 3 May 2023 12:11:42 +0000 (UTC)") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-24.2 required=5.0 tests=BAYES_00,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Richard Biener writes: > On Fri, 28 Apr 2023, Andre Vieira (lists) wrote: > >> This patch replaces the existing tree_code widen_plus and widen_minus >> patterns with internal_fn versions. >> >> DEF_INTERNAL_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it provides >> convenience wrappers for defining conversions that require a hi/lo split, like >> widening and narrowing operations. Each definition for will require an >> optab named and two other optabs that you specify for signed and >> unsigned. The hi/lo pair is necessary because the widening operations take n >> narrow elements as inputs and return n/2 wide elements as outputs. The 'lo' >> operation operates on the first n/2 elements of input. The 'hi' operation >> operates on the second n/2 elements of input. Defining an internal_fn along >> with hi/lo variations allows a single internal function to be returned from a >> vect_recog function that will later be expanded to hi/lo. >> >> DEF_INTERNAL_OPTAB_HILO_FN is used in internal-fn.def to register a widening >> internal_fn. It is defined differently in different places and internal-fn.def >> is sourced from those places so the parameters given can be reused. >> internal-fn.c: defined to expand to hi/lo signed/unsigned optabs, later >> defined to generate the 'expand_' functions for the hi/lo versions of the fn. >> internal-fn.def: defined to invoke DEF_INTERNAL_OPTAB_FN for the original >> and hi/lo variants of the internal_fn >> >> For example: >> IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO >> for aarch64: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_addl_hi_ -> >> (u/s)addl2 >> IFN_VEC_WIDEN_PLUS_LO -> vec_widen_addl_lo_ >> -> (u/s)addl >> >> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree >> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. > > I'll note that it's interesting we have widen multiplication as > the only existing example where we have both HI/LO and EVEN/ODD cases. > I think we want to share as much of the infrastructure to eventually > support targets doing even/odd (I guess all VLA vector targets will > be even/odd?). Can't speak for all, but SVE2 certainly is. > DEF_INTERNAL_OPTAB_HILO_FN also looks to be implicitely directed to > widening operations (otherwise no signed/unsigned variants would be > necessary). What I don't understand is why we need an optab > without _hi/_lo but in that case no signed/unsigned variant? > > Looks like all plus, plus_lo and plus_hi are commutative but > only plus is widening?! So is the setup that the vectorizer > doesn't know about the split and uses 'plus' but then the > expander performs the split? It does look a bit awkward here > (the plain 'plus' is just used for the scalar case during > pattern recog it seems). > > I'd rather have DEF_INTERNAL_OPTAB_HILO_FN split up, declaring > the hi/lo pairs and the scalar variant separately using > DEF_INTERNAL_FN without expander for that, and having > DEF_INTERNAL_HILO_WIDEN_OPTAB_FN and DEF_INTERNAL_EVENODD_WIDEN_OPTAB_FN > for the signed/unsigned pairs? (if we need that helper at all) > > Targets shouldn't need to implement the plain optab (it shouldn't > exist) and the vectorizer should query the hi/lo or even/odd > optabs for support instead. I dread these kinds of review because I think I'm almost certain to flatly contradict something I said last time round, but +1 FWIW. It seems OK to define an ifn to represent the combined effect, for the scalar case, but that shouldn't leak into optabs unless we actually want to use the ifn for "real" scalar ops (as opposed to a temporary placeholder during pattern recognition). On the optabs/ifn bits: > +static int > +ifn_cmp (const void *a_, const void *b_) > +{ > + typedef std::pair ifn_pair; > + auto *a = (const std::pair *)a_; > + auto *b = (const std::pair *)b_; > + return (int) (a->first.first) - (b->first.first); > +} > + > +/* Return the optab belonging to the given internal function NAME for the given > + SIGN or unknown_optab. */ > + > +optab > +lookup_hilo_ifn_optab (enum internal_fn fn, unsigned sign) There is no NAME parameter. It also isn't clear what SIGN means: is 1 for unsigned or signed? Would be better to use signop and TYPE_SIGN IMO. > +{ > + typedef std::pair ifn_pair; > + typedef auto_vec >fn_to_optab_map_type; > + static fn_to_optab_map_type *fn_to_optab_map; > + > + if (!fn_to_optab_map) > + { > + unsigned num > + = sizeof (internal_fn_hilo_keys_array) / sizeof (enum internal_fn); > + fn_to_optab_map = new fn_to_optab_map_type (); > + for (unsigned int i = 0; i < num - 1; ++i) > + { > + enum internal_fn fn = internal_fn_hilo_keys_array[i]; > + optab v1 = internal_fn_hilo_values_array[2*i]; > + optab v2 = internal_fn_hilo_values_array[2*i + 1]; > + ifn_pair key1 (fn, 0); > + fn_to_optab_map->safe_push ({key1, v1}); > + ifn_pair key2 (fn, 1); > + fn_to_optab_map->safe_push ({key2, v2}); > + } > + fn_to_optab_map->qsort (ifn_cmp); > + } > + > + ifn_pair new_pair (fn, sign ? 1 : 0); > + optab tmp; > + std::pair pair_wrap (new_pair, tmp); > + auto entry = fn_to_optab_map->bsearch (&pair_wrap, ifn_cmp); > + return entry != fn_to_optab_map->end () ? entry->second : unknown_optab; > +} > + Do we need to use a map for this? It seems like it follows mechanically from the macro definition and could be handled using a switch statement and preprocessor logic. Also, it would be good to make direct_internal_fn_optab DTRT for this case, rather than needing a separate function. > +extern void > +lookup_hilo_internal_fn (enum internal_fn ifn, enum internal_fn *lo, > + enum internal_fn *hi) > +{ > + gcc_assert (decomposes_to_hilo_fn_p (ifn)); > + > + *lo = internal_fn (ifn + 1); > + *hi = internal_fn (ifn + 2); > +} Nit: spurious extern. Function needs a comment. There have been requests to drop redundant "enum" keywords from new code. > +/* Return true if FN decomposes to _hi and _lo IFN. If true this should also > + be a widening function. */ > + > +bool > +decomposes_to_hilo_fn_p (internal_fn fn) > +{ > + if (!widening_fn_p (fn)) > + return false; > + > + switch (fn) > + { > + case IFN_VEC_WIDEN_PLUS: > + case IFN_VEC_WIDEN_MINUS: > + return true; > + > + default: > + return false; > + } > +} > + Similarly here I think we should use the preprocessor. It isn't clear why this returns false for !widening_fn_p. Narrowing hi/lo functions would decompose in a similar way. As a general comment, how about naming the new macro: DEF_INTERNAL_SIGNED_HILO_OPTAB_FN and make it invoke DEF_INTERNAL_SIGNED_OPTAB_FN twice, once for the hi and once for the lo? The new optabs need to be documented in md.texi. I think it'd be better to drop the "l" suffix in "addl" and "subl", since that's an Arm convention and is redundant with the earlier "widen". Sorry for the nitpicks and thanks for picking up this work. Richard