From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by sourceware.org (Postfix) with ESMTPS id 064143858C2B for ; Mon, 15 May 2023 10:20:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 064143858C2B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 1E0B821D41; Mon, 15 May 2023 10:20:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1684146055; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=vcR/Y9QTAMVgG5Zil0s3BeKuAlxb6/Aql0TceD1bieo=; b=0YajKZuWzxAPEVinmYPmmOsX6NI8/uSMzRHrjoBUXsYF+qCpId5RBosV+WFU9H9eXWJOT6 /8s3y025M5U8F3yfVg/yCgRcMmnCk/4izR9TrAyMIqgr35vlUwbPgpZNSRviLONd7y7EyV 6CRwjOWJB2waqHxJvpAszfrhv7q7uW4= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1684146055; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=vcR/Y9QTAMVgG5Zil0s3BeKuAlxb6/Aql0TceD1bieo=; b=Ni9dgPKVj+ne45v8aeuLGzghtQ35Nr9Yv4WEBq/MSi7yrULgMT+bwifulbEWLs7WLvQE1Y 26RU27tWqst9XICg== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 078852C141; Mon, 15 May 2023 10:20:54 +0000 (UTC) Date: Mon, 15 May 2023 10:20:54 +0000 (UTC) From: Richard Biener To: Richard Sandiford cc: "Andre Vieira (lists)" , Richard Biener , "gcc-patches@gcc.gnu.org" Subject: Re: [PATCH 2/3] Refactor widen_plus as internal_fn In-Reply-To: Message-ID: References: <51ce8969-3130-452e-092e-f9d91eff2dad@arm.com> <4df83136-82f9-de0b-7e66-007b9047174d@arm.com> User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_SHORT,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, 12 May 2023, Richard Sandiford wrote: > Richard Biener writes: > > On Fri, 12 May 2023, Andre Vieira (lists) wrote: > > > >> I have dealt with, I think..., most of your comments. There's quite a few > >> changes, I think it's all a bit simpler now. I made some other changes to the > >> costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve > >> the same behaviour as we had with the tree codes before. Also added some extra > >> checks to tree-cfg.cc that made sense to me. > >> > >> I am still regression testing the gimple-range-op change, as that was a last > >> minute change, but the rest survived a bootstrap and regression test on > >> aarch64-unknown-linux-gnu. > >> > >> cover letter: > >> > >> This patch replaces the existing tree_code widen_plus and widen_minus > >> patterns with internal_fn versions. > >> > >> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN > >> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively > >> except they provide convenience wrappers for defining conversions that require > >> a hi/lo split. Each definition for will require optabs for _hi and _lo > >> and each of those will also require a signed and unsigned version in the case > >> of widening. The hi/lo pair is necessary because the widening and narrowing > >> operations take n narrow elements as inputs and return n/2 wide elements as > >> outputs. The 'lo' operation operates on the first n/2 elements of input. The > >> 'hi' operation operates on the second n/2 elements of input. Defining an > >> internal_fn along with hi/lo variations allows a single internal function to > >> be returned from a vect_recog function that will later be expanded to hi/lo. > >> > >> > >> For example: > >> IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO > >> for aarch64: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_add_hi_ -> > >> (u/s)addl2 > >> IFN_VEC_WIDEN_PLUS_LO -> vec_widen_add_lo_ > >> -> (u/s)addl > >> > >> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree > >> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. > > > > What I still don't understand is how we are so narrowly focused on > > HI/LO? We need a combined scalar IFN for pattern selection (not > > sure why that's now called _HILO, I expected no suffix). Then there's > > three possibilities the target can implement this: > > > > 1) with a widen_[su]add instruction - I _think_ that's what > > RISCV is going to offer since it is a target where vector modes > > have "padding" (aka you cannot subreg a V2SI to get V4HI). Instead > > RVV can do a V4HI to V4SI widening and widening add/subtract > > using vwadd[u] and vwsub[u] (the HI->SI widening is actually > > done with a widening add of zero - eh). > > IIRC GCN is the same here. > > SVE currently does this too, but the addition and widening are > separate operations. E.g. in principle there's no reason why > you can't sign-extend one operand, zero-extend the other, and > then add the result together. Or you could extend them from > different sizes (QI and HI). All of those are supported > (if the costing allows them). I see. So why does the target the expose widen_[su]add at all? > If the target has operations to do combined extending and adding (or > whatever), then at the moment we rely on combine to generate them. > > So I think this case is separate from Andre's work. The addition > itself is just an ordinary addition, and any widening happens by > vectorising a CONVERT/NOP_EXPR. > > > 2) with a widen_[su]add{_lo,_hi} combo - that's what the tree > > codes currently support (exclusively) > > 3) similar, but widen_[su]add{_even,_odd} > > > > that said, things like decomposes_to_hilo_fn_p look to paint us into > > a 2) corner without good reason. > > I suppose one question is: how much of the patch is really specific > to HI/LO, and how much is just grouping two halves together? Yep, that I don't know for sure. > The nice > thing about the internal-fn grouping macros is that, if (3) is > implemented in future, the structure will strongly encourage even/odd > pairs to be supported for all operations that support hi/lo. That is, > I would expect the grouping macros to be extended to define even/odd > ifns alongside hi/lo ones, rather than adding separate definitions > for even/odd functions. > > If so, at least from the internal-fn.* side of things, I think the question > is whether it's OK to stick with hilo names for now, or whether we should > use more forward-looking names. I think for parts that are independent we could use a more forward-looking name. Maybe _halves? But I'm also not sure how much of that is really needed (it seems to be tied around optimizing optabs space?) Richard. > Thanks, > Richard > > > > > Richard. > > > >> gcc/ChangeLog: > >> > >> 2023-05-12 Andre Vieira > >> Joel Hutton > >> Tamar Christina > >> > >> * config/aarch64/aarch64-simd.md (vec_widen_addl_lo_): > >> Rename > >> this ... > >> (vec_widen_add_lo_): ... to this. > >> (vec_widen_addl_hi_): Rename this ... > >> (vec_widen_add_hi_): ... to this. > >> (vec_widen_subl_lo_): Rename this ... > >> (vec_widen_sub_lo_): ... to this. > >> (vec_widen_subl_hi_): Rename this ... > >> (vec_widen_sub_hi_): ...to this. > >> * doc/generic.texi: Document new IFN codes. > >> * internal-fn.cc (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Macro to > >> define an > >> internal_fn that expands into multiple internal_fns for widening. > >> (DEF_INTERNAL_OPTAB_NARROWING_HILO_FN): Likewise but for narrowing. > >> (ifn_cmp): Function to compare ifn's for sorting/searching. > >> (lookup_hilo_internal_fn): Add lookup function. > >> (commutative_binary_fn_p): Add widen_plus fn's. > >> (widening_fn_p): New function. > >> (narrowing_fn_p): New function. > >> (decomposes_to_hilo_fn_p): New function. > >> (direct_internal_fn_optab): Change visibility. > >> * internal-fn.def (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Define > >> widening > >> plus,minus functions. > >> (VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS_EXPR tree code. > >> (VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS_EXPR tree code. > >> * internal-fn.h (GCC_INTERNAL_FN_H): Add headers. > >> (direct_internal_fn_optab): Declare new prototype. > >> (lookup_hilo_internal_fn): Likewise. > >> (widening_fn_p): Likewise. > >> (Narrowing_fn_p): Likewise. > >> (decomposes_to_hilo_fn_p): Likewise. > >> * optabs.cc (commutative_optab_p): Add widening plus optabs. > >> * optabs.def (OPTAB_D): Define widen add, sub optabs. > >> * tree-cfg.cc (verify_gimple_call): Add checks for new widen > >> add and sub IFNs. > >> * tree-inline.cc (estimate_num_insns): Return same > >> cost for widen add and sub IFNs as previous tree_codes. > >> * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support > >> patterns with a hi/lo split. > >> (vect_recog_sad_pattern): Refactor to use new IFN codes. > >> (vect_recog_widen_plus_pattern): Likewise. > >> (vect_recog_widen_minus_pattern): Likewise. > >> (vect_recog_average_pattern): Likewise. > >> * tree-vect-stmts.cc (vectorizable_conversion): Add support for > >> _HILO IFNs. > >> (supportable_widening_operation): Likewise. > >> * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs. > >> > >> gcc/testsuite/ChangeLog: > >> > >> * gcc.target/aarch64/vect-widen-add.c: Test that new > >> IFN_VEC_WIDEN_PLUS is being used. > >> * gcc.target/aarch64/vect-widen-sub.c: Test that new > >> IFN_VEC_WIDEN_MINUS is being used. > >> > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg)