From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 49584 invoked by alias); 9 Aug 2018 21:07:31 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 49574 invoked by uid 89); 9 Aug 2018 21:07:30 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-27.0 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 09 Aug 2018 21:07:28 +0000 Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w79KwUCe094768 for ; Thu, 9 Aug 2018 17:07:26 -0400 Received: from e11.ny.us.ibm.com (e11.ny.us.ibm.com [129.33.205.201]) by mx0a-001b2d01.pphosted.com with ESMTP id 2krtxen4uh-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 09 Aug 2018 17:07:26 -0400 Received: from localhost by e11.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 9 Aug 2018 17:07:24 -0400 Received: from b01cxnp22036.gho.pok.ibm.com (9.57.198.26) by e11.ny.us.ibm.com (146.89.104.198) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Thu, 9 Aug 2018 17:07:22 -0400 Received: from b01ledav006.gho.pok.ibm.com (b01ledav006.gho.pok.ibm.com [9.57.199.111]) by b01cxnp22036.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w79L7KLS9765146 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Thu, 9 Aug 2018 21:07:21 GMT Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 089D3AC060; Thu, 9 Aug 2018 17:07:49 -0400 (EDT) Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A0DA7AC059; Thu, 9 Aug 2018 17:07:48 -0400 (EDT) Received: from [9.10.86.107] (unknown [9.10.86.107]) by b01ledav006.gho.pok.ibm.com (Postfix) with ESMTP; Thu, 9 Aug 2018 17:07:48 -0400 (EDT) Subject: Re: [PATCH, RFC, rs6000] enable GIMPLE folding of vec_splat From: Will Schmidt Reply-To: will_schmidt@vnet.ibm.com To: Bill Schmidt Cc: Richard Biener , Segher Boessenkool , Bill Schmidt , David Edelsohn , GCC Patches In-Reply-To: <6E9306D6-B8A1-42F6-90E9-0E6CC7AC66FA@linux.ibm.com> References: <1533669927.4452.14.camel@brimstone.rchland.ibm.com> <6E9306D6-B8A1-42F6-90E9-0E6CC7AC66FA@linux.ibm.com> Content-Type: text/plain; charset="UTF-8" Date: Thu, 09 Aug 2018 21:07:00 -0000 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit x-cbid: 18080921-2213-0000-0000-000002D85793 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00009515; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000266; SDB=6.01071416; UDB=6.00551727; IPR=6.00851116; MB=3.00022619; MTD=3.00000008; XFM=3.00000015; UTC=2018-08-09 21:07:23 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18080921-2214-0000-0000-00005B2540F9 Message-Id: <1533848840.4452.36.camel@brimstone.rchland.ibm.com> X-IsSubscribed: yes X-SW-Source: 2018-08/txt/msg00653.txt.bz2 On Thu, 2018-08-09 at 08:53 -0500, Bill Schmidt wrote: > > On Aug 7, 2018, at 2:25 PM, Will Schmidt wrote: > > > > Hi > > Enable GIMPLE folding of the vec_splat() intrinsic. > > > > For review.. feedback is expected. :-) > > > > I came up with the following after spending some time poking around > > at the tree_vec_extract() and vector_element() functions as seen > > in tree-vect-generic.c looking for insights. Some of this seems a > > bit clunky to me yet, but this is functional as far as make-check > > can tell, and is as far as I can get without additional input. > > > > This uses the tree_vec_extract() function out of tree-vect-generic.c > > to retrieve the splat value, which is a BIT_FIELD_REF. That function is > > made non-static as part of this change. > > > > In review of the .gimple output, this folding takes a sample testcase > > of > > vector bool int testb_0 (vector bool int x) > > { > > return vec_splat (x, 0b00000); > > } > > from: > > testb_0 (__vector __bool int x) > > { > > __vector __bool intD.1486 D.2855; > > > > _1 = VIEW_CONVERT_EXPR<__vector signed intD.1468>(xD.2778); > > _2 = __builtin_altivec_vspltwD.1919 (_1, 0); > > D.2855 = VIEW_CONVERT_EXPR<__vector __bool intD.1486>(_2); > > return D.2855; > > } > > to: > > testb_0 (__vector __bool int x) > > { > > __vector __bool intD.1486 D.2855; > > > > _1 = VIEW_CONVERT_EXPR<__vector signed intD.1468>(xD.2778); > > D.2856 = BIT_FIELD_REF <_1, 32, 0>; > > _2 = {D.2856, D.2856, D.2856, D.2856}; > > D.2855 = VIEW_CONVERT_EXPR<__vector __bool intD.1486>(_2); > > return D.2855; > > } > > > This looks okay. > > > > Testcases are being posted as a separate patch. > > > > OK for trunk? > > Thanks, > > -Will > > > > [gcc] > > > > 2018-08-07 Will Schmidt > > > > * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add support for > > early gimple folding of vec_splat(). > > * tree-vect-generic.c: Remove static from tree_vec_extract() definition. > > * gimple-fold.h: Add an extern define for tree_vec_extract(). > > > > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > > index 35c32be..acc6b49 100644 > > --- a/gcc/config/rs6000/rs6000.c > > +++ b/gcc/config/rs6000/rs6000.c > > @@ -15764,10 +15764,56 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) > > tree splat_tree = build_vector_from_val (TREE_TYPE (lhs), splat_value); > > g = gimple_build_assign (lhs, splat_tree); > > gimple_set_location (g, gimple_location (stmt)); > > gsi_replace (gsi, g, true); > > return true; > > + } > > + > > + /* Flavors of vec_splat. */ > > + // a = vec_splat (b, 0x3) ; // a = { b[3],b[3],b[3],...}; > > + case ALTIVEC_BUILTIN_VSPLTB: > > + case ALTIVEC_BUILTIN_VSPLTH: > > + case ALTIVEC_BUILTIN_VSPLTW: > > + case VSX_BUILTIN_XXSPLTD_V2DI: > > + case VSX_BUILTIN_XXSPLTD_V2DF: > > + { > > + arg0 = gimple_call_arg (stmt, 0); /* input vector. */ > > + arg1 = gimple_call_arg (stmt, 1); /* index into arg0. */ > > + /* Only fold the vec_splat_*() if arg1 is a constant > > + 5-bit unsigned literal. */ > > + if (TREE_CODE (arg1) != INTEGER_CST || TREE_INT_CST_LOW (arg1) > 0x1f) > > + return false; > > + > > + lhs = gimple_call_lhs (stmt); > > + tree lhs_type = TREE_TYPE (lhs); > > + > > + tree splat; > > + if (TREE_CODE (arg0) == VECTOR_CST) > > + splat = VECTOR_CST_ELT (arg0, TREE_INT_CST_LOW (arg1)); > > You should force arg1 into range before this. It should be adjusted modulo > VECTOR_CST_NELTS (arg0). Yes, the underlying vsplt* instructions will OK. > handle the modulo itself, but when expanding early we should make the > correction early, else something is likely to explode. Do you have a test > where arg1 is less than 32 but greater than the number of elements? Almost. :-) I test for arg1 < 32 and > #elms, but not with the arg0-is-constant scenario. I'll craft up a couple more testcases for that. > > > + else > > + { > > + /* Determine (in bits) the length and start location of the > > + splat value for a call to the tree_vec_extract helper. */ > > + int tree_size_in_bits = TREE_INT_CST_LOW (size_in_bytes (lhs_type)) > > + * BITS_PER_UNIT; > > I expect you should use arg0's type rather than lhs_type here. They > should be the same, of course; it's just clearer. ok. > > > + int splat_elem_size = tree_size_in_bits / VECTOR_CST_NELTS (arg0); > > + int splat_start_bit = TREE_INT_CST_LOW (arg1) * splat_elem_size; > > + /* Do not attempt to early-fold if the size + specified offset into > > + the vector would touch outside of the source vector. */ > > + if ((splat_start_bit + splat_elem_size) > tree_size_in_bits) > > + return false; > > This won't be necessary once you fix the modulo issue. ok. > The rest LGTM. thanks for the review and feedback. :-) -Will > > Thanks, > Bill > > > + tree len = build_int_cst (bitsizetype, splat_elem_size); > > + tree start = build_int_cst (bitsizetype, splat_start_bit); > > + splat = tree_vec_extract (gsi, TREE_TYPE (lhs_type), arg0, > > + len, start); > > + } > > + /* And finally, build the new vector. */ > > + tree splat_tree = build_vector_from_val (lhs_type, splat); > > + g = gimple_build_assign (lhs, splat_tree); > > + gimple_set_location (g, gimple_location (stmt)); > > + gsi_replace (gsi, g, true); > > + return true; > > } > > > > /* vec_mergel (integrals). */ > > case ALTIVEC_BUILTIN_VMRGLH: > > case ALTIVEC_BUILTIN_VMRGLW: > > diff --git a/gcc/gimple-fold.h b/gcc/gimple-fold.h > > index 04e9bfa..e634180 100644 > > --- a/gcc/gimple-fold.h > > +++ b/gcc/gimple-fold.h > > @@ -59,10 +59,11 @@ extern tree gimple_fold_indirect_ref (tree); > > extern bool gimple_fold_builtin_sprintf (gimple_stmt_iterator *); > > extern bool gimple_fold_builtin_snprintf (gimple_stmt_iterator *); > > extern bool arith_code_with_undefined_signed_overflow (tree_code); > > extern gimple_seq rewrite_to_defined_overflow (gimple *); > > extern void replace_call_with_value (gimple_stmt_iterator *, tree); > > +extern tree tree_vec_extract (gimple_stmt_iterator *, tree, tree, tree, tree); > > > > /* gimple_build, functionally matching fold_buildN, outputs stmts > > int the provided sequence, matching and simplifying them on-the-fly. > > Supposed to replace force_gimple_operand (fold_buildN (...), ...). */ > > extern tree gimple_build (gimple_seq *, location_t, > > diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c > > index 909f790..1c9701d 100644 > > --- a/gcc/tree-vect-generic.c > > +++ b/gcc/tree-vect-generic.c > > @@ -118,11 +118,11 @@ build_word_mode_vector_type (int nunits) > > > > typedef tree (*elem_op_func) (gimple_stmt_iterator *, > > tree, tree, tree, tree, tree, enum tree_code, > > tree); > > > > -static inline tree > > +tree > > tree_vec_extract (gimple_stmt_iterator *gsi, tree type, > > tree t, tree bitsize, tree bitpos) > > { > > if (TREE_CODE (t) == SSA_NAME) > > { > > > > >