From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 116602 invoked by alias); 7 Jan 2019 11:35:47 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 116562 invoked by uid 89); 7 Jan 2019 11:35:46 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-11.1 required=5.0 tests=BAYES_00,FREEMAIL_FROM,GIT_PATCH_2,GIT_PATCH_3,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=corresponds, transform, consists X-HELO: mail-lf1-f50.google.com Received: from mail-lf1-f50.google.com (HELO mail-lf1-f50.google.com) (209.85.167.50) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 07 Jan 2019 11:35:42 +0000 Received: by mail-lf1-f50.google.com with SMTP id a8so36656lfk.5 for ; Mon, 07 Jan 2019 03:35:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=mknLJJtSrcGNDbhEjtCugOPcg0IAJtfvSf+XPZ4HXzI=; b=t6Im8TkCbRPSzL87pn7P8T5cUxMXJgfoeqIXBp6q4syqmjWhVhE5LLQUxCkOP1zp8x 4ZzRwjU5VwseiMbk3lF6KivViV8Lcx3B5RwxBJdStG07mOrdv+AAiWk6UnRJzrIbvObw plqb/qXVEfsNd6e41URDV97AbqB+DEytNcKz1/h4HuGyEXBY+SQHx3sYnr556PouCnrx /1ouxb8pfPQfXKkMK1MOtjoE6TbpGXKL3RfVNos5ntWd7zbIZoFDUY+oUQ02uIaenuLM FTKmp6fCiua8QRoj6SbGBDVoVXiz1WDDzocIclSl0IS4DfbDyvW0bwx4SbOebyOg9hAP CNIA== MIME-Version: 1.0 References: <87bm4wr6ku.fsf@arm.com> <1848503.BfMgXAnLGD@polaris> <87o98wpqna.fsf@arm.com> <20190104121905.GZ30353@tucnak> <871s5spp6z.fsf@arm.com> In-Reply-To: <871s5spp6z.fsf@arm.com> From: Richard Biener Date: Mon, 07 Jan 2019 11:35:00 -0000 Message-ID: Subject: Re: [1/2] PR88598: Optimise x * { 0 or 1, 0 or 1, ... } To: Jakub Jelinek , Eric Botcazou , GCC Patches , Richard Sandiford Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2019-01/txt/msg00289.txt.bz2 On Fri, Jan 4, 2019 at 1:44 PM Richard Sandiford wrote: > > Jakub Jelinek writes: > > On Fri, Jan 04, 2019 at 12:13:13PM +0000, Richard Sandiford wrote: > >> > Can we avoid the gratuitous use of template here? We were told that C++ would > >> > be used only when it makes things more straightforward and it's the contrary > >> > in this case, to wit the need for the ugly RECURSE macro in the middle. > >> > >> I did it that way so that it would be easy to add things like > >> zero_or_minus_onep without cut-&-pasting the whole structure. > > > > IMHO we can make such a change only when it is needed. > > The other predicates in tree.c suggest that we won't though. > E.g. there was never any attempt to unify integer_zerop vs. integer_onep > and real_zerop vs. real_onep. > > >> The way to do that in C would be to use a macro for the full > >> function, but that's even uglier due to the extra backslashes. > > > > Or just make the function static inline and pass the function pointers > > to it as arguments? If it is inlined, it will be the same, it could be > > even always_inline if that is really needed. > > For that to work for recursive functions I think we'd need to pass the > caller predicate in too, which means one more function pointer overall. > > Anyway, here's the patch without the template. OK. Thanks, Richard. > Thanks, > Richard > > > 2019-01-04 Richard Sandiford > > gcc/ > PR tree-optimization/88598 > * tree.h (initializer_each_zero_or_onep): Declare. > * tree.c (initializer_each_zero_or_onep): New function. > (signed_or_unsigned_type_for): Handle float types too. > (unsigned_type_for, signed_type_for): Update comments accordingly. > * match.pd: Fold x * { 0 or 1, 0 or 1, ...} to > x & { 0 or -1, 0 or -1, ... }. > > gcc/testsuite/ > PR tree-optimization/88598 > * gcc.dg/pr88598-1.c: New test. > * gcc.dg/pr88598-2.c: Likewise. > * gcc.dg/pr88598-3.c: Likewise. > * gcc.dg/pr88598-4.c: Likewise. > * gcc.dg/pr88598-5.c: Likewise. > > Index: gcc/tree.h > =================================================================== > --- gcc/tree.h 2019-01-04 12:40:51.000000000 +0000 > +++ gcc/tree.h 2019-01-04 12:40:51.990582844 +0000 > @@ -4506,6 +4506,7 @@ extern tree first_field (const_tree); > combinations indicate definitive answers. */ > > extern bool initializer_zerop (const_tree, bool * = NULL); > +extern bool initializer_each_zero_or_onep (const_tree); > > extern wide_int vector_cst_int_elt (const_tree, unsigned int); > extern tree vector_cst_elt (const_tree, unsigned int); > Index: gcc/tree.c > =================================================================== > --- gcc/tree.c 2019-01-04 12:40:51.000000000 +0000 > +++ gcc/tree.c 2019-01-04 12:40:51.990582844 +0000 > @@ -11229,6 +11229,45 @@ initializer_zerop (const_tree init, bool > } > } > > +/* Return true if EXPR is an initializer expression in which every element > + is a constant that is numerically equal to 0 or 1. The elements do not > + need to be equal to each other. */ > + > +bool > +initializer_each_zero_or_onep (const_tree expr) > +{ > + STRIP_ANY_LOCATION_WRAPPER (expr); > + > + switch (TREE_CODE (expr)) > + { > + case INTEGER_CST: > + return integer_zerop (expr) || integer_onep (expr); > + > + case REAL_CST: > + return real_zerop (expr) || real_onep (expr); > + > + case VECTOR_CST: > + { > + unsigned HOST_WIDE_INT nelts = vector_cst_encoded_nelts (expr); > + if (VECTOR_CST_STEPPED_P (expr) > + && !TYPE_VECTOR_SUBPARTS (TREE_TYPE (expr)).is_constant (&nelts)) > + return false; > + > + for (unsigned int i = 0; i < nelts; ++i) > + { > + tree elt = VECTOR_CST_ENCODED_ELT (expr, i); > + if (!initializer_each_zero_or_onep (elt)) > + return false; > + } > + > + return true; > + } > + > + default: > + return false; > + } > +} > + > /* Check if vector VEC consists of all the equal elements and > that the number of elements corresponds to the type of VEC. > The function returns first element of the vector > @@ -11672,7 +11711,10 @@ int_cst_value (const_tree x) > > /* If TYPE is an integral or pointer type, return an integer type with > the same precision which is unsigned iff UNSIGNEDP is true, or itself > - if TYPE is already an integer type of signedness UNSIGNEDP. */ > + if TYPE is already an integer type of signedness UNSIGNEDP. > + If TYPE is a floating-point type, return an integer type with the same > + bitsize and with the signedness given by UNSIGNEDP; this is useful > + when doing bit-level operations on a floating-point value. */ > > tree > signed_or_unsigned_type_for (int unsignedp, tree type) > @@ -11702,17 +11744,23 @@ signed_or_unsigned_type_for (int unsigne > return build_complex_type (inner2); > } > > - if (!INTEGRAL_TYPE_P (type) > - && !POINTER_TYPE_P (type) > - && TREE_CODE (type) != OFFSET_TYPE) > + unsigned int bits; > + if (INTEGRAL_TYPE_P (type) > + || POINTER_TYPE_P (type) > + || TREE_CODE (type) == OFFSET_TYPE) > + bits = TYPE_PRECISION (type); > + else if (TREE_CODE (type) == REAL_TYPE) > + bits = GET_MODE_BITSIZE (SCALAR_TYPE_MODE (type)); > + else > return NULL_TREE; > > - return build_nonstandard_integer_type (TYPE_PRECISION (type), unsignedp); > + return build_nonstandard_integer_type (bits, unsignedp); > } > > /* If TYPE is an integral or pointer type, return an integer type with > the same precision which is unsigned, or itself if TYPE is already an > - unsigned integer type. */ > + unsigned integer type. If TYPE is a floating-point type, return an > + unsigned integer type with the same bitsize as TYPE. */ > > tree > unsigned_type_for (tree type) > @@ -11722,7 +11770,8 @@ unsigned_type_for (tree type) > > /* If TYPE is an integral or pointer type, return an integer type with > the same precision which is signed, or itself if TYPE is already a > - signed integer type. */ > + signed integer type. If TYPE is a floating-point type, return a > + signed integer type with the same bitsize as TYPE. */ > > tree > signed_type_for (tree type) > Index: gcc/match.pd > =================================================================== > --- gcc/match.pd 2019-01-04 12:40:51.000000000 +0000 > +++ gcc/match.pd 2019-01-04 12:40:51.982582910 +0000 > @@ -29,6 +29,7 @@ along with GCC; see the file COPYING3. > integer_each_onep integer_truep integer_nonzerop > real_zerop real_onep real_minus_onep > zerop > + initializer_each_zero_or_onep > CONSTANT_CLASS_P > tree_expr_nonnegative_p > tree_expr_nonzero_p > @@ -194,6 +195,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > || !COMPLEX_FLOAT_TYPE_P (type))) > (negate @0))) > > +/* Transform x * { 0 or 1, 0 or 1, ... } into x & { 0 or -1, 0 or -1, ...}, > + unless the target has native support for the former but not the latter. */ > +(simplify > + (mult @0 VECTOR_CST@1) > + (if (initializer_each_zero_or_onep (@1) > + && !HONOR_SNANS (type) > + && !HONOR_SIGNED_ZEROS (type)) > + (with { tree itype = FLOAT_TYPE_P (type) ? unsigned_type_for (type) : type; } > + (if (itype > + && (!VECTOR_MODE_P (TYPE_MODE (type)) > + || (VECTOR_MODE_P (TYPE_MODE (itype)) > + && optab_handler (and_optab, > + TYPE_MODE (itype)) != CODE_FOR_nothing))) > + (view_convert (bit_and:itype (view_convert @0) > + (ne @1 { build_zero_cst (type); }))))))) > + > (for cmp (gt ge lt le) > outp (convert convert negate negate) > outn (negate negate convert convert) > Index: gcc/testsuite/gcc.dg/pr88598-1.c > =================================================================== > --- /dev/null 2018-12-31 11:20:29.178325188 +0000 > +++ gcc/testsuite/gcc.dg/pr88598-1.c 2019-01-04 12:40:51.982582910 +0000 > @@ -0,0 +1,27 @@ > +/* { dg-do run } */ > +/* { dg-options "-O -fdump-tree-ccp1" } */ > + > +typedef int v4si __attribute__ ((vector_size (16))); > + > +int > +main () > +{ > + volatile v4si x1 = { 4, 5, 6, 7 }; > + volatile v4si x2 = { 10, 11, 12, 13 }; > + volatile v4si x3 = { 20, 21, 22, 23 }; > + > + x1 *= (v4si) { 0, 1, 1, 0 }; > + x2 *= (v4si) { 1, 0, 0, 1 }; > + x3 *= (v4si) { 0, 0, 1, 0 }; > + > + if (__builtin_memcmp ((void *) &x1, &(v4si) { 0, 5, 6, 0 }, sizeof (v4si)) > + || __builtin_memcmp ((void *) &x2, &(v4si) { 10, 0, 0, 13 }, > + sizeof (v4si)) > + || __builtin_memcmp ((void *) &x3, &(v4si) { 0, 0, 22, 0 }, > + sizeof (v4si))) > + __builtin_abort (); > + > + return 0; > +} > + > +/* { dg-final { scan-tree-dump-not { \* } "ccp1" } } */ > Index: gcc/testsuite/gcc.dg/pr88598-2.c > =================================================================== > --- /dev/null 2018-12-31 11:20:29.178325188 +0000 > +++ gcc/testsuite/gcc.dg/pr88598-2.c 2019-01-04 12:40:51.986582877 +0000 > @@ -0,0 +1,30 @@ > +/* { dg-do run { target double64 } } */ > +/* { dg-options "-O -fdump-tree-ccp1" } */ > +/* { dg-add-options ieee } */ > + > +typedef double v4df __attribute__ ((vector_size (32))); > + > +int > +main () > +{ > + volatile v4df x1 = { 4, 5, 6, -7 }; > + volatile v4df x2 = { 10, -11, 12, 13 }; > + volatile v4df x3 = { 20, 21, 22, 23 }; > + > + x1 *= (v4df) { 0, 1, 1, 0 }; > + x2 *= (v4df) { 1, 0, 0, 1 }; > + x3 *= (v4df) { 0.0, -0.0, 1.0, -0.0 }; > + > + if (__builtin_memcmp ((void *) &x1, &(v4df) { 0, 5, 6, -0.0 }, > + sizeof (v4df)) > + || __builtin_memcmp ((void *) &x2, &(v4df) { 10, -0.0, 0, 13 }, > + sizeof (v4df)) > + || __builtin_memcmp ((void *) &x3, &(v4df) { 0, -0.0, 22, -0.0 }, > + sizeof (v4df))) > + __builtin_abort (); > + > + return 0; > +} > + > +/* { dg-final { scan-tree-dump { \* } "ccp1" } } */ > +/* { dg-final { scan-tree-dump-not { \& } "ccp1" } } */ > Index: gcc/testsuite/gcc.dg/pr88598-3.c > =================================================================== > --- /dev/null 2018-12-31 11:20:29.178325188 +0000 > +++ gcc/testsuite/gcc.dg/pr88598-3.c 2019-01-04 12:40:51.986582877 +0000 > @@ -0,0 +1,29 @@ > +/* { dg-do run { target double64 } } */ > +/* { dg-options "-O -fno-signed-zeros -fdump-tree-ccp1" } */ > +/* { dg-add-options ieee } */ > + > +typedef double v4df __attribute__ ((vector_size (32))); > + > +int > +main () > +{ > + volatile v4df x1 = { 4, 5, 6, -7 }; > + volatile v4df x2 = { 10, -11, 12, 13 }; > + volatile v4df x3 = { 20, 21, 22, 23 }; > + > + x1 *= (v4df) { 0, 1, 1, 0 }; > + x2 *= (v4df) { 1, 0, 0, 1 }; > + x3 *= (v4df) { 0.0, -0.0, 1.0, -0.0 }; > + > + if (__builtin_memcmp ((void *) &x1, &(v4df) { 0, 5, 6, 0 }, > + sizeof (v4df)) > + || __builtin_memcmp ((void *) &x2, &(v4df) { 10, 0, 0, 13 }, > + sizeof (v4df)) > + || __builtin_memcmp ((void *) &x3, &(v4df) { 0, 0, 22, 0 }, > + sizeof (v4df))) > + __builtin_abort (); > + > + return 0; > +} > + > +/* { dg-final { scan-tree-dump-not { \* } "ccp1" } } */ > Index: gcc/testsuite/gcc.dg/pr88598-4.c > =================================================================== > --- /dev/null 2018-12-31 11:20:29.178325188 +0000 > +++ gcc/testsuite/gcc.dg/pr88598-4.c 2019-01-04 12:40:51.986582877 +0000 > @@ -0,0 +1,28 @@ > +/* { dg-do run } */ > +/* { dg-options "-O -fdump-tree-ccp1" } */ > + > +typedef int v4si __attribute__ ((vector_size (16))); > + > +int > +main () > +{ > + volatile v4si x1 = { 4, 5, 6, 7 }; > + volatile v4si x2 = { 10, 11, 12, 13 }; > + volatile v4si x3 = { 20, 21, 22, 23 }; > + > + x1 *= (v4si) { 0, 1, 2, 3 }; > + x2 *= (v4si) { 1, 0, 2, 0 }; > + x3 *= (v4si) { 0, 0, -1, 0 }; > + > + if (__builtin_memcmp ((void *) &x1, &(v4si) { 0, 5, 12, 21 }, sizeof (v4si)) > + || __builtin_memcmp ((void *) &x2, &(v4si) { 10, 0, 24, 0 }, > + sizeof (v4si)) > + || __builtin_memcmp ((void *) &x3, &(v4si) { 0, 0, -22, 0 }, > + sizeof (v4si))) > + __builtin_abort (); > + > + return 0; > +} > + > +/* { dg-final { scan-tree-dump { \* } "ccp1" } } */ > +/* { dg-final { scan-tree-dump-not { \& } "ccp1" } } */ > Index: gcc/testsuite/gcc.dg/pr88598-5.c > =================================================================== > --- /dev/null 2018-12-31 11:20:29.178325188 +0000 > +++ gcc/testsuite/gcc.dg/pr88598-5.c 2019-01-04 12:40:51.986582877 +0000 > @@ -0,0 +1,29 @@ > +/* { dg-do run { target double64 } } */ > +/* { dg-options "-O -fno-signed-zeros -fdump-tree-ccp1" } */ > +/* { dg-add-options ieee } */ > + > +typedef double v4df __attribute__ ((vector_size (32))); > + > +int > +main () > +{ > + volatile v4df x1 = { 4, 5, 6, 7 }; > + volatile v4df x2 = { 10, 11, 12, 13 }; > + volatile v4df x3 = { 20, 21, 22, 23 }; > + > + x1 *= (v4df) { 0, 1, 2, 3 }; > + x2 *= (v4df) { 1, 0, 2, 0 }; > + x3 *= (v4df) { 0, 0, -1, 0 }; > + > + if (__builtin_memcmp ((void *) &x1, &(v4df) { 0, 5, 12, 21 }, sizeof (v4df)) > + || __builtin_memcmp ((void *) &x2, &(v4df) { 10, 0, 24, 0 }, > + sizeof (v4df)) > + || __builtin_memcmp ((void *) &x3, &(v4df) { 0, 0, -22, 0 }, > + sizeof (v4df))) > + __builtin_abort (); > + > + return 0; > +} > + > +/* { dg-final { scan-tree-dump { \* } "ccp1" } } */ > +/* { dg-final { scan-tree-dump-not { \& } "ccp1" } } */