Hi, Richard. After investigating the codes: /* Return true if EXPR is the integer constant zero or a complex constant of zero, or a location wrapper for such a constant. */ bool integer_zerop (const_tree expr) { STRIP_ANY_LOCATION_WRAPPER (expr); switch (TREE_CODE (expr)) { case INTEGER_CST: return wi::to_wide (expr) == 0; case COMPLEX_CST: return (integer_zerop (TREE_REALPART (expr)) && integer_zerop (TREE_IMAGPART (expr))); case VECTOR_CST: return (VECTOR_CST_NPATTERNS (expr) == 1 && VECTOR_CST_DUPLICATE_P (expr) && integer_zerop (VECTOR_CST_ENCODED_ELT (expr, 0))); default: return false; } } I wonder whether we can simplify the codes as follows :? if (integer_zerop (arg1) || integer_zerop (arg2)) step_ok_p = (code == BIT_AND_EXPR || code == BIT_IOR_EXPR || code == BIT_XOR_EXPR); juzhe.zhong@rivai.ai From: Richard Biener Date: 2023-12-19 17:12 To: juzhe.zhong@rivai.ai CC: Robin Dapp; gcc-patches; pan2.li; richard.sandiford; Richard Biener; pinskia Subject: Re: Re: [PATCH] fold-const: Handle AND, IOR, XOR with stepped vectors [PR112971]. On Tue, 19 Dec 2023, juzhe.zhong@rivai.ai wrote: > Hi?Richard. Do you mean add the check as follows ? > > if (VECTOR_CST_NELTS_PER_PATTERN (arg1) == 1 > && VECTOR_CST_NELTS_PER_PATTERN (arg2) == 3 Or <= 3 which would allow combining. As said, not sure what == 2 would be and whether that would work. Btw, integer_allonesp should also allow to be optimized for and/ior at least. Possibly IOR/AND with the sign bit for signed elements as well. I wonder if there's a programmatic way to identify OK cases rather than enumerating them. > && integer_zerop (VECTOR_CST_ELT (arg1, 0))) > step_ok_p = (code == BIT_AND_EXPR || code == BIT_IOR_EXPR > || code == BIT_XOR_EXPR); > else if (VECTOR_CST_NELTS_PER_PATTERN (arg2) == 1 > && VECTOR_CST_NELTS_PER_PATTERN (arg1) == 3 > && integer_zerop (VECTOR_CST_ELT (arg2, 0))) > step_ok_p = (code == BIT_AND_EXPR || code == BIT_IOR_EXPR > || code == BIT_XOR_EXPR); > > > > juzhe.zhong@rivai.ai > > From: Richard Biener > Date: 2023-12-19 16:15 > To: ??? > CC: rdapp.gcc; gcc-patches; pan2.li; richard.sandiford; richard.guenther; Andrew Pinski > Subject: Re: [PATCH] fold-const: Handle AND, IOR, XOR with stepped vectors [PR112971]. > On Tue, 19 Dec 2023, ??? wrote: > > > Thanks Robin send initial patch to fix this ICE bug. > > > > CC to Richard S, Richard B, and Andrew. > > Just one comment, it seems that VECTOR_CST_STEPPED_P should > implicitly include VECTOR_CST_DUPLICATE_P since it would be > a step of zero (but as implemented it doesn't catch this). > Looking at the implementation it's odd that we can handle > VECTOR_CST_NELTS_PER_PATTERN == 1 (duplicate) and > == 3 (stepped) but not == 2 (not sure what that would be). > > Maybe the tests can be re-formulated in terms of > VECTOR_CST_NELTS_PER_PATTERN? > > Richard. > > > Thanks. > > > > > > > > juzhe.zhong@rivai.ai > > > > From: Robin Dapp > > Date: 2023-12-19 03:50 > > To: gcc-patches > > CC: rdapp.gcc; Li, Pan2; juzhe.zhong@rivai.ai > > Subject: [PATCH] fold-const: Handle AND, IOR, XOR with stepped vectors [PR112971]. > > Hi, > > > > found in PR112971, this patch adds folding support for bitwise operations > > of const duplicate zero vectors and stepped vectors. > > On riscv we have the situation that a folding would perpetually continue > > without simplifying because e.g. {0, 0, 0, ...} & {7, 6, 5, ...} would > > not fold to {0, 0, 0, ...}. > > > > Bootstrapped and regtested on x86 and aarch64, regtested on riscv. > > > > I won't be available to respond quickly until next year. Pan or Juzhe, > > as discussed, feel free to continue with possible revisions. > > > > Regards > > Robin > > > > > > gcc/ChangeLog: > > > > PR middle-end/112971 > > > > * fold-const.cc (const_binop): Handle > > zerop@1 AND/IOR/XOR VECT_CST_STEPPED_P@2 > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/riscv/rvv/autovec/pr112971.c: New test. > > --- > > gcc/fold-const.cc | 14 +++++++++++++- > > .../gcc.target/riscv/rvv/autovec/pr112971.c | 18 ++++++++++++++++++ > > 2 files changed, 31 insertions(+), 1 deletion(-) > > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112971.c > > > > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc > > index f5d68ac323a..43ed097bf5c 100644 > > --- a/gcc/fold-const.cc > > +++ b/gcc/fold-const.cc > > @@ -1653,8 +1653,20 @@ const_binop (enum tree_code code, tree arg1, tree arg2) > > { > > tree type = TREE_TYPE (arg1); > > bool step_ok_p; > > + > > + /* AND, IOR as well as XOR with a zerop can be handled directly. */ > > if (VECTOR_CST_STEPPED_P (arg1) > > - && VECTOR_CST_STEPPED_P (arg2)) > > + && VECTOR_CST_DUPLICATE_P (arg2) > > + && integer_zerop (VECTOR_CST_ELT (arg2, 0))) > > + step_ok_p = code == BIT_AND_EXPR || code == BIT_IOR_EXPR > > + || code == BIT_XOR_EXPR; > > + else if (VECTOR_CST_STEPPED_P (arg2) > > + && VECTOR_CST_DUPLICATE_P (arg1) > > + && integer_zerop (VECTOR_CST_ELT (arg1, 0))) > > + step_ok_p = code == BIT_AND_EXPR || code == BIT_IOR_EXPR > > + || code == BIT_XOR_EXPR; > > + else if (VECTOR_CST_STEPPED_P (arg1) > > + && VECTOR_CST_STEPPED_P (arg2)) > > /* We can operate directly on the encoding if: > > a3 - a2 == a2 - a1 && b3 - b2 == b2 - b1 > > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112971.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112971.c > > new file mode 100644 > > index 00000000000..816ebd3c493 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112971.c > > @@ -0,0 +1,18 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-march=rv64gcv_zvl256b -mabi=lp64d -O3 -fno-vect-cost-model" } */ > > + > > +int a; > > +short b[9]; > > +char c, d; > > +void e() { > > + d = 0; > > + for (;; d++) { > > + if (b[d]) > > + break; > > + a = 8; > > + for (; a >= 0; a--) { > > + char *f = &c; > > + *f &= d == (a & d); > > + } > > + } > > +} > > > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)