From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 177173858D38 for ; Tue, 19 Dec 2023 10:40:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 177173858D38 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 177173858D38 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702982433; cv=none; b=ZUbwKqVRHHiFjbyO2b4Wgqi4NjUZhTfP+o7IZ3FcwaZSO1kA9MGZi8EWf7mGgv1xZgwzRJRLsFpRABCbl/TQUwGO+V+8vabbYQXMon+I4Oq7kEsuqucP2myDiEo+FXpBehfjmWZvIqr7Ija4kBnFS1QCDdU3C6QsCHS/H6PZRmM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702982433; c=relaxed/simple; bh=nr7b0Ow6UOrNfAL0PXVpy6QwD2YUZ3DS21Co2S2mk3w=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=ls/KUNecHLAltsJk8jhdG35kxC/+CQ5LH/N31RoU/UPY6wEp7jUBzyWXugnEaMAkOXZsmjAQ8k4RTYwddYHQUy4+tVfVn3ptwJN39fPoTpAYAfoGRwEK1FD7iuqpd2bR+xV/6T3JlOf15CZJXaPtz1FGzgRuHS17aCq3Bo0SCH4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0E9E11FB; Tue, 19 Dec 2023 02:41:16 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C126D3F738; Tue, 19 Dec 2023 02:40:30 -0800 (PST) From: Richard Sandiford To: Richard Biener Mail-Followup-To: Richard Biener ,"juzhe.zhong\@rivai.ai" , Robin Dapp , gcc-patches , "pan2.li" , Richard Biener , pinskia , richard.sandiford@arm.com Cc: "juzhe.zhong\@rivai.ai" , Robin Dapp , gcc-patches , "pan2.li" , Richard Biener , pinskia Subject: Re: [PATCH] fold-const: Handle AND, IOR, XOR with stepped vectors [PR112971]. References: <097AABD6596FB0C3+2023121906491281154423@rivai.ai> <92p02r8p-46rq-s976-5r8p-s87q0q763465@fhfr.qr> <6FD0A43E2F3E9BD9+202312191735136921653@rivai.ai> Date: Tue, 19 Dec 2023 10:40:29 +0000 In-Reply-To: (Richard Biener's message of "Tue, 19 Dec 2023 11:11:22 +0100 (CET)") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-21.7 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,KAM_SHORT,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Richard Biener writes: > On Tue, 19 Dec 2023, juzhe.zhong@rivai.ai wrote: > >> Hi, Richard. >> >> After investigating the codes: >> /* Return true if EXPR is the integer constant zero or a complex constant >> of zero, or a location wrapper for such a constant. */ >> >> bool >> integer_zerop (const_tree expr) >> { >> STRIP_ANY_LOCATION_WRAPPER (expr); >> >> switch (TREE_CODE (expr)) >> { >> case INTEGER_CST: >> return wi::to_wide (expr) == 0; >> case COMPLEX_CST: >> return (integer_zerop (TREE_REALPART (expr)) >> && integer_zerop (TREE_IMAGPART (expr))); >> case VECTOR_CST: >> return (VECTOR_CST_NPATTERNS (expr) == 1 >> && VECTOR_CST_DUPLICATE_P (expr) >> && integer_zerop (VECTOR_CST_ENCODED_ELT (expr, 0))); >> default: >> return false; >> } >> } >> >> I wonder whether we can simplify the codes as follows :? >> if (integer_zerop (arg1) || integer_zerop (arg2)) >> step_ok_p = (code == BIT_AND_EXPR || code == BIT_IOR_EXPR >> || code == BIT_XOR_EXPR); > > Possibly. I'll let Richard S. comment on the whole structure. The current code is handling cases that require elementwise arithmetic. ISTM that what we're really doing here is identifying cases where whole-vector arithmetic is possible instead. I think that should be a separate pre-step, rather than integrated into the current code. Largely this would consist of writing out match.pd-style folds in C++ code, so Andrew's fix in comment 7 seems neater to me. But if this must happen in const_binop instead, then we could have a function like: /* OP is the INDEXth operand to CODE (counting from zero) and OTHER_OP is the other operand. Try to use the value of OP to simplify the operation in one step, without having to process individual elements. */ tree simplify_const_binop (tree_code code, rtx op, rtx other_op, int index) { ... } Thanks, Richard > > Richard. > >> >> >> >> juzhe.zhong@rivai.ai >> >> From: Richard Biener >> Date: 2023-12-19 17:12 >> To: juzhe.zhong@rivai.ai >> CC: Robin Dapp; gcc-patches; pan2.li; richard.sandiford; Richard Biener; pinskia >> Subject: Re: Re: [PATCH] fold-const: Handle AND, IOR, XOR with stepped vectors [PR112971]. >> On Tue, 19 Dec 2023, juzhe.zhong@rivai.ai wrote: >> >> > Hi?Richard. Do you mean add the check as follows ? >> > >> > if (VECTOR_CST_NELTS_PER_PATTERN (arg1) == 1 >> > && VECTOR_CST_NELTS_PER_PATTERN (arg2) == 3 >> >> Or <= 3 which would allow combining. As said, not sure what >> == 2 would be and whether that would work. >> >> Btw, integer_allonesp should also allow to be optimized for >> and/ior at least. Possibly IOR/AND with the sign bit for >> signed elements as well. >> >> I wonder if there's a programmatic way to identify OK cases >> rather than enumerating them. >> >> > && integer_zerop (VECTOR_CST_ELT (arg1, 0))) >> > step_ok_p = (code == BIT_AND_EXPR || code == BIT_IOR_EXPR >> > || code == BIT_XOR_EXPR); >> > else if (VECTOR_CST_NELTS_PER_PATTERN (arg2) == 1 >> > && VECTOR_CST_NELTS_PER_PATTERN (arg1) == 3 >> > && integer_zerop (VECTOR_CST_ELT (arg2, 0))) >> > step_ok_p = (code == BIT_AND_EXPR || code == BIT_IOR_EXPR >> > || code == BIT_XOR_EXPR); >> > >> > >> > >> > juzhe.zhong@rivai.ai >> > >> > From: Richard Biener >> > Date: 2023-12-19 16:15 >> > To: ??? >> > CC: rdapp.gcc; gcc-patches; pan2.li; richard.sandiford; richard.guenther; Andrew Pinski >> > Subject: Re: [PATCH] fold-const: Handle AND, IOR, XOR with stepped vectors [PR112971]. >> > On Tue, 19 Dec 2023, ??? wrote: >> > >> > > Thanks Robin send initial patch to fix this ICE bug. >> > > >> > > CC to Richard S, Richard B, and Andrew. >> > >> > Just one comment, it seems that VECTOR_CST_STEPPED_P should >> > implicitly include VECTOR_CST_DUPLICATE_P since it would be >> > a step of zero (but as implemented it doesn't catch this). >> > Looking at the implementation it's odd that we can handle >> > VECTOR_CST_NELTS_PER_PATTERN == 1 (duplicate) and >> > == 3 (stepped) but not == 2 (not sure what that would be). >> > >> > Maybe the tests can be re-formulated in terms of >> > VECTOR_CST_NELTS_PER_PATTERN? >> > >> > Richard. >> > >> > > Thanks. >> > > >> > > >> > > >> > > juzhe.zhong@rivai.ai >> > > >> > > From: Robin Dapp >> > > Date: 2023-12-19 03:50 >> > > To: gcc-patches >> > > CC: rdapp.gcc; Li, Pan2; juzhe.zhong@rivai.ai >> > > Subject: [PATCH] fold-const: Handle AND, IOR, XOR with stepped vectors [PR112971]. >> > > Hi, >> > > >> > > found in PR112971, this patch adds folding support for bitwise operations >> > > of const duplicate zero vectors and stepped vectors. >> > > On riscv we have the situation that a folding would perpetually continue >> > > without simplifying because e.g. {0, 0, 0, ...} & {7, 6, 5, ...} would >> > > not fold to {0, 0, 0, ...}. >> > > >> > > Bootstrapped and regtested on x86 and aarch64, regtested on riscv. >> > > >> > > I won't be available to respond quickly until next year. Pan or Juzhe, >> > > as discussed, feel free to continue with possible revisions. >> > > >> > > Regards >> > > Robin >> > > >> > > >> > > gcc/ChangeLog: >> > > >> > > PR middle-end/112971 >> > > >> > > * fold-const.cc (const_binop): Handle >> > > zerop@1 AND/IOR/XOR VECT_CST_STEPPED_P@2 >> > > >> > > gcc/testsuite/ChangeLog: >> > > >> > > * gcc.target/riscv/rvv/autovec/pr112971.c: New test. >> > > --- >> > > gcc/fold-const.cc | 14 +++++++++++++- >> > > .../gcc.target/riscv/rvv/autovec/pr112971.c | 18 ++++++++++++++++++ >> > > 2 files changed, 31 insertions(+), 1 deletion(-) >> > > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112971.c >> > > >> > > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc >> > > index f5d68ac323a..43ed097bf5c 100644 >> > > --- a/gcc/fold-const.cc >> > > +++ b/gcc/fold-const.cc >> > > @@ -1653,8 +1653,20 @@ const_binop (enum tree_code code, tree arg1, tree arg2) >> > > { >> > > tree type = TREE_TYPE (arg1); >> > > bool step_ok_p; >> > > + >> > > + /* AND, IOR as well as XOR with a zerop can be handled directly. */ >> > > if (VECTOR_CST_STEPPED_P (arg1) >> > > - && VECTOR_CST_STEPPED_P (arg2)) >> > > + && VECTOR_CST_DUPLICATE_P (arg2) >> > > + && integer_zerop (VECTOR_CST_ELT (arg2, 0))) >> > > + step_ok_p = code == BIT_AND_EXPR || code == BIT_IOR_EXPR >> > > + || code == BIT_XOR_EXPR; >> > > + else if (VECTOR_CST_STEPPED_P (arg2) >> > > + && VECTOR_CST_DUPLICATE_P (arg1) >> > > + && integer_zerop (VECTOR_CST_ELT (arg1, 0))) >> > > + step_ok_p = code == BIT_AND_EXPR || code == BIT_IOR_EXPR >> > > + || code == BIT_XOR_EXPR; >> > > + else if (VECTOR_CST_STEPPED_P (arg1) >> > > + && VECTOR_CST_STEPPED_P (arg2)) >> > > /* We can operate directly on the encoding if: >> > > a3 - a2 == a2 - a1 && b3 - b2 == b2 - b1 >> > > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112971.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112971.c >> > > new file mode 100644 >> > > index 00000000000..816ebd3c493 >> > > --- /dev/null >> > > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112971.c >> > > @@ -0,0 +1,18 @@ >> > > +/* { dg-do compile } */ >> > > +/* { dg-options "-march=rv64gcv_zvl256b -mabi=lp64d -O3 -fno-vect-cost-model" } */ >> > > + >> > > +int a; >> > > +short b[9]; >> > > +char c, d; >> > > +void e() { >> > > + d = 0; >> > > + for (;; d++) { >> > > + if (b[d]) >> > > + break; >> > > + a = 8; >> > > + for (; a >= 0; a--) { >> > > + char *f = &c; >> > > + *f &= d == (a & d); >> > > + } >> > > + } >> > > +} >> > > >> > >> > >> >>