From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=hh60=H6=arm.com=richard.sandiford@sourceware.org>
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
	by sourceware.org (Postfix) with ESMTP id 177173858D38
	for <gcc-patches@gcc.gnu.org>; Tue, 19 Dec 2023 10:40:32 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 177173858D38
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 177173858D38
Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702982433; cv=none;
	b=ZUbwKqVRHHiFjbyO2b4Wgqi4NjUZhTfP+o7IZ3FcwaZSO1kA9MGZi8EWf7mGgv1xZgwzRJRLsFpRABCbl/TQUwGO+V+8vabbYQXMon+I4Oq7kEsuqucP2myDiEo+FXpBehfjmWZvIqr7Ija4kBnFS1QCDdU3C6QsCHS/H6PZRmM=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
	t=1702982433; c=relaxed/simple;
	bh=nr7b0Ow6UOrNfAL0PXVpy6QwD2YUZ3DS21Co2S2mk3w=;
	h=From:To:Subject:Date:Message-ID:MIME-Version; b=ls/KUNecHLAltsJk8jhdG35kxC/+CQ5LH/N31RoU/UPY6wEp7jUBzyWXugnEaMAkOXZsmjAQ8k4RTYwddYHQUy4+tVfVn3ptwJN39fPoTpAYAfoGRwEK1FD7iuqpd2bR+xV/6T3JlOf15CZJXaPtz1FGzgRuHS17aCq3Bo0SCH4=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
	by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0E9E11FB;
	Tue, 19 Dec 2023 02:41:16 -0800 (PST)
Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72])
	by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C126D3F738;
	Tue, 19 Dec 2023 02:40:30 -0800 (PST)
From: Richard Sandiford <richard.sandiford@arm.com>
To: Richard Biener <rguenther@suse.de>
Mail-Followup-To: Richard Biener <rguenther@suse.de>,"juzhe.zhong\@rivai.ai" <juzhe.zhong@rivai.ai>,  Robin Dapp <rdapp.gcc@gmail.com>,  gcc-patches <gcc-patches@gcc.gnu.org>,  "pan2.li" <pan2.li@intel.com>,  Richard Biener <richard.guenther@gmail.com>,  pinskia <pinskia@gmail.com>, richard.sandiford@arm.com
Cc: "juzhe.zhong\@rivai.ai" <juzhe.zhong@rivai.ai>,  Robin Dapp <rdapp.gcc@gmail.com>,  gcc-patches <gcc-patches@gcc.gnu.org>,  "pan2.li" <pan2.li@intel.com>,  Richard Biener <richard.guenther@gmail.com>,  pinskia <pinskia@gmail.com>
Subject: Re: [PATCH] fold-const: Handle AND, IOR, XOR with stepped vectors [PR112971].
References: <a45c5452-1b17-43fe-a858-7bce23ee88f1@gmail.com>
	<097AABD6596FB0C3+2023121906491281154423@rivai.ai>
	<92p02r8p-46rq-s976-5r8p-s87q0q763465@fhfr.qr>
	<D108A5A3D94FB612+202312191654014095160@rivai.ai>
	<n3549npp-51n5-o7on-q8s8-07on61qq83rq@fhfr.qr>
	<6FD0A43E2F3E9BD9+202312191735136921653@rivai.ai>
	<p54q4569-5164-9osp-pqpp-n1r071s80pns@fhfr.qr>
Date: Tue, 19 Dec 2023 10:40:29 +0000
In-Reply-To: <p54q4569-5164-9osp-pqpp-n1r071s80pns@fhfr.qr> (Richard Biener's
	message of "Tue, 19 Dec 2023 11:11:22 +0100 (CET)")
Message-ID: <mptle9qo4pu.fsf@arm.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Status: No, score=-21.7 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,KAM_SHORT,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

Richard Biener <rguenther@suse.de> writes:
> On Tue, 19 Dec 2023, juzhe.zhong@rivai.ai wrote:
>
>> Hi, Richard.
>> 
>> After investigating the codes:
>> /* Return true if EXPR is the integer constant zero or a complex constant
>>    of zero, or a location wrapper for such a constant.  */
>> 
>> bool
>> integer_zerop (const_tree expr)
>> {
>>   STRIP_ANY_LOCATION_WRAPPER (expr);
>> 
>>   switch (TREE_CODE (expr))
>>     {
>>     case INTEGER_CST:
>>       return wi::to_wide (expr) == 0;
>>     case COMPLEX_CST:
>>       return (integer_zerop (TREE_REALPART (expr))
>>               && integer_zerop (TREE_IMAGPART (expr)));
>>     case VECTOR_CST:
>>       return (VECTOR_CST_NPATTERNS (expr) == 1
>>               && VECTOR_CST_DUPLICATE_P (expr)
>>               && integer_zerop (VECTOR_CST_ENCODED_ELT (expr, 0)));
>>     default:
>>       return false;
>>     }
>> }
>> 
>> I wonder whether we can simplify the codes as follows :?
>>       if (integer_zerop (arg1) || integer_zerop (arg2))
>>         step_ok_p = (code == BIT_AND_EXPR || code == BIT_IOR_EXPR
>>                      || code == BIT_XOR_EXPR);
>
> Possibly.  I'll let Richard S. comment on the whole structure.

The current code is handling cases that require elementwise arithmetic.
ISTM that what we're really doing here is identifying cases where
whole-vector arithmetic is possible instead.  I think that should be
a separate pre-step, rather than integrated into the current code.

Largely this would consist of writing out match.pd-style folds in
C++ code, so Andrew's fix in comment 7 seems neater to me.

But if this must happen in const_binop instead, then we could have
a function like:

/* OP is the INDEXth operand to CODE (counting from zero) and OTHER_OP
   is the other operand.  Try to use the value of OP to simplify the
   operation in one step, without having to process individual elements.  */
tree
simplify_const_binop (tree_code code, rtx op, rtx other_op, int index)
{
  ...
}

Thanks,
Richard

>
> Richard.
>
>> 
>> 
>> 
>> juzhe.zhong@rivai.ai
>>  
>> From: Richard Biener
>> Date: 2023-12-19 17:12
>> To: juzhe.zhong@rivai.ai
>> CC: Robin Dapp; gcc-patches; pan2.li; richard.sandiford; Richard Biener; pinskia
>> Subject: Re: Re: [PATCH] fold-const: Handle AND, IOR, XOR with stepped vectors [PR112971].
>> On Tue, 19 Dec 2023, juzhe.zhong@rivai.ai wrote:
>>  
>> > Hi?Richard. Do you mean add the check as follows ?
>> > 
>> >       if (VECTOR_CST_NELTS_PER_PATTERN (arg1) == 1
>> >           && VECTOR_CST_NELTS_PER_PATTERN (arg2) == 3
>>  
>> Or <= 3 which would allow combining.  As said, not sure what
>> == 2 would be and whether that would work.
>>  
>> Btw, integer_allonesp should also allow to be optimized for
>> and/ior at least.  Possibly IOR/AND with the sign bit for
>> signed elements as well.
>>  
>> I wonder if there's a programmatic way to identify OK cases
>> rather than enumerating them.
>>  
>> >           && integer_zerop (VECTOR_CST_ELT (arg1, 0)))
>> >         step_ok_p = (code == BIT_AND_EXPR || code == BIT_IOR_EXPR
>> >           || code == BIT_XOR_EXPR);
>> >       else if (VECTOR_CST_NELTS_PER_PATTERN (arg2) == 1
>> >           && VECTOR_CST_NELTS_PER_PATTERN (arg1) == 3
>> >           && integer_zerop (VECTOR_CST_ELT (arg2, 0)))
>> >         step_ok_p = (code == BIT_AND_EXPR || code == BIT_IOR_EXPR
>> >           || code == BIT_XOR_EXPR);
>> > 
>> > 
>> > 
>> > juzhe.zhong@rivai.ai
>> >  
>> > From: Richard Biener
>> > Date: 2023-12-19 16:15
>> > To: ???
>> > CC: rdapp.gcc; gcc-patches; pan2.li; richard.sandiford; richard.guenther; Andrew Pinski
>> > Subject: Re: [PATCH] fold-const: Handle AND, IOR, XOR with stepped vectors [PR112971].
>> > On Tue, 19 Dec 2023, ??? wrote:
>> >  
>> > > Thanks Robin send initial patch to fix this ICE bug.
>> > > 
>> > > CC to Richard S, Richard B, and Andrew.
>> >  
>> > Just one comment, it seems that VECTOR_CST_STEPPED_P should
>> > implicitly include VECTOR_CST_DUPLICATE_P since it would be
>> > a step of zero (but as implemented it doesn't catch this).
>> > Looking at the implementation it's odd that we can handle
>> > VECTOR_CST_NELTS_PER_PATTERN == 1 (duplicate) and
>> > == 3 (stepped) but not == 2 (not sure what that would be).
>> >  
>> > Maybe the tests can be re-formulated in terms of
>> > VECTOR_CST_NELTS_PER_PATTERN?
>> >  
>> > Richard.
>> >  
>> > > Thanks.
>> > > 
>> > > 
>> > > 
>> > > juzhe.zhong@rivai.ai
>> > >  
>> > > From: Robin Dapp
>> > > Date: 2023-12-19 03:50
>> > > To: gcc-patches
>> > > CC: rdapp.gcc; Li, Pan2; juzhe.zhong@rivai.ai
>> > > Subject: [PATCH] fold-const: Handle AND, IOR, XOR with stepped vectors [PR112971].
>> > > Hi,
>> > >  
>> > > found in PR112971, this patch adds folding support for bitwise operations
>> > > of const duplicate zero vectors and stepped vectors.
>> > > On riscv we have the situation that a folding would perpetually continue
>> > > without simplifying because e.g. {0, 0, 0, ...} & {7, 6, 5, ...} would
>> > > not fold to {0, 0, 0, ...}.
>> > >  
>> > > Bootstrapped and regtested on x86 and aarch64, regtested on riscv.
>> > >  
>> > > I won't be available to respond quickly until next year.  Pan or Juzhe,
>> > > as discussed, feel free to continue with possible revisions.
>> > >  
>> > > Regards
>> > > Robin
>> > >  
>> > >  
>> > > gcc/ChangeLog:
>> > >  
>> > > PR middle-end/112971
>> > >  
>> > > * fold-const.cc (const_binop): Handle
>> > > zerop@1  AND/IOR/XOR  VECT_CST_STEPPED_P@2
>> > >  
>> > > gcc/testsuite/ChangeLog:
>> > >  
>> > > * gcc.target/riscv/rvv/autovec/pr112971.c: New test.
>> > > ---
>> > > gcc/fold-const.cc                              | 14 +++++++++++++-
>> > > .../gcc.target/riscv/rvv/autovec/pr112971.c    | 18 ++++++++++++++++++
>> > > 2 files changed, 31 insertions(+), 1 deletion(-)
>> > > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112971.c
>> > >  
>> > > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
>> > > index f5d68ac323a..43ed097bf5c 100644
>> > > --- a/gcc/fold-const.cc
>> > > +++ b/gcc/fold-const.cc
>> > > @@ -1653,8 +1653,20 @@ const_binop (enum tree_code code, tree arg1, tree arg2)
>> > >      {
>> > >        tree type = TREE_TYPE (arg1);
>> > >        bool step_ok_p;
>> > > +
>> > > +      /* AND, IOR as well as XOR with a zerop can be handled directly.  */
>> > >        if (VECTOR_CST_STEPPED_P (arg1)
>> > > -   && VECTOR_CST_STEPPED_P (arg2))
>> > > +   && VECTOR_CST_DUPLICATE_P (arg2)
>> > > +   && integer_zerop (VECTOR_CST_ELT (arg2, 0)))
>> > > + step_ok_p = code == BIT_AND_EXPR || code == BIT_IOR_EXPR
>> > > +   || code == BIT_XOR_EXPR;
>> > > +      else if (VECTOR_CST_STEPPED_P (arg2)
>> > > +        && VECTOR_CST_DUPLICATE_P (arg1)
>> > > +        && integer_zerop (VECTOR_CST_ELT (arg1, 0)))
>> > > + step_ok_p = code == BIT_AND_EXPR || code == BIT_IOR_EXPR
>> > > +   || code == BIT_XOR_EXPR;
>> > > +      else if (VECTOR_CST_STEPPED_P (arg1)
>> > > +        && VECTOR_CST_STEPPED_P (arg2))
>> > > /* We can operate directly on the encoding if:
>> > >       a3 - a2 == a2 - a1 && b3 - b2 == b2 - b1
>> > > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112971.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112971.c
>> > > new file mode 100644
>> > > index 00000000000..816ebd3c493
>> > > --- /dev/null
>> > > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112971.c
>> > > @@ -0,0 +1,18 @@
>> > > +/* { dg-do compile }  */
>> > > +/* { dg-options "-march=rv64gcv_zvl256b -mabi=lp64d -O3 -fno-vect-cost-model" }  */
>> > > +
>> > > +int a;
>> > > +short b[9];
>> > > +char c, d;
>> > > +void e() {
>> > > +  d = 0;
>> > > +  for (;; d++) {
>> > > +    if (b[d])
>> > > +      break;
>> > > +    a = 8;
>> > > +    for (; a >= 0; a--) {
>> > > +      char *f = &c;
>> > > +      *f &= d == (a & d);
>> > > +    }
>> > > +  }
>> > > +}
>> > > 
>> >  
>> > 
>>  
>>