From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=dkyG=AQ=intel.com=pan2.li@sourceware.org>
Received: from mga14.intel.com (mga14.intel.com [192.55.52.115])
	by sourceware.org (Postfix) with ESMTPS id 4294A3858D32
	for <gcc-patches@gcc.gnu.org>; Tue, 25 Apr 2023 14:29:13 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4294A3858D32
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1682432953; x=1713968953;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=hw47pHB1qD5PfJOr6wcvruDdb0XwO9H8QWkZT9zGOV4=;
  b=DNS2VSQSOa0k2KANkWhrNS8l974FJekghqXqSHkzyanB9TKD3bWy9tsx
   8Hrlqye0Wqmc28xQ+smFK1EyUMj/CsWMmVHKFPjBosvtztv0m/wB5C6zp
   dly8Fu5/+L6vg2UBrAhgrVMM5cmiQYvtwF3wMv1viYroYhnM2I5eWC9aB
   Fb9PquYeMytaIwgjevKRnbu3G4pr661nV96e1KuwcwLGr9CzPjAiT7Koo
   z5SZQaNy/+HVpGlI8wOJPrWZL3OQ71Pu7bFQkUoB1i/B60zVDOEJ01qip
   oqdHQPmQpAmzmHgsDKevbu1I8tfnTmiT1xKw9rnVfeMHIQhOxN3MANo7I
   g==;
X-IronPort-AV: E=McAfee;i="6600,9927,10691"; a="346806243"
X-IronPort-AV: E=Sophos;i="5.99,225,1677571200"; 
   d="scan'208";a="346806243"
Received: from fmsmga001.fm.intel.com ([10.253.24.23])
  by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Apr 2023 07:29:11 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=McAfee;i="6600,9927,10691"; a="837452116"
X-IronPort-AV: E=Sophos;i="5.99,225,1677571200"; 
   d="scan'208";a="837452116"
Received: from shvmail03.sh.intel.com ([10.239.245.20])
  by fmsmga001.fm.intel.com with ESMTP; 25 Apr 2023 07:29:09 -0700
Received: from pli-ubuntu.sh.intel.com (pli-ubuntu.sh.intel.com [10.239.46.88])
	by shvmail03.sh.intel.com (Postfix) with ESMTP id 7EAEA10189E7;
	Tue, 25 Apr 2023 22:29:08 +0800 (CST)
From: pan2.li@intel.com
To: gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai,
	kito.cheng@sifive.com,
	pan2.li@intel.com,
	yanzhang.wang@intel.com
Subject: [PATCH v3] RISC-V: Bugfix for RVV vbool*_t vn_reference_equal
Date: Tue, 25 Apr 2023 22:29:04 +0800
Message-Id: <20230425142904.133137-1-pan2.li@intel.com>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <Message-Id: <20230329075222.2888608-1-pan2.li@intel.com>
References: <Message-Id: <20230329075222.2888608-1-pan2.li@intel.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,SCC_5_SHORT_WORD_LINES,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

From: Pan Li <pan2.li@intel.com>

In most architecture the precision_size of vbool*_t types are caculated
like as the multiple of the type size.  For example:
precision_size = type_size * 8 (aka, bit count per bytes).

Unfortunately, some architecture like RISC-V will adjust the
precision_size
for the vbool*_t in order to align the ISA. For example as below.
type_size      = [1, 1, 1, 1,  2,  4,  8]
precision_size = [1, 2, 4, 8, 16, 32, 64]

Then the precision_size of RISC-V vbool*_t will not be the multiple of
the
type_size. This PATCH try to enrich this case when comparing the
vn_reference.

Given we have the below code:
void test_vbool8_then_vbool16(int8_t * restrict in, int8_t * restrict
out) {
    vbool8_t v1 = *(vbool8_t*)in;
    vbool16_t v2 = *(vbool16_t*)in;

    *(vbool8_t*)(out + 100) = v1;
    *(vbool16_t*)(out + 200) = v2;
}

Before this PATCH:
csrr    t0,vlenb
slli    t1,t0,1
csrr    a3,vlenb
sub     sp,sp,t1
slli    a4,a3,1
add     a4,a4,sp
addi    a2,a1,100
vsetvli a5,zero,e8,m1,ta,ma
sub     a3,a4,a3
vlm.v   v24,0(a0)
vsm.v   v24,0(a2)
vsm.v   v24,0(a3)
addi    a1,a1,200
csrr    t0,vlenb
vsetvli a4,zero,e8,mf2,ta,ma
slli    t1,t0,1
vlm.v   v24,0(a3)
vsm.v   v24,0(a1)
add     sp,sp,t1
jr      ra

After this PATCH:
addi    a3,a1,100
vsetvli a4,zero,e8,m1,ta,ma
addi    a1,a1,200
vlm.v   v24,0(a0)
vsm.v   v24,0(a3)
vsetvli a5,zero,e8,mf2,ta,ma
vlm.v   v24,0(a0)
vsm.v   v24,0(a1)
ret

	PR 109272

gcc/ChangeLog:

	* tree-ssa-sccvn.cc (vn_reference_eq): add type vector subparts
	  check for vn_reference equal.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/pr108185-4.c: Update test check
	  condition.
	* gcc.target/riscv/rvv/base/pr108185-5.c: Likewise.
	* gcc.target/riscv/rvv/base/pr108185-6.c: Likewise.

Signed-off-by: Pan Li <pan2.li@intel.com>
---
 .../gcc.target/riscv/rvv/base/pr108185-4.c    |  2 +-
 .../gcc.target/riscv/rvv/base/pr108185-5.c    |  2 +-
 .../gcc.target/riscv/rvv/base/pr108185-6.c    |  2 +-
 gcc/tree-ssa-sccvn.cc                         | 20 +++++++++++++++++++
 4 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-4.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-4.c
index 6e4d1cb1e44..321cd5c818e 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-4.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-4.c
@@ -65,4 +65,4 @@ test_vbool8_then_vbool64(int8_t * restrict in, int8_t * restrict out) {
 /* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf4,\s*ta,\s*ma} 1 } } */
 /* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 1 } } */
 /* { dg-final { scan-assembler-times {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */
-/* { dg-final { scan-assembler-times {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 15 } } */
+/* { dg-final { scan-assembler-times {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-5.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-5.c
index 9fc659d2402..575a7842cdf 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-5.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-5.c
@@ -65,4 +65,4 @@ test_vbool16_then_vbool64(int8_t * restrict in, int8_t * restrict out) {
 /* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf4,\s*ta,\s*ma} 1 } } */
 /* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 1 } } */
 /* { dg-final { scan-assembler-times {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */
-/* { dg-final { scan-assembler-times {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 14 } } */
+/* { dg-final { scan-assembler-times {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-6.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-6.c
index 98275e5267d..95a11d37016 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-6.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr108185-6.c
@@ -65,4 +65,4 @@ test_vbool32_then_vbool64(int8_t * restrict in, int8_t * restrict out) {
 /* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf2,\s*ta,\s*ma} 1 } } */
 /* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 1 } } */
 /* { dg-final { scan-assembler-times {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */
-/* { dg-final { scan-assembler-times {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 13 } } */
+/* { dg-final { scan-assembler-times {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */
diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index 94b793e1caf..94d6163e6ae 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -799,6 +799,26 @@ vn_reference_eq (const_vn_reference_t const vr1, const_vn_reference_t const vr2)
 	   && (TYPE_PRECISION (vr2->type)
 	       != TREE_INT_CST_LOW (TYPE_SIZE (vr2->type))))
     return false;
+  else if (VECTOR_BOOLEAN_TYPE_P (vr1->type)
+	   && VECTOR_BOOLEAN_TYPE_P (vr2->type))
+    {
+      /* Vector boolean types can have padding, verify we are dealing with
+	 the same number of elements, aka the precision of the types.
+	 For example, In most architecture the precision_size of vbool*_t
+	 types are caculated like below:
+	 precision_size = type_size * 8
+
+	 Unfortunately, the RISC-V will adjust the precision_size for the
+	 vbool*_t in order to align the ISA as below:
+	 type_size      = [1, 1, 1, 1,  2,  4,  8]
+	 precision_size = [1, 2, 4, 8, 16, 32, 64]
+
+	 Then the precision_size of RISC-V vbool*_t will not be the multiple
+	 of the type_size.  We take care of this case consolidated here.  */
+      if (maybe_ne (TYPE_VECTOR_SUBPARTS (vr1->type),
+		    TYPE_VECTOR_SUBPARTS (vr2->type)))
+	return false;
+    }
 
   i = 0;
   j = 0;
-- 
2.34.1