From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by sourceware.org (Postfix) with ESMTPS id 5C3343858439 for ; Wed, 20 Dec 2023 02:39:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5C3343858439 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5C3343858439 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.55.52.43 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703039973; cv=none; b=QIZQyxjQiT8d2NHnwobdJDiIKkqprsYV5G6EvvdCku7EJHvFurGFAeCPruPO9Kijf4CLO9fFpbyPk+ShvXAeXzjVUj+7aDrdbyZeea+mDxl4coGkj6OG0Q83Ea7R6DETevFUSSdh1TUVFTAgsK+aYRuNpjL+fBh89Kg+8ThD3P8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703039973; c=relaxed/simple; bh=iqlaNoZ5v9gMKOxFqZ8VxA7nzWykIVnQlkia2mPk6Pk=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=FlEjvxeV7bGNxU+6zetj0KB97Ec27kAsSC8/uFd/ycOVeFC0Ae6vr0inGWzoZOtMZ2wdTb/nzi8/45TnMECEQEX4vsnMYZUD3gF9uQGQCGodYfDV0ShyLsiUPLUEGKDQzK6VsAH4a7eRyYZGl7SLe5ZF8UxiTHT4cypHOd1CtFg= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703039971; x=1734575971; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=iqlaNoZ5v9gMKOxFqZ8VxA7nzWykIVnQlkia2mPk6Pk=; b=foR9uLKKkMW21GUs/tnUY/B+lHxuKZ7eHAHcWHmCW72Wu5XumWax/bLn oID/rhek5rZUM3rft4a3JzDLIgNOCOaTQTuJ8GEyKWs9vM4d/55b0kCtv q/kkROjpJ6oFdKvW4ChZsbqjhMkCuEejin6Qs0nfQvqJvzbgDARtfqCqC i0flSCgvMNp3rZjLBOC3g0F8DTCWklfcZd6zSi3GvnRLG3tqiw5SD16EB NVshGyebl0uAcghMcaGV6Hb4vJX9RQnHL1kWoMAgVrt4aHncLFc4UJEcs F3w6TUB40cl3QO/rRjq4x8E9x5t7GtKYHXfOSSIFmgYF3Zr3AkPZ7ppFO Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10929"; a="481942548" X-IronPort-AV: E=Sophos;i="6.04,290,1695711600"; d="scan'208";a="481942548" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2023 18:39:30 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.04,290,1695711600"; d="scan'208";a="18119174" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmviesa001.fm.intel.com with ESMTP; 19 Dec 2023 18:39:27 -0800 Received: from pli-ubuntu.sh.intel.com (pli-ubuntu.sh.intel.com [10.239.159.47]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 33DDD1005663; Wed, 20 Dec 2023 10:39:26 +0800 (CST) From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, pan2.li@intel.com, yanzhang.wang@intel.com, kito.cheng@gmail.com Subject: [PATCH v1] RISC-V: Bugfix for the const vector in single steps Date: Wed, 20 Dec 2023 10:39:22 +0800 Message-Id: <20231220023922.1076198-1-pan2.li@intel.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,SCC_5_SHORT_WORD_LINES,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: From: Pan Li For generating the const vector with single step, we have code gen similar as below. We have npatterns = 4. v1= {3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8... } v2 (diff) = {3 - 0, 2 - 1, 1 - 2, 0 - 3, 7 - 4, 6 - 5, 5 - 6, 4 - 7...} = {3, 1, -1, 3, 3, 1, -1, 3 ...} v1 = vd + vid. But this requires the diff is npattern size repeated like {3, 1, -1, 3} as above. And it cannot take care of single step as below: { -4, 4, -4 + 1, 4 + 1, -4 + 2, 4 + 2, -4 + 3, 4 + 3, ... This patch would like to add the restriction to above code gen and implement one for the general case. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Add restriction for the vid-diff code gen and implement general one. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/bug-7.c: New test. Signed-off-by: Pan Li --- gcc/config/riscv/riscv-v.cc | 73 +++++++++++++++---- .../gcc.target/riscv/rvv/autovec/bug-7.c | 61 ++++++++++++++++ 2 files changed, 119 insertions(+), 15 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-7.c diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 486f5deb296..946588b7b1f 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -1257,24 +1257,67 @@ expand_const_vector (rtx target, rtx src) else { /* Generate the variable-length vector following this rule: - { a, b, a, b, a + step, b + step, a + step*2, b + step*2, ...} - E.g. { 3, 2, 1, 0, 7, 6, 5, 4, ... } */ - /* Step 2: Generate diff = TARGET - VID: - { 3-0, 2-1, 1-2, 0-3, 7-4, 6-5, 5-6, 4-7, ... }*/ + { a, b, a + step, b + step, a + step*2, b + step*2, ... } */ rvv_builder v (builder.mode (), builder.npatterns (), 1); - for (unsigned int i = 0; i < v.npatterns (); ++i) + poly_int64 ele_0 = rtx_to_poly_int64 (builder.elt (0)); + poly_int64 ele_n + = rtx_to_poly_int64 (builder.elt (v.npatterns ())); + + if (known_eq (ele_0 - 0, ele_n - v.npatterns ())) + { + /* Case 1: For example as below: + {3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8... } + We have 3 - 0 = 3 equals 7 - 4 = 3, the sequence is + repeated as below after minus vid. + {3, 1, -1, -3, 3, 1, -1, -3...} + Then we can simplify the diff code gen to at most + npatterns(). */ + + /* Step 1: Generate diff = TARGET - VID. */ + for (unsigned int i = 0; i < v.npatterns (); ++i) + { + poly_int64 diff = rtx_to_poly_int64 (builder.elt (i)) - i; + v.quick_push (gen_int_mode (diff, v.inner_mode ())); + } + + /* Step 2: Generate result = VID + diff. */ + rtx vec = v.build (); + rtx add_ops[] = {target, vid, vec}; + emit_vlmax_insn (code_for_pred (PLUS, builder.mode ()), + BINARY_OP, add_ops); + } + else { - /* Calculate the diff between the target sequence and - vid sequence. The elt (i) can be either const_int or - const_poly_int. */ - poly_int64 diff = rtx_to_poly_int64 (builder.elt (i)) - i; - v.quick_push (gen_int_mode (diff, v.inner_mode ())); + /* Case 2: For example as below: + { -4, 4, -4 + 1, 4 + 1, -4 + 2, 4 + 2, -4 + 3, 4 + 3, ... } + */ + + /* Step 1: Generate { a, b, a, b, ... } */ + for (unsigned int i = 0; i < v.npatterns (); ++i) + v.quick_push (builder.elt (i)); + rtx new_base = v.build (); + + /* Step 2: Generate tmp = VID >> LOG2 (NPATTERNS).  */ + rtx shift_count + = gen_int_mode (exact_log2 (builder.npatterns ()), + builder.inner_mode ()); + rtx tmp = expand_simple_binop (builder.mode (), LSHIFTRT, + vid, shift_count, NULL_RTX, + false, OPTAB_DIRECT); + + /* Step 3: Generate tmp2 = tmp * step.  */ + rtx tmp2 = gen_reg_rtx (builder.mode ()); + rtx step + = simplify_binary_operation (MINUS, builder.inner_mode (), + builder.elt (v.npatterns()), + builder.elt (0)); + expand_vec_series (tmp2, const0_rtx, step, tmp); + + /* Step 4: Generate target = tmp2 + new_base.  */ + rtx add_ops[] = {target, tmp2, new_base}; + emit_vlmax_insn (code_for_pred (PLUS, builder.mode ()), + BINARY_OP, add_ops); } - /* Step 2: Generate result = VID + diff. */ - rtx vec = v.build (); - rtx add_ops[] = {target, vid, vec}; - emit_vlmax_insn (code_for_pred (PLUS, builder.mode ()), - BINARY_OP, add_ops); } } else if (builder.interleaved_stepped_npatterns_p ()) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-7.c new file mode 100644 index 00000000000..9acac391f65 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-7.c @@ -0,0 +1,61 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math" } */ + +#define N 4 +struct C { int l, r; }; +struct C a[N], b[N], c[N]; +struct C a1[N], b1[N], c1[N]; + +void __attribute__((noinline)) +init_data_vec (struct C * __restrict a, struct C * __restrict b, + struct C * __restrict c) +{ + int i; + + for (i = 0; i < N; ++i) + { + a[i].l = N - i; + a[i].r = i - N; + + b[i].l = i - N; + b[i].r = i + N; + + c[i].l = -1 - i; + c[i].r = 2 * N - 1 - i; + } +} + +int +main () +{ + int i; + + init_data_vec (a, b, c); + +#pragma GCC novector + for (i = 0; i < N; ++i) + { + a1[i].l = N - i; + a1[i].r = i - N; + + b1[i].l = i - N; + b1[i].r = i + N; + + c1[i].l = -1 - i; + c1[i].r = 2 * N - 1 - i; + } + + for (i = 0; i < N; i++) + { + if (a[i].l != a1[i].l || a[i].r != a1[i].r) + __builtin_abort (); + + if (b[i].l != b1[i].l || b[i].r != b1[i].r) + __builtin_abort (); + + if (c[i].l != c1[i].l || c[i].r != c1[i].r) + __builtin_abort (); + } + + return 0; +} -- 2.34.1