From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) by sourceware.org (Postfix) with ESMTPS id 05C063858D1E for ; Sat, 18 Nov 2023 04:36:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 05C063858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.alibaba.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 05C063858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=115.124.30.133 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700282176; cv=none; b=K0Sx6QHO6fG8gDY7BOFBRw+pyUH+bcbGYLCEfYBta5ppEeupUdQ9dAUxzk5EcbeZSycEA6KMyAP1g9D30L6oO3P0oG57/3ontxtKRmvJAOn2uEQu8smmUIsv1LsgtqrzjJRMfaCsio9lGEmbXDa8EIZnTVxjMkXQPGVk+y4cT3s= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700282176; c=relaxed/simple; bh=P1VLqwzTfz+uLLzQmAquBg3Ev+o4Hw/4VuIXBM2PahA=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=xUFwap8rutRH3FkgfChfMaXV23UEy1pZM+1h14l1YGsMISShHl61MElgeq+hhEOP9pAKWHULbWmDyp1BIv8yzKmg87h+I6/4aKjU4f/XFLeZV0woQ/N32RSMzlkmaRCcsRQU+iOMtSQ6L/RpXfgZvdPPR1HPBzHkKa9+YDS2Erk= ARC-Authentication-Results: i=1; server2.sourceware.org X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R191e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046060;MF=cooper.joshua@linux.alibaba.com;NM=1;PH=DS;RN=8;SR=0;TI=SMTPD_---0VwbRCJc_1700282163; Received: from localhost.localdomain(mailfrom:cooper.joshua@linux.alibaba.com fp:SMTPD_---0VwbRCJc_1700282163) by smtp.aliyun-inc.com; Sat, 18 Nov 2023 12:36:07 +0800 From: "Jun Sha (Joshua)" To: gcc-patches@gcc.gnu.org Cc: jim.wilson.gcc@gmail.com, palmer@dabbelt.com, andrew@sifive.com, philipp.tomsich@vrull.eu, jeffreyalaw@gmail.com, christoph.muellner@vrull.eu, "Jun Sha (Joshua)" Subject: [PATCH v2 6/9] RISC-V: Tests for overlapping RVV and XTheadVector instructions (Part4) Date: Sat, 18 Nov 2023 12:35:54 +0800 Message-Id: <20231118043554.3863-1-cooper.joshua@linux.alibaba.com> X-Mailer: git-send-email 2.27.0.windows.1 In-Reply-To: <20231118042258.3545-1-cooper.joshua@linux.alibaba.com> References: <20231118042258.3545-1-cooper.joshua@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-21.1 required=5.0 tests=BAYES_00,ENV_AND_HDR_SPF_MATCH,GIT_PATCH_0,KAM_DMARC_STATUS,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: For big changes in instruction generation, we can only duplicate some typical tests in testsuite/gcc.target/riscv/rvv/base. This patch is adding some tests for ternary and unary operations. Contributors: Jun Sha (Joshua) Jin Ma Christoph Müllner gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-1.c: New test. * gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-2.c: New test. * gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-3.c: New test. * gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-4.c: New test. * gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-5.c: New test. * gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-6.c: New test. * gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-1.c: New test. * gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-2.c: New test. * gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-3.c: New test. * gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-4.c: New test. * gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-5.c: New test. * gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-6.c: New test. * gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-7.c: New test. * gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-8.c: New test. * gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-9.c: New test. * gcc.target/riscv/rvv/xtheadvector/unop_v_constraint-1.c: New test. --- .../rvv/xtheadvector/ternop_vv_constraint-1.c | 83 +++++++++++ .../rvv/xtheadvector/ternop_vv_constraint-2.c | 83 +++++++++++ .../rvv/xtheadvector/ternop_vv_constraint-3.c | 83 +++++++++++ .../rvv/xtheadvector/ternop_vv_constraint-4.c | 83 +++++++++++ .../rvv/xtheadvector/ternop_vv_constraint-5.c | 83 +++++++++++ .../rvv/xtheadvector/ternop_vv_constraint-6.c | 83 +++++++++++ .../rvv/xtheadvector/ternop_vx_constraint-1.c | 71 ++++++++++ .../rvv/xtheadvector/ternop_vx_constraint-2.c | 38 +++++ .../rvv/xtheadvector/ternop_vx_constraint-3.c | 125 +++++++++++++++++ .../rvv/xtheadvector/ternop_vx_constraint-4.c | 123 +++++++++++++++++ .../rvv/xtheadvector/ternop_vx_constraint-5.c | 123 +++++++++++++++++ .../rvv/xtheadvector/ternop_vx_constraint-6.c | 130 ++++++++++++++++++ .../rvv/xtheadvector/ternop_vx_constraint-7.c | 130 ++++++++++++++++++ .../rvv/xtheadvector/ternop_vx_constraint-8.c | 71 ++++++++++ .../rvv/xtheadvector/ternop_vx_constraint-9.c | 71 ++++++++++ .../rvv/xtheadvector/unop_v_constraint-1.c | 68 +++++++++ 16 files changed, 1448 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-6.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-6.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-7.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-9.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/unop_v_constraint-1.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-1.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-1.c new file mode 100644 index 00000000000..d98755e7040 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-1.c @@ -0,0 +1,83 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcxtheadvector -mabi=ilp32d -O3" } */ +/* { dg-final { check-function-bodies "**" "" } } */ +#include "riscv_th_vector.h" + +/* +** f1: +** ... +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f1 (void * in, void * in2, void *out) +{ + vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1 (in2, 4); + vint32m1_t v3 = __riscv_vmacc_vv_i32m1 (v, v2, v2, 4); + vint32m1_t v4 = __riscv_vmacc_vv_i32m1(v3, v2, v2, 4); + v4 = __riscv_vmacc_vv_i32m1 (v4, v2, v2, 4); + v4 = __riscv_vmacc_vv_i32m1 (v4, v2, v2, 4); + v4 = __riscv_vmacc_vv_i32m1 (v4, v2, v2, 4); + __riscv_vse32_v_i32m1 (out, v4, 4); +} + +/* +** f2: +** ... +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f2 (void * in, void * in2, void *out) +{ + vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1 (in2, 4); + vint32m1_t v3 = __riscv_vmacc_vv_i32m1_tu (v, v2, v2, 4); + vint32m1_t v4 = __riscv_vmacc_vv_i32m1_tu(v3, v2, v2, 4); + v4 = __riscv_vmacc_vv_i32m1_tu (v4, v2, v2, 4); + v4 = __riscv_vmacc_vv_i32m1_tu (v4, v2, v2, 4); + v4 = __riscv_vmacc_vv_i32m1_tu (v4, v2, v2, 4); + __riscv_vse32_v_i32m1 (out, v4, 4); +} + +/* +** f3: +** ... +** th.vlm\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f3 (void * in, void * in2, void * in3, void *out) +{ + vbool32_t m = __riscv_vlm_v_b32 (in3, 4); + vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1 (in2, 4); + vint32m1_t v3 = __riscv_vmacc_vv_i32m1_m (m, v, v2, v2, 4); + vint32m1_t v4 = __riscv_vmacc_vv_i32m1_m(m, v3, v2, v2, 4); + v4 = __riscv_vmacc_vv_i32m1_m (m, v4, v2, v2, 4); + v4 = __riscv_vmacc_vv_i32m1_m (m, v4, v2, v2, 4); + v4 = __riscv_vmacc_vv_i32m1_m (m, v4, v2, v2, 4); + __riscv_vse32_v_i32m1 (out, v4, 4); +} + +/* { dg-final { scan-assembler-not {th.vmv} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-2.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-2.c new file mode 100644 index 00000000000..e9d2c7f10a5 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-2.c @@ -0,0 +1,83 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcxtheadvector -mabi=ilp32d -O3" } */ +/* { dg-final { check-function-bodies "**" "" } } */ +#include "riscv_th_vector.h" + +/* +** f1: +** ... +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f1 (void * in, void * in2, void *out) +{ + vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1 (in2, 4); + vint32m1_t v3 = __riscv_vmadd_vv_i32m1 (v, v2, v2, 4); + vint32m1_t v4 = __riscv_vmadd_vv_i32m1(v3, v2, v2, 4); + v4 = __riscv_vmadd_vv_i32m1 (v4, v2, v2, 4); + v4 = __riscv_vmadd_vv_i32m1 (v4, v2, v2, 4); + v4 = __riscv_vmadd_vv_i32m1 (v4, v2, v2, 4); + __riscv_vse32_v_i32m1 (out, v4, 4); +} + +/* +** f2: +** ... +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f2 (void * in, void * in2, void *out) +{ + vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1 (in2, 4); + vint32m1_t v3 = __riscv_vmadd_vv_i32m1_tu (v, v2, v2, 4); + vint32m1_t v4 = __riscv_vmadd_vv_i32m1_tu(v3, v2, v2, 4); + v4 = __riscv_vmadd_vv_i32m1_tu (v4, v2, v2, 4); + v4 = __riscv_vmadd_vv_i32m1_tu (v4, v2, v2, 4); + v4 = __riscv_vmadd_vv_i32m1_tu (v4, v2, v2, 4); + __riscv_vse32_v_i32m1 (out, v4, 4); +} + +/* +** f3: +** ... +** th.vlm\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f3 (void * in, void * in2, void * in3, void *out) +{ + vbool32_t m = __riscv_vlm_v_b32 (in3, 4); + vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1 (in2, 4); + vint32m1_t v3 = __riscv_vmadd_vv_i32m1_m (m, v, v2, v2, 4); + vint32m1_t v4 = __riscv_vmadd_vv_i32m1_m(m, v3, v2, v2, 4); + v4 = __riscv_vmadd_vv_i32m1_m (m, v4, v2, v2, 4); + v4 = __riscv_vmadd_vv_i32m1_m (m, v4, v2, v2, 4); + v4 = __riscv_vmadd_vv_i32m1_m (m, v4, v2, v2, 4); + __riscv_vse32_v_i32m1 (out, v4, 4); +} + +/* { dg-final { scan-assembler-not {th.vmv} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-3.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-3.c new file mode 100644 index 00000000000..2f70761558d --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-3.c @@ -0,0 +1,83 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcxtheadvector -mabi=ilp32d -O3" } */ +/* { dg-final { check-function-bodies "**" "" } } */ +#include "riscv_th_vector.h" + +/* +** f1: +** ... +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f1 (void * in, void * in2, void *out) +{ + vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1 (in2, 4); + vfloat32m1_t v3 = __riscv_vfmacc_vv_f32m1 (v, v2, v2, 4); + vfloat32m1_t v4 = __riscv_vfmacc_vv_f32m1(v3, v2, v2, 4); + v4 = __riscv_vfmacc_vv_f32m1 (v4, v2, v2, 4); + v4 = __riscv_vfmacc_vv_f32m1 (v4, v2, v2, 4); + v4 = __riscv_vfmacc_vv_f32m1 (v4, v2, v2, 4); + __riscv_vse32_v_f32m1 (out, v4, 4); +} + +/* +** f2: +** ... +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f2 (void * in, void * in2, void *out) +{ + vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1 (in2, 4); + vfloat32m1_t v3 = __riscv_vfmacc_vv_f32m1_tu (v, v2, v2, 4); + vfloat32m1_t v4 = __riscv_vfmacc_vv_f32m1_tu(v3, v2, v2, 4); + v4 = __riscv_vfmacc_vv_f32m1_tu (v4, v2, v2, 4); + v4 = __riscv_vfmacc_vv_f32m1_tu (v4, v2, v2, 4); + v4 = __riscv_vfmacc_vv_f32m1_tu (v4, v2, v2, 4); + __riscv_vse32_v_f32m1 (out, v4, 4); +} + +/* +** f3: +** ... +** th.vlm\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f3 (void * in, void * in2, void * in3, void *out) +{ + vbool32_t m = __riscv_vlm_v_b32 (in3, 4); + vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1 (in2, 4); + vfloat32m1_t v3 = __riscv_vfmacc_vv_f32m1_m (m, v, v2, v2, 4); + vfloat32m1_t v4 = __riscv_vfmacc_vv_f32m1_m(m, v3, v2, v2, 4); + v4 = __riscv_vfmacc_vv_f32m1_m (m, v4, v2, v2, 4); + v4 = __riscv_vfmacc_vv_f32m1_m (m, v4, v2, v2, 4); + v4 = __riscv_vfmacc_vv_f32m1_m (m, v4, v2, v2, 4); + __riscv_vse32_v_f32m1 (out, v4, 4); +} + +/* { dg-final { scan-assembler-not {th.vmv} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-4.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-4.c new file mode 100644 index 00000000000..0ba9c866b32 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-4.c @@ -0,0 +1,83 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcxtheadvector -mabi=ilp32d -O3" } */ +/* { dg-final { check-function-bodies "**" "" } } */ +#include "riscv_th_vector.h" + +/* +** f1: +** ... +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f1 (void * in, void * in2, void *out) +{ + vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1 (in2, 4); + vfloat32m1_t v3 = __riscv_vfmadd_vv_f32m1 (v, v2, v2, 4); + vfloat32m1_t v4 = __riscv_vfmadd_vv_f32m1(v3, v2, v2, 4); + v4 = __riscv_vfmadd_vv_f32m1 (v4, v2, v2, 4); + v4 = __riscv_vfmadd_vv_f32m1 (v4, v2, v2, 4); + v4 = __riscv_vfmadd_vv_f32m1 (v4, v2, v2, 4); + __riscv_vse32_v_f32m1 (out, v4, 4); +} + +/* +** f2: +** ... +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f2 (void * in, void * in2, void *out) +{ + vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1 (in2, 4); + vfloat32m1_t v3 = __riscv_vfmadd_vv_f32m1_tu (v, v2, v2, 4); + vfloat32m1_t v4 = __riscv_vfmadd_vv_f32m1_tu(v3, v2, v2, 4); + v4 = __riscv_vfmadd_vv_f32m1_tu (v4, v2, v2, 4); + v4 = __riscv_vfmadd_vv_f32m1_tu (v4, v2, v2, 4); + v4 = __riscv_vfmadd_vv_f32m1_tu (v4, v2, v2, 4); + __riscv_vse32_v_f32m1 (out, v4, 4); +} + +/* +** f3: +** ... +** th.vlm\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vfma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f3 (void * in, void * in2, void * in3, void *out) +{ + vbool32_t m = __riscv_vlm_v_b32 (in3, 4); + vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1 (in2, 4); + vfloat32m1_t v3 = __riscv_vfmadd_vv_f32m1_m (m, v, v2, v2, 4); + vfloat32m1_t v4 = __riscv_vfmadd_vv_f32m1_m(m, v3, v2, v2, 4); + v4 = __riscv_vfmadd_vv_f32m1_m (m, v4, v2, v2, 4); + v4 = __riscv_vfmadd_vv_f32m1_m (m, v4, v2, v2, 4); + v4 = __riscv_vfmadd_vv_f32m1_m (m, v4, v2, v2, 4); + __riscv_vse32_v_f32m1 (out, v4, 4); +} + +/* { dg-final { scan-assembler-not {th.vmv} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-5.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-5.c new file mode 100644 index 00000000000..e913cfe9ef8 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-5.c @@ -0,0 +1,83 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcxtheadvector -mabi=ilp32d -O3" } */ +/* { dg-final { check-function-bodies "**" "" } } */ +#include "riscv_th_vector.h" + +/* +** f1: +** ... +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f1 (void * in, void * in2, void *out) +{ + vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1 (in2, 4); + vfloat32m1_t v3 = __riscv_vfnmacc_vv_f32m1 (v, v2, v2, 4); + vfloat32m1_t v4 = __riscv_vfnmacc_vv_f32m1(v3, v2, v2, 4); + v4 = __riscv_vfnmacc_vv_f32m1 (v4, v2, v2, 4); + v4 = __riscv_vfnmacc_vv_f32m1 (v4, v2, v2, 4); + v4 = __riscv_vfnmacc_vv_f32m1 (v4, v2, v2, 4); + __riscv_vse32_v_f32m1 (out, v4, 4); +} + +/* +** f2: +** ... +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f2 (void * in, void * in2, void *out) +{ + vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1 (in2, 4); + vfloat32m1_t v3 = __riscv_vfnmacc_vv_f32m1_tu (v, v2, v2, 4); + vfloat32m1_t v4 = __riscv_vfnmacc_vv_f32m1_tu(v3, v2, v2, 4); + v4 = __riscv_vfnmacc_vv_f32m1_tu (v4, v2, v2, 4); + v4 = __riscv_vfnmacc_vv_f32m1_tu (v4, v2, v2, 4); + v4 = __riscv_vfnmacc_vv_f32m1_tu (v4, v2, v2, 4); + __riscv_vse32_v_f32m1 (out, v4, 4); +} + +/* +** f3: +** ... +** th.vlm\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f3 (void * in, void * in2, void * in3, void *out) +{ + vbool32_t m = __riscv_vlm_v_b32 (in3, 4); + vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1 (in2, 4); + vfloat32m1_t v3 = __riscv_vfnmacc_vv_f32m1_m (m, v, v2, v2, 4); + vfloat32m1_t v4 = __riscv_vfnmacc_vv_f32m1_m(m, v3, v2, v2, 4); + v4 = __riscv_vfnmacc_vv_f32m1_m (m, v4, v2, v2, 4); + v4 = __riscv_vfnmacc_vv_f32m1_m (m, v4, v2, v2, 4); + v4 = __riscv_vfnmacc_vv_f32m1_m (m, v4, v2, v2, 4); + __riscv_vse32_v_f32m1 (out, v4, 4); +} + +/* { dg-final { scan-assembler-not {th.vmv} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-6.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-6.c new file mode 100644 index 00000000000..ced00a2e43e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vv_constraint-6.c @@ -0,0 +1,83 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcxtheadvector -mabi=ilp32d -O3" } */ +/* { dg-final { check-function-bodies "**" "" } } */ +#include "riscv_th_vector.h" + +/* +** f1: +** ... +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f1 (void * in, void * in2, void *out) +{ + vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1 (in2, 4); + vfloat32m1_t v3 = __riscv_vfnmadd_vv_f32m1 (v, v2, v2, 4); + vfloat32m1_t v4 = __riscv_vfnmadd_vv_f32m1(v3, v2, v2, 4); + v4 = __riscv_vfnmadd_vv_f32m1 (v4, v2, v2, 4); + v4 = __riscv_vfnmadd_vv_f32m1 (v4, v2, v2, 4); + v4 = __riscv_vfnmadd_vv_f32m1 (v4, v2, v2, 4); + __riscv_vse32_v_f32m1 (out, v4, 4); +} + +/* +** f2: +** ... +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f2 (void * in, void * in2, void *out) +{ + vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1 (in2, 4); + vfloat32m1_t v3 = __riscv_vfnmadd_vv_f32m1_tu (v, v2, v2, 4); + vfloat32m1_t v4 = __riscv_vfnmadd_vv_f32m1_tu(v3, v2, v2, 4); + v4 = __riscv_vfnmadd_vv_f32m1_tu (v4, v2, v2, 4); + v4 = __riscv_vfnmadd_vv_f32m1_tu (v4, v2, v2, 4); + v4 = __riscv_vfnmadd_vv_f32m1_tu (v4, v2, v2, 4); + __riscv_vse32_v_f32m1 (out, v4, 4); +} + +/* +** f3: +** ... +** th.vlm\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vfnma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f3 (void * in, void * in2, void * in3, void *out) +{ + vbool32_t m = __riscv_vlm_v_b32 (in3, 4); + vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1 (in2, 4); + vfloat32m1_t v3 = __riscv_vfnmadd_vv_f32m1_m (m, v, v2, v2, 4); + vfloat32m1_t v4 = __riscv_vfnmadd_vv_f32m1_m(m, v3, v2, v2, 4); + v4 = __riscv_vfnmadd_vv_f32m1_m (m, v4, v2, v2, 4); + v4 = __riscv_vfnmadd_vv_f32m1_m (m, v4, v2, v2, 4); + v4 = __riscv_vfnmadd_vv_f32m1_m (m, v4, v2, v2, 4); + __riscv_vse32_v_f32m1 (out, v4, 4); +} + +/* { dg-final { scan-assembler-not {th.vmv} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-1.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-1.c new file mode 100644 index 00000000000..34e6fe355a3 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-1.c @@ -0,0 +1,71 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcxtheadvector -mabi=ilp32d -O3" } */ +/* { dg-final { check-function-bodies "**" "" } } */ +#include "riscv_th_vector.h" + +/* +** f1: +** ... +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f1 (void * in, void * in2, void *out, int32_t x) +{ + vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1 (in2, 4); + vint32m1_t v3 = __riscv_vmacc_vx_i32m1 (v, x, v2, 4); + vint32m1_t v4 = __riscv_vmacc_vx_i32m1_tu (v3, x, v2, 4); + __riscv_vse32_v_i32m1 (out, v4, 4); +} + +/* +** f2: +** ... +** th.vlm.v\tv[0-9]+,0\([a-x0-9]+\) +** ... +** th.vle.v\tv[0-9]+,0\([a-x0-9]+\),v0.t +** th.vle.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vse.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f2 (void * in, void * in2, void *out, int32_t x) +{ + vbool32_t mask = *(vbool32_t*)in; + asm volatile ("":::"memory"); + vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, in2, 4); + vint32m1_t v3 = __riscv_vmacc_vx_i32m1 (v, x, v2, 4); + vint32m1_t v4 = __riscv_vmacc_vx_i32m1_tu (v3, x, v2, 4); + __riscv_vse32_v_i32m1 (out, v4, 4); +} + +/* +** f3: +** ... +** th.vlm.v\tv[0-9]+,0\([a-x0-9]+\) +** ... +** th.vle.v\tv[0-9]+,0\([a-x0-9]+\),v0.t +** th.vle.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** th.vse.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f3 (void * in, void * in2, void *out, int32_t x) +{ + vbool32_t mask = *(vbool32_t*)in; + asm volatile ("":::"memory"); + vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, in2, 4); + vint32m1_t v3 = __riscv_vmacc_vx_i32m1 (v, x, v2, 4); + vint32m1_t v4 = __riscv_vmacc_vx_i32m1_tumu (mask, v3, x, v2, 4); + __riscv_vse32_v_i32m1 (out, v4, 4); +} + +/* { dg-final { scan-assembler-not {th.vmv} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-2.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-2.c new file mode 100644 index 00000000000..290981625bf --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-2.c @@ -0,0 +1,38 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcxtheadvector -mabi=ilp32d -O3" } */ +#include "riscv_th_vector.h" + +void f1 (void * in, void * in2, void *out, int32_t x) +{ + vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1 (in2, 4); + vint32m1_t v3 = __riscv_vmacc_vx_i32m1 (v, x, v2, 4); + vint32m1_t v4 = __riscv_vmacc_vx_i32m1_tu (v3, x, v2, 4); + __riscv_vse32_v_i32m1 (out, v4, 4); +} + +void f2 (void * in, void * in2, void *out, int32_t x) +{ + vbool32_t mask = *(vbool32_t*)in; + asm volatile ("":::"memory"); + vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, in2, 4); + vint32m1_t v3 = __riscv_vmacc_vx_i32m1 (v, x, v2, 4); + vint32m1_t v4 = __riscv_vmacc_vx_i32m1_tu (v3, x, v2, 4); + __riscv_vse32_v_i32m1 (out, v4, 4); +} + +void f3 (void * in, void * in2, void *out, int32_t x) +{ + vbool32_t mask = *(vbool32_t*)in; + asm volatile ("":::"memory"); + vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, in2, 4); + vint32m1_t v3 = __riscv_vmacc_vx_i32m1 (v, x, v2, 4); + vint32m1_t v4 = __riscv_vmacc_vx_i32m1_tumu (mask, v3, x, v2, 4); + __riscv_vse32_v_i32m1 (out, v4, 4); +} + +/* { dg-final { scan-assembler-times {th.vma[c-d][c-d]\.vx\s+v[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+\s+} 5 } } */ +/* { dg-final { scan-assembler-times {th.vma[c-d][c-d]\.vx\s+v[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,\s*v0.t} 1 } } */ +/* { dg-final { scan-assembler-not {th.vmv} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-3.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-3.c new file mode 100644 index 00000000000..491cd2d42af --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-3.c @@ -0,0 +1,125 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcxtheadvector -mabi=lp64d -O3" } */ +/* { dg-final { check-function-bodies "**" "" } } */ +#include "riscv_th_vector.h" + +/* +** f0: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** ... +** ret +*/ +void f0 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1 (v2, -16, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1 (v3, -16, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f1: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** ... +** ret +*/ +void f1 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1 (v2, 15, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1 (v3, 15, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f2: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** ... +** ret +*/ +void f2 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1 (v2, 16, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1 (v3, 16, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f3: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** ... +** ret +*/ +void f3 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1 (v2, 0xAAAAAA, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1 (v3, 0xAAAAAA, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f4: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** ... +** ret +*/ +void f4 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1 (v2, 0xAAAAAAA, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1 (v3, 0xAAAAAAA, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f5: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** ... +** ret +*/ +void f5 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1 (v2, 0xAAAAAAAAAAAAAAAA, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1 (v3, 0xAAAAAAAAAAAAAAAA, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f6: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** ... +** ret +*/ +void f6 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1 (v2, x, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1 (v3, x, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* { dg-final { scan-assembler-not {th.vmv} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-4.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-4.c new file mode 100644 index 00000000000..70f249bfc8b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-4.c @@ -0,0 +1,123 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcxtheadvector -mabi=ilp32 -O3" } */ +/* { dg-final { check-function-bodies "**" "" } } */ +#include "riscv_th_vector.h" + +/* +** f0: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** ... +** ret +*/ +void f0 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1 (v2, -16, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1 (v3, -16, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f1: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** ... +** ret +*/ +void f1 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1 (v2, 15, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1 (v3, 15, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f2: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** ... +** ret +*/ +void f2 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1 (v2, 16, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1 (v3, 16, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f3: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** ... +** ret +*/ +void f3 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1 (v2, 0xAAAAAA, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1 (v3, 0xAAAAAA, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f4: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** ... +** ret +*/ +void f4 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1 (v2, 0xAAAAAAA, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1 (v3, 0xAAAAAAA, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f5: +** ... +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +void f5 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1 (v2, 0xAAAAAAAAAAAAAAAA, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1 (v3, 0xAAAAAAAAAAAAAAAA, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f6: +** ... +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +void f6 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1 (v2, x, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1 (v3, x, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* { dg-final { scan-assembler-not {th.vmv} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-5.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-5.c new file mode 100644 index 00000000000..3de929de136 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-5.c @@ -0,0 +1,123 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcxtheadvector -mabi=ilp32 -O3" } */ +/* { dg-final { check-function-bodies "**" "" } } */ +#include "riscv_th_vector.h" + +/* +** f0: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** ... +** ret +*/ +void f0 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_tu (v2, -16, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_tu (v3, -16, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f1: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** ... +** ret +*/ +void f1 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_tu (v2, 15, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_tu (v3, 15, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f2: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** ... +** ret +*/ +void f2 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_tu (v2, 16, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_tu (v3, 16, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f3: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** ... +** ret +*/ +void f3 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_tu (v2, 0xAAAAAA, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_tu (v3, 0xAAAAAA, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f4: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** ... +** ret +*/ +void f4 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_tu (v2, 0xAAAAAAA, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_tu (v3, 0xAAAAAAA, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f5: +** ... +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +void f5 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_tu (v2, 0xAAAAAAAAAAAAAAAA, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_tu (v3, 0xAAAAAAAAAAAAAAAA, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f6: +** ... +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+ +** ... +*/ +void f6 (void * in, void *out, int64_t x, int n) +{ + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_tu (v2, x, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_tu (v3, x, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* { dg-final { scan-assembler-not {th.vmv} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-6.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-6.c new file mode 100644 index 00000000000..ceef8794297 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-6.c @@ -0,0 +1,130 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcxtheadvector -mabi=ilp32 -O3" } */ +/* { dg-final { check-function-bodies "**" "" } } */ +#include "riscv_th_vector.h" + +/* +** f0: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** ... +** ret +*/ +void f0 (void * in, void *out, int64_t x, int n) +{ + vbool64_t mask = __riscv_vlm_v_b64 (in + 100, 4); + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_m (mask, v2, -16, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_m (mask, v3, -16, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f1: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** ... +** ret +*/ +void f1 (void * in, void *out, int64_t x, int n) +{ + vbool64_t mask = __riscv_vlm_v_b64 (in + 100, 4); + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_m (mask,v2, 15, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_m (mask,v3, 15, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f2: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** ... +** ret +*/ +void f2 (void * in, void *out, int64_t x, int n) +{ + vbool64_t mask = __riscv_vlm_v_b64 (in + 100, 4); + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_m (mask,v2, 16, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_m (mask,v3, 16, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f3: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** ... +** ret +*/ +void f3 (void * in, void *out, int64_t x, int n) +{ + vbool64_t mask = __riscv_vlm_v_b64 (in + 100, 4); + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_m (mask,v2, 0xAAAAAA, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_m (mask,v3, 0xAAAAAA, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f4: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** ... +** ret +*/ +void f4 (void * in, void *out, int64_t x, int n) +{ + vbool64_t mask = __riscv_vlm_v_b64 (in + 100, 4); + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_m (mask,v2, 0xAAAAAAA, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_m (mask,v3, 0xAAAAAAA, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f5: +** ... +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** ... +*/ +void f5 (void * in, void *out, int64_t x, int n) +{ + vbool64_t mask = __riscv_vlm_v_b64 (in + 100, 4); + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_m (mask,v2, 0xAAAAAAAAAAAAAAAA, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_m (mask,v3, 0xAAAAAAAAAAAAAAAA, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f6: +** ... +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** ... +*/ +void f6 (void * in, void *out, int64_t x, int n) +{ + vbool64_t mask = __riscv_vlm_v_b64 (in + 100, 4); + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_m (mask,v2, x, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_m (mask,v3, x, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* { dg-final { scan-assembler-not {th.vmv} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-7.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-7.c new file mode 100644 index 00000000000..6e524489176 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-7.c @@ -0,0 +1,130 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcxtheadvector -mabi=ilp32 -O3" } */ +/* { dg-final { check-function-bodies "**" "" } } */ +#include "riscv_th_vector.h" + +/* +** f0: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** ... +** ret +*/ +void f0 (void * in, void *out, int64_t x, int n) +{ + vbool64_t mask = __riscv_vlm_v_b64 (in + 100, 4); + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_tumu (mask, v2, -16, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_tumu (mask, v3, -16, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f1: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** ... +** ret +*/ +void f1 (void * in, void *out, int64_t x, int n) +{ + vbool64_t mask = __riscv_vlm_v_b64 (in + 100, 4); + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_tumu (mask,v2, 15, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_tumu (mask,v3, 15, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f2: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** ... +** ret +*/ +void f2 (void * in, void *out, int64_t x, int n) +{ + vbool64_t mask = __riscv_vlm_v_b64 (in + 100, 4); + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_tumu (mask,v2, 16, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_tumu (mask,v3, 16, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f3: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** ... +** ret +*/ +void f3 (void * in, void *out, int64_t x, int n) +{ + vbool64_t mask = __riscv_vlm_v_b64 (in + 100, 4); + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_tumu (mask,v2, 0xAAAAAA, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_tumu (mask,v3, 0xAAAAAA, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f4: +** ... +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vx\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** ... +** ret +*/ +void f4 (void * in, void *out, int64_t x, int n) +{ + vbool64_t mask = __riscv_vlm_v_b64 (in + 100, 4); + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_tumu (mask,v2, 0xAAAAAAA, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_tumu (mask,v3, 0xAAAAAAA, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f5: +** ... +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** ... +*/ +void f5 (void * in, void *out, int64_t x, int n) +{ + vbool64_t mask = __riscv_vlm_v_b64 (in + 100, 4); + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_tumu (mask,v2, 0xAAAAAAAAAAAAAAAA, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_tumu (mask,v3, 0xAAAAAAAAAAAAAAAA, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* +** f6: +** ... +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** th.vma[c-d][c-d]\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+,v0.t +** ... +*/ +void f6 (void * in, void *out, int64_t x, int n) +{ + vbool64_t mask = __riscv_vlm_v_b64 (in + 100, 4); + vint64m1_t v = __riscv_vle64_v_i64m1 (in + 1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_tu (v, in + 2, 4); + vint64m1_t v3 = __riscv_vmacc_vx_i64m1_tumu (mask,v2, x, v2, 4); + vint64m1_t v4 = __riscv_vmacc_vx_i64m1_tumu (mask,v3, x, v3, 4); + __riscv_vse64_v_i64m1 (out + 2, v4, 4); +} + +/* { dg-final { scan-assembler-not {th.vmv} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-8.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-8.c new file mode 100644 index 00000000000..16f03203276 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-8.c @@ -0,0 +1,71 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcxtheadvector -mabi=ilp32d -O3" } */ +/* { dg-final { check-function-bodies "**" "" } } */ +#include "riscv_th_vector.h" + +/* +** f1: +** ... +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vfma[c-d][c-d]\.vf\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vfma[c-d][c-d]\.vf\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f1 (void * in, void * in2, void *out, float x) +{ + vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1 (in2, 4); + vfloat32m1_t v3 = __riscv_vfmacc_vf_f32m1 (v, x, v2, 4); + vfloat32m1_t v4 = __riscv_vfmacc_vf_f32m1_tu (v3, x, v2, 4); + __riscv_vse32_v_f32m1 (out, v4, 4); +} + +/* +** f2: +** ... +** th.vlm.v\tv[0-9]+,0\([a-x0-9]+\) +** ... +** th.vle.v\tv[0-9]+,0\([a-x0-9]+\),v0.t +** th.vle.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vfma[c-d][c-d]\.vf\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vfma[c-d][c-d]\.vf\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vse.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f2 (void * in, void * in2, void *out, float x) +{ + vbool32_t mask = *(vbool32_t*)in; + asm volatile ("":::"memory"); + vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1_m (mask, in2, 4); + vfloat32m1_t v3 = __riscv_vfmacc_vf_f32m1 (v, x, v2, 4); + vfloat32m1_t v4 = __riscv_vfmacc_vf_f32m1_tu (v3, x, v2, 4); + __riscv_vse32_v_f32m1 (out, v4, 4); +} + +/* +** f3: +** ... +** th.vlm.v\tv[0-9]+,0\([a-x0-9]+\) +** ... +** th.vle.v\tv[0-9]+,0\([a-x0-9]+\),v0.t +** th.vle.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vfma[c-d][c-d]\.vf\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vfma[c-d][c-d]\.vf\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** th.vse.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f3 (void * in, void * in2, void *out, float x) +{ + vbool32_t mask = *(vbool32_t*)in; + asm volatile ("":::"memory"); + vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1_m (mask, in2, 4); + vfloat32m1_t v3 = __riscv_vfmacc_vf_f32m1 (v, x, v2, 4); + vfloat32m1_t v4 = __riscv_vfmacc_vf_f32m1_tumu (mask, v3, x, v2, 4); + __riscv_vse32_v_f32m1 (out, v4, 4); +} + +/* { dg-final { scan-assembler-not {th.vmv} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-9.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-9.c new file mode 100644 index 00000000000..13bd7f762f2 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/ternop_vx_constraint-9.c @@ -0,0 +1,71 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcxtheadvector -mabi=ilp32d -O3" } */ +/* { dg-final { check-function-bodies "**" "" } } */ +#include "riscv_th_vector.h" + +/* +** f1: +** ... +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vfnma[c-d][c-d]\.vf\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vfnma[c-d][c-d]\.vf\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f1 (void * in, void * in2, void *out, float x) +{ + vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1 (in2, 4); + vfloat32m1_t v3 = __riscv_vfnmacc_vf_f32m1 (v, x, v2, 4); + vfloat32m1_t v4 = __riscv_vfnmacc_vf_f32m1_tu (v3, x, v2, 4); + __riscv_vse32_v_f32m1 (out, v4, 4); +} + +/* +** f2: +** ... +** th.vlm.v\tv[0-9]+,0\([a-x0-9]+\) +** ... +** th.vle.v\tv[0-9]+,0\([a-x0-9]+\),v0.t +** th.vle.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vfnma[c-d][c-d]\.vf\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vfnma[c-d][c-d]\.vf\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vse.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f2 (void * in, void * in2, void *out, float x) +{ + vbool32_t mask = *(vbool32_t*)in; + asm volatile ("":::"memory"); + vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1_m (mask, in2, 4); + vfloat32m1_t v3 = __riscv_vfnmacc_vf_f32m1 (v, x, v2, 4); + vfloat32m1_t v4 = __riscv_vfnmacc_vf_f32m1_tu (v3, x, v2, 4); + __riscv_vse32_v_f32m1 (out, v4, 4); +} + +/* +** f3: +** ... +** th.vlm.v\tv[0-9]+,0\([a-x0-9]+\) +** ... +** th.vle.v\tv[0-9]+,0\([a-x0-9]+\),v0.t +** th.vle.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vfnma[c-d][c-d]\.vf\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+ +** th.vfnma[c-d][c-d]\.vf\tv[0-9]+,\s*[a-x0-9]+,\s*v[0-9]+,v0.t +** th.vse.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f3 (void * in, void * in2, void *out, float x) +{ + vbool32_t mask = *(vbool32_t*)in; + asm volatile ("":::"memory"); + vfloat32m1_t v = __riscv_vle32_v_f32m1 (in, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1_m (mask, in2, 4); + vfloat32m1_t v3 = __riscv_vfnmacc_vf_f32m1 (v, x, v2, 4); + vfloat32m1_t v4 = __riscv_vfnmacc_vf_f32m1_tumu (mask, v3, x, v2, 4); + __riscv_vse32_v_f32m1 (out, v4, 4); +} + +/* { dg-final { scan-assembler-not {th.vmv} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/unop_v_constraint-1.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/unop_v_constraint-1.c new file mode 100644 index 00000000000..95b35d3ad36 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/unop_v_constraint-1.c @@ -0,0 +1,68 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcxtheadvector -mabi=ilp32d -O3" } */ +/* { dg-final { check-function-bodies "**" "" } } */ +#include "riscv_vector.h" + +/* +** f1: +** ... +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vrsub\.vx\tv[0-9]+,\s*v[0-9]+,\s*[a-x0-9]+ +** th.vrsub\.vx\tv[0-9]+,\s*v[0-9]+,\s*[a-x0-9]+ +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f1 (void * in, void *out) +{ + vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1_tu (v, in, 4); + vint32m1_t v3 = __riscv_vneg_v_i32m1 (v2, 4); + vint32m1_t v4 = __riscv_vneg_v_i32m1_tu (v3, v2, 4); + __riscv_vse32_v_i32m1 (out, v4, 4); +} + +/* +** f2: +** ... +** th.vlm\.v\tv[0-9]+,0\([a-x0-9]+\) +** ... +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\),v0.t +** th.vrsub\.vx\tv[0-9]+,\s*v[0-9]+,\s*[a-x0-9]+ +** th.vrsub\.vx\tv[0-9]+,\s*v[0-9]+,\s*[a-x0-9]+,\s*v0.t +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f2 (void * in, void *out) +{ + vbool32_t mask = *(vbool32_t*)in; + asm volatile ("":::"memory"); + vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, in, 4); + vint32m1_t v3 = __riscv_vneg_v_i32m1 (v2, 4); + vint32m1_t v4 = __riscv_vneg_v_i32m1_m (mask, v3, 4); + __riscv_vse32_v_i32m1 (out, v4, 4); +} + +/* +** f3: +** ... +** th.vlm\.v\tv[0-9]+,0\([a-x0-9]+\) +** ... +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\) +** th.vle\.v\tv[0-9]+,0\([a-x0-9]+\),v0.t +** th.vrsub\.vx\tv[0-9]+,\s*v[0-9]+,\s*[a-x0-9]+ +** th.vrsub\.vx\tv[0-9]+,\s*v[0-9]+,\s*[a-x0-9]+,\s*v0.t +** th.vse\.v\tv[0-9]+,0\([a-x0-9]+\) +** ret +*/ +void f3 (void * in, void *out) +{ + vbool32_t mask = *(vbool32_t*)in; + asm volatile ("":::"memory"); + vint32m1_t v = __riscv_vle32_v_i32m1 (in, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1_tumu (mask, v, in, 4); + vint32m1_t v3 = __riscv_vneg_v_i32m1 (v2, 4); + vint32m1_t v4 = __riscv_vneg_v_i32m1_tumu (mask, v3, v2, 4); + __riscv_vse32_v_i32m1 (out, v4, 4); +} -- 2.17.1