From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1816) id 5B8E63858D32; Mon, 24 Apr 2023 10:32:53 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5B8E63858D32 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1682332373; bh=NqtfnDUclTezPK/DDrVb4qh5khjoBcJvxL7zxKCQmlc=; h=From:To:Subject:Date:From; b=Ah351HLb7Bs079dRPI+eIgwPLA4VZOxjeINBzOzZw4jZn52EqJ1ymbJiY2Q+jsYQF EVbDfaCQfhhBzE7zK8YTQrEwcVZjlTR5MUCengryS7fzsIK97hpKmzpjJNZd5RygvF pe911fwl4qmmIh8CGRP9pJ4YWEOTQzWHocC8nIQE= MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8" From: Kyrylo Tkachov To: gcc-cvs@gcc.gnu.org Subject: [gcc r14-196] aarch64: Add mulv2di3 expander for TARGET_SVE X-Act-Checkin: gcc X-Git-Author: Kyrylo Tkachov X-Git-Refname: refs/heads/master X-Git-Oldrev: 2abc7b68ae9e5becf9ef7c1776293a11d05f3229 X-Git-Newrev: b98c63e9e8ceaf9e04c28d83500f98313284c7f8 Message-Id: <20230424103253.5B8E63858D32@sourceware.org> Date: Mon, 24 Apr 2023 10:32:53 +0000 (GMT) List-Id: https://gcc.gnu.org/g:b98c63e9e8ceaf9e04c28d83500f98313284c7f8 commit r14-196-gb98c63e9e8ceaf9e04c28d83500f98313284c7f8 Author: Kyrylo Tkachov Date: Mon Apr 24 11:32:15 2023 +0100 aarch64: Add mulv2di3 expander for TARGET_SVE Motivated by a recent LLVM patch I saw, we can use SVE for 64-bit vector integer MUL (plain Advanced SIMD doesn't support it). Since the Advanced SIMD regs are just the low 128-bit part of the SVE regs it all works transparently. It's a reasonably straightforward implementation of the mulv2di3 optab that wires it up through the mulvnx2di3 expander and subregs the results back to the Advanced SIMD modes. There's more such tricks possible with other operations (and we could do 64-bit multiply-add merged operations too) but for now this self-contained patch improves the mul case as without it for the testcases in the patch we'd have scalarised the arguments, moved them to GP regs, performed two GP MULs and moved them back to SIMD regs. Advertising a mulv2di3 optab from the backend should also allow for more flexibile vectorisation opportunities. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-simd.md (mulv2di3): New expander. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve-neon-modes_1.c: New test. * gcc.target/aarch64/sve-neon-modes_2.c: New test. Diff: --- gcc/config/aarch64/aarch64-simd.md | 20 ++++++++++++++ .../gcc.target/aarch64/sve-neon-modes_1.c | 31 ++++++++++++++++++++++ .../gcc.target/aarch64/sve-neon-modes_2.c | 30 +++++++++++++++++++++ 3 files changed, 81 insertions(+) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index e420f58633a..9f2fce6f033 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -458,6 +458,26 @@ [(set_attr "type" "neon_mul_")] ) +;; Advanced SIMD does not support vector DImode MUL, but SVE does. +;; Make use of the overlap between Z and V registers to implement the V2DI +;; optab for TARGET_SVE. The mulvnx2di3 expander can +;; handle the TARGET_SVE2 case transparently. +(define_expand "mulv2di3" + [(set (match_operand:V2DI 0 "register_operand") + (mult:V2DI (match_operand:V2DI 1 "register_operand") + (match_operand:V2DI 2 "aarch64_sve_vsm_operand")))] + "TARGET_SVE" + { + machine_mode sve_mode = VNx2DImode; + rtx sve_op0 = simplify_gen_subreg (sve_mode, operands[0], V2DImode, 0); + rtx sve_op1 = simplify_gen_subreg (sve_mode, operands[1], V2DImode, 0); + rtx sve_op2 = simplify_gen_subreg (sve_mode, operands[2], V2DImode, 0); + + emit_insn (gen_mulvnx2di3 (sve_op0, sve_op1, sve_op2)); + DONE; + } +) + (define_insn "bswap2" [(set (match_operand:VDQHSD 0 "register_operand" "=w") (bswap:VDQHSD (match_operand:VDQHSD 1 "register_operand" "w")))] diff --git a/gcc/testsuite/gcc.target/aarch64/sve-neon-modes_1.c b/gcc/testsuite/gcc.target/aarch64/sve-neon-modes_1.c new file mode 100644 index 00000000000..ce4f1c70bcc --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve-neon-modes_1.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-options "-O -march=armv8.2-a+sve" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +typedef long v2di __attribute__((vector_size (16))); + +/* +** foo: +** ptrue p0.b, all +** mul z0.d, p0/m, z0.d, z1.d +** ret +*/ + +v2di +foo (v2di a, v2di b) +{ + return a * b; +} + +/* +** foo_imm: +** mul z0.d, z0.d, #125 +** ret +*/ + +v2di +foo_imm (v2di a) +{ + return a * 125; +} + diff --git a/gcc/testsuite/gcc.target/aarch64/sve-neon-modes_2.c b/gcc/testsuite/gcc.target/aarch64/sve-neon-modes_2.c new file mode 100644 index 00000000000..02bb9f34c38 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve-neon-modes_2.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-options "-O -march=armv8.5-a+sve2" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +typedef long v2di __attribute__((vector_size (16))); + +/* +** foo: +** mul z0.d, z0.d, z1.d +** ret +*/ + +v2di +foo (v2di a, v2di b) +{ + return a * b; +} + +/* +** foo_imm: +** mul z0.d, z0.d, #125 +** ret +*/ + +v2di +foo_imm (v2di a) +{ + return a * 125; +} +