From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 7810) id 568303858D38; Mon, 20 Feb 2023 09:21:29 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 568303858D38 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1676884890; bh=WPMTmIEIoMqf0Rsu3GWV3evuGYcT1T8jUfM0M1zgPGc=; h=From:To:Subject:Date:From; b=KhaQC3MBo5GhDV1mgPNrIF6c5gBC/Q6u8NP27IX7ihY5FodU1Rz8jTPkOrSNdgauC KBucbJOc7MtieyzyZ8sG6d6L8d6I3yHs6UHfu6sDOBtOFM1V7L82zqRwn0turrZPY7 VkjIJm34jcx3wqhYsjTr7oV9gV1LPw46LtYz2Qbo= MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8" From: Alex Coplan To: gcc-cvs@gcc.gnu.org Subject: [gcc r12-9190] aarch64: Fix up bfmlal lane pattern [PR104921] X-Act-Checkin: gcc X-Git-Author: Alex Coplan X-Git-Refname: refs/heads/releases/gcc-12 X-Git-Oldrev: c6e3ecca0e3dcf567d0c843a4987e52591041372 X-Git-Newrev: 5b99b0f1ee65ad50b5e6d3a57aace616273cc814 Message-Id: <20230220092130.568303858D38@sourceware.org> Date: Mon, 20 Feb 2023 09:21:29 +0000 (GMT) List-Id: https://gcc.gnu.org/g:5b99b0f1ee65ad50b5e6d3a57aace616273cc814 commit r12-9190-g5b99b0f1ee65ad50b5e6d3a57aace616273cc814 Author: Alex Coplan Date: Mon Feb 6 14:32:21 2023 +0000 aarch64: Fix up bfmlal lane pattern [PR104921] As the testcase shows, this pattern had an incorrect constraint leading to GCC's output getting rejected by the assembler. This patch fixes the constraint accordingly. The test is split into two: one that can run without bf16 support from the assembler and another that checks that the output actually assembles when such support is available. gcc/ChangeLog: PR target/104921 * config/aarch64/aarch64-simd.md (aarch64_bfmlal_lanev4sf): Use correct constraint for operand 3. gcc/testsuite/ChangeLog: PR target/104921 * gcc.target/aarch64/pr104921-1.c: New test. * gcc.target/aarch64/pr104921-2.c: New test. * gcc.target/aarch64/pr104921.x: Include file for new tests. (cherry picked from commit 277e1f30a5e4e634304a7b8a532825119f0ea47f) Diff: --- gcc/config/aarch64/aarch64-simd.md | 2 +- gcc/testsuite/gcc.target/aarch64/pr104921-1.c | 12 ++++++++++++ gcc/testsuite/gcc.target/aarch64/pr104921-2.c | 6 ++++++ gcc/testsuite/gcc.target/aarch64/pr104921.x | 9 +++++++++ 4 files changed, 28 insertions(+), 1 deletion(-) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index a00e1c6ef8d..ee7f0b89cff 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -9047,7 +9047,7 @@ [(set (match_operand:V4SF 0 "register_operand" "=w") (plus: V4SF (match_operand:V4SF 1 "register_operand" "0") (unspec:V4SF [(match_operand:V8BF 2 "register_operand" "w") - (match_operand:VBF 3 "register_operand" "w") + (match_operand:VBF 3 "register_operand" "x") (match_operand:SI 4 "const_int_operand" "n")] BF_MLA)))] "TARGET_BF16_SIMD" diff --git a/gcc/testsuite/gcc.target/aarch64/pr104921-1.c b/gcc/testsuite/gcc.target/aarch64/pr104921-1.c new file mode 100644 index 00000000000..dcf6fe7d90d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr104921-1.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O2 -march=armv8.2-a+bf16 -std=gnu99 -save-temps" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "pr104921.x" + +/* +**foo: +** mov v([0-9]|1[0-5])\.8b, v16\.8b +** bfmlalb v0\.4s, v1\.8h, v([0-9]|1[0-5])\.h\[0\] +** ret +*/ diff --git a/gcc/testsuite/gcc.target/aarch64/pr104921-2.c b/gcc/testsuite/gcc.target/aarch64/pr104921-2.c new file mode 100644 index 00000000000..211fcd0aca9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr104921-2.c @@ -0,0 +1,6 @@ +/* { dg-do assemble } */ +/* { dg-add-options arm_v8_2a_bf16_neon } */ +/* { dg-additional-options "-O2 -std=gnu99" } */ +/* { dg-require-effective-target arm_v8_2a_bf16_neon_ok } */ + +#include "pr104921.x" diff --git a/gcc/testsuite/gcc.target/aarch64/pr104921.x b/gcc/testsuite/gcc.target/aarch64/pr104921.x new file mode 100644 index 00000000000..1e1a6f24e22 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr104921.x @@ -0,0 +1,9 @@ +#include + +float32x4_t +foo(float32x4_t x, bfloat16x8_t a) +{ + register bfloat16x4_t b asm ("v16"); + asm volatile ("" : "=w"(b)); + return vbfmlalbq_lane_f32 (x, a, b, 0); +}