From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by sourceware.org (Postfix) with ESMTPS id E04313858017 for ; Wed, 27 Sep 2023 01:35:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E04313858017 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=googlemail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=googlemail.com Received: by mail-wm1-x333.google.com with SMTP id 5b1f17b1804b1-4053c6f0e50so95185065e9.1 for ; Tue, 26 Sep 2023 18:35:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20230601; t=1695778516; x=1696383316; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=pYHlol1sjZS0L1/fFR/H88i40XuL9VNc454Rgf/XfKo=; b=eUpGjQhT1NfJe7OYPAFOJn6gXhtrItIQv6cJ/7Zi/hS4t7jFH2SH0vS1Agp536e4JU Z+rtP3XGBTVHL38ez6RhwAvPjN3WzEBoOw97XPipZ6kzHRfYr2WaMRvB6sWvv0fqv7GV 6IHdkzc5TihbGCzK1RlTut6L+UFbvhX0+SqVTMXWaxTmF8OXPpskHiT2CPNHTimvPBls rFCPU0zDhlfOWuKrcZcKhXrTyZ/grEMi+/OD/qiL7FouwilnmgYlDdIJzWRXtDTxh+AD /Pw3hXX6b8whQQgNvfdLAby3D15Gx8xHMZ69Gc+2lAG3ZmqlkE43XOGXykMDgnm4O/Lq 6pNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695778516; x=1696383316; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pYHlol1sjZS0L1/fFR/H88i40XuL9VNc454Rgf/XfKo=; b=CCn7/YMU2eH8IljCsbKQW2R7Mb4QW5u5ckicmNDeSJN52nu1ENLrF9/0TuSLPR8Dc/ QDiwCneUo8Xek8tIFCBkwQHh+fUo9EVyO5n7R0QKAuQrh0lXoC7WaSs220SoBtzxUUrg MWJ6LIa4ZlvLyzzHecFux5KOh6/24G0zdN2yT6ecX5DGEvN/G1anFxA7NQZrYLH0M5WD dqBustwXtJrhbPsYVgkD1maYW+NF+gAxKeXQaHrxfXZgAqivBnooELu9IKrBlZFJNVc8 0anl4FytT58F/2QH6D3kO+E7Vzsmit8iydpumN6deIPfR9FML+OB8W9wJ6IhTKmLSUSs 68ow== X-Gm-Message-State: AOJu0YxyT6Cf48YBGi8YONT7hEiqBpjNOHT9AJCZM50EyVTyhk/c13oa 2ZsSo3LF7m/niQrmtCmfh88nkT44/wdfHYV5viw= X-Google-Smtp-Source: AGHT+IEEameN2GWow49aXhqkf63SL0F+OcDf12/mCdF3zcDuvnAzkwPl4+4b8qarUhlJo0y3nByBXlAy+XAWWWWrfu0= X-Received: by 2002:a05:600c:225a:b0:402:8896:bb7b with SMTP id a26-20020a05600c225a00b004028896bb7bmr509588wmm.6.1695778516578; Tue, 26 Sep 2023 18:35:16 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Ramana Radhakrishnan Date: Wed, 27 Sep 2023 02:35:05 +0100 Message-ID: Subject: Re: [PATCH]AArch64 Add movi for 0 moves for scalar types [PR109154] To: Tamar Christina Cc: gcc-patches@gcc.gnu.org, nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, Kyrylo.Tkachov@arm.com, richard.sandiford@arm.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-9.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_LOTSOFHASH,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Sep 27, 2023 at 1:51=E2=80=AFAM Tamar Christina wrote: > > Hi All, > > Following the Neoverse N/V and Cortex-A optimization guides SIMD 0 immedi= ates > should be created with a movi of 0. > > At the moment we generate an `fmov .., xzr` which is slower and requires = a > GP -> FP transfer. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? > > Thanks, > Tamar > > gcc/ChangeLog: > > PR tree-optimization/109154 > * config/aarch64/aarch64.md (*mov_aarch64, *movsi_aarch64, > *movdi_aarch64): Add new w -> Z case. > * config/aarch64/iterators.md (Vbtype): Add QI and HI. > > gcc/testsuite/ChangeLog: > > PR tree-optimization/109154 > * gcc.target/aarch64/fneg-abs_2.c: Updated. > * gcc.target/aarch64/fneg-abs_4.c: Updated. > > --- inline copy of patch -- > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.m= d > index b51f979dba12b726bff0c1109b75c6d2c7ae41ab..60c92213c75a2a4c18a6b59ae= 52fe45d1e872718 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -1232,6 +1232,7 @@ (define_insn "*mov_aarch64" > "(register_operand (operands[0], mode) > || aarch64_reg_or_zero (operands[1], mode))" > {@ [cons: =3D0, 1; attrs: type, arch] > + [w, Z ; neon_move , simd ] movi\t%0., #0 > [r, r ; mov_reg , * ] mov\t%w0, %w1 > [r, M ; mov_imm , * ] mov\t%w0, %1 > [w, D; neon_move , simd ] << aarch64_output_scalar_simd_m= ov_immediate (operands[1], mode); > @@ -1289,6 +1290,7 @@ (define_insn_and_split "*movsi_aarch64" > "(register_operand (operands[0], SImode) > || aarch64_reg_or_zero (operands[1], SImode))" > {@ [cons: =3D0, 1; attrs: type, arch, length] > + [w , Z ; neon_move, simd, 4] movi\t%0.2d, #0 > [r k, r ; mov_reg , * , 4] mov\t%w0, %w1 > [r , k ; mov_reg , * , 4] ^ > [r , M ; mov_imm , * , 4] mov\t%w0, %1 > @@ -1322,6 +1324,7 @@ (define_insn_and_split "*movdi_aarch64" > "(register_operand (operands[0], DImode) > || aarch64_reg_or_zero (operands[1], DImode))" > {@ [cons: =3D0, 1; attrs: type, arch, length] > + [w, Z ; neon_move, simd, 4] movi\t%0.2d, #0 > [r, r ; mov_reg , * , 4] mov\t%x0, %x1 > [k, r ; mov_reg , * , 4] mov\t%0, %x1 > [r, k ; mov_reg , * , 4] mov\t%x0, %1 > diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterato= rs.md > index 2451d8c2cd8e2da6ac8339eed9bc975cf203fa4c..d17becc37e230684beaee3c69= e2a0f0ce612eda5 100644 > --- a/gcc/config/aarch64/iterators.md > +++ b/gcc/config/aarch64/iterators.md > @@ -1297,6 +1297,7 @@ (define_mode_attr Vbtype [(V8QI "8b") (V16QI "16b"= ) > (V4SF "16b") (V2DF "16b") > (DI "8b") (DF "8b") > (SI "8b") (SF "8b") > + (QI "8b") (HI "8b") > (V4BF "8b") (V8BF "16b")]) > > ;; Advanced SIMD vector structure to element modes. > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c b/gcc/testsuit= e/gcc.target/aarch64/fneg-abs_2.c > index fb14ec3e2210e0feeff80f2410d777d3046a9f78..5e253d3059cfc9b93bd0865e6= eaed1231eba19bd 100644 > --- a/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c > @@ -20,7 +20,7 @@ float32_t f1 (float32_t a) > > /* > ** f2: > -** fmov d[0-9]+, xzr > +** movi v[0-9]+.2d, #0 > ** fneg v[0-9]+.2d, v[0-9]+.2d > ** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > ** ret > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c b/gcc/testsuit= e/gcc.target/aarch64/fneg-abs_4.c > index 4ea0105f6c0a9756070bcc60d34f142f53d8242c..c86fe3e032c9e5176467841ce= 1a679ea47bbd531 100644 > --- a/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c > @@ -8,7 +8,7 @@ > > /* > ** negabs: > -** fmov d[0-9]+, xzr > +** movi v31.2d, #0 > ** fneg v[0-9]+.2d, v[0-9]+.2d > ** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b > ** ret > > > > LGTM. I just clocked that the simd attribute is disabled with -mgeneral-regs-only which allows for this to work .. Neat. I cannot approve. Ramana > --