From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <richard.sandiford@arm.com>
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
 by sourceware.org (Postfix) with ESMTP id BE2393A77C15
 for <gcc-patches@gcc.gnu.org>; Wed, 28 Apr 2021 14:30:36 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org BE2393A77C15
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
 by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 62714ED1;
 Wed, 28 Apr 2021 07:30:36 -0700 (PDT)
Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126])
 by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E05F73F694;
 Wed, 28 Apr 2021 07:30:30 -0700 (PDT)
From: Richard Sandiford <richard.sandiford@arm.com>
To: Jonathan Wright via Gcc-patches <gcc-patches@gcc.gnu.org>
Mail-Followup-To: Jonathan Wright via Gcc-patches <gcc-patches@gcc.gnu.org>,
 Jonathan Wright <Jonathan.Wright@arm.com>, richard.sandiford@arm.com
Subject: Re: [PATCH 4/20] aarch64: Use RTL builtins for [su]paddl[q] intrinsics
References: <DBBPR08MB47581546D83D9633964DA800EB409@DBBPR08MB4758.eurprd08.prod.outlook.com>
Date: Wed, 28 Apr 2021 15:30:29 +0100
In-Reply-To: <DBBPR08MB47581546D83D9633964DA800EB409@DBBPR08MB4758.eurprd08.prod.outlook.com>
 (Jonathan Wright via Gcc-patches's message of "Wed, 28 Apr 2021
 13:51:15 +0000")
Message-ID: <mptwnsmfpmi.fsf@arm.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Status: No, score=-6.6 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS,
 SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Apr 2021 14:30:38 -0000

Jonathan Wright via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> Hi,
>
> As subject, this patch rewrites the [su]paddl[q] Neon intrinsics to use
> RTL builtins rather than inline assembly code, allowing for better
> scheduling and optimization.
>
> Regression tested and bootstrapped on aarch64-none-linux-gnu - no
> issues.
>
> Ok for master?

OK, thanks.  For the record=E2=80=A6

>  __extension__ extern __inline uint64x1_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  vpaddl_u32 (uint32x2_t __a)
>  {
> -  uint64x1_t __result;
> -  __asm__ ("uaddlp %0.1d,%1.2s"
> -           : "=3Dw"(__result)
> -           : "w"(__a)
> -           : /* No clobbers */);
> -  return __result;
> +  return (uint64x1_t) __builtin_aarch64_uaddlpv2si_uu (__a);
>  }

=E2=80=A6I wasn't sure for this whether it would be better to use (uint64x1=
_t) {=E2=80=A6}
instead of a scalar-to-vector conversion, since that seems to be the more
common style in the rest of arm_neon.h.  But there are already instances
of this kind of conversion too, and if anything it should be more
efficient than creating a distinct vector object.

Richard