From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id A97023858425 for ; Thu, 20 Apr 2023 14:51:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A97023858425 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B9B041480; Thu, 20 Apr 2023 07:51:48 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 42F5D3F6C4; Thu, 20 Apr 2023 07:51:04 -0700 (PDT) From: Richard Sandiford To: "Andre Vieira \(lists\)" Mail-Followup-To: "Andre Vieira \(lists\)" ,"gcc-patches\@gcc.gnu.org" , "jakub\@redhat.com" , Richard Biener , richard.sandiford@arm.com Cc: "gcc-patches\@gcc.gnu.org" , "jakub\@redhat.com" , Richard Biener Subject: Re: [RFC 0/X] Implement GCC support for AArch64 libmvec References: Date: Thu, 20 Apr 2023 15:51:02 +0100 In-Reply-To: (Andre Vieira's message of "Wed, 8 Mar 2023 16:17:33 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-24.8 required=5.0 tests=BAYES_00,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: "Andre Vieira (lists)" writes: > Hi all, > > This is a series of patches/RFCs to implement support in GCC to be able > to target AArch64's libmvec functions that will be/are being added to glibc. > We have chosen to use the omp pragma '#pragma omp declare variant ...' > with a simd construct as the way for glibc to inform GCC what functions > are available. > > For example, if we would like to supply a vector version of the scalar > 'cosf' we would have an include file with something like: > typedef __attribute__((__neon_vector_type__(4))) float __f32x4_t; > typedef __attribute__((__neon_vector_type__(2))) float __f32x2_t; > typedef __SVFloat32_t __sv_f32_t; > typedef __SVBool_t __sv_bool_t; > __f32x4_t _ZGVnN4v_cosf (__f32x4_t); > __f32x2_t _ZGVnN2v_cosf (__f32x2_t); > __sv_f32_t _ZGVsMxv_cosf (__sv_f32_t, __sv_bool_t); > #pragma omp declare variant(_ZGVnN4v_cosf) \ > match(construct = {simd(notinbranch, simdlen(4))}, device = > {isa("simd")}) > #pragma omp declare variant(_ZGVnN2v_cosf) \ > match(construct = {simd(notinbranch, simdlen(2))}, device = > {isa("simd")}) > #pragma omp declare variant(_ZGVsMxv_cosf) \ > match(construct = {simd(inbranch)}, device = {isa("sve")}) > extern float cosf (float); > > The BETA ABI can be found in the vfabia64 subdir of > https://github.com/ARM-software/abi-aa/ > This currently disagrees with how this patch series implements 'omp > declare simd' for SVE and I also do not see a need for the 'omp declare > variant' scalable extension constructs. I will make changes to the ABI > once we've finalized the co-design of the ABI and this implementation. I don't see a good reason for dropping the extension("scalable"). The problem is that since the base spec requires a simdlen clause, GCC should in general raise an error if simdlen is omitted. Relaxing that for an explicit extension seems better than doing it only based on the ISA (which should in general be a free-form string). Having "scalable" in the definition also helps to make the intent clearer. Any change to the declare simd behaviour should probably be agreed with the LLVM folks first. Like you say, we already know that GCC can do your version, since it already does the equivalent thing for x86. I'm not sure, but I'm guessing the declare simd VFABI was written that way because, at the time (several years ago), there were concerns about switching SVE on and off on a function-by-function basis in LLVM. But I'm not sure it makes sense to ignore -msve-vector-bits= when compiling the SVE version (which is what patch 4 seems to do). If someone compiles with -march=armv8.4-a, we'll use all Armv8.4-A features in the Advanced SIMD routines. Why should we ignore SVE-related target information for the SVE routines? Of course, the fact that we take command-line options into account means that omp simd/variant clauses on linkonce/comdat group functions are an ODR violation waiting to happen. But the same is true for the original scalar functions that the clauses are attached to. Thanks, Richard > The patch series has three main steps: > 1) Add SVE support for 'omp declare simd', see PR 96342 > 2) Enable GCC to use omp declare variants with simd constructs as simd > clones during auto-vectorization. > 3) Add SLP support for vectorizable_simd_clone_call (This sounded like a > nice thing to add as we want to move away from non-slp vectorization). > > Below you can see the list of current Patches/RFCs, the difference being > on how confident I am of the proposed changes. For the RFC I am hoping > to get early comments on the approach, rather than more indepth > code-reviews. > > I appreciate we are still in Stage 4, so I can completely understand if > you don't have time to review this now, but I thought it can't hurt to > post these early. > > Andre Vieira: > [PATCH] omp: Replace simd_clone_subparts with TYPE_VECTOR_SUBPARTS > [PATCH] parloops: Copy target and optimizations when creating a function > clone > [PATCH] parloops: Allow poly nit and bound > [RFC] omp, aarch64: Add SVE support for 'omp declare simd' [PR 96342] > [RFC] omp: Create simd clones from 'omp declare variant's > [RFC] omp: Allow creation of simd clones from omp declare variant with > -fopenmp-simd flag > > Work in progress: > [RFC] vect: Enable SLP codegen for vectorizable_simd_clone_call