From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9945 invoked by alias); 12 Dec 2019 16:09:23 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 9933 invoked by uid 89); 12 Dec 2019 16:09:23 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-7.2 required=5.0 tests=AWL,BAYES_00,KAM_SHORT,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 spammy=kyrill, timeframe, multilib, sk:static. X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.110.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 12 Dec 2019 16:09:21 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 16EF430E; Thu, 12 Dec 2019 08:09:20 -0800 (PST) Received: from [10.2.80.62] (e120808-lin.cambridge.arm.com [10.2.80.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 95EBD3F6CF; Thu, 12 Dec 2019 08:09:19 -0800 (PST) Subject: Re: [PATCH][ARM][GCC][0/x]: Support for MVE ACLE intrinsics. To: Srinath Parvathaneni , "gcc-patches@gcc.gnu.org" Cc: Richard Earnshaw References: <157375666998.31400.16652205595246718910.scripted-patch-series@arm.com> From: Kyrill Tkachov Message-ID: <4a4e4d86-ec1c-fe31-266a-60cc8dee8c9c@foss.arm.com> Date: Thu, 12 Dec 2019 16:09:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.1 MIME-Version: 1.0 In-Reply-To: <157375666998.31400.16652205595246718910.scripted-patch-series@arm.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit X-SW-Source: 2019-12/txt/msg00896.txt.bz2 Hi Srinath, On 11/14/19 7:12 PM, Srinath Parvathaneni wrote: > Hello, > > This patches series is to support Arm MVE ACLE intrinsics. > > Please refer to Arm reference manual [1] and MVE intrinsics [2] for > more details. > Please refer to Chapter 13 MVE ACLE [3] for MVE intrinsics concepts. > > This patch series depends on upstream patches "Armv8.1-M Mainline > Security Extension" [4], > "CLI and multilib support for Armv8.1-M Mainline MVE extensions" [5] > and "support for Armv8.1-M > Mainline scalar shifts" [6]. > > [1] > https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf?_ga=2.102521798.659307368.1572453718-1501600630.1548848914 > [2] > https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics > [3] > https://static.docs.arm.com/101028/0009/Q3-ACLE_2019Q3_release-0009.pdf?_ga=2.239684871.588348166.1573726994-1501600630.1548848914 > [4] https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01654.html > [5] https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00641.html > [6] https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01194.html > > Srinath Parvathaneni(38): > [PATCH][ARM][GCC][1/x]: MVE ACLE intrinsics framework patch. > [PATCH][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch. > [PATCH][ARM][GCC][3/x]: MVE ACLE intrinsics framework patch. > [PATCH][ARM][GCC][4/x]: MVE ACLE vector interleaving store intrinsics. > [PATCH][ARM][GCC][1/1x]: Patch to support MVE ACLE intrinsics with > unary operand. > [PATCH][ARM][GCC][2/1x]: MVE intrinsics with unary operand. > [PATCH][ARM][GCC][3/1x]: MVE intrinsics with unary operand. > [PATCH][ARM][GCC][4/1x]: MVE intrinsics with unary operand. > [PATCH][ARM][GCC][1/2x]: MVE intrinsics with binary operands. > [PATCH][ARM][GCC][2/2x]: MVE intrinsics with binary operands. > [PATCH][ARM][GCC][3/2x]: MVE intrinsics with binary operands. > [PATCH][ARM][GCC][4/2x]: MVE intrinsics with binary operands. > [PATCH][ARM][GCC][5/2x]: MVE intrinsics with binary operands. > [PATCH][ARM][GCC][1/3x]: MVE intrinsics with ternary operands. > [PATCH][ARM][GCC][2/3x]: MVE intrinsics with ternary operands. > [PATCH][ARM][GCC][3/3x]: MVE intrinsics with ternary operands. > [PATCH][ARM][GCC][1/4x]: MVE intrinsics with quaternary operands. > [PATCH][ARM][GCC][2/4x]: MVE intrinsics with quaternary operands. > [PATCH][ARM][GCC][3/4x]: MVE intrinsics with quaternary operands. > [PATCH][ARM][GCC][4/4x]: MVE intrinsics with quaternary operands. > [PATCH][ARM][GCC][1/5x]: MVE store intrinsics. > [PATCH][ARM][GCC][2/5x]: MVE load intrinsics. > [PATCH][ARM][GCC][3/5x]: MVE store intrinsics with predicated suffix. > [PATCH][ARM][GCC][4/5x]: MVE load intrinsics with zero(_z) suffix. > [PATCH][ARM][GCC][5/5x]: MVE ACLE load intrinsics which load a byte, > halfword, or word from memory. > [PATCH][ARM][GCC][6/5x]: Remaining MVE load intrinsics which loads > half word and word or double word from memory. > [PATCH][ARM][GCC][7/5x]: MVE store intrinsics which stores byte,half > word or word to memory. > [PATCH][ARM][GCC][8/5x]: Remaining MVE store intrinsics which stores > an half word, word and double word to memory. > [PATCH][ARM][GCC][6x]:MVE ACLE vaddq intrinsics using arithmetic plus > operator. > [PATCH][ARM][GCC][7x]: MVE vreinterpretq and vuninitializedq intrinsics. > [PATCH][ARM][GCC][1/8x]: MVE ACLE vidup, vddup, viwdup and vdwdup > intrinsics with writeback. > [PATCH][ARM][GCC][2/8x]: MVE ACLE gather load and scatter store > intrinsics with writeback. > [PATCH][ARM][GCC][9x]: MVE ACLE predicated intrinsics with (dont-care) > variant. > [PATCH][ARM][GCC][10x]: MVE ACLE intrinsics "add with carry across > beats" and "beat-wise substract". > [PATCH][ARM][GCC][11x]: MVE ACLE vector interleaving store and > deinterleaving load intrinsics and also aliases to vstr and vldr > intrinsics. > [PATCH][ARM][GCC][12x]: MVE ACLE intrinsics to set and get vector lane. > [PATCH][ARM][GCC][13x]: MVE ACLE scalar shift intrinsics. > [PATCH][ARM][GCC][14x]: MVE ACLE whole vector left shift with carry > intrinsics. > Thank you for working on these. I will reply to individual patches with more targeted comments. As this is a fairly large amount of code, here's my high-level view: The MVE intrinsics spec has more complexities than the Neon intrinsics one: * It needs support for both the user-namespace versions, and the __arm_* ones. * There are also overloaded forms that in C are implemented using _Generic. The above two facts make for a rather bulky and messy arm_mve.h implementation. In the case of the _Generic usage we hit the performance problems reported in PR c/91937. Ideally, I'd like to see the frontend parts of these intrinsics implemented in a similar way to the SVE ACLE (https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00413.html) i.e. have the compiler inject the right functions into the language and do overload resolution through the appropriate hooks, thus keeping the (unavoidable) complexity in the backend rather than arm_mve.h That being said, this is a major feature that I would very much like to see in GCC 10 and the current implementation, outside of the new .md file and arm_mve.h file, doesn't disturb anything in the backend it shouldn't. That is, implementing the SVE ACLE approach shouldn't require any risky backend surgery, just ripping out a chunk of code and replacing it with another chunk of code. So I'll accept the current approach for GCC 10 as long as we improve the frontend parts for theĀ  GCC 11 timeframe. My reviews on the individual patches will be in this context. Thanks, Kyrill > Regards, > Srinath. > > Entire patch series attached to cover letter.