From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vs1-xe31.google.com (mail-vs1-xe31.google.com [IPv6:2607:f8b0:4864:20::e31]) by sourceware.org (Postfix) with ESMTPS id E1CDD384B05E for ; Fri, 24 Jun 2022 11:38:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E1CDD384B05E Received: by mail-vs1-xe31.google.com with SMTP id o190so2065257vsc.5 for ; Fri, 24 Jun 2022 04:38:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=JY32DlFE/pOwnbjy1cxJh6ayAlbs1MT/kYh1ObFDA7Y=; b=eUQYzGTOQOLxYObb8fsPRzBEfWek6dsXupwAtw9279C9ahkTW67ExIKipTKp4Y2kH2 NDv5Q4wLp643+dTe1nwakFcpIwDjZMeuNKP2wmt1IUQjLoPSYIsUA82p1uezNve8dUc1 GCbUjol1q/nGukCNpkppLP++AcgLwfYauBl6fxSD0sc83G689pYGL83Tg+RKT4MkCfWF bouBBUrD0Byq4Z3bUOdc7MzVhQMNe7ZoIK7Zr+jfo995N0dFwI385Cl+2Az69gd400YW 2wsKBqHn0K1UeWkWcSDBgdswpdvKXuE9Sp3ZFGia8AwfUL3nQ8ka9xfu0dOt5s5ontiu 9+lA== X-Gm-Message-State: AJIora+Vr35BrlG8ZdfcctdsYaIBdDPAi3NhRnsgCiwEOJrjx6mn7ptu 2C9LCahfJSJjjuB4/Lj2YPX5P4XxStHGBvT86oA= X-Google-Smtp-Source: AGRyM1sj2ILC2shvMnPY8ylTcUl1Te2t0drkUR/XrWDDB1cr0k5KAGwFXzPWHnGgBLTm6wobGHAP7vI4wbjLsTYDlGU= X-Received: by 2002:a05:6102:5493:b0:34b:b583:f557 with SMTP id bk19-20020a056102549300b0034bb583f557mr21371614vsb.2.1656070732229; Fri, 24 Jun 2022 04:38:52 -0700 (PDT) MIME-Version: 1.0 References: <20220621115115.1068453-1-ghostmansd@gmail.com> <20220623193734.1245650-1-ghostmansd@gmail.com> In-Reply-To: From: lkcl Date: Fri, 24 Jun 2022 12:38:39 +0100 Message-ID: Subject: Draft Simple-V roadmap for Power ISA (was: [PATCH v3 0/6] ppc/svp64: support SVP64 and its first insns) To: Dmitry Selyutin Cc: Binutils , Alan Modra , Jan Beulich , Nick Alcock , Richard Earnshaw , Andreas Schwab , Libre-Soc General Development Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Jun 2022 11:38:54 -0000 On Thu, Jun 23, 2022 at 8:46 PM Dmitry Selyutin wrote: > Hi folks, many thanks for your tips, suggestions and ideas on > improvements! it's greatly appreciated, everyone, you as well, Dmitry. just so everyone knows, the bulk of the work for binutils, adding Draft Cray-style Scalable Vectors to the Power ISA is, astoundingly, pretty much done. there are *NO* actual Vector instructions in SV. we will NOT be submitting 200-5,000 Vector opcodes as would normally be done in any other Scalable Vector ISA: ppc64-opc.c contains the *entirety* of the Vector "contextualisation" of *pre-existing* Scalar instructions [9] context and roadmap: * Simple-V is named "simple" because it adheres to a strict RISC paradigm [extended into the Scalable Vector space] * there are only 5 actual "management" instructions: setvl, svstep, svremap, svshape, svindex (TODO [0]) these last three are for hardware-controlled "Structure Packing" such as Matrices and other Dimensional shuffling, and full triple-loop DCT/FFT (normally only found in VLIW DSPs) * there is "borrowing" of 25% of the EXT001 64-bit prefix space which gives 24 bits to "categorise" every Scalar instruction, according to their register profile [1] as new *Scalar* instructions get added to the Power ISA, then if it is appropriate to do so [8] they would correspondingly have to be run through the "register profile analysis" [2] and, for sanity's sake, the ppc64-opc.[ch] auto-generator re-run [3]. now, we *also* happen to be developing some Scalar instructions. it's really important to emphasise that these have absolutely nothing to do with SV, at all. these Scalar instructions are designed to bring the Scalar Power ISA up-to-date in many areas outside of its primary focus and perfectly reasonable and understandable use-case to date [IBM's high-end customers]. example: i'm currently designing a bitmanip-mask instruction which covers the entirety of BMI and TBM [4] *and* RVV's vsbfm suite. none of these were needed for any IBM workloads / customers so it is perfectly reasonable that they were never considered. there's also a pair of biginteger math operations, a variant of the intel "mulx" instruction is one of them [5]. the majority of the list is on the bitmanip page [6], these will take some time simply because there's a lot of them (appx... 80-100) still on the TODO list: * macro support (including the "8" of element-width=8, sorry Dmitry!) [7] * svindex for doing vector-looped GPR[RT] = GPR(GPR(RA)) [0] * submit scalar instructions [6] and corresponding ppc64-opc.[ch] [2][3] that's basically it. there's no binutils-level subsetting of SVP64 because the lower SV Compliancy Levels require soft-emulation through illegal instruction traps. there's no Vector instructions to add: everything Scalable-Vectorised is in the 24-bit Prefix. overall, then, the strict RISC paradigm creates one hell of a lot less work for everyone, yet brings something mind-melting like 2 million intrinsics to the Power ISA. which is only manageable by sticking strictly to RISC principles. l. [0] https://bugs.libre-soc.org/show_bug.cgi?id=867 [1] fascinatingly this approach was exactly the one that Peter Hsu and his team at MIPS, when they were developing the R8000, came up with around 1995. [2] https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/sv/sv_analysis.py;hb=HEAD [3] https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/sv/sv_binutils.py;hb=HEAD [4] https://bugs.libre-soc.org/show_bug.cgi?id=865#c1 [5] https://libre-soc.org/openpower/sv/biginteger/ [6] https://libre-soc.org/openpower/sv/bitmanip/ [7] https://bugs.libre-soc.org/show_bug.cgi?id=849 [8] new Scalar instructions have to make sense in a Vector context "scalar===element" before they can be Prefixed to extend to multiple elements. mtmsr doesn't qualify for example because there's only ever going to be one MSR. sc makes no sense, but weirdly td/tw tdi/twi do. [9] we tried breaking the rule of adding Vector opcodes without having the corresponding identical Scalar instruction: it went very badly. lesson learned.