From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from libre-soc.org (libre-soc.org [IPv6:2a00:1098:82:f::1]) by sourceware.org (Postfix) with ESMTPS id 143AD3858C51 for ; Wed, 18 May 2022 10:03:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 143AD3858C51 Received: from mail-yb1-f180.google.com ([209.85.219.180]) by libre-soc.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1nrGWU-0006I2-FL for binutils@sourceware.org; Wed, 18 May 2022 11:03:18 +0100 Received: by mail-yb1-f180.google.com with SMTP id e78so2740160ybc.12 for ; Wed, 18 May 2022 03:03:18 -0700 (PDT) X-Gm-Message-State: AOAM532s44M4sB4m4KzQ0W5oTY3U7Y/7k+7dVSGaHEVt20blROkTdYYb WIRbjZtHXadDXBw1dcFTmMOa5QA7RYxL9DVAXmQ= X-Google-Smtp-Source: ABdhPJyjSubcvkrbyvTfkZhMDZXjZJmR0aBIVSehSoa0Tv2KGT4mC1vnztVjLIdxb41by0X5Su2XLU57bzAEgpb9K5k= X-Received: by 2002:ab0:54c9:0:b0:35d:5e7:f830 with SMTP id q9-20020ab054c9000000b0035d05e7f830mr10284600uaa.87.1652867888992; Wed, 18 May 2022 02:58:08 -0700 (PDT) MIME-Version: 1.0 References: <59802a7f-7d83-e1af-b422-1617f2db2489@linux.ibm.com> In-Reply-To: <59802a7f-7d83-e1af-b422-1617f2db2489@linux.ibm.com> From: Luke Kenneth Casson Leighton Date: Wed, 18 May 2022 10:57:57 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: PPC binutils opcodes To: Peter Bergner Cc: binutils@sourceware.org, Alan Modra , Dmitry Selyutin , "Toshaan Bharvani | VanTosh" , Paul Mackerras Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=0.4 required=5.0 tests=BAYES_00, DKIM_INVALID, DKIM_SIGNED, KAM_DMARC_NONE, KAM_DMARC_STATUS, SPF_HELO_PASS, TXREP, T_SCC_BODY_TEXT_LINE, T_SPF_PERMERROR autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 May 2022 10:03:22 -0000 On Wed, May 18, 2022 at 4:32 AM Peter Bergner wrote: > On 5/11/22 6:38 PM, Alan Modra via Binutils wrote: > > I don't see any reason why you shouldn't use -mlibresoc or whatever > > you choose. > > +1 > > There is also no problem if your cpu uses a particular set of major and/or > minor opcode bits that another cpu already uses or a new cpu might use in > the future. this is the bit that makes me jumpy / nervous, from a Specification Management perspective. it's where RISC-V, due to its popularity, is inexorably and irredemably heading in the direction of a public opcode conflict. aside from being a niggle for developers (who have to specify the right flags, but as you point out below they have to do that anyway), then as long as there is no customer-need / drive / call from a *different* vendor to utilise two or more operations that have publicly-conflicting opcodes, everything's perfectly fine. remember, we're not designing instructions that are intended for specialist niche areas or are hidden behind proprietary compilers where (thank you for describing this to me yesterday, Toshaan, about the private IBM product) where yes the product is popular but no its users are *not* permitted to compile their own source code, they get binaries pre-compiled by the vendor (IBM), and it's the vendor (IBM) that keeps the [conflicting] compiler entirely secret and out of the public eye. this is *not* of concern: such scenarios can have as many conflicts as they like and there will be zero [public] problems. the driving motivation for what we're doing is as a Hybrid CPU-VPU-GPU mass-volume target. one product family, able to do the exact same job of ARM+MALI, or Intel i5/i7/i9 + i950 Graphics, or AMD+Ryzen, and with the same expected popularity and market reach of those products. however unlike those products, instead of 2 disparate ISAs and associated brain-melting userspace-kernelspace-GPUspace RPC marshalling mechanisms across PCIe buses (aka, "a Graphics Driver"), we're envisioning a single unified ISA that is *completely* public and completely end-user-programmable. you want a huge Vector cosine operation? no problem, there's an opcode and associated intrinsic for that, right here, right now. a la ARM SVE2 and AVX-512, we fully expect, once people get over the "disbelief" hump of a small team being able to achieve this (and think at that level), that other vendors will go "dang, we need those opcodes too, our customers are already clamoring for them because of the performance increase / power-reduction / etc." and if those vendors happen to be *already using* that same opcode space for other customers that they're already supplying product to, long-term, and those customers fully expect to recompile their source code right here, right now, using both conflicting operations *in the same binary*, that is where the nightmares begin. opcodes that have been explicitly marked - officially - in the spec - as deprecated, i have no problem with, because those will be for legacy applications that will be covered by a PCR bit anyway. sotto voice: bottom line here is, the last thing we need is for IBM to be pissed at us for wrecking their 25-year-old superb and highly stable ISA, *even if we warned them loudly of the possibility!* :) > We already have multiple cases of this in the opcode table. > You just have to ensure that your cpu's bitmask (ie, the combination of > powerpc_opcode flags that describe what your cpu implements) doesn't enable > your new instruction(s) as well as any instruction(s) that have conflicting > opcode bits. not a problem, technically :) i am a little nervous about the quantity of opcodes we're designing (and have been trying for 18+ months to draw attention to that), but this is par-for-the-course: IBM's agreement with Motorola 12 years ago stopped entry into anything other than the Server Market, and there is a lot of catch-up to be done. normally this job would be covered by *teams* of people, i mean, look at how it's done in RISC-V, they have multiple ISA WGs with multiple vendors contributing in each area. * [0] madded (variant of madd with the 128-bit result split across 2 64-bit regs) * [0] divmod2du (merging of div and mod results again into 2 64-bit regs) * [1] transcendentals fsin fcos atan2 log1p exp1m etc. and DCT/FFT ops * [2] 6 CRweird instructions (much more powerful CR-field-level operations) * [3] bitmanip operations with large immediates and/or operands (3-in 1-out) to give you some idea of the scope: * we're completely out of Sandbox EXT022 (combined bitmanip, SVP64) * EXT05 is taken up entirely with 3 very-large-immediate bitmanip ops * crweirds need to use EXT019 (a swathe of 32 just like addpcis and crops) * madded/divmod2du need to go into EXT04 (same area as madd*) [5] * transcendentals into EXT059 (joining other FP scalar ops) when proposed as RFCs we'll need to propose using up *both* EXT05 *and* EXT09, the two areas i have a vague recollection of being the areas marked in the spec as "if you want to officially propose something that needs an entire opcode, use these". once they've gone through the [new] OPF ISA WG RFC process, many of the bitmanip X-Form operations currently jam-packed into EXT022 could be proposed to place in EXT031 or EXT019 but there are still two groups of large-immediate, large-number-of-operands operations [4] that cannot be merged into one, they'd have to be in both EXT05 and EXT09. > As Alan said, as long as you don't break other cpus, I think you're fine. i think / plan into the future (a long way), and there's a scenario that i am able to visualise which is making me nervous [RISC-V has already hit it and, as predicted ~3 years ago, there's no way back]. i appreciate it's quite complex / specific, and i would like everyone to understand it in full. what do you feel would be the best way to achieve that? if the dlopen/dlsym plugin idea is a step too far, i think what we'll do is add a "-mdraft" option as well. "-mlibresoc -mdraft". firstly, i absolutely do not want people to think that we're operating "rogue", here, and secondly i want it to be very clear that we're acting responsibly, and in no way want the Power ISA to have any possibility of long-term irrecoverable damage. we're expecting to use it for 20+ years, so have to keep it clean. l. [0] https://libre-soc.org/openpower/isa/svfixedarith/ [1] https://libre-soc.org/openpower/isa/svfparith/ [2] https://libre-soc.org/openpower/sv/cr_int_predication/ [3] https://libre-soc.org/openpower/sv/bitmanip/ [4] https://libre-soc.org/openpower/isa/bitmanip/ - termlog* and grev* [5] https://libre-soc.org/openpower/sv/biginteger/