From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from angie.orcam.me.uk (angie.orcam.me.uk [IPv6:2001:4190:8020::34]) by sourceware.org (Postfix) with ESMTP id 7CE8B3858D35 for ; Thu, 12 Jan 2023 01:28:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7CE8B3858D35 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=orcam.me.uk Authentication-Results: sourceware.org; spf=none smtp.mailfrom=orcam.me.uk Received: by angie.orcam.me.uk (Postfix, from userid 500) id C6DE192009D; Thu, 12 Jan 2023 02:28:45 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by angie.orcam.me.uk (Postfix) with ESMTP id C093B92009C; Thu, 12 Jan 2023 01:28:45 +0000 (GMT) Date: Thu, 12 Jan 2023 01:28:45 +0000 (GMT) From: "Maciej W. Rozycki" To: Jan Beulich cc: Andrew Waterman , Jim Wilson , nelson@rivosinc.com, Nick Clifton , binutils@sourceware.org, Palmer Dabbelt Subject: Re: [PATCH] gas/RISC-V: adjust assembler for opcode table re-ordering In-Reply-To: Message-ID: References: <4a67f41c-3473-3833-c0fc-ed4f69a062e9@suse.com> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-3489.0 required=5.0 tests=BAYES_00,KAM_DMARC_STATUS,KAM_INFOUSMEBIZ,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, 11 Jan 2023, Jan Beulich wrote: > > And it does appear to happen, because correct machine code is produced > > regardless of your hack, except for the spurious symbol produced. So is > > it not the case that simply the state (interal relocations recorded) is > > not correctly reset on an unsuccessful operand match? Why does it have to > > be special-cased just for the `a' operand type? > > The parsing of an 'a' type operand involves expression(), a side effect of > which is to insert a symbol table entry for symbols not otherwise > recognized (and note how my_getSmallExpression() addresses the same issue > by filtering out GPR names first [1]). Yes, in a way this is an > "insufficient undoing" issue, just that undoing of that symbol table > insertion would be quite hard and/or fragile (from all I can tell). And > this is where the dual meaning of symbol names comes into play: This looks > to be intentional, and hence we can't make use of md_parse_name() to > suppress the symbol table insertion in the first place for symbols which > (in other contexts) identify registers. Thank you for looking into it. Indeed it looks to me like a problem with `expression' (or `expr' really) and the way the RISC-V assembly dialect defines register references (unlike the MIPS one which uses a `$' prefix). At a glance it seems to me that the correct approach would be to define a "dry run" mode for `expr' and use it in the RISC-V backend to validate an operand in the first invocation without causing any side effects, and then only once all the operands have been processed and an opcode table entry accepted `expr' would be called to finalise the expression. I realise it's something you may not be willing to commit to, as it's likely a larger task than a random tweak to the RISC-V backend, but I think it's the way we ought to do it rather than piling up workarounds. FWIW, Maciej