From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-il1-x12e.google.com (mail-il1-x12e.google.com [IPv6:2607:f8b0:4864:20::12e]) by sourceware.org (Postfix) with ESMTPS id 2B80A3858D35 for ; Thu, 12 Jan 2023 08:40:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2B80A3858D35 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=sifive.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=sifive.com Received: by mail-il1-x12e.google.com with SMTP id x6so4467486ill.10 for ; Thu, 12 Jan 2023 00:40:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=VCWdfe3sqvNhBmEuixqkcvRUvaXBQEzkc4NJksWsRbU=; b=FSTBayFkWyXCeh96ANC9QxjQCq0ug+ZWHeRCldOX0cfGDS1QImsBpGMLJpJo5m8G+k 5kUVhHzTujyRt92hPWW4GQgFbkOPUl+iJ3ngF60JYey1OoV7zeXt0RGhHQc79acVZCAa 4rWJ1PEg5Zt63KllXYHpV+0plHLvpSxzwPja9dkdkiurj4DI01zDYnxsyu+jvOWxqeM7 ewJoNiFFNKba/wE1gJRlq7R0iOzEF/+/AZ7rPcr/udq4nazYzJvxq+LYyt5yDZMlBs0H p8Zj3qhmfkadyWQwytHKEnQ2OSi5h6KAGBcF54YnWFhAAIXdvFHi6roy5XFb8/4GxuBZ 7IIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=VCWdfe3sqvNhBmEuixqkcvRUvaXBQEzkc4NJksWsRbU=; b=iOX+0OnQ+izNbTnoWcyUQNrzwTy7M14JGuTEzJfPi9EAE/F0nS8fbDwhvC6+bO0IIC Hpp7Igk74JXrPsxJRMX4Mo8g9aoIlQrgax7uUSi80I8+N0xEVmQz7wAXlxq/uLrhGjeL XNwB/H7Fqow8l1Swg8qtj/GPlAz5PSFoiYhpnsbAQHiB2+ak7aMI9awOBh/64UN0sl2x DcHODZ9AO0QTTg84y5aune61Jt9QIXa6zSg2DlHpRb0T67quG+zoRVXTruDZXyoUX4cl 7x3S1LL7r6XZtfaYSqXZQSHoGY9saW6ShWT771u71YyVlBJTFdmWazm9jxbAE3cv+CtF Czew== X-Gm-Message-State: AFqh2krjRtOkcNwlhP2Muj4LHkFdfrJPRJpVFaWIi6mSD3UwpRa/QICf bwP5jjfvfthT5/qZLT7P03A3WLrlGI9ToCk3f/MbtwU0BGEBfWkmrzdEo18WesDJrS9vjdvUQ8o J99L8YrSajfKi1cC1Zek7cSP4IVjlaX+Irg0FymKkakp6flbexVUUldiaU6g0+LVgs0KrU3I= X-Google-Smtp-Source: AMrXdXswQBurNNHoaMFSxo3YJkY3JuVDWD+R46oZGoRNAHy5q4F/vXvFrGStB7rEM8+hA/MBPYcU2g== X-Received: by 2002:a05:6e02:ee6:b0:30c:dd1:f768 with SMTP id j6-20020a056e020ee600b0030c0dd1f768mr3512076ilk.12.1673512840864; Thu, 12 Jan 2023 00:40:40 -0800 (PST) Received: from mail-il1-f181.google.com (mail-il1-f181.google.com. [209.85.166.181]) by smtp.gmail.com with ESMTPSA id d1-20020a92d781000000b00302bb083c2bsm5061898iln.21.2023.01.12.00.40.39 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 12 Jan 2023 00:40:40 -0800 (PST) Received: by mail-il1-f181.google.com with SMTP id g2so9086718ila.4 for ; Thu, 12 Jan 2023 00:40:39 -0800 (PST) X-Received: by 2002:a05:6e02:c74:b0:30c:2e26:d263 with SMTP id f20-20020a056e020c7400b0030c2e26d263mr6473674ilj.140.1673512839647; Thu, 12 Jan 2023 00:40:39 -0800 (PST) MIME-Version: 1.0 References: <4a67f41c-3473-3833-c0fc-ed4f69a062e9@suse.com> In-Reply-To: From: Andrew Waterman Date: Thu, 12 Jan 2023 00:40:28 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] gas/RISC-V: adjust assembler for opcode table re-ordering To: Jan Beulich Cc: "Maciej W. Rozycki" , Jim Wilson , nelson@rivosinc.com, Nick Clifton , binutils@sourceware.org, Palmer Dabbelt Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Jan 12, 2023 at 12:26 AM Jan Beulich wrote: > > On 12.01.2023 02:28, Maciej W. Rozycki wrote: > > On Wed, 11 Jan 2023, Jan Beulich wrote: > > > >>> And it does appear to happen, because correct machine code is produced > >>> regardless of your hack, except for the spurious symbol produced. So is > >>> it not the case that simply the state (interal relocations recorded) is > >>> not correctly reset on an unsuccessful operand match? Why does it have to > >>> be special-cased just for the `a' operand type? > >> > >> The parsing of an 'a' type operand involves expression(), a side effect of > >> which is to insert a symbol table entry for symbols not otherwise > >> recognized (and note how my_getSmallExpression() addresses the same issue > >> by filtering out GPR names first [1]). Yes, in a way this is an > >> "insufficient undoing" issue, just that undoing of that symbol table > >> insertion would be quite hard and/or fragile (from all I can tell). And > >> this is where the dual meaning of symbol names comes into play: This looks > >> to be intentional, and hence we can't make use of md_parse_name() to > >> suppress the symbol table insertion in the first place for symbols which > >> (in other contexts) identify registers. > > > > Thank you for looking into it. Indeed it looks to me like a problem with > > `expression' (or `expr' really) and the way the RISC-V assembly dialect > > defines register references (unlike the MIPS one which uses a `$' prefix). > > > > At a glance it seems to me that the correct approach would be to define a > > "dry run" mode for `expr' and use it in the RISC-V backend to validate an > > operand in the first invocation without causing any side effects, and then > > only once all the operands have been processed and an opcode table entry > > accepted `expr' would be called to finalise the expression. > > > > I realise it's something you may not be willing to commit to, as it's > > likely a larger task than a random tweak to the RISC-V backend, but I > > think it's the way we ought to do it rather than piling up workarounds. > > I might actually try to do something along those lines, but only once it was > clarified (by the arch maintainers) that the present behavior of identifiers > meaning different things depending on context is actually intentional. I haven't been following this discussion until now, but if I understand the question correctly, then yes, it is intentional. Were we to travel back in time, we would have defined a different assembly syntax that sidestepped this complexity. But it is now part of an API that is in widespread use, so we are stuck with it. > There not being a prefix to indicate registers isn't unprecedented, after > all - at least x86 (Intel syntax, or more generally "noprefix" mode), ia64, > and Arm permit the same. The former two take the identifier as a register > regardless of which insn this is an operand of (creating another problem > when you really mean a symbol of that name, with varying approaches to > dealing with). Arm instead makes sure that different mnemonics are used > (b vs bx for Arm32, b vs br for Arm64) and hence ambiguities cannot arise. > > Jan