From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot1-x336.google.com (mail-ot1-x336.google.com [IPv6:2607:f8b0:4864:20::336]) by sourceware.org (Postfix) with ESMTPS id 7F5813858417 for ; Wed, 15 Mar 2023 10:08:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7F5813858417 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-ot1-x336.google.com with SMTP id f17-20020a9d7b51000000b00697349ab7e7so2432413oto.9 for ; Wed, 15 Mar 2023 03:08:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; t=1678874886; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Nenh1F0StRUO9XTB4TFNp4hDmyyDNhXLn4vEz59ChFQ=; b=FfbOXzRa1dfRETOblXJpKuH1bZBOniTQXCZKpaPPWSIG3ceaKSCQdlzMj6cHHbJBLb gUCTXJpS5CUgH7GIbCJ14bAm2X7l2DehWln4SPYhbQHjEnfdBOGr6bW4ly6NRsTA7L8K /SJ+OmPFrce/yu2Cr/CGcdyPvvPRm7JBndlCI3HkEmPXRD2h+NeZKv5iMZvaXxImHOwv LHe/zXzHcOlu34fWhZaIym55nsdY4s2fb/A3juz02BHXroLvTkxnM1FT15HFPmJhG8cJ luWFARKwtNw/MOxnBzUwWnK9chV15gezDJ7fT5MBxylm+cP5Si8AeNOjIsacgk+4UuWc nABw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678874886; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Nenh1F0StRUO9XTB4TFNp4hDmyyDNhXLn4vEz59ChFQ=; b=idJoFid0xzh6iACCiq12GuNfSztF4SUVvoScmcjGOPCoKQ5EptjuH1wIcW7G0RgYxU QsALROwKoqGmNnrNcJ1BBIqFLp6njvnWMAJ/xFux22KLZ54IJfB2PjjKNZAbxLHj0BUg FZdojinhc3bKk3rd7TvdKyMq6dQJUAtv6KFQKaOqmwLbmNCzhweexNx1L07ciXPksY0k MkU4JygYUGLzilXgXWsVj2cHGlaQ8tLVshDZqWkVgGSUKIwxzJTLdwUnAGWvyxHvZtCC m68prq0PrDpW9uAg7II9L8k5uCE1UmCdWtSaGYMYHFA8i+bx/Tp+usjnr6PYZ0MCJrnt iK8g== X-Gm-Message-State: AO0yUKWHw3V6hCr2T44dZ0Cg84V0nj0SAL4/wetE+TeB40SB758jrMcG 5apTx4WkYfwdpPqsrww8+AOJ11KXW/HJfDDUCpYl4Q== X-Google-Smtp-Source: AK7set+NaDZtZZDTuGwXd47vM9ydz5SMA6xwlY9wRDQf0jbAJwDP5DH/OCzNhBzZu8HotVr1o2yh740ApftcUaY+bbg= X-Received: by 2002:a9d:5f97:0:b0:688:d1a8:389e with SMTP id g23-20020a9d5f97000000b00688d1a8389emr14123186oti.1.1678874885749; Wed, 15 Mar 2023 03:08:05 -0700 (PDT) MIME-Version: 1.0 References: <2df3478a-3cd9-d7b5-670e-60683ae3066e@suse.com> <3bb126f6-5bf1-dab0-f540-25c386b9507e@suse.com> <574b6c36-7e00-ea69-0fdf-6c41d425b771@suse.com> In-Reply-To: <574b6c36-7e00-ea69-0fdf-6c41d425b771@suse.com> From: Nelson Chu Date: Wed, 15 Mar 2023 18:07:54 +0800 Message-ID: Subject: Re: [PATCH 2/2] RISC-V: adjust logic to avoid register name symbols To: Jan Beulich Cc: Binutils , Palmer Dabbelt , Andrew Waterman , Jim Wilson Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: My regression schedule was a bit full lately, so I just finished running for this patch now. Everything looks good when running the gcc/binutils regression with this patch in riscv-gnu-toolchain, the errors mentioned last time should be all resolved perfectly, so LGTM. Thanks for continuing to fix this issue. Nelson On Fri, Mar 3, 2023 at 9:53=E2=80=AFPM Jan Beulich wrot= e: > > On 16.02.2023 10:36, Nelson Chu wrote: > > Yeah the logs are too big, so not sure where can be uploaded these > > kind of huge files. However, we should get the same result if we just > > run the gcc upstream testsuites. Seems many errors but a quick > > review, most of them are, > > > > *Assembler messages: > > *Error: register value used as expression > > ... > > Below the tentative updated patch. I haven't got around to seeing > whether I can sensibly run gcc's testsuite yet, though. My past > experience with doing so for cross builds was quite poor (but that > has been quite a while back) ... > > Attached also the corresponding extended testcase (subject of patch 1 > of the series), which I expect is good to have in any event. > > Jan > > RISC-V: adjust logic to avoid register name symbols > > Special casing GPR names in my_getSmallExpression() leads to a number of > inconsistencies. Generalize this by utilizing the md_parse_name() hook, > limited to when instruction operands are being parsed (really: probed). > Then both the GPR lookup there and the yet more ad hoc workaround for > PR/gas 29940 can be removed (including its extension needed for making > the compressed form JAL work again). > --- > When a floating point extension (but not [hfdq]inx) is enabled, floating > point registers would perhaps better be recognized by riscv_parse_name() > as well. Same for vector registers and likely also CSRs. Otherwise > inconsistent behavior may (continue to) result when equates come into > play, afaict. > > Considering equates, the new behavior isn't the only sensible one: We > could also omit the symbol_find() call, thus never honoring equates. > Then, however, even with the "i" insn suffix (or sometimes infix) > explicitly specified, equates with names matching a GPR name (or, see > above, potentially also other register names) wouldn't be accepted > anymore. I guess the primary question here is what the intended (and > consistent) behavior is to be. > > Overall the uses of my_getSmallExpression() vs my_getExpression() look > pretty random to me: Is there any underlying principle when which of the > two is expected to be used? > --- > v2: Re-work to not break things like %hi(). > > --- a/gas/config/tc-riscv.c > +++ b/gas/config/tc-riscv.c > @@ -171,6 +171,8 @@ static enum float_abi float_abi =3D FLOAT_ > > static unsigned elf_flags =3D 0; > > +static bool probing_insn_operands; > + > /* Set the default_isa_spec. Return 0 if the spec isn't supported. > Otherwise, return 1. */ > > @@ -2228,21 +2230,10 @@ my_getSmallExpression (expressionS *ep, > char *str, const struct percent_op_match *percent_= op) > { > size_t reloc_index; > - unsigned crux_depth, str_depth, regno; > + unsigned crux_depth, str_depth; > + bool orig_probing =3D probing_insn_operands; > char *crux; > > - /* First, check for integer registers. No callers can accept a reg, b= ut > - we need to avoid accidentally creating a useless undefined symbol b= elow, > - if this is an instruction pattern that can't match. A glibc build = fails > - if this is removed. */ > - if (reg_lookup (&str, RCLASS_GPR, ®no)) > - { > - ep->X_op =3D O_register; > - ep->X_add_number =3D regno; > - expr_parse_end =3D str; > - return 0; > - } > - > /* Search for the start of the main expression. > > End the loop with CRUX pointing to the start of the main expression= and > @@ -2265,9 +2256,16 @@ my_getSmallExpression (expressionS *ep, > && reloc_index < 1 > && parse_relocation (&str, reloc, percent_op)); > > + /* Anything inside parentheses cannot be a register and hence can be > + treated the same as operands to directives (other than .insn). */ > + if (str_depth) > + probing_insn_operands =3D false; > + > my_getExpression (ep, crux); > str =3D expr_parse_end; > > + probing_insn_operands =3D orig_probing; > + > /* Match every open bracket. */ > while (crux_depth > 0 && (*str =3D=3D ')' || *str =3D=3D ' ' || *str = =3D=3D '\t')) > if (*str++ =3D=3D ')') > @@ -2453,6 +2451,13 @@ riscv_is_priv_insn (insn_t insn) > || ((insn ^ MATCH_SFENCE_VM) & MASK_SFENCE_VM) =3D=3D 0); > } > > +static symbolS *deferred_sym_rootP; > +static symbolS *deferred_sym_lastP; > +/* Since symbols can't easily be freed, try to recycle ones which weren'= t > + committed. */ > +static symbolS *orphan_sym_rootP; > +static symbolS *orphan_sym_lastP; > + > /* This routine assembles an instruction into its binary format. As a > side effect, it sets the global variable imm_reloc to the type of > relocation to do if one of the operands is an address expression. */ > @@ -2488,6 +2493,8 @@ riscv_ip (char *str, struct riscv_cl_ins > > insn =3D (struct riscv_opcode *) str_hash_find (hash, str); > > + probing_insn_operands =3D true; > + > asargStart =3D asarg; > for ( ; insn && insn->name && strcmp (insn->name, str) =3D=3D 0; insn+= +) > { > @@ -2504,6 +2511,17 @@ riscv_ip (char *str, struct riscv_cl_ins > /* Reset error message of the previous round. */ > error.msg =3D _("illegal operands"); > error.missing_ext =3D NULL; > + > + /* Purge deferred symbols from the previous round, if any. */ > + while (deferred_sym_rootP) > + { > + symbolS *sym =3D deferred_sym_rootP; > + > + symbol_remove (sym, &deferred_sym_rootP, &deferred_sym_lastP); > + symbol_append (sym, orphan_sym_lastP, &orphan_sym_rootP, > + &orphan_sym_lastP); > + } > + > create_insn (ip, insn); > > imm_expr->X_op =3D O_absent; > @@ -2558,9 +2576,22 @@ riscv_ip (char *str, struct riscv_cl_ins > } > if (*asarg !=3D '\0') > break; > + > /* Successful assembly. */ > error.msg =3D NULL; > insn_with_csr =3D false; > + > + /* Commit deferred symbols, if any. */ > + while (deferred_sym_rootP) > + { > + symbolS *sym =3D deferred_sym_rootP; > + > + symbol_remove (sym, &deferred_sym_rootP, > + &deferred_sym_lastP); > + symbol_append (sym, symbol_lastP, &symbol_rootP, > + &symbol_lastP); > + symbol_table_insert (sym); > + } > goto out; > > case 'C': /* RVC */ > @@ -2764,8 +2795,6 @@ riscv_ip (char *str, struct riscv_cl_ins > case 'p': > goto branch; > case 'a': > - if (oparg =3D=3D insn->args + 1) > - goto jump_check_gpr; > goto jump; > case 'S': /* Floating-point RS1 x8-x15. */ > if (!reg_lookup (&asarg, RCLASS_FPR, ®no) > @@ -3270,18 +3299,6 @@ riscv_ip (char *str, struct riscv_cl_ins > continue; > > case 'a': /* 20-bit PC-relative offset. */ > - /* Like in my_getSmallExpression() we need to avoid emittin= g > - a stray undefined symbol if the 1st JAL entry doesn't ma= tch, > - but the 2nd (with 2 operands) might. */ > - if (oparg =3D=3D insn->args) > - { > - jump_check_gpr: > - asargStart =3D asarg; > - if (reg_lookup (&asarg, RCLASS_GPR, NULL) > - && (*asarg =3D=3D ',' || (ISSPACE (*asarg) && asarg= [1] =3D=3D ','))) > - break; > - asarg =3D asargStart; > - } > jump: > my_getExpression (imm_expr, asarg); > asarg =3D expr_parse_end; > @@ -3504,6 +3521,8 @@ riscv_ip (char *str, struct riscv_cl_ins > if (save_c) > *(asargStart - 1) =3D save_c; > > + probing_insn_operands =3D false; > + > return error; > } > > @@ -3800,6 +3819,53 @@ riscv_after_parse_args (void) > flag_dwarf_cie_version =3D 3; > } > > +bool riscv_parse_name (const char *name, struct expressionS *ep, > + enum expr_mode mode) > +{ > + unsigned int regno; > + symbolS *sym; > + > + if (!probing_insn_operands) > + return false; > + > + gas_assert (mode =3D=3D expr_normal); > + > + regno =3D reg_lookup_internal (name, RCLASS_GPR); > + if (regno =3D=3D (unsigned int)-1) > + return false; > + > + if (symbol_find (name) !=3D NULL) > + return false; > + > + /* Create a symbol without adding it to the symbol table yet. > + Insertion will happen only once we commit to using the insn > + we're probing operands for. */ > + for (sym =3D deferred_sym_rootP; sym; sym =3D symbol_next (sym)) > + if (strcmp (name, S_GET_NAME (sym)) =3D=3D 0) > + break; > + if (!sym) > + { > + for (sym =3D orphan_sym_rootP; sym; sym =3D symbol_next (sym)) > + if (strcmp (name, S_GET_NAME (sym)) =3D=3D 0) > + { > + symbol_remove (sym, &orphan_sym_rootP, &orphan_sym_lastP); > + break; > + } > + if (!sym) > + sym =3D symbol_create (name, undefined_section, > + &zero_address_frag, 0); > + > + symbol_append (sym, deferred_sym_lastP, &deferred_sym_rootP, > + &deferred_sym_lastP); > + } > + > + ep->X_op =3D O_symbol; > + ep->X_add_symbol =3D sym; > + ep->X_add_number =3D 0; > + > + return true; > +} > + > long > md_pcrel_from (fixS *fixP) > { > --- a/gas/config/tc-riscv.h > +++ b/gas/config/tc-riscv.h > @@ -123,6 +123,10 @@ extern void riscv_elf_final_processing ( > /* Adjust debug_line after relaxation. */ > #define DWARF2_USE_FIXED_ADVANCE_PC 1 > > +#define md_parse_name(name, exp, mode, c) \ > + riscv_parse_name (name, exp, mode) > +bool riscv_parse_name (const char *, struct expressionS *, enum expr_mod= e); > + > #define md_finish riscv_md_finish > #define CONVERT_SYMBOLIC_ATTRIBUTE riscv_convert_symbolic_attribute > >