From: Nelson Chu <nelson@rivosinc.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: Binutils <binutils@sourceware.org>,
Palmer Dabbelt <palmer@dabbelt.com>,
Andrew Waterman <andrew@sifive.com>,
Jim Wilson <jim.wilson.gcc@gmail.com>
Subject: Re: [PATCH 2/2] RISC-V: adjust logic to avoid register name symbols
Date: Wed, 15 Mar 2023 18:07:54 +0800 [thread overview]
Message-ID: <CAPpQWtA3xQtOZS6n9dMAWMhCzuFh_2gZ5q8gLkZoh31dxD5qvw@mail.gmail.com> (raw)
In-Reply-To: <574b6c36-7e00-ea69-0fdf-6c41d425b771@suse.com>
My regression schedule was a bit full lately, so I just finished
running for this patch now. Everything looks good when running the
gcc/binutils regression with this patch in riscv-gnu-toolchain, the
errors mentioned last time should be all resolved perfectly, so LGTM.
Thanks for continuing to fix this issue.
Nelson
On Fri, Mar 3, 2023 at 9:53 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 16.02.2023 10:36, Nelson Chu wrote:
> > Yeah the logs are too big, so not sure where can be uploaded these
> > kind of huge files. However, we should get the same result if we just
> > run the gcc upstream testsuites. Seems many errors but a quick
> > review, most of them are,
> >
> > *Assembler messages:
> > *Error: register value used as expression
> > ...
>
> Below the tentative updated patch. I haven't got around to seeing
> whether I can sensibly run gcc's testsuite yet, though. My past
> experience with doing so for cross builds was quite poor (but that
> has been quite a while back) ...
>
> Attached also the corresponding extended testcase (subject of patch 1
> of the series), which I expect is good to have in any event.
>
> Jan
>
> RISC-V: adjust logic to avoid register name symbols
>
> Special casing GPR names in my_getSmallExpression() leads to a number of
> inconsistencies. Generalize this by utilizing the md_parse_name() hook,
> limited to when instruction operands are being parsed (really: probed).
> Then both the GPR lookup there and the yet more ad hoc workaround for
> PR/gas 29940 can be removed (including its extension needed for making
> the compressed form JAL work again).
> ---
> When a floating point extension (but not [hfdq]inx) is enabled, floating
> point registers would perhaps better be recognized by riscv_parse_name()
> as well. Same for vector registers and likely also CSRs. Otherwise
> inconsistent behavior may (continue to) result when equates come into
> play, afaict.
>
> Considering equates, the new behavior isn't the only sensible one: We
> could also omit the symbol_find() call, thus never honoring equates.
> Then, however, even with the "i" insn suffix (or sometimes infix)
> explicitly specified, equates with names matching a GPR name (or, see
> above, potentially also other register names) wouldn't be accepted
> anymore. I guess the primary question here is what the intended (and
> consistent) behavior is to be.
>
> Overall the uses of my_getSmallExpression() vs my_getExpression() look
> pretty random to me: Is there any underlying principle when which of the
> two is expected to be used?
> ---
> v2: Re-work to not break things like %hi(<register-name>).
>
> --- a/gas/config/tc-riscv.c
> +++ b/gas/config/tc-riscv.c
> @@ -171,6 +171,8 @@ static enum float_abi float_abi = FLOAT_
>
> static unsigned elf_flags = 0;
>
> +static bool probing_insn_operands;
> +
> /* Set the default_isa_spec. Return 0 if the spec isn't supported.
> Otherwise, return 1. */
>
> @@ -2228,21 +2230,10 @@ my_getSmallExpression (expressionS *ep,
> char *str, const struct percent_op_match *percent_op)
> {
> size_t reloc_index;
> - unsigned crux_depth, str_depth, regno;
> + unsigned crux_depth, str_depth;
> + bool orig_probing = probing_insn_operands;
> char *crux;
>
> - /* First, check for integer registers. No callers can accept a reg, but
> - we need to avoid accidentally creating a useless undefined symbol below,
> - if this is an instruction pattern that can't match. A glibc build fails
> - if this is removed. */
> - if (reg_lookup (&str, RCLASS_GPR, ®no))
> - {
> - ep->X_op = O_register;
> - ep->X_add_number = regno;
> - expr_parse_end = str;
> - return 0;
> - }
> -
> /* Search for the start of the main expression.
>
> End the loop with CRUX pointing to the start of the main expression and
> @@ -2265,9 +2256,16 @@ my_getSmallExpression (expressionS *ep,
> && reloc_index < 1
> && parse_relocation (&str, reloc, percent_op));
>
> + /* Anything inside parentheses cannot be a register and hence can be
> + treated the same as operands to directives (other than .insn). */
> + if (str_depth)
> + probing_insn_operands = false;
> +
> my_getExpression (ep, crux);
> str = expr_parse_end;
>
> + probing_insn_operands = orig_probing;
> +
> /* Match every open bracket. */
> while (crux_depth > 0 && (*str == ')' || *str == ' ' || *str == '\t'))
> if (*str++ == ')')
> @@ -2453,6 +2451,13 @@ riscv_is_priv_insn (insn_t insn)
> || ((insn ^ MATCH_SFENCE_VM) & MASK_SFENCE_VM) == 0);
> }
>
> +static symbolS *deferred_sym_rootP;
> +static symbolS *deferred_sym_lastP;
> +/* Since symbols can't easily be freed, try to recycle ones which weren't
> + committed. */
> +static symbolS *orphan_sym_rootP;
> +static symbolS *orphan_sym_lastP;
> +
> /* This routine assembles an instruction into its binary format. As a
> side effect, it sets the global variable imm_reloc to the type of
> relocation to do if one of the operands is an address expression. */
> @@ -2488,6 +2493,8 @@ riscv_ip (char *str, struct riscv_cl_ins
>
> insn = (struct riscv_opcode *) str_hash_find (hash, str);
>
> + probing_insn_operands = true;
> +
> asargStart = asarg;
> for ( ; insn && insn->name && strcmp (insn->name, str) == 0; insn++)
> {
> @@ -2504,6 +2511,17 @@ riscv_ip (char *str, struct riscv_cl_ins
> /* Reset error message of the previous round. */
> error.msg = _("illegal operands");
> error.missing_ext = NULL;
> +
> + /* Purge deferred symbols from the previous round, if any. */
> + while (deferred_sym_rootP)
> + {
> + symbolS *sym = deferred_sym_rootP;
> +
> + symbol_remove (sym, &deferred_sym_rootP, &deferred_sym_lastP);
> + symbol_append (sym, orphan_sym_lastP, &orphan_sym_rootP,
> + &orphan_sym_lastP);
> + }
> +
> create_insn (ip, insn);
>
> imm_expr->X_op = O_absent;
> @@ -2558,9 +2576,22 @@ riscv_ip (char *str, struct riscv_cl_ins
> }
> if (*asarg != '\0')
> break;
> +
> /* Successful assembly. */
> error.msg = NULL;
> insn_with_csr = false;
> +
> + /* Commit deferred symbols, if any. */
> + while (deferred_sym_rootP)
> + {
> + symbolS *sym = deferred_sym_rootP;
> +
> + symbol_remove (sym, &deferred_sym_rootP,
> + &deferred_sym_lastP);
> + symbol_append (sym, symbol_lastP, &symbol_rootP,
> + &symbol_lastP);
> + symbol_table_insert (sym);
> + }
> goto out;
>
> case 'C': /* RVC */
> @@ -2764,8 +2795,6 @@ riscv_ip (char *str, struct riscv_cl_ins
> case 'p':
> goto branch;
> case 'a':
> - if (oparg == insn->args + 1)
> - goto jump_check_gpr;
> goto jump;
> case 'S': /* Floating-point RS1 x8-x15. */
> if (!reg_lookup (&asarg, RCLASS_FPR, ®no)
> @@ -3270,18 +3299,6 @@ riscv_ip (char *str, struct riscv_cl_ins
> continue;
>
> case 'a': /* 20-bit PC-relative offset. */
> - /* Like in my_getSmallExpression() we need to avoid emitting
> - a stray undefined symbol if the 1st JAL entry doesn't match,
> - but the 2nd (with 2 operands) might. */
> - if (oparg == insn->args)
> - {
> - jump_check_gpr:
> - asargStart = asarg;
> - if (reg_lookup (&asarg, RCLASS_GPR, NULL)
> - && (*asarg == ',' || (ISSPACE (*asarg) && asarg[1] == ',')))
> - break;
> - asarg = asargStart;
> - }
> jump:
> my_getExpression (imm_expr, asarg);
> asarg = expr_parse_end;
> @@ -3504,6 +3521,8 @@ riscv_ip (char *str, struct riscv_cl_ins
> if (save_c)
> *(asargStart - 1) = save_c;
>
> + probing_insn_operands = false;
> +
> return error;
> }
>
> @@ -3800,6 +3819,53 @@ riscv_after_parse_args (void)
> flag_dwarf_cie_version = 3;
> }
>
> +bool riscv_parse_name (const char *name, struct expressionS *ep,
> + enum expr_mode mode)
> +{
> + unsigned int regno;
> + symbolS *sym;
> +
> + if (!probing_insn_operands)
> + return false;
> +
> + gas_assert (mode == expr_normal);
> +
> + regno = reg_lookup_internal (name, RCLASS_GPR);
> + if (regno == (unsigned int)-1)
> + return false;
> +
> + if (symbol_find (name) != NULL)
> + return false;
> +
> + /* Create a symbol without adding it to the symbol table yet.
> + Insertion will happen only once we commit to using the insn
> + we're probing operands for. */
> + for (sym = deferred_sym_rootP; sym; sym = symbol_next (sym))
> + if (strcmp (name, S_GET_NAME (sym)) == 0)
> + break;
> + if (!sym)
> + {
> + for (sym = orphan_sym_rootP; sym; sym = symbol_next (sym))
> + if (strcmp (name, S_GET_NAME (sym)) == 0)
> + {
> + symbol_remove (sym, &orphan_sym_rootP, &orphan_sym_lastP);
> + break;
> + }
> + if (!sym)
> + sym = symbol_create (name, undefined_section,
> + &zero_address_frag, 0);
> +
> + symbol_append (sym, deferred_sym_lastP, &deferred_sym_rootP,
> + &deferred_sym_lastP);
> + }
> +
> + ep->X_op = O_symbol;
> + ep->X_add_symbol = sym;
> + ep->X_add_number = 0;
> +
> + return true;
> +}
> +
> long
> md_pcrel_from (fixS *fixP)
> {
> --- a/gas/config/tc-riscv.h
> +++ b/gas/config/tc-riscv.h
> @@ -123,6 +123,10 @@ extern void riscv_elf_final_processing (
> /* Adjust debug_line after relaxation. */
> #define DWARF2_USE_FIXED_ADVANCE_PC 1
>
> +#define md_parse_name(name, exp, mode, c) \
> + riscv_parse_name (name, exp, mode)
> +bool riscv_parse_name (const char *, struct expressionS *, enum expr_mode);
> +
> #define md_finish riscv_md_finish
> #define CONVERT_SYMBOLIC_ATTRIBUTE riscv_convert_symbolic_attribute
>
>
next prev parent reply other threads:[~2023-03-15 10:08 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-13 8:01 [PATCH 0/2] RISC-V/gas: re-work register named symbols avoidance logic Jan Beulich
2023-02-13 8:02 ` [PATCH 1/2] RISC-V: test for expected / no unexpected symbols Jan Beulich
2023-02-13 8:02 ` [PATCH 2/2] RISC-V: adjust logic to avoid register name symbols Jan Beulich
2023-02-15 3:09 ` Nelson Chu
2023-02-15 8:03 ` Jan Beulich
2023-02-15 8:48 ` Nelson Chu
2023-02-15 9:21 ` Jan Beulich
2023-02-16 9:36 ` Nelson Chu
2023-02-16 16:49 ` Jan Beulich
2023-02-16 16:54 ` Palmer Dabbelt
2023-03-03 13:52 ` Jan Beulich
2023-03-15 10:07 ` Nelson Chu [this message]
2023-03-15 11:49 ` Jan Beulich
2023-03-16 0:30 ` Nelson Chu
2023-04-19 9:37 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAPpQWtA3xQtOZS6n9dMAWMhCzuFh_2gZ5q8gLkZoh31dxD5qvw@mail.gmail.com \
--to=nelson@rivosinc.com \
--cc=andrew@sifive.com \
--cc=binutils@sourceware.org \
--cc=jbeulich@suse.com \
--cc=jim.wilson.gcc@gmail.com \
--cc=palmer@dabbelt.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).