public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Nelson Chu <nelson@rivosinc.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: Binutils <binutils@sourceware.org>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	 Andrew Waterman <andrew@sifive.com>,
	Jim Wilson <jim.wilson.gcc@gmail.com>
Subject: Re: [PATCH 2/2] RISC-V: adjust logic to avoid register name symbols
Date: Wed, 15 Mar 2023 18:07:54 +0800	[thread overview]
Message-ID: <CAPpQWtA3xQtOZS6n9dMAWMhCzuFh_2gZ5q8gLkZoh31dxD5qvw@mail.gmail.com> (raw)
In-Reply-To: <574b6c36-7e00-ea69-0fdf-6c41d425b771@suse.com>

My regression schedule was a bit full lately, so I just finished
running for this patch now.  Everything looks good when running the
gcc/binutils regression with this patch in riscv-gnu-toolchain, the
errors mentioned last time should be all resolved perfectly, so LGTM.
Thanks for continuing to fix this issue.

Nelson

On Fri, Mar 3, 2023 at 9:53 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 16.02.2023 10:36, Nelson Chu wrote:
> > Yeah the logs are too big, so not sure where can be uploaded these
> > kind of huge files.  However, we should get the same result if we just
> > run the gcc upstream testsuites.  Seems many errors but a quick
> > review, most of them are,
> >
> > *Assembler messages:
> > *Error: register value used as expression
> > ...
>
> Below the tentative updated patch. I haven't got around to seeing
> whether I can sensibly run gcc's testsuite yet, though. My past
> experience with doing so for cross builds was quite poor (but that
> has been quite a while back) ...
>
> Attached also the corresponding extended testcase (subject of patch 1
> of the series), which I expect is good to have in any event.
>
> Jan
>
> RISC-V: adjust logic to avoid register name symbols
>
> Special casing GPR names in my_getSmallExpression() leads to a number of
> inconsistencies. Generalize this by utilizing the md_parse_name() hook,
> limited to when instruction operands are being parsed (really: probed).
> Then both the GPR lookup there and the yet more ad hoc workaround for
> PR/gas 29940 can be removed (including its extension needed for making
> the compressed form JAL work again).
> ---
> When a floating point extension (but not [hfdq]inx) is enabled, floating
> point registers would perhaps better be recognized by riscv_parse_name()
> as well. Same for vector registers and likely also CSRs. Otherwise
> inconsistent behavior may (continue to) result when equates come into
> play, afaict.
>
> Considering equates, the new behavior isn't the only sensible one: We
> could also omit the symbol_find() call, thus never honoring equates.
> Then, however, even with the "i" insn suffix (or sometimes infix)
> explicitly specified, equates with names matching a GPR name (or, see
> above, potentially also other register names) wouldn't be accepted
> anymore. I guess the primary question here is what the intended (and
> consistent) behavior is to be.
>
> Overall the uses of my_getSmallExpression() vs my_getExpression() look
> pretty random to me: Is there any underlying principle when which of the
> two is expected to be used?
> ---
> v2: Re-work to not break things like %hi(<register-name>).
>
> --- a/gas/config/tc-riscv.c
> +++ b/gas/config/tc-riscv.c
> @@ -171,6 +171,8 @@ static enum float_abi float_abi = FLOAT_
>
>  static unsigned elf_flags = 0;
>
> +static bool probing_insn_operands;
> +
>  /* Set the default_isa_spec.  Return 0 if the spec isn't supported.
>     Otherwise, return 1.  */
>
> @@ -2228,21 +2230,10 @@ my_getSmallExpression (expressionS *ep,
>                        char *str, const struct percent_op_match *percent_op)
>  {
>    size_t reloc_index;
> -  unsigned crux_depth, str_depth, regno;
> +  unsigned crux_depth, str_depth;
> +  bool orig_probing = probing_insn_operands;
>    char *crux;
>
> -  /* First, check for integer registers.  No callers can accept a reg, but
> -     we need to avoid accidentally creating a useless undefined symbol below,
> -     if this is an instruction pattern that can't match.  A glibc build fails
> -     if this is removed.  */
> -  if (reg_lookup (&str, RCLASS_GPR, &regno))
> -    {
> -      ep->X_op = O_register;
> -      ep->X_add_number = regno;
> -      expr_parse_end = str;
> -      return 0;
> -    }
> -
>    /* Search for the start of the main expression.
>
>       End the loop with CRUX pointing to the start of the main expression and
> @@ -2265,9 +2256,16 @@ my_getSmallExpression (expressionS *ep,
>          && reloc_index < 1
>          && parse_relocation (&str, reloc, percent_op));
>
> +  /* Anything inside parentheses cannot be a register and hence can be
> +     treated the same as operands to directives (other than .insn).  */
> +  if (str_depth)
> +    probing_insn_operands = false;
> +
>    my_getExpression (ep, crux);
>    str = expr_parse_end;
>
> +  probing_insn_operands = orig_probing;
> +
>    /* Match every open bracket.  */
>    while (crux_depth > 0 && (*str == ')' || *str == ' ' || *str == '\t'))
>      if (*str++ == ')')
> @@ -2453,6 +2451,13 @@ riscv_is_priv_insn (insn_t insn)
>           || ((insn ^ MATCH_SFENCE_VM) & MASK_SFENCE_VM) == 0);
>  }
>
> +static symbolS *deferred_sym_rootP;
> +static symbolS *deferred_sym_lastP;
> +/* Since symbols can't easily be freed, try to recycle ones which weren't
> +   committed.  */
> +static symbolS *orphan_sym_rootP;
> +static symbolS *orphan_sym_lastP;
> +
>  /* This routine assembles an instruction into its binary format.  As a
>     side effect, it sets the global variable imm_reloc to the type of
>     relocation to do if one of the operands is an address expression.  */
> @@ -2488,6 +2493,8 @@ riscv_ip (char *str, struct riscv_cl_ins
>
>    insn = (struct riscv_opcode *) str_hash_find (hash, str);
>
> +  probing_insn_operands = true;
> +
>    asargStart = asarg;
>    for ( ; insn && insn->name && strcmp (insn->name, str) == 0; insn++)
>      {
> @@ -2504,6 +2511,17 @@ riscv_ip (char *str, struct riscv_cl_ins
>        /* Reset error message of the previous round.  */
>        error.msg = _("illegal operands");
>        error.missing_ext = NULL;
> +
> +      /* Purge deferred symbols from the previous round, if any.  */
> +      while (deferred_sym_rootP)
> +       {
> +         symbolS *sym = deferred_sym_rootP;
> +
> +         symbol_remove (sym, &deferred_sym_rootP, &deferred_sym_lastP);
> +         symbol_append (sym, orphan_sym_lastP, &orphan_sym_rootP,
> +                        &orphan_sym_lastP);
> +       }
> +
>        create_insn (ip, insn);
>
>        imm_expr->X_op = O_absent;
> @@ -2558,9 +2576,22 @@ riscv_ip (char *str, struct riscv_cl_ins
>                 }
>               if (*asarg != '\0')
>                 break;
> +
>               /* Successful assembly.  */
>               error.msg = NULL;
>               insn_with_csr = false;
> +
> +             /* Commit deferred symbols, if any.  */
> +             while (deferred_sym_rootP)
> +               {
> +                 symbolS *sym = deferred_sym_rootP;
> +
> +                 symbol_remove (sym, &deferred_sym_rootP,
> +                                &deferred_sym_lastP);
> +                 symbol_append (sym, symbol_lastP, &symbol_rootP,
> +                                &symbol_lastP);
> +                 symbol_table_insert (sym);
> +               }
>               goto out;
>
>             case 'C': /* RVC */
> @@ -2764,8 +2795,6 @@ riscv_ip (char *str, struct riscv_cl_ins
>                 case 'p':
>                   goto branch;
>                 case 'a':
> -                 if (oparg == insn->args + 1)
> -                   goto jump_check_gpr;
>                   goto jump;
>                 case 'S': /* Floating-point RS1 x8-x15.  */
>                   if (!reg_lookup (&asarg, RCLASS_FPR, &regno)
> @@ -3270,18 +3299,6 @@ riscv_ip (char *str, struct riscv_cl_ins
>               continue;
>
>             case 'a': /* 20-bit PC-relative offset.  */
> -             /* Like in my_getSmallExpression() we need to avoid emitting
> -                a stray undefined symbol if the 1st JAL entry doesn't match,
> -                but the 2nd (with 2 operands) might.  */
> -             if (oparg == insn->args)
> -               {
> -           jump_check_gpr:
> -                 asargStart = asarg;
> -                 if (reg_lookup (&asarg, RCLASS_GPR, NULL)
> -                     && (*asarg == ',' || (ISSPACE (*asarg) && asarg[1] == ',')))
> -                   break;
> -                 asarg = asargStart;
> -               }
>             jump:
>               my_getExpression (imm_expr, asarg);
>               asarg = expr_parse_end;
> @@ -3504,6 +3521,8 @@ riscv_ip (char *str, struct riscv_cl_ins
>    if (save_c)
>      *(asargStart  - 1) = save_c;
>
> +  probing_insn_operands = false;
> +
>    return error;
>  }
>
> @@ -3800,6 +3819,53 @@ riscv_after_parse_args (void)
>      flag_dwarf_cie_version = 3;
>  }
>
> +bool riscv_parse_name (const char *name, struct expressionS *ep,
> +                      enum expr_mode mode)
> +{
> +  unsigned int regno;
> +  symbolS *sym;
> +
> +  if (!probing_insn_operands)
> +    return false;
> +
> +  gas_assert (mode == expr_normal);
> +
> +  regno = reg_lookup_internal (name, RCLASS_GPR);
> +  if (regno == (unsigned int)-1)
> +    return false;
> +
> +  if (symbol_find (name) != NULL)
> +    return false;
> +
> +  /* Create a symbol without adding it to the symbol table yet.
> +     Insertion will happen only once we commit to using the insn
> +     we're probing operands for.  */
> +  for (sym = deferred_sym_rootP; sym; sym = symbol_next (sym))
> +    if (strcmp (name, S_GET_NAME (sym)) == 0)
> +      break;
> +  if (!sym)
> +    {
> +      for (sym = orphan_sym_rootP; sym; sym = symbol_next (sym))
> +       if (strcmp (name, S_GET_NAME (sym)) == 0)
> +         {
> +           symbol_remove (sym, &orphan_sym_rootP, &orphan_sym_lastP);
> +           break;
> +         }
> +      if (!sym)
> +       sym = symbol_create (name, undefined_section,
> +                            &zero_address_frag, 0);
> +
> +      symbol_append (sym, deferred_sym_lastP, &deferred_sym_rootP,
> +                    &deferred_sym_lastP);
> +    }
> +
> +  ep->X_op = O_symbol;
> +  ep->X_add_symbol = sym;
> +  ep->X_add_number = 0;
> +
> +  return true;
> +}
> +
>  long
>  md_pcrel_from (fixS *fixP)
>  {
> --- a/gas/config/tc-riscv.h
> +++ b/gas/config/tc-riscv.h
> @@ -123,6 +123,10 @@ extern void riscv_elf_final_processing (
>  /* Adjust debug_line after relaxation.  */
>  #define DWARF2_USE_FIXED_ADVANCE_PC 1
>
> +#define md_parse_name(name, exp, mode, c) \
> +  riscv_parse_name (name, exp, mode)
> +bool riscv_parse_name (const char *, struct expressionS *, enum expr_mode);
> +
>  #define md_finish riscv_md_finish
>  #define CONVERT_SYMBOLIC_ATTRIBUTE riscv_convert_symbolic_attribute
>
>

  reply	other threads:[~2023-03-15 10:08 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-13  8:01 [PATCH 0/2] RISC-V/gas: re-work register named symbols avoidance logic Jan Beulich
2023-02-13  8:02 ` [PATCH 1/2] RISC-V: test for expected / no unexpected symbols Jan Beulich
2023-02-13  8:02 ` [PATCH 2/2] RISC-V: adjust logic to avoid register name symbols Jan Beulich
2023-02-15  3:09   ` Nelson Chu
2023-02-15  8:03     ` Jan Beulich
2023-02-15  8:48       ` Nelson Chu
2023-02-15  9:21         ` Jan Beulich
2023-02-16  9:36           ` Nelson Chu
2023-02-16 16:49             ` Jan Beulich
2023-02-16 16:54               ` Palmer Dabbelt
2023-03-03 13:52             ` Jan Beulich
2023-03-15 10:07               ` Nelson Chu [this message]
2023-03-15 11:49                 ` Jan Beulich
2023-03-16  0:30                   ` Nelson Chu
2023-04-19  9:37                     ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPpQWtA3xQtOZS6n9dMAWMhCzuFh_2gZ5q8gLkZoh31dxD5qvw@mail.gmail.com \
    --to=nelson@rivosinc.com \
    --cc=andrew@sifive.com \
    --cc=binutils@sourceware.org \
    --cc=jbeulich@suse.com \
    --cc=jim.wilson.gcc@gmail.com \
    --cc=palmer@dabbelt.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).