public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* Teaching expression() to treat some operations specially
@ 2022-07-09 16:47 Dmitry Selyutin
  2022-07-11  1:01 ` Alan Modra
  0 siblings, 1 reply; 17+ messages in thread
From: Dmitry Selyutin @ 2022-07-09 16:47 UTC (permalink / raw)
  To: Binutils; +Cc: Luke Kenneth Casson Leighton, Alan Modra

Hi folks, I'm still working on SVP64 extension support[0]. I'm now
checking the SVP64 predicates[1].
Recently I stumbled upon the need to be able to resolve some complex
expressions as a single entity.
Here's the exhaustive list we're going to support: 1<<r3, r3, ~r3,
r10, ~r10, r30, ~r30, lt, nl, ge, gt, ng, le, eq, ne, so, un, ns, nu.
All of these can be represented as macros. For example, let's consider
the instruction `sv.setb/dm=~r3/sm=1<<r3 5, 31`.
When SVP64 parsing code sees dm= or sm=, it simply iterates over the
possible arguments[2].
However, we'd also like to have macros support here, like:

    .set DM, ~r3
    .set SM, 1<<r3
    sv.setb/dm=DM/sm=SM 5, 31

The obvious candidate was expression() routine, which I already
re-used for a simpler case when the value is always a constant[3].
However, things like 1<<r3 and ~r3 are more tricky, and I'm not sure
how to handle these.
expression() would, as it seems, defer the calculation, since r3 is a symbol.
Instead, I'd like to teach expression() the knowledge that 1<<r3
combination is special.

Two places I've been thinking of are ppc_parse_name() and register_name().
The first one "hooks" the name lookup algorithm, but it unsurprizingly
only gets "r3" from "1<<r3".
The second one seems like a minor hack, substituting expression()
logic (in fact, expression() is a fallback for it).

To be honest, I'm not sure how to achieve the goal. In fact, I must
teach expression() to treat some operations specially in some
contexts.
Could you, please, recommend me options to achieve the goal? I want to
keep the logic clear to anyone potentially reading this code later.

Thank you!

[0] https://libre-soc.org/
[1] https://libre-soc.org/openpower/sv/svp64/appendix/
[2] https://git.libre-soc.org/?p=binutils-gdb.git;a=blob;f=gas/config/tc-ppc-svp64.c;h=9dfafa20347f5085653439f7bd4d70c1710a205d;hb=refs/heads/svp64-ng#l144
[3] https://git.libre-soc.org/?p=binutils-gdb.git;a=blob;f=gas/config/tc-ppc-svp64.c;h=9dfafa20347f5085653439f7bd4d70c1710a205d;hb=refs/heads/svp64-ng#l349

--
Best regards,
Dmitry Selyutin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Teaching expression() to treat some operations specially
  2022-07-09 16:47 Teaching expression() to treat some operations specially Dmitry Selyutin
@ 2022-07-11  1:01 ` Alan Modra
  2022-07-11  2:33   ` lkcl
  0 siblings, 1 reply; 17+ messages in thread
From: Alan Modra @ 2022-07-11  1:01 UTC (permalink / raw)
  To: Dmitry Selyutin; +Cc: Binutils, Luke Kenneth Casson Leighton

On Sat, Jul 09, 2022 at 07:47:26PM +0300, Dmitry Selyutin wrote:
> Hi folks, I'm still working on SVP64 extension support[0]. I'm now
> checking the SVP64 predicates[1].
> Recently I stumbled upon the need to be able to resolve some complex
> expressions as a single entity.
> Here's the exhaustive list we're going to support: 1<<r3, r3, ~r3,
> r10, ~r10, r30, ~r30, lt, nl, ge, gt, ng, le, eq, ne, so, un, ns, nu.
> All of these can be represented as macros. For example, let's consider
> the instruction `sv.setb/dm=~r3/sm=1<<r3 5, 31`.
> When SVP64 parsing code sees dm= or sm=, it simply iterates over the
> possible arguments[2].
> However, we'd also like to have macros support here, like:
> 
>     .set DM, ~r3
>     .set SM, 1<<r3
>     sv.setb/dm=DM/sm=SM 5, 31
> 
> The obvious candidate was expression() routine, which I already
> re-used for a simpler case when the value is always a constant[3].
> However, things like 1<<r3 and ~r3 are more tricky, and I'm not sure
> how to handle these.
> expression() would, as it seems, defer the calculation, since r3 is a symbol.
> Instead, I'd like to teach expression() the knowledge that 1<<r3
> combination is special.
> 
> Two places I've been thinking of are ppc_parse_name() and register_name().
> The first one "hooks" the name lookup algorithm, but it unsurprizingly
> only gets "r3" from "1<<r3".
> The second one seems like a minor hack, substituting expression()
> logic (in fact, expression() is a fallback for it).

PowerPC register operands are normally just plain numbers, or
expressions that resolve to a number.  "r3" is normally just a
symbol, in fact people often use a bunch of defines to have a
preprocessor replace "r3" with plain "3".  That said, the assembler
does recognise "%r3" as a register via register_name(), and plain "r3"
if you pass -mregnames to gas.  However, it only does that at the
start of an operand expression.  There's a good reason for that:  We
don't want to confuse the gas lexing code into smashing '%' into
following alphas.  We'd like '%' to have its normal meaning in a
modulus expression.

ppc_parse_name is a hook as you say (translating a register name to an
O_register expression) but only for CR regs and condition expresions
like "4*cr7+so".  It is only enabled in contexts where such an
expression is expected, again because gas lexing is simplistic.

> To be honest, I'm not sure how to achieve the goal. In fact, I must
> teach expression() to treat some operations specially in some
> contexts.

It seems to me you will firstly need to parse your example
"sv.setb/dm=~r3/sm=1<<r3" into "sv.setb", "dm=", "~r3", "sm=", and
"1<<r3".  You should modify ppc_parse_name to handle the new
register expressions in this context only (see cr_operand), and call
expression() on "~r3" and "1<<r3".  You may find it better to create
O_constant values instead of O_register in your modified
ppc_parse_name, if you want ~r3 to resolve to 0xff..fffc early.

> Could you, please, recommend me options to achieve the goal? I want to
> keep the logic clear to anyone potentially reading this code later.
> 
> Thank you!
> 
> [0] https://libre-soc.org/
> [1] https://libre-soc.org/openpower/sv/svp64/appendix/
> [2] https://git.libre-soc.org/?p=binutils-gdb.git;a=blob;f=gas/config/tc-ppc-svp64.c;h=9dfafa20347f5085653439f7bd4d70c1710a205d;hb=refs/heads/svp64-ng#l144
> [3] https://git.libre-soc.org/?p=binutils-gdb.git;a=blob;f=gas/config/tc-ppc-svp64.c;h=9dfafa20347f5085653439f7bd4d70c1710a205d;hb=refs/heads/svp64-ng#l349
> 
> --
> Best regards,
> Dmitry Selyutin

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Teaching expression() to treat some operations specially
  2022-07-11  1:01 ` Alan Modra
@ 2022-07-11  2:33   ` lkcl
  2022-07-11  3:08     ` Alan Modra
  0 siblings, 1 reply; 17+ messages in thread
From: lkcl @ 2022-07-11  2:33 UTC (permalink / raw)
  To: Alan Modra, Dmitry Selyutin; +Cc: Binutils

alan thank you for the insights. i am using an unusual mailer which only topposts: rather than subject everyone to that i'm trimming all context.

some background on svp64 predication (which is not part of Power ISA): it is the same concept as RVV and NEC SX Aurora (Cray-style) predicate masks except a bit more sophisticated. basic concept of Predicated Cray Vectors: elements may be dynamically skipped at runtime based on VL bits read from the regiater file:

for i in range(VL):
   if get_mask()&(1<<i) == 0: continue
   // do operation

get_mask() above may be one of the following sources:

* all ones (disables predication)
* the contents of r3, r10, r31 or their bit-inverses
* 1<<r3 i.e. one single unary bit
* the concatenation (and inversion) of a selected CR Field
   from CR0, CR1, ..... CRn
   sm=eq means concatenate CR0.eq CR1.eq CR2.eq as the mask

i particularly like 1<<r3 because it is

for i in range(VL):
   if (1<<GPR(r3))&(1<<i) == 0: continue
   GPR(RT+i) = GPR(RA+i) + GPR(RB+i) # vector add

i.e. just

   GPR(RT+GPR(r3)) = GPR(RA+GPR(r3)) + GPR(RB+GPR(r3))


this should help clear up some potential confusion on the meaning of sm=~r3, it is not the inversion of a static constant "3" (to create 0xfff....fffc) but a mnemonic to put a bitpattern into the 24-bit prefix, "at *instruction execute* time please read then use contents of register r3 as the predicate mask, invert all its bits before passing it on".

we could have used "sm=0b0111" or other numeric code and simply dropped that directly into the 24-bit prefix. however this loses the opportunity to perform macro substitution.

another idea we discussed was to move the "~" to the lhs:

    ~sm=r3 / !sm=r3

or

     sm ~= r3

on the basis that only the rhs gets passed to expression().  in english:

      inverted source mask is r3

readability of 1<<r3 is compromised however by trying something aimilar.  we thought perhaps "sm=ur3" or such, for "source mask is unary from r3" but it is pushing things.

adapting 0_constant to recognise 1<<r3 as a special pattern would help preserve readability greatly. i take it you mean that after the lexer has recognised "1" "<<" "r3" that there is a tree-walk done by cr_operand() that *replaces* the patterns?

l.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Teaching expression() to treat some operations specially
  2022-07-11  2:33   ` lkcl
@ 2022-07-11  3:08     ` Alan Modra
  2022-07-11 14:28       ` Dmitry Selyutin
  0 siblings, 1 reply; 17+ messages in thread
From: Alan Modra @ 2022-07-11  3:08 UTC (permalink / raw)
  To: lkcl; +Cc: Dmitry Selyutin, Binutils

On Mon, Jul 11, 2022 at 03:33:55AM +0100, lkcl wrote:
> alan thank you for the insights. i am using an unusual mailer which only topposts: rather than subject everyone to that i'm trimming all context.
> 
> some background on svp64 predication (which is not part of Power ISA): it is the same concept as RVV and NEC SX Aurora (Cray-style) predicate masks except a bit more sophisticated. basic concept of Predicated Cray Vectors: elements may be dynamically skipped at runtime based on VL bits read from the regiater file:
> 
> for i in range(VL):
>    if get_mask()&(1<<i) == 0: continue
>    // do operation
> 
> get_mask() above may be one of the following sources:
> 
> * all ones (disables predication)
> * the contents of r3, r10, r31 or their bit-inverses
> * 1<<r3 i.e. one single unary bit
> * the concatenation (and inversion) of a selected CR Field
>    from CR0, CR1, ..... CRn
>    sm=eq means concatenate CR0.eq CR1.eq CR2.eq as the mask
> 
> i particularly like 1<<r3 because it is
> 
> for i in range(VL):
>    if (1<<GPR(r3))&(1<<i) == 0: continue
>    GPR(RT+i) = GPR(RA+i) + GPR(RB+i) # vector add
> 
> i.e. just
> 
>    GPR(RT+GPR(r3)) = GPR(RA+GPR(r3)) + GPR(RB+GPR(r3))
> 
> 
> this should help clear up some potential confusion on the meaning of sm=~r3, it is not the inversion of a static constant "3" (to create 0xfff....fffc) but a mnemonic to put a bitpattern into the 24-bit prefix, "at *instruction execute* time please read then use contents of register r3 as the predicate mask, invert all its bits before passing it on".

OK, then you'll want ppc_parse_name to continue to create O_register
values.  I think in that case (haven't checked) that you'll get a tree
for ~r3, top node with X_op = O_bit_not and X_add_symbol an expression
symbol with value X_op = O_register, X_add_number = 3.  There is
opportunity in ppc_optimize_expr to fiddle that tree to something
nicer.

> we could have used "sm=0b0111" or other numeric code and simply dropped that directly into the 24-bit prefix. however this loses the opportunity to perform macro substitution.
> 
> another idea we discussed was to move the "~" to the lhs:
> 
>     ~sm=r3 / !sm=r3
> 
> or
> 
>      sm ~= r3
> 
> on the basis that only the rhs gets passed to expression().  in english:
> 
>       inverted source mask is r3
> 
> readability of 1<<r3 is compromised however by trying something aimilar.  we thought perhaps "sm=ur3" or such, for "source mask is unary from r3" but it is pushing things.
> 
> adapting 0_constant to recognise 1<<r3 as a special pattern would help preserve readability greatly. i take it you mean that after the lexer has recognised "1" "<<" "r3" that there is a tree-walk done by cr_operand() that *replaces* the patterns?
> 
> l.
> 

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Teaching expression() to treat some operations specially
  2022-07-11  3:08     ` Alan Modra
@ 2022-07-11 14:28       ` Dmitry Selyutin
  2022-07-11 19:13         ` Dmitry Selyutin
  2022-07-11 23:37         ` Alan Modra
  0 siblings, 2 replies; 17+ messages in thread
From: Dmitry Selyutin @ 2022-07-11 14:28 UTC (permalink / raw)
  To: Alan Modra; +Cc: lkcl, Binutils

On Mon, Jul 11, 2022 at 6:08 AM Alan Modra <amodra@gmail.com> wrote:
> On Mon, Jul 11, 2022 at 03:33:55AM +0100, lkcl wrote:
> > please read then use contents of register r3 as the predicate mask, invert all its bits before passing it on".
>
> OK, then you'll want ppc_parse_name to continue to create O_register
> values.

There is another difficulty with statements like "r3" or "~r3" in
macros. As such, PPC does not support register names in macros, at
all.

    .set REG, %r0
    extsw REG, 2

This code leads to "junk at end of line, first unrecognized character
is `r'" error. Also, using "r0" instead of "%r0" also fails due to
unknown relocation.
From the code around it looks that we could define a simple
md_operand() routine. This _almost_ works, but now I hit checks at
tc-ppc.c:3484.

I think we're being bitten by some discrepancies between these checks
and symbol expansion.
Alan, could you, please, help me with understanding these checks? I
understand it's been 5 years ago...

P.S. I've inlined the patch for a simple md_operand routine here.

diff --git a/gas/config/tc-ppc.c b/gas/config/tc-ppc.c
index ac61cd8f12..c04bde0d00 100644
--- a/gas/config/tc-ppc.c
+++ b/gas/config/tc-ppc.c
@@ -3275,8 +3275,14 @@ parse_tls_arg (char **str, const expressionS
*exp, struct ppc_fixup *tls_fix)
}
#endif

-/* This routine is called for each instruction to be assembled.  */
+void
+md_operand (expressionS *exp)
+{
+  if (!register_name (exp))
+    expression (exp);
+}

+/* This routine is called for each instruction to be assembled.  */
void
md_assemble (char *str)
{
diff --git a/gas/config/tc-ppc.h b/gas/config/tc-ppc.h
index ed06a29638..a65e51c875 100644
--- a/gas/config/tc-ppc.h
+++ b/gas/config/tc-ppc.h
@@ -328,8 +328,6 @@ extern int ppc_parse_name (const char *, struct
expressionS *);
#define md_optimize_expr(left, op, right) ppc_optimize_expr (left, op, right)
extern int ppc_optimize_expr (expressionS *, operatorT, expressionS *);

-#define md_operand(x)
-
#define md_cleanup() ppc_cleanup ()
extern void ppc_cleanup (void);

-- 
Best regards,
Dmitry Selyutin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Teaching expression() to treat some operations specially
  2022-07-11 14:28       ` Dmitry Selyutin
@ 2022-07-11 19:13         ` Dmitry Selyutin
  2022-07-11 23:37         ` Alan Modra
  1 sibling, 0 replies; 17+ messages in thread
From: Dmitry Selyutin @ 2022-07-11 19:13 UTC (permalink / raw)
  To: Alan Modra; +Cc: lkcl, Binutils

On Mon, Jul 11, 2022 at 5:28 PM Dmitry Selyutin <ghostmansd@gmail.com> wrote:
> I think we're being bitten by some discrepancies between these checks
> and symbol expansion.

OK, actually it's way simpler. Some fields are left uninitialized, and
this includes fields which are later checked.
This patch below simply assigns the expression from the symbol to the
expression we process.
However, I'm not sure whether this is the right thing to do; even if
so, this logic is hardly unique for registers.
Anyway, posting here to show that with this path things work (along
with the previous patch).

diff --git a/gas/expr.c b/gas/expr.c
index f4ea24717d..a46e220526 100644
--- a/gas/expr.c
+++ b/gas/expr.c
@@ -1350,8 +1350,7 @@ operand (expressionS *expressionP, enum expr_mode mode)
           }
         else if (mode != expr_defer && segment == reg_section)
           {
-             expressionP->X_op = O_register;
-             expressionP->X_add_number = S_GET_VALUE (symbolP);
+             *expressionP = *symbol_get_value_expression (symbolP);
           }
         else
           {



-- 
Best regards,
Dmitry Selyutin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Teaching expression() to treat some operations specially
  2022-07-11 14:28       ` Dmitry Selyutin
  2022-07-11 19:13         ` Dmitry Selyutin
@ 2022-07-11 23:37         ` Alan Modra
  2022-07-12  4:18           ` Dmitry Selyutin
  1 sibling, 1 reply; 17+ messages in thread
From: Alan Modra @ 2022-07-11 23:37 UTC (permalink / raw)
  To: Dmitry Selyutin; +Cc: lkcl, Binutils

On Mon, Jul 11, 2022 at 05:28:35PM +0300, Dmitry Selyutin wrote:
> +void
> +md_operand (expressionS *exp)
> +{
> +  if (!register_name (exp))
> +    expression (exp);
> +}
> 

You should not call expression like this without some syntactic
element being consumed.  The following is more reasonable.

void
md_operand (expressionS *e)
{
  (void) register_name (e);
}


-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Teaching expression() to treat some operations specially
  2022-07-11 23:37         ` Alan Modra
@ 2022-07-12  4:18           ` Dmitry Selyutin
  2022-07-12  6:25             ` Jan Beulich
  0 siblings, 1 reply; 17+ messages in thread
From: Dmitry Selyutin @ 2022-07-12  4:18 UTC (permalink / raw)
  To: Alan Modra; +Cc: lkcl, Binutils

On Tue, Jul 12, 2022 at 2:37 AM Alan Modra <amodra@gmail.com> wrote:
>
> You should not call expression like this without some syntactic
> element being consumed.

The expression() is not the culprit, it's the fact that operand() does
not set all the expression fields from the symbol.
The patch below "fixes" it, though I'm quite dubious regarding whether
this is a fix at all.

diff --git a/gas/expr.c b/gas/expr.c
index f4ea24717d..a46e220526 100644
--- a/gas/expr.c
+++ b/gas/expr.c
@@ -1350,8 +1350,7 @@ operand (expressionS *expressionP, enum expr_mode mode)
           }
         else if (mode != expr_defer && segment == reg_section)
           {
-             expressionP->X_op = O_register;
-             expressionP->X_add_number = S_GET_VALUE (symbolP);
+             *expressionP = *symbol_get_value_expression (symbolP);
           }
         else
           {

However, about using expression() in md_operand(), there's another question.
One of particular cases we're interested in is extending operands with
the vector notation.
For example, *%r3 would mean "vector register %r3, same as %r3, but
operating on a vector".
This is the code we're currently using; what'd be the safe way to
replace expression()?

void
md_operand (expressionS *exp)
{
  bool vector = false;

  if (*input_line_pointer == '*')
    {
      ++input_line_pointer;
      vector = true;
    }

  if (!register_name (exp))
    expression (exp);

  if (vector)
    {
      if (exp->X_op == O_register)
        exp->X_op = O_vector_register;
      else if (exp->X_op == O_constant)
        exp->X_op = O_vector_constant;
      else
        exp->X_op = O_absent;
    }
}

-- 
Best regards,
Dmitry Selyutin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Teaching expression() to treat some operations specially
  2022-07-12  4:18           ` Dmitry Selyutin
@ 2022-07-12  6:25             ` Jan Beulich
  2022-07-12  6:47               ` Dmitry Selyutin
  2022-07-12  6:58               ` Dmitry Selyutin
  0 siblings, 2 replies; 17+ messages in thread
From: Jan Beulich @ 2022-07-12  6:25 UTC (permalink / raw)
  To: Dmitry Selyutin; +Cc: lkcl, Binutils, Alan Modra

On 12.07.2022 06:18, Dmitry Selyutin via Binutils wrote:
> On Tue, Jul 12, 2022 at 2:37 AM Alan Modra <amodra@gmail.com> wrote:
>>
>> You should not call expression like this without some syntactic
>> element being consumed.
> 
> The expression() is not the culprit, it's the fact that operand() does
> not set all the expression fields from the symbol.
> The patch below "fixes" it, though I'm quite dubious regarding whether
> this is a fix at all.
> 
> diff --git a/gas/expr.c b/gas/expr.c
> index f4ea24717d..a46e220526 100644
> --- a/gas/expr.c
> +++ b/gas/expr.c
> @@ -1350,8 +1350,7 @@ operand (expressionS *expressionP, enum expr_mode mode)
>            }
>          else if (mode != expr_defer && segment == reg_section)
>            {
> -             expressionP->X_op = O_register;
> -             expressionP->X_add_number = S_GET_VALUE (symbolP);
> +             *expressionP = *symbol_get_value_expression (symbolP);
>            }

May I ask which other struct fields are consumed and where? It is
my understanding that for O_register other fields simply are
invalid ...

> However, about using expression() in md_operand(), there's another question.
> One of particular cases we're interested in is extending operands with
> the vector notation.
> For example, *%r3 would mean "vector register %r3, same as %r3, but
> operating on a vector".
> This is the code we're currently using; what'd be the safe way to
> replace expression()?

Maybe you want to recognize * as a unary operator, assigning it
one of the O_md<N> values?

Jan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Teaching expression() to treat some operations specially
  2022-07-12  6:25             ` Jan Beulich
@ 2022-07-12  6:47               ` Dmitry Selyutin
  2022-07-12  6:57                 ` Jan Beulich
  2022-07-12  6:58               ` Dmitry Selyutin
  1 sibling, 1 reply; 17+ messages in thread
From: Dmitry Selyutin @ 2022-07-12  6:47 UTC (permalink / raw)
  To: Jan Beulich; +Cc: lkcl, Binutils, Alan Modra

On Tue, Jul 12, 2022 at 9:25 AM Jan Beulich <jbeulich@suse.com> wrote:
> May I ask which other struct fields are consumed and where? It is
> my understanding that for O_register other fields simply are
> invalid ...

The check takes a look at the X_md field. I assume this is a
PPC-specific tweak, though.
The check we fail to pass after the call to expression() is here:
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/config/tc-ppc.c;h=5015777d60061e61ffe4c78591b793862919c769;hb=HEAD#l3468

We do indeed tune X_md field in several places to recognize the type
of the register.
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/config/tc-ppc.c;h=5015777d60061e61ffe4c78591b793862919c769;hb=HEAD#l857
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/config/tc-ppc.c;h=5015777d60061e61ffe4c78591b793862919c769;hb=HEAD#l915
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/config/tc-ppc.c;h=5015777d60061e61ffe4c78591b793862919c769;hb=HEAD#l938
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/config/tc-ppc.c;h=5015777d60061e61ffe4c78591b793862919c769;hb=HEAD#l952
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/config/tc-ppc.c;h=5015777d60061e61ffe4c78591b793862919c769;hb=HEAD#l966

I assume this logic is intended; X_md is listed as a machine-dependent
field, so PPC port took the liberty to re-use it for this information.

>
> > However, about using expression() in md_operand(), there's another question.
> > One of particular cases we're interested in is extending operands with
> > the vector notation.
> > For example, *%r3 would mean "vector register %r3, same as %r3, but
> > operating on a vector".
> > This is the code we're currently using; what'd be the safe way to
> > replace expression()?
>
> Maybe you want to recognize * as a unary operator, assigning it
> one of the O_md<N> values?

Yes, I think this would be much a better option! But I didn't manage
to get md_operator working with * symbol...
It seems by the time the execution arrives at md_operator call site, *
is already consumed by the operand() logic.
Perhaps I should tweak lex_type with LEX_NAME flag respectively?
This for sure should work somehow, considering that x86 has
expressions like "jmp *%rax".
Could you provide an example, please?

Anyway, even if we teach md_operator to recognize * operator, this
still leaves the question of macros open.
Regardless of the vector notation used, we're unable to do things like
`.set XXX, %r3`.

FWIW, the "better" version of the patch to expr.c would be this:

diff --git a/gas/expr.c b/gas/expr.c
index f4ea24717d..cf8fa961eb 100644
--- a/gas/expr.c
+++ b/gas/expr.c
@@ -1341,17 +1341,12 @@ operand (expressionS *expressionP, enum expr_mode mode)
         /* If we have an absolute symbol or a reg, then we know its
            value now.  */
         segment = S_GET_SEGMENT (symbolP);
-         if (mode != expr_defer
-             && segment == absolute_section
-             && !S_FORCE_RELOC (symbolP, 0))
+         if ((mode != expr_defer
+              && segment == absolute_section
+              && !S_FORCE_RELOC (symbolP, 0))
+            || (mode != expr_defer && segment == reg_section))
           {
-             expressionP->X_op = O_constant;
-             expressionP->X_add_number = S_GET_VALUE (symbolP);
-           }
-         else if (mode != expr_defer && segment == reg_section)
-           {
-             expressionP->X_op = O_register;
-             expressionP->X_add_number = S_GET_VALUE (symbolP);
+             *expressionP = *symbol_get_value_expression (symbolP);
           }
         else
           {

This is assuming that the patch is correct at all, of course.

-- 
Best regards,
Dmitry Selyutin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Teaching expression() to treat some operations specially
  2022-07-12  6:47               ` Dmitry Selyutin
@ 2022-07-12  6:57                 ` Jan Beulich
  2022-07-12  8:50                   ` Dmitry Selyutin
  2022-07-12  9:01                   ` Dmitry Selyutin
  0 siblings, 2 replies; 17+ messages in thread
From: Jan Beulich @ 2022-07-12  6:57 UTC (permalink / raw)
  To: Dmitry Selyutin; +Cc: lkcl, Binutils, Alan Modra

On 12.07.2022 08:47, Dmitry Selyutin wrote:
> On Tue, Jul 12, 2022 at 9:25 AM Jan Beulich <jbeulich@suse.com> wrote:
>> May I ask which other struct fields are consumed and where? It is
>> my understanding that for O_register other fields simply are
>> invalid ...
> 
> The check takes a look at the X_md field. I assume this is a
> PPC-specific tweak, though.
> The check we fail to pass after the call to expression() is here:
> https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/config/tc-ppc.c;h=5015777d60061e61ffe4c78591b793862919c769;hb=HEAD#l3468
> 
> We do indeed tune X_md field in several places to recognize the type
> of the register.
> https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/config/tc-ppc.c;h=5015777d60061e61ffe4c78591b793862919c769;hb=HEAD#l857
> https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/config/tc-ppc.c;h=5015777d60061e61ffe4c78591b793862919c769;hb=HEAD#l915
> https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/config/tc-ppc.c;h=5015777d60061e61ffe4c78591b793862919c769;hb=HEAD#l938
> https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/config/tc-ppc.c;h=5015777d60061e61ffe4c78591b793862919c769;hb=HEAD#l952
> https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/config/tc-ppc.c;h=5015777d60061e61ffe4c78591b793862919c769;hb=HEAD#l966
> 
> I assume this logic is intended; X_md is listed as a machine-dependent
> field, so PPC port took the liberty to re-use it for this information.

I see. I have to admit that I don't know whether then it's the port's
or common code's responsibility to initialize X_md everywhere. I'm
sure Alan does ...

>>> However, about using expression() in md_operand(), there's another question.
>>> One of particular cases we're interested in is extending operands with
>>> the vector notation.
>>> For example, *%r3 would mean "vector register %r3, same as %r3, but
>>> operating on a vector".
>>> This is the code we're currently using; what'd be the safe way to
>>> replace expression()?
>>
>> Maybe you want to recognize * as a unary operator, assigning it
>> one of the O_md<N> values?
> 
> Yes, I think this would be much a better option! But I didn't manage
> to get md_operator working with * symbol...
> It seems by the time the execution arrives at md_operator call site, *
> is already consumed by the operand() logic.
> Perhaps I should tweak lex_type with LEX_NAME flag respectively?
> This for sure should work somehow, considering that x86 has
> expressions like "jmp *%rax".

No, that won't be parsed by expression(), the * will be taken off
elsewhere. AT&T syntax operand handling takes operands apart
"manually" (and uses expression() only on parts thereof), whereas
...

> Could you provide an example, please?

... you may find examples in Intel syntax handling (i386-intel.c),
which many years ago I did switch to actually going through
expression(). But: You may not get away without touching common
code, as unary and binary operators are treated separately, and
x86 uses it only for identifier-like unary operators. So it may
indeed be that * doesn't presently have a way to make it to
md_operator() when used in unary way.

Jan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Teaching expression() to treat some operations specially
  2022-07-12  6:25             ` Jan Beulich
  2022-07-12  6:47               ` Dmitry Selyutin
@ 2022-07-12  6:58               ` Dmitry Selyutin
  1 sibling, 0 replies; 17+ messages in thread
From: Dmitry Selyutin @ 2022-07-12  6:58 UTC (permalink / raw)
  To: Jan Beulich; +Cc: lkcl, Binutils, Alan Modra

On Tue, Jul 12, 2022 at 9:25 AM Jan Beulich <jbeulich@suse.com> wrote:
> May I ask which other struct fields are consumed and where? It is
> my understanding that for O_register other fields simply are
> invalid ...

Also, it seems a bit strange that, instead of providing md_operand, we
rather try to parse with a custom routine, register_name:
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/config/tc-ppc.c;h=5015777d60061e61ffe4c78591b793862919c769;hb=HEAD#l3444

I'd rather expect we have this logic in md_operand.
We could avoid duplicating some bits in ppc_parse_name.
Also, this way, we would cover macros as well, which we don't do now.

-- 
Best regards,
Dmitry Selyutin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Teaching expression() to treat some operations specially
  2022-07-12  6:57                 ` Jan Beulich
@ 2022-07-12  8:50                   ` Dmitry Selyutin
  2022-07-12 10:44                     ` Jan Beulich
  2022-07-12  9:01                   ` Dmitry Selyutin
  1 sibling, 1 reply; 17+ messages in thread
From: Dmitry Selyutin @ 2022-07-12  8:50 UTC (permalink / raw)
  To: Jan Beulich; +Cc: lkcl, Binutils, Alan Modra

On Tue, Jul 12, 2022 at 9:57 AM Jan Beulich <jbeulich@suse.com> wrote:
> x86 uses it only for identifier-like unary operators. So it may
> indeed be that * doesn't presently have a way to make it to
> md_operator() when used in unary way.

Frankly speaking, md_operator is somewhat confusing. What kind of
arguments does it expect?
I tried defining both -- ppc_parse_name, ppc_operand and ppc_operator.
The exact sequence of how these are called depends on many conditions.
Let's consider this code:

    .set REG, %r0
    extsw REG, 2

By default, I don't see any calls to ppc_operator at all, only calls
to ppc_operand.
If I update tc-ppc.h with `#define LEX_PCT (LEX_BEGIN_NAME)`,
ppc_operator is called before ppc_operand.
internals.texi lacks any information on ppc_operator, so I'm unsure
how to use it.
Also, it's especially confusing to see this call: `md_operator (NULL, 2, NULL)`.

Is there any documentation on how md_operator works? i386_operator
doesn't really explain it.
I think that, to make it work, lex_type must be updated with
LEX_BEGIN_NAME with % and * symbols.
But, considering the current depths of my knowledge for this code,
these are mostly guesses.


-- 
Best regards,
Dmitry Selyutin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Teaching expression() to treat some operations specially
  2022-07-12  6:57                 ` Jan Beulich
  2022-07-12  8:50                   ` Dmitry Selyutin
@ 2022-07-12  9:01                   ` Dmitry Selyutin
  1 sibling, 0 replies; 17+ messages in thread
From: Dmitry Selyutin @ 2022-07-12  9:01 UTC (permalink / raw)
  To: Jan Beulich; +Cc: lkcl, Binutils, Alan Modra

On Tue, Jul 12, 2022 at 9:57 AM Jan Beulich <jbeulich@suse.com> wrote:
> I see. I have to admit that I don't know whether then it's the port's
> or common code's responsibility to initialize X_md everywhere. I'm
> sure Alan does ...

Updating X_md on port's side would be another option. For example,
this works too:

    expressionS ex;
    ex.X_md = (operand->flags &
        (PPC_OPERAND_GPR | PPC_OPERAND_FPR | PPC_OPERAND_VR
        | PPC_OPERAND_VSR | PPC_OPERAND_CR_BIT | PPC_OPERAND_CR_REG
        | PPC_OPERAND_SPR | PPC_OPERAND_GQR | PPC_OPERAND_ACC));
    expression (&ex);

-- 
Best regards,
Dmitry Selyutin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Teaching expression() to treat some operations specially
  2022-07-12  8:50                   ` Dmitry Selyutin
@ 2022-07-12 10:44                     ` Jan Beulich
  2022-07-12 11:17                       ` Dmitry Selyutin
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Beulich @ 2022-07-12 10:44 UTC (permalink / raw)
  To: Dmitry Selyutin; +Cc: lkcl, Binutils, Alan Modra

On 12.07.2022 10:50, Dmitry Selyutin wrote:
> On Tue, Jul 12, 2022 at 9:57 AM Jan Beulich <jbeulich@suse.com> wrote:
>> x86 uses it only for identifier-like unary operators. So it may
>> indeed be that * doesn't presently have a way to make it to
>> md_operator() when used in unary way.
> 
> Frankly speaking, md_operator is somewhat confusing. What kind of
> arguments does it expect?
> I tried defining both -- ppc_parse_name, ppc_operand and ppc_operator.
> The exact sequence of how these are called depends on many conditions.
> Let's consider this code:
> 
>     .set REG, %r0
>     extsw REG, 2
> 
> By default, I don't see any calls to ppc_operator at all, only calls
> to ppc_operand.
> If I update tc-ppc.h with `#define LEX_PCT (LEX_BEGIN_NAME)`,
> ppc_operator is called before ppc_operand.
> internals.texi lacks any information on ppc_operator, so I'm unsure
> how to use it.
> Also, it's especially confusing to see this call: `md_operator (NULL, 2, NULL)`.
> 
> Is there any documentation on how md_operator works? i386_operator
> doesn't really explain it.

I don't think there's any documentation.

> I think that, to make it work, lex_type must be updated with
> LEX_BEGIN_NAME with % and * symbols.

That might be an approach, but I'd suggest trying to avoid it. Instead
I think there's another place missing where md_operator() should be
called - when an otherwise binary operator char is used as a unary
operator. I think it is wrong to call md_operand() in that case, or at
least md_operator() may want trying first. Or wait - that's actually
the only call site of md_operand(), so why again can't you deal with *
there (converting the thing to an X_md<N> expression)?

Jan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Teaching expression() to treat some operations specially
  2022-07-12 10:44                     ` Jan Beulich
@ 2022-07-12 11:17                       ` Dmitry Selyutin
  2022-07-12 11:39                         ` Jan Beulich
  0 siblings, 1 reply; 17+ messages in thread
From: Dmitry Selyutin @ 2022-07-12 11:17 UTC (permalink / raw)
  To: Jan Beulich; +Cc: lkcl, Binutils, Alan Modra

On Tue, Jul 12, 2022 at 1:44 PM Jan Beulich <jbeulich@suse.com> wrote:
> why again can't you deal with *
> there (converting the thing to an X_md<N> expression)?

Actually this is what we already do. :-)
https://git.libre-soc.org/?p=binutils-gdb.git;a=blob;f=gas/config/tc-ppc.c;h=989d1a29a56a364cdfcce6b670f7466ff8eaaeea;hb=refs/heads/svp64-ng#l4238

There are two issues:
1. Handle register names in macros in a uniform way.
2. Handle unary * operator.

The thread became a bit confusing due to these questions being
somewhat related (e.g. whether these should be parsed in the same
place).
My point was, since * is indeed an unary operator, it's kind of
natural for it to be handled at md_operator.
However, we can stick to md_operand, no problems; so the second
question is already solved.
As for the first one, I'm now preparing a patch for it (it's also
about md_operand, anyway).
I attempt to make the code somewhat more straightforward and generic,
and also handle X_md stuff at port's side.
I'll update on this. Thank you for your help and advice!

-- 
Best regards,
Dmitry Selyutin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Teaching expression() to treat some operations specially
  2022-07-12 11:17                       ` Dmitry Selyutin
@ 2022-07-12 11:39                         ` Jan Beulich
  0 siblings, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2022-07-12 11:39 UTC (permalink / raw)
  To: Dmitry Selyutin; +Cc: lkcl, Binutils, Alan Modra

On 12.07.2022 13:17, Dmitry Selyutin wrote:
> On Tue, Jul 12, 2022 at 1:44 PM Jan Beulich <jbeulich@suse.com> wrote:
>> why again can't you deal with *
>> there (converting the thing to an X_md<N> expression)?
> 
> Actually this is what we already do. :-)
> https://git.libre-soc.org/?p=binutils-gdb.git;a=blob;f=gas/config/tc-ppc.c;h=989d1a29a56a364cdfcce6b670f7466ff8eaaeea;hb=refs/heads/svp64-ng#l4238
> 
> There are two issues:
> 1. Handle register names in macros in a uniform way.

Note that from all I can tell equates (which is what I think you
mean) are a problem for expression in a wider sense. Things get
especially "interesting" once using transitive equates ...

Jan

> 2. Handle unary * operator.
> 
> The thread became a bit confusing due to these questions being
> somewhat related (e.g. whether these should be parsed in the same
> place).
> My point was, since * is indeed an unary operator, it's kind of
> natural for it to be handled at md_operator.
> However, we can stick to md_operand, no problems; so the second
> question is already solved.
> As for the first one, I'm now preparing a patch for it (it's also
> about md_operand, anyway).
> I attempt to make the code somewhat more straightforward and generic,
> and also handle X_md stuff at port's side.
> I'll update on this. Thank you for your help and advice!
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-07-12 11:39 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-09 16:47 Teaching expression() to treat some operations specially Dmitry Selyutin
2022-07-11  1:01 ` Alan Modra
2022-07-11  2:33   ` lkcl
2022-07-11  3:08     ` Alan Modra
2022-07-11 14:28       ` Dmitry Selyutin
2022-07-11 19:13         ` Dmitry Selyutin
2022-07-11 23:37         ` Alan Modra
2022-07-12  4:18           ` Dmitry Selyutin
2022-07-12  6:25             ` Jan Beulich
2022-07-12  6:47               ` Dmitry Selyutin
2022-07-12  6:57                 ` Jan Beulich
2022-07-12  8:50                   ` Dmitry Selyutin
2022-07-12 10:44                     ` Jan Beulich
2022-07-12 11:17                       ` Dmitry Selyutin
2022-07-12 11:39                         ` Jan Beulich
2022-07-12  9:01                   ` Dmitry Selyutin
2022-07-12  6:58               ` Dmitry Selyutin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).