public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* RISC-V: mapping symbols vs "unimpl"
@ 2022-02-14  8:26 Jan Beulich
  2022-02-14 10:35 ` Andrew Waterman
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Beulich @ 2022-02-14  8:26 UTC (permalink / raw)
  To: Nelson Chu; +Cc: Binutils

Nelson,

with your introduction of mapping symbols I have trouble expressing
certain spec-conforming deliberately illegal 32-bit instructions (low
16 bits all zero). I can't use "unimpl" itself, as that for whichever
reason resolves to "csrrw x0, cycle, x0" (and oddly enough using that
form as input goes through without even a warning, but that's just a
side note). I also can't use .insn, as that places requirements on
the low two bits of the main opcode. And now I also can't use .word
anymore, as that will cause mapping symbols to be inserted, which
therefore - by a disassembler honoring the mapping symbols - won't
disassemble as an instruction anymore.

Do you have any suggestion how to encode a spec-conforming "unimpl",
which I want to be part of my own disassembler's test cases (in
particular the 0xffff0000 form)? Since the commit introducing the
mapping symbols refers to Arm, I'd like to point out that their .insn
equivalents allow to encode entirely arbitrary instruction forms. But
of course I understand that RISC-V's insn length encoding scheme is
somewhat in conflict with this.

Two further remarks: Even ".insn ci ..." cannot be used, not even for
forms with the high 16 bits not all set (which again the main opcode
restriction would get in the way of): Already just temporarily
enabling RVC causes the RVC bit to be set in the ELF header flags.
Yet with that bit set 0xffff0000 is actually a 16-bit insn 0x0000
followed by a wider insn with the low 16 bits all set. IOW this
conflicts with the spec's wording of "minimal length insn with the
low 16 bits all zero".

And then I'm puzzled by the main opcode restriction related error
being raised a whopping 6 times for ".insn i 0, 0, x0, x30, ~0",
and still twice instead of just once for ".insn ci 3, 7, x31, ~0".

Jan


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: RISC-V: mapping symbols vs "unimpl"
  2022-02-14  8:26 RISC-V: mapping symbols vs "unimpl" Jan Beulich
@ 2022-02-14 10:35 ` Andrew Waterman
  2022-02-14 11:24   ` Nelson Chu
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Waterman @ 2022-02-14 10:35 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Nelson Chu, Binutils

On Mon, Feb 14, 2022 at 12:27 AM Jan Beulich via Binutils
<binutils@sourceware.org> wrote:
>
> Nelson,
>
> with your introduction of mapping symbols I have trouble expressing
> certain spec-conforming deliberately illegal 32-bit instructions (low
> 16 bits all zero). I can't use "unimpl" itself, as that for whichever
> reason resolves to "csrrw x0, cycle, x0"

I just wanted to remark on this seemingly arbitrary choice: we chose
it because it's both architecturally guaranteed to be 32 bits wide and
architecturally guaranteed to trap.  (The CSR in question is
definitionally read-only, and writing a read-only CSR must trap.)  I
realize it's of no help to your query, but I figured it was best not
to leave this point dangling.

> (and oddly enough using that
> form as input goes through without even a warning, but that's just a
> side note). I also can't use .insn, as that places requirements on
> the low two bits of the main opcode. And now I also can't use .word
> anymore, as that will cause mapping symbols to be inserted, which
> therefore - by a disassembler honoring the mapping symbols - won't
> disassemble as an instruction anymore.
>
> Do you have any suggestion how to encode a spec-conforming "unimpl",
> which I want to be part of my own disassembler's test cases (in
> particular the 0xffff0000 form)? Since the commit introducing the
> mapping symbols refers to Arm, I'd like to point out that their .insn
> equivalents allow to encode entirely arbitrary instruction forms. But
> of course I understand that RISC-V's insn length encoding scheme is
> somewhat in conflict with this.
>
> Two further remarks: Even ".insn ci ..." cannot be used, not even for
> forms with the high 16 bits not all set (which again the main opcode
> restriction would get in the way of): Already just temporarily
> enabling RVC causes the RVC bit to be set in the ELF header flags.
> Yet with that bit set 0xffff0000 is actually a 16-bit insn 0x0000
> followed by a wider insn with the low 16 bits all set. IOW this
> conflicts with the spec's wording of "minimal length insn with the
> low 16 bits all zero".
>
> And then I'm puzzled by the main opcode restriction related error
> being raised a whopping 6 times for ".insn i 0, 0, x0, x30, ~0",
> and still twice instead of just once for ".insn ci 3, 7, x31, ~0".
>
> Jan
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: RISC-V: mapping symbols vs "unimpl"
  2022-02-14 10:35 ` Andrew Waterman
@ 2022-02-14 11:24   ` Nelson Chu
  2022-02-14 12:49     ` Jan Beulich
  0 siblings, 1 reply; 4+ messages in thread
From: Nelson Chu @ 2022-02-14 11:24 UTC (permalink / raw)
  To: Andrew Waterman; +Cc: Jan Beulich, Binutils

Hi Jan,

On Mon, Feb 14, 2022 at 6:35 PM Andrew Waterman <andrew@sifive.com> wrote:
>
> On Mon, Feb 14, 2022 at 12:27 AM Jan Beulich via Binutils
> <binutils@sourceware.org> wrote:
> >
> > Nelson,
> >
> > with your introduction of mapping symbols I have trouble expressing
> > certain spec-conforming deliberately illegal 32-bit instructions (low
> > 16 bits all zero). I can't use "unimpl" itself, as that for whichever
> > reason resolves to "csrrw x0, cycle, x0"
>
> I just wanted to remark on this seemingly arbitrary choice: we chose
> it because it's both architecturally guaranteed to be 32 bits wide and
> architecturally guaranteed to trap.  (The CSR in question is
> definitionally read-only, and writing a read-only CSR must trap.)  I
> realize it's of no help to your query, but I figured it was best not
> to leave this point dangling.

I'm not really sure the history, but according the implementation,
https://github.com/riscv-collab/riscv-binutils-gdb/blob/riscv-binutils-2.38/opcodes/riscv-opc.c#L272
The unimp is encoded as "0x000" when rvc is enable, and is encoded as
"csrrw x0, cycle, x0" when rvc is disabled.

> > (and oddly enough using that
> > form as input goes through without even a warning, but that's just a
> > side note). I also can't use .insn, as that places requirements on
> > the low two bits of the main opcode. And now I also can't use .word
> > anymore, as that will cause mapping symbols to be inserted, which
> > therefore - by a disassembler honoring the mapping symbols - won't
> > disassemble as an instruction anymore.
> >
> > Do you have any suggestion how to encode a spec-conforming "unimpl",
> > which I want to be part of my own disassembler's test cases (in
> > particular the 0xffff0000 form)? Since the commit introducing the
> > mapping symbols refers to Arm, I'd like to point out that their .insn
> > equivalents allow to encode entirely arbitrary instruction forms. But
> > of course I understand that RISC-V's insn length encoding scheme is
> > somewhat in conflict with this.

We are used to checking if the instruction is valid or not for the
.insn directive.  So this made me think - maybe we should allow .insn
to encode any instruction, including illegal formats, just like what
Arm/AArch do as you mentioned.

> > Two further remarks: Even ".insn ci ..." cannot be used, not even for
> > forms with the high 16 bits not all set (which again the main opcode
> > restriction would get in the way of): Already just temporarily
> > enabling RVC causes the RVC bit to be set in the ELF header flags.
> > Yet with that bit set 0xffff0000 is actually a 16-bit insn 0x0000
> > followed by a wider insn with the low 16 bits all set. IOW this
> > conflicts with the spec's wording of "minimal length insn with the
> > low 16 bits all zero".

For now you probably can write 0xffff0000 by two .insn directives.
For example, .insn 0x0 + .insn 0xffff, and then you should get
something like,

Disassembly of section .text:

0000000000000000 <.text>:
   0:   0000                    unimp
   2:   ffff                    .2byte  0xffff

Anyway, this is just a workaround.  I will find relevant people to
discuss whether the .insn directive should be allowed to encode an
illegal format.

> > And then I'm puzzled by the main opcode restriction related error
> > being raised a whopping 6 times for ".insn i 0, 0, x0, x30, ~0",
> > and still twice instead of just once for ".insn ci 3, 7, x31, ~0".
> >
> > Jan
> >


Thanks
Nelson

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: RISC-V: mapping symbols vs "unimpl"
  2022-02-14 11:24   ` Nelson Chu
@ 2022-02-14 12:49     ` Jan Beulich
  0 siblings, 0 replies; 4+ messages in thread
From: Jan Beulich @ 2022-02-14 12:49 UTC (permalink / raw)
  To: Nelson Chu; +Cc: Binutils, Andrew Waterman

On 14.02.2022 12:24, Nelson Chu wrote:
> On Mon, Feb 14, 2022 at 6:35 PM Andrew Waterman <andrew@sifive.com> wrote:
>> On Mon, Feb 14, 2022 at 12:27 AM Jan Beulich via Binutils
>> <binutils@sourceware.org> wrote:
>>> Do you have any suggestion how to encode a spec-conforming "unimpl",
>>> which I want to be part of my own disassembler's test cases (in
>>> particular the 0xffff0000 form)? Since the commit introducing the
>>> mapping symbols refers to Arm, I'd like to point out that their .insn
>>> equivalents allow to encode entirely arbitrary instruction forms. But
>>> of course I understand that RISC-V's insn length encoding scheme is
>>> somewhat in conflict with this.
> 
> We are used to checking if the instruction is valid or not for the
> .insn directive.  So this made me think - maybe we should allow .insn
> to encode any instruction, including illegal formats, just like what
> Arm/AArch do as you mentioned.
> 
>>> Two further remarks: Even ".insn ci ..." cannot be used, not even for
>>> forms with the high 16 bits not all set (which again the main opcode
>>> restriction would get in the way of): Already just temporarily
>>> enabling RVC causes the RVC bit to be set in the ELF header flags.
>>> Yet with that bit set 0xffff0000 is actually a 16-bit insn 0x0000
>>> followed by a wider insn with the low 16 bits all set. IOW this
>>> conflicts with the spec's wording of "minimal length insn with the
>>> low 16 bits all zero".
> 
> For now you probably can write 0xffff0000 by two .insn directives.
> For example, .insn 0x0 + .insn 0xffff, and then you should get
> something like,
> 
> Disassembly of section .text:
> 
> 0000000000000000 <.text>:
>    0:   0000                    unimp
>    2:   ffff                    .2byte  0xffff
> 
> Anyway, this is just a workaround.  I will find relevant people to
> discuss whether the .insn directive should be allowed to encode an
> illegal format.

Hmm, interesting. I wasn't aware of these alternative forms of
.insn. But I wonder in how far I can actually rely on the present
behavior: ".insn 0xffff" emitting a 16-bit value does not really
form a valid 16-bit insn. I'd rather expect the "value conflicts
with instruction length" in this case. And if a 16-bit insn is
emitted, I'd expect a check for RVC being enabled, which in turn
would have led to RVC being recorded in the header (creating the
same problem as mentioned before again).

But yes, for the immediate purpose with 2.38, "utilizing" its
shortcomings, I should be able to use this approach.

Looking at riscv_ip_hardcode() I also notice that

	.insn 8, 0x3f

wrongly triggers that very error, due to the undefined behavior of
the shift in this case.

Jan


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-02-14 12:49 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-14  8:26 RISC-V: mapping symbols vs "unimpl" Jan Beulich
2022-02-14 10:35 ` Andrew Waterman
2022-02-14 11:24   ` Nelson Chu
2022-02-14 12:49     ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).