* Problems with relocations for a custom ISA
@ 2023-08-07 23:08 MegaIng
2023-08-08 14:13 ` Michael Matz
0 siblings, 1 reply; 6+ messages in thread
From: MegaIng @ 2023-08-07 23:08 UTC (permalink / raw)
To: binutils
Hello,
I am currently in the process of porting binutils to a custom
architecture I design with a few others (Spec [1], Start of our Port
[2]). An interesting quirk of this ISA is that its highly modular,
starting with fixed-size 16bit opcodes, but with extensions supporting
variable length instructions similar in power to what x86 has with it's
addressing modes. The base ISA is fixed 16bit word, but there are
extensions for 32 and 64bit words.
Most of the basics I already managed to implement, i.e. I can generate
simple workable ELF files. However, I am running into problems with
relocations for "load immediate" instructions. Without extensions, we
want to potentially emit long chains of instruction (3 to 8 instructions
is realistic), but with proper extensions in can get down to only 1
instruction of 3 or 4 bytes. I am unsure how to best represent such
variable length relocations in BFD and ELF. It seems like those always
assume fixed size relocations that get relaxed away in their entirety if
no longer needed. Is the best solution really to emit multiple
relocations and treat them as one in our custom
elf_relocate_sectionfunction?
In a similar vein I noticed that it seems impossible to teach
bfd_perform_relocation to correctly perform the non-trivial
transformation required to encode the signed offsets of jumps (since
they are non-consecutive bitfields), which means that we get garbage if
that function is called, for example when `--oformat` is not elf. Is
this really unavoidable? I tried using special_function, but since it's
also called from bfd_install_relocation, I couldn't figure out what the
correct behavior inside of it would be.
Thanks in advance,
MegaIng
1. https://github.com/ETC-A/etca-spec
2. https://github.com/ETC-A/etca-binutils-gdb
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Problems with relocations for a custom ISA
2023-08-07 23:08 Problems with relocations for a custom ISA MegaIng
@ 2023-08-08 14:13 ` Michael Matz
2023-08-08 14:35 ` MegaIng
0 siblings, 1 reply; 6+ messages in thread
From: Michael Matz @ 2023-08-08 14:13 UTC (permalink / raw)
To: MegaIng; +Cc: binutils
Hello,
On Tue, 8 Aug 2023, MegaIng via Binutils wrote:
> I am currently in the process of porting binutils to a custom architecture I
> design with a few others (Spec [1], Start of our Port [2]). An interesting
> quirk of this ISA is that its highly modular, starting with fixed-size 16bit
> opcodes, but with extensions supporting variable length instructions similar
> in power to what x86 has with it's addressing modes. The base ISA is fixed
> 16bit word, but there are extensions for 32 and 64bit words.
>
> Most of the basics I already managed to implement, i.e. I can generate simple
> workable ELF files. However, I am running into problems with relocations for
> "load immediate" instructions. Without extensions, we want to potentially emit
> long chains of instruction (3 to 8 instructions is realistic), but with proper
> extensions in can get down to only 1 instruction of 3 or 4 bytes. I am unsure
> how to best represent such variable length relocations in BFD and ELF.
The normal way would be to not do that. It seems the assembler will
already see either a long chain of small insns, or a single large insn,
right? So at that point you can already emit the correct relocs. For
example, if I have three insns: setlo, sethi and setall, setting the low
16 bits, the high 16 bits, or all 32 bits of a 32bit immediate, then I
also would have three reloc types: LOW16, HIGH16 and ABS32, which the
assembler would appropriately emit:
setlo %r1, lo(sym) --> RELOC_LOW16, symbol 'sym'
sethi %r1, hi(sym) --> RELOC_HIGH16, symbol 'sym'
setall %r1, sym --> RELOC_ABS32, symbol 'sym'
(obviously details will differ, your 16bit insns won't be able to quite
set all 16 bits :) ).
If you really want to optimize these sequences also at link time (but
why?) then all of this becomes more complicated, but remains essentially
the same. The secret will then be in linking from one of the small relocs
(say, the high16 one) to the other, for the linker to easily recognize the
whole insn pair and appropriately do something about those byte sequences.
In that scheme you need to differ between relocations applied to relaxable
code and relocation applied to random non-relaxable data. E.g. you
probably need two variants of the RELOC_LOW16 relocation.
Some bfd targets chose to limit themself to only simple sequences of
relaxable instructions, e.g. if the low16/high16 setter always comes in
sequence directly after each other (the compiler or asm author will need
to ensure this if it wants to benefit from relaxation then), then one
reloc doesn't need to link to the other.
I wouldn't go that way if I were you: it seems the assembler/compiler
needs to know if targeting the extended ISA or not anyway, so generating
the right instructions and relocations from the start in the assembler
seems the right choice, and then doesn't need any relax complications at
link time.
Ciao,
Michael.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Problems with relocations for a custom ISA
2023-08-08 14:13 ` Michael Matz
@ 2023-08-08 14:35 ` MegaIng
2023-08-08 14:55 ` Xi Ruoyao
2023-08-08 15:35 ` Michael Matz
0 siblings, 2 replies; 6+ messages in thread
From: MegaIng @ 2023-08-08 14:35 UTC (permalink / raw)
To: Michael Matz; +Cc: binutils
Hi,
On 2023-08-08 Michael Matz wrote:
> Hello,
>
> On Tue, 8 Aug 2023, MegaIng via Binutils wrote:
>
>> I am currently in the process of porting binutils to a custom architecture I
>> design with a few others (Spec [1], Start of our Port [2]). An interesting
>> quirk of this ISA is that its highly modular, starting with fixed-size 16bit
>> opcodes, but with extensions supporting variable length instructions similar
>> in power to what x86 has with it's addressing modes. The base ISA is fixed
>> 16bit word, but there are extensions for 32 and 64bit words.
>>
>> Most of the basics I already managed to implement, i.e. I can generate simple
>> workable ELF files. However, I am running into problems with relocations for
>> "load immediate" instructions. Without extensions, we want to potentially emit
>> long chains of instruction (3 to 8 instructions is realistic), but with proper
>> extensions in can get down to only 1 instruction of 3 or 4 bytes. I am unsure
>> how to best represent such variable length relocations in BFD and ELF.
> The normal way would be to not do that. It seems the assembler will
> already see either a long chain of small insns, or a single large insn,
> right?
Our idea was that the user can use a simple pseudo instruction to
represent the
entire process of loading a symbol (or any immediate for that matter).
Maybe this is a misguided idea?
> So at that point you can already emit the correct relocs. For
> example, if I have three insns: setlo, sethi and setall, setting the low
> 16 bits, the high 16 bits, or all 32 bits of a 32bit immediate, then I
> also would have three reloc types: LOW16, HIGH16 and ABS32, which the
> assembler would appropriately emit:
>
> setlo %r1, lo(sym) --> RELOC_LOW16, symbol 'sym'
> sethi %r1, hi(sym) --> RELOC_HIGH16, symbol 'sym'
> setall %r1, sym --> RELOC_ABS32, symbol 'sym'
>
> (obviously details will differ, your 16bit insns won't be able to quite
> set all 16 bits :) ).
> If you really want to optimize these sequences also at link time (but
> why?) then all of this becomes more complicated, but remains essentially
> the same. The secret will then be in linking from one of the small relocs
> (say, the high16 one) to the other, for the linker to easily recognize the
> whole insn pair and appropriately do something about those byte sequences.
> In that scheme you need to differ between relocations applied to relaxable
> code and relocation applied to random non-relaxable data. E.g. you
> probably need two variants of the RELOC_LOW16 relocation.
Not sure if you took a look at our instruction set: The way you would
load an arbitrary 16bit word is via a sequence of `slo` (shift left 5
and or) instructions which use a 5bit immediate (the largest we have in
base). So breaking it up into two RELOC_LOW_16 or similar wouldn't quite
work. It would have to be 3-4 RELOC_BITS_0_4, RELOC_BITS_5_9
RELOC_BITS_10_15 or something like that. And you couldn't exactly remove
one of those without changing the others. But ofcourse, we don't always
need all 4 instructions, sometimes we can get away with only two or
three, for example if it's only an 8bit value, we only need 2
instructions. We would like to optimize these cases somewhere.
After a bit more discussion we came to the idea of having many relocations
that potentially cover multiple instructions so that the entire load-immediate
sequence can be covered by one relocation, but this is quite a large amount of
relocations.
> Some bfd targets chose to limit themself to only simple sequences of
> relaxable instructions, e.g. if the low16/high16 setter always comes in
> sequence directly after each other (the compiler or asm author will need
> to ensure this if it wants to benefit from relaxation then), then one
> reloc doesn't need to link to the other.
>
> I wouldn't go that way if I were you: it seems the assembler/compiler
> needs to know if targeting the extended ISA or not anyway, so generating
> the right instructions and relocations from the start in the assembler
> seems the right choice, and then doesn't need any relax complications at
> link time.
As long as the range (or even the exact value) of the symbol is known at
assembly time, this is ofcourse true, but what about situations where
nothing
about the range of the value is known? It seems like other assembler targets
truncate the values in those cases? If we went for the minimal
representation
we would basically limit external symbols to 5bit, which isn't exactly
ideal.
And from what I can tell, growing a relocation also isn't really
something bfd
is designed to deal with, right?
Many thanks already,
MegaIng
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Problems with relocations for a custom ISA
2023-08-08 14:35 ` MegaIng
@ 2023-08-08 14:55 ` Xi Ruoyao
2023-08-08 15:35 ` Michael Matz
1 sibling, 0 replies; 6+ messages in thread
From: Xi Ruoyao @ 2023-08-08 14:55 UTC (permalink / raw)
To: MegaIng, Michael Matz; +Cc: binutils
On Tue, 2023-08-08 at 16:35 +0200, MegaIng via Binutils wrote:
> Our idea was that the user can use a simple pseudo instruction to
> represent the
> entire process of loading a symbol (or any immediate for that matter).
> Maybe this is a misguided idea?
I'd say it's a bad idea. A stack-based pseudo instruction reloc
approach had been used for LoongArch. But the "pseudo instruction" has
never been implemented completely (doing so is just impossible unless
you rewrite the entire libbfd) so actually we could only handle some
special cases, and this approach caused much more trouble than the
benefit. Now we've made these nasty things deprecated and we will
remove the support of them in a future Binutils release.
See https://github.com/loongson/LoongArch-Documentation/issues/9 for the
"much more trouble".
> It would have to be 3-4 RELOC_BITS_0_4, RELOC_BITS_5_9
> RELOC_BITS_10_15 or something like that
Yes, now we use some traditional reloc types for LoongArch like them.
And this approach works much better than the previous stack-based one.
We now really wish we'd never tried the stack-based approach at all.
ELF allows 2^31-1 reloc types, so a larger reloc type set is not an
issue.
> but what about situations where nothing about the range of the value
> is known
You need to design some code models (sets of assumptions for the
ranges), like other BFD ports do. If you really need the marginal
performance gain by exploiting the range limitations, you can also
implement linker relaxation.
--
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Problems with relocations for a custom ISA
2023-08-08 14:35 ` MegaIng
2023-08-08 14:55 ` Xi Ruoyao
@ 2023-08-08 15:35 ` Michael Matz
2023-08-08 17:26 ` MegaIng
1 sibling, 1 reply; 6+ messages in thread
From: Michael Matz @ 2023-08-08 15:35 UTC (permalink / raw)
To: MegaIng; +Cc: binutils
Hello,
On Tue, 8 Aug 2023, MegaIng wrote:
> > > Most of the basics I already managed to implement, i.e. I can generate
> > > simple
> > > workable ELF files. However, I am running into problems with relocations
> > > for
> > > "load immediate" instructions. Without extensions, we want to potentially
> > > emit
> > > long chains of instruction (3 to 8 instructions is realistic), but with
> > > proper
> > > extensions in can get down to only 1 instruction of 3 or 4 bytes. I am
> > > unsure
> > > how to best represent such variable length relocations in BFD and ELF.
> > The normal way would be to not do that. It seems the assembler will
> > already see either a long chain of small insns, or a single large insn,
> > right?
>
> Our idea was that the user can use a simple pseudo instruction to
> represent the entire process of loading a symbol (or any immediate for
> that matter).
Pseudo instruction makes sense. But then it would still be the assembler
that expands it to either a couple base insns or a single extended insn.
The linker would see only one or the other, and hence also only the base
or the extended relocs.
Or did you really want to reserve some specific byte encoding for this
pseudo instruction to transfer it from assembler via object file to linker
and let only the linker replace that by one or the other variant? That
seems an unnecessarily complicated scheme. It depends on if the assembler
does or doesn't know if it can target the extended insns, or only the base
ones. I would definitely suggest that the assembler at latest should know
this.
> > (obviously details will differ, your 16bit insns won't be able to quite
> > set all 16 bits :) ).
> > If you really want to optimize these sequences also at link time (but
> > why?) then all of this becomes more complicated, but remains essentially
> > the same. The secret will then be in linking from one of the small relocs
> > (say, the high16 one) to the other, for the linker to easily recognize the
> > whole insn pair and appropriately do something about those byte sequences.
> > In that scheme you need to differ between relocations applied to relaxable
> > code and relocation applied to random non-relaxable data. E.g. you
> > probably need two variants of the RELOC_LOW16 relocation.
>
> Not sure if you took a look at our instruction set: The way you would load an
> arbitrary 16bit word is via a sequence of `slo` (shift left 5 and or)
> instructions which use a 5bit immediate (the largest we have in base). So
> breaking it up into two RELOC_LOW_16 or similar wouldn't quite work.
Sure, as I said above: "obviously details will differ".
> It would have to be 3-4 RELOC_BITS_0_4, RELOC_BITS_5_9 RELOC_BITS_10_15
> or something like that. And you couldn't exactly remove one of those
> without changing the others.
Yes, this is the usual way to express that. There are many architectures
which have similar ISA restrictions and they all do it essentially the
same way: "select X bits from value, put them into Y bits of field", for
potentially many combinations of (not necessarily consecutive) X and Y.
> But ofcourse, we don't always need all 4
> instructions, sometimes we can get away with only two or three, for
> example if it's only an 8bit value, we only need 2 instructions. We
> would like to optimize these cases somewhere.
I see. Yeah, that will ultimately need some linker relaxation as only
that one will know for sure which values symbols have, and hence if they
do or do not fit certain constraints.
> After a bit more
> discussion we came to the idea of having many relocations that
> potentially cover multiple instructions so that the entire
> load-immediate sequence can be covered by one relocation,
As you have only such a short immediate field in the base ISA this seems
like a sensible idea, as otherwise, as you say, you need 7 relocations
(and insns) for a full 32bit load.
> but this is quite a large amount of relocations.
Hmm? I don't understand this remark. If you cover a range of
instructions by one relocation you necessarily need fewer relocs than if
you use one reloc per insn?
> > I wouldn't go that way if I were you: it seems the assembler/compiler
> > needs to know if targeting the extended ISA or not anyway, so generating
> > the right instructions and relocations from the start in the assembler
> > seems the right choice, and then doesn't need any relax complications at
> > link time.
>
> As long as the range (or even the exact value) of the symbol is known at
> assembly time, this is ofcourse true, but what about situations where nothing
> about the range of the value is known?
The compiler/assembler would always emit the full sequence (e.g. assumes
that the symbol in question happens to be full 32bit). If you want to
optimize this use in case the symbol happens to need fewer bits, then yes,
you do need linker relaxation. As said, you then need a way in the linker
to recognize an insn sequence that "belongs" together, so that you can
appropriately optimize this, either by referring from one to the next
reloc in such a chain, or by simply assuming that such sequences are
always done in a certain order (i.e. a simple pattern match; unrecognized
patterns would remain unrelaxed/unoptimized).
The basic form of relocations doesn't depend on that, though. You still
need to differ between the lowest N bits of the requested value, the next
N bits, the next N bits, and so on, so you do need roundup(32/N) reloc
types either way.
By restricting certain insn sequences and flexibility you can get away
with fewer relocations than this. E.g. with your idea of covering
multiple insns with one reloc. Say, if you require that the low 10 bits
of a value are always set in this way (and given your ISA that makes
sense):
shiftset5 %r1, bit04(sym)
shiftset5 %r1, bit59(sym)
and never with another insn in between, and never in a difference order,
then of course you can get away with a relocation (say) RELOC_SHIFTSET10,
that takes the low 10 bits of 'sym' and appropriate distributes those 10
bits into the right 5 bit field of the instruction. It would implicitely
cover both instructions, i.e. a 32bit place in the code section.
If you extend this idea to cover seven instructions of the base ISA you
can get away with a single reloc that is able to set the whole 32bit of a
value (at the expense of not being able to place unrelated instructions
between those seven).
> It seems like other assembler targets truncate the values in those
> cases? If we went for the minimal representation we would basically
> limit external symbols to 5bit, which isn't exactly ideal. And from what
> I can tell, growing a relocation also isn't really something bfd is
> designed to deal with, right?
I'm not super fluent in the actual implementation of bfd linker
relaxation. But I don't see why it can't also grow sections. It's true
that the usual relaxation shrinks sizes, and it's probably better to
follow that as well, but in principle enlarging is no proble either (if
you enlarge _and_ shrink in your relaxation you can run into
endless oscillation between the two, so that needs to be watched for).
But one thing about terminology: relocations themself don't grow or
shrink. A relocation in principle applies to a certain address without
range. The semantics of a specific relocation type will usually say that
these-and-those bits in a field will be changed by it, and you can say
that that's the size of a relocation. But not all relocations are like
that, and nothing really prevents you from either changing the relocation
type when you want something else (in linker relaxation), or even defining
a funny type that applies to either (say) a byte or a word, as needed.
You need to implement special functions for such relocs then, and can't
use the generic simple BFD reloc howto model, but still.
Just to expand on this: in principle one could invent a relocation type
that says "when the symbol has value '1' change the byte 45 bytes
from here to 42, when it has another value then encode that one into the
word 7 bytes from here". That's obviously a crazy semantics for a
relocation, but nothing inherently prevents you from that. (Of course,
making sure that there actually _is_ something 45 bytes from the relocs
place is a problem :) ) The "size" of such relocation wouldn't be
well-defined anymore (or be 46), but what I'm saying is, that this is
okayish.
What does grow or shrink is the section content, and hence distance
between labels might change during relaxation, which requires delaying
resolving jumps until relaxation time as well. This can get quite slow at
link time (riscv is plagued by this). Just to make you aware :)
One remark: you _really_ should think long and hard about your immediate
size in the base ISA. 5 bits is terribly small. Maybe you can snatch
away some bits here and there in your 16bit insns to make this 8 bits
(something that divides 32 would be ideal), but even 6 would bring the
full-32-bit sequence from 7 to 6 instructions.
Ciao,
Michael.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Problems with relocations for a custom ISA
2023-08-08 15:35 ` Michael Matz
@ 2023-08-08 17:26 ` MegaIng
0 siblings, 0 replies; 6+ messages in thread
From: MegaIng @ 2023-08-08 17:26 UTC (permalink / raw)
To: Michael Matz; +Cc: binutils
Am 2023-08-08 um 17:35 schrieb Michael Matz:
> Hello,
>
> On Tue, 8 Aug 2023, MegaIng wrote:
>
>>>> Most of the basics I already managed to implement, i.e. I can generate
>>>> simple
>>>> workable ELF files. However, I am running into problems with relocations
>>>> for
>>>> "load immediate" instructions. Without extensions, we want to potentially
>>>> emit
>>>> long chains of instruction (3 to 8 instructions is realistic), but with
>>>> proper
>>>> extensions in can get down to only 1 instruction of 3 or 4 bytes. I am
>>>> unsure
>>>> how to best represent such variable length relocations in BFD and ELF.
>>> The normal way would be to not do that. It seems the assembler will
>>> already see either a long chain of small insns, or a single large insn,
>>> right?
>> Our idea was that the user can use a simple pseudo instruction to
>> represent the entire process of loading a symbol (or any immediate for
>> that matter).
> Pseudo instruction makes sense. But then it would still be the assembler
> that expands it to either a couple base insns or a single extended insn.
> The linker would see only one or the other, and hence also only the base
> or the extended relocs.
>
> Or did you really want to reserve some specific byte encoding for this
> pseudo instruction to transfer it from assembler via object file to linker
> and let only the linker replace that by one or the other variant? That
> seems an unnecessarily complicated scheme. It depends on if the assembler
> does or doesn't know if it can target the extended insns, or only the base
> ones. I would definitely suggest that the assembler at latest should know
> this.
It wasn't our idea to have a specific bit pattern reserved for that, that
would be quite weird, I agree :-) I think the linker needs knowlegde about
which extensions are available, for that we would use an attributes section
similar to what RISC-V seems to use. (although, maybe we don't need it if we
have many relocation types)
>>> (obviously details will differ, your 16bit insns won't be able to quite
>>> set all 16 bits :) ).
>>> If you really want to optimize these sequences also at link time (but
>>> why?) then all of this becomes more complicated, but remains essentially
>>> the same. The secret will then be in linking from one of the small relocs
>>> (say, the high16 one) to the other, for the linker to easily recognize the
>>> whole insn pair and appropriately do something about those byte sequences.
>>> In that scheme you need to differ between relocations applied to relaxable
>>> code and relocation applied to random non-relaxable data. E.g. you
>>> probably need two variants of the RELOC_LOW16 relocation.
>> Not sure if you took a look at our instruction set: The way you would load an
>> arbitrary 16bit word is via a sequence of `slo` (shift left 5 and or)
>> instructions which use a 5bit immediate (the largest we have in base). So
>> breaking it up into two RELOC_LOW_16 or similar wouldn't quite work.
> Sure, as I said above: "obviously details will differ".
>
>> It would have to be 3-4 RELOC_BITS_0_4, RELOC_BITS_5_9 RELOC_BITS_10_15
>> or something like that. And you couldn't exactly remove one of those
>> without changing the others.
> Yes, this is the usual way to express that. There are many architectures
> which have similar ISA restrictions and they all do it essentially the
> same way: "select X bits from value, put them into Y bits of field", for
> potentially many combinations of (not necessarily consecutive) X and Y.
>
>> But ofcourse, we don't always need all 4
>> instructions, sometimes we can get away with only two or three, for
>> example if it's only an 8bit value, we only need 2 instructions. We
>> would like to optimize these cases somewhere.
> I see. Yeah, that will ultimately need some linker relaxation as only
> that one will know for sure which values symbols have, and hence if they
> do or do not fit certain constraints.
>
>> After a bit more
>> discussion we came to the idea of having many relocations that
>> potentially cover multiple instructions so that the entire
>> load-immediate sequence can be covered by one relocation,
> As you have only such a short immediate field in the base ISA this seems
> like a sensible idea, as otherwise, as you say, you need 7 relocations
> (and insns) for a full 32bit load.
>
>> but this is quite a large amount of relocations.
> Hmm? I don't understand this remark. If you cover a range of
> instructions by one relocation you necessarily need fewer relocs than if
> you use one reloc per insn?
I was considering a large amount of relocation types as a drawback, but
I now realize
that this can't be avoided no matter which path we chose. We are now
going to have the
large multi-instruction relocations that can be relaxed one instruction
at a time instead
of the bit-selection relocations.
>>> I wouldn't go that way if I were you: it seems the assembler/compiler
>>> needs to know if targeting the extended ISA or not anyway, so generating
>>> the right instructions and relocations from the start in the assembler
>>> seems the right choice, and then doesn't need any relax complications at
>>> link time.
>> As long as the range (or even the exact value) of the symbol is known at
>> assembly time, this is ofcourse true, but what about situations where nothing
>> about the range of the value is known?
> The compiler/assembler would always emit the full sequence (e.g. assumes
> that the symbol in question happens to be full 32bit). If you want to
> optimize this use in case the symbol happens to need fewer bits, then yes,
> you do need linker relaxation. As said, you then need a way in the linker
> to recognize an insn sequence that "belongs" together, so that you can
> appropriately optimize this, either by referring from one to the next
> reloc in such a chain, or by simply assuming that such sequences are
> always done in a certain order (i.e. a simple pattern match; unrecognized
> patterns would remain unrelaxed/unoptimized).
>
> The basic form of relocations doesn't depend on that, though. You still
> need to differ between the lowest N bits of the requested value, the next
> N bits, the next N bits, and so on, so you do need roundup(32/N) reloc
> types either way.
>
> By restricting certain insn sequences and flexibility you can get away
> with fewer relocations than this. E.g. with your idea of covering
> multiple insns with one reloc. Say, if you require that the low 10 bits
> of a value are always set in this way (and given your ISA that makes
> sense):
>
> shiftset5 %r1, bit04(sym)
> shiftset5 %r1, bit59(sym)
>
> and never with another insn in between, and never in a difference order,
> then of course you can get away with a relocation (say) RELOC_SHIFTSET10,
> that takes the low 10 bits of 'sym' and appropriate distributes those 10
> bits into the right 5 bit field of the instruction. It would implicitely
> cover both instructions, i.e. a 32bit place in the code section.
>
> If you extend this idea to cover seven instructions of the base ISA you
> can get away with a single reloc that is able to set the whole 32bit of a
> value (at the expense of not being able to place unrelated instructions
> between those seven).
My primary interested is to support to load-immediate pseudo opcode, so
I am not going to worry about stuff users could manually write. I don't
think
there could ever be a benefit to put instruction in the middle of that, so I
am not gonna worry about that.
Although, we might have to split into multiple relocations since bfd
set's an
upper limit on the amount of bytes a relocation can cover by using a 4-wide
bitfield for that.
>> It seems like other assembler targets truncate the values in those
>> cases? If we went for the minimal representation we would basically
>> limit external symbols to 5bit, which isn't exactly ideal. And from what
>> I can tell, growing a relocation also isn't really something bfd is
>> designed to deal with, right?
> I'm not super fluent in the actual implementation of bfd linker
> relaxation. But I don't see why it can't also grow sections. It's true
> that the usual relaxation shrinks sizes, and it's probably better to
> follow that as well, but in principle enlarging is no proble either (if
> you enlarge _and_ shrink in your relaxation you can run into
> endless oscillation between the two, so that needs to be watched for).
>
> But one thing about terminology: relocations themself don't grow or
> shrink. A relocation in principle applies to a certain address without
> range. The semantics of a specific relocation type will usually say that
> these-and-those bits in a field will be changed by it, and you can say
> that that's the size of a relocation. But not all relocations are like
> that, and nothing really prevents you from either changing the relocation
> type when you want something else (in linker relaxation), or even defining
> a funny type that applies to either (say) a byte or a word, as needed.
> You need to implement special functions for such relocs then, and can't
> use the generic simple BFD reloc howto model, but still.
>
> Just to expand on this: in principle one could invent a relocation type
> that says "when the symbol has value '1' change the byte 45 bytes
> from here to 42, when it has another value then encode that one into the
> word 7 bytes from here". That's obviously a crazy semantics for a
> relocation, but nothing inherently prevents you from that. (Of course,
> making sure that there actually _is_ something 45 bytes from the relocs
> place is a problem :) ) The "size" of such relocation wouldn't be
> well-defined anymore (or be 46), but what I'm saying is, that this is
> okayish.
>
> What does grow or shrink is the section content, and hence distance
> between labels might change during relaxation, which requires delaying
> resolving jumps until relaxation time as well. This can get quite slow at
> link time (riscv is plagued by this). Just to make you aware :)
Yeah, thank you, my word choice was a bit confused. The speed penalty is
something
we are probably not gonna worry about for the moment, but we will keep
it in mind.
> One remark: you _really_ should think long and hard about your immediate
> size in the base ISA. 5 bits is terribly small. Maybe you can snatch
> away some bits here and there in your 16bit insns to make this 8 bits
> (something that divides 32 would be ideal), but even 6 would bring the
> full-32-bit sequence from 7 to 6 instructions.
This is something we had discussed a few times and came to the conclusion
that we prefer the current encoding. We wanted 16bit opcodes and
byte-aligned
sections and from there the choices do get quite limited. We also wanted
a simple encoding, so we didn't want to have too many complex tricks.
>
> Ciao,
> Michael.
Thank you for taking your time :-)
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-08-08 17:26 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-07 23:08 Problems with relocations for a custom ISA MegaIng
2023-08-08 14:13 ` Michael Matz
2023-08-08 14:35 ` MegaIng
2023-08-08 14:55 ` Xi Ruoyao
2023-08-08 15:35 ` Michael Matz
2023-08-08 17:26 ` MegaIng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).