Re: [RFC] Allow linker scripts to specify multiple output regions for an output section?

public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed

From: Tejas Belagod <tejas.belagod@foss.arm.com>
To: dvalin@internode.on.net
Cc: Thomas Preudhomme <thomas.preudhomme@foss.arm.com>,
	binutils@sourceware.org
Subject: Re: [RFC] Allow linker scripts to specify multiple output regions for an output section?
Date: Tue, 07 Mar 2017 11:06:00 -0000	[thread overview]
Message-ID: <58BE9421.50008@foss.arm.com> (raw)
In-Reply-To: <20170303102728.GB3635@ratatosk>

On 03/03/17 10:27, Erik Christiansen wrote:
> On 02.03.17 15:40, Tejas Belagod wrote:
>> On 02/03/17 04:32, Erik Christiansen wrote:
>>> I am led to wonder if it might not be less work to merely tweak ld to
>>> look for subsequent matching wildcard patterns in following output
>>> sections before issuing a region overflow error. I.e. ld merely
>>> redefines "first match" if a subsequent one is available when needed.
>>> That seems less intervention than adding new syntax to the script
>>> interpreter, and then grafting on the new capability.
>>>
>>> The overflowing input section needs to remain in the input queue during
>>> the output section bump, to complete its "go-around" on failed landing
>>> approach.
>>>
>>
>> It does seem like an interesting idea. Two things immediately spring to mind.
>>
>> 1. Will it break existing code?
>
> That's perhaps the most important question. At present any input section
> pattern repetitions in the linker script would only be nonfunctional
> baggage. They would only occur as harmless errors, disregarded by ld,
> through its "first match" policy. Adding a command-line option to enable
> flowing would however be a useful safeguard.
>
>> 2. How do we honor any ordering specified? For eg. If the above spec means
>> that raml will have .data first and .data.* later .ramu is expected to start
>> with .data sections, will this break the assumption if a .data.* jumps into
>> .ramu and starts the region with it?
>
> Re-using the existing code, an input section would not just fall over
> the edge to "start the region". Whether an input section is read from
> ld's input, or redirected from an overflowing output section makes no
> difference while the input section remains in the input queue,
> unallocated. On failing to land in the full output section, it needs to
> be redirected to a "second match" in a subsequent output section if
> provided, else the pending (existing code) overflow error comes to
> fruition. The existing allocation code (being unmodified) then continues
> to distribute the input section according to existing pattern matching
> behaviour, but using the "second match".
>
> The ordering of input sections into output sections is set out in ld
> info. The difference between "*(.text .rdata)" and "*(.text) *(.rdata)"
> is described in "3.6.4.1 Input Section Basics".
>
> Thus, if the user wants .ramu and .raml to have identical .data vs
> .data.* order, then it'll be copy/paste. But if a difference is desired,
> then copy/edit/paste is equally available. It was when one output
> section had to "> RAML,RAMU, RAMZ", that region-specific control over
> ordering was lost.
>
> It is not suggested to change any code other than to interpose rebasing
> of pattern allocation before erroring on output section overflow. If at
> that point, we look for "second match" wildcard patterns in subsequent
> output sections, then as each input section is read from ld's input, it
> will be allocated to the next output section with matching patterns -
> using the existing allocation code, influenced only to the extent of
> replacing the "first match" patterns from the full output section with
> subsequent substitutes.
>

Ah, yes! That makes a lot of sense. Thanks for clearing that up.

>>> One significant advantage of this approach is that part of the
>>> established practice, i.e. constraining certain input sections to low,
>>> middle, or high RAM regions, remains both straightforward and explicit.
>>> If multiple output sections are directed to a region, even finer
>>> constraint is possible _simultaneous_ with inter-region flowing on
>>> overflow. On the other hand, what would happen if multiple "> RAML,
>>> RAMU, RAMZ" were aimed at these regions in an attempt to enforce a
>>> paging or proximity constraint while flowing?
>>>
>>
>> I'm not sure I understand this question.
>
> My word picture was a bit fuzzy, I must admit. The minimalist tweak
> without syntax extension is capable of constraining some input sections
> at the same time as flowing others. Input sections which need to be in
> low memory are made to match a wildcard pattern (or explicit file list)
> which is placed only in the first output section. Only input sections
> which match patterns in subsequent output section can flow. The
> mechanism thus sorts sheep from goats, while flowing. That is very
> useful, and should be present in any implementation of flowing, I think.
>
> There would undoubtedly be some real effort involved in tweaking ld to
> rebase input pattern "first match" on output section overflow - where a
> subsequent match is available. Whether that would best be done as a
> "second match" search when needed, or replacing "first match" with a
> list of matches at the outset, remains to be seen. The difference
> between theory and practice always looks smaller from this side.
>

I like the approach you've proposed. I admit it is more practical than extending 
the syntax for more regions. But, I see 2 disadvantages that are more cosmetic 
than anything else:

1. Is the duplicity of patterns over multiple output section regions as 
expressive of the intent as using '> REGION1, REGION2,..., REGIONX'? Though you 
could argue that if the subsequent-match flowing feature is controlled by a 
command-line switch, the user knows what they're doing and the intention would 
be implicit.

2. If we have complex patterns matching input sections/filenames, duplicating it 
over multiple output sections statements might be prone to copy-paste errors. 
Keeping them consistent after changes means diligently replicating them 
everywhere - adds to maintenance overhead.

I agree that replacing the first-match rule with a subsequent match rule 
controlled by a command-line switch is much much lower implementation cost. It 
will be interesting to hear views of a maintainer about the preferred approach.

Thanks,
Tejas.


> Erik
>

next prev parent reply	other threads:[~2017-03-07 11:06 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-22 15:28 Thomas Preudhomme
2017-02-27 10:27 ` Tejas Belagod
2017-02-28  5:52 ` Erik Christiansen
2017-02-28 12:11   ` Tejas Belagod
2017-03-01  7:12     ` Erik Christiansen
2017-03-02  4:32 ` Erik Christiansen
     [not found]   ` <58B83CDA.5050000@foss.arm.com>
2017-03-03 10:27     ` Erik Christiansen
2017-03-07 11:06       ` Tejas Belagod [this message]
2017-03-09 12:06 Erik Christiansen
2017-06-09 12:21 ` Tejas Belagod
2017-06-09 13:35   ` Erik Christiansen
2019-06-27 12:58     ` Christophe Lyon
2019-07-02  6:49       ` Erik Christiansen
2019-07-11  8:42         ` Christophe Lyon
2019-07-24  7:28           ` Nick Clifton
2019-07-24  9:18             ` Simon Richter
2019-07-24 12:48               ` Erik Christiansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58BE9421.50008@foss.arm.com \
    --to=tejas.belagod@foss.arm.com \
    --cc=binutils@sourceware.org \
    --cc=dvalin@internode.on.net \
    --cc=thomas.preudhomme@foss.arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).