From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <binutils-return-96076-listarch-binutils=sources.redhat.com@sourceware.org>
Received: (qmail 104281 invoked by alias); 3 Mar 2017 10:27:35 -0000
Mailing-List: contact binutils-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <binutils.sourceware.org>
List-Subscribe: <mailto:binutils-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/binutils/>
List-Post: <mailto:binutils@sourceware.org>
List-Help: <mailto:binutils-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: binutils-owner@sourceware.org
Received: (qmail 104186 invoked by uid 89); 3 Mar 2017 10:27:35 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.6 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_LOW,RP_MATCHES_RCVD autolearn=no version=3.3.2 spammy=safeguard, basics, Basics, spring
X-HELO: ipmail06.adl6.internode.on.net
Received: from ipmail06.adl6.internode.on.net (HELO ipmail06.adl6.internode.on.net) (150.101.137.145) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 03 Mar 2017 10:27:32 +0000
Received: from ppp118-209-30-102.lns20.mel4.internode.on.net (HELO ratatosk) ([118.209.30.102])  by ipmail06.adl6.internode.on.net with ESMTP; 03 Mar 2017 20:57:29 +1030
Received: by ratatosk (Postfix, from userid 1000)	id 8BC2FA19; Fri,  3 Mar 2017 21:27:28 +1100 (AEDT)
Date: Fri, 03 Mar 2017 10:27:00 -0000
From: Erik Christiansen <dvalin@internode.on.net>
To: Tejas Belagod <tejas.belagod@foss.arm.com>
Cc: Thomas Preudhomme <thomas.preudhomme@foss.arm.com>,	binutils@sourceware.org
Subject: Re: [RFC] Allow linker scripts to specify multiple output regions for an output section?
Message-ID: <20170303102728.GB3635@ratatosk>
Reply-To: dvalin@internode.on.net
References: <20170302043231.GA3942@ratatosk> <58B83CDA.5050000@foss.arm.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <58B83CDA.5050000@foss.arm.com>
User-Agent: Mutt/1.8.0 (2017-02-23)
X-IsSubscribed: yes
X-SW-Source: 2017-03/txt/msg00039.txt.bz2

On 02.03.17 15:40, Tejas Belagod wrote:
> On 02/03/17 04:32, Erik Christiansen wrote:
> > I am led to wonder if it might not be less work to merely tweak ld to
> > look for subsequent matching wildcard patterns in following output
> > sections before issuing a region overflow error. I.e. ld merely
> > redefines "first match" if a subsequent one is available when needed.
> > That seems less intervention than adding new syntax to the script
> > interpreter, and then grafting on the new capability.
> > 
> > The overflowing input section needs to remain in the input queue during
> > the output section bump, to complete its "go-around" on failed landing
> > approach.
> > 
> 
> It does seem like an interesting idea. Two things immediately spring to mind.
> 
> 1. Will it break existing code?

That's perhaps the most important question. At present any input section
pattern repetitions in the linker script would only be nonfunctional
baggage. They would only occur as harmless errors, disregarded by ld,
through its "first match" policy. Adding a command-line option to enable
flowing would however be a useful safeguard.

> 2. How do we honor any ordering specified? For eg. If the above spec means
> that raml will have .data first and .data.* later .ramu is expected to start
> with .data sections, will this break the assumption if a .data.* jumps into
> .ramu and starts the region with it?

Re-using the existing code, an input section would not just fall over
the edge to "start the region". Whether an input section is read from
ld's input, or redirected from an overflowing output section makes no
difference while the input section remains in the input queue,
unallocated. On failing to land in the full output section, it needs to
be redirected to a "second match" in a subsequent output section if
provided, else the pending (existing code) overflow error comes to
fruition. The existing allocation code (being unmodified) then continues
to distribute the input section according to existing pattern matching
behaviour, but using the "second match".

The ordering of input sections into output sections is set out in ld
info. The difference between "*(.text .rdata)" and "*(.text) *(.rdata)"
is described in "3.6.4.1 Input Section Basics".

Thus, if the user wants .ramu and .raml to have identical .data vs
.data.* order, then it'll be copy/paste. But if a difference is desired,
then copy/edit/paste is equally available. It was when one output
section had to "> RAML,RAMU, RAMZ", that region-specific control over
ordering was lost.

It is not suggested to change any code other than to interpose rebasing
of pattern allocation before erroring on output section overflow. If at
that point, we look for "second match" wildcard patterns in subsequent
output sections, then as each input section is read from ld's input, it
will be allocated to the next output section with matching patterns -
using the existing allocation code, influenced only to the extent of
replacing the "first match" patterns from the full output section with
subsequent substitutes.

> > One significant advantage of this approach is that part of the
> > established practice, i.e. constraining certain input sections to low,
> > middle, or high RAM regions, remains both straightforward and explicit.
> > If multiple output sections are directed to a region, even finer
> > constraint is possible _simultaneous_ with inter-region flowing on
> > overflow. On the other hand, what would happen if multiple "> RAML,
> > RAMU, RAMZ" were aimed at these regions in an attempt to enforce a
> > paging or proximity constraint while flowing?
> > 
> 
> I'm not sure I understand this question.

My word picture was a bit fuzzy, I must admit. The minimalist tweak
without syntax extension is capable of constraining some input sections
at the same time as flowing others. Input sections which need to be in
low memory are made to match a wildcard pattern (or explicit file list)
which is placed only in the first output section. Only input sections
which match patterns in subsequent output section can flow. The
mechanism thus sorts sheep from goats, while flowing. That is very
useful, and should be present in any implementation of flowing, I think.

There would undoubtedly be some real effort involved in tweaking ld to
rebase input pattern "first match" on output section overflow - where a
subsequent match is available. Whether that would best be done as a
"second match" search when needed, or replacing "first match" with a
list of matches at the outset, remains to be seen. The difference
between theory and practice always looks smaller from this side.

Erik