public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/2] ld: On defsym, PROVIDE, and MEMORY regions
@ 2020-10-12 13:49 Andrew Burgess
  2020-10-12 13:49 ` [PATCH 1/2] ld: More documentation for --defsym Andrew Burgess
  2020-10-12 13:49 ` [PATCH 2/2] ld: Allow symbols from PROVIDE to be use in MEMORY regions Andrew Burgess
  0 siblings, 2 replies; 7+ messages in thread
From: Andrew Burgess @ 2020-10-12 13:49 UTC (permalink / raw)
  To: binutils

These two patches came from a simple idea, using --defsym to
optionally provide the origin and length for a MEMORY region, and
having a default origin and length written into the linker script
using PROVIDE, so:

  PROVIDE (the_origin = 0x....)
  PROVIDE (the_length = 0x....)

  MEMORY
  {
    REGION : ORIGIN = the_origin, LENGTH = the_length;
  }

and linking with one of:

  ld --defsym=the_origin=0x.... --defsym=the_length=0x.... -Tscript.ld ......
  ld -Tscript.ld ......

I was initially placing `--defsym` after `-Tscript.ld`, and it turns
out the ordering of these two options is important.  The documentation
doesn't really (I feel) make this clear, this is addressed in patch #1.

Then it turns out that you can't use PROVIDE with MEMORY regions like
I wanted, which is addressed in patch #2.

All feedback welcome.

Thanks,
Andrew


---

Andrew Burgess (2):
  ld: More documentation for --defsym
  ld: Allow symbols from PROVIDE to be use in MEMORY regions

 ld/ChangeLog                           | 20 +++++++++
 ld/ld.texi                             |  9 ++++
 ld/ldlang.c                            | 59 +++++++++++++++++---------
 ld/testsuite/ld-scripts/provide-10.d   |  3 ++
 ld/testsuite/ld-scripts/provide-10.map |  6 +++
 ld/testsuite/ld-scripts/provide-11.d   |  3 ++
 ld/testsuite/ld-scripts/provide-11.map |  6 +++
 ld/testsuite/ld-scripts/provide-12.d   |  3 ++
 ld/testsuite/ld-scripts/provide-12.map |  6 +++
 ld/testsuite/ld-scripts/provide-9.d    |  3 ++
 ld/testsuite/ld-scripts/provide-9.map  |  6 +++
 ld/testsuite/ld-scripts/provide-9.t    | 25 +++++++++++
 12 files changed, 129 insertions(+), 20 deletions(-)
 create mode 100644 ld/testsuite/ld-scripts/provide-10.d
 create mode 100644 ld/testsuite/ld-scripts/provide-10.map
 create mode 100644 ld/testsuite/ld-scripts/provide-11.d
 create mode 100644 ld/testsuite/ld-scripts/provide-11.map
 create mode 100644 ld/testsuite/ld-scripts/provide-12.d
 create mode 100644 ld/testsuite/ld-scripts/provide-12.map
 create mode 100644 ld/testsuite/ld-scripts/provide-9.d
 create mode 100644 ld/testsuite/ld-scripts/provide-9.map
 create mode 100644 ld/testsuite/ld-scripts/provide-9.t

-- 
2.25.4


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] ld: More documentation for --defsym
  2020-10-12 13:49 [PATCH 0/2] ld: On defsym, PROVIDE, and MEMORY regions Andrew Burgess
@ 2020-10-12 13:49 ` Andrew Burgess
  2020-10-14  4:54   ` Alan Modra
  2020-10-12 13:49 ` [PATCH 2/2] ld: Allow symbols from PROVIDE to be use in MEMORY regions Andrew Burgess
  1 sibling, 1 reply; 7+ messages in thread
From: Andrew Burgess @ 2020-10-12 13:49 UTC (permalink / raw)
  To: binutils

The ordering of command line options --defsym and -T is important,
however, the description of --defsym in the manual doesn't mention
this.

This commit adds more text to the description of --defsym to try and
explain this ordering requirement.

ld/ChangeLog:

	* ld.texi (Options): Extend the description of --defsym.
---
 ld/ChangeLog | 4 ++++
 ld/ld.texi   | 9 +++++++++
 2 files changed, 13 insertions(+)

diff --git a/ld/ld.texi b/ld/ld.texi
index 66bede283e8..2f3a2056f40 100644
--- a/ld/ld.texi
+++ b/ld/ld.texi
@@ -1705,6 +1705,15 @@
 @emph{Note:} there should be no white space between @var{symbol}, the
 equals sign (``@key{=}''), and @var{expression}.
 
+The linker processes @samp{--defsym} arguments and @samp{-T} arguments
+in order, placing @samp{--defsym} before @samp{-T} will define the
+symbol before the linker script from @samp{-T} is processed, while
+placing @samp{--defsym} after @samp{-T} will define the symbol after
+the linker script has been processed.  This difference has
+consequences for expressions within the linker script that use the
+@samp{--defsym} symbols, which order is correct will depend on what
+you are trying to achieve.
+
 @cindex demangling, from command line
 @kindex --demangle[=@var{style}]
 @kindex --no-demangle
-- 
2.25.4


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/2] ld: Allow symbols from PROVIDE to be use in MEMORY regions
  2020-10-12 13:49 [PATCH 0/2] ld: On defsym, PROVIDE, and MEMORY regions Andrew Burgess
  2020-10-12 13:49 ` [PATCH 1/2] ld: More documentation for --defsym Andrew Burgess
@ 2020-10-12 13:49 ` Andrew Burgess
  2020-10-14  4:59   ` Alan Modra
  1 sibling, 1 reply; 7+ messages in thread
From: Andrew Burgess @ 2020-10-12 13:49 UTC (permalink / raw)
  To: binutils

I wanted to write a linker script like this:

  PROVIDE(mem_origin = 0x1000);
  PROVIDE(mem_length = 0x1000);

  MEMORY
  {
    REGION : ORIGIN = mem_origin, LENGTH = mem_length
  }

  ....

Then when I link using this script I can optionally supply:

  --defsym=mem_origin=..... --defsym=mem_length=....

to override the defaults.

And though passing `--defsym' does work, if I remove the use of
`--defsym' and just rely on the defaults I get an error:

  ld-new: invalid origin for memory region REGION

Interestingly, if I make the above error non-fatal and dump a linker
map file I see that (a) REGION has origin 0x0, and length 0xffff...,
and (b) the symbol from the PROVIDE is provided.

An examination of ldlang.c:lang_process shows us what the issue is,
the origin and length of all memory regions are set as a result of a
single call to lang_do_memory_regions, this call is done after calling
open_input_bfds.

During the open_input_bfds call provide statements can be converted to
provided statements if we know that the assigned symbol is needed, but
for symbols that are only used in the memory regions we are unaware
that we need these symbols.

What I propose in this patch is to make two calls to
lang_do_memory_regions, in the first call we process the expressions
for the origin and length fields of each region, however, errors,
especially undefined symbols, will be ignored.  The origin and length
values are not updated.  However, by evaluating the expressions any
symbols we need will be added to the symbol table.

Now when we call open_input_bfds, when we process the provide
statements, we will see that the assigned symbol is needed add its new
value to the symbol table.

Finally we reach the original call to lang_do_memory_regions, in
this (now second) call we again process the expressions, and this time
update the origin and length values.  Any errors encountered now are
reported to the user.

ld/ChangeLog:

	* ldlang.c (lang_process): Add extra call to lang_do_memory_regions, and pass
	phase parameter.
	(lang_do_memory_regions): Add parameter, only define origin and
	length when we are not in the first phase.  Reindent.
	* testsuite/ld-scripts/provide-10.d: New file.
	* testsuite/ld-scripts/provide-10.map: New file.
	* testsuite/ld-scripts/provide-11.d: New file.
	* testsuite/ld-scripts/provide-11.map: New file.
	* testsuite/ld-scripts/provide-12.d: New file.
	* testsuite/ld-scripts/provide-12.map: New file.
	* testsuite/ld-scripts/provide-9.d: New file.
	* testsuite/ld-scripts/provide-9.map: New file.
	* testsuite/ld-scripts/provide-9.t: New file.
---
 ld/ChangeLog                           | 16 +++++++
 ld/ldlang.c                            | 59 +++++++++++++++++---------
 ld/testsuite/ld-scripts/provide-10.d   |  3 ++
 ld/testsuite/ld-scripts/provide-10.map |  6 +++
 ld/testsuite/ld-scripts/provide-11.d   |  3 ++
 ld/testsuite/ld-scripts/provide-11.map |  6 +++
 ld/testsuite/ld-scripts/provide-12.d   |  3 ++
 ld/testsuite/ld-scripts/provide-12.map |  6 +++
 ld/testsuite/ld-scripts/provide-9.d    |  3 ++
 ld/testsuite/ld-scripts/provide-9.map  |  6 +++
 ld/testsuite/ld-scripts/provide-9.t    | 25 +++++++++++
 11 files changed, 116 insertions(+), 20 deletions(-)
 create mode 100644 ld/testsuite/ld-scripts/provide-10.d
 create mode 100644 ld/testsuite/ld-scripts/provide-10.map
 create mode 100644 ld/testsuite/ld-scripts/provide-11.d
 create mode 100644 ld/testsuite/ld-scripts/provide-11.map
 create mode 100644 ld/testsuite/ld-scripts/provide-12.d
 create mode 100644 ld/testsuite/ld-scripts/provide-12.map
 create mode 100644 ld/testsuite/ld-scripts/provide-9.d
 create mode 100644 ld/testsuite/ld-scripts/provide-9.map
 create mode 100644 ld/testsuite/ld-scripts/provide-9.t

diff --git a/ld/ldlang.c b/ld/ldlang.c
index 4249b3a045d..fd3d38a0166 100644
--- a/ld/ldlang.c
+++ b/ld/ldlang.c
@@ -98,7 +98,7 @@ static void lang_record_phdrs (void);
 static void lang_do_version_exports_section (void);
 static void lang_finalize_version_expr_head
   (struct bfd_elf_version_expr_head *);
-static void lang_do_memory_regions (void);
+static void lang_do_memory_regions (lang_phase_type phase);
 
 /* Exported variables.  */
 const char *output_target;
@@ -7909,13 +7909,22 @@ lang_process (void)
   if (!bfd_section_already_linked_table_init ())
     einfo (_("%F%P: can not create hash table: %E\n"));
 
+  /* A first pass through the memory regions ensures that if any region
+     references a symbol for its origin or length then this symbol will be
+     added to the symbol table.  Having these symbols in the symbol table
+     means that when we call open_input_bfds PROVIDE statements will
+     trigger to provide any needed symbols.  The values origins and
+     lengths are not assigned as a result of this call.  */
+  lang_do_memory_regions (lang_first_phase_enum);
+
   /* Create a bfd for each input file.  */
   current_target = default_target;
   lang_statement_iteration++;
   open_input_bfds (statement_list.head, OPEN_BFD_NORMAL);
-  /* open_input_bfds also handles assignments, so we can give values
-     to symbolic origin/length now.  */
-  lang_do_memory_regions ();
+
+  /* Now that open_input_bfds has processed assignments and provide
+     statements we can give values to symbolic origin/length now.  */
+  lang_do_memory_regions (lang_final_phase_enum);
 
 #if BFD_SUPPORTS_PLUGINS
   if (link_info.lto_plugin_active)
@@ -9373,10 +9382,14 @@ lang_do_version_exports_section (void)
 			   lang_new_vers_node (greg, lreg), NULL);
 }
 
-/* Evaluate LENGTH and ORIGIN parts of MEMORY spec */
+/* Evaluate LENGTH and ORIGIN parts of MEMORY spec.  When we are in the
+   initial phase no errors are thrown, however, references to symbols in
+   the origin and length fields will be pushed into the symbol table, this
+   allows PROVIDE statements to then provide these symbols when we are
+   called again in a later phase.  */
 
 static void
-lang_do_memory_regions (void)
+lang_do_memory_regions (lang_phase_type phase)
 {
   lang_memory_region_type *r = lang_memory_region_list;
 
@@ -9385,24 +9398,30 @@ lang_do_memory_regions (void)
       if (r->origin_exp)
 	{
 	  exp_fold_tree_no_dot (r->origin_exp);
-	  if (expld.result.valid_p)
-	    {
-	      r->origin = expld.result.value;
-	      r->current = r->origin;
-	    }
-	  else
-	    einfo (_("%F%P: invalid origin for memory region %s\n"),
-		   r->name_list.name);
+          if (phase != lang_first_phase_enum)
+            {
+              if (expld.result.valid_p)
+                {
+                  r->origin = expld.result.value;
+                  r->current = r->origin;
+                }
+              else
+                einfo (_("%P: invalid origin for memory region %s\n"),
+                       r->name_list.name);
+            }
 	}
       if (r->length_exp)
 	{
 	  exp_fold_tree_no_dot (r->length_exp);
-	  if (expld.result.valid_p)
-	    r->length = expld.result.value;
-	  else
-	    einfo (_("%F%P: invalid length for memory region %s\n"),
-		   r->name_list.name);
-	}
+          if (phase != lang_first_phase_enum)
+            {
+              if (expld.result.valid_p)
+                r->length = expld.result.value;
+              else
+                einfo (_("%P: invalid length for memory region %s\n"),
+                       r->name_list.name);
+            }
+        }
     }
 }
 
diff --git a/ld/testsuite/ld-scripts/provide-10.d b/ld/testsuite/ld-scripts/provide-10.d
new file mode 100644
index 00000000000..7481a92bbbd
--- /dev/null
+++ b/ld/testsuite/ld-scripts/provide-10.d
@@ -0,0 +1,3 @@
+#source: provide-5.s
+#ld: --defsym=mem_origin=0x300 --defsym=mem_length=0x400 -T provide-9.t
+#map: provide-10.map
diff --git a/ld/testsuite/ld-scripts/provide-10.map b/ld/testsuite/ld-scripts/provide-10.map
new file mode 100644
index 00000000000..022b962b2d8
--- /dev/null
+++ b/ld/testsuite/ld-scripts/provide-10.map
@@ -0,0 +1,6 @@
+#...
+Memory Configuration
+
+Name             Origin             Length             Attributes
+FOO              0x[0-9a-f]+300 +0x[0-9a-f]+400
+#pass
diff --git a/ld/testsuite/ld-scripts/provide-11.d b/ld/testsuite/ld-scripts/provide-11.d
new file mode 100644
index 00000000000..79bcfa62162
--- /dev/null
+++ b/ld/testsuite/ld-scripts/provide-11.d
@@ -0,0 +1,3 @@
+#source: provide-5.s
+#ld: --defsym=mem_length=0x400 -T provide-9.t
+#map: provide-11.map
diff --git a/ld/testsuite/ld-scripts/provide-11.map b/ld/testsuite/ld-scripts/provide-11.map
new file mode 100644
index 00000000000..71763123123
--- /dev/null
+++ b/ld/testsuite/ld-scripts/provide-11.map
@@ -0,0 +1,6 @@
+#...
+Memory Configuration
+
+Name             Origin             Length             Attributes
+FOO              0x[0-9a-f]+100 +0x[0-9a-f]+400
+#pass
diff --git a/ld/testsuite/ld-scripts/provide-12.d b/ld/testsuite/ld-scripts/provide-12.d
new file mode 100644
index 00000000000..41d95900fd3
--- /dev/null
+++ b/ld/testsuite/ld-scripts/provide-12.d
@@ -0,0 +1,3 @@
+#source: provide-5.s
+#ld: --defsym=mem_origin=0x300 -T provide-9.t
+#map: provide-12.map
diff --git a/ld/testsuite/ld-scripts/provide-12.map b/ld/testsuite/ld-scripts/provide-12.map
new file mode 100644
index 00000000000..e76654b10fd
--- /dev/null
+++ b/ld/testsuite/ld-scripts/provide-12.map
@@ -0,0 +1,6 @@
+#...
+Memory Configuration
+
+Name             Origin             Length             Attributes
+FOO              0x[0-9a-f]+300 +0x[0-9a-f]+200
+#pass
diff --git a/ld/testsuite/ld-scripts/provide-9.d b/ld/testsuite/ld-scripts/provide-9.d
new file mode 100644
index 00000000000..94dc02949b7
--- /dev/null
+++ b/ld/testsuite/ld-scripts/provide-9.d
@@ -0,0 +1,3 @@
+#source: provide-5.s
+#ld: -T provide-9.t
+#map: provide-9.map
diff --git a/ld/testsuite/ld-scripts/provide-9.map b/ld/testsuite/ld-scripts/provide-9.map
new file mode 100644
index 00000000000..e35e3e25157
--- /dev/null
+++ b/ld/testsuite/ld-scripts/provide-9.map
@@ -0,0 +1,6 @@
+#...
+Memory Configuration
+
+Name             Origin             Length             Attributes
+FOO              0x[0-9a-f]+100 +0x[0-9a-f]+200
+#pass
diff --git a/ld/testsuite/ld-scripts/provide-9.t b/ld/testsuite/ld-scripts/provide-9.t
new file mode 100644
index 00000000000..00d906aa80a
--- /dev/null
+++ b/ld/testsuite/ld-scripts/provide-9.t
@@ -0,0 +1,25 @@
+PROVIDE (mem_origin = 0x100);
+PROVIDE (mem_length = 0x200);
+
+MEMORY
+{
+  FOO : ORIGIN = mem_origin, LENGTH = mem_length
+}
+
+SECTIONS
+{
+  .data : {
+    *(.data .data.*)
+  } >FOO
+
+  .text : {
+    *(.text .text.*)
+  } >FOO
+
+  .bss : {
+    *(.bss .bss.*)
+  } >FOO
+
+  /DISCARD/ : { *(.*) }
+}
+
-- 
2.25.4


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] ld: More documentation for --defsym
  2020-10-12 13:49 ` [PATCH 1/2] ld: More documentation for --defsym Andrew Burgess
@ 2020-10-14  4:54   ` Alan Modra
  0 siblings, 0 replies; 7+ messages in thread
From: Alan Modra @ 2020-10-14  4:54 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: binutils

On Mon, Oct 12, 2020 at 02:49:14PM +0100, Andrew Burgess wrote:
> The ordering of command line options --defsym and -T is important,
> however, the description of --defsym in the manual doesn't mention
> this.
> 
> This commit adds more text to the description of --defsym to try and
> explain this ordering requirement.
> 
> ld/ChangeLog:
> 
> 	* ld.texi (Options): Extend the description of --defsym.

OK.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] ld: Allow symbols from PROVIDE to be use in MEMORY regions
  2020-10-12 13:49 ` [PATCH 2/2] ld: Allow symbols from PROVIDE to be use in MEMORY regions Andrew Burgess
@ 2020-10-14  4:59   ` Alan Modra
  2020-10-15 12:58     ` Andrew Burgess
  0 siblings, 1 reply; 7+ messages in thread
From: Alan Modra @ 2020-10-14  4:59 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: binutils

On Mon, Oct 12, 2020 at 02:49:15PM +0100, Andrew Burgess wrote:
> 	* ldlang.c (lang_process): Add extra call to lang_do_memory_regions, and pass
> 	phase parameter.
> 	(lang_do_memory_regions): Add parameter, only define origin and
> 	length when we are not in the first phase.  Reindent.
> 	* testsuite/ld-scripts/provide-10.d: New file.
> 	* testsuite/ld-scripts/provide-10.map: New file.
> 	* testsuite/ld-scripts/provide-11.d: New file.
> 	* testsuite/ld-scripts/provide-11.map: New file.
> 	* testsuite/ld-scripts/provide-12.d: New file.
> 	* testsuite/ld-scripts/provide-12.map: New file.
> 	* testsuite/ld-scripts/provide-9.d: New file.
> 	* testsuite/ld-scripts/provide-9.map: New file.
> 	* testsuite/ld-scripts/provide-9.t: New file.

OK, except I think it would be better in this case to omit the
lang_do_memory_regions parameter and just test
expld.phase != lang_first_phase_enum in lang_do_memory_regions.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] ld: Allow symbols from PROVIDE to be use in MEMORY regions
  2020-10-14  4:59   ` Alan Modra
@ 2020-10-15 12:58     ` Andrew Burgess
  2020-10-15 14:19       ` Alan Modra
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Burgess @ 2020-10-15 12:58 UTC (permalink / raw)
  To: Alan Modra; +Cc: binutils

* Alan Modra <amodra@gmail.com> [2020-10-14 15:29:18 +1030]:

> On Mon, Oct 12, 2020 at 02:49:15PM +0100, Andrew Burgess wrote:
> > 	* ldlang.c (lang_process): Add extra call to lang_do_memory_regions, and pass
> > 	phase parameter.
> > 	(lang_do_memory_regions): Add parameter, only define origin and
> > 	length when we are not in the first phase.  Reindent.
> > 	* testsuite/ld-scripts/provide-10.d: New file.
> > 	* testsuite/ld-scripts/provide-10.map: New file.
> > 	* testsuite/ld-scripts/provide-11.d: New file.
> > 	* testsuite/ld-scripts/provide-11.map: New file.
> > 	* testsuite/ld-scripts/provide-12.d: New file.
> > 	* testsuite/ld-scripts/provide-12.map: New file.
> > 	* testsuite/ld-scripts/provide-9.d: New file.
> > 	* testsuite/ld-scripts/provide-9.map: New file.
> > 	* testsuite/ld-scripts/provide-9.t: New file.
> 
> OK, except I think it would be better in this case to omit the
> lang_do_memory_regions parameter and just test
> expld.phase != lang_first_phase_enum in lang_do_memory_regions.

Hi,

Thanks for the review.

I looked at making the change you suggested, but ran into a bit of a
problem.  The two calls to lang_do_memory_regions appear in ldlang.c
like this:

  /* A first pass through the memory regions ensures that if any region
     references a symbol for its origin or length then this symbol will be
     added to the symbol table.  Having these symbols in the symbol table
     means that when we call open_input_bfds PROVIDE statements will
     trigger to provide any needed symbols.  The regions origins and
     lengths are not assigned as a result of this call.  */
  lang_do_memory_regions (...);

  /* Create a bfd for each input file.  */
  current_target = default_target;
  lang_statement_iteration++;
  open_input_bfds (statement_list.head, OPEN_BFD_NORMAL);

  /* Now that open_input_bfds has processed assignments and provide
     statements we can give values to symbolic origin/length now.  */
  lang_do_memory_regions (...);

Now at the location of the first call expld.phase is indeed set to
lang_first_phase_enum, so that's not an issue.  The problem is that
expld.phase is _still_ set to lang_first_phase_enum at the time of the
second call.

Now, sure, I can change its value, but what if some later call depends
on this being set to lang_first_phase_enum?  OK, so I should change it
back again after.  So now I have the second call like this:

  /* Now that open_input_bfds has processed assignments and provide
     statements we can give values to symbolic origin/length now.  */
  expld.phase = lang_final_phase_enum;
  lang_do_memory_regions ();
  expld.phase = lang_first_phase_enum;

That kind of feels like a really rubbish way to pass state to the
function call.

What I did do was move away from using the "phase" parameter, and
instead just passed a TRUE/FALSE 'update_regions_p' predicate.
Hopefully this removes possible confusion with "why a parameter and
not expld.phase?", but also avoids passing parameters through global
variables.

What do you think to this?

Thanks,
Andrew

---

commit 8c6592303d0dbd581fb779f50b4a22d477b3d647
Author: Andrew Burgess <andrew.burgess@embecosm.com>
Date:   Wed Oct 7 14:27:56 2020 +0100

    ld: Allow symbols from PROVIDE to be use in MEMORY regions
    
    I wanted to write a linker script like this:
    
      PROVIDE(mem_origin = 0x1000);
      PROVIDE(mem_length = 0x1000);
    
      MEMORY
      {
        REGION : ORIGIN = mem_origin, LENGTH = mem_length
      }
    
      ....
    
    Then when I link using this script I can optionally supply:
    
      --defsym=mem_origin=..... --defsym=mem_length=....
    
    to override the defaults.
    
    And though passing `--defsym' does work, if I remove the use of
    `--defsym' and just rely on the defaults I get an error:
    
      ld-new: invalid origin for memory region REGION
    
    Interestingly, if I make the above error non-fatal and dump a linker
    map file I see that (a) REGION has origin 0x0, and length 0xffff...,
    and (b) the symbol from the PROVIDE is provided.
    
    An examination of ldlang.c:lang_process shows us what the issue is,
    the origin and length of all memory regions are set as a result of a
    single call to lang_do_memory_regions, this call is done after calling
    open_input_bfds.
    
    During the open_input_bfds call provide statements can be converted to
    provided statements if we know that the assigned symbol is needed, but
    for symbols that are only used in the memory regions we are unaware
    that we need these symbols.
    
    What I propose in this patch is to make two calls to
    lang_do_memory_regions, in the first call we process the expressions
    for the origin and length fields of each region, however, errors,
    especially undefined symbols, will be ignored.  The origin and length
    values are not updated.  However, by evaluating the expressions any
    symbols we need will be added to the symbol table.
    
    Now when we call open_input_bfds, when we process the provide
    statements, we will see that the assigned symbol is needed add its new
    value to the symbol table.
    
    Finally we reach the original call to lang_do_memory_regions, in
    this (now second) call we again process the expressions, and this time
    update the origin and length values.  Any errors encountered now are
    reported to the user.
    
    ld/ChangeLog:
    
            * ldlang.c (lang_process): Add extra call to
            lang_do_memory_regions, and pass parameter.
            (lang_do_memory_regions): Add parameter, only define origin and
            length when requested.  Reindent.
            * testsuite/ld-scripts/provide-10.d: New file.
            * testsuite/ld-scripts/provide-10.map: New file.
            * testsuite/ld-scripts/provide-11.d: New file.
            * testsuite/ld-scripts/provide-11.map: New file.
            * testsuite/ld-scripts/provide-12.d: New file.
            * testsuite/ld-scripts/provide-12.map: New file.
            * testsuite/ld-scripts/provide-9.d: New file.
            * testsuite/ld-scripts/provide-9.map: New file.
            * testsuite/ld-scripts/provide-9.t: New file.

diff --git a/ld/ldlang.c b/ld/ldlang.c
index 4249b3a045d..2073ac09ce4 100644
--- a/ld/ldlang.c
+++ b/ld/ldlang.c
@@ -98,7 +98,7 @@ static void lang_record_phdrs (void);
 static void lang_do_version_exports_section (void);
 static void lang_finalize_version_expr_head
   (struct bfd_elf_version_expr_head *);
-static void lang_do_memory_regions (void);
+static void lang_do_memory_regions (bfd_boolean);
 
 /* Exported variables.  */
 const char *output_target;
@@ -7909,13 +7909,22 @@ lang_process (void)
   if (!bfd_section_already_linked_table_init ())
     einfo (_("%F%P: can not create hash table: %E\n"));
 
+  /* A first pass through the memory regions ensures that if any region
+     references a symbol for its origin or length then this symbol will be
+     added to the symbol table.  Having these symbols in the symbol table
+     means that when we call open_input_bfds PROVIDE statements will
+     trigger to provide any needed symbols.  The regions origins and
+     lengths are not assigned as a result of this call.  */
+  lang_do_memory_regions (FALSE);
+
   /* Create a bfd for each input file.  */
   current_target = default_target;
   lang_statement_iteration++;
   open_input_bfds (statement_list.head, OPEN_BFD_NORMAL);
-  /* open_input_bfds also handles assignments, so we can give values
-     to symbolic origin/length now.  */
-  lang_do_memory_regions ();
+
+  /* Now that open_input_bfds has processed assignments and provide
+     statements we can give values to symbolic origin/length now.  */
+  lang_do_memory_regions (TRUE);
 
 #if BFD_SUPPORTS_PLUGINS
   if (link_info.lto_plugin_active)
@@ -9373,10 +9382,16 @@ lang_do_version_exports_section (void)
 			   lang_new_vers_node (greg, lreg), NULL);
 }
 
-/* Evaluate LENGTH and ORIGIN parts of MEMORY spec */
+/* Evaluate LENGTH and ORIGIN parts of MEMORY spec.  This is initially
+   called with UPDATE_REGIONS_P set to FALSE, in this case no errors are
+   thrown, however, references to symbols in the origin and length fields
+   will be pushed into the symbol table, this allows PROVIDE statements to
+   then provide these symbols.  This function is called a second time with
+   UPDATE_REGIONS_P set to TRUE, this time the we update the actual region
+   data structures, and throw errors if missing symbols are encountered.  */
 
 static void
-lang_do_memory_regions (void)
+lang_do_memory_regions (bfd_boolean update_regions_p)
 {
   lang_memory_region_type *r = lang_memory_region_list;
 
@@ -9385,24 +9400,30 @@ lang_do_memory_regions (void)
       if (r->origin_exp)
 	{
 	  exp_fold_tree_no_dot (r->origin_exp);
-	  if (expld.result.valid_p)
-	    {
-	      r->origin = expld.result.value;
-	      r->current = r->origin;
-	    }
-	  else
-	    einfo (_("%F%P: invalid origin for memory region %s\n"),
-		   r->name_list.name);
+          if (update_regions_p)
+            {
+              if (expld.result.valid_p)
+                {
+                  r->origin = expld.result.value;
+                  r->current = r->origin;
+                }
+              else
+                einfo (_("%P: invalid origin for memory region %s\n"),
+                       r->name_list.name);
+            }
 	}
       if (r->length_exp)
 	{
 	  exp_fold_tree_no_dot (r->length_exp);
-	  if (expld.result.valid_p)
-	    r->length = expld.result.value;
-	  else
-	    einfo (_("%F%P: invalid length for memory region %s\n"),
-		   r->name_list.name);
-	}
+          if (update_regions_p)
+            {
+              if (expld.result.valid_p)
+                r->length = expld.result.value;
+              else
+                einfo (_("%P: invalid length for memory region %s\n"),
+                       r->name_list.name);
+            }
+        }
     }
 }
 
diff --git a/ld/testsuite/ld-scripts/provide-10.d b/ld/testsuite/ld-scripts/provide-10.d
new file mode 100644
index 00000000000..7481a92bbbd
--- /dev/null
+++ b/ld/testsuite/ld-scripts/provide-10.d
@@ -0,0 +1,3 @@
+#source: provide-5.s
+#ld: --defsym=mem_origin=0x300 --defsym=mem_length=0x400 -T provide-9.t
+#map: provide-10.map
diff --git a/ld/testsuite/ld-scripts/provide-10.map b/ld/testsuite/ld-scripts/provide-10.map
new file mode 100644
index 00000000000..022b962b2d8
--- /dev/null
+++ b/ld/testsuite/ld-scripts/provide-10.map
@@ -0,0 +1,6 @@
+#...
+Memory Configuration
+
+Name             Origin             Length             Attributes
+FOO              0x[0-9a-f]+300 +0x[0-9a-f]+400
+#pass
diff --git a/ld/testsuite/ld-scripts/provide-11.d b/ld/testsuite/ld-scripts/provide-11.d
new file mode 100644
index 00000000000..79bcfa62162
--- /dev/null
+++ b/ld/testsuite/ld-scripts/provide-11.d
@@ -0,0 +1,3 @@
+#source: provide-5.s
+#ld: --defsym=mem_length=0x400 -T provide-9.t
+#map: provide-11.map
diff --git a/ld/testsuite/ld-scripts/provide-11.map b/ld/testsuite/ld-scripts/provide-11.map
new file mode 100644
index 00000000000..71763123123
--- /dev/null
+++ b/ld/testsuite/ld-scripts/provide-11.map
@@ -0,0 +1,6 @@
+#...
+Memory Configuration
+
+Name             Origin             Length             Attributes
+FOO              0x[0-9a-f]+100 +0x[0-9a-f]+400
+#pass
diff --git a/ld/testsuite/ld-scripts/provide-12.d b/ld/testsuite/ld-scripts/provide-12.d
new file mode 100644
index 00000000000..41d95900fd3
--- /dev/null
+++ b/ld/testsuite/ld-scripts/provide-12.d
@@ -0,0 +1,3 @@
+#source: provide-5.s
+#ld: --defsym=mem_origin=0x300 -T provide-9.t
+#map: provide-12.map
diff --git a/ld/testsuite/ld-scripts/provide-12.map b/ld/testsuite/ld-scripts/provide-12.map
new file mode 100644
index 00000000000..e76654b10fd
--- /dev/null
+++ b/ld/testsuite/ld-scripts/provide-12.map
@@ -0,0 +1,6 @@
+#...
+Memory Configuration
+
+Name             Origin             Length             Attributes
+FOO              0x[0-9a-f]+300 +0x[0-9a-f]+200
+#pass
diff --git a/ld/testsuite/ld-scripts/provide-9.d b/ld/testsuite/ld-scripts/provide-9.d
new file mode 100644
index 00000000000..94dc02949b7
--- /dev/null
+++ b/ld/testsuite/ld-scripts/provide-9.d
@@ -0,0 +1,3 @@
+#source: provide-5.s
+#ld: -T provide-9.t
+#map: provide-9.map
diff --git a/ld/testsuite/ld-scripts/provide-9.map b/ld/testsuite/ld-scripts/provide-9.map
new file mode 100644
index 00000000000..e35e3e25157
--- /dev/null
+++ b/ld/testsuite/ld-scripts/provide-9.map
@@ -0,0 +1,6 @@
+#...
+Memory Configuration
+
+Name             Origin             Length             Attributes
+FOO              0x[0-9a-f]+100 +0x[0-9a-f]+200
+#pass
diff --git a/ld/testsuite/ld-scripts/provide-9.t b/ld/testsuite/ld-scripts/provide-9.t
new file mode 100644
index 00000000000..00d906aa80a
--- /dev/null
+++ b/ld/testsuite/ld-scripts/provide-9.t
@@ -0,0 +1,25 @@
+PROVIDE (mem_origin = 0x100);
+PROVIDE (mem_length = 0x200);
+
+MEMORY
+{
+  FOO : ORIGIN = mem_origin, LENGTH = mem_length
+}
+
+SECTIONS
+{
+  .data : {
+    *(.data .data.*)
+  } >FOO
+
+  .text : {
+    *(.text .text.*)
+  } >FOO
+
+  .bss : {
+    *(.bss .bss.*)
+  } >FOO
+
+  /DISCARD/ : { *(.*) }
+}
+

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] ld: Allow symbols from PROVIDE to be use in MEMORY regions
  2020-10-15 12:58     ` Andrew Burgess
@ 2020-10-15 14:19       ` Alan Modra
  0 siblings, 0 replies; 7+ messages in thread
From: Alan Modra @ 2020-10-15 14:19 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: binutils

On Thu, Oct 15, 2020 at 01:58:10PM +0100, Andrew Burgess wrote:
> Now at the location of the first call expld.phase is indeed set to
> lang_first_phase_enum, so that's not an issue.  The problem is that
> expld.phase is _still_ set to lang_first_phase_enum at the time of the
> second call.

Oops, indeed it is.  We don't get to lang_mark_phase_enum until later.

> What I did do was move away from using the "phase" parameter, and
> instead just passed a TRUE/FALSE 'update_regions_p' predicate.
> Hopefully this removes possible confusion with "why a parameter and
> not expld.phase?", but also avoids passing parameters through global
> variables.

Yes, this is better.

> What do you think to this?

It's fine to apply.  Thanks for your patience.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-10-15 14:19 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-12 13:49 [PATCH 0/2] ld: On defsym, PROVIDE, and MEMORY regions Andrew Burgess
2020-10-12 13:49 ` [PATCH 1/2] ld: More documentation for --defsym Andrew Burgess
2020-10-14  4:54   ` Alan Modra
2020-10-12 13:49 ` [PATCH 2/2] ld: Allow symbols from PROVIDE to be use in MEMORY regions Andrew Burgess
2020-10-14  4:59   ` Alan Modra
2020-10-15 12:58     ` Andrew Burgess
2020-10-15 14:19       ` Alan Modra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).