Re: binutils-doc 2.15-5: glitches in ld.info

public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed

* Re: binutils-doc 2.15-5: glitches in ld.info
       [not found] <E1Crf48-0008TJ-00@whorl.oslo.opera.com>
@ 2005-01-23  7:43 ` Alan Modra
  2005-01-24 10:53   ` Edward Welbourne
  0 siblings, 1 reply; 7+ messages in thread
From: Alan Modra @ 2005-01-23  7:43 UTC (permalink / raw)
  To: Edward Welbourne; +Cc: bug-binutils, binutils

On Thu, Jan 20, 2005 at 05:21:48PM +0100, Edward Welbourne wrote:
> I have version 2.15-5 of the binutils-doc package as shipped with
> Debian/sarge.  I was reading up on Linker Scripts in the ld info pages
> and noticed what I suppose to be some errors;

Thanks.  The CONSTRUCTOR paragraph is wrong, fixed by the following.

	* ld.texinfo (Output Section Keywords <CONSTRUCTORS>): Correct
	__DTOR_LIST__ description.

>        extern char __load_start_text1, __load_stop_text1;
>        memcpy ((char *) 0x1000, &__load_start_text1,
>                &__load_stop_text1 - &__load_start_text1);

However, this example is correct, and your suggested change

>        extern char *__load_start_text1, *__load_stop_text1;
>        memcpy ((char *) 0x1000, __load_start_text1,
>                __load_stop_text1 - __load_start_text1);

won't work like you think it will..

Index: ld.texinfo
===================================================================
RCS file: /cvs/src/src/ld/ld.texinfo,v
retrieving revision 1.136
diff -u -p -r1.136 ld.texinfo
--- ld.texinfo	23 Jan 2005 05:36:37 -0000	1.136
+++ ld.texinfo	23 Jan 2005 07:37:22 -0000
@@ -3396,7 +3396,9 @@ linker to place constructor information 
 ignored for other object file formats.
 
 The symbol @w{@code{__CTOR_LIST__}} marks the start of the global
-constructors, and the symbol @w{@code{__DTOR_LIST}} marks the end.  The
+constructors, and the symbol @w{@code{__CTOR_END__}} marks the end.
+Similarly, @w{@code{__DTOR_LIST__}} and @w{@code{__DTOR_END__}} mark
+the start and end of the global destructors.  The
 first word in the list is the number of entries, followed by the address
 of each constructor or destructor, followed by a zero word.  The
 compiler must arrange to actually run the code.  For these object file

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: binutils-doc 2.15-5: glitches in ld.info
  2005-01-23  7:43 ` binutils-doc 2.15-5: glitches in ld.info Alan Modra
@ 2005-01-24 10:53   ` Edward Welbourne
  2005-01-24 19:03     ` Ian Lance Taylor
  0 siblings, 1 reply; 7+ messages in thread
From: Edward Welbourne @ 2005-01-24 10:53 UTC (permalink / raw)
  To: amodra; +Cc: bug-binutils, binutils

(ld) Overlay Description

> However, this example is correct, and your suggested change
> won't work like you think it will..

OK, so I infer that I've mis-understood the accompanying text; which
hints that perhaps some clarification would be prudent, though it
might not belong in this page.  Two paragraphs earlier:

    For each section within the `OVERLAY', the linker automatically
  defines two symbols.  The symbol `__load_start_SECNAME' is defined
  as the starting load address of the section.  The symbol
  `__load_stop_SECNAME' is defined as the final load address of the
  section.

shortly after the example:

       .text0 0x1000 : AT (0x4000) { o1/*.o(.text) }
       __load_start_text0 = LOADADDR (.text0);
       __load_stop_text0 = LOADADDR (.text0) + SIZEOF (.text0);
       .text1 0x1000 : AT (0x4000 + SIZEOF (.text0)) { o2/*.o(.text) }
       __load_start_text1 = LOADADDR (.text1);
       __load_stop_text1 = LOADADDR (.text1) + SIZEOF (.text1);
       . = 0x1000 + MAX (SIZEOF (.text0), SIZEOF (.text1));

and I've been interpreting (ld) Assignments and (ld) Simple
Assignments as saying that assignments work like they do in C - that
is, if the linker script says
	 SYMBOL = EXPRESSION
then C code compiled into one of the objects being linked (c.f. the
example you assure me is correct) can use SYMBOL as an lvalue and
thereby get the value of EXPRESSION; while &SYMBOL will get a memory
address at which this value is being held.  I conclude that this is
not how it works.

If I have understood what you are saying correctly, then a linker
script assignment actually creates an alias - an assignment
	 SYMBOL = EXPRESSION
causes my C code to be able to reference &SYMBOL to get the value of
EXPRESSION, and my C code won't be able to reference the place where
that value is stored.

If such is really the case, either (ld) Assignments or possibly "Basic
Linker Script Concepts" should explain it - since the naive C/C++
programmer will be apt to make the same mistake as I did in
understanding the meaning of assignment.  Alternatively, if in fact my
initial reading of how assignments work is correct, then there are
problems with the (ld) Overlay Description text quoted above.

	Eddy.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: binutils-doc 2.15-5: glitches in ld.info
  2005-01-24 10:53   ` Edward Welbourne
@ 2005-01-24 19:03     ` Ian Lance Taylor
  2005-01-24 22:47       ` Edward Welbourne
  0 siblings, 1 reply; 7+ messages in thread
From: Ian Lance Taylor @ 2005-01-24 19:03 UTC (permalink / raw)
  To: eddy; +Cc: amodra, bug-binutils, binutils

Edward Welbourne <eddy@opera.com> writes:

> and I've been interpreting (ld) Assignments and (ld) Simple
> Assignments as saying that assignments work like they do in C - that
> is, if the linker script says
> 	 SYMBOL = EXPRESSION
> then C code compiled into one of the objects being linked (c.f. the
> example you assure me is correct) can use SYMBOL as an lvalue and
> thereby get the value of EXPRESSION; while &SYMBOL will get a memory
> address at which this value is being held.  I conclude that this is
> not how it works.

You are correct that that is not how it works.  In a linker script,
assigning to SYMBOL just affects the symbol table.

> If I have understood what you are saying correctly, then a linker
> script assignment actually creates an alias - an assignment
> 	 SYMBOL = EXPRESSION
> causes my C code to be able to reference &SYMBOL to get the value of
> EXPRESSION, and my C code won't be able to reference the place where
> that value is stored.

Well, it depends upon how you write your C code.  If you want to play
with this kind of thing in linker scripts, you need to understand the
interaction between C code and the linker symbol table.

First I'll note that from the point of view of the program there is no
"place where that value is stored."  The value is stored in the symbol
table, but the symbol table is not (normally) loaded into memory at
runtime.

If you write
    extern int VAR;
then in the simple case C code will expect the symbol VAR to be
defined in the symbol table.  The value that VAR will have in the
symbol table is the address of the location where the variable's value
will be stored.  So if in your linker script you write "VAR = VAL"
then you are in essence saying that the address where the value of the
C variable VAR will be stored is VAL.  So if your C code wants to get
VAL, you need to write "&VAR".

Or, you can write
    extern int VAR[];
Once again, in the simple case, C code will expect VAR to be defined
in the symbol table, and to be the address of the array.  However,
know if you simply write VAR in C, you will get VAL, because in C
a simple reference to an array is the same as a reference to the
address of the first element, which is VAR.

> If such is really the case, either (ld) Assignments or possibly "Basic
> Linker Script Concepts" should explain it - since the naive C/C++
> programmer will be apt to make the same mistake as I did in
> understanding the meaning of assignment.  Alternatively, if in fact my
> initial reading of how assignments work is correct, then there are
> problems with the (ld) Overlay Description text quoted above.

Documentation updates are always welcome.

Ian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: binutils-doc 2.15-5: glitches in ld.info
  2005-01-24 19:03     ` Ian Lance Taylor
@ 2005-01-24 22:47       ` Edward Welbourne
  2005-02-01 12:34         ` Nick Clifton
  0 siblings, 1 reply; 7+ messages in thread
From: Edward Welbourne @ 2005-01-24 22:47 UTC (permalink / raw)
  To: ian; +Cc: amodra, bug-binutils, binutils

> Documentation updates are always welcome.

OK.  (ld) Assigning Values to Symbols
===========================
seems like the place where I believe something is worth saying about
this.  I am poorly qualified to write it, since the foregoing
discussion is all I know about it.  None the less ...

The section presently comprises a short paragraph <quote>

You may assign a value to a symbol in a linker script.  This will define
the symbol as a global symbol.

</quote> followed by a menu of two sub-topics.  I suggest adding to
that paragraph, so as to set context, and adding a third sub-section,
given below as after the existing two.  I am sure it can be greatly
improved, if only because I'm not much good at terseness.  A patch
follows, but I do not know enough about .info files to know what to do
about its Tag Table.  Hmm ... .info files are generated, but I don't
have the source file for it.  If the simple text of the modified
section would be more useful, please let me know; or point me to a
copy of the source form and I'll send a patch for it.

	Eddy.
-- 
diff -bu ld.info.orig ld.info
--- ld.info.orig	2004-11-25 01:24:28.000000000 +0100
+++ ld.info		2005-01-24 23:18:39.000000000 +0100
@@ -2063,12 +2063,14 @@
 ===========================
 
 You may assign a value to a symbol in a linker script.  This will define
-the symbol as a global symbol.
+the symbol as a global symbol: it creates an entry in the symbol table,
+which the linker uses to resolve references to the symbol.
 
 * Menu:
 
 * Simple Assignments::		Simple Assignments
 * PROVIDE::			PROVIDE
+* Source Code Reference::	How the symbol appears to the program.
 
 \x1f
 File: ld.info,  Node: Simple Assignments,  Next: PROVIDE,  Up: Assignments
@@ -2128,7 +2130,7 @@
 boundary.
 
 \x1f
-File: ld.info,  Node: PROVIDE,  Prev: Simple Assignments,  Up: Assignments
+File: ld.info, Node: PROVIDE, Next: Source Code Reference, Prev: Simple Assignments, Up: Assignments
 
 PROVIDE
 -------
@@ -2160,6 +2162,80 @@
 will use the definition in the linker script.
 
 \x1f
+File: ld.info,  Node: Source Code Reference, Prev: PROVIDE, Up: Assignments
+
+Source Code Reference
+---------------------
+
+When it is necessary for a program to reference symbols assigned by
+its linker script, it is important to understand the relationship
+between entries in the symbol table and the semantics of variables in
+your program's source code.  The short story is that the linker's
+symbol table records the addresses of objects with external linkage: a
+linker script assignment says where an object is in memory.
+
+Before going further, we must note that the compiler for your
+programming language may transform the names in your source code, so
+that their names in the resulting symbol tables will not match the
+names you use in your source code; for example, Fortran compilers
+commonly prepend or append an underscore, and C++ performs extensive
+"name mangling".  You should consult your compiler's manual for
+details; you shall need to use the correctly transformed name in your
+linker script, to match the name you use in your source code.
+
+However, it is usual for C compilers to use source names verbatim in
+the symbol table, so we shall use C in what follows.  In C++, it is
+possible to have a name treated "as if" in C (i.e. without
+name-mangling) by declaring it in the extern "C" namespace.
+
+Only symbols your program declares with "external linkage" will appear
+in (or be resolved via) the symbol table: objects with "local linkage"
+(those declared static in file scope, in C) are not visible to the
+linker.  Objects your program defines (i.e. functions for which it
+provides bodies, and variables it declares at file scope, but does not
+declare "static") provide entries to the symbol table; your linker
+script cannot assign to their names - doing so would produce a
+duplicate symbol warning.  However, if your program declares a symbol
+with external linkage, but does not define it, then the linker will
+have to provide that symbol with a value.  Normally, this happens by
+some other object file providing a definition for the symbol, which
+the linker makes your use of the symbol refer to.
+
+However, the linker can also resolve your use of a symbol using a
+linker script assignment.  If your C program declares
+
+     extern int VAR;
+
+then the symbol table needs to say where VAR is to be stored in
+memory, so that your program's uses of VAR can be turned into read and
+write operations accessing that memory.  The symbol table records the
+start address of the piece of memory to be accessed: it knows nothing
+about its size (let alone its type), though the compiler will have
+caused your uses of VAR to access the right number of bytes following
+the start address.  If your linker script says
+
+     VAR = EXPRESSION
+
+then the value it computes for EXPRESSION will be used as the address
+at which VAR is stored: this is what your C program would see as &VAR.
+For an example of a C program refering to symbols set by a link
+script, see *Note Overlay Description.
+
+If the EXPRESSION evaluates to some address which your program is
+using in some other way, its references to VAR may well conflict with
+that use: the linker assumes you mean what you say, so it is up to you
+to ensure your link script provides a sensible EXPRESSION.
+
+Due to some quirks of C's type system (treating functions and arrays
+as synonyms for their addresses, in some respects - but not all) it is
+possible to declare an extern variable and have its value as seen by
+your C code coincide with the value assigned to it by your linker
+script.  However, thanks to related complications of C's type system,
+use of such quirks should be approached with caution.  It is safer to
+specify an ordinary type and have the fact that you are taking the
+variable's address be evident to anyone reading your program.
+
+\x1f
 File: ld.info,  Node: SECTIONS,  Next: MEMORY,  Prev: Assignments,  Up: Scripts
 
 SECTIONS Command

Diff finished at Mon Jan 24 23:19:03

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: binutils-doc 2.15-5: glitches in ld.info
  2005-01-24 22:47       ` Edward Welbourne
@ 2005-02-01 12:34         ` Nick Clifton
  2005-02-01 14:49           ` Edward Welbourne
  0 siblings, 1 reply; 7+ messages in thread
From: Nick Clifton @ 2005-02-01 12:34 UTC (permalink / raw)
  To: eddy; +Cc: ian, amodra, binutils

[-- Attachment #1: Type: text/plain, Size: 425 bytes --]

Hi Edward,

Thanks for the suggested documentation update.  I have taken the liberty 
of rewriting it slightly, err, well actually rather a lot, but I think 
that I have managed to retain the feel and intent of your original version.

What do you think of this version then ?  If you were coming to this 
subject for the first time, do you think that this new section would 
explain the linker's behaviour ?

Cheers
   Nick


[-- Attachment #2: ld.texinfo.patch --]
[-- Type: text/plain, Size: 5308 bytes --]

Index: ld/ld.texinfo
===================================================================
RCS file: /cvs/src/src/ld/ld.texinfo,v
retrieving revision 1.139
diff -c -3 -p -r1.139 ld.texinfo
*** ld/ld.texinfo	1 Feb 2005 01:11:27 -0000	1.139
--- ld/ld.texinfo	1 Feb 2005 12:31:35 -0000
*************** the @samp{-f} option.
*** 2741,2751 ****
  @cindex symbol definition, scripts
  @cindex variables, defining
  You may assign a value to a symbol in a linker script.  This will define
! the symbol as a global symbol.
  
  @menu
  * Simple Assignments::		Simple Assignments
  * PROVIDE::			PROVIDE
  @end menu
  
  @node Simple Assignments
--- 2741,2752 ----
  @cindex symbol definition, scripts
  @cindex variables, defining
  You may assign a value to a symbol in a linker script.  This will define
! the symbol and place it into the symbol table with a global scope.
  
  @menu
  * Simple Assignments::		Simple Assignments
  * PROVIDE::			PROVIDE
+ * Source Code Reference::	How to use a linker script defined symbol in source code
  @end menu
  
  @node Simple Assignments
*************** underscore), the linker will silently us
*** 2838,2843 ****
--- 2839,2951 ----
  If the program references @samp{etext} but does not define it, the
  linker will use the definition in the linker script.
  
+ @node Source Code Reference
+ @subsection Source Code Reference
+ 
+ Accessing a linker script defined variable from source code is not
+ intuitive.  In particular a linker script symbol is not equivalent to
+ a variable declaration in a high level language, it is instead a
+ symbol that does not have a value.
+ 
+ Before going further, it is important to note that compilers often
+ transform names in the source code into different names when they are
+ stored in the symbol table.  For example, Fortran compilers commonly
+ prepend or append an underscore, and C++ performs extensive @samp{name
+ mangling}.  Therefore there might be a discrepancy between the name
+ of a variable as it is used in source code and the name of the same
+ variable as it is defined in a linker script.  For example in C a
+ linker script variable might be referred to as:
+ 
+ @smallexample
+   extern int foo;
+ @end smallexample
+ 
+ But in the linker script it might be defined as:
+ 
+ @smallexample
+   _foo = 1000;
+ @end smallexample
+ 
+ In the remaining examples however it is assumed that no name
+ transformation has taken place.
+ 
+ When a symbol is declared in a high level language such as C, two
+ things happen.  The first is that the compiler reserves enough space
+ in the program's memory to hold the @emph{value} of the symbol.  The
+ second is that the compiler creates an entry in the program's symbol
+ table which holds the symbol's @emph{address}.  ie the symbol table
+ contains the address of the block of memory holding the symbol's
+ value.  So for example the C declaration:
+ 
+ @smallexample
+   int foo = 1000;
+ @end smallexample
+ 
+ Creates a entry called @samp{foo} in the symbol table.  This entry
+ holds the address of an @samp{int} sized block of memory where the
+ number 1000 is currently stored.
+ 
+ When a program references a symbol the compiler generates code that
+ first accesses the symbol table to find the address of the symbol's
+ memory block and then code to read the value from that memory block.
+ So:
+ 
+ @smallexample
+   foo = 1;
+ @end smallexample
+ 
+ Looks up the symbol @samp{foo} in the symbol table, gets the address
+ associated with this symbol and then writes the value 1 into that
+ address.  Whereas:
+ 
+ @smallexample
+   int * a = & foo;
+ @end smallexample
+ 
+ Looks up the symbol @samp{foo} in the symbol table, gets it address
+ and then copies this address into the block of memory associated with
+ the variable @samp{a}.
+ 
+ Linker scripts symbol declarations by contrast, create an entry in
+ the symbol table but do not assign any memory to them.  Thus they are
+ an address without a value.  So for example the linker script definition:
+ 
+ @smallexample
+   foo = 1000;
+ @end smallexample
+ 
+ Creates an entry in the symbol table called @samp{foo} which contains
+ the address of memory location 1000, but nothing special is stored at
+ address 1000.  This means that you cannot access the @emph{value} of a
+ linker script defined symbol - it has no value - all you can do is
+ access the @emph{address} of a linker script defined symbol, 
+ 
+ Hence when you are using a linker script defined symbol in source code
+ you should always take the address of the symbol, and never attempt to
+ use its value.  For example suppose you want to copy the contents of a
+ section of memory called .ROM into a section called .FLASH and the
+ linker script contains these declarations:
+ 
+ @smallexample
+ @group
+   start_of_ROM   = .ROM;
+   end_of_ROM     = .ROM + sizeof (.ROM) - 1;
+   start_of_FLASH = .FLASH;
+ @end group
+ @end smallexample
+ 
+ Then the C source code to perform the copy would be:
+ 
+ @smallexample
+ @group
+   extern char start_of_ROM, end_of_ROM, start_of_FLASH;
+   
+   memcpy (& start_of_FLASH, & start_of_ROM, & end_of_ROM - & start_of_ROM);
+ @end group
+ @end smallexample
+ 
+ Note the use of the @samp{&} operators.  These are correct.
+ 
  @node SECTIONS
  @section SECTIONS Command
  @kindex SECTIONS

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: binutils-doc 2.15-5: glitches in ld.info
  2005-02-01 12:34         ` Nick Clifton
@ 2005-02-01 14:49           ` Edward Welbourne
  2005-02-01 17:30             ` Nick Clifton
  0 siblings, 1 reply; 7+ messages in thread
From: Edward Welbourne @ 2005-02-01 14:49 UTC (permalink / raw)
  To: nickc; +Cc: ian, amodra, binutils

> I have taken the liberty 
> of rewriting it slightly, err, well actually rather a lot,

Good.  I do not have delusions of expertise ...
Ah, that's how you write info files.
TeX with @ in place of \ ...

> What do you think of this version then ?
Looks good to me.  A few comments below ...

	Eddy.
-- 
+ contains the address of the block of memory holding the symbol's
+ value.  So for example the C declaration:
+ 
+ @smallexample
+   int foo = 1000;
+ @end smallexample

needs to be clear about it being an extern decl; e.g. by phrasing it as

+ value.  So for example (at file scope) the C declaration:

Subsequently:

+ Creates a entry called @samp{foo} in the symbol table.  This entry
+ holds the address of an @samp{int} sized block of memory where the
+ number 1000 is currently stored.

initially <-     ^^^^^^^^^

Later:

+ Linker scripts symbol declarations by contrast, create an entry in
+ the symbol table but do not assign any memory to them.  Thus they are
+ an address without a value.  So for example the linker script definition:
+ 
+ @smallexample
+   foo = 1000;
+ @end smallexample
+ 
+ Creates an entry in the symbol table called @samp{foo} which contains
+ the address of memory location 1000, but nothing special is stored at
+ address 1000.  This means that you cannot access the @emph{value} of a
+ linker script defined symbol - it has no value - all you can do is
+ access the @emph{address} of a linker script defined symbol, 

(i) punctuation: 2nd para begins mid-sentence, so "Creates" isn't
first word of a sentence and doesn't deserve to be capitalised; and this
para ends in a comma, "symbol," rather than "symbol."

(ii) but from a linker script's point of view the above is unfair, IIUC.
I would suggest

+ Linker scripts symbol declarations, by contrast, create an entry in
+ the symbol table but do not assign any memory to them.  Thus they
+ name an address rather than the value recorded at an address.  So for
+ example the linker script definition:
+ 
+ @smallexample
+   foo = 1000;
+ @end smallexample
+ 
+ creates an entry in the symbol table which gives name @samp{foo} to
+ memory location 1000, but stores nothing at address 1000 and provides
+ no means to alter @samp{foo} at run-time.

The subsequent paragraph's first sentence would then want re-phrased
something like

+ Hence when you reference a linker script defined symbol in source
+ code, the @emph{address} of the symbol, rather than its value, is
+ what the linker script has defined.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: binutils-doc 2.15-5: glitches in ld.info
  2005-02-01 14:49           ` Edward Welbourne
@ 2005-02-01 17:30             ` Nick Clifton
  0 siblings, 0 replies; 7+ messages in thread
From: Nick Clifton @ 2005-02-01 17:30 UTC (permalink / raw)
  To: eddy; +Cc: ian, amodra, binutils

[-- Attachment #1: Type: text/plain, Size: 620 bytes --]

Hi Edward,

> Looks good to me.  A few comments below ...

Thanks - I have taken most of these on board, apart from the rewording 
of what a linker script declaration does when it creates an entry in the 
symbol table.  Call me pedantic, but I think that my version was 
technically more accurate.

Anyway I have checked the attached revised patch in with this ChangeLog 
entry.

Cheers
   Nick

ld/ChangeLog
2005-02-01  Edward Welbourne  <eddy@opera.com>
	    Nick Clifton  <nickc@redhat.com>

	* ld.texinfo (Source Code Reference): New node describing how to
	access linker script defined variables from source code.


[-- Attachment #2: ld.texinfo.patch --]
[-- Type: text/plain, Size: 5330 bytes --]

Index: ld/ld.texinfo
===================================================================
RCS file: /cvs/src/src/ld/ld.texinfo,v
retrieving revision 1.139
diff -c -3 -p -r1.139 ld.texinfo
*** ld/ld.texinfo	1 Feb 2005 01:11:27 -0000	1.139
--- ld/ld.texinfo	1 Feb 2005 17:27:43 -0000
*************** the @samp{-f} option.
*** 2741,2751 ****
  @cindex symbol definition, scripts
  @cindex variables, defining
  You may assign a value to a symbol in a linker script.  This will define
! the symbol as a global symbol.
  
  @menu
  * Simple Assignments::		Simple Assignments
  * PROVIDE::			PROVIDE
  @end menu
  
  @node Simple Assignments
--- 2741,2752 ----
  @cindex symbol definition, scripts
  @cindex variables, defining
  You may assign a value to a symbol in a linker script.  This will define
! the symbol and place it into the symbol table with a global scope.
  
  @menu
  * Simple Assignments::		Simple Assignments
  * PROVIDE::			PROVIDE
+ * Source Code Reference::	How to use a linker script defined symbol in source code
  @end menu
  
  @node Simple Assignments
*************** underscore), the linker will silently us
*** 2838,2843 ****
--- 2839,2951 ----
  If the program references @samp{etext} but does not define it, the
  linker will use the definition in the linker script.
  
+ @node Source Code Reference
+ @subsection Source Code Reference
+ 
+ Accessing a linker script defined variable from source code is not
+ intuitive.  In particular a linker script symbol is not equivalent to
+ a variable declaration in a high level language, it is instead a
+ symbol that does not have a value.
+ 
+ Before going further, it is important to note that compilers often
+ transform names in the source code into different names when they are
+ stored in the symbol table.  For example, Fortran compilers commonly
+ prepend or append an underscore, and C++ performs extensive @samp{name
+ mangling}.  Therefore there might be a discrepancy between the name
+ of a variable as it is used in source code and the name of the same
+ variable as it is defined in a linker script.  For example in C a
+ linker script variable might be referred to as:
+ 
+ @smallexample
+   extern int foo;
+ @end smallexample
+ 
+ But in the linker script it might be defined as:
+ 
+ @smallexample
+   _foo = 1000;
+ @end smallexample
+ 
+ In the remaining examples however it is assumed that no name
+ transformation has taken place.
+ 
+ When a symbol is declared in a high level language such as C, two
+ things happen.  The first is that the compiler reserves enough space
+ in the program's memory to hold the @emph{value} of the symbol.  The
+ second is that the compiler creates an entry in the program's symbol
+ table which holds the symbol's @emph{address}.  ie the symbol table
+ contains the address of the block of memory holding the symbol's
+ value.  So for example the following C declaration, at file scope:
+ 
+ @smallexample
+   int foo = 1000;
+ @end smallexample
+ 
+ creates a entry called @samp{foo} in the symbol table.  This entry
+ holds the address of an @samp{int} sized block of memory where the
+ number 1000 is initially stored.
+ 
+ When a program references a symbol the compiler generates code that
+ first accesses the symbol table to find the address of the symbol's
+ memory block and then code to read the value from that memory block.
+ So:
+ 
+ @smallexample
+   foo = 1;
+ @end smallexample
+ 
+ looks up the symbol @samp{foo} in the symbol table, gets the address
+ associated with this symbol and then writes the value 1 into that
+ address.  Whereas:
+ 
+ @smallexample
+   int * a = & foo;
+ @end smallexample
+ 
+ looks up the symbol @samp{foo} in the symbol table, gets it address
+ and then copies this address into the block of memory associated with
+ the variable @samp{a}.
+ 
+ Linker scripts symbol declarations, by contrast, create an entry in
+ the symbol table but do not assign any memory to them.  Thus they are
+ an address without a value.  So for example the linker script definition:
+ 
+ @smallexample
+   foo = 1000;
+ @end smallexample
+ 
+ creates an entry in the symbol table called @samp{foo} which holds
+ the address of memory location 1000, but nothing special is stored at
+ address 1000.  This means that you cannot access the @emph{value} of a
+ linker script defined symbol - it has no value - all you can do is
+ access the @emph{address} of a linker script defined symbol.
+ 
+ Hence when you are using a linker script defined symbol in source code
+ you should always take the address of the symbol, and never attempt to
+ use its value.  For example suppose you want to copy the contents of a
+ section of memory called .ROM into a section called .FLASH and the
+ linker script contains these declarations:
+ 
+ @smallexample
+ @group
+   start_of_ROM   = .ROM;
+   end_of_ROM     = .ROM + sizeof (.ROM) - 1;
+   start_of_FLASH = .FLASH;
+ @end group
+ @end smallexample
+ 
+ Then the C source code to perform the copy would be:
+ 
+ @smallexample
+ @group
+   extern char start_of_ROM, end_of_ROM, start_of_FLASH;
+   
+   memcpy (& start_of_FLASH, & start_of_ROM, & end_of_ROM - & start_of_ROM);
+ @end group
+ @end smallexample
+ 
+ Note the use of the @samp{&} operators.  These are correct.
+ 
  @node SECTIONS
  @section SECTIONS Command
  @kindex SECTIONS

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-02-01 17:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <E1Crf48-0008TJ-00@whorl.oslo.opera.com>
2005-01-23  7:43 ` binutils-doc 2.15-5: glitches in ld.info Alan Modra
2005-01-24 10:53   ` Edward Welbourne
2005-01-24 19:03     ` Ian Lance Taylor
2005-01-24 22:47       ` Edward Welbourne
2005-02-01 12:34         ` Nick Clifton
2005-02-01 14:49           ` Edward Welbourne
2005-02-01 17:30             ` Nick Clifton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).