public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [RFC] PR ld/13621
@ 2012-01-30  0:57 Richard Henderson
  2012-01-30  3:26 ` Alan Modra
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Richard Henderson @ 2012-01-30  0:57 UTC (permalink / raw)
  To: binutils; +Cc: Mark Wielaard, Alan Modra

[-- Attachment #1: Type: text/plain, Size: 965 bytes --]

This PR is about elflint warning about symbols outside of the section
with which they are associated.

The problem can be seen with just about any zero-sized section, nearly
all of which are removed by strip_excluded_output_sections.  Then, the
symbols are relocated to a "near by" section by _bfd_fix_excluded_sec_syms.

This elflint warning will affect essentially all binaries built with
gcc4.7 because of the crt symbol __TMC_END__, present in the normally
empty section .tm_clone_table.  Thus it seems worthwhile to fix this.

At first I attempted to SEC_KEEP any section with global symbols.  But
this ran into all sorts of testsuite failures re .got.  In those cases
the _GLOBAL_OFFSET_TABLE_ symbol still exists, which causes .got to 
be kept instead of removed as unused later in the link.

A second best solution would seem to be to merely move the offending
symbols to the ABS section rather than associate them with some random
section.

Thoughts?


r~

[-- Attachment #2: z1 --]
[-- Type: text/plain, Size: 1778 bytes --]

bfd/
	* linker.c (fix_syms): Don't add a symbol to a section if the
	symbol does not lie inclusively within the section.
ld/
	* testsuite/ld-elf/warn2.d: Expect Foo in ABS.
	* testsuite/ld-elf/zerosize1.d: New file.
	* testsuite/ld-elf/zerosize1.s: New file.


diff --git a/bfd/linker.c b/bfd/linker.c
index 7a01e11..1d9e66a 100644
--- a/bfd/linker.c
+++ b/bfd/linker.c
@@ -3201,6 +3201,12 @@ fix_syms (struct bfd_link_hash_entry *h, void *data)
 		op = op1;
 	    }
 
+	  /* Refuse to choose a section for which we are out of bounds.  */
+	  /* ??? This may make most of the above moot.  */
+	  if (h->u.def.value < op->vma
+	      || h->u.def.value > op->vma + op->size)
+	    op = bfd_abs_section_ptr;
+
 	  h->u.def.value -= op->vma;
 	  h->u.def.section = op;
 	}
diff --git a/ld/testsuite/ld-elf/warn2.d b/ld/testsuite/ld-elf/warn2.d
index 95b7ef4..a9c05f9 100644
--- a/ld/testsuite/ld-elf/warn2.d
+++ b/ld/testsuite/ld-elf/warn2.d
@@ -13,5 +13,5 @@
 # construct and that the symbol still appears as expected.
 
 #...
- +[0-9]+: +[0-9a-f]+ +20 +OBJECT +GLOBAL +DEFAULT +[1-9] Foo
+ +[0-9]+: +[0-9a-f]+ +20 +OBJECT +GLOBAL +DEFAULT +ABS Foo
 #pass
diff --git a/ld/testsuite/ld-elf/zerosize1.d b/ld/testsuite/ld-elf/zerosize1.d
new file mode 100644
index 0000000..0da5171
--- /dev/null
+++ b/ld/testsuite/ld-elf/zerosize1.d
@@ -0,0 +1,10 @@
+#source: start.s
+#source: zerosize1.s
+#ld:
+#readelf: -s
+
+# Check that xyzzy is not placed in the ABS section.
+
+#...
+ +[0-9]+: +[0-9a-f]+ +0 +NOTYPE +GLOBAL +DEFAULT +ABS xyzzy
+#pass
diff --git a/ld/testsuite/ld-elf/zerosize1.s b/ld/testsuite/ld-elf/zerosize1.s
new file mode 100644
index 0000000..4fc8198
--- /dev/null
+++ b/ld/testsuite/ld-elf/zerosize1.s
@@ -0,0 +1,3 @@
+	.section "zerosize","aw"
+	.globl	xyzzy
+xyzzy:

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC] PR ld/13621
  2012-01-30  0:57 [RFC] PR ld/13621 Richard Henderson
@ 2012-01-30  3:26 ` Alan Modra
  2012-01-30  8:28   ` Mark Wielaard
  2012-01-30 19:51   ` Richard Henderson
  2012-01-30  8:23 ` Mark Wielaard
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 17+ messages in thread
From: Alan Modra @ 2012-01-30  3:26 UTC (permalink / raw)
  To: Richard Henderson; +Cc: binutils, Mark Wielaard

On Mon, Jan 30, 2012 at 11:56:49AM +1100, Richard Henderson wrote:
> This PR is about elflint warning about symbols outside of the section
> with which they are associated.

Huh.  So what's wrong with a symbol belonging to a section but defined
outside of 0..section_size-1?  Particularly if it happens to be
defined at section_size.  That's quite a natural way to mark out the
bounds of an array.  I think this is just elflint being silly.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC] PR ld/13621
  2012-01-30  0:57 [RFC] PR ld/13621 Richard Henderson
  2012-01-30  3:26 ` Alan Modra
@ 2012-01-30  8:23 ` Mark Wielaard
  2012-03-01  3:58 ` Hans-Peter Nilsson
  2012-05-05 14:42 ` H.J. Lu
  3 siblings, 0 replies; 17+ messages in thread
From: Mark Wielaard @ 2012-01-30  8:23 UTC (permalink / raw)
  To: Richard Henderson; +Cc: binutils, Alan Modra

[-- Attachment #1: Type: text/plain, Size: 779 bytes --]

On Mon, Jan 30, 2012 at 11:56:49AM +1100, Richard Henderson wrote:
> At first I attempted to SEC_KEEP any section with global symbols.  But
> this ran into all sorts of testsuite failures re .got.  In those cases
> the _GLOBAL_OFFSET_TABLE_ symbol still exists, which causes .got to 
> be kept instead of removed as unused later in the link.

My attempt was using a separate flag to indicate global symbols
were present for the an input section. This works for the example
given in the bug report, but doesn't for the actual gc 4.7 case.
I assume because multiple input files define the same symbol, some
as undefined, and the output section gets associated with the "wrong"
input section. Maybe a solution could be to do te registration in
_bfd_elf_merge_symbol?

Cheers,

Mark

[-- Attachment #2: section_syms.patch --]
[-- Type: text/plain, Size: 4587 bytes --]

diff --git a/bfd/bfd-in2.h b/bfd/bfd-in2.h
index 1523660..fae3bfd 100644
--- a/bfd/bfd-in2.h
+++ b/bfd/bfd-in2.h
@@ -1392,6 +1392,9 @@ typedef struct bfd_section
   /* Nonzero if this section uses RELA relocations, rather than REL.  */
   unsigned int use_rela_p:1;
 
+  /* Nonzero if there are global symbols associated with this section.  */
+  unsigned int global_syms_p:1;
+
   /* Bits used by various backends.  The generic code doesn't touch
      these fields.  */
 
@@ -1674,8 +1677,8 @@ extern asection bfd_ind_section;
   /* linker_mark, linker_has_input, gc_mark, decompress_status,    */  \
      0,           0,                1,       0,                        \
                                                                        \
-  /* segment_mark, sec_info_type, use_rela_p,                      */  \
-     0,            0,             0,                                   \
+  /* segment_mark, sec_info_type, use_rela_p, global_syms_p,       */  \
+     0,            0,             0,          0,                       \
                                                                        \
   /* sec_flg0, sec_flg1, sec_flg2, sec_flg3, sec_flg4, sec_flg5,   */  \
      0,        0,        0,        0,        0,        0,              \
diff --git a/bfd/ecoff.c b/bfd/ecoff.c
index b76266d..50c7890 100644
--- a/bfd/ecoff.c
+++ b/bfd/ecoff.c
@@ -57,8 +57,8 @@ static asection bfd_debug_section =
      "*DEBUG*", 0,   0,     NULL, NULL, 0,     0,
   /* linker_mark, linker_has_input, gc_mark, compress_status,      */
      0,           0,                1,       0,
-  /* segment_mark, sec_info_type, use_rela_p,                      */
-     0,            0,             0,
+  /* segment_mark, sec_info_type, use_rela_p, global_syms_p,       */
+     0,            0,             0,          0,
   /* sec_flg0, sec_flg1, sec_flg2, sec_flg3, sec_flg4, sec_flg5,   */
      0,        0,        0,        0,        0,        0,
   /* vma, lma, size, rawsize, compressed_size, relax, relax_count, */
diff --git a/bfd/elf.c b/bfd/elf.c
index 9c9ba75..8f4e595 100644
--- a/bfd/elf.c
+++ b/bfd/elf.c
@@ -6320,6 +6320,7 @@ _bfd_elf_init_private_section_data (bfd *ibfd,
     }
 
   osec->use_rela_p = isec->use_rela_p;
+  osec->global_syms_p = isec->global_syms_p;
 
   return TRUE;
 }
diff --git a/bfd/elflink.c b/bfd/elflink.c
index 1d1ca0b..47191cf 100644
--- a/bfd/elflink.c
+++ b/bfd/elflink.c
@@ -3903,8 +3903,20 @@ error_free_dyn:
 	      sec = bfd_und_section_ptr;
 	      isym->st_shndx = SHN_UNDEF;
 	    }
-	  else if ((abfd->flags & (EXEC_P | DYNAMIC)) != 0)
-	    value -= sec->vma;
+	  else
+	    {
+	      /* If we have a global symbol for a section, then we need
+		 to keep it around, even if the section ends up empty.
+		 Except for .gnu.warning sections which are "special"
+		 (see above).  */
+	      if (flags == BSF_GLOBAL
+		  && ! CONST_STRNEQ (bfd_get_section_name (abfd, sec),
+				     ".gnu.warning"))
+		sec->global_syms_p = 1;
+
+	      if ((abfd->flags & (EXEC_P | DYNAMIC)) != 0)
+		value -= sec->vma;
+	    }
 	}
 
       name = bfd_elf_string_from_elf_section (abfd, hdr->sh_link,
diff --git a/bfd/section.c b/bfd/section.c
index 7c1f750..803de74 100644
--- a/bfd/section.c
+++ b/bfd/section.c
@@ -391,6 +391,9 @@ CODE_FRAGMENT
 .  {* Nonzero if this section uses RELA relocations, rather than REL.  *}
 .  unsigned int use_rela_p:1;
 .
+.  {* Nonzero if there are global symbols associated with this section.  *}
+.  unsigned int global_syms_p:1;
+.
 .  {* Bits used by various backends.  The generic code doesn't touch
 .     these fields.  *}
 .
@@ -673,8 +676,8 @@ CODE_FRAGMENT
 .  {* linker_mark, linker_has_input, gc_mark, decompress_status,    *}	\
 .     0,           0,                1,       0,			\
 .									\
-.  {* segment_mark, sec_info_type, use_rela_p,                      *}	\
-.     0,            0,             0,					\
+.  {* segment_mark, sec_info_type, use_rela_p, global_syms_p,       *}	\
+.     0,            0,             0,	       0,			\
 .									\
 .  {* sec_flg0, sec_flg1, sec_flg2, sec_flg3, sec_flg4, sec_flg5,   *}	\
 .     0,        0,        0,        0,        0,        0,		\
diff --git a/ld/ldlang.c b/ld/ldlang.c
index 5e01fc0..c7b24a2 100644
--- a/ld/ldlang.c
+++ b/ld/ldlang.c
@@ -3903,6 +3903,7 @@ strip_excluded_output_sections (void)
 	continue;
 
       exclude = (output_section->rawsize == 0
+		 && output_section->global_syms_p == 0
 		 && (output_section->flags & SEC_KEEP) == 0
 		 && !bfd_section_removed_from_list (link_info.output_bfd,
 						    output_section));

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC] PR ld/13621
  2012-01-30  3:26 ` Alan Modra
@ 2012-01-30  8:28   ` Mark Wielaard
  2012-01-30 13:17     ` Alan Modra
  2012-01-30 19:51   ` Richard Henderson
  1 sibling, 1 reply; 17+ messages in thread
From: Mark Wielaard @ 2012-01-30  8:28 UTC (permalink / raw)
  To: Richard Henderson, binutils

On Mon, Jan 30, 2012 at 01:56:26PM +1030, Alan Modra wrote:
> On Mon, Jan 30, 2012 at 11:56:49AM +1100, Richard Henderson wrote:
> > This PR is about elflint warning about symbols outside of the section
> > with which they are associated.
> 
> Huh.  So what's wrong with a symbol belonging to a section but defined
> outside of 0..section_size-1?  Particularly if it happens to be
> defined at section_size.  That's quite a natural way to mark out the
> bounds of an array.  I think this is just elflint being silly.

elflint allows the sillyness of marking out the bounds of the section.
That is precisely what happens here, the symbol is used to mark the
end of the section (section_size), but then the whole section that
the symbol is associated with is removed. So now the symbol is
associated with some other random section, and points into some other
one that now is associated with the address of the symbol. In that
case it seems natural to at least mark the symbol as being associated
with SHN_ABS as Richard did, but it would be even better if ld didn't
just toss out a section that has symbols associated with it, so that
the address of the symbol at least matches up.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC] PR ld/13621
  2012-01-30  8:28   ` Mark Wielaard
@ 2012-01-30 13:17     ` Alan Modra
  2012-01-30 14:14       ` Mark Wielaard
  2012-01-30 20:00       ` Richard Henderson
  0 siblings, 2 replies; 17+ messages in thread
From: Alan Modra @ 2012-01-30 13:17 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: Richard Henderson, binutils

On Mon, Jan 30, 2012 at 09:28:40AM +0100, Mark Wielaard wrote:
> On Mon, Jan 30, 2012 at 01:56:26PM +1030, Alan Modra wrote:
> > On Mon, Jan 30, 2012 at 11:56:49AM +1100, Richard Henderson wrote:
> > > This PR is about elflint warning about symbols outside of the section
> > > with which they are associated.
> > 
> > Huh.  So what's wrong with a symbol belonging to a section but defined
> > outside of 0..section_size-1?  Particularly if it happens to be
> > defined at section_size.  That's quite a natural way to mark out the
> > bounds of an array.  I think this is just elflint being silly.
> 
> elflint allows the sillyness of marking out the bounds of the section.
> That is precisely what happens here, the symbol is used to mark the
> end of the section (section_size), but then the whole section that
> the symbol is associated with is removed.

There seems to be some misunderstanding about linker operation here,
and below when you talk about ld "tossing out" sections.

> So now the symbol is
> associated with some other random section, and points into some other
> one that now is associated with the address of the symbol.

The symbol is associated with some input file section.  This section
gets mapped to some output file section, along with any symbols.  The
output file section may not have the same name, for example, when a
linker script maps your .tm_clone_table input section to .data output
section.  I don't see why automatically mapping empty input sections
to some other named section in a sensible manner should cause any more
consternation than doing so via a linker script, except that in the
past we've made separate output sections for empty input sections not
mentioned in a linker script.  Current behaviour of not making empty
orphan output sections has existed since 2005.

> In that
> case it seems natural to at least mark the symbol as being associated
> with SHN_ABS as Richard did, but it would be even better if ld didn't
> just toss out a section that has symbols associated with it, so that
> the address of the symbol at least matches up.

I'd rather you just fixed elflint's spurious warning.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC] PR ld/13621
  2012-01-30 13:17     ` Alan Modra
@ 2012-01-30 14:14       ` Mark Wielaard
  2012-01-30 23:16         ` Alan Modra
  2012-01-30 20:00       ` Richard Henderson
  1 sibling, 1 reply; 17+ messages in thread
From: Mark Wielaard @ 2012-01-30 14:14 UTC (permalink / raw)
  To: Alan Modra; +Cc: Richard Henderson, binutils

On Mon, 2012-01-30 at 23:47 +1030, Alan Modra wrote:
> The symbol is associated with some input file section.  This section
> gets mapped to some output file section, along with any symbols.  The
> output file section may not have the same name, for example, when a
> linker script maps your .tm_clone_table input section to .data output
> section.  I don't see why automatically mapping empty input sections
> to some other named section in a sensible manner should cause any more
> consternation than doing so via a linker script, except that in the
> past we've made separate output sections for empty input sections not
> mentioned in a linker script.  Current behaviour of not making empty
> orphan output sections has existed since 2005.

I might be missing something, but according to elf/gabi associating the
symbol, through the section index member, makes sure that the symbol's
value refers to a specific location within that section, so that
references to the symbol continue to point to the same location in the
program. So, it seems to me you cannot just remove a section which has a
symbol associated with it, because if you do you can no longer make sure
the value will refer to the right location. So two different symbols
associated with different sections would have values that pointed to
different locations in the program, which I don't see how you guarantee
with the current behavior. Maybe I am just missing the (sensible) manner
how ld does the automatic mapping of empty sections?

Note that gold has the behavior that I would expect, keep the sections
if there are symbols associated with it.

> > In that
> > case it seems natural to at least mark the symbol as being associated
> > with SHN_ABS as Richard did, but it would be even better if ld didn't
> > just toss out a section that has symbols associated with it, so that
> > the address of the symbol at least matches up.
> 
> I'd rather you just fixed elflint's spurious warning.

I am not sure the warning is spurious, it checks that a symbols value
refers to a specific location within a section using its section index
member, which seems to be how ELF intended the values to be used (except
of course where it says that there are other semantics for special
section indexes like abs, common and undef).

Cheers,

Mark

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC] PR ld/13621
  2012-01-30  3:26 ` Alan Modra
  2012-01-30  8:28   ` Mark Wielaard
@ 2012-01-30 19:51   ` Richard Henderson
  1 sibling, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2012-01-30 19:51 UTC (permalink / raw)
  To: binutils, Mark Wielaard

On 01/30/2012 02:26 PM, Alan Modra wrote:
> On Mon, Jan 30, 2012 at 11:56:49AM +1100, Richard Henderson wrote:
>> This PR is about elflint warning about symbols outside of the section
>> with which they are associated.
> 
> Huh.  So what's wrong with a symbol belonging to a section but defined
> outside of 0..section_size-1?  Particularly if it happens to be
> defined at section_size.  That's quite a natural way to mark out the
> bounds of an array.  I think this is just elflint being silly.
> 

It isn't at section_size.  The case mjw saw had the symbol at 
section_size + 8.  The test case I created has the symbol at
section_size + 2M (or whatever the x86_64 maxpagesize is).


r~

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC] PR ld/13621
  2012-01-30 13:17     ` Alan Modra
  2012-01-30 14:14       ` Mark Wielaard
@ 2012-01-30 20:00       ` Richard Henderson
  2012-01-30 23:41         ` Alan Modra
  1 sibling, 1 reply; 17+ messages in thread
From: Richard Henderson @ 2012-01-30 20:00 UTC (permalink / raw)
  To: Mark Wielaard, binutils

On 01/31/2012 12:17 AM, Alan Modra wrote:
> The symbol is associated with some input file section.  This section
> gets mapped to some output file section, along with any symbols.  The
> output file section may not have the same name, for example, when a
> linker script maps your .tm_clone_table input section to .data output
> section. 

Misconception alert.

In this case there is no entry for .tm_clone_table in the linker script.
It gets placed after .data by normal orphan placement rules.
The final post-link size of .tm_clone_table is zero.
Zero sized output sections are removed by the linker.
The symbols are arbitrarily moved to .data simply because it is "nearby".

> I don't see why automatically mapping empty input sections
> to some other named section in a sensible manner should cause any more
> consternation than doing so via a linker script, except that in the
> past we've made separate output sections for empty input sections not
> mentioned in a linker script.  Current behaviour of not making empty
> orphan output sections has existed since 2005.

If we actually extended the range of .data to include the range of
.tm_clone_table, that would be fine too in my book.

But in the cases we're looking at, alignment constraints for 
.tm_clone_table have left padding between the end of .data and the
beginning of .tm_clone_table.  Which means that the symbols that
are associated with .data wind up outside of the section.


r~

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC] PR ld/13621
  2012-01-30 14:14       ` Mark Wielaard
@ 2012-01-30 23:16         ` Alan Modra
  2012-01-31  9:58           ` Mark Wielaard
  0 siblings, 1 reply; 17+ messages in thread
From: Alan Modra @ 2012-01-30 23:16 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: Richard Henderson, binutils

On Mon, Jan 30, 2012 at 03:13:57PM +0100, Mark Wielaard wrote:
> On Mon, 2012-01-30 at 23:47 +1030, Alan Modra wrote:
> > The symbol is associated with some input file section.  This section
> > gets mapped to some output file section, along with any symbols.  The
> > output file section may not have the same name, for example, when a
> > linker script maps your .tm_clone_table input section to .data output
> > section.  I don't see why automatically mapping empty input sections
> > to some other named section in a sensible manner should cause any more
> > consternation than doing so via a linker script, except that in the
> > past we've made separate output sections for empty input sections not
> > mentioned in a linker script.  Current behaviour of not making empty
> > orphan output sections has existed since 2005.
> 
> I might be missing something, but according to elf/gabi associating the
> symbol, through the section index member, makes sure that the symbol's
> value refers to a specific location within that section, so that
> references to the symbol continue to point to the same location in the
> program. So, it seems to me you cannot just remove a section which has a
> symbol associated with it, because if you do you can no longer make sure
> the value will refer to the right location. So two different symbols
> associated with different sections would have values that pointed to
> different locations in the program, which I don't see how you guarantee
> with the current behavior. Maybe I am just missing the (sensible) manner
> how ld does the automatic mapping of empty sections?

The point is that *none* of the input sections are really kept.
Instead, input sections are *mapped* to output sections.  You'll see
that global symbols within the input sections have their section index
translated to the appropriate output section index, often a different
index to that in the input file.

Internally, GNU ld does create output sections and later remove them,
but this is just an implementation detail.  ld could, if it were made
a little more clever, not create these sections in the first place.

As far as symbols go, the ELF spec says that st_value for (non-common)
symbols in relocatable object files is an offset from the beginning of
a section.  There is nothing to say that values should be within the
section bounds.  In an executable or shared object the ELF spec
says of symbol values that "the section number is irrelevant", since
st_value is no longer a section offset but a virtual memory address.
(That doesn't mean to say the section is totally irrelevant, just that
it is irrelevant so far as the value is concerned.)  So that's why I
claim that the elflint warning is a little silly.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC] PR ld/13621
  2012-01-30 20:00       ` Richard Henderson
@ 2012-01-30 23:41         ` Alan Modra
  2012-01-31  0:34           ` Richard Henderson
  0 siblings, 1 reply; 17+ messages in thread
From: Alan Modra @ 2012-01-30 23:41 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Mark Wielaard, binutils

On Tue, Jan 31, 2012 at 07:00:02AM +1100, Richard Henderson wrote:
> On 01/31/2012 12:17 AM, Alan Modra wrote:
> > The symbol is associated with some input file section.  This section
> > gets mapped to some output file section, along with any symbols.  The
> > output file section may not have the same name, for example, when a
> > linker script maps your .tm_clone_table input section to .data output
> > section. 
> 
> Misconception alert.
> 
> In this case there is no entry for .tm_clone_table in the linker script.

I didn't say there was. ;)

> It gets placed after .data by normal orphan placement rules.
> The final post-link size of .tm_clone_table is zero.
> Zero sized output sections are removed by the linker.
> The symbols are arbitrarily moved to .data simply because it is "nearby".

The choice of .data isn't arbitrary, but based on section flags as
well as adjacency.  The idea being to hopefully put symbols in a
section that belongs to the same segment (or overlay if such exists).
Using SHN_ABS loses this info.  That might matter for a hypothetical
system that dynamically relocates the data segment separately to the
code segment, something I looked at doing for powerpc64 at one stage.
It might also matter for analysis and optimisation tools like
FDPRpro.

Also, I dislike SHN_ABS symbols in executables and shared objects.
They're supposed to be absolute, yet glibc ld.so dynamically relocates
them, necessary for compatibility with ancient linkers and loaders..

> > I don't see why automatically mapping empty input sections
> > to some other named section in a sensible manner should cause any more
> > consternation than doing so via a linker script, except that in the
> > past we've made separate output sections for empty input sections not
> > mentioned in a linker script.  Current behaviour of not making empty
> > orphan output sections has existed since 2005.
> 
> If we actually extended the range of .data to include the range of
> .tm_clone_table, that would be fine too in my book.
> 
> But in the cases we're looking at, alignment constraints for 
> .tm_clone_table have left padding between the end of .data and the
> beginning of .tm_clone_table.  Which means that the symbols that
> are associated with .data wind up outside of the section.

I missed the alignment effect, but I still don't see any problem
whatsoever with a symbol "outside" of a section in an executable.  Is
there some reason other than an elflint warning that this matters?

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC] PR ld/13621
  2012-01-30 23:41         ` Alan Modra
@ 2012-01-31  0:34           ` Richard Henderson
  0 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2012-01-31  0:34 UTC (permalink / raw)
  To: Mark Wielaard, binutils

On 01/31/2012 10:40 AM, Alan Modra wrote:
> The choice of .data isn't arbitrary, but based on section flags as
> well as adjacency.  The idea being to hopefully put symbols in a
> section that belongs to the same segment (or overlay if such exists).

Well, it is, kinda.

Have a look at the minimal testcase in the patch.  There, we also
eliminate .data as empty, and the symbol gets assigned to the .text
section.  Different segment, very large "out of range" offset.

> Also, I dislike SHN_ABS symbols in executables and shared objects.
> They're supposed to be absolute, yet glibc ld.so dynamically relocates
> them, necessary for compatibility with ancient linkers and loaders..

Fair enough.

> Is there some reason other than an elflint warning that this matters?

I don't know of a real execution problem, no.  It's just... weird.


r~

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC] PR ld/13621
  2012-01-30 23:16         ` Alan Modra
@ 2012-01-31  9:58           ` Mark Wielaard
  2012-02-07 11:57             ` Alan Modra
  0 siblings, 1 reply; 17+ messages in thread
From: Mark Wielaard @ 2012-01-31  9:58 UTC (permalink / raw)
  To: Alan Modra; +Cc: Richard Henderson, binutils

On Tue, 2012-01-31 at 09:46 +1030, Alan Modra wrote:
> As far as symbols go, the ELF spec says that st_value for (non-common)
> symbols in relocatable object files is an offset from the beginning of
> a section.  There is nothing to say that values should be within the
> section bounds.

But it does also say "If a symbol's value refers to a specific location
within a section, its section index member, st_shndx, holds an index
into the section header table." and again when defining SHN_XINDEX it
says "It indicates that the symbol refers to a specific location within
a section...". So I would interpret that as saying the offset shouldn't
be outside the section.

IMHO it doesn't really make sense to have a symbol value outside a
section (except maybe to mark the end of it). At least I wouldn't know
what it represents then (given that you are dealing with section tables,
see below, and it isn't one of the special section index values).

>   In an executable or shared object the ELF spec
> says of symbol values that "the section number is irrelevant", since
> st_value is no longer a section offset but a virtual memory address.
> (That doesn't mean to say the section is totally irrelevant, just that
> it is irrelevant so far as the value is concerned.)  So that's why I
> claim that the elflint warning is a little silly.

I see. But that only really holds if you wouldn't output the section
table in the first place (which you don't have to for executables, which
only require a program table). If you do output a section table then it
is IMHO reasonable to assume that the section indexes used in the symbol
tables are consistent with the section table description.

For example some analyzers/debuggers use it to provide the user with the
section name that a symbol address falls within, that can be really
helpful if the section has a useful name (like in this
case .tm_clone_table would immediately give someone a hint of what
structure one is dealing with).

So IMHO as long as you are dealing with sections, and even for
executables ld does output sections, they should be consistent with the
usage of section indexes in symbol tables. That is what elflint checks,
that the internal structure of the elf file is consistent.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC] PR ld/13621
  2012-01-31  9:58           ` Mark Wielaard
@ 2012-02-07 11:57             ` Alan Modra
  0 siblings, 0 replies; 17+ messages in thread
From: Alan Modra @ 2012-02-07 11:57 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: Richard Henderson, binutils

On Tue, Jan 31, 2012 at 10:57:43AM +0100, Mark Wielaard wrote:
> On Tue, 2012-01-31 at 09:46 +1030, Alan Modra wrote:
> > As far as symbols go, the ELF spec says that st_value for (non-common)
> > symbols in relocatable object files is an offset from the beginning of
> > a section.  There is nothing to say that values should be within the
> > section bounds.
> 
> But it does also say "If a symbol's value refers to a specific location
> within a section, its section index member, st_shndx, holds an index
> into the section header table." and again when defining SHN_XINDEX it
> says "It indicates that the symbol refers to a specific location within
> a section...". So I would interpret that as saying the offset shouldn't
> be outside the section.

You do have a point.  I'll not throw a fit and revert rth's patch
if he still wants to commit it.  :)

You can of course easily create symbols outside of a sections.  For
example:

 .data
 .global x
x:
 .byte 2, 3, 5, 7, 11, 13, 17, 19, 23
 .global y
y = x - '1'

But, yeah, I can't think of a really good reason why you would want to
do this.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC] PR ld/13621
  2012-01-30  0:57 [RFC] PR ld/13621 Richard Henderson
  2012-01-30  3:26 ` Alan Modra
  2012-01-30  8:23 ` Mark Wielaard
@ 2012-03-01  3:58 ` Hans-Peter Nilsson
  2012-03-01  4:19   ` Alan Modra
  2012-05-05 14:42 ` H.J. Lu
  3 siblings, 1 reply; 17+ messages in thread
From: Hans-Peter Nilsson @ 2012-03-01  3:58 UTC (permalink / raw)
  To: rth; +Cc: binutils, mjw, amodra

> From: Richard Henderson <rth@redhat.com>
> Date: Mon, 30 Jan 2012 01:56:49 +0100


> 	  * testsuite/ld-elf/zerosize1.d: New file.
> 	  * testsuite/ld-elf/zerosize1.s: New file.

> diff --git a/ld/testsuite/ld-elf/zerosize1.d b/ld/testsuite/ld-elf/zerosize1.d
> new file mode 100644
> index 0000000..0da5171
> --- /dev/null
> +++ b/ld/testsuite/ld-elf/zerosize1.d
> @@ -0,0 +1,10 @@
> +#source: start.s
> +#source: zerosize1.s
> +#ld:
> +#readelf: -s
> +
> +# Check that xyzzy is not placed in the ABS section.
> +
> +#...
> + +[0-9]+: +[0-9a-f]+ +0 +NOTYPE +GLOBAL +DEFAULT +ABS xyzzy
> +#pass

Would it be right to change this as follows, making it pass for
MIPS targets too?  See the PR.  I don't know the MIPS ELF rules well
enough to call it obvious.

ld/testsuite:
	PR ld/13789
	* ld-elf/zerosize1.d: Allow xyzzy to be either OBJECT or NOTYPE.

diff --git a/ld/testsuite/ld-elf/zerosize1.d b/ld/testsuite/ld-elf/zerosize1.d
index ee69592..43187f0 100644
--- a/ld/testsuite/ld-elf/zerosize1.d
+++ b/ld/testsuite/ld-elf/zerosize1.d
@@ -6,5 +6,5 @@
 # Check that xyzzy is not placed in the .text section.
 
 #...
- +[0-9]+: +[0-9a-f]+ +0 +NOTYPE +GLOBAL +DEFAULT +ABS xyzzy
+ +[0-9]+: +[0-9a-f]+ +0 +(OBJECT|NOTYPE) +GLOBAL +DEFAULT +ABS xyzzy
 #pass

brgds, H-P

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC] PR ld/13621
  2012-03-01  3:58 ` Hans-Peter Nilsson
@ 2012-03-01  4:19   ` Alan Modra
  0 siblings, 0 replies; 17+ messages in thread
From: Alan Modra @ 2012-03-01  4:19 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: rth, binutils, mjw

On Thu, Mar 01, 2012 at 04:57:57AM +0100, Hans-Peter Nilsson wrote:
> 	PR ld/13789
> 	* ld-elf/zerosize1.d: Allow xyzzy to be either OBJECT or NOTYPE.

OK.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC] PR ld/13621
  2012-01-30  0:57 [RFC] PR ld/13621 Richard Henderson
                   ` (2 preceding siblings ...)
  2012-03-01  3:58 ` Hans-Peter Nilsson
@ 2012-05-05 14:42 ` H.J. Lu
  2012-05-07  6:20   ` Alan Modra
  3 siblings, 1 reply; 17+ messages in thread
From: H.J. Lu @ 2012-05-05 14:42 UTC (permalink / raw)
  To: Richard Henderson; +Cc: binutils, Mark Wielaard, Alan Modra

[-- Attachment #1: Type: text/plain, Size: 1219 bytes --]

On Sun, Jan 29, 2012 at 4:56 PM, Richard Henderson <rth@redhat.com> wrote:
> This PR is about elflint warning about symbols outside of the section
> with which they are associated.
>
> The problem can be seen with just about any zero-sized section, nearly
> all of which are removed by strip_excluded_output_sections.  Then, the
> symbols are relocated to a "near by" section by _bfd_fix_excluded_sec_syms.
>
> This elflint warning will affect essentially all binaries built with
> gcc4.7 because of the crt symbol __TMC_END__, present in the normally
> empty section .tm_clone_table.  Thus it seems worthwhile to fix this.
>
> At first I attempted to SEC_KEEP any section with global symbols.  But
> this ran into all sorts of testsuite failures re .got.  In those cases
> the _GLOBAL_OFFSET_TABLE_ symbol still exists, which causes .got to
> be kept instead of removed as unused later in the link.
>
> A second best solution would seem to be to merely move the offending
> symbols to the ABS section rather than associate them with some random
> section.
>
> Thoughts?
>

This patch has been reverted.  Here is a different patch to keep
empty section if a definition uses it.


-- 
H.J.

[-- Attachment #2: binutils-pr13621.patch --]
[-- Type: application/octet-stream, Size: 3862 bytes --]

bfd/

	PR ld/13621
	* bfd-in.h (_bfd_keep_sections_with_sym): New.
	* bfd-in2.h: Regnerated.

	* linker.c (keep_sections_with_sym): New.
	(_bfd_keep_sections_with_sym): Likewise.

ld/

	PR ld/13621
	* ldlang.c (strip_excluded_output_sections): Call
	_bfd_keep_sections_with_sym if needed.

ld/testsuite/

	PR ld/13621
	* ld-elf/zerosize1.d: New file.
	* ld-elf/zerosize1.s: Likewise.
	* ld-elf/zerosize1.t: Likewise.

diff --git a/bfd/bfd-in.h b/bfd/bfd-in.h
index bff5f34..8fe20bb 100644
--- a/bfd/bfd-in.h
+++ b/bfd/bfd-in.h
@@ -703,6 +703,9 @@ _bfd_nearby_section (bfd *, struct bfd_section *, bfd_vma);
 extern void _bfd_fix_excluded_sec_syms
   (bfd *, struct bfd_link_info *);
 
+extern void _bfd_keep_sections_with_sym
+  (bfd *, struct bfd_link_info *);
+
 extern unsigned bfd_m68k_mach_to_features (int);
 
 extern int bfd_m68k_features_to_mach (unsigned);
diff --git a/bfd/bfd-in2.h b/bfd/bfd-in2.h
index a66c74f..cc8b380 100644
--- a/bfd/bfd-in2.h
+++ b/bfd/bfd-in2.h
@@ -710,6 +710,9 @@ _bfd_nearby_section (bfd *, struct bfd_section *, bfd_vma);
 extern void _bfd_fix_excluded_sec_syms
   (bfd *, struct bfd_link_info *);
 
+extern void _bfd_keep_sections_with_sym
+  (bfd *, struct bfd_link_info *);
+
 extern unsigned bfd_m68k_mach_to_features (int);
 
 extern int bfd_m68k_features_to_mach (unsigned);
diff --git a/bfd/linker.c b/bfd/linker.c
index 3caec96..50eab5e 100644
--- a/bfd/linker.c
+++ b/bfd/linker.c
@@ -3235,6 +3235,36 @@ _bfd_fix_excluded_sec_syms (bfd *obfd, struct bfd_link_info *info)
   bfd_link_hash_traverse (info->hash, fix_syms, obfd);
 }
 
+/* Keep sections with symbols.  */
+
+static bfd_boolean
+keep_sections_with_sym (struct bfd_link_hash_entry *h, void *data)
+{
+  bfd *obfd = (bfd *) data;
+
+  if (h->type == bfd_link_hash_defined
+      || h->type == bfd_link_hash_defweak)
+    {
+      asection *s = h->u.def.section;
+      if (s != NULL
+	  && (s->flags
+	      & (SEC_LINKER_CREATED | SEC_EXCLUDE | SEC_KEEP)) == 0
+	  && s->output_section != NULL
+	  && (s->output_section->flags
+	      & (SEC_LINKER_CREATED | SEC_EXCLUDE | SEC_KEEP)) == 0
+	  && !bfd_section_removed_from_list (obfd, s->output_section))
+	s->output_section->flags |= SEC_KEEP;
+    }
+
+  return TRUE;
+}
+
+void
+_bfd_keep_sections_with_sym (bfd *obfd, struct bfd_link_info *info)
+{
+  bfd_link_hash_traverse (info->hash, keep_sections_with_sym, obfd);
+}
+
 /*
 FUNCTION
 	bfd_generic_define_common_symbol
diff --git a/ld/ldlang.c b/ld/ldlang.c
index beb84d3..3e79ca0 100644
--- a/ld/ldlang.c
+++ b/ld/ldlang.c
@@ -3858,6 +3858,9 @@ strip_excluded_output_sections (void)
       lang_reset_memory_regions ();
     }
 
+  if (!link_info.relocatable)
+    _bfd_keep_sections_with_sym (link_info.output_bfd, &link_info);
+
   for (os = &lang_output_section_statement.head->output_section_statement;
        os != NULL;
        os = os->next)
diff --git a/ld/testsuite/ld-elf/zerosize1.d b/ld/testsuite/ld-elf/zerosize1.d
new file mode 100644
index 0000000..36d1433
--- /dev/null
+++ b/ld/testsuite/ld-elf/zerosize1.d
@@ -0,0 +1,12 @@
+#source: start.s
+#source: zerosize1.s
+#ld: -T zerosize1.t
+#readelf: -sS --wide
+
+# Check that xyzzy is not placed in the .text section.
+
+#...
+  \[[ 2]+\] zerosize[ \t]+PROGBITS[ \t0-9a-f]+WA.*
+#...
+ +[0-9]+: +[0-9a-f]+ +0 +(OBJECT|NOTYPE) +GLOBAL +DEFAULT +2 xyzzy
+#pass
diff --git a/ld/testsuite/ld-elf/zerosize1.s b/ld/testsuite/ld-elf/zerosize1.s
new file mode 100644
index 0000000..4fc8198
--- /dev/null
+++ b/ld/testsuite/ld-elf/zerosize1.s
@@ -0,0 +1,3 @@
+	.section "zerosize","aw"
+	.globl	xyzzy
+xyzzy:
diff --git a/ld/testsuite/ld-elf/zerosize1.t b/ld/testsuite/ld-elf/zerosize1.t
new file mode 100644
index 0000000..0be32cd
--- /dev/null
+++ b/ld/testsuite/ld-elf/zerosize1.t
@@ -0,0 +1,5 @@
+SECTIONS {
+	.text : { *(.text) }
+	zerosize : { *(zerosize) }
+	/DISCARD/ : { *(.*) }
+}

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC] PR ld/13621
  2012-05-05 14:42 ` H.J. Lu
@ 2012-05-07  6:20   ` Alan Modra
  0 siblings, 0 replies; 17+ messages in thread
From: Alan Modra @ 2012-05-07  6:20 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Richard Henderson, binutils, Mark Wielaard

On Sat, May 05, 2012 at 07:42:40AM -0700, H.J. Lu wrote:
> On Sun, Jan 29, 2012 at 4:56 PM, Richard Henderson <rth@redhat.com> wrote:
> > This PR is about elflint warning about symbols outside of the section
> > with which they are associated.
> >
> > The problem can be seen with just about any zero-sized section, nearly
> > all of which are removed by strip_excluded_output_sections.  Then, the
> > symbols are relocated to a "near by" section by _bfd_fix_excluded_sec_syms.
> >
> > This elflint warning will affect essentially all binaries built with
> > gcc4.7 because of the crt symbol __TMC_END__, present in the normally
> > empty section .tm_clone_table.  Thus it seems worthwhile to fix this.
> >
> > At first I attempted to SEC_KEEP any section with global symbols.  But
> > this ran into all sorts of testsuite failures re .got.  In those cases
> > the _GLOBAL_OFFSET_TABLE_ symbol still exists, which causes .got to
> > be kept instead of removed as unused later in the link.
> >
> > A second best solution would seem to be to merely move the offending
> > symbols to the ABS section rather than associate them with some random
> > section.
> >
> > Thoughts?
> >
> 
> This patch has been reverted.  Here is a different patch to keep
> empty section if a definition uses it.

As Richard said, this results in testsuite failures.  If you or anyone
else wants this patch, you get to fix the testsuite..  Here's another
idea.  Put symbols that don't belong in any of the usual sections into
a non-alloc section that elflint can recognise and so silence the
dubious warning.

I'm not really enthusiastic about the following, but it does seem to
work.

Index: bfd/elf.c
===================================================================
RCS file: /cvs/src/src/bfd/elf.c,v
retrieving revision 1.552
diff -u -p -r1.552 elf.c
--- bfd/elf.c	24 Apr 2012 05:12:30 -0000	1.552
+++ bfd/elf.c	7 May 2012 04:38:55 -0000
@@ -2620,7 +2620,8 @@ elf_fake_sections (bfd *abfd, asection *
   else
     sh_type = bfd_elf_get_default_section_type (asect->flags);
 
-  if (this_hdr->sh_type == SHT_NULL)
+  if (this_hdr->sh_type == SHT_NULL
+      && asect->name[0] != 0)
     this_hdr->sh_type = sh_type;
   else if (this_hdr->sh_type == SHT_NOBITS
 	   && sh_type == SHT_PROGBITS
@@ -2640,15 +2641,6 @@ elf_fake_sections (bfd *abfd, asection *
     default:
       break;
 
-    case SHT_STRTAB:
-    case SHT_INIT_ARRAY:
-    case SHT_FINI_ARRAY:
-    case SHT_PREINIT_ARRAY:
-    case SHT_NOTE:
-    case SHT_NOBITS:
-    case SHT_PROGBITS:
-      break;
-
     case SHT_HASH:
       this_hdr->sh_entsize = bed->s->sizeof_hash_entry;
       break;
@@ -2951,7 +2943,8 @@ assign_section_numbers (bfd *abfd, struc
     {
       d = elf_section_data (sec);
 
-      if (d->this_hdr.sh_type != SHT_GROUP)
+      if (d->this_hdr.sh_type != SHT_GROUP
+	  && d->this_hdr.sh_type != SHT_NULL)
 	d->this_idx = section_number++;
       _bfd_elf_strtab_addref (elf_shstrtab (abfd), d->this_hdr.sh_name);
       if (d->rel.hdr)
@@ -2999,6 +2992,17 @@ assign_section_numbers (bfd *abfd, struc
   _bfd_elf_strtab_finalize (elf_shstrtab (abfd));
   t->shstrtab_hdr.sh_size = _bfd_elf_strtab_size (elf_shstrtab (abfd));
 
+  sec = bfd_get_section_by_name (abfd, "");
+  if (sec != NULL)
+    {
+      d = elf_section_data (sec);
+
+      /* This chooses .symtab for the section we've used for symbols
+	 that belong in no other section.  */
+      if (d->this_hdr.sh_type == SHT_NULL)
+	d->this_idx = section_number - 2;
+    }
+
   elf_numsections (abfd) = section_number;
   elf_elfheader (abfd)->e_shnum = section_number;
 
@@ -3039,7 +3043,8 @@ assign_section_numbers (bfd *abfd, struc
 
       d = elf_section_data (sec);
 
-      i_shdrp[d->this_idx] = &d->this_hdr;
+      if (d->this_hdr.sh_type != SHT_NULL)
+	i_shdrp[d->this_idx] = &d->this_hdr;
       if (d->rel.idx != 0)
 	i_shdrp[d->rel.idx] = d->rel.hdr;
       if (d->rela.idx != 0)
Index: bfd/linker.c
===================================================================
RCS file: /cvs/src/src/bfd/linker.c,v
retrieving revision 1.95
diff -u -p -r1.95 linker.c
--- bfd/linker.c	5 May 2012 04:51:15 -0000	1.95
+++ bfd/linker.c	7 May 2012 04:39:09 -0000
@@ -3161,10 +3160,7 @@ _bfd_nearby_section (bfd *obfd, asection
      as S would have been if it was kept.  */
   best = next;
   if (prev == NULL)
-    {
-      if (next == NULL)
-	best = bfd_abs_section_ptr;
-    }
+    ;
   else if (next == NULL)
     best = prev;
   else if (((prev->flags ^ next->flags)
@@ -3198,6 +3194,28 @@ _bfd_nearby_section (bfd *obfd, asection
 	best = prev;
     }
 
+  if (bfd_get_flavour (obfd) == bfd_target_elf_flavour)
+    {
+      if (best == NULL
+	  || addr < best->vma
+	  || addr > best->vma + best->size)
+	{
+	  /* For ELF, if symbols don't belong in one of the normal
+	     sections, don't put them in the absolute section as this
+	     will result in SHN_ABS symbols.  Instead, we'll make them
+	     belong to the .symtab section via this dummy section
+	     (which will not be output).  */
+	  best = bfd_get_section_by_name (obfd, "");
+	  if (best == NULL)
+	    {
+	      best = bfd_make_section_anyway_with_flags (obfd, "", SEC_READONLY);
+	      best->output_section = best;
+	    }
+	}
+    }
+  else if (best == NULL)
+    best = bfd_abs_section_ptr;
+
   return best;
 }
 

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2012-05-07  6:20 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-30  0:57 [RFC] PR ld/13621 Richard Henderson
2012-01-30  3:26 ` Alan Modra
2012-01-30  8:28   ` Mark Wielaard
2012-01-30 13:17     ` Alan Modra
2012-01-30 14:14       ` Mark Wielaard
2012-01-30 23:16         ` Alan Modra
2012-01-31  9:58           ` Mark Wielaard
2012-02-07 11:57             ` Alan Modra
2012-01-30 20:00       ` Richard Henderson
2012-01-30 23:41         ` Alan Modra
2012-01-31  0:34           ` Richard Henderson
2012-01-30 19:51   ` Richard Henderson
2012-01-30  8:23 ` Mark Wielaard
2012-03-01  3:58 ` Hans-Peter Nilsson
2012-03-01  4:19   ` Alan Modra
2012-05-05 14:42 ` H.J. Lu
2012-05-07  6:20   ` Alan Modra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).