public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [RFC PATCH] Don't put local or undefined symbols into .hash
@ 2006-06-26 13:02 Jakub Jelinek
  2006-06-26 13:10 ` Daniel Jacobowitz
  0 siblings, 1 reply; 2+ messages in thread
From: Jakub Jelinek @ 2006-06-26 13:02 UTC (permalink / raw)
  To: binutils

Hi!

It seems an average shared library has roughly 25% of undefined symbols
in its .dynsym section.
Currently, the linker puts all dynamic symbols (except the special ones like
section symbols or symbol 0) into the .hash section, but the dynamic linker
will always skip STB_LOCAL symbols or undefined symbols (except those with
st_value != 0) - it is always looking for symbol definitions.
Is there any reason why we put even the undefined or local symbols into
.hash?  It just makes the .hash buckets array larger and/or the hash chains
longer than necessary.
Or is there something that relies on all symbols being there?
So far I have noticed just readelf -Ds (but in that case it is questionable
if readelf -Ds would need to change or stay as is).

2006-06-26  Jakub Jelinek  <jakub@redhat.com>

	* elflink.c (elf_collect_hash_codes): Set u.elf_hash_value to ~0
	for local or undefined symbols, don't include it into hashcodes
	array.
	(elf_link_output_extsym): Don't put symbols with u.elf_hash_value
	~0 into the .hash section.

--- bfd/elflink.c.jj	2006-06-20 18:34:53.000000000 +0200
+++ bfd/elflink.c	2006-06-26 14:04:49.000000000 +0200
@@ -4785,6 +4785,17 @@ elf_collect_hash_codes (struct elf_link_
   if (h->dynindx == -1)
     return TRUE;
 
+  /* Ignore also local symbols and undefined symbols.  */
+  if (h->forced_local
+      || h->root.type == bfd_link_hash_undefined
+      || h->root.type == bfd_link_hash_undefweak)
+    {
+      /* bfd_elf_hash values never have topmost 4 bits set,
+	 so ~0 don't collide with any valid hash value.  */
+      h->u.elf_hash_value = ~(unsigned long) 0;
+      return TRUE;
+    }
+
   name = h->root.root.string;
   p = strchr (name, ELF_VER_CHR);
   if (p != NULL)
@@ -6678,16 +6689,24 @@ elf_link_output_extsym (struct elf_link_
       bed->s->swap_symbol_out (finfo->output_bfd, &sym, esym, 0);
 
       bucketcount = elf_hash_table (finfo->info)->bucketcount;
-      bucket = h->u.elf_hash_value % bucketcount;
-      hash_entry_size
-	= elf_section_data (finfo->hash_sec)->this_hdr.sh_entsize;
-      bucketpos = ((bfd_byte *) finfo->hash_sec->contents
-		   + (bucket + 2) * hash_entry_size);
-      chain = bfd_get (8 * hash_entry_size, finfo->output_bfd, bucketpos);
-      bfd_put (8 * hash_entry_size, finfo->output_bfd, h->dynindx, bucketpos);
-      bfd_put (8 * hash_entry_size, finfo->output_bfd, chain,
-	       ((bfd_byte *) finfo->hash_sec->contents
-		+ (bucketcount + 2 + h->dynindx) * hash_entry_size));
+      if (h->u.elf_hash_value == ~(unsigned long) 0)
+	BFD_ASSERT (h->forced_local
+		    || h->root.type == bfd_link_hash_undefined
+		    || h->root.type == bfd_link_hash_undefweak);
+      else
+	{
+	  bucket = h->u.elf_hash_value % bucketcount;
+	  hash_entry_size
+	    = elf_section_data (finfo->hash_sec)->this_hdr.sh_entsize;
+	  bucketpos = ((bfd_byte *) finfo->hash_sec->contents
+		      + (bucket + 2) * hash_entry_size);
+	  chain = bfd_get (8 * hash_entry_size, finfo->output_bfd, bucketpos);
+	  bfd_put (8 * hash_entry_size, finfo->output_bfd, h->dynindx,
+		   bucketpos);
+	  bfd_put (8 * hash_entry_size, finfo->output_bfd, chain,
+		   ((bfd_byte *) finfo->hash_sec->contents
+		    + (bucketcount + 2 + h->dynindx) * hash_entry_size));
+	}
 
       if (finfo->symver_sec != NULL && finfo->symver_sec->contents != NULL)
 	{

	Jakub

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [RFC PATCH] Don't put local or undefined symbols into .hash
  2006-06-26 13:02 [RFC PATCH] Don't put local or undefined symbols into .hash Jakub Jelinek
@ 2006-06-26 13:10 ` Daniel Jacobowitz
  0 siblings, 0 replies; 2+ messages in thread
From: Daniel Jacobowitz @ 2006-06-26 13:10 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: binutils

On Mon, Jun 26, 2006 at 02:32:38PM +0200, Jakub Jelinek wrote:
> Hi!
> 
> It seems an average shared library has roughly 25% of undefined symbols
> in its .dynsym section.
> Currently, the linker puts all dynamic symbols (except the special ones like
> section symbols or symbol 0) into the .hash section, but the dynamic linker
> will always skip STB_LOCAL symbols or undefined symbols (except those with
> st_value != 0) - it is always looking for symbol definitions.
> Is there any reason why we put even the undefined or local symbols into
> .hash?  It just makes the .hash buckets array larger and/or the hash chains
> longer than necessary.
> Or is there something that relies on all symbols being there?
> So far I have noticed just readelf -Ds (but in that case it is questionable
> if readelf -Ds would need to change or stay as is).

Regardless of whether any other software would break, please don't let
readelf -Ds just ignore the undefined symbols.  If you want to leave
them out of the hash table, maybe readelf -Ds should make a second pass
to dump out things from .dynsym which weren't in the hash.

I often use this to work out a library's dependencies.


-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2006-06-26 13:02 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-06-26 13:02 [RFC PATCH] Don't put local or undefined symbols into .hash Jakub Jelinek
2006-06-26 13:10 ` Daniel Jacobowitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).