From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <binutils-return-46662-listarch-binutils=sources.redhat.com@sourceware.org>
Received: (qmail 26198 invoked by alias); 3 Jul 2006 21:05:14 -0000
Received: (qmail 26125 invoked by uid 22791); 3 Jul 2006 21:05:08 -0000
X-Spam-Check-By: sourceware.org
Received: from sunsite.ms.mff.cuni.cz (HELO sunsite.mff.cuni.cz) (195.113.15.26)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 03 Jul 2006 21:05:02 +0000
Received: from sunsite.mff.cuni.cz (sunsite.mff.cuni.cz [127.0.0.1]) 	by sunsite.mff.cuni.cz (8.13.1/8.13.1) with ESMTP id k63L4lBZ010961; 	Mon, 3 Jul 2006 23:04:47 +0200
Received: (from jj@localhost) 	by sunsite.mff.cuni.cz (8.13.1/8.13.1/Submit) id k63L4kkG010960; 	Mon, 3 Jul 2006 23:04:46 +0200
Date: Mon, 03 Jul 2006 21:05:00 -0000
From: Jakub Jelinek <jakub@redhat.com>
To: binutils@sources.redhat.com
Cc: libc-alpha@sources.redhat.com, Ulrich Drepper <drepper@redhat.com>,         Michael Meeks <michael.meeks@novell.com>
Subject: [PATCH] DT_GNU_HASH: reducing working set ... (take 2)
Message-ID: <20060703210446.GF3823@sunsite.mff.cuni.cz>
Reply-To: Jakub Jelinek <jakub@redhat.com>
References: <20060628170900.GX3823@sunsite.mff.cuni.cz> <1151605626.20187.72.camel@t60p.site> <20060629193926.GZ3823@sunsite.mff.cuni.cz> <1151939720.17892.98.camel@t60p.site> <20060703155925.GE3823@sunsite.mff.cuni.cz> <1151950882.17892.146.camel@t60p.site>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1151950882.17892.146.camel@t60p.site>
User-Agent: Mutt/1.4.1i
X-IsSubscribed: yes
Mailing-List: contact binutils-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:binutils-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/binutils/>
List-Post: <mailto:binutils@sourceware.org>
List-Help: <mailto:binutils-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: binutils-owner@sourceware.org
X-SW-Source: 2006-07/txt/msg00023.txt.bz2

On Mon, Jul 03, 2006 at 07:21:22PM +0100, Michael Meeks wrote:
> > I will rework the patch now.

Here is the updated patch, Ulrich has the corresponding glibc bits and it
passed all testing we have done so far on it.

There is still room for improvement in the chain reordering to minimize
number of chains crossing cacheline boundaries, but that optimization
doesn't change anything in the section format, so I think it can be left
for later.

Ok for trunk?

2006-07-03  Jakub Jelinek  <jakub@redhat.com>

include/
	* bfdlink.h (struct bfd_link_info): Add emit_hash and
	emit_gnu_hash bitfields.
include/elf/
	* common.h (SHT_GNU_HASH, DT_GNU_HASH): Define.
ld/
	* scripttempl/elf.sc: Add .gnu.hash section.
	* emultempl/elf32.em (OPTION_HASH_STYLE): Define.
	(gld${EMULATION_NAME}_add_options): Register --hash-style option.
	(gld${EMULATION_NAME}_handle_option): Handle it.
	(gld${EMULATION_NAME}_list_options): Document it.
	* ldmain.c (main): Initialize emit_hash and emit_gnu_hash.
	* ld.texinfo: Document --hash-style option.
bfd/
	* elf.c (_bfd_elf_print_private_bfd_data): Handle DT_GNU_HASH.
	(bfd_section_from_shdr, elf_fake_sections, assign_section_numbers):
	Handle SHT_GNU_HASH.
	(special_sections_g): Include .gnu.hash section.
	(bfd_elf_gnu_hash): New function.
	* elf-bfd.h (bfd_elf_gnu_hash): New prototype.
	* elflink.c (_bfd_elf_link_create_dynamic_sections): Create .hash
	only if info->emit_hash, create .gnu.hash section if
	info->emit_gnu_hash.
	(struct collect_gnu_hash_codes): New type.
	(elf_collect_gnu_hash_codes, elf_renumber_gnu_hash_syms): New
	functions.
	(compute_bucket_count): Don't compute HASHCODES array, instead add
	that and NSYMS as arguments.  Use bed->s->sizeof_hash_entry
	instead of bed->s->arch_size / 8.  Fix .hash size estimation.
	When not optimizing, use the number of hashed symbols rather than
	dynsymcount.
	(bfd_elf_size_dynamic_sections): Only add DT_HASH if info->emit_hash,
	and ADD DT_GNU_HASH if info->emit_gnu_hash.
	(bfd_elf_size_dynsym_hash_dynstr): Size .hash only if info->emit_hash,
	adjust compute_bucket_count caller.  Create and populate .gnu.hash
	section if info->emit_gnu_hash.
	(elf_link_output_extsym): Only populate .hash section if
	finfo->hash_sec != NULL.
	(bfd_elf_final_link): Adjust assertion.  Handle DT_GNU_HASH.
binutils/
	* readelf.c (get_dynamic_type): Handle DT_GNU_HASH.
	(get_section_type_name): Handle SHT_GNU_HASH.
	(dynamic_info_DT_GNU_HASH): New variable.
	(process_dynamic_section): Handle DT_GNU_HASH.
	(process_symbol_table): Print also DT_GNU_HASH histogram.

--- ld/scripttempl/elf.sc.jj	2006-01-01 01:02:16.000000000 +0100
+++ ld/scripttempl/elf.sc	2006-06-22 11:11:53.000000000 +0200
@@ -260,6 +260,7 @@ SECTIONS
   ${INITIAL_READONLY_SECTIONS}
   ${TEXT_DYNAMIC+${DYNAMIC}}
   .hash         ${RELOCATING-0} : { *(.hash) }
+  .gnu.hash     ${RELOCATING-0} : { *(.gnu.hash) }
   .dynsym       ${RELOCATING-0} : { *(.dynsym) }
   .dynstr       ${RELOCATING-0} : { *(.dynstr) }
   .gnu.version  ${RELOCATING-0} : { *(.gnu.version) }
--- ld/ldmain.c.jj	2006-06-01 15:50:33.000000000 +0200
+++ ld/ldmain.c	2006-06-22 11:21:11.000000000 +0200
@@ -304,6 +304,8 @@ main (int argc, char **argv)
   link_info.create_object_symbols_section = NULL;
   link_info.gc_sym_list = NULL;
   link_info.base_file = NULL;
+  link_info.emit_hash = TRUE;
+  link_info.emit_gnu_hash = FALSE;
   /* SVR4 linkers seem to set DT_INIT and DT_FINI based on magic _init
      and _fini symbols.  We are compatible.  */
   link_info.init_function = "_init";
--- ld/ld.texinfo.jj	2006-06-15 14:31:06.000000000 +0200
+++ ld/ld.texinfo	2006-06-22 14:03:21.000000000 +0200
@@ -1883,6 +1883,14 @@ time it takes the linker to perform its 
 increasing the linker's memory requirements.  Similarly reducing this
 value can reduce the memory requirements at the expense of speed.
 
+@kindex --hash-style=@var{style}
+@item --hash-style=@var{style}
+Set the type of linker's hash table(s).  @var{style} can be either
+@code{sysv} for classic ELF @code{.hash} section, @code{gnu} for
+new style GNU @code{.gnu.hash} section or @code{both} for both
+the classic ELF @code{.hash} and new style GNU @code{.gnu.hash}
+hash tables.  The default is @code{sysv}.
+
 @kindex --reduce-memory-overheads
 @item --reduce-memory-overheads
 This option reduces memory requirements at ld runtime, at the expense of
--- ld/emultempl/elf32.em.jj	2006-06-20 18:34:24.000000000 +0200
+++ ld/emultempl/elf32.em	2006-06-22 14:39:25.000000000 +0200
@@ -1719,6 +1719,7 @@ cat >>e${EMULATION_NAME}.c <<EOF
 #define OPTION_GROUP			(OPTION_ENABLE_NEW_DTAGS + 1)
 #define OPTION_EH_FRAME_HDR		(OPTION_GROUP + 1)
 #define OPTION_EXCLUDE_LIBS		(OPTION_EH_FRAME_HDR + 1)
+#define OPTION_HASH_STYLE		(OPTION_EXCLUDE_LIBS + 1)
 
 static void
 gld${EMULATION_NAME}_add_options
@@ -1735,6 +1736,7 @@ cat >>e${EMULATION_NAME}.c <<EOF
     {"enable-new-dtags", no_argument, NULL, OPTION_ENABLE_NEW_DTAGS},
     {"eh-frame-hdr", no_argument, NULL, OPTION_EH_FRAME_HDR},
     {"exclude-libs", required_argument, NULL, OPTION_EXCLUDE_LIBS},
+    {"hash-style", required_argument, NULL, OPTION_HASH_STYLE},
     {"Bgroup", no_argument, NULL, OPTION_GROUP},
 EOF
 fi
@@ -1791,6 +1793,22 @@ cat >>e${EMULATION_NAME}.c <<EOF
       add_excluded_libs (optarg);
       break;
 
+    case OPTION_HASH_STYLE:
+      link_info.emit_hash = FALSE;
+      link_info.emit_gnu_hash = FALSE;
+      if (strcmp (optarg, "sysv") == 0)
+	link_info.emit_hash = TRUE;
+      else if (strcmp (optarg, "gnu") == 0)
+	link_info.emit_gnu_hash = TRUE;
+      else if (strcmp (optarg, "both") == 0)
+	{
+	  link_info.emit_hash = TRUE;
+	  link_info.emit_gnu_hash = TRUE;
+	}
+      else
+	einfo (_("%P%F: invalid hash style \`%s'\n"), optarg);
+      break;
+
     case 'z':
       if (strcmp (optarg, "initfirst") == 0)
 	link_info.flags_1 |= (bfd_vma) DF_1_INITFIRST;
@@ -1894,6 +1912,7 @@ cat >>e${EMULATION_NAME}.c <<EOF
   fprintf (file, _("  --disable-new-dtags\tDisable new dynamic tags\n"));
   fprintf (file, _("  --enable-new-dtags\tEnable new dynamic tags\n"));
   fprintf (file, _("  --eh-frame-hdr\tCreate .eh_frame_hdr section\n"));
+  fprintf (file, _("  --hash-style=STYLE\tSet hash style to sysv, gnu or both\n"));
   fprintf (file, _("  -z combreloc\t\tMerge dynamic relocs into one section and sort\n"));
   fprintf (file, _("  -z defs\t\tReport unresolved symbols in object files.\n"));
   fprintf (file, _("  -z execstack\t\tMark executable as requiring executable stack\n"));
--- bfd/elf-bfd.h.jj	2006-06-20 18:34:24.000000000 +0200
+++ bfd/elf-bfd.h	2006-06-26 16:17:53.000000000 +0200
@@ -1481,6 +1481,8 @@ extern bfd_vma _bfd_elf_section_offset
 
 extern unsigned long bfd_elf_hash
   (const char *);
+extern unsigned long bfd_elf_gnu_hash
+  (const char *);
 
 extern bfd_reloc_status_type bfd_elf_generic_reloc
   (bfd *, arelent *, asymbol *, void *, asection *, bfd *, char **);
--- bfd/elf.c.jj	2006-06-20 18:34:24.000000000 +0200
+++ bfd/elf.c	2006-06-26 16:17:28.000000000 +0200
@@ -206,6 +206,21 @@ bfd_elf_hash (const char *namearg)
   return h & 0xffffffff;
 }
 
+/* DT_GNU_HASH hash function.  Do not change this function; you will
+   cause invalid hash tables to be generated.  */
+
+unsigned long
+bfd_elf_gnu_hash (const char *namearg)
+{
+  const unsigned char *name = (const unsigned char *) namearg;
+  unsigned long h = 5381;
+  unsigned char ch;
+
+  while ((ch = *name++) != '\0')
+    h = (h << 5) + h + ch;
+  return h & 0xffffffff;
+}
+
 bfd_boolean
 bfd_elf_mkobject (bfd *abfd)
 {
@@ -1239,6 +1254,7 @@ _bfd_elf_print_private_bfd_data (bfd *ab
 	    case DT_AUXILIARY: name = "AUXILIARY"; stringp = TRUE; break;
 	    case DT_USED: name = "USED"; break;
 	    case DT_FILTER: name = "FILTER"; stringp = TRUE; break;
+	    case DT_GNU_HASH: name = "GNU_HASH"; break;
 	    }
 
 	  fprintf (f, "  %-11s ", name);
@@ -1823,6 +1839,7 @@ bfd_section_from_shdr (bfd *abfd, unsign
     case SHT_FINI_ARRAY:	/* .fini_array section.  */
     case SHT_PREINIT_ARRAY:	/* .preinit_array section.  */
     case SHT_GNU_LIBLIST:	/* .gnu.liblist section.  */
+    case SHT_GNU_HASH:		/* .gnu.hash section.  */
       return _bfd_elf_make_section_from_shdr (abfd, hdr, name, shindex);
 
     case SHT_DYNAMIC:	/* Dynamic linking information.  */
@@ -2295,6 +2312,7 @@ static const struct bfd_elf_special_sect
   { ".gnu.version_r", 14,  0, SHT_GNU_verneed, 0 },
   { ".gnu.liblist",   12,  0, SHT_GNU_LIBLIST, SHF_ALLOC },
   { ".gnu.conflict",  13,  0, SHT_RELA,     SHF_ALLOC },
+  { ".gnu.hash",       9,  0, SHT_GNU_HASH, SHF_ALLOC },
   { NULL,              0,  0, 0,            0 }
 };
 
@@ -2811,6 +2829,10 @@ elf_fake_sections (bfd *abfd, asection *
     case SHT_GROUP:
       this_hdr->sh_entsize = 4;
       break;
+
+    case SHT_GNU_HASH:
+      this_hdr->sh_entsize = 4;
+      break;
     }
 
   if ((asect->flags & SEC_ALLOC) != 0)
@@ -3256,6 +3278,7 @@ assign_section_numbers (bfd *abfd, struc
 	  break;
 
 	case SHT_HASH:
+	case SHT_GNU_HASH:
 	case SHT_GNU_versym:
 	  /* sh_link is the section header index of the symbol table
 	     this hash table or version table is for.  */
--- bfd/elflink.c.jj	2006-06-20 18:34:53.000000000 +0200
+++ bfd/elflink.c	2006-07-03 18:26:47.000000000 +0200
@@ -240,12 +240,24 @@ _bfd_elf_link_create_dynamic_sections (b
   if (!_bfd_elf_define_linkage_sym (abfd, info, s, "_DYNAMIC"))
     return FALSE;
 
-  s = bfd_make_section_with_flags (abfd, ".hash",
-				   flags | SEC_READONLY);
-  if (s == NULL
-      || ! bfd_set_section_alignment (abfd, s, bed->s->log_file_align))
-    return FALSE;
-  elf_section_data (s)->this_hdr.sh_entsize = bed->s->sizeof_hash_entry;
+  if (info->emit_hash)
+    {
+      s = bfd_make_section_with_flags (abfd, ".hash", flags | SEC_READONLY);
+      if (s == NULL
+	  || ! bfd_set_section_alignment (abfd, s, bed->s->log_file_align))
+	return FALSE;
+      elf_section_data (s)->this_hdr.sh_entsize = bed->s->sizeof_hash_entry;
+    }
+
+  if (info->emit_gnu_hash)
+    {
+      s = bfd_make_section_with_flags (abfd, ".gnu.hash",
+				       flags | SEC_READONLY);
+      if (s == NULL
+	  || ! bfd_set_section_alignment (abfd, s, bed->s->log_file_align))
+	return FALSE;
+      elf_section_data (s)->this_hdr.sh_entsize = 4;
+    }
 
   /* Let the backend create the rest of the sections.  This lets the
      backend set the right flags.  The backend will normally create
@@ -4811,6 +4823,118 @@ elf_collect_hash_codes (struct elf_link_
   return TRUE;
 }
 
+struct collect_gnu_hash_codes
+{
+  bfd *output_bfd;
+  unsigned long int nsyms;
+  unsigned long int *hashcodes;
+  unsigned long int *hashval;
+  unsigned long int *indx;
+  unsigned long int *counts;
+  bfd_byte *contents;
+  long int min_dynindx;
+  unsigned long int bucketcount;
+  unsigned long int symindx;
+  long int local_indx;
+};
+
+/* This function will be called though elf_link_hash_traverse to store
+   all hash value of the exported symbols in an array.  */
+
+static bfd_boolean
+elf_collect_gnu_hash_codes (struct elf_link_hash_entry *h, void *data)
+{
+  struct collect_gnu_hash_codes *s = data;
+  const char *name;
+  char *p;
+  unsigned long ha;
+  char *alc = NULL;
+
+  if (h->root.type == bfd_link_hash_warning)
+    h = (struct elf_link_hash_entry *) h->root.u.i.link;
+
+  /* Ignore indirect symbols.  These are added by the versioning code.  */
+  if (h->dynindx == -1)
+    return TRUE;
+
+  /* Ignore also local symbols and undefined symbols.  */
+  if (h->forced_local
+      || h->root.type == bfd_link_hash_undefined
+      || h->root.type == bfd_link_hash_undefweak
+      || ((h->root.type == bfd_link_hash_defined
+	   || h->root.type == bfd_link_hash_defweak)
+	  && h->root.u.def.section->output_section == NULL))
+    return TRUE;
+
+  name = h->root.root.string;
+  p = strchr (name, ELF_VER_CHR);
+  if (p != NULL)
+    {
+      alc = bfd_malloc (p - name + 1);
+      memcpy (alc, name, p - name);
+      alc[p - name] = '\0';
+      name = alc;
+    }
+
+  /* Compute the hash value.  */
+  ha = bfd_elf_gnu_hash (name);
+
+  /* Store the found hash value in the array for compute_bucket_count,
+     and also for .dynsym reordering purposes.  */
+  s->hashcodes[s->nsyms] = ha;
+  s->hashval[h->dynindx] = ha;
+  ++s->nsyms;
+  if (s->min_dynindx < 0 || s->min_dynindx > h->dynindx)
+    s->min_dynindx = h->dynindx;
+
+  if (alc != NULL)
+    free (alc);
+
+  return TRUE;
+}
+
+/* This function will be called though elf_link_hash_traverse to do
+   final dynaminc symbol renumbering.  */
+
+static bfd_boolean
+elf_renumber_gnu_hash_syms (struct elf_link_hash_entry *h, void *data)
+{
+  struct collect_gnu_hash_codes *s = data;
+  unsigned long int bucket;
+  unsigned long int val;
+
+  if (h->root.type == bfd_link_hash_warning)
+    h = (struct elf_link_hash_entry *) h->root.u.i.link;
+
+  /* Ignore indirect symbols.  */
+  if (h->dynindx == -1)
+    return TRUE;
+
+  /* Ignore also local symbols and undefined symbols.  */
+  if (h->forced_local
+      || h->root.type == bfd_link_hash_undefined
+      || h->root.type == bfd_link_hash_undefweak
+      || ((h->root.type == bfd_link_hash_defined
+	   || h->root.type == bfd_link_hash_defweak)
+	  && h->root.u.def.section->output_section == NULL))
+    {
+      if (h->dynindx >= s->min_dynindx)
+	h->dynindx = s->local_indx++;
+      return TRUE;
+    }
+
+  bucket = s->hashval[h->dynindx] % s->bucketcount;
+  val = s->hashval[h->dynindx] & ~(unsigned long int) 1;
+  if (s->counts[bucket] == 1)
+    /* Last element terminates the chain.  */
+    val |= 1;
+  bfd_put_32 (s->output_bfd, val,
+	      s->contents + (s->indx[bucket] - s->symindx) * 4);
+  --s->counts[bucket];
+  h->dynindx = s->indx[bucket]++;
+  return TRUE;
+}
+
 /* Array used to determine the number of hash table buckets to use
    based on the number of symbols there are.  If there are fewer than
    3 symbols we use 1 bucket, fewer than 17 symbols we use 3 buckets,
@@ -4832,42 +4956,26 @@ static const size_t elf_buckets[] =
    Therefore the result is always a good payoff between few collisions
    (= short chain lengths) and table size.  */
 static size_t
-compute_bucket_count (struct bfd_link_info *info)
+compute_bucket_count (struct bfd_link_info *info, unsigned long int *hashcodes,
+		      unsigned long int nsyms)
 {
   size_t dynsymcount = elf_hash_table (info)->dynsymcount;
   size_t best_size = 0;
-  unsigned long int *hashcodes;
-  unsigned long int *hashcodesp;
   unsigned long int i;
   bfd_size_type amt;
 
-  /* Compute the hash values for all exported symbols.  At the same
-     time store the values in an array so that we could use them for
-     optimizations.  */
-  amt = dynsymcount;
-  amt *= sizeof (unsigned long int);
-  hashcodes = bfd_malloc (amt);
-  if (hashcodes == NULL)
-    return 0;
-  hashcodesp = hashcodes;
-
-  /* Put all hash values in HASHCODES.  */
-  elf_link_hash_traverse (elf_hash_table (info),
-			  elf_collect_hash_codes, &hashcodesp);
-
   /* We have a problem here.  The following code to optimize the table
      size requires an integer type with more the 32 bits.  If
      BFD_HOST_U_64_BIT is set we know about such a type.  */
 #ifdef BFD_HOST_U_64_BIT
   if (info->optimize)
     {
-      unsigned long int nsyms = hashcodesp - hashcodes;
       size_t minsize;
       size_t maxsize;
       BFD_HOST_U_64_BIT best_chlen = ~((BFD_HOST_U_64_BIT) 0);
-      unsigned long int *counts ;
       bfd *dynobj = elf_hash_table (info)->dynobj;
       const struct elf_backend_data *bed = get_elf_backend_data (dynobj);
+      unsigned long int *counts;
 
       /* Possible optimization parameters: if we have NSYMS symbols we say
 	 that the hashing table must at least have NSYMS/4 and at most
@@ -4883,10 +4991,7 @@ compute_bucket_count (struct bfd_link_in
       amt *= sizeof (unsigned long int);
       counts = bfd_malloc (amt);
       if (counts == NULL)
-	{
-	  free (hashcodes);
-	  return 0;
-	}
+	return 0;
 
       /* Compute the "optimal" size for the hash table.  The criteria is a
 	 minimal chain length.  The minor criteria is (of course) the size
@@ -4913,9 +5018,9 @@ compute_bucket_count (struct bfd_link_in
 #  define BFD_TARGET_PAGESIZE	(4096)
 # endif
 
-	  /* We in any case need 2 + NSYMS entries for the size values and
-	     the chains.  */
-	  max = (2 + nsyms) * (bed->s->arch_size / 8);
+	  /* We in any case need 2 + DYNSYMCOUNT entries for the size values
+	     and the chains.  */
+	  max = (2 + dynsymcount) * bed->s->sizeof_hash_entry;
 
 # if 1
 	  /* Variant 1: optimize for short chains.  We add the squares
@@ -4925,7 +5030,7 @@ compute_bucket_count (struct bfd_link_in
 	    max += counts[j] * counts[j];
 
 	  /* This adds penalties for the overall size of the table.  */
-	  fact = i / (BFD_TARGET_PAGESIZE / (bed->s->arch_size / 8)) + 1;
+	  fact = i / (BFD_TARGET_PAGESIZE / bed->s->sizeof_hash_entry) + 1;
 	  max *= fact * fact;
 # else
 	  /* Variant 2: Optimize a lot more for small table.  Here we
@@ -4936,7 +5041,7 @@ compute_bucket_count (struct bfd_link_in
 
 	  /* The overall size of the table is considered, but not as
 	     strong as in variant 1, where it is squared.  */
-	  fact = i / (BFD_TARGET_PAGESIZE / (bed->s->arch_size / 8)) + 1;
+	  fact = i / (BFD_TARGET_PAGESIZE / bed->s->sizeof_hash_entry) + 1;
 	  max *= fact;
 # endif
 
@@ -4959,14 +5064,11 @@ compute_bucket_count (struct bfd_link_in
       for (i = 0; elf_buckets[i] != 0; i++)
 	{
 	  best_size = elf_buckets[i];
-	  if (dynsymcount < elf_buckets[i + 1])
+	  if (nsyms < elf_buckets[i + 1])
 	    break;
 	}
     }
 
-  /* Free the arrays we needed.  */
-  free (hashcodes);
-
   return best_size;
 }
 
@@ -5324,7 +5426,10 @@ bfd_elf_size_dynamic_sections (bfd *outp
 	  bfd_size_type strsize;
 
 	  strsize = _bfd_elf_strtab_size (elf_hash_table (info)->dynstr);
-	  if (!_bfd_elf_add_dynamic_entry (info, DT_HASH, 0)
+	  if ((info->emit_hash
+	       && !_bfd_elf_add_dynamic_entry (info, DT_HASH, 0))
+	      || (info->emit_gnu_hash
+		  && !_bfd_elf_add_dynamic_entry (info, DT_GNU_HASH, 0))
 	      || !_bfd_elf_add_dynamic_entry (info, DT_STRTAB, 0)
 	      || !_bfd_elf_add_dynamic_entry (info, DT_SYMTAB, 0)
 	      || !_bfd_elf_add_dynamic_entry (info, DT_STRSZ, strsize)
@@ -5726,8 +5831,6 @@ bfd_elf_size_dynsym_hash_dynstr (bfd *ou
       asection *s;
       bfd_size_type dynsymcount;
       unsigned long section_sym_count;
-      size_t bucketcount = 0;
-      size_t hash_entry_size;
       unsigned int dtagcount;
 
       dynobj = elf_hash_table (info)->dynobj;
@@ -5778,23 +5881,150 @@ bfd_elf_size_dynsym_hash_dynstr (bfd *ou
 	  memset (s->contents, 0, section_sym_count * bed->s->sizeof_sym);
 	}
 
+      elf_hash_table (info)->bucketcount = 0;
+
       /* Compute the size of the hashing table.  As a side effect this
 	 computes the hash values for all the names we export.  */
-      bucketcount = compute_bucket_count (info);
+      if (info->emit_hash)
+	{
+	  unsigned long int *hashcodes;
+	  unsigned long int *hashcodesp;
+	  bfd_size_type amt;
+	  unsigned long int nsyms;
+	  size_t bucketcount;
+	  size_t hash_entry_size;
+
+	  /* Compute the hash values for all exported symbols.  At the same
+	     time store the values in an array so that we could use them for
+	     optimizations.  */
+	  amt = dynsymcount * sizeof (unsigned long int);
+	  hashcodes = bfd_malloc (amt);
+	  if (hashcodes == NULL)
+	    return FALSE;
+	  hashcodesp = hashcodes;
 
-      s = bfd_get_section_by_name (dynobj, ".hash");
-      BFD_ASSERT (s != NULL);
-      hash_entry_size = elf_section_data (s)->this_hdr.sh_entsize;
-      s->size = ((2 + bucketcount + dynsymcount) * hash_entry_size);
-      s->contents = bfd_zalloc (output_bfd, s->size);
-      if (s->contents == NULL)
-	return FALSE;
+	  /* Put all hash values in HASHCODES.  */
+	  elf_link_hash_traverse (elf_hash_table (info),
+				  elf_collect_hash_codes, &hashcodesp);
+
+	  nsyms = hashcodesp - hashcodes;
+	  bucketcount
+	    = compute_bucket_count (info, hashcodes, nsyms);
+	  free (hashcodes);
+
+	  if (bucketcount == 0)
+	    return FALSE;
+
+	  elf_hash_table (info)->bucketcount = bucketcount;
+
+	  s = bfd_get_section_by_name (dynobj, ".hash");
+	  BFD_ASSERT (s != NULL);
+	  hash_entry_size = elf_section_data (s)->this_hdr.sh_entsize;
+	  s->size = ((2 + bucketcount + dynsymcount) * hash_entry_size);
+	  s->contents = bfd_zalloc (output_bfd, s->size);
+	  if (s->contents == NULL)
+	    return FALSE;
 
-      bfd_put (8 * hash_entry_size, output_bfd, bucketcount, s->contents);
-      bfd_put (8 * hash_entry_size, output_bfd, dynsymcount,
-	       s->contents + hash_entry_size);
+	  bfd_put (8 * hash_entry_size, output_bfd, bucketcount, s->contents);
+	  bfd_put (8 * hash_entry_size, output_bfd, dynsymcount,
+		   s->contents + hash_entry_size);
+	}
+
+      if (info->emit_gnu_hash)
+	{
+	  size_t i, cnt;
+	  unsigned char *contents;
+	  struct collect_gnu_hash_codes cinfo;
+	  bfd_size_type amt;
+	  size_t bucketcount;
+
+	  memset (&cinfo, 0, sizeof (cinfo));
 
-      elf_hash_table (info)->bucketcount = bucketcount;
+	  /* Compute the hash values for all exported symbols.  At the same
+	     time store the values in an array so that we could use them for
+	     optimizations.  */
+	  amt = dynsymcount * 2 * sizeof (unsigned long int);
+	  cinfo.hashcodes = bfd_malloc (amt);
+	  if (cinfo.hashcodes == NULL)
+	    return FALSE;
+
+	  cinfo.hashval = cinfo.hashcodes + dynsymcount;
+	  cinfo.min_dynindx = -1;
+	  cinfo.output_bfd = output_bfd;
+
+	  /* Put all hash values in HASHCODES.  */
+	  elf_link_hash_traverse (elf_hash_table (info),
+				  elf_collect_gnu_hash_codes, &cinfo);
+
+	  bucketcount
+	    = compute_bucket_count (info, cinfo.hashcodes, cinfo.nsyms);
+
+	  if (bucketcount == 0)
+	    {
+	      free (cinfo.hashcodes);
+	      return FALSE;
+	    }
+
+	  amt = bucketcount * sizeof (unsigned long int) * 2;
+	  cinfo.counts = bfd_malloc (amt);
+	  if (cinfo.counts == NULL)
+	    {
+	      free (cinfo.hashcodes);
+	      return FALSE;
+	    }
+
+	  /* Determine how often each hash bucket is used.  */
+	  memset (cinfo.counts, 0, bucketcount * sizeof (cinfo.counts[0]));
+	  for (i = 0; i < cinfo.nsyms; ++i)
+	    ++cinfo.counts[cinfo.hashcodes[i] % bucketcount];
+
+	  s = bfd_get_section_by_name (dynobj, ".gnu.hash");
+	  BFD_ASSERT (s != NULL);
+	  cinfo.indx = cinfo.counts + bucketcount;
+	  cinfo.symindx = dynsymcount - cinfo.nsyms;
+	  for (i = 0, cnt = cinfo.symindx; i < bucketcount; ++i)
+	    if (cinfo.counts[i] != 0)
+	      {
+		cinfo.indx[i] = cnt;
+		cnt += cinfo.counts[i];
+	      }
+	  BFD_ASSERT (cnt == dynsymcount);
+	  cinfo.bucketcount = bucketcount;
+	  cinfo.local_indx = cinfo.min_dynindx;
+
+	  s->size = (2 + bucketcount + cinfo.nsyms) * 4;
+	  contents = bfd_zalloc (output_bfd, s->size);
+	  if (contents == NULL)
+	    {
+	      free (cinfo.counts);
+	      free (cinfo.hashcodes);
+	      return FALSE;
+	    }
+
+	  s->contents = contents;
+	  bfd_put_32 (output_bfd, bucketcount, contents);
+	  bfd_put_32 (output_bfd, cinfo.symindx, contents + 4);
+	  contents += 8;
+
+	  for (i = 0; i < bucketcount; ++i)
+	    {
+	      if (cinfo.counts[i] == 0)
+		bfd_put_32 (output_bfd, ~0, contents);
+	      else
+		bfd_put_32 (output_bfd, cinfo.indx[i] - cinfo.symindx,
+			    contents);
+	      contents += 4;
+	    }
+
+	  cinfo.contents = contents;
+
+	  /* Renumber dynamic symbols, populate .gnu.hash section.  */
+	  elf_link_hash_traverse (elf_hash_table (info),
+				  elf_renumber_gnu_hash_syms, &cinfo);
+
+	  free (cinfo.counts);
+	  free (cinfo.hashcodes);
+	}
 
       s = bfd_get_section_by_name (dynobj, ".dynstr");
       BFD_ASSERT (s != NULL);
@@ -6663,9 +6893,6 @@ elf_link_output_extsym (struct elf_link_
     {
       size_t bucketcount;
       size_t bucket;
-      size_t hash_entry_size;
-      bfd_byte *bucketpos;
-      bfd_vma chain;
       bfd_byte *esym;
 
       sym.st_name = h->dynstr_index;
@@ -6679,15 +6906,23 @@ elf_link_output_extsym (struct elf_link_
 
       bucketcount = elf_hash_table (finfo->info)->bucketcount;
       bucket = h->u.elf_hash_value % bucketcount;
-      hash_entry_size
-	= elf_section_data (finfo->hash_sec)->this_hdr.sh_entsize;
-      bucketpos = ((bfd_byte *) finfo->hash_sec->contents
-		   + (bucket + 2) * hash_entry_size);
-      chain = bfd_get (8 * hash_entry_size, finfo->output_bfd, bucketpos);
-      bfd_put (8 * hash_entry_size, finfo->output_bfd, h->dynindx, bucketpos);
-      bfd_put (8 * hash_entry_size, finfo->output_bfd, chain,
-	       ((bfd_byte *) finfo->hash_sec->contents
-		+ (bucketcount + 2 + h->dynindx) * hash_entry_size));
+
+      if (finfo->hash_sec != NULL)
+	{
+	  size_t hash_entry_size;
+	  bfd_byte *bucketpos;
+	  bfd_vma chain;
+
+	  hash_entry_size
+	    = elf_section_data (finfo->hash_sec)->this_hdr.sh_entsize;
+	  bucketpos = ((bfd_byte *) finfo->hash_sec->contents
+		       + (bucket + 2) * hash_entry_size);
+	  chain = bfd_get (8 * hash_entry_size, finfo->output_bfd, bucketpos);
+	  bfd_put (8 * hash_entry_size, finfo->output_bfd, h->dynindx, bucketpos);
+	  bfd_put (8 * hash_entry_size, finfo->output_bfd, chain,
+		   ((bfd_byte *) finfo->hash_sec->contents
+		    + (bucketcount + 2 + h->dynindx) * hash_entry_size));
+	}
 
       if (finfo->symver_sec != NULL && finfo->symver_sec->contents != NULL)
 	{
@@ -7861,7 +8096,7 @@ bfd_elf_final_link (bfd *abfd, struct bf
     {
       finfo.dynsym_sec = bfd_get_section_by_name (dynobj, ".dynsym");
       finfo.hash_sec = bfd_get_section_by_name (dynobj, ".hash");
-      BFD_ASSERT (finfo.dynsym_sec != NULL && finfo.hash_sec != NULL);
+      BFD_ASSERT (finfo.dynsym_sec != NULL);
       finfo.symver_sec = bfd_get_section_by_name (dynobj, ".gnu.version");
       /* Note that it is OK if symver_sec is NULL.  */
     }
@@ -8621,6 +8856,9 @@ bfd_elf_final_link (bfd *abfd, struct bf
 	    case DT_HASH:
 	      name = ".hash";
 	      goto get_vma;
+	    case DT_GNU_HASH:
+	      name = ".gnu.hash";
+	      goto get_vma;
 	    case DT_STRTAB:
 	      name = ".dynstr";
 	      goto get_vma;
--- include/elf/common.h.jj	2006-02-17 15:36:26.000000000 +0100
+++ include/elf/common.h	2006-06-22 10:43:21.000000000 +0200
@@ -338,6 +338,7 @@
 #define SHT_LOOS	0x60000000	/* First of OS specific semantics */
 #define SHT_HIOS	0x6fffffff	/* Last of OS specific semantics */
 
+#define SHT_GNU_HASH	0x6ffffff6	/* GNU style symbol hash table */
 #define SHT_GNU_LIBLIST	0x6ffffff7	/* List of prelink dependencies */
 
 /* The next three section types are defined by Solaris, and are named
@@ -577,6 +578,7 @@
 #define DT_VALRNGHI	0x6ffffdff
 
 #define DT_ADDRRNGLO	0x6ffffe00
+#define DT_GNU_HASH	0x6ffffef5
 #define DT_TLSDESC_PLT	0x6ffffef6
 #define DT_TLSDESC_GOT	0x6ffffef7
 #define DT_GNU_CONFLICT	0x6ffffef8
--- include/bfdlink.h.jj	2006-04-07 17:17:29.000000000 +0200
+++ include/bfdlink.h	2006-06-22 11:11:20.000000000 +0200
@@ -324,6 +324,12 @@ struct bfd_link_info
   /* TRUE if unreferenced sections should be removed.  */
   unsigned int gc_sections: 1;
 
+  /* TRUE if .hash section should be created.  */
+  unsigned int emit_hash: 1;
+
+  /* TRUE if .gnu.hash section should be created.  */
+  unsigned int emit_gnu_hash: 1;
+
   /* What to do with unresolved symbols in an object file.
      When producing executables the default is GENERATE_ERROR.
      When producing shared libraries the default is IGNORE.  The
--- binutils/readelf.c.jj	2006-05-30 16:13:54.000000000 +0200
+++ binutils/readelf.c	2006-07-03 19:30:54.000000000 +0200
@@ -135,6 +135,7 @@ static unsigned long dynamic_syminfo_off
 static unsigned int dynamic_syminfo_nent;
 static char program_interpreter[64];
 static bfd_vma dynamic_info[DT_JMPREL + 1];
+static bfd_vma dynamic_info_DT_GNU_HASH;
 static bfd_vma version_info[16];
 static Elf_Internal_Ehdr elf_header;
 static Elf_Internal_Shdr *section_headers;
@@ -1501,6 +1502,7 @@ get_dynamic_type (unsigned long type)
     case DT_GNU_CONFLICTSZ: return "GNU_CONFLICTSZ";
     case DT_GNU_LIBLIST: return "GNU_LIBLIST";
     case DT_GNU_LIBLISTSZ: return "GNU_LIBLISTSZ";
+    case DT_GNU_HASH:	return "GNU_HASH";
 
     default:
       if ((type >= DT_LOPROC) && (type <= DT_HIPROC))
@@ -2571,6 +2573,7 @@ get_section_type_name (unsigned int sh_t
     case SHT_INIT_ARRAY:	return "INIT_ARRAY";
     case SHT_FINI_ARRAY:	return "FINI_ARRAY";
     case SHT_PREINIT_ARRAY:	return "PREINIT_ARRAY";
+    case SHT_GNU_HASH:		return "GNU_HASH";
     case SHT_GROUP:		return "GROUP";
     case SHT_SYMTAB_SHNDX:	return "SYMTAB SECTION INDICIES";
     case SHT_GNU_verdef:	return "VERDEF";
@@ -6228,6 +6231,15 @@ process_dynamic_section (FILE *file)
 	    }
 	  break;
 
+	case DT_GNU_HASH:
+	  dynamic_info_DT_GNU_HASH = entry->d_un.d_val;
+	  if (do_dynamic)
+	    {
+	      print_vma (entry->d_un.d_val, PREFIX_HEX);
+	      putchar ('\n');
+	    }
+	  break;
+
 	default:
 	  if ((entry->d_tag >= DT_VERSYM) && (entry->d_tag <= DT_VERNEEDNUM))
 	    version_info[DT_VERSIONTAGIDX (entry->d_tag)] =
@@ -6903,6 +6915,9 @@ process_symbol_table (FILE *file)
   bfd_vma nchains = 0;
   bfd_vma *buckets = NULL;
   bfd_vma *chains = NULL;
+  bfd_vma ngnubuckets = 0;
+  bfd_vma *gnubuckets = NULL;
+  bfd_vma *gnuchains = NULL;
 
   if (! do_syms && !do_histogram)
     return 1;
@@ -7282,6 +7297,145 @@ process_symbol_table (FILE *file)
       free (chains);
     }
 
+  if (do_histogram && dynamic_info_DT_GNU_HASH)
+    {
+      unsigned char nb[8];
+      bfd_vma i, maxchain = 0xffffffff;
+      unsigned long *lengths;
+      unsigned long *counts;
+      unsigned long hn;
+      unsigned long maxlength = 0;
+      unsigned long nzero_counts = 0;
+      unsigned long nsyms = 0;
+
+      if (fseek (file,
+		 (archive_file_offset
+		  + offset_from_vma (file, dynamic_info_DT_GNU_HASH,
+				     sizeof nb)),
+		 SEEK_SET))
+	{
+	  error (_("Unable to seek to start of dynamic information"));
+	  return 0;
+	}
+
+      if (fread (nb, 8, 1, file) != 1)
+	{
+	  error (_("Failed to read in number of buckets\n"));
+	  return 0;
+	}
+
+      ngnubuckets = byte_get (nb, 4);
+
+      gnubuckets = get_dynamic_data (file, ngnubuckets, 4);
+
+      if (gnubuckets == NULL)
+	return 0;
+
+      for (i = 0; i < ngnubuckets; i++)
+	if (gnubuckets[i] != 0xffffffff
+	    && (maxchain == 0xffffffff || gnubuckets[i] > maxchain))
+	  maxchain = gnubuckets[i];
+
+      if (maxchain == 0xffffffff)
+	return 0;
+
+      if (fseek (file,
+		 (archive_file_offset
+		  + offset_from_vma (file,
+				     dynamic_info_DT_GNU_HASH
+				     + 4 * (2 + ngnubuckets + maxchain),
+				     sizeof nb)),
+		 SEEK_SET))
+	{
+	  error (_("Unable to seek to start of dynamic information"));
+	  return 0;
+	}
+
+      do
+	{
+	  if (fread (nb, 4, 1, file) != 1)
+	    {
+	      error (_("Failed to determine last chain length\n"));
+	      return 0;
+	    }
+
+	  if (maxchain + 1 == 0)
+	    return 0;
+
+	  ++maxchain;
+	}
+      while ((byte_get (nb, 4) & 1) == 0);
+
+      if (fseek (file,
+		 (archive_file_offset
+		  + offset_from_vma (file,
+				     dynamic_info_DT_GNU_HASH
+				     + 4 * (2 + ngnubuckets), sizeof nb)),
+		 SEEK_SET))
+	{
+	  error (_("Unable to seek to start of dynamic information"));
+	  return 0;
+	}
+
+      gnuchains = get_dynamic_data (file, maxchain, 4);
+
+      if (gnuchains == NULL)
+	return 0;
+
+      lengths = calloc (ngnubuckets, sizeof (*lengths));
+      if (lengths == NULL)
+	{
+	  error (_("Out of memory"));
+	  return 0;
+	}
+
+      printf (_("\nHistogram for `.gnu.hash' bucket list length (total of %lu buckets):\n"),
+	      (unsigned long) ngnubuckets);
+      printf (_(" Length  Number     %% of total  Coverage\n"));
+
+      for (hn = 0; hn < ngnubuckets; ++hn)
+	if (gnubuckets[hn] != 0xffffffff)
+	  {
+	    bfd_vma off, length = 1;
+
+	    for (off = gnubuckets[hn]; (gnuchains[off] & 1) == 0; ++off)
+	      ++length;
+	    lengths[hn] = length;
+	    if (length > maxlength)
+	      maxlength = length;
+	    nsyms += length;
+	  }
+
+      counts = calloc (maxlength + 1, sizeof (*counts));
+      if (counts == NULL)
+	{
+	  error (_("Out of memory"));
+	  return 0;
+	}
+
+      for (hn = 0; hn < ngnubuckets; ++hn)
+	++counts[lengths[hn]];
+
+      if (ngnubuckets > 0)
+	{
+	  unsigned long j;
+	  printf ("      0  %-10lu (%5.1f%%)\n",
+		  counts[0], (counts[0] * 100.0) / ngnubuckets);
+	  for (j = 1; j <= maxlength; ++j)
+	    {
+	      nzero_counts += counts[j] * j;
+	      printf ("%7lu  %-10lu (%5.1f%%)    %5.1f%%\n",
+		      j, counts[j], (counts[j] * 100.0) / ngnubuckets,
+		      (nzero_counts * 100.0) / nsyms);
+	    }
+	}
+
+      free (counts);
+      free (lengths);
+      free (gnubuckets);
+      free (gnuchains);
+    }
+
   return 1;
 }
 

	Jakub