public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* Re: PATCH: Use a hash table for 26X linker speedup
@ 2005-09-29 16:46 Steven Bosscher
  2005-09-30  7:31 ` Alan Modra
  0 siblings, 1 reply; 11+ messages in thread
From: Steven Bosscher @ 2005-09-29 16:46 UTC (permalink / raw)
  To: binutils; +Cc: amodra, hjl

Hi,

Can anyone tell what happened to this linker speedup patch?
http://sources.redhat.com/ml/binutils/2005-05/msg00720.html

Gr.
Steven

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PATCH: Use a hash table for 26X linker speedup
  2005-09-29 16:46 PATCH: Use a hash table for 26X linker speedup Steven Bosscher
@ 2005-09-30  7:31 ` Alan Modra
  2005-09-30 15:47   ` Nick Clifton
  0 siblings, 1 reply; 11+ messages in thread
From: Alan Modra @ 2005-09-30  7:31 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: binutils, hjl

On Thu, Sep 29, 2005 at 04:23:05PM +0200, Steven Bosscher wrote:
> Can anyone tell what happened to this linker speedup patch?
> http://sources.redhat.com/ml/binutils/2005-05/msg00720.html

I looked at rearranging the orphan code so that this wouldn't be
necessary, and found I needed to pull apart half the linker.  So I left
that job for a rainy day.  Since I haven't provided a more elegant
solution, I suppose HJ's patch ought to go in.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PATCH: Use a hash table for 26X linker speedup
  2005-09-30  7:31 ` Alan Modra
@ 2005-09-30 15:47   ` Nick Clifton
  2005-10-01  1:45     ` Paul Brook
  0 siblings, 1 reply; 11+ messages in thread
From: Nick Clifton @ 2005-09-30 15:47 UTC (permalink / raw)
  To: Alan Modra; +Cc: Steven Bosscher, binutils, hjl

Hi H.J.

   Just to make this explicit:

 > Alan Modra wrote:
>>Can anyone tell what happened to this linker speedup patch?
>>http://sources.redhat.com/ml/binutils/2005-05/msg00720.html
> 
> I looked at rearranging the orphan code so that this wouldn't be
> necessary, and found I needed to pull apart half the linker.  So I left
> that job for a rainy day.  Since I haven't provided a more elegant
> solution, I suppose HJ's patch ought to go in.

ld/
2005-05-01  H.J. Lu  <hongjiu.lu@intel.com>

	* ldlang.c (output_statement_hash_entry): New type.
	(output_statement_table): New variable for hash table.
	(output_statement_newfunc): New function.
	(output_statement_table_init): Likewise.
	(output_statement_table_free): Likewise.
	(lang_init): Call output_statement_table_init.
	(lang_finish): Renamed to ...
	(lang_end): This.
	(lang_process): Updated.
	(lang_finish): New function.
	(lang_output_section_find_1): Use hash table.
	(lang_output_section_statement_lookup_1): Likewise.

	* ldlang.h (lang_finish): New.

	* ldmain.c (main): Call lang_finish.

ld/testsuite/
2005-05-01  H.J. Lu  <hongjiu.lu@intel.com>

	* ld-elf/sec64k.exp: Enabled for all ELF targets.

Approved - please apply.

Cheers
   Nick

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PATCH: Use a hash table for 26X linker speedup
  2005-09-30 15:47   ` Nick Clifton
@ 2005-10-01  1:45     ` Paul Brook
  2005-10-04  7:23       ` Nick Clifton
  0 siblings, 1 reply; 11+ messages in thread
From: Paul Brook @ 2005-10-01  1:45 UTC (permalink / raw)
  To: binutils; +Cc: Nick Clifton, Alan Modra, hjl

[-- Attachment #1: Type: text/plain, Size: 826 bytes --]

> ld/testsuite/
> 2005-05-01  H.J. Lu  <hongjiu.lu@intel.com>
>
> 	* ld-elf/sec64k.exp: Enabled for all ELF targets.

On an arm-none-eabi cross this test takes about 8 minutes to run on my 
machine. This is about 25 times longer than the whole of the rest of the 
binutils+gas+gdb testsuites put together.

Tests run on a 2GHz amd64, ie. not a slow machine.

A native i386 linker completes the test in a more reasonable time (~20 
seconds), so this looks like an arm specific problem.

We should probably figure out what causing this slowness. Until that happens, 
the patch below disable the test on arm targets.
Ok?

I'll file a bug in bugzilla to remind me/us that there's something slow in the 
arm linker.

Paul

2005-10-01  Paul Brook  <paul@codesourcery.com>

	* ld-elf/sec64k.exp: Skip test on arm targets (too slow).

[-- Attachment #2: patch.arm_sec64k --]
[-- Type: text/x-diff, Size: 685 bytes --]

Index: ld/testsuite/ld-elf/sec64k.exp
===================================================================
RCS file: /var/cvsroot/src-cvs/src/ld/testsuite/ld-elf/sec64k.exp,v
retrieving revision 1.7
diff -u -p -r1.7 sec64k.exp
--- ld/testsuite/ld-elf/sec64k.exp	30 Sep 2005 17:45:54 -0000	1.7
+++ ld/testsuite/ld-elf/sec64k.exp	1 Oct 2005 01:38:20 -0000
@@ -24,6 +24,11 @@ if ![is_elf_format] {
     return
 }
 
+# For some reason this is painfully slow on arm targets, so skip it.
+if { [istarget arm*-*-*] } {
+    return
+}
+
 # Test >64k sections, with and without -r.  First, create the assembly
 # files.  Have a relocation to another section and one within the local
 # section.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PATCH: Use a hash table for 26X linker speedup
  2005-10-01  1:45     ` Paul Brook
@ 2005-10-04  7:23       ` Nick Clifton
  2005-10-19  0:59         ` Paul Brook
  0 siblings, 1 reply; 11+ messages in thread
From: Nick Clifton @ 2005-10-04  7:23 UTC (permalink / raw)
  To: Paul Brook; +Cc: binutils, Alan Modra, hjl

Hi Paul,

>>	* ld-elf/sec64k.exp: Enabled for all ELF targets.

> On an arm-none-eabi cross this test takes about 8 minutes to run on my 
> machine. This is about 25 times longer than the whole of the rest of the 
> binutils+gas+gdb testsuites put together.

> We should probably figure out what causing this slowness. Until that happens, 
> the patch below disable the test on arm targets.
> Ok?

Better I think to find out the cause of the slow down.  I did this - it is
the get_arm_elf_section_data() function in elf32-arm.c, which is repeatedly
scanning over a list of ~64K entries trying to match up section 
pointers.  Since I wrote this function I felt obliged to try to fix it 
and I am committing the patch below which does improve things.  It is 
not the best fix (which would be to use a hash function), but it does 
help by noticing that the typical usage is to add sections to the list 
in forwards order and then scan for them in backwards order.

Cheers
   Nick

bfd/ChangeLog
2005-10-04  Nick Clifton  <nickc@redhat.com>

	* elf32-arm.c (get_arm_elf_section_data): Cache the last pointer
	matched so that the typical case of scanning for the previous
	section to last one can be handled quickly.

Index: bfd/elf32-arm.c
===================================================================
RCS file: /cvs/src/src/bfd/elf32-arm.c,v
retrieving revision 1.55
diff -c -3 -p -r1.55 elf32-arm.c
*** bfd/elf32-arm.c	9 Sep 2005 13:10:01 -0000	1.55
--- bfd/elf32-arm.c	4 Oct 2005 07:18:12 -0000
*************** static _arm_elf_section_data *
*** 6563,6572 ****
   get_arm_elf_section_data (asection * sec)
   {
     struct section_list * entry;

     for (entry = sections_with_arm_elf_section_data; entry; entry = 
entry->next)
       if (entry->sec == sec)
!       return elf32_arm_section_data (sec);
     return NULL;
   }

--- 6563,6594 ----
   get_arm_elf_section_data (asection * sec)
   {
     struct section_list * entry;
+   static struct section_list * last_entry = NULL;

+   /* This is a short cut for the typical case where the sections are added
+      to the sections_with_arm_elf_section_data list in forward order and
+      then looked up here in backwards order.  This makes a real difference
+      to the ld-srec/sec64k.exp linker test.  */
+   if (last_entry != NULL)
+     {
+       if (last_entry->sec == sec)
+ 	return elf32_arm_section_data (sec);
+
+       if (last_entry->prev != NULL
+ 	  && last_entry->prev->sec == sec)
+ 	{
+ 	  last_entry = last_entry->prev;
+ 	  return elf32_arm_section_data (sec);
+ 	}
+     }
+
     for (entry = sections_with_arm_elf_section_data; entry; entry = 
entry->next)
       if (entry->sec == sec)
!       {
! 	last_entry = entry;
! 	return elf32_arm_section_data (sec);
!       }
!
     return NULL;
   }

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PATCH: Use a hash table for 26X linker speedup
  2005-10-04  7:23       ` Nick Clifton
@ 2005-10-19  0:59         ` Paul Brook
  2005-10-19 15:39           ` Nick Clifton
  0 siblings, 1 reply; 11+ messages in thread
From: Paul Brook @ 2005-10-19  0:59 UTC (permalink / raw)
  To: binutils; +Cc: Nick Clifton

[-- Attachment #1: Type: text/plain, Size: 702 bytes --]

> bfd/ChangeLog
> 2005-10-04  Nick Clifton  <nickc@redhat.com>
>
> 	* elf32-arm.c (get_arm_elf_section_data): Cache the last pointer
> 	matched so that the typical case of scanning for the previous
> 	section to last one can be handled quickly.

It's still quite slow with this patch.  
unrecord_section_with_arm_elf_section_data has the same problem.

The attached patch uses the cache for this as well, and avoids leaving 
last_entry pointing a a free'd object.

Tested with cross to arm-none-eabi.
Ok?

Paul

2005-10-19  Paul Brook  <paul@codesourcery.com>

	* elf32-arm.c (find_arm_elf_section_entry): New function.
	(get_arm_elf_section_data,
	unrecord_section_with_arm_elf_section_data): Use it.

[-- Attachment #2: patch.64ksec_speedup --]
[-- Type: text/x-diff, Size: 2868 bytes --]

Index: bfd/elf32-arm.c
===================================================================
RCS file: /var/cvsroot/src-cvs/src/bfd/elf32-arm.c,v
retrieving revision 1.58
diff -u -p -r1.58 elf32-arm.c
--- bfd/elf32-arm.c	8 Oct 2005 17:07:15 -0000	1.58
+++ bfd/elf32-arm.c	19 Oct 2005 00:16:11 -0000
@@ -7311,8 +7311,8 @@ record_section_with_arm_elf_section_data
   sections_with_arm_elf_section_data = entry;
 }
 
-static _arm_elf_section_data *
-get_arm_elf_section_data (asection * sec)
+static struct section_list *
+find_arm_elf_section_entry (asection * sec)
 {
   struct section_list * entry;
   static struct section_list * last_entry = NULL;
@@ -7321,27 +7321,37 @@ get_arm_elf_section_data (asection * sec
      to the sections_with_arm_elf_section_data list in forward order and
      then looked up here in backwards order.  This makes a real difference
      to the ld-srec/sec64k.exp linker test.  */
+  entry = sections_with_arm_elf_section_data;
   if (last_entry != NULL)
     {
       if (last_entry->sec == sec)
-	return elf32_arm_section_data (sec);
-
-      if (last_entry->prev != NULL
-	  && last_entry->prev->sec == sec)
-	{
-	  last_entry = last_entry->prev;
-	  return elf32_arm_section_data (sec);
-	}
+	entry = last_entry;
+      else if (last_entry->next != NULL
+	       && last_entry->next->sec == sec)
+	entry = last_entry->next;
     }
- 
-  for (entry = sections_with_arm_elf_section_data; entry; entry = entry->next)
+
+  for (; entry; entry = entry->next)
     if (entry->sec == sec)
-      {
-	last_entry = entry;
-	return elf32_arm_section_data (sec);
-      }
+      break;
+
+  if (entry)
+    last_entry = entry->prev;
+
+  return entry;
+}
+
+static _arm_elf_section_data *
+get_arm_elf_section_data (asection * sec)
+{
+  struct section_list * entry;
+
+  entry = find_arm_elf_section_entry (sec);
 
-  return NULL;
+  if (entry)
+    return elf32_arm_section_data (entry->sec);
+  else
+    return NULL;
 }
 
 static void
@@ -7349,18 +7359,18 @@ unrecord_section_with_arm_elf_section_da
 {
   struct section_list * entry;
 
-  for (entry = sections_with_arm_elf_section_data; entry; entry = entry->next)
-    if (entry->sec == sec)
-      {
-	if (entry->prev != NULL)
-	  entry->prev->next = entry->next;
-	if (entry->next != NULL)
-	  entry->next->prev = entry->prev;
-	if (entry == sections_with_arm_elf_section_data)
-	  sections_with_arm_elf_section_data = entry->next;
-	free (entry);
-	break;
-      }
+  entry = find_arm_elf_section_entry (sec);
+
+  if (entry)
+    {
+      if (entry->prev != NULL)
+	entry->prev->next = entry->next;
+      if (entry->next != NULL)
+	entry->next->prev = entry->prev;
+      if (entry == sections_with_arm_elf_section_data)
+	sections_with_arm_elf_section_data = entry->next;
+      free (entry);
+    }
 }
 
 /* Called for each symbol.  Builds a section map based on mapping symbols.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PATCH: Use a hash table for 26X linker speedup
  2005-10-19  0:59         ` Paul Brook
@ 2005-10-19 15:39           ` Nick Clifton
  0 siblings, 0 replies; 11+ messages in thread
From: Nick Clifton @ 2005-10-19 15:39 UTC (permalink / raw)
  To: Paul Brook; +Cc: binutils

Hi Paul,

> It's still quite slow with this patch.  
> unrecord_section_with_arm_elf_section_data has the same problem.

Good point.

> The attached patch uses the cache for this as well, and avoids leaving 
> last_entry pointing a free'd object.

Also a very good point.

I have applied your patch together with an extra comment describing why 
we set last_found to entry->prev, since I felt that it was not entirely 
obvious.

Cheers
   Nick

> 2005-10-19  Paul Brook  <paul@codesourcery.com>
> 
> 	* elf32-arm.c (find_arm_elf_section_entry): New function.
> 	(get_arm_elf_section_data,
> 	unrecord_section_with_arm_elf_section_data): Use it.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PATCH: Use a hash table for 26X linker speedup
  2005-05-27 16:42     ` H. J. Lu
@ 2005-05-29 11:16       ` Alan Modra
  0 siblings, 0 replies; 11+ messages in thread
From: Alan Modra @ 2005-05-29 11:16 UTC (permalink / raw)
  To: H. J. Lu; +Cc: Christian Joensson, binutils

On Fri, May 27, 2005 at 09:41:02AM -0700, H. J. Lu wrote:
> > > > I will work on lang_output_section_find_1 next. I am planning to use
> > > > a hash table. I hope to enable 64k section on all ELF targets.
> 
> Here is the updated patch. I have been using this for a while. Is there
> any objection?

I think I made this comment before:  It seems wasteful to have two hash
tables working with virtually the same set of data.  We have a section
hash table holding output section names, and you're adding another hash
table to hold output_section_statement names.  I realize the two aren't
identical, but all orphans will be in both tables (and it is our orphan
section handling that makes the output_section_statement list so long).
I'd prefer if you don't commit this patch, at least until I've had time
to look at changing orphan section handling.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PATCH: Use a hash table for 26X linker speedup
  2005-05-21 11:05   ` Christian Joensson
@ 2005-05-27 16:42     ` H. J. Lu
  2005-05-29 11:16       ` Alan Modra
  0 siblings, 1 reply; 11+ messages in thread
From: H. J. Lu @ 2005-05-27 16:42 UTC (permalink / raw)
  To: Christian Joensson; +Cc: binutils

On Sat, May 21, 2005 at 08:40:55AM +0200, Christian Joensson wrote:
> Whatever happened with this?
> 
> /ChJ
> 
> On 5/1/05, H. J. Lu <hjl@lucon.org> wrote:
> > On Sun, May 01, 2005 at 09:57:07AM -0700, H. J. Lu wrote:
> > > Linker is very slow on input files with many sections. The 64k section
> > > test is only enabled for CRIS. I did a profile. 70% of linker time is
> > > spent in lang_check_section_addresses:
> > >
> > > Flat profile:
> > >
> > > Each sample counts as 0.01 seconds.
> > >   %   cumulative   self              self     total
> > >  time   seconds   seconds    calls  Ks/call  Ks/call  name
> > >  72.92    948.31   948.31        1     0.95     0.95 lang_check_section_addresses
> > >  22.37   1239.21   290.90   132089     0.00     0.00 lang_output_section_find_1
> > >
> > > The main problem is we use a single section list. We have to scan the
> > > whole list for anything. In case of address overlap check, we only
> > > need to check the previous section. There are many other places in
> > > bfd, assembler and linker where a double section list will help. With
> > > this patch, I got 30% linker speed up in 64k section test:
> > >
> > > The old linker:
> > >
> > > sh 1  502.74s user 0.90s system 99% cpu 8:23.73 total
> > >
> > > The new linker:
> > >
> > > sh 1  340.58s user 0.90s system 99% cpu 5:41.55 total
> > >
> > > The profile data shows:
> > >
> > > Flat profile:
> > >
> > > Each sample counts as 0.01 seconds.
> > >   %   cumulative   self              self     total
> > >  time   seconds   seconds    calls   s/call   s/call  name
> > >  81.20    256.42   256.42   132089     0.00     0.00 lang_output_section_find_1
> > >  13.27    298.33    41.91   673985     0.00     0.00  bfd_hash_lookup
> > >
> > > I will work on lang_output_section_find_1 next. I am planning to use
> > > a hash table. I hope to enable 64k section on all ELF targets.
> > >
> > 
> > This is the patch. I got
> > 
> > sh 1 13.39s user 1.28s system 95% cpu 15.431 total
> > 
> > vs.
> > 
> > sh 1  340.58s user 0.90s system 99% cpu 5:41.55 total
> > 
> > That is a 26X speed up. I enabled 64k section test for all ELF targets.
> > 


Here is the updated patch. I have been using this for a while. Is there
any objection?


H.J.
-----
ld/

2005-05-01  H.J. Lu  <hongjiu.lu@intel.com>

	* ldlang.c (output_statement_hash_entry): New type.
	(output_statement_table): New variable for hash table.
	(output_statement_newfunc): New function.
	(output_statement_table_init): Likewise.
	(output_statement_table_free): Likewise.
	(lang_init): Call output_statement_table_init.
	(lang_finish): Renamed to ...
	(lang_end): This.
	(lang_process): Updated.
	(lang_finish): New function.
	(lang_output_section_find_1): Use hash table.
	(lang_output_section_statement_lookup_1): Likewise.

	* ldlang.h (lang_finish): New.

	* ldmain.c (main): Call lang_finish.

ld/testsuite/

2005-05-01  H.J. Lu  <hongjiu.lu@intel.com>

	* ld-elf/sec64k.exp: Enabled for all ELF targets.

--- ld/ldlang.c.hash	2005-05-11 08:12:08.000000000 -0700
+++ ld/ldlang.c	2005-05-11 09:14:33.000000000 -0700
@@ -864,6 +864,45 @@ lang_add_input_file (const char *name,
   return new_afile (name, file_type, target, TRUE);
 }
 
+struct output_statement_hash_entry
+{
+  struct bfd_hash_entry root;
+  lang_output_section_statement_type *entry;
+};
+
+/* The hash table.  */
+
+static struct bfd_hash_table output_statement_table;
+
+/* Support routines for the hash table used by lang_output_section_find_1,
+   initialize the table, fill in an entry and remove the table.  */
+
+static struct bfd_hash_entry *
+output_statement_newfunc (struct bfd_hash_entry *entry ATTRIBUTE_UNUSED, 
+			  struct bfd_hash_table *table,
+			  const char *string ATTRIBUTE_UNUSED)
+{
+  struct output_statement_hash_entry *ret
+    = bfd_hash_allocate (table,
+			 sizeof (struct output_statement_hash_entry));
+  ret->entry = NULL;
+  return (struct bfd_hash_entry *) ret;
+}
+
+static void
+output_statement_table_init (void)
+{
+  if (! bfd_hash_table_init_n (&output_statement_table,
+			       output_statement_newfunc, 61))
+    einfo (_("%P%F: Failed to create hash table\n"));
+}
+
+static void
+output_statement_table_free (void)
+{
+  bfd_hash_table_free (&output_statement_table);
+}
+
 /* Build enough state so that the parser can build its tree.  */
 
 void
@@ -873,6 +912,8 @@ lang_init (void)
 
   stat_ptr = &statement_list;
 
+  output_statement_table_init ();
+
   lang_list_init (stat_ptr);
 
   lang_list_init (&input_file_chain);
@@ -899,6 +940,12 @@ lang_init (void)
   lang_statement_iteration = 0;
 }
 
+void
+lang_finish (void)
+{
+  output_statement_table_free ();
+}
+
 /*----------------------------------------------------------------------
   A region is an area of memory declared with the
   MEMORY {  name:org=exp, len=exp ... }
@@ -984,18 +1031,30 @@ static lang_output_section_statement_typ
 lang_output_section_find_1 (const char *const name, int constraint)
 {
   lang_output_section_statement_type *lookup;
+  struct output_statement_hash_entry *entry;
+  unsigned long hash;
 
-  for (lookup = &lang_output_section_statement.head->output_section_statement;
-       lookup != NULL;
-       lookup = lookup->next)
+  entry = ((struct output_statement_hash_entry *)
+	   bfd_hash_lookup (&output_statement_table, name, FALSE,
+			    FALSE));
+  if (entry == NULL || (lookup = entry->entry) == NULL)
+    return NULL;
+
+  hash = entry->root.hash;
+  do
     {
-      if (strcmp (name, lookup->name) == 0
-	  && lookup->constraint != -1
+      if (lookup->constraint != -1
 	  && (constraint == 0
 	      || (constraint == lookup->constraint
 		  && constraint != SPECIAL)))
 	return lookup;
+      entry = (struct output_statement_hash_entry *) entry->root.next;
+      lookup = entry ? entry->entry : NULL;
     }
+  while (entry != NULL
+	 && entry->root.hash == hash
+	 && strcmp (name, lookup->name) == 0);
+
   return NULL;
 }
 
@@ -1013,6 +1072,8 @@ lang_output_section_statement_lookup_1 (
   lookup = lang_output_section_find_1 (name, constraint);
   if (lookup == NULL)
     {
+      struct output_statement_hash_entry *entry;
+
       lookup = new_stat (lang_output_section_statement, stat_ptr);
       lookup->region = NULL;
       lookup->lma_region = NULL;
@@ -1036,6 +1097,15 @@ lang_output_section_statement_lookup_1 (
       lookup->update_dot_tree = NULL;
       lookup->phdrs = NULL;
 
+      entry = ((struct output_statement_hash_entry *)
+	       bfd_hash_lookup (&output_statement_table, name, TRUE,
+				FALSE));
+      if (entry == NULL)
+	einfo (_("%P%F: bfd_hash_lookup failed creating section `%s'\n"),
+	       name);
+
+      entry->entry = lookup;
+
       lang_statement_append (&lang_output_section_statement,
 			     (lang_statement_union_type *) lookup,
 			     (lang_statement_union_type **) &lookup->next);
@@ -4546,7 +4616,7 @@ lang_set_startof (void)
 }
 
 static void
-lang_finish (void)
+lang_end (void)
 {
   struct bfd_link_hash_entry *h;
   bfd_boolean warn;
@@ -5332,7 +5402,7 @@ lang_process (void)
   /* Final stuffs.  */
   lang_mark_used_section ();
   ldemul_finish ();
-  lang_finish ();
+  lang_end ();
 }
 
 /* EXPORTED TO YACC */
--- ld/ldlang.h.hash	2005-05-04 08:52:34.000000000 -0700
+++ ld/ldlang.h	2005-05-11 08:13:47.000000000 -0700
@@ -448,6 +448,8 @@ extern int lang_statement_iteration;
 
 extern void lang_init
   (void);
+extern void lang_finish
+  (void);
 extern lang_memory_region_type *lang_memory_region_lookup
   (const char *const, bfd_boolean);
 extern lang_memory_region_type *lang_memory_region_default
--- ld/ldmain.c.hash	2005-05-11 08:12:08.000000000 -0700
+++ ld/ldmain.c	2005-05-11 08:13:47.000000000 -0700
@@ -474,6 +474,8 @@ main (int argc, char **argv)
   if (nocrossref_list != NULL)
     check_nocrossrefs ();
 
+  lang_finish ();
+
   /* Even if we're producing relocatable output, some non-fatal errors should
      be reported in the exit status.  (What non-fatal errors, if any, do we
      want to ignore for relocatable output?)  */
--- ld/testsuite/ld-elf/sec64k.exp.hash	2005-03-03 08:56:35.000000000 -0800
+++ ld/testsuite/ld-elf/sec64k.exp	2005-05-11 08:13:47.000000000 -0700
@@ -24,12 +24,6 @@ if ![is_elf_format] {
     return
 }
 
-# Per-port excludes, since this test takes an overwhelmingly long time
-# currently.
-if { ![istarget cris-*-*] } {
-    return
-}
-
 # Test >64k sections, with and without -r.  First, create the assembly
 # files.  Have a relocation to another section and one within the local
 # section.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: PATCH: Use a hash table for 26X linker speedup
  2005-05-01 19:09 ` PATCH: Use a hash table for 26X linker speedup H. J. Lu
@ 2005-05-21 11:05   ` Christian Joensson
  2005-05-27 16:42     ` H. J. Lu
  0 siblings, 1 reply; 11+ messages in thread
From: Christian Joensson @ 2005-05-21 11:05 UTC (permalink / raw)
  To: H. J. Lu; +Cc: binutils

Whatever happened with this?

/ChJ

On 5/1/05, H. J. Lu <hjl@lucon.org> wrote:
> On Sun, May 01, 2005 at 09:57:07AM -0700, H. J. Lu wrote:
> > Linker is very slow on input files with many sections. The 64k section
> > test is only enabled for CRIS. I did a profile. 70% of linker time is
> > spent in lang_check_section_addresses:
> >
> > Flat profile:
> >
> > Each sample counts as 0.01 seconds.
> >   %   cumulative   self              self     total
> >  time   seconds   seconds    calls  Ks/call  Ks/call  name
> >  72.92    948.31   948.31        1     0.95     0.95 lang_check_section_addresses
> >  22.37   1239.21   290.90   132089     0.00     0.00 lang_output_section_find_1
> >
> > The main problem is we use a single section list. We have to scan the
> > whole list for anything. In case of address overlap check, we only
> > need to check the previous section. There are many other places in
> > bfd, assembler and linker where a double section list will help. With
> > this patch, I got 30% linker speed up in 64k section test:
> >
> > The old linker:
> >
> > sh 1  502.74s user 0.90s system 99% cpu 8:23.73 total
> >
> > The new linker:
> >
> > sh 1  340.58s user 0.90s system 99% cpu 5:41.55 total
> >
> > The profile data shows:
> >
> > Flat profile:
> >
> > Each sample counts as 0.01 seconds.
> >   %   cumulative   self              self     total
> >  time   seconds   seconds    calls   s/call   s/call  name
> >  81.20    256.42   256.42   132089     0.00     0.00 lang_output_section_find_1
> >  13.27    298.33    41.91   673985     0.00     0.00  bfd_hash_lookup
> >
> > I will work on lang_output_section_find_1 next. I am planning to use
> > a hash table. I hope to enable 64k section on all ELF targets.
> >
> 
> This is the patch. I got
> 
> sh 1 13.39s user 1.28s system 95% cpu 15.431 total
> 
> vs.
> 
> sh 1  340.58s user 0.90s system 99% cpu 5:41.55 total
> 
> That is a 26X speed up. I enabled 64k section test for all ELF targets.
> 
> 
> H.J.
> ----
> ld/
> 
> 2005-05-01  H.J. Lu  <hongjiu.lu@intel.com>
> 
>        * ldlang.c (output_statement_hash_entry): New type.
>        (output_statement_table): New variable for hash table.
>        (output_statement_newfunc): New function.
>        (output_statement_table_init): Likewise.
>        (output_statement_table_free): Likewise.
>        (lang_init): Call output_statement_table_init.
>        (lang_finish): Renamed to ...
>        (lang_end): This.
>        (lang_process): Updated.
>        (lang_finish): New function.
>        (lang_output_section_find_1): Use hash table.
>        (lang_output_section_statement_lookup_1): Likewise.
> 
>        * ldlang.h (lang_finish): New.
> 
>        * ldmain.c (main): Call lang_finish.
> 
> ld/testsuite/
> 
> 2005-05-01  H.J. Lu  <hongjiu.lu@intel.com>
> 
>        * ld-elf/sec64k.exp: Enabled for all ELF targets.
> 
> --- ld/ldlang.c.hash    2005-05-01 09:00:10.000000000 -0700
> +++ ld/ldlang.c 2005-05-01 12:05:32.000000000 -0700
> @@ -863,6 +863,45 @@ lang_add_input_file (const char *name,
>   return new_afile (name, file_type, target, TRUE);
>  }
> 
> +struct output_statement_hash_entry
> +{
> +  struct bfd_hash_entry root;
> +  lang_output_section_statement_type *entry;
> +};
> +
> +/* The hash table.  */
> +
> +static struct bfd_hash_table output_statement_table;
> +
> +/* Support routines for the hash table used by lang_output_section_find_1,
> +   initialize the table, fill in an entry and remove the table.  */
> +
> +static struct bfd_hash_entry *
> +output_statement_newfunc (struct bfd_hash_entry *entry ATTRIBUTE_UNUSED,
> +                         struct bfd_hash_table *table,
> +                         const char *string ATTRIBUTE_UNUSED)
> +{
> +  struct output_statement_hash_entry *ret
> +    = bfd_hash_allocate (table,
> +                        sizeof (struct output_statement_hash_entry));
> +  ret->entry = NULL;
> +  return (struct bfd_hash_entry *) ret;
> +}
> +
> +static void
> +output_statement_table_init (void)
> +{
> +  if (! bfd_hash_table_init_n (&output_statement_table,
> +                              output_statement_newfunc, 61))
> +    einfo (_("%P%F: Failed to create hash table\n"));
> +}
> +
> +static void
> +output_statement_table_free (void)
> +{
> +  bfd_hash_table_free (&output_statement_table);
> +}
> +
>  /* Build enough state so that the parser can build its tree.  */
> 
>  void
> @@ -872,6 +911,8 @@ lang_init (void)
> 
>   stat_ptr = &statement_list;
> 
> +  output_statement_table_init ();
> +
>   lang_list_init (stat_ptr);
> 
>   lang_list_init (&input_file_chain);
> @@ -898,6 +939,12 @@ lang_init (void)
>   lang_statement_iteration = 0;
>  }
> 
> +void
> +lang_finish (void)
> +{
> +  output_statement_table_free ();
> +}
> +
>  /*----------------------------------------------------------------------
>   A region is an area of memory declared with the
>   MEMORY {  name:org=exp, len=exp ... }
> @@ -983,16 +1030,28 @@ static lang_output_section_statement_typ
>  lang_output_section_find_1 (const char *const name, int constraint)
>  {
>   lang_output_section_statement_type *lookup;
> +  struct output_statement_hash_entry *entry;
> +  unsigned long hash;
> 
> -  for (lookup = &lang_output_section_statement.head->output_section_statement;
> -       lookup != NULL;
> -       lookup = lookup->next)
> +  entry = ((struct output_statement_hash_entry *)
> +          bfd_hash_lookup (&output_statement_table, name, FALSE,
> +                           FALSE));
> +  if (entry == NULL || (lookup = entry->entry) == NULL)
> +    return NULL;
> +
> +  hash = entry->root.hash;
> +  do
>     {
> -      if (strcmp (name, lookup->name) == 0
> -         && lookup->constraint != -1
> +      if (lookup->constraint != -1
>          && (constraint == 0 || constraint == lookup->constraint))
>        return lookup;
> +      entry = (struct output_statement_hash_entry *) entry->root.next;
> +      lookup = entry ? entry->entry : NULL;
>     }
> +  while (entry != NULL
> +        && entry->root.hash == hash
> +        && strcmp (name, lookup->name) == 0);
> +
>   return NULL;
>  }
> 
> @@ -1010,6 +1069,8 @@ lang_output_section_statement_lookup_1 (
>   lookup = lang_output_section_find_1 (name, constraint);
>   if (lookup == NULL)
>     {
> +      struct output_statement_hash_entry *entry;
> +
>       lookup = new_stat (lang_output_section_statement, stat_ptr);
>       lookup->region = NULL;
>       lookup->lma_region = NULL;
> @@ -1034,6 +1095,15 @@ lang_output_section_statement_lookup_1 (
>       lookup->update_dot_tree = NULL;
>       lookup->phdrs = NULL;
> 
> +      entry = ((struct output_statement_hash_entry *)
> +              bfd_hash_lookup (&output_statement_table, name, TRUE,
> +                               FALSE));
> +      if (entry == NULL)
> +       einfo (_("%P%F: bfd_hash_lookup failed creating section `%s'\n"),
> +              name);
> +
> +      entry->entry = lookup;
> +
>       lang_statement_append (&lang_output_section_statement,
>                             (lang_statement_union_type *) lookup,
>                             (lang_statement_union_type **) &lookup->next);
> @@ -4495,7 +4565,7 @@ lang_set_startof (void)
>  }
> 
>  static void
> -lang_finish (void)
> +lang_end (void)
>  {
>   struct bfd_link_hash_entry *h;
>   bfd_boolean warn;
> @@ -5294,7 +5364,7 @@ lang_process (void)
>   /* Final stuffs.  */
> 
>   ldemul_finish ();
> -  lang_finish ();
> +  lang_end ();
>  }
> 
>  /* EXPORTED TO YACC */
> --- ld/ldlang.h.hash    2005-05-01 09:00:10.000000000 -0700
> +++ ld/ldlang.h 2005-05-01 11:39:36.000000000 -0700
> @@ -449,6 +449,8 @@ extern int lang_statement_iteration;
> 
>  extern void lang_init
>   (void);
> +extern void lang_finish
> +  (void);
>  extern lang_memory_region_type *lang_memory_region_lookup
>   (const char *const, bfd_boolean);
>  extern lang_memory_region_type *lang_memory_region_default
> --- ld/ldmain.c.hash    2005-03-16 17:37:00.000000000 -0800
> +++ ld/ldmain.c 2005-05-01 11:40:39.000000000 -0700
> @@ -474,6 +474,8 @@ main (int argc, char **argv)
>   if (nocrossref_list != NULL)
>     check_nocrossrefs ();
> 
> +  lang_finish ();
> +
>   /* Even if we're producing relocatable output, some non-fatal errors should
>      be reported in the exit status.  (What non-fatal errors, if any, do we
>      want to ignore for relocatable output?)  */
> --- ld/testsuite/ld-elf/sec64k.exp.hash 2005-03-03 08:56:35.000000000 -0800
> +++ ld/testsuite/ld-elf/sec64k.exp      2005-05-01 11:50:34.000000000 -0700
> @@ -24,12 +24,6 @@ if ![is_elf_format] {
>     return
>  }
> 
> -# Per-port excludes, since this test takes an overwhelmingly long time
> -# currently.
> -if { ![istarget cris-*-*] } {
> -    return
> -}
> -
>  # Test >64k sections, with and without -r.  First, create the assembly
>  # files.  Have a relocation to another section and one within the local
>  # section.
> 


-- 
Cheers,

/ChJ

^ permalink raw reply	[flat|nested] 11+ messages in thread

* PATCH: Use a hash table for 26X linker speedup
  2005-05-01 16:57 PATCH: Use a double section list to speed up linker by 30% H. J. Lu
@ 2005-05-01 19:09 ` H. J. Lu
  2005-05-21 11:05   ` Christian Joensson
  0 siblings, 1 reply; 11+ messages in thread
From: H. J. Lu @ 2005-05-01 19:09 UTC (permalink / raw)
  To: binutils

On Sun, May 01, 2005 at 09:57:07AM -0700, H. J. Lu wrote:
> Linker is very slow on input files with many sections. The 64k section
> test is only enabled for CRIS. I did a profile. 70% of linker time is
> spent in lang_check_section_addresses:
> 
> Flat profile:
> 
> Each sample counts as 0.01 seconds.
>   %   cumulative   self              self     total
>  time   seconds   seconds    calls  Ks/call  Ks/call  name
>  72.92    948.31   948.31        1     0.95     0.95 lang_check_section_addresses
>  22.37   1239.21   290.90   132089     0.00     0.00 lang_output_section_find_1
> 
> The main problem is we use a single section list. We have to scan the
> whole list for anything. In case of address overlap check, we only
> need to check the previous section. There are many other places in
> bfd, assembler and linker where a double section list will help. With
> this patch, I got 30% linker speed up in 64k section test:
> 
> The old linker:
> 
> sh 1  502.74s user 0.90s system 99% cpu 8:23.73 total
> 
> The new linker:
> 
> sh 1  340.58s user 0.90s system 99% cpu 5:41.55 total
> 
> The profile data shows:
> 
> Flat profile:
> 
> Each sample counts as 0.01 seconds.
>   %   cumulative   self              self     total
>  time   seconds   seconds    calls   s/call   s/call  name
>  81.20    256.42   256.42   132089     0.00     0.00 lang_output_section_find_1
>  13.27    298.33    41.91   673985     0.00     0.00  bfd_hash_lookup 
> 
> I will work on lang_output_section_find_1 next. I am planning to use
> a hash table. I hope to enable 64k section on all ELF targets.
> 

This is the patch. I got

sh 1 13.39s user 1.28s system 95% cpu 15.431 total

vs.

sh 1  340.58s user 0.90s system 99% cpu 5:41.55 total

That is a 26X speed up. I enabled 64k section test for all ELF targets.


H.J.
----
ld/

2005-05-01  H.J. Lu  <hongjiu.lu@intel.com>

	* ldlang.c (output_statement_hash_entry): New type.
	(output_statement_table): New variable for hash table.
	(output_statement_newfunc): New function.
	(output_statement_table_init): Likewise.
	(output_statement_table_free): Likewise.
	(lang_init): Call output_statement_table_init.
	(lang_finish): Renamed to ...
	(lang_end): This.
	(lang_process): Updated.
	(lang_finish): New function.
	(lang_output_section_find_1): Use hash table.
	(lang_output_section_statement_lookup_1): Likewise.

	* ldlang.h (lang_finish): New.

	* ldmain.c (main): Call lang_finish.

ld/testsuite/

2005-05-01  H.J. Lu  <hongjiu.lu@intel.com>

	* ld-elf/sec64k.exp: Enabled for all ELF targets.

--- ld/ldlang.c.hash	2005-05-01 09:00:10.000000000 -0700
+++ ld/ldlang.c	2005-05-01 12:05:32.000000000 -0700
@@ -863,6 +863,45 @@ lang_add_input_file (const char *name,
   return new_afile (name, file_type, target, TRUE);
 }
 
+struct output_statement_hash_entry
+{
+  struct bfd_hash_entry root;
+  lang_output_section_statement_type *entry;
+};
+
+/* The hash table.  */
+
+static struct bfd_hash_table output_statement_table;
+
+/* Support routines for the hash table used by lang_output_section_find_1,
+   initialize the table, fill in an entry and remove the table.  */
+
+static struct bfd_hash_entry *
+output_statement_newfunc (struct bfd_hash_entry *entry ATTRIBUTE_UNUSED, 
+			  struct bfd_hash_table *table,
+			  const char *string ATTRIBUTE_UNUSED)
+{
+  struct output_statement_hash_entry *ret
+    = bfd_hash_allocate (table,
+			 sizeof (struct output_statement_hash_entry));
+  ret->entry = NULL;
+  return (struct bfd_hash_entry *) ret;
+}
+
+static void
+output_statement_table_init (void)
+{
+  if (! bfd_hash_table_init_n (&output_statement_table,
+			       output_statement_newfunc, 61))
+    einfo (_("%P%F: Failed to create hash table\n"));
+}
+
+static void
+output_statement_table_free (void)
+{
+  bfd_hash_table_free (&output_statement_table);
+}
+
 /* Build enough state so that the parser can build its tree.  */
 
 void
@@ -872,6 +911,8 @@ lang_init (void)
 
   stat_ptr = &statement_list;
 
+  output_statement_table_init ();
+
   lang_list_init (stat_ptr);
 
   lang_list_init (&input_file_chain);
@@ -898,6 +939,12 @@ lang_init (void)
   lang_statement_iteration = 0;
 }
 
+void
+lang_finish (void)
+{
+  output_statement_table_free ();
+}
+
 /*----------------------------------------------------------------------
   A region is an area of memory declared with the
   MEMORY {  name:org=exp, len=exp ... }
@@ -983,16 +1030,28 @@ static lang_output_section_statement_typ
 lang_output_section_find_1 (const char *const name, int constraint)
 {
   lang_output_section_statement_type *lookup;
+  struct output_statement_hash_entry *entry;
+  unsigned long hash;
 
-  for (lookup = &lang_output_section_statement.head->output_section_statement;
-       lookup != NULL;
-       lookup = lookup->next)
+  entry = ((struct output_statement_hash_entry *)
+	   bfd_hash_lookup (&output_statement_table, name, FALSE,
+			    FALSE));
+  if (entry == NULL || (lookup = entry->entry) == NULL)
+    return NULL;
+
+  hash = entry->root.hash;
+  do
     {
-      if (strcmp (name, lookup->name) == 0
-	  && lookup->constraint != -1
+      if (lookup->constraint != -1
 	  && (constraint == 0 || constraint == lookup->constraint))
 	return lookup;
+      entry = (struct output_statement_hash_entry *) entry->root.next;
+      lookup = entry ? entry->entry : NULL;
     }
+  while (entry != NULL
+	 && entry->root.hash == hash
+	 && strcmp (name, lookup->name) == 0);
+
   return NULL;
 }
 
@@ -1010,6 +1069,8 @@ lang_output_section_statement_lookup_1 (
   lookup = lang_output_section_find_1 (name, constraint);
   if (lookup == NULL)
     {
+      struct output_statement_hash_entry *entry;
+
       lookup = new_stat (lang_output_section_statement, stat_ptr);
       lookup->region = NULL;
       lookup->lma_region = NULL;
@@ -1034,6 +1095,15 @@ lang_output_section_statement_lookup_1 (
       lookup->update_dot_tree = NULL;
       lookup->phdrs = NULL;
 
+      entry = ((struct output_statement_hash_entry *)
+	       bfd_hash_lookup (&output_statement_table, name, TRUE,
+				FALSE));
+      if (entry == NULL)
+	einfo (_("%P%F: bfd_hash_lookup failed creating section `%s'\n"),
+	       name);
+
+      entry->entry = lookup;
+
       lang_statement_append (&lang_output_section_statement,
 			     (lang_statement_union_type *) lookup,
 			     (lang_statement_union_type **) &lookup->next);
@@ -4495,7 +4565,7 @@ lang_set_startof (void)
 }
 
 static void
-lang_finish (void)
+lang_end (void)
 {
   struct bfd_link_hash_entry *h;
   bfd_boolean warn;
@@ -5294,7 +5364,7 @@ lang_process (void)
   /* Final stuffs.  */
 
   ldemul_finish ();
-  lang_finish ();
+  lang_end ();
 }
 
 /* EXPORTED TO YACC */
--- ld/ldlang.h.hash	2005-05-01 09:00:10.000000000 -0700
+++ ld/ldlang.h	2005-05-01 11:39:36.000000000 -0700
@@ -449,6 +449,8 @@ extern int lang_statement_iteration;
 
 extern void lang_init
   (void);
+extern void lang_finish
+  (void);
 extern lang_memory_region_type *lang_memory_region_lookup
   (const char *const, bfd_boolean);
 extern lang_memory_region_type *lang_memory_region_default
--- ld/ldmain.c.hash	2005-03-16 17:37:00.000000000 -0800
+++ ld/ldmain.c	2005-05-01 11:40:39.000000000 -0700
@@ -474,6 +474,8 @@ main (int argc, char **argv)
   if (nocrossref_list != NULL)
     check_nocrossrefs ();
 
+  lang_finish ();
+
   /* Even if we're producing relocatable output, some non-fatal errors should
      be reported in the exit status.  (What non-fatal errors, if any, do we
      want to ignore for relocatable output?)  */
--- ld/testsuite/ld-elf/sec64k.exp.hash	2005-03-03 08:56:35.000000000 -0800
+++ ld/testsuite/ld-elf/sec64k.exp	2005-05-01 11:50:34.000000000 -0700
@@ -24,12 +24,6 @@ if ![is_elf_format] {
     return
 }
 
-# Per-port excludes, since this test takes an overwhelmingly long time
-# currently.
-if { ![istarget cris-*-*] } {
-    return
-}
-
 # Test >64k sections, with and without -r.  First, create the assembly
 # files.  Have a relocation to another section and one within the local
 # section.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-10-19 15:39 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-09-29 16:46 PATCH: Use a hash table for 26X linker speedup Steven Bosscher
2005-09-30  7:31 ` Alan Modra
2005-09-30 15:47   ` Nick Clifton
2005-10-01  1:45     ` Paul Brook
2005-10-04  7:23       ` Nick Clifton
2005-10-19  0:59         ` Paul Brook
2005-10-19 15:39           ` Nick Clifton
  -- strict thread matches above, loose matches on Subject: below --
2005-05-01 16:57 PATCH: Use a double section list to speed up linker by 30% H. J. Lu
2005-05-01 19:09 ` PATCH: Use a hash table for 26X linker speedup H. J. Lu
2005-05-21 11:05   ` Christian Joensson
2005-05-27 16:42     ` H. J. Lu
2005-05-29 11:16       ` Alan Modra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).