public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* bfd: optimize bfd_elf_hash
@ 2023-04-09 22:55 Nathan Sidwell
  2023-04-11 12:36 ` Alan Modra
  0 siblings, 1 reply; 3+ messages in thread
From: Nathan Sidwell @ 2023-04-09 22:55 UTC (permalink / raw)
  To: Nick Clifton, binutils

[-- Attachment #1: Type: text/plain, Size: 999 bytes --]

I happened to be poking at llvm's implementation of the sysv hash and notice 
binutils' version had similar optimization issues.

neither gcc nor llvm can spot these xforms, and the if (...) at least obscures 
the data flow IMHO anyway.

The bfd_elf_hash loop is taken straight from the sysV document, but it
is poorly optimized. This refactoring removes about 5 x86 insns from
the 15 insn loop.

1) The if (..) is meaningless -- we're xoring with that value, and of
course xor 0 is a nop. On x86 (at least) we actually compute the xor'd
value and then cmov.  Removing the if test removes the cmov.

2) The 'h ^ g' to clear the top 4 bits is not needed, as those 4 bits
will be shifted out in the next iteration.  All we need to do is sink
a mask of those 4 bits out of the loop.

3) anding with 0xf0 after shifting by 24 bits can allow betterin
encoding on RISC ISAs than masking with '0xf0 << 24' before shifting.
RISC ISAs often require materializing larger constants.

nathan

-- 
Nathan Sidwell

[-- Attachment #2: 0001-bfd-optimize-bfd_elf_hash.patch --]
[-- Type: text/x-patch, Size: 2620 bytes --]

From 4fc6668fcbee2945d79b40f51cdcaeac78f08aa3 Mon Sep 17 00:00:00 2001
From: Nathan Sidwell <nathan@acm.org>
Date: Sun, 9 Apr 2023 18:44:07 -0400
Subject: [PATCH] bfd: optimize bfd_elf_hash

The bfd_elf_hash loop is taken straight from the sysV document, but it
is poorly optimized. This refactoring removes about 5 x86 insns from
the 15 insn loop.

1) The if (..) is meaningless -- we're xoring with that value, and of
course xor 0 is a nop. On x86 (at least) we actually compute the xor'd
value and then cmov.  Removing the if test removes the cmov.

2) The 'h ^ g' to clear the top 4 bits is not needed, as those 4 bits
will be shifted out in the next iteration.  All we need to do is sink
a mask of those 4 bits out of the loop.

3) anding with 0xf0 after shifting by 24 bits can allow betterin
encoding on RISC ISAs than masking with '0xf0 << 24' before shifting.
RISC ISAs often require materializing larger constants.

	bfd/
	* elf.c (bfd_elf_hash): Refactor to optimize loop.
	(bfd_elf_gnu_hash): Refactor to use 32-bit type.
---
 bfd/elf.c | 31 +++++++++++--------------------
 1 file changed, 11 insertions(+), 20 deletions(-)

diff --git a/bfd/elf.c b/bfd/elf.c
index 87ec1623313..bda83469edd 100644
--- a/bfd/elf.c
+++ b/bfd/elf.c
@@ -196,23 +196,15 @@ _bfd_elf_swap_versym_out (bfd *abfd,
 unsigned long
 bfd_elf_hash (const char *namearg)
 {
-  const unsigned char *name = (const unsigned char *) namearg;
-  unsigned long h = 0;
-  unsigned long g;
-  int ch;
+  uint32_t h = 0;
 
-  while ((ch = *name++) != '\0')
+  for (const unsigned char *name = (const unsigned char *) namearg;
+       *name; name++)
     {
-      h = (h << 4) + ch;
-      if ((g = (h & 0xf0000000)) != 0)
-	{
-	  h ^= g >> 24;
-	  /* The ELF ABI says `h &= ~g', but this is equivalent in
-	     this case and on some machines one insn instead of two.  */
-	  h ^= g;
-	}
+      h = (h << 4) + *name;
+      h ^= (h >> 24) & 0xf0;
     }
-  return h & 0xffffffff;
+  return h & 0x0fffffff;
 }
 
 /* DT_GNU_HASH hash function.  Do not change this function; you will
@@ -221,13 +213,12 @@ bfd_elf_hash (const char *namearg)
 unsigned long
 bfd_elf_gnu_hash (const char *namearg)
 {
-  const unsigned char *name = (const unsigned char *) namearg;
-  unsigned long h = 5381;
-  unsigned char ch;
+  uint32_t h = 5381;
 
-  while ((ch = *name++) != '\0')
-    h = (h << 5) + h + ch;
-  return h & 0xffffffff;
+  for (const unsigned char *name = (const unsigned char *) namearg;
+       *name; name++)
+    h = (h << 5) + h + *name;
+  return h;
 }
 
 /* Create a tdata field OBJECT_SIZE bytes in length, zeroed out and with
-- 
2.39.2


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: bfd: optimize bfd_elf_hash
  2023-04-09 22:55 bfd: optimize bfd_elf_hash Nathan Sidwell
@ 2023-04-11 12:36 ` Alan Modra
  2023-04-12  5:33   ` Fangrui Song
  0 siblings, 1 reply; 3+ messages in thread
From: Alan Modra @ 2023-04-11 12:36 UTC (permalink / raw)
  To: Nathan Sidwell; +Cc: Nick Clifton, binutils

On Sun, Apr 09, 2023 at 06:55:13PM -0400, Nathan Sidwell via Binutils wrote:
> 	* elf.c (bfd_elf_hash): Refactor to optimize loop.
> 	(bfd_elf_gnu_hash): Refactor to use 32-bit type.

Looks good to me.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: bfd: optimize bfd_elf_hash
  2023-04-11 12:36 ` Alan Modra
@ 2023-04-12  5:33   ` Fangrui Song
  0 siblings, 0 replies; 3+ messages in thread
From: Fangrui Song @ 2023-04-12  5:33 UTC (permalink / raw)
  To: Nathan Sidwell; +Cc: Alan Modra, Nick Clifton, binutils

On Tue, Apr 11, 2023 at 5:36 AM Alan Modra via Binutils
<binutils@sourceware.org> wrote:
>
> On Sun, Apr 09, 2023 at 06:55:13PM -0400, Nathan Sidwell via Binutils wrote:
> >       * elf.c (bfd_elf_hash): Refactor to optimize loop.
> >       (bfd_elf_gnu_hash): Refactor to use 32-bit type.
>
> Looks good to me.
>
> --
> Alan Modra
> Australia Development Lab, IBM

Thanks for the patch! The generic ABI code fragment may have an oversight.
When long represents a 64-bit integer, elf_hash((const unsigned char
*)"\xff\x0f\x0f\x0f\x0f\x0f\x12") returns 0x100000002, larger than
UINT32_MAX.
It is unclear whether a value larger than UINT32_MAX is intended.

I then investigated the binutils-gdb implementation and found that in
a 2003 commit (https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=32dfa85d9015feea4a06d423fe58f6eaf841456e),
Andrew Haley appeared to have noticed the issue and made a change to
"Mask lower 32 bits of hash."
This change made me more convinced that the generic ABI code fragment
had an oversight.

If the generic ABI code fragment is indeed an oversight, this
optimized version looks good to me.

I asked https://groups.google.com/g/generic-abi/c/8J_jtjsonrE ("What
if the result of elf_hash is larger than UINT32_MAX?").

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-04-12  5:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-09 22:55 bfd: optimize bfd_elf_hash Nathan Sidwell
2023-04-11 12:36 ` Alan Modra
2023-04-12  5:33   ` Fangrui Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).