bfd: optimize bfd_elf_hash

public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed

From: Nathan Sidwell <nathan@acm.org>
To: Nick Clifton <nickc@redhat.com>, binutils <binutils@sourceware.org>
Subject: bfd: optimize bfd_elf_hash
Date: Sun, 9 Apr 2023 18:55:13 -0400	[thread overview]
Message-ID: <745eb337-facd-244a-bd02-d9b8b1f653a5@acm.org> (raw)

[-- Attachment #1: Type: text/plain, Size: 999 bytes --]

I happened to be poking at llvm's implementation of the sysv hash and notice 
binutils' version had similar optimization issues.

neither gcc nor llvm can spot these xforms, and the if (...) at least obscures 
the data flow IMHO anyway.

The bfd_elf_hash loop is taken straight from the sysV document, but it
is poorly optimized. This refactoring removes about 5 x86 insns from
the 15 insn loop.

1) The if (..) is meaningless -- we're xoring with that value, and of
course xor 0 is a nop. On x86 (at least) we actually compute the xor'd
value and then cmov.  Removing the if test removes the cmov.

2) The 'h ^ g' to clear the top 4 bits is not needed, as those 4 bits
will be shifted out in the next iteration.  All we need to do is sink
a mask of those 4 bits out of the loop.

3) anding with 0xf0 after shifting by 24 bits can allow betterin
encoding on RISC ISAs than masking with '0xf0 << 24' before shifting.
RISC ISAs often require materializing larger constants.

nathan

-- 
Nathan Sidwell

[-- Attachment #2: 0001-bfd-optimize-bfd_elf_hash.patch --]
[-- Type: text/x-patch, Size: 2620 bytes --]

From 4fc6668fcbee2945d79b40f51cdcaeac78f08aa3 Mon Sep 17 00:00:00 2001
From: Nathan Sidwell <nathan@acm.org>
Date: Sun, 9 Apr 2023 18:44:07 -0400
Subject: [PATCH] bfd: optimize bfd_elf_hash

The bfd_elf_hash loop is taken straight from the sysV document, but it
is poorly optimized. This refactoring removes about 5 x86 insns from
the 15 insn loop.

1) The if (..) is meaningless -- we're xoring with that value, and of
course xor 0 is a nop. On x86 (at least) we actually compute the xor'd
value and then cmov.  Removing the if test removes the cmov.

2) The 'h ^ g' to clear the top 4 bits is not needed, as those 4 bits
will be shifted out in the next iteration.  All we need to do is sink
a mask of those 4 bits out of the loop.

3) anding with 0xf0 after shifting by 24 bits can allow betterin
encoding on RISC ISAs than masking with '0xf0 << 24' before shifting.
RISC ISAs often require materializing larger constants.

	bfd/
	* elf.c (bfd_elf_hash): Refactor to optimize loop.
	(bfd_elf_gnu_hash): Refactor to use 32-bit type.
---
 bfd/elf.c | 31 +++++++++++--------------------
 1 file changed, 11 insertions(+), 20 deletions(-)

diff --git a/bfd/elf.c b/bfd/elf.c
index 87ec1623313..bda83469edd 100644
--- a/bfd/elf.c
+++ b/bfd/elf.c
@@ -196,23 +196,15 @@ _bfd_elf_swap_versym_out (bfd *abfd,
 unsigned long
 bfd_elf_hash (const char *namearg)
 {
-  const unsigned char *name = (const unsigned char *) namearg;
-  unsigned long h = 0;
-  unsigned long g;
-  int ch;
+  uint32_t h = 0;

-  while ((ch = *name++) != '\0')
+  for (const unsigned char *name = (const unsigned char *) namearg;
+       *name; name++)
     {
-      h = (h << 4) + ch;
-      if ((g = (h & 0xf0000000)) != 0)
-	{
-	  h ^= g >> 24;
-	  /* The ELF ABI says `h &= ~g', but this is equivalent in
-	     this case and on some machines one insn instead of two.  */
-	  h ^= g;
-	}
+      h = (h << 4) + *name;
+      h ^= (h >> 24) & 0xf0;
     }
-  return h & 0xffffffff;
+  return h & 0x0fffffff;
 }

 /* DT_GNU_HASH hash function.  Do not change this function; you will
@@ -221,13 +213,12 @@ bfd_elf_hash (const char *namearg)
 unsigned long
 bfd_elf_gnu_hash (const char *namearg)
 {
-  const unsigned char *name = (const unsigned char *) namearg;
-  unsigned long h = 5381;
-  unsigned char ch;
+  uint32_t h = 5381;

-  while ((ch = *name++) != '\0')
-    h = (h << 5) + h + ch;
-  return h & 0xffffffff;
+  for (const unsigned char *name = (const unsigned char *) namearg;
+       *name; name++)
+    h = (h << 5) + h + *name;
+  return h;
 }

 /* Create a tdata field OBJECT_SIZE bytes in length, zeroed out and with
-- 
2.39.2

next             reply	other threads:[~2023-04-09 22:55 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-09 22:55 Nathan Sidwell [this message]
2023-04-11 12:36 ` Alan Modra
2023-04-12  5:33   ` Fangrui Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=745eb337-facd-244a-bd02-d9b8b1f653a5@acm.org \
    --to=nathan@acm.org \
    --cc=binutils@sourceware.org \
    --cc=nickc@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).