public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/4] RISCV: Improve linker time complexity
@ 2022-04-12 16:25 Patrick O'Neill
  2022-04-12 16:25 ` [PATCH 1/4] RISCV: Add linker relaxation tests Patrick O'Neill
                   ` (5 more replies)
  0 siblings, 6 replies; 22+ messages in thread
From: Patrick O'Neill @ 2022-04-12 16:25 UTC (permalink / raw)
  To: binutils; +Cc: gnu-toolchain, palmer, andrew, Patrick O'Neill

The current linker has an O(n^2) time complexity when it comes to
deleting bytes. By deferring the deletion of bytes, we can achieve O(n)
deletion runtime.

Patrick O'Neill (4):
  RISCV: Add linker relaxation tests
  RISCV: Arrange DELETE pass after .align pass
  RISCV: Implement piecewise deletion
  RISCV: Improve runtime of align directives

 bfd/elfnn-riscv.c                          | 343 ++++++++++++---------
 ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp |   6 +
 ld/testsuite/ld-riscv-elf/relax-call-1.d   |  17 +
 ld/testsuite/ld-riscv-elf/relax-call-1.s   |   7 +
 ld/testsuite/ld-riscv-elf/relax-call-2.d   |  21 ++
 ld/testsuite/ld-riscv-elf/relax-call-2.s   |  10 +
 ld/testsuite/ld-riscv-elf/relax-call-3.d   |  25 ++
 ld/testsuite/ld-riscv-elf/relax-call-3.s   |  13 +
 ld/testsuite/ld-riscv-elf/relax-call-4.d   |  19 ++
 ld/testsuite/ld-riscv-elf/relax-call-4.s   |   8 +
 ld/testsuite/ld-riscv-elf/relax-call-5.d   |  23 ++
 ld/testsuite/ld-riscv-elf/relax-call-5.s   |  11 +
 ld/testsuite/ld-riscv-elf/relax-call-6.d   |  22 ++
 ld/testsuite/ld-riscv-elf/relax-call-6.s   |  11 +
 14 files changed, 389 insertions(+), 147 deletions(-)
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-1.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-1.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-2.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-2.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-3.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-3.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-4.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-4.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-5.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-5.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-6.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-6.s

-- 
2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/4] RISCV: Add linker relaxation tests
  2022-04-12 16:25 [PATCH 0/4] RISCV: Improve linker time complexity Patrick O'Neill
@ 2022-04-12 16:25 ` Patrick O'Neill
  2022-04-12 16:25 ` [PATCH 2/4] RISCV: Arrange DELETE pass after .align pass Patrick O'Neill
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Patrick O'Neill @ 2022-04-12 16:25 UTC (permalink / raw)
  To: binutils; +Cc: gnu-toolchain, palmer, andrew, Patrick O'Neill

Add basic tests for the behavior of relaxing function calls.
This covers behavior like relaxing a call, not relaxing a call, and
relaxing a call that is enabled by relaxing another call.

2022-04-11 Patrick O'Neill <patrick@rivosinc.com>

	* ld-riscv-elf.exp: Add testcases.
	* relax-call-1.d: Add new testcase.
	* relax-call-1.s: Likewise.
	* relax-call-2.d: Likewise.
	* relax-call-2.s: Likewise.
	* relax-call-3.d: Likewise.
	* relax-call-3.s: Likewise.
	* relax-call-4.d: Likewise.
	* relax-call-4.s: Likewise.
	* relax-call-5.d: Likewise.
	* relax-call-5.s: Likewise.
	* relax-call-6.d: Likewise.
	* relax-call-6.s: Likewise.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
---
 ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp |  6 ++++++
 ld/testsuite/ld-riscv-elf/relax-call-1.d   | 17 +++++++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-1.s   |  7 ++++++
 ld/testsuite/ld-riscv-elf/relax-call-2.d   | 21 ++++++++++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-2.s   | 10 +++++++++
 ld/testsuite/ld-riscv-elf/relax-call-3.d   | 25 ++++++++++++++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-3.s   | 13 +++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-4.d   | 19 ++++++++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-4.s   |  8 +++++++
 ld/testsuite/ld-riscv-elf/relax-call-5.d   | 23 ++++++++++++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-5.s   | 11 ++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-6.d   | 22 +++++++++++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-6.s   | 11 ++++++++++
 13 files changed, 193 insertions(+)
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-1.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-1.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-2.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-2.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-3.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-3.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-4.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-4.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-5.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-5.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-6.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-6.s

diff --git a/ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp b/ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp
index 272424b33e..60f7cc5f20 100644
--- a/ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp
+++ b/ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp
@@ -119,6 +119,12 @@ proc run_relax_twice_test {} {
 }
 
 if [istarget "riscv*-*-*"] {
+    run_dump_test "relax-call-1"
+    run_dump_test "relax-call-2"
+    run_dump_test "relax-call-3"
+    run_dump_test "relax-call-4"
+    run_dump_test "relax-call-5"
+    run_dump_test "relax-call-6"
     run_dump_test "align-small-region"
     run_dump_test "call-relax"
     run_dump_test "pcgp-relax-01"
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-1.d b/ld/testsuite/ld-riscv-elf/relax-call-1.d
new file mode 100644
index 0000000000..3e7bfb500c
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-1.d
@@ -0,0 +1,17 @@
+#source: relax-call-1.s
+#ld: --relax
+#objdump: -d
+
+
+.*:     file format .*
+
+
+Disassembly of section \.text:
+
+.*<foo>:
+.*ret
+
+.* <_start>:
+.*jal.* <foo>
+.*ret
+#pass
\ No newline at end of file
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-1.s b/ld/testsuite/ld-riscv-elf/relax-call-1.s
new file mode 100644
index 0000000000..7b81e28f99
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-1.s
@@ -0,0 +1,7 @@
+	.text
+foo:
+	jr	ra
+	.globl	_start
+_start:
+	call	foo
+	jr	ra
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-2.d b/ld/testsuite/ld-riscv-elf/relax-call-2.d
new file mode 100644
index 0000000000..7481bea8c1
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-2.d
@@ -0,0 +1,21 @@
+#source: relax-call-2.s
+#ld: --relax
+#objdump: -d
+
+
+.*:     file format .*
+
+
+Disassembly of section \.text:
+
+.*<foo>:
+.*ret
+
+.*<bar>:
+.*ret
+
+.* <_start>:
+.*jal.* <foo>
+.*jal.* <bar>
+.*ret
+#pass
\ No newline at end of file
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-2.s b/ld/testsuite/ld-riscv-elf/relax-call-2.s
new file mode 100644
index 0000000000..7f5844a678
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-2.s
@@ -0,0 +1,10 @@
+	.text
+foo:
+	jr	ra
+bar:
+	jr	ra
+	.globl	_start
+_start:
+	call	foo
+	call	bar
+	jr	ra
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-3.d b/ld/testsuite/ld-riscv-elf/relax-call-3.d
new file mode 100644
index 0000000000..5c49222e05
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-3.d
@@ -0,0 +1,25 @@
+#source: relax-call-3.s
+#ld: --relax
+#objdump: -d
+
+
+.*:     file format .*
+
+
+Disassembly of section \.text:
+
+.*<foo>:
+.*ret
+
+.*<bar>:
+.*ret
+
+.*<baz>:
+.*ret
+
+.* <_start>:
+.*jal.* <foo>
+.*jal.* <bar>
+.*jal.* <baz>
+.*ret
+#pass
\ No newline at end of file
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-3.s b/ld/testsuite/ld-riscv-elf/relax-call-3.s
new file mode 100644
index 0000000000..66e7c02ca1
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-3.s
@@ -0,0 +1,13 @@
+	.text
+foo:
+	jr	ra
+bar:
+	jr	ra
+baz:
+	jr	ra
+	.globl	_start
+_start:
+	call	foo
+	call	bar
+	call	baz
+	jr	ra
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-4.d b/ld/testsuite/ld-riscv-elf/relax-call-4.d
new file mode 100644
index 0000000000..09e939ca9b
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-4.d
@@ -0,0 +1,19 @@
+#source: relax-call-4.s
+#ld: --relax
+#objdump: -d
+
+
+.*:     file format .*
+
+
+Disassembly of section \.text:
+
+.*<foo>:
+.*ret
+.*\.\.\.
+
+.* <_start>:
+.*auipc.*
+.*jalr.* <foo>
+.*ret
+#pass
\ No newline at end of file
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-4.s b/ld/testsuite/ld-riscv-elf/relax-call-4.s
new file mode 100644
index 0000000000..44eed0bc12
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-4.s
@@ -0,0 +1,8 @@
+	.text
+foo:
+	jr	ra
+	.zero 1048570
+	.globl	_start
+_start:
+	call	foo
+	jr	ra
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-5.d b/ld/testsuite/ld-riscv-elf/relax-call-5.d
new file mode 100644
index 0000000000..d9f44e5f8a
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-5.d
@@ -0,0 +1,23 @@
+#source: relax-call-5.s
+#ld: --relax
+#objdump: -d
+
+
+.*:     file format .*
+
+
+Disassembly of section \.text:
+
+.*<foo>:
+.*ret
+.*\.\.\.
+
+.*<bar>:
+.*ret
+
+.* <_start>:
+.*auipc.*
+.*jalr.* <foo>
+.*jal.* <bar>
+.*ret
+#pass
\ No newline at end of file
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-5.s b/ld/testsuite/ld-riscv-elf/relax-call-5.s
new file mode 100644
index 0000000000..5605fb14e7
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-5.s
@@ -0,0 +1,11 @@
+	.text
+foo:
+	jr	ra
+	.zero 1048566
+bar:
+	jr	ra
+	.globl	_start
+_start:
+	call	foo
+	call	bar
+	jr	ra
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-6.d b/ld/testsuite/ld-riscv-elf/relax-call-6.d
new file mode 100644
index 0000000000..c385ccb551
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-6.d
@@ -0,0 +1,22 @@
+#source: relax-call-6.s
+#ld: --relax
+#objdump: -d
+
+
+.*:     file format .*
+
+
+Disassembly of section \.text:
+
+.*<foo>:
+.*ret
+.*\.\.\.
+
+.*<bar>:
+.*ret
+
+.* <_start>:
+.*jal.* <bar>
+.*jal.* <foo>
+.*ret
+#pass
\ No newline at end of file
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-6.s b/ld/testsuite/ld-riscv-elf/relax-call-6.s
new file mode 100644
index 0000000000..6df73e108f
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-6.s
@@ -0,0 +1,11 @@
+	.text
+foo:
+	jr	ra
+	.zero 1048560
+bar:
+	jr	ra
+	.globl	_start
+_start:
+	call	bar
+	call	foo
+	jr	ra
-- 
2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 2/4] RISCV: Arrange DELETE pass after .align pass
  2022-04-12 16:25 [PATCH 0/4] RISCV: Improve linker time complexity Patrick O'Neill
  2022-04-12 16:25 ` [PATCH 1/4] RISCV: Add linker relaxation tests Patrick O'Neill
@ 2022-04-12 16:25 ` Patrick O'Neill
  2022-04-12 16:26 ` [PATCH 3/4] RISCV: Implement piecewise deletion Patrick O'Neill
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Patrick O'Neill @ 2022-04-12 16:25 UTC (permalink / raw)
  To: binutils; +Cc: gnu-toolchain, palmer, andrew, Patrick O'Neill

By moving the deletion pass after the align pass, the linker can touch
each byte once via piecewise deletion. Otherwise, it may need to move
the same bytes twice.

2022-04-11 Patrick O'Neill <patrick@rivosinc.com>

	* elfnn-riscv.c (_bfd_riscv_relax_align): Count to-be-deleted
	  bytes so the alignments are still accurate.
	* elfnn-riscv.c (_bfd_riscv_relax_section): Move DELETE pass
	  after ALIGN pass.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
---
This gives us an O(n^2) runtime when the align pass counts the number of 
deleted bytes preceding it, but this will be fixed in Patch 4/4.
---
 bfd/elfnn-riscv.c | 45 +++++++++++++++++++++++++++++----------------
 1 file changed, 29 insertions(+), 16 deletions(-)

diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c
index 8f9f0d8a86..17f9607744 100644
--- a/bfd/elfnn-riscv.c
+++ b/bfd/elfnn-riscv.c
@@ -4435,7 +4435,7 @@ _bfd_riscv_relax_tls_le (bfd *abfd,
 static bool
 _bfd_riscv_relax_align (bfd *abfd, asection *sec,
 			asection *sym_sec,
-			struct bfd_link_info *link_info,
+			struct bfd_link_info *link_info ATTRIBUTE_UNUSED,
 			Elf_Internal_Rela *rel,
 			bfd_vma symval,
 			bfd_vma max_alignment ATTRIBUTE_UNUSED,
@@ -4449,6 +4449,18 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
   while (alignment <= rel->r_addend)
     alignment *= 2;
 
+  Elf_Internal_Rela *relocs = elf_section_data (sec)->relocs;
+  for (unsigned int i = 0; i < sec->reloc_count; i++)
+    {
+      Elf_Internal_Rela *reloc = relocs + i;
+      /* Ignore annotations after this alignment directive.  */
+      if (reloc == rel)
+	break;
+      /* Account for to-be-deleted bytes  */
+      else if (ELFNN_R_TYPE (reloc->r_info) == R_RISCV_DELETE)
+	symval -= reloc->r_addend;
+    }
+
   symval -= rel->r_addend;
   bfd_vma aligned_addr = ((symval - 1) & ~(alignment - 1)) + alignment;
   bfd_vma nop_bytes = aligned_addr - symval;
@@ -4468,12 +4480,12 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
       return false;
     }
 
-  /* Delete the reloc.  */
-  rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
-
-  /* If the number of NOPs is already correct, there's nothing to do.  */
+  /* If the number of NOPs is already correct, delete the reloc.  */
   if (nop_bytes == rel->r_addend)
-    return true;
+    {
+      rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
+      return true;
+    }
 
   /* Write as many RISC-V NOPs as we need.  */
   for (pos = 0; pos < (nop_bytes & -4); pos += 4)
@@ -4483,10 +4495,11 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
   if (nop_bytes % 4 != 0)
     bfd_putl16 (RVC_NOP, contents + rel->r_offset + pos);
 
-  /* Delete the excess bytes.  */
-  return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + nop_bytes,
-				   rel->r_addend - nop_bytes, link_info,
-				   NULL);
+  /* Mark the excess bytes for deletion.  */
+  rel->r_info = ELFNN_R_INFO (0, R_RISCV_DELETE);
+  rel->r_addend = rel->r_addend - nop_bytes;
+  rel->r_offset = rel->r_offset + nop_bytes;
+  return true;
 }
 
 /* Relax PC-relative references to GP-relative references.  */
@@ -4677,8 +4690,8 @@ bfd_elfNN_riscv_set_data_segment_info (struct bfd_link_info *info,
 /* Relax a section.
 
    Pass 0: Shortens code sequences for LUI/CALL/TPREL/PCREL relocs.
-   Pass 1: Deletes the bytes that PCREL relaxation in pass 0 made obsolete.
-   Pass 2: Which cannot be disabled, handles code alignment directives.  */
+   Pass 1: Which cannot be disabled, handles code alignment directives.
+   Pass 2: Handle DELETE directives.  */
 
 static bool
 _bfd_riscv_relax_section (bfd *abfd, asection *sec,
@@ -4697,7 +4710,7 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
   *again = false;
 
   if (bfd_link_relocatable (info)
-      || sec->sec_flg0
+      || (sec->sec_flg0 && info->relax_pass == 0)
       || (sec->flags & SEC_RELOC) == 0
       || sec->reloc_count == 0
       || (info->disable_target_specific_optimizations
@@ -4771,10 +4784,10 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
 	  /* Skip over the R_RISCV_RELAX.  */
 	  i++;
 	}
-      else if (info->relax_pass == 1 && type == R_RISCV_DELETE)
-	relax_func = _bfd_riscv_relax_delete;
-      else if (info->relax_pass == 2 && type == R_RISCV_ALIGN)
+      else if (info->relax_pass == 1 && type == R_RISCV_ALIGN)
 	relax_func = _bfd_riscv_relax_align;
+      else if (info->relax_pass == 2 && type == R_RISCV_DELETE)
+	relax_func = _bfd_riscv_relax_delete;
       else
 	continue;
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 3/4] RISCV: Implement piecewise deletion
  2022-04-12 16:25 [PATCH 0/4] RISCV: Improve linker time complexity Patrick O'Neill
  2022-04-12 16:25 ` [PATCH 1/4] RISCV: Add linker relaxation tests Patrick O'Neill
  2022-04-12 16:25 ` [PATCH 2/4] RISCV: Arrange DELETE pass after .align pass Patrick O'Neill
@ 2022-04-12 16:26 ` Patrick O'Neill
  2022-04-12 16:26 ` [PATCH 4/4] RISCV: Improve runtime of align directives Patrick O'Neill
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Patrick O'Neill @ 2022-04-12 16:26 UTC (permalink / raw)
  To: binutils; +Cc: gnu-toolchain, palmer, andrew, Patrick O'Neill

Rather than deleting bytes immediately, we can first mark the bytes for
deletion. Then, using piecewise deletion, we reduce the runtime from
O(n^2) to O(n).

By moving the logic from the riscv_relax_delete_bytes function into the
RISCV_DELETE reloc handler, we can delete the bytes in a piecewise
manner.

2022-04-11 Patrick O'Neill <patrick@rivosinc.com>

	* elfnn-riscv.c: Move deletion logic into DELETE reloc handler
	  and replace with DELETE reloc creation logic.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
---
This change overwrites the R_RISCV_RELAX with a R_RISCV_DELETE reloc
when a relaxation wants to delete bytes. If it was going to delete a
reloc, we use that reloc rather than deleting it.
---
 bfd/elfnn-riscv.c | 302 +++++++++++++++++++++++++---------------------
 1 file changed, 167 insertions(+), 135 deletions(-)

diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c
index 17f9607744..5ebf7fe578 100644
--- a/bfd/elfnn-riscv.c
+++ b/bfd/elfnn-riscv.c
@@ -4057,117 +4057,14 @@ riscv_update_pcgp_relocs (riscv_pcgp_relocs *p, asection *deleted_sec,
 /* Delete some bytes from a section while relaxing.  */
 
 static bool
-riscv_relax_delete_bytes (bfd *abfd,
-			  asection *sec,
-			  bfd_vma addr,
+riscv_relax_delete_bytes (bfd_vma addr,
 			  size_t count,
-			  struct bfd_link_info *link_info,
-			  riscv_pcgp_relocs *p)
+			  Elf_Internal_Rela *rel)
 {
-  unsigned int i, symcount;
-  bfd_vma toaddr = sec->size;
-  struct elf_link_hash_entry **sym_hashes = elf_sym_hashes (abfd);
-  Elf_Internal_Shdr *symtab_hdr = &elf_tdata (abfd)->symtab_hdr;
-  unsigned int sec_shndx = _bfd_elf_section_from_bfd_section (abfd, sec);
-  struct bfd_elf_section_data *data = elf_section_data (sec);
-  bfd_byte *contents = data->this_hdr.contents;
-
-  /* Actually delete the bytes.  */
-  sec->size -= count;
-  memmove (contents + addr, contents + addr + count, toaddr - addr - count);
-
-  /* Adjust the location of all of the relocs.  Note that we need not
-     adjust the addends, since all PC-relative references must be against
-     symbols, which we will adjust below.  */
-  for (i = 0; i < sec->reloc_count; i++)
-    if (data->relocs[i].r_offset > addr && data->relocs[i].r_offset < toaddr)
-      data->relocs[i].r_offset -= count;
-
-  /* Adjust the hi_sec_off, and the hi_addr of any entries in the pcgp relocs
-     table for which these values occur after the deleted bytes.  */
-  if (p)
-    riscv_update_pcgp_relocs (p, sec, addr, count);
-
-  /* Adjust the local symbols defined in this section.  */
-  for (i = 0; i < symtab_hdr->sh_info; i++)
-    {
-      Elf_Internal_Sym *sym = (Elf_Internal_Sym *) symtab_hdr->contents + i;
-      if (sym->st_shndx == sec_shndx)
-	{
-	  /* If the symbol is in the range of memory we just moved, we
-	     have to adjust its value.  */
-	  if (sym->st_value > addr && sym->st_value <= toaddr)
-	    sym->st_value -= count;
-
-	  /* If the symbol *spans* the bytes we just deleted (i.e. its
-	     *end* is in the moved bytes but its *start* isn't), then we
-	     must adjust its size.
-
-	     This test needs to use the original value of st_value, otherwise
-	     we might accidentally decrease size when deleting bytes right
-	     before the symbol.  But since deleted relocs can't span across
-	     symbols, we can't have both a st_value and a st_size decrease,
-	     so it is simpler to just use an else.  */
-	  else if (sym->st_value <= addr
-		   && sym->st_value + sym->st_size > addr
-		   && sym->st_value + sym->st_size <= toaddr)
-	    sym->st_size -= count;
-	}
-    }
-
-  /* Now adjust the global symbols defined in this section.  */
-  symcount = ((symtab_hdr->sh_size / sizeof (ElfNN_External_Sym))
-	      - symtab_hdr->sh_info);
-
-  for (i = 0; i < symcount; i++)
-    {
-      struct elf_link_hash_entry *sym_hash = sym_hashes[i];
-
-      /* The '--wrap SYMBOL' option is causing a pain when the object file,
-	 containing the definition of __wrap_SYMBOL, includes a direct
-	 call to SYMBOL as well. Since both __wrap_SYMBOL and SYMBOL reference
-	 the same symbol (which is __wrap_SYMBOL), but still exist as two
-	 different symbols in 'sym_hashes', we don't want to adjust
-	 the global symbol __wrap_SYMBOL twice.
-
-	 The same problem occurs with symbols that are versioned_hidden, as
-	 foo becomes an alias for foo@BAR, and hence they need the same
-	 treatment.  */
-      if (link_info->wrap_hash != NULL
-	  || sym_hash->versioned != unversioned)
-	{
-	  struct elf_link_hash_entry **cur_sym_hashes;
-
-	  /* Loop only over the symbols which have already been checked.  */
-	  for (cur_sym_hashes = sym_hashes; cur_sym_hashes < &sym_hashes[i];
-	       cur_sym_hashes++)
-	    {
-	      /* If the current symbol is identical to 'sym_hash', that means
-		 the symbol was already adjusted (or at least checked).  */
-	      if (*cur_sym_hashes == sym_hash)
-		break;
-	    }
-	  /* Don't adjust the symbol again.  */
-	  if (cur_sym_hashes < &sym_hashes[i])
-	    continue;
-	}
-
-      if ((sym_hash->root.type == bfd_link_hash_defined
-	   || sym_hash->root.type == bfd_link_hash_defweak)
-	  && sym_hash->root.u.def.section == sec)
-	{
-	  /* As above, adjust the value if needed.  */
-	  if (sym_hash->root.u.def.value > addr
-	      && sym_hash->root.u.def.value <= toaddr)
-	    sym_hash->root.u.def.value -= count;
-
-	  /* As above, adjust the size if needed.  */
-	  else if (sym_hash->root.u.def.value <= addr
-		   && sym_hash->root.u.def.value + sym_hash->size > addr
-		   && sym_hash->root.u.def.value + sym_hash->size <= toaddr)
-	    sym_hash->size -= count;
-	}
-    }
+  /* Mark bytes for deletion.  */
+  rel->r_info = ELFNN_R_INFO (0, R_RISCV_DELETE);
+  rel->r_offset = addr;
+  rel->r_addend = count;
 
   return true;
 }
@@ -4177,7 +4074,7 @@ typedef bool (*relax_func_t) (bfd *, asection *, asection *,
 			      Elf_Internal_Rela *,
 			      bfd_vma, bfd_vma, bfd_vma, bool *,
 			      riscv_pcgp_relocs *,
-			      bool undefined_weak);
+			      bool undefined_weak, bfd_vma *);
 
 /* Relax AUIPC + JALR into JAL.  */
 
@@ -4190,7 +4087,8 @@ _bfd_riscv_relax_call (bfd *abfd, asection *sec, asection *sym_sec,
 		       bfd_vma reserve_size ATTRIBUTE_UNUSED,
 		       bool *again,
 		       riscv_pcgp_relocs *pcgp_relocs,
-		       bool undefined_weak ATTRIBUTE_UNUSED)
+		       bool undefined_weak ATTRIBUTE_UNUSED,
+		       bfd_vma *delete_total ATTRIBUTE_UNUSED)
 {
   bfd_byte *contents = elf_section_data (sec)->this_hdr.contents;
   bfd_vma foff = symval - (sec_addr (sec) + rel->r_offset);
@@ -4252,8 +4150,7 @@ _bfd_riscv_relax_call (bfd *abfd, asection *sec, asection *sym_sec,
 
   /* Delete unnecessary JALR.  */
   *again = true;
-  return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + len, 8 - len,
-				   link_info, pcgp_relocs);
+  return riscv_relax_delete_bytes (rel->r_offset + len, 8 - len, rel + 1);
 }
 
 /* Traverse all output sections and return the max alignment.  */
@@ -4286,7 +4183,8 @@ _bfd_riscv_relax_lui (bfd *abfd,
 		      bfd_vma reserve_size,
 		      bool *again,
 		      riscv_pcgp_relocs *pcgp_relocs,
-		      bool undefined_weak)
+		      bool undefined_weak,
+		      bfd_vma *delete_total ATTRIBUTE_UNUSED)
 {
   bfd_byte *contents = elf_section_data (sec)->this_hdr.contents;
   bfd_vma gp = riscv_global_pointer_value (link_info);
@@ -4343,11 +4241,10 @@ _bfd_riscv_relax_lui (bfd *abfd,
 	  return true;
 
 	case R_RISCV_HI20:
-	  /* We can delete the unnecessary LUI and reloc.  */
+	  /* We can delete the unnecessary LUI and reuse the reloc.  */
 	  rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
 	  *again = true;
-	  return riscv_relax_delete_bytes (abfd, sec, rel->r_offset, 4,
-					   link_info, pcgp_relocs);
+	  return riscv_relax_delete_bytes (rel->r_offset, 4, rel);
 
 	default:
 	  abort ();
@@ -4379,8 +4276,7 @@ _bfd_riscv_relax_lui (bfd *abfd,
       rel->r_info = ELFNN_R_INFO (ELFNN_R_SYM (rel->r_info), R_RISCV_RVC_LUI);
 
       *again = true;
-      return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + 2, 2,
-				       link_info, pcgp_relocs);
+      return riscv_relax_delete_bytes (rel->r_offset + 2, 2, rel + 1);
     }
 
   return true;
@@ -4399,7 +4295,8 @@ _bfd_riscv_relax_tls_le (bfd *abfd,
 			 bfd_vma reserve_size ATTRIBUTE_UNUSED,
 			 bool *again,
 			 riscv_pcgp_relocs *pcgp_relocs,
-			 bool undefined_weak ATTRIBUTE_UNUSED)
+			 bool undefined_weak ATTRIBUTE_UNUSED,
+			 bfd_vma *delete_total ATTRIBUTE_UNUSED)
 {
   /* See if this symbol is in range of tp.  */
   if (RISCV_CONST_HIGH_PART (tpoff (link_info, symval)) != 0)
@@ -4418,11 +4315,10 @@ _bfd_riscv_relax_tls_le (bfd *abfd,
 
     case R_RISCV_TPREL_HI20:
     case R_RISCV_TPREL_ADD:
-      /* We can delete the unnecessary instruction and reloc.  */
+      /* We can delete the unnecessary instruction and reuse the reloc.  */
       rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
       *again = true;
-      return riscv_relax_delete_bytes (abfd, sec, rel->r_offset, 4, link_info,
-				       pcgp_relocs);
+      return riscv_relax_delete_bytes (rel->r_offset, 4, rel);
 
     default:
       abort ();
@@ -4442,7 +4338,8 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
 			bfd_vma reserve_size ATTRIBUTE_UNUSED,
 			bool *again ATTRIBUTE_UNUSED,
 			riscv_pcgp_relocs *pcgp_relocs ATTRIBUTE_UNUSED,
-			bool undefined_weak ATTRIBUTE_UNUSED)
+			bool undefined_weak ATTRIBUTE_UNUSED,
+			bfd_vma *delete_total ATTRIBUTE_UNUSED)
 {
   bfd_byte *contents = elf_section_data (sec)->this_hdr.contents;
   bfd_vma alignment = 1, pos;
@@ -4496,10 +4393,8 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
     bfd_putl16 (RVC_NOP, contents + rel->r_offset + pos);
 
   /* Mark the excess bytes for deletion.  */
-  rel->r_info = ELFNN_R_INFO (0, R_RISCV_DELETE);
-  rel->r_addend = rel->r_addend - nop_bytes;
-  rel->r_offset = rel->r_offset + nop_bytes;
-  return true;
+  return riscv_relax_delete_bytes (rel->r_offset + nop_bytes,
+				   rel->r_addend - nop_bytes, rel);
 }
 
 /* Relax PC-relative references to GP-relative references.  */
@@ -4515,7 +4410,8 @@ _bfd_riscv_relax_pc (bfd *abfd ATTRIBUTE_UNUSED,
 		     bfd_vma reserve_size,
 		     bool *again ATTRIBUTE_UNUSED,
 		     riscv_pcgp_relocs *pcgp_relocs,
-		     bool undefined_weak)
+		     bool undefined_weak,
+		     bfd_vma *delete_total ATTRIBUTE_UNUSED)
 {
   bfd_byte *contents = elf_section_data (sec)->this_hdr.contents;
   bfd_vma gp = riscv_global_pointer_value (link_info);
@@ -4666,13 +4562,148 @@ _bfd_riscv_relax_delete (bfd *abfd,
 			 bfd_vma max_alignment ATTRIBUTE_UNUSED,
 			 bfd_vma reserve_size ATTRIBUTE_UNUSED,
 			 bool *again ATTRIBUTE_UNUSED,
-			 riscv_pcgp_relocs *pcgp_relocs ATTRIBUTE_UNUSED,
-			 bool undefined_weak ATTRIBUTE_UNUSED)
+			 riscv_pcgp_relocs *pcgp_relocs,
+			 bool undefined_weak ATTRIBUTE_UNUSED,
+			 bfd_vma *delete_total)
 {
-  if (!riscv_relax_delete_bytes (abfd, sec, rel->r_offset, rel->r_addend,
-				 link_info, NULL))
-    return false;
+  unsigned int i, symcount;
+  bfd_vma toaddr = sec->size;
+  struct elf_link_hash_entry **sym_hashes = elf_sym_hashes (abfd);
+  Elf_Internal_Shdr *symtab_hdr = &elf_tdata (abfd)->symtab_hdr;
+  unsigned int sec_shndx = _bfd_elf_section_from_bfd_section (abfd, sec);
+  struct bfd_elf_section_data *data = elf_section_data (sec);
+  bfd_byte *contents = data->this_hdr.contents;
+
+  bfd_vma addr = rel->r_offset;
+  size_t count = rel->r_addend;
+  riscv_pcgp_relocs *p = pcgp_relocs;
+
+  /* Find the next DELETE reloc (if one exists).  */
+  Elf_Internal_Rela *relocs = elf_section_data (sec)->relocs;
+  Elf_Internal_Rela *next_delete = NULL;
+  /* Since we only replace existing relocs and don't add new relocs, the
+     relocs are in sequential order. We can skip the relocs prior to this one,
+     making this search linear time.  */
+  int start = rel - relocs;
+  for (unsigned int i = start; i < sec->reloc_count; i++)
+    {
+      next_delete = relocs + i;
+      if (ELFNN_R_TYPE (next_delete->r_info) == R_RISCV_DELETE
+	  && next_delete->r_offset > rel->r_offset)
+	break;
+      else
+	next_delete = NULL;
+    }
+
+  size_t bytes_to_move;
+  /* Make this a piecewise deletion.  */
+  if (next_delete == NULL)
+    bytes_to_move = toaddr - addr - count;
+  else
+    bytes_to_move = next_delete->r_offset - addr - count;
+
+  /* Actually delete the bytes.  */
+  sec->size -= count;
+  memmove (contents + addr, contents + addr + count + *delete_total, bytes_to_move);
+
+  *delete_total += count;
+
+  /* Delete the reloc.  */
   rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
+
+  /* Adjust the location of all of the relocs.  Note that we need not
+     adjust the addends, since all PC-relative references must be against
+     symbols, which we will adjust below.  */
+  for (i = 0; i < sec->reloc_count; i++)
+    if (data->relocs[i].r_offset > addr && data->relocs[i].r_offset < toaddr)
+      data->relocs[i].r_offset -= count;
+
+  /* Adjust the hi_sec_off, and the hi_addr of any entries in the pcgp relocs
+     table for which these values occur after the deleted bytes.  */
+  if (p)
+    riscv_update_pcgp_relocs (p, sec, addr, count);
+
+  /* Adjust the local symbols defined in this section.  */
+  for (i = 0; i < symtab_hdr->sh_info; i++)
+    {
+      Elf_Internal_Sym *sym = (Elf_Internal_Sym *) symtab_hdr->contents + i;
+      if (sym->st_shndx == sec_shndx)
+	{
+	  /* If the symbol is in the range of memory we just moved, we
+	     have to adjust its value.  */
+	  if (sym->st_value > addr && sym->st_value <= toaddr)
+	    sym->st_value -= count;
+
+	  /* If the symbol *spans* the bytes we just deleted (i.e. its
+	     *end* is in the moved bytes but its *start* isn't), then we
+	     must adjust its size.
+
+	     This test needs to use the original value of st_value, otherwise
+	     we might accidentally decrease size when deleting bytes right
+	     before the symbol.  But since deleted relocs can't span across
+	     symbols, we can't have both a st_value and a st_size decrease,
+	     so it is simpler to just use an else.  */
+	  else if (sym->st_value <= addr
+		   && sym->st_value + sym->st_size > addr
+		   && sym->st_value + sym->st_size <= toaddr)
+	    sym->st_size -= count;
+	}
+    }
+
+  /* Now adjust the global symbols defined in this section.  */
+  symcount = ((symtab_hdr->sh_size / sizeof (ElfNN_External_Sym))
+	      - symtab_hdr->sh_info);
+
+  for (i = 0; i < symcount; i++)
+    {
+      struct elf_link_hash_entry *sym_hash = sym_hashes[i];
+
+      /* The '--wrap SYMBOL' option is causing a pain when the object file,
+	 containing the definition of __wrap_SYMBOL, includes a direct
+	 call to SYMBOL as well. Since both __wrap_SYMBOL and SYMBOL reference
+	 the same symbol (which is __wrap_SYMBOL), but still exist as two
+	 different symbols in 'sym_hashes', we don't want to adjust
+	 the global symbol __wrap_SYMBOL twice.
+
+	 The same problem occurs with symbols that are versioned_hidden, as
+	 foo becomes an alias for foo@BAR, and hence they need the same
+	 treatment.  */
+      if (link_info->wrap_hash != NULL
+	  || sym_hash->versioned != unversioned)
+	{
+	  struct elf_link_hash_entry **cur_sym_hashes;
+
+	  /* Loop only over the symbols which have already been checked.  */
+	  for (cur_sym_hashes = sym_hashes; cur_sym_hashes < &sym_hashes[i];
+	       cur_sym_hashes++)
+	    {
+	      /* If the current symbol is identical to 'sym_hash', that means
+		 the symbol was already adjusted (or at least checked).  */
+	      if (*cur_sym_hashes == sym_hash)
+		break;
+	    }
+	  /* Don't adjust the symbol again.  */
+	  if (cur_sym_hashes < &sym_hashes[i])
+	    continue;
+	}
+
+      if ((sym_hash->root.type == bfd_link_hash_defined
+	   || sym_hash->root.type == bfd_link_hash_defweak)
+	  && sym_hash->root.u.def.section == sec)
+	{
+	  /* As above, adjust the value if needed.  */
+	  if (sym_hash->root.u.def.value > addr
+	      && sym_hash->root.u.def.value <= toaddr)
+	    sym_hash->root.u.def.value -= count;
+
+	  /* As above, adjust the size if needed.  */
+	  else if (sym_hash->root.u.def.value <= addr
+		   && sym_hash->root.u.def.value + sym_hash->size > addr
+		   && sym_hash->root.u.def.value + sym_hash->size <= toaddr)
+	    sym_hash->size -= count;
+	}
+    }
+
   return true;
 }
 
@@ -4706,6 +4737,7 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
   unsigned int i;
   bfd_vma max_alignment, reserve_size = 0;
   riscv_pcgp_relocs pcgp_relocs;
+  bfd_vma delete_bytes = 0;
 
   *again = false;
 
@@ -4941,7 +4973,7 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
 
       if (!relax_func (abfd, sec, sym_sec, info, rel, symval,
 		       max_alignment, reserve_size, again,
-		       &pcgp_relocs, undefined_weak))
+		       &pcgp_relocs, undefined_weak, &delete_bytes))
 	goto fail;
     }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 4/4] RISCV: Improve runtime of align directives
  2022-04-12 16:25 [PATCH 0/4] RISCV: Improve linker time complexity Patrick O'Neill
                   ` (2 preceding siblings ...)
  2022-04-12 16:26 ` [PATCH 3/4] RISCV: Implement piecewise deletion Patrick O'Neill
@ 2022-04-12 16:26 ` Patrick O'Neill
  2022-04-13  0:58 ` [PATCH 0/4] RISCV: Improve linker time complexity Kito Cheng
  2022-05-02 13:50 ` [PATCH v2 0/5] " Patrick O'Neill
  5 siblings, 0 replies; 22+ messages in thread
From: Patrick O'Neill @ 2022-04-12 16:26 UTC (permalink / raw)
  To: binutils; +Cc: gnu-toolchain, palmer, andrew, Patrick O'Neill

Align directives need an accurate count of the number of bytes to be
deleted. This can be computed using a running sum, giving us an O(n),
rather than O(n^2) runtime.

2022-04-11 Patrick O'Neill <patrick@rivosinc.com>

	* elfnn-riscv.c (_bfd_riscv_relax_align): Rely on running
	  delete_bytes count when calculating alignment.
	* elfnn-riscv.c (_bfd_riscv_relax_section): Calculate running
	  deletion total during align pass.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
---
This fixes the O(n^2) runtime introduced in patch 2/4
---
 bfd/elfnn-riscv.c | 32 ++++++++++++++++++--------------
 1 file changed, 18 insertions(+), 14 deletions(-)

diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c
index 5ebf7fe578..e08d60edb0 100644
--- a/bfd/elfnn-riscv.c
+++ b/bfd/elfnn-riscv.c
@@ -4339,24 +4339,15 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
 			bool *again ATTRIBUTE_UNUSED,
 			riscv_pcgp_relocs *pcgp_relocs ATTRIBUTE_UNUSED,
 			bool undefined_weak ATTRIBUTE_UNUSED,
-			bfd_vma *delete_total ATTRIBUTE_UNUSED)
+			bfd_vma *delete_total)
 {
   bfd_byte *contents = elf_section_data (sec)->this_hdr.contents;
   bfd_vma alignment = 1, pos;
   while (alignment <= rel->r_addend)
     alignment *= 2;
 
-  Elf_Internal_Rela *relocs = elf_section_data (sec)->relocs;
-  for (unsigned int i = 0; i < sec->reloc_count; i++)
-    {
-      Elf_Internal_Rela *reloc = relocs + i;
-      /* Ignore annotations after this alignment directive.  */
-      if (reloc == rel)
-	break;
-      /* Account for to-be-deleted bytes  */
-      else if (ELFNN_R_TYPE (reloc->r_info) == R_RISCV_DELETE)
-	symval -= reloc->r_addend;
-    }
+  /* Account for to-be-deleted bytes.  */
+  symval -= *delete_total;
 
   symval -= rel->r_addend;
   bfd_vma aligned_addr = ((symval - 1) & ~(alignment - 1)) + alignment;
@@ -4392,6 +4383,9 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
   if (nop_bytes % 4 != 0)
     bfd_putl16 (RVC_NOP, contents + rel->r_offset + pos);
 
+  /* Account for the soon-to-be marked bytes.  */
+  *delete_total += rel->r_addend - nop_bytes;
+
   /* Mark the excess bytes for deletion.  */
   return riscv_relax_delete_bytes (rel->r_offset + nop_bytes,
 				   rel->r_addend - nop_bytes, rel);
@@ -4816,8 +4810,18 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
 	  /* Skip over the R_RISCV_RELAX.  */
 	  i++;
 	}
-      else if (info->relax_pass == 1 && type == R_RISCV_ALIGN)
-	relax_func = _bfd_riscv_relax_align;
+      else if (info->relax_pass == 1)
+        {
+	  if (type == R_RISCV_ALIGN)
+	    relax_func = _bfd_riscv_relax_align;
+	  else if (type == R_RISCV_DELETE)
+	    {
+	      delete_bytes += rel->r_addend;
+	      continue;
+	    }
+	  else
+	    continue;
+	}
       else if (info->relax_pass == 2 && type == R_RISCV_DELETE)
 	relax_func = _bfd_riscv_relax_delete;
       else
-- 
2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] RISCV: Improve linker time complexity
  2022-04-12 16:25 [PATCH 0/4] RISCV: Improve linker time complexity Patrick O'Neill
                   ` (3 preceding siblings ...)
  2022-04-12 16:26 ` [PATCH 4/4] RISCV: Improve runtime of align directives Patrick O'Neill
@ 2022-04-13  0:58 ` Kito Cheng
  2022-04-13  2:23   ` Palmer Dabbelt
  2022-04-13  5:12   ` Alan Modra
  2022-05-02 13:50 ` [PATCH v2 0/5] " Patrick O'Neill
  5 siblings, 2 replies; 22+ messages in thread
From: Kito Cheng @ 2022-04-13  0:58 UTC (permalink / raw)
  To: Patrick O'Neill
  Cc: Binutils, gnu-toolchain, Palmer Dabbelt, Nelson Chu, Jim Wilson

Hi Patrick:

Thanks for your amazing work! that's really good to see here is O(N)
implementation here,

And I have a suggestion here is - does it possible to let co-exist with current
implementation and having a command line option to select the linker
relaxation, of course we
could default to using the new implementation, but that gives us an
emergency fallback option to use the old implementation :)

Thanks





On Wed, Apr 13, 2022 at 12:27 AM Patrick O'Neill <patrick@rivosinc.com> wrote:
>
> The current linker has an O(n^2) time complexity when it comes to
> deleting bytes. By deferring the deletion of bytes, we can achieve O(n)
> deletion runtime.
>
> Patrick O'Neill (4):
>   RISCV: Add linker relaxation tests
>   RISCV: Arrange DELETE pass after .align pass
>   RISCV: Implement piecewise deletion
>   RISCV: Improve runtime of align directives
>
>  bfd/elfnn-riscv.c                          | 343 ++++++++++++---------
>  ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp |   6 +
>  ld/testsuite/ld-riscv-elf/relax-call-1.d   |  17 +
>  ld/testsuite/ld-riscv-elf/relax-call-1.s   |   7 +
>  ld/testsuite/ld-riscv-elf/relax-call-2.d   |  21 ++
>  ld/testsuite/ld-riscv-elf/relax-call-2.s   |  10 +
>  ld/testsuite/ld-riscv-elf/relax-call-3.d   |  25 ++
>  ld/testsuite/ld-riscv-elf/relax-call-3.s   |  13 +
>  ld/testsuite/ld-riscv-elf/relax-call-4.d   |  19 ++
>  ld/testsuite/ld-riscv-elf/relax-call-4.s   |   8 +
>  ld/testsuite/ld-riscv-elf/relax-call-5.d   |  23 ++
>  ld/testsuite/ld-riscv-elf/relax-call-5.s   |  11 +
>  ld/testsuite/ld-riscv-elf/relax-call-6.d   |  22 ++
>  ld/testsuite/ld-riscv-elf/relax-call-6.s   |  11 +
>  14 files changed, 389 insertions(+), 147 deletions(-)
>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-1.d
>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-1.s
>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-2.d
>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-2.s
>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-3.d
>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-3.s
>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-4.d
>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-4.s
>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-5.d
>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-5.s
>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-6.d
>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-6.s
>
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] RISCV: Improve linker time complexity
  2022-04-13  0:58 ` [PATCH 0/4] RISCV: Improve linker time complexity Kito Cheng
@ 2022-04-13  2:23   ` Palmer Dabbelt
  2022-04-13  5:12   ` Alan Modra
  1 sibling, 0 replies; 22+ messages in thread
From: Palmer Dabbelt @ 2022-04-13  2:23 UTC (permalink / raw)
  To: Kito Cheng
  Cc: Patrick O'Neill, binutils, gnu-toolchain, Nelson Chu, Jim Wilson

On Tue, 12 Apr 2022 17:58:38 PDT (-0700), Kito Cheng wrote:
> Hi Patrick:
>
> Thanks for your amazing work! that's really good to see here is O(N)
> implementation here,
>
> And I have a suggestion here is - does it possible to let co-exist with current
> implementation and having a command line option to select the linker
> relaxation, of course we
> could default to using the new implementation, but that gives us an
> emergency fallback option to use the old implementation :)

I haven't actually looked at the code, but based on the design there 
should be almost no difference between the two implementations: 
essentially it's just an extra half-iteration of latency between the 
deletion and address calculation.  In theory it wouldn't be that hard to 
have a flag that just turns the "mark for deletion" step into an 
"actually delete" step, which would result in exactly the same behavior 
we had before, but I'm not sure how that maps to the actual 
implementation as these patches don't look like I expected them to.

I'll try and find some time to sort out why the diff is bigger than I 
expected it to be.

>
> Thanks
>
>
>
>
>
> On Wed, Apr 13, 2022 at 12:27 AM Patrick O'Neill <patrick@rivosinc.com> wrote:
>>
>> The current linker has an O(n^2) time complexity when it comes to
>> deleting bytes. By deferring the deletion of bytes, we can achieve O(n)
>> deletion runtime.
>>
>> Patrick O'Neill (4):
>>   RISCV: Add linker relaxation tests
>>   RISCV: Arrange DELETE pass after .align pass
>>   RISCV: Implement piecewise deletion
>>   RISCV: Improve runtime of align directives
>>
>>  bfd/elfnn-riscv.c                          | 343 ++++++++++++---------
>>  ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp |   6 +
>>  ld/testsuite/ld-riscv-elf/relax-call-1.d   |  17 +
>>  ld/testsuite/ld-riscv-elf/relax-call-1.s   |   7 +
>>  ld/testsuite/ld-riscv-elf/relax-call-2.d   |  21 ++
>>  ld/testsuite/ld-riscv-elf/relax-call-2.s   |  10 +
>>  ld/testsuite/ld-riscv-elf/relax-call-3.d   |  25 ++
>>  ld/testsuite/ld-riscv-elf/relax-call-3.s   |  13 +
>>  ld/testsuite/ld-riscv-elf/relax-call-4.d   |  19 ++
>>  ld/testsuite/ld-riscv-elf/relax-call-4.s   |   8 +
>>  ld/testsuite/ld-riscv-elf/relax-call-5.d   |  23 ++
>>  ld/testsuite/ld-riscv-elf/relax-call-5.s   |  11 +
>>  ld/testsuite/ld-riscv-elf/relax-call-6.d   |  22 ++
>>  ld/testsuite/ld-riscv-elf/relax-call-6.s   |  11 +
>>  14 files changed, 389 insertions(+), 147 deletions(-)
>>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-1.d
>>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-1.s
>>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-2.d
>>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-2.s
>>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-3.d
>>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-3.s
>>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-4.d
>>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-4.s
>>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-5.d
>>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-5.s
>>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-6.d
>>  create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-6.s
>>
>> --
>> 2.25.1
>>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] RISCV: Improve linker time complexity
  2022-04-13  0:58 ` [PATCH 0/4] RISCV: Improve linker time complexity Kito Cheng
  2022-04-13  2:23   ` Palmer Dabbelt
@ 2022-04-13  5:12   ` Alan Modra
  2022-04-13 18:11     ` Palmer Dabbelt
  1 sibling, 1 reply; 22+ messages in thread
From: Alan Modra @ 2022-04-13  5:12 UTC (permalink / raw)
  To: Kito Cheng; +Cc: Patrick O'Neill, Binutils, gnu-toolchain

On Wed, Apr 13, 2022 at 08:58:38AM +0800, Kito Cheng via Binutils wrote:
> And I have a suggestion here is - does it possible to let co-exist with current
> implementation and having a command line option to select the linker
> relaxation, of course we
> could default to using the new implementation, but that gives us an
> emergency fallback option to use the old implementation :)

You already have an emergency fallback, use an older binutils or
revert the patchset.  IMO you do not want two implementations of any
given feature.   Doing so just makes it more likely that neither
implementation is good.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] RISCV: Improve linker time complexity
  2022-04-13  5:12   ` Alan Modra
@ 2022-04-13 18:11     ` Palmer Dabbelt
  2022-04-25 17:26       ` Patrick O'Neill
  0 siblings, 1 reply; 22+ messages in thread
From: Palmer Dabbelt @ 2022-04-13 18:11 UTC (permalink / raw)
  To: binutils; +Cc: Kito Cheng, Patrick O'Neill, binutils, gnu-toolchain

On Tue, 12 Apr 2022 22:12:22 PDT (-0700), binutils@sourceware.org wrote:
> On Wed, Apr 13, 2022 at 08:58:38AM +0800, Kito Cheng via Binutils wrote:
>> And I have a suggestion here is - does it possible to let co-exist with current
>> implementation and having a command line option to select the linker
>> relaxation, of course we
>> could default to using the new implementation, but that gives us an
>> emergency fallback option to use the old implementation :)
>
> You already have an emergency fallback, use an older binutils or
> revert the patchset.  IMO you do not want two implementations of any
> given feature.   Doing so just makes it more likely that neither
> implementation is good.

IMO a key point here is that the hueristics are subtly different, the 
linear-time algorithm will fail at both forwards and backwards targets 
where relaxation enables relaxation (as opposed to just failing at the 
forwards targets, like the old one did).  The theory is that there 
aren't any pathological cases in the wild, but it's hard to know for 
sure.  I think it should just be a few lines of code to match the old 
behavior (ie, just eagerly delete instead of deferring it to after all 
relocations are processed), but I'm not sure -- the change around 
alignment handling is tripping me up, as that was unexpected on my end.

That said, there's certainly enough complexity here so I don't think 
it's a big deal to just only support the new flavor.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] RISCV: Improve linker time complexity
  2022-04-13 18:11     ` Palmer Dabbelt
@ 2022-04-25 17:26       ` Patrick O'Neill
  0 siblings, 0 replies; 22+ messages in thread
From: Patrick O'Neill @ 2022-04-25 17:26 UTC (permalink / raw)
  To: Palmer Dabbelt, binutils; +Cc: Kito Cheng, gnu-toolchain

On 4/13/22 11:11, Palmer Dabbelt wrote:
> On Tue, 12 Apr 2022 22:12:22 PDT (-0700), binutils@sourceware.org wrote:
>> On Wed, Apr 13, 2022 at 08:58:38AM +0800, Kito Cheng via Binutils wrote:
>>> And I have a suggestion here is - does it possible to let co-exist 
>>> with current
>>> implementation and having a command line option to select the linker
>>> relaxation, of course we
>>> could default to using the new implementation, but that gives us an
>>> emergency fallback option to use the old implementation :)
>>
>> You already have an emergency fallback, use an older binutils or
>> revert the patchset.  IMO you do not want two implementations of any
>> given feature.   Doing so just makes it more likely that neither
>> implementation is good.
>
> IMO a key point here is that the hueristics are subtly different, the 
> linear-time algorithm will fail at both forwards and backwards targets 
> where relaxation enables relaxation (as opposed to just failing at the 
> forwards targets, like the old one did).  The theory is that there 
> aren't any pathological cases in the wild, but it's hard to know for 
> sure.  I think it should just be a few lines of code to match the old 
> behavior (ie, just eagerly delete instead of deferring it to after all 
> relocations are processed), but I'm not sure -- the change around 
> alignment handling is tripping me up, as that was unexpected on my end.
>
> That said, there's certainly enough complexity here so I don't think 
> it's a big deal to just only support the new flavor.

If we're only concerned about the backwards targets then it should be
relatively easy to have it evaluate the relaxation deletions
immediately.

There's a challenge in that the current method expects all deletions to
occur one after another with a running tally of deleted bytes. This
causes issues when a deletion depends on a later deletion to clear up
any slack introduced. To solve this, I think it'd be 2 if statements.
One to delete immediately, and one to not depend on a running tally when
deleting immediately.

Regarding the change to alignment/reordered passes, those changes just
defer the deletions related to alignment to one pass at the end. Other
than concerns around correctness, I don't see a reason to keep the old
flavor of alignment around. The old flavor has O(n^2) time complexity
and doesn't change the underlying behavior.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 0/5] RISCV: Improve linker time complexity
  2022-04-12 16:25 [PATCH 0/4] RISCV: Improve linker time complexity Patrick O'Neill
                   ` (4 preceding siblings ...)
  2022-04-13  0:58 ` [PATCH 0/4] RISCV: Improve linker time complexity Kito Cheng
@ 2022-05-02 13:50 ` Patrick O'Neill
  2022-05-02 13:50   ` [PATCH v2 1/5] RISCV: Add linker relaxation tests Patrick O'Neill
                     ` (5 more replies)
  5 siblings, 6 replies; 22+ messages in thread
From: Patrick O'Neill @ 2022-05-02 13:50 UTC (permalink / raw)
  To: binutils; +Cc: kito.cheng, palmer, gnu-toolchain, Patrick O'Neill

The current linker has an O(n^2) time complexity when it comes to
deleting bytes. By deferring the deletion of bytes, we can achieve O(n)
deletion runtime.

There is a pathological case that could cause this to have worse
performance, so the --no-defer-deletion flag is added to allow users
to opt-out.

Patrick O'Neill (4):
  RISCV: Add linker relaxation tests
  RISCV: Arrange DELETE pass after .align pass
  RISCV: Implement piecewise deletion
  RISCV: Improve runtime of align directives
  RISCV: Add --defer-deletion flag

 bfd/elfnn-riscv.c                          | 212 ++++++++++++++-------
 include/bfdlink.h                          |   4 +
 ld/ld.texi                                 |  10 +
 ld/ldlex.h                                 |   2 +
 ld/lexsup.c                                |  10 +
 ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp |   6 +
 ld/testsuite/ld-riscv-elf/relax-call-1.d   |  17 ++
 ld/testsuite/ld-riscv-elf/relax-call-1.s   |   7 +
 ld/testsuite/ld-riscv-elf/relax-call-2.d   |  21 ++
 ld/testsuite/ld-riscv-elf/relax-call-2.s   |  10 +
 ld/testsuite/ld-riscv-elf/relax-call-3.d   |  25 +++
 ld/testsuite/ld-riscv-elf/relax-call-3.s   |  13 ++
 ld/testsuite/ld-riscv-elf/relax-call-4.d   |  19 ++
 ld/testsuite/ld-riscv-elf/relax-call-4.s   |   8 +
 ld/testsuite/ld-riscv-elf/relax-call-5.d   |  23 +++
 ld/testsuite/ld-riscv-elf/relax-call-5.s   |  11 ++
 ld/testsuite/ld-riscv-elf/relax-call-6.d   |  22 +++
 ld/testsuite/ld-riscv-elf/relax-call-6.s   |  11 ++
 18 files changed, 363 insertions(+), 68 deletions(-)
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-1.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-1.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-2.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-2.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-3.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-3.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-4.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-4.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-5.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-5.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-6.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-6.s

-- 
2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 1/5] RISCV: Add linker relaxation tests
  2022-05-02 13:50 ` [PATCH v2 0/5] " Patrick O'Neill
@ 2022-05-02 13:50   ` Patrick O'Neill
  2022-05-02 13:50   ` [PATCH v2 2/5] RISCV: Arrange DELETE pass after .align pass Patrick O'Neill
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Patrick O'Neill @ 2022-05-02 13:50 UTC (permalink / raw)
  To: binutils; +Cc: kito.cheng, palmer, gnu-toolchain, Patrick O'Neill

Add basic tests for the behavior of relaxing function calls.
This covers behavior like relaxing a call, not relaxing a call, and
relaxing a call that is enabled by relaxing another call.

2022-04-29 Patrick O'Neill <patrick@rivosinc.com>

	* ld-riscv-elf.exp: Add testcases.
	* relax-call-1.d: Add new testcase.
	* relax-call-1.s: Likewise.
	* relax-call-2.d: Likewise.
	* relax-call-2.s: Likewise.
	* relax-call-3.d: Likewise.
	* relax-call-3.s: Likewise.
	* relax-call-4.d: Likewise.
	* relax-call-4.s: Likewise.
	* relax-call-5.d: Likewise.
	* relax-call-5.s: Likewise.
	* relax-call-6.d: Likewise.
	* relax-call-6.s: Likewise.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
---
 ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp |  6 ++++++
 ld/testsuite/ld-riscv-elf/relax-call-1.d   | 17 +++++++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-1.s   |  7 ++++++
 ld/testsuite/ld-riscv-elf/relax-call-2.d   | 21 ++++++++++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-2.s   | 10 +++++++++
 ld/testsuite/ld-riscv-elf/relax-call-3.d   | 25 ++++++++++++++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-3.s   | 13 +++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-4.d   | 19 ++++++++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-4.s   |  8 +++++++
 ld/testsuite/ld-riscv-elf/relax-call-5.d   | 23 ++++++++++++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-5.s   | 11 ++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-6.d   | 22 +++++++++++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-6.s   | 11 ++++++++++
 13 files changed, 193 insertions(+)
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-1.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-1.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-2.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-2.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-3.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-3.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-4.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-4.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-5.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-5.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-6.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-6.s

diff --git a/ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp b/ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp
index 272424b33e3..60f7cc5f208 100644
--- a/ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp
+++ b/ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp
@@ -119,6 +119,12 @@ proc run_relax_twice_test {} {
 }
 
 if [istarget "riscv*-*-*"] {
+    run_dump_test "relax-call-1"
+    run_dump_test "relax-call-2"
+    run_dump_test "relax-call-3"
+    run_dump_test "relax-call-4"
+    run_dump_test "relax-call-5"
+    run_dump_test "relax-call-6"
     run_dump_test "align-small-region"
     run_dump_test "call-relax"
     run_dump_test "pcgp-relax-01"
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-1.d b/ld/testsuite/ld-riscv-elf/relax-call-1.d
new file mode 100644
index 00000000000..3e7bfb500c9
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-1.d
@@ -0,0 +1,17 @@
+#source: relax-call-1.s
+#ld: --relax
+#objdump: -d
+
+
+.*:     file format .*
+
+
+Disassembly of section \.text:
+
+.*<foo>:
+.*ret
+
+.* <_start>:
+.*jal.* <foo>
+.*ret
+#pass
\ No newline at end of file
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-1.s b/ld/testsuite/ld-riscv-elf/relax-call-1.s
new file mode 100644
index 00000000000..7b81e28f99f
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-1.s
@@ -0,0 +1,7 @@
+	.text
+foo:
+	jr	ra
+	.globl	_start
+_start:
+	call	foo
+	jr	ra
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-2.d b/ld/testsuite/ld-riscv-elf/relax-call-2.d
new file mode 100644
index 00000000000..7481bea8c1e
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-2.d
@@ -0,0 +1,21 @@
+#source: relax-call-2.s
+#ld: --relax
+#objdump: -d
+
+
+.*:     file format .*
+
+
+Disassembly of section \.text:
+
+.*<foo>:
+.*ret
+
+.*<bar>:
+.*ret
+
+.* <_start>:
+.*jal.* <foo>
+.*jal.* <bar>
+.*ret
+#pass
\ No newline at end of file
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-2.s b/ld/testsuite/ld-riscv-elf/relax-call-2.s
new file mode 100644
index 00000000000..7f5844a6789
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-2.s
@@ -0,0 +1,10 @@
+	.text
+foo:
+	jr	ra
+bar:
+	jr	ra
+	.globl	_start
+_start:
+	call	foo
+	call	bar
+	jr	ra
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-3.d b/ld/testsuite/ld-riscv-elf/relax-call-3.d
new file mode 100644
index 00000000000..5c49222e056
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-3.d
@@ -0,0 +1,25 @@
+#source: relax-call-3.s
+#ld: --relax
+#objdump: -d
+
+
+.*:     file format .*
+
+
+Disassembly of section \.text:
+
+.*<foo>:
+.*ret
+
+.*<bar>:
+.*ret
+
+.*<baz>:
+.*ret
+
+.* <_start>:
+.*jal.* <foo>
+.*jal.* <bar>
+.*jal.* <baz>
+.*ret
+#pass
\ No newline at end of file
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-3.s b/ld/testsuite/ld-riscv-elf/relax-call-3.s
new file mode 100644
index 00000000000..66e7c02ca19
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-3.s
@@ -0,0 +1,13 @@
+	.text
+foo:
+	jr	ra
+bar:
+	jr	ra
+baz:
+	jr	ra
+	.globl	_start
+_start:
+	call	foo
+	call	bar
+	call	baz
+	jr	ra
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-4.d b/ld/testsuite/ld-riscv-elf/relax-call-4.d
new file mode 100644
index 00000000000..09e939ca9b4
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-4.d
@@ -0,0 +1,19 @@
+#source: relax-call-4.s
+#ld: --relax
+#objdump: -d
+
+
+.*:     file format .*
+
+
+Disassembly of section \.text:
+
+.*<foo>:
+.*ret
+.*\.\.\.
+
+.* <_start>:
+.*auipc.*
+.*jalr.* <foo>
+.*ret
+#pass
\ No newline at end of file
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-4.s b/ld/testsuite/ld-riscv-elf/relax-call-4.s
new file mode 100644
index 00000000000..44eed0bc128
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-4.s
@@ -0,0 +1,8 @@
+	.text
+foo:
+	jr	ra
+	.zero 1048570
+	.globl	_start
+_start:
+	call	foo
+	jr	ra
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-5.d b/ld/testsuite/ld-riscv-elf/relax-call-5.d
new file mode 100644
index 00000000000..d9f44e5f8a2
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-5.d
@@ -0,0 +1,23 @@
+#source: relax-call-5.s
+#ld: --relax
+#objdump: -d
+
+
+.*:     file format .*
+
+
+Disassembly of section \.text:
+
+.*<foo>:
+.*ret
+.*\.\.\.
+
+.*<bar>:
+.*ret
+
+.* <_start>:
+.*auipc.*
+.*jalr.* <foo>
+.*jal.* <bar>
+.*ret
+#pass
\ No newline at end of file
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-5.s b/ld/testsuite/ld-riscv-elf/relax-call-5.s
new file mode 100644
index 00000000000..5605fb14e78
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-5.s
@@ -0,0 +1,11 @@
+	.text
+foo:
+	jr	ra
+	.zero 1048566
+bar:
+	jr	ra
+	.globl	_start
+_start:
+	call	foo
+	call	bar
+	jr	ra
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-6.d b/ld/testsuite/ld-riscv-elf/relax-call-6.d
new file mode 100644
index 00000000000..c385ccb5511
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-6.d
@@ -0,0 +1,22 @@
+#source: relax-call-6.s
+#ld: --relax
+#objdump: -d
+
+
+.*:     file format .*
+
+
+Disassembly of section \.text:
+
+.*<foo>:
+.*ret
+.*\.\.\.
+
+.*<bar>:
+.*ret
+
+.* <_start>:
+.*jal.* <bar>
+.*jal.* <foo>
+.*ret
+#pass
\ No newline at end of file
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-6.s b/ld/testsuite/ld-riscv-elf/relax-call-6.s
new file mode 100644
index 00000000000..6df73e108f4
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-6.s
@@ -0,0 +1,11 @@
+	.text
+foo:
+	jr	ra
+	.zero 1048560
+bar:
+	jr	ra
+	.globl	_start
+_start:
+	call	bar
+	call	foo
+	jr	ra
-- 
2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 2/5] RISCV: Arrange DELETE pass after .align pass
  2022-05-02 13:50 ` [PATCH v2 0/5] " Patrick O'Neill
  2022-05-02 13:50   ` [PATCH v2 1/5] RISCV: Add linker relaxation tests Patrick O'Neill
@ 2022-05-02 13:50   ` Patrick O'Neill
  2022-05-02 13:50   ` [PATCH v2 3/5] RISCV: Implement piecewise deletion Patrick O'Neill
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Patrick O'Neill @ 2022-05-02 13:50 UTC (permalink / raw)
  To: binutils; +Cc: kito.cheng, palmer, gnu-toolchain, Patrick O'Neill

By moving the deletion pass after the align pass, the linker can touch
each byte once via piecewise deletion. Otherwise, it may need to move
the same bytes twice.

2022-04-29 Patrick O'Neill <patrick@rivosinc.com>

	* elfnn-riscv.c (_bfd_riscv_relax_align): Count to-be-deleted
	  bytes so the alignments are still accurate.
	* elfnn-riscv.c (_bfd_riscv_relax_section): Move DELETE pass
	  after ALIGN pass.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
---
This gives us an O(n^2) runtime when the align pass counts the number of 
deleted bytes preceding it, but this will be fixed in Patch 4/5.
---
 bfd/elfnn-riscv.c | 45 +++++++++++++++++++++++++++++----------------
 1 file changed, 29 insertions(+), 16 deletions(-)

diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c
index 8f9f0d8a86a..17f9607744f 100644
--- a/bfd/elfnn-riscv.c
+++ b/bfd/elfnn-riscv.c
@@ -4435,7 +4435,7 @@ _bfd_riscv_relax_tls_le (bfd *abfd,
 static bool
 _bfd_riscv_relax_align (bfd *abfd, asection *sec,
 			asection *sym_sec,
-			struct bfd_link_info *link_info,
+			struct bfd_link_info *link_info ATTRIBUTE_UNUSED,
 			Elf_Internal_Rela *rel,
 			bfd_vma symval,
 			bfd_vma max_alignment ATTRIBUTE_UNUSED,
@@ -4449,6 +4449,18 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
   while (alignment <= rel->r_addend)
     alignment *= 2;
 
+  Elf_Internal_Rela *relocs = elf_section_data (sec)->relocs;
+  for (unsigned int i = 0; i < sec->reloc_count; i++)
+    {
+      Elf_Internal_Rela *reloc = relocs + i;
+      /* Ignore annotations after this alignment directive.  */
+      if (reloc == rel)
+	break;
+      /* Account for to-be-deleted bytes  */
+      else if (ELFNN_R_TYPE (reloc->r_info) == R_RISCV_DELETE)
+	symval -= reloc->r_addend;
+    }
+
   symval -= rel->r_addend;
   bfd_vma aligned_addr = ((symval - 1) & ~(alignment - 1)) + alignment;
   bfd_vma nop_bytes = aligned_addr - symval;
@@ -4468,12 +4480,12 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
       return false;
     }
 
-  /* Delete the reloc.  */
-  rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
-
-  /* If the number of NOPs is already correct, there's nothing to do.  */
+  /* If the number of NOPs is already correct, delete the reloc.  */
   if (nop_bytes == rel->r_addend)
-    return true;
+    {
+      rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
+      return true;
+    }
 
   /* Write as many RISC-V NOPs as we need.  */
   for (pos = 0; pos < (nop_bytes & -4); pos += 4)
@@ -4483,10 +4495,11 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
   if (nop_bytes % 4 != 0)
     bfd_putl16 (RVC_NOP, contents + rel->r_offset + pos);
 
-  /* Delete the excess bytes.  */
-  return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + nop_bytes,
-				   rel->r_addend - nop_bytes, link_info,
-				   NULL);
+  /* Mark the excess bytes for deletion.  */
+  rel->r_info = ELFNN_R_INFO (0, R_RISCV_DELETE);
+  rel->r_addend = rel->r_addend - nop_bytes;
+  rel->r_offset = rel->r_offset + nop_bytes;
+  return true;
 }
 
 /* Relax PC-relative references to GP-relative references.  */
@@ -4677,8 +4690,8 @@ bfd_elfNN_riscv_set_data_segment_info (struct bfd_link_info *info,
 /* Relax a section.
 
    Pass 0: Shortens code sequences for LUI/CALL/TPREL/PCREL relocs.
-   Pass 1: Deletes the bytes that PCREL relaxation in pass 0 made obsolete.
-   Pass 2: Which cannot be disabled, handles code alignment directives.  */
+   Pass 1: Which cannot be disabled, handles code alignment directives.
+   Pass 2: Handle DELETE directives.  */
 
 static bool
 _bfd_riscv_relax_section (bfd *abfd, asection *sec,
@@ -4697,7 +4710,7 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
   *again = false;
 
   if (bfd_link_relocatable (info)
-      || sec->sec_flg0
+      || (sec->sec_flg0 && info->relax_pass == 0)
       || (sec->flags & SEC_RELOC) == 0
       || sec->reloc_count == 0
       || (info->disable_target_specific_optimizations
@@ -4771,10 +4784,10 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
 	  /* Skip over the R_RISCV_RELAX.  */
 	  i++;
 	}
-      else if (info->relax_pass == 1 && type == R_RISCV_DELETE)
-	relax_func = _bfd_riscv_relax_delete;
-      else if (info->relax_pass == 2 && type == R_RISCV_ALIGN)
+      else if (info->relax_pass == 1 && type == R_RISCV_ALIGN)
 	relax_func = _bfd_riscv_relax_align;
+      else if (info->relax_pass == 2 && type == R_RISCV_DELETE)
+	relax_func = _bfd_riscv_relax_delete;
       else
 	continue;
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 3/5] RISCV: Implement piecewise deletion
  2022-05-02 13:50 ` [PATCH v2 0/5] " Patrick O'Neill
  2022-05-02 13:50   ` [PATCH v2 1/5] RISCV: Add linker relaxation tests Patrick O'Neill
  2022-05-02 13:50   ` [PATCH v2 2/5] RISCV: Arrange DELETE pass after .align pass Patrick O'Neill
@ 2022-05-02 13:50   ` Patrick O'Neill
  2022-05-20 10:48     ` Nelson Chu
  2022-05-02 13:50   ` [PATCH v2 4/5] RISCV: Improve runtime of align directives Patrick O'Neill
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 22+ messages in thread
From: Patrick O'Neill @ 2022-05-02 13:50 UTC (permalink / raw)
  To: binutils; +Cc: kito.cheng, palmer, gnu-toolchain, Patrick O'Neill

Rather than deleting bytes immediately, we can first mark the bytes for
deletion. Then, using piecewise deletion, we reduce the runtime from
O(n^2) to O(n).

By moving the logic from the riscv_relax_delete_bytes function into the
RISCV_DELETE reloc handler, we can delete the bytes in a piecewise
manner.

2022-04-29 Patrick O'Neill <patrick@rivosinc.com>

	* elfnn-riscv.c: Move deletion logic into DELETE reloc handler
	  and replace with DELETE reloc creation logic.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
---
This change overwrites the R_RISCV_RELAX with a R_RISCV_DELETE reloc
when a relaxation wants to delete bytes. If it was going to delete a
reloc, we use that reloc rather than deleting it.
---
v2 Changelog:
- Arrange functions as to reduce the diff of each patch and make
  reviewing more manageable
---
 bfd/elfnn-riscv.c | 144 ++++++++++++++++++++++++++++------------------
 1 file changed, 88 insertions(+), 56 deletions(-)

diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c
index 17f9607744f..10e76e8628d 100644
--- a/bfd/elfnn-riscv.c
+++ b/bfd/elfnn-riscv.c
@@ -4054,15 +4054,28 @@ riscv_update_pcgp_relocs (riscv_pcgp_relocs *p, asection *deleted_sec,
     }
 }
 
-/* Delete some bytes from a section while relaxing.  */
+typedef bool (*relax_func_t) (bfd *, asection *, asection *,
+			      struct bfd_link_info *,
+			      Elf_Internal_Rela *,
+			      bfd_vma, bfd_vma, bfd_vma, bool *,
+			      riscv_pcgp_relocs *,
+			      bool undefined_weak, bfd_vma *);
+
+/* Delete the bytes for R_RISCV_DELETE.  */
 
 static bool
-riscv_relax_delete_bytes (bfd *abfd,
-			  asection *sec,
-			  bfd_vma addr,
-			  size_t count,
-			  struct bfd_link_info *link_info,
-			  riscv_pcgp_relocs *p)
+_bfd_riscv_relax_delete (bfd *abfd,
+			 asection *sec,
+			 asection *sym_sec ATTRIBUTE_UNUSED,
+			 struct bfd_link_info *link_info,
+			 Elf_Internal_Rela *rel,
+			 bfd_vma symval ATTRIBUTE_UNUSED,
+			 bfd_vma max_alignment ATTRIBUTE_UNUSED,
+			 bfd_vma reserve_size ATTRIBUTE_UNUSED,
+			 bool *again ATTRIBUTE_UNUSED,
+			 riscv_pcgp_relocs *pcgp_relocs,
+			 bool undefined_weak ATTRIBUTE_UNUSED,
+			 bfd_vma *delete_total)
 {
   unsigned int i, symcount;
   bfd_vma toaddr = sec->size;
@@ -4072,9 +4085,42 @@ riscv_relax_delete_bytes (bfd *abfd,
   struct bfd_elf_section_data *data = elf_section_data (sec);
   bfd_byte *contents = data->this_hdr.contents;
 
+  bfd_vma addr = rel->r_offset;
+  size_t count = rel->r_addend;
+  riscv_pcgp_relocs *p = pcgp_relocs;
+
+  /* Find the next DELETE reloc (if one exists).  */
+  Elf_Internal_Rela *relocs = elf_section_data (sec)->relocs;
+  Elf_Internal_Rela *next_delete = NULL;
+  /* Since we only replace existing relocs and don't add new relocs, the
+     relocs are in sequential order. We can skip the relocs prior to this one,
+     making this search linear time.  */
+  int start = rel - relocs;
+  for (unsigned int i = start; i < sec->reloc_count; i++)
+    {
+      next_delete = relocs + i;
+      if (ELFNN_R_TYPE (next_delete->r_info) == R_RISCV_DELETE
+	  && next_delete->r_offset > rel->r_offset)
+	break;
+      else
+	next_delete = NULL;
+    }
+
+  size_t bytes_to_move;
+  /* Make this a piecewise deletion.  */
+  if (next_delete == NULL)
+    bytes_to_move = toaddr - addr - count;
+  else
+    bytes_to_move = next_delete->r_offset - addr - count;
+
   /* Actually delete the bytes.  */
   sec->size -= count;
-  memmove (contents + addr, contents + addr + count, toaddr - addr - count);
+  memmove (contents + addr, contents + addr + count + *delete_total, bytes_to_move);
+
+  *delete_total += count;
+
+  /* Delete the reloc.  */
+  rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
 
   /* Adjust the location of all of the relocs.  Note that we need not
      adjust the addends, since all PC-relative references must be against
@@ -4172,12 +4218,20 @@ riscv_relax_delete_bytes (bfd *abfd,
   return true;
 }
 
-typedef bool (*relax_func_t) (bfd *, asection *, asection *,
-			      struct bfd_link_info *,
-			      Elf_Internal_Rela *,
-			      bfd_vma, bfd_vma, bfd_vma, bool *,
-			      riscv_pcgp_relocs *,
-			      bool undefined_weak);
+/* Delete some bytes from a section while relaxing.  */
+
+static bool
+riscv_relax_delete_bytes (bfd_vma addr,
+			  size_t count,
+			  Elf_Internal_Rela *rel)
+{
+  /* Mark bytes for deletion.  */
+  rel->r_info = ELFNN_R_INFO (0, R_RISCV_DELETE);
+  rel->r_offset = addr;
+  rel->r_addend = count;
+
+  return true;
+}
 
 /* Relax AUIPC + JALR into JAL.  */
 
@@ -4190,7 +4244,8 @@ _bfd_riscv_relax_call (bfd *abfd, asection *sec, asection *sym_sec,
 		       bfd_vma reserve_size ATTRIBUTE_UNUSED,
 		       bool *again,
 		       riscv_pcgp_relocs *pcgp_relocs,
-		       bool undefined_weak ATTRIBUTE_UNUSED)
+		       bool undefined_weak ATTRIBUTE_UNUSED,
+		       bfd_vma *delete_total ATTRIBUTE_UNUSED)
 {
   bfd_byte *contents = elf_section_data (sec)->this_hdr.contents;
   bfd_vma foff = symval - (sec_addr (sec) + rel->r_offset);
@@ -4252,8 +4307,7 @@ _bfd_riscv_relax_call (bfd *abfd, asection *sec, asection *sym_sec,
 
   /* Delete unnecessary JALR.  */
   *again = true;
-  return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + len, 8 - len,
-				   link_info, pcgp_relocs);
+  return riscv_relax_delete_bytes (rel->r_offset + len, 8 - len, rel + 1);
 }
 
 /* Traverse all output sections and return the max alignment.  */
@@ -4286,7 +4340,8 @@ _bfd_riscv_relax_lui (bfd *abfd,
 		      bfd_vma reserve_size,
 		      bool *again,
 		      riscv_pcgp_relocs *pcgp_relocs,
-		      bool undefined_weak)
+		      bool undefined_weak,
+		      bfd_vma *delete_total ATTRIBUTE_UNUSED)
 {
   bfd_byte *contents = elf_section_data (sec)->this_hdr.contents;
   bfd_vma gp = riscv_global_pointer_value (link_info);
@@ -4343,11 +4398,10 @@ _bfd_riscv_relax_lui (bfd *abfd,
 	  return true;
 
 	case R_RISCV_HI20:
-	  /* We can delete the unnecessary LUI and reloc.  */
+	  /* We can delete the unnecessary LUI and reuse the reloc.  */
 	  rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
 	  *again = true;
-	  return riscv_relax_delete_bytes (abfd, sec, rel->r_offset, 4,
-					   link_info, pcgp_relocs);
+	  return riscv_relax_delete_bytes (rel->r_offset, 4, rel);
 
 	default:
 	  abort ();
@@ -4379,8 +4433,7 @@ _bfd_riscv_relax_lui (bfd *abfd,
       rel->r_info = ELFNN_R_INFO (ELFNN_R_SYM (rel->r_info), R_RISCV_RVC_LUI);
 
       *again = true;
-      return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + 2, 2,
-				       link_info, pcgp_relocs);
+      return riscv_relax_delete_bytes (rel->r_offset + 2, 2, rel + 1);
     }
 
   return true;
@@ -4399,7 +4452,8 @@ _bfd_riscv_relax_tls_le (bfd *abfd,
 			 bfd_vma reserve_size ATTRIBUTE_UNUSED,
 			 bool *again,
 			 riscv_pcgp_relocs *pcgp_relocs,
-			 bool undefined_weak ATTRIBUTE_UNUSED)
+			 bool undefined_weak ATTRIBUTE_UNUSED,
+			 bfd_vma *delete_total ATTRIBUTE_UNUSED)
 {
   /* See if this symbol is in range of tp.  */
   if (RISCV_CONST_HIGH_PART (tpoff (link_info, symval)) != 0)
@@ -4418,11 +4472,10 @@ _bfd_riscv_relax_tls_le (bfd *abfd,
 
     case R_RISCV_TPREL_HI20:
     case R_RISCV_TPREL_ADD:
-      /* We can delete the unnecessary instruction and reloc.  */
+      /* We can delete the unnecessary instruction and reuse the reloc.  */
       rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
       *again = true;
-      return riscv_relax_delete_bytes (abfd, sec, rel->r_offset, 4, link_info,
-				       pcgp_relocs);
+      return riscv_relax_delete_bytes (rel->r_offset, 4, rel);
 
     default:
       abort ();
@@ -4442,7 +4495,8 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
 			bfd_vma reserve_size ATTRIBUTE_UNUSED,
 			bool *again ATTRIBUTE_UNUSED,
 			riscv_pcgp_relocs *pcgp_relocs ATTRIBUTE_UNUSED,
-			bool undefined_weak ATTRIBUTE_UNUSED)
+			bool undefined_weak ATTRIBUTE_UNUSED,
+			bfd_vma *delete_total ATTRIBUTE_UNUSED)
 {
   bfd_byte *contents = elf_section_data (sec)->this_hdr.contents;
   bfd_vma alignment = 1, pos;
@@ -4496,10 +4550,8 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
     bfd_putl16 (RVC_NOP, contents + rel->r_offset + pos);
 
   /* Mark the excess bytes for deletion.  */
-  rel->r_info = ELFNN_R_INFO (0, R_RISCV_DELETE);
-  rel->r_addend = rel->r_addend - nop_bytes;
-  rel->r_offset = rel->r_offset + nop_bytes;
-  return true;
+  return riscv_relax_delete_bytes (rel->r_offset + nop_bytes,
+				   rel->r_addend - nop_bytes, rel);
 }
 
 /* Relax PC-relative references to GP-relative references.  */
@@ -4515,7 +4567,8 @@ _bfd_riscv_relax_pc (bfd *abfd ATTRIBUTE_UNUSED,
 		     bfd_vma reserve_size,
 		     bool *again ATTRIBUTE_UNUSED,
 		     riscv_pcgp_relocs *pcgp_relocs,
-		     bool undefined_weak)
+		     bool undefined_weak,
+		     bfd_vma *delete_total ATTRIBUTE_UNUSED)
 {
   bfd_byte *contents = elf_section_data (sec)->this_hdr.contents;
   bfd_vma gp = riscv_global_pointer_value (link_info);
@@ -4654,28 +4707,6 @@ _bfd_riscv_relax_pc (bfd *abfd ATTRIBUTE_UNUSED,
   return true;
 }
 
-/* Delete the bytes for R_RISCV_DELETE.  */
-
-static bool
-_bfd_riscv_relax_delete (bfd *abfd,
-			 asection *sec,
-			 asection *sym_sec ATTRIBUTE_UNUSED,
-			 struct bfd_link_info *link_info,
-			 Elf_Internal_Rela *rel,
-			 bfd_vma symval ATTRIBUTE_UNUSED,
-			 bfd_vma max_alignment ATTRIBUTE_UNUSED,
-			 bfd_vma reserve_size ATTRIBUTE_UNUSED,
-			 bool *again ATTRIBUTE_UNUSED,
-			 riscv_pcgp_relocs *pcgp_relocs ATTRIBUTE_UNUSED,
-			 bool undefined_weak ATTRIBUTE_UNUSED)
-{
-  if (!riscv_relax_delete_bytes (abfd, sec, rel->r_offset, rel->r_addend,
-				 link_info, NULL))
-    return false;
-  rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
-  return true;
-}
-
 /* Called by after_allocation to set the information of data segment
    before relaxing.  */
 
@@ -4706,6 +4737,7 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
   unsigned int i;
   bfd_vma max_alignment, reserve_size = 0;
   riscv_pcgp_relocs pcgp_relocs;
+  bfd_vma delete_bytes = 0;
 
   *again = false;
 
@@ -4941,7 +4973,7 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
 
       if (!relax_func (abfd, sec, sym_sec, info, rel, symval,
 		       max_alignment, reserve_size, again,
-		       &pcgp_relocs, undefined_weak))
+		       &pcgp_relocs, undefined_weak, &delete_bytes))
 	goto fail;
     }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 4/5] RISCV: Improve runtime of align directives
  2022-05-02 13:50 ` [PATCH v2 0/5] " Patrick O'Neill
                     ` (2 preceding siblings ...)
  2022-05-02 13:50   ` [PATCH v2 3/5] RISCV: Implement piecewise deletion Patrick O'Neill
@ 2022-05-02 13:50   ` Patrick O'Neill
  2022-05-02 13:50   ` [PATCH v2 5/5] RISCV: Add --defer-deletion flag Patrick O'Neill
  2022-05-27 21:20   ` [PATCH v3 0/3] RISCV: Improve linker time complexity Patrick O'Neill
  5 siblings, 0 replies; 22+ messages in thread
From: Patrick O'Neill @ 2022-05-02 13:50 UTC (permalink / raw)
  To: binutils; +Cc: kito.cheng, palmer, gnu-toolchain, Patrick O'Neill

Align directives need an accurate count of the number of bytes to be
deleted. This can be computed using a running sum, giving us an O(n),
rather than O(n^2) runtime.

2022-04-29 Patrick O'Neill <patrick@rivosinc.com>

	* elfnn-riscv.c (_bfd_riscv_relax_align): Rely on running
	  delete_bytes count when calculating alignment.
	* elfnn-riscv.c (_bfd_riscv_relax_section): Calculate running
	  deletion total during align pass.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
---
 bfd/elfnn-riscv.c | 32 ++++++++++++++++++--------------
 1 file changed, 18 insertions(+), 14 deletions(-)

diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c
index 10e76e8628d..0e94021d39c 100644
--- a/bfd/elfnn-riscv.c
+++ b/bfd/elfnn-riscv.c
@@ -4496,24 +4496,15 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
 			bool *again ATTRIBUTE_UNUSED,
 			riscv_pcgp_relocs *pcgp_relocs ATTRIBUTE_UNUSED,
 			bool undefined_weak ATTRIBUTE_UNUSED,
-			bfd_vma *delete_total ATTRIBUTE_UNUSED)
+			bfd_vma *delete_total)
 {
   bfd_byte *contents = elf_section_data (sec)->this_hdr.contents;
   bfd_vma alignment = 1, pos;
   while (alignment <= rel->r_addend)
     alignment *= 2;
 
-  Elf_Internal_Rela *relocs = elf_section_data (sec)->relocs;
-  for (unsigned int i = 0; i < sec->reloc_count; i++)
-    {
-      Elf_Internal_Rela *reloc = relocs + i;
-      /* Ignore annotations after this alignment directive.  */
-      if (reloc == rel)
-	break;
-      /* Account for to-be-deleted bytes  */
-      else if (ELFNN_R_TYPE (reloc->r_info) == R_RISCV_DELETE)
-	symval -= reloc->r_addend;
-    }
+  /* Account for to-be-deleted bytes.  */
+  symval -= *delete_total;
 
   symval -= rel->r_addend;
   bfd_vma aligned_addr = ((symval - 1) & ~(alignment - 1)) + alignment;
@@ -4549,6 +4540,9 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
   if (nop_bytes % 4 != 0)
     bfd_putl16 (RVC_NOP, contents + rel->r_offset + pos);
 
+  /* Account for the soon-to-be marked bytes.  */
+  *delete_total += rel->r_addend - nop_bytes;
+
   /* Mark the excess bytes for deletion.  */
   return riscv_relax_delete_bytes (rel->r_offset + nop_bytes,
 				   rel->r_addend - nop_bytes, rel);
@@ -4816,8 +4810,18 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
 	  /* Skip over the R_RISCV_RELAX.  */
 	  i++;
 	}
-      else if (info->relax_pass == 1 && type == R_RISCV_ALIGN)
-	relax_func = _bfd_riscv_relax_align;
+      else if (info->relax_pass == 1)
+        {
+	  if (type == R_RISCV_ALIGN)
+	    relax_func = _bfd_riscv_relax_align;
+	  else if (type == R_RISCV_DELETE)
+	    {
+	      delete_bytes += rel->r_addend;
+	      continue;
+	    }
+	  else
+	    continue;
+	}
       else if (info->relax_pass == 2 && type == R_RISCV_DELETE)
 	relax_func = _bfd_riscv_relax_delete;
       else
-- 
2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 5/5] RISCV: Add --defer-deletion flag
  2022-05-02 13:50 ` [PATCH v2 0/5] " Patrick O'Neill
                     ` (3 preceding siblings ...)
  2022-05-02 13:50   ` [PATCH v2 4/5] RISCV: Improve runtime of align directives Patrick O'Neill
@ 2022-05-02 13:50   ` Patrick O'Neill
  2022-05-27 21:20   ` [PATCH v3 0/3] RISCV: Improve linker time complexity Patrick O'Neill
  5 siblings, 0 replies; 22+ messages in thread
From: Patrick O'Neill @ 2022-05-02 13:50 UTC (permalink / raw)
  To: binutils; +Cc: kito.cheng, palmer, gnu-toolchain, Patrick O'Neill

Previously the linker failed to relax only the backwards case (where a
relaxation enables another relaxation that has already been checked).
With patches 1-4, the behavior has been changed to fail in both the
forward and backwards case. It's possible that this will lead to worse
performance, so the --no-defer-deletion flag allows you to use the
non-linear immediate deletion method.
By default, the linear deletion method is enabled as this is expected to
be more performant.

2022-04-29 Patrick O'Neill <patrick@rivosinc.com>

	* elfnn-riscv.c: Add if statements and expand function calls to
	  enable immediately deleting bytes.
	* bfdlink.h: Add delete_immediately flag.
	* ld/ld.texi: Document the new --defer-deletion flag.
	* ld/ldlex.h: Add DEFER_DELETION and NO_DEFER_DELETION options.
	* ld/lexsup.c: Parse --defer-deletion and --no-defer-deletion
	  flags.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
---
I think that there may be a better way to phrase the --defer-deletion
flag. I'm also unsure if this is the correct way of expressing
a target-specific linker flag.
---
v2 Changelog:
- This is a new patch.
---
 bfd/elfnn-riscv.c | 61 ++++++++++++++++++++++++++++++++++-------------
 include/bfdlink.h |  4 ++++
 ld/ld.texi        | 10 ++++++++
 ld/ldlex.h        |  2 ++
 ld/lexsup.c       | 10 ++++++++
 5 files changed, 70 insertions(+), 17 deletions(-)

diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c
index 0e94021d39c..4f57e9bef03 100644
--- a/bfd/elfnn-riscv.c
+++ b/bfd/elfnn-riscv.c
@@ -4092,18 +4092,27 @@ _bfd_riscv_relax_delete (bfd *abfd,
   /* Find the next DELETE reloc (if one exists).  */
   Elf_Internal_Rela *relocs = elf_section_data (sec)->relocs;
   Elf_Internal_Rela *next_delete = NULL;
-  /* Since we only replace existing relocs and don't add new relocs, the
-     relocs are in sequential order. We can skip the relocs prior to this one,
-     making this search linear time.  */
-  int start = rel - relocs;
-  for (unsigned int i = start; i < sec->reloc_count; i++)
+  if (link_info->immediate_deletion && link_info->relax_pass == 0)
     {
-      next_delete = relocs + i;
-      if (ELFNN_R_TYPE (next_delete->r_info) == R_RISCV_DELETE
-	  && next_delete->r_offset > rel->r_offset)
-	break;
-      else
-	next_delete = NULL;
+      /* When performing non-linear time deletions, disable piecewise
+	 deletion.  */
+      *delete_total = 0;
+    }
+  else
+    {
+      /* Since we only replace existing relocs and don't add new relocs, the
+	 relocs are in sequential order. We can skip the relocs prior to this one,
+	 making this search linear time.  */
+      int start = rel - relocs;
+      for (unsigned int i = start; i < sec->reloc_count; i++)
+	{
+	  next_delete = relocs + i;
+	  if (ELFNN_R_TYPE (next_delete->r_info) == R_RISCV_DELETE
+	      && next_delete->r_offset > rel->r_offset)
+	    break;
+	  else
+	    next_delete = NULL;
+	}
     }
 
   size_t bytes_to_move;
@@ -4221,7 +4230,11 @@ _bfd_riscv_relax_delete (bfd *abfd,
 /* Delete some bytes from a section while relaxing.  */
 
 static bool
-riscv_relax_delete_bytes (bfd_vma addr,
+riscv_relax_delete_bytes (bfd *abfd,
+			  asection *sec,
+			  struct bfd_link_info *link_info,
+			  riscv_pcgp_relocs *pcgp_relocs,
+			  bfd_vma addr,
 			  size_t count,
 			  Elf_Internal_Rela *rel)
 {
@@ -4230,6 +4243,15 @@ riscv_relax_delete_bytes (bfd_vma addr,
   rel->r_offset = addr;
   rel->r_addend = count;
 
+  if (link_info != NULL && link_info->immediate_deletion
+      && link_info->relax_pass == 0)
+    {
+      /* Delete the bytes immediately (non-linear time).  */
+      bfd_vma count = 0;
+      _bfd_riscv_relax_delete (abfd, sec, NULL, link_info, rel, NULL, NULL, NULL,
+			       NULL, pcgp_relocs, NULL, &count);
+    }
+
   return true;
 }
 
@@ -4307,7 +4329,8 @@ _bfd_riscv_relax_call (bfd *abfd, asection *sec, asection *sym_sec,
 
   /* Delete unnecessary JALR.  */
   *again = true;
-  return riscv_relax_delete_bytes (rel->r_offset + len, 8 - len, rel + 1);
+  return riscv_relax_delete_bytes (abfd, sec, link_info, pcgp_relocs,
+				   rel->r_offset + len, 8 - len, rel + 1);
 }
 
 /* Traverse all output sections and return the max alignment.  */
@@ -4401,7 +4424,8 @@ _bfd_riscv_relax_lui (bfd *abfd,
 	  /* We can delete the unnecessary LUI and reuse the reloc.  */
 	  rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
 	  *again = true;
-	  return riscv_relax_delete_bytes (rel->r_offset, 4, rel);
+	  return riscv_relax_delete_bytes (abfd, sec, link_info, pcgp_relocs,
+					   rel->r_offset, 4, rel);
 
 	default:
 	  abort ();
@@ -4433,7 +4457,8 @@ _bfd_riscv_relax_lui (bfd *abfd,
       rel->r_info = ELFNN_R_INFO (ELFNN_R_SYM (rel->r_info), R_RISCV_RVC_LUI);
 
       *again = true;
-      return riscv_relax_delete_bytes (rel->r_offset + 2, 2, rel + 1);
+      return riscv_relax_delete_bytes (abfd, sec, link_info, pcgp_relocs,
+				       rel->r_offset + 2, 2, rel + 1);
     }
 
   return true;
@@ -4475,7 +4500,8 @@ _bfd_riscv_relax_tls_le (bfd *abfd,
       /* We can delete the unnecessary instruction and reuse the reloc.  */
       rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
       *again = true;
-      return riscv_relax_delete_bytes (rel->r_offset, 4, rel);
+      return riscv_relax_delete_bytes (abfd, sec, link_info, pcgp_relocs,
+				       rel->r_offset, 4, rel);
 
     default:
       abort ();
@@ -4544,7 +4570,8 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
   *delete_total += rel->r_addend - nop_bytes;
 
   /* Mark the excess bytes for deletion.  */
-  return riscv_relax_delete_bytes (rel->r_offset + nop_bytes,
+  return riscv_relax_delete_bytes (NULL, NULL, NULL, NULL,
+				   rel->r_offset + nop_bytes,
 				   rel->r_addend - nop_bytes, rel);
 }
 
diff --git a/include/bfdlink.h b/include/bfdlink.h
index 69fc9d33ff4..f81a00c9f94 100644
--- a/include/bfdlink.h
+++ b/include/bfdlink.h
@@ -568,6 +568,10 @@ struct bfd_link_info
        args_type structure in ldmain.c:main.  */
   signed int disable_target_specific_optimizations;
 
+  /* Enable or disable deferred deletion of linker relaxations.
+     This option is enabled my default.  */
+  unsigned int immediate_deletion: 1;
+
   /* Function callbacks.  */
   const struct bfd_link_callbacks *callbacks;
 
diff --git a/ld/ld.texi b/ld/ld.texi
index fc75e9b3625..393ce28e5e3 100644
--- a/ld/ld.texi
+++ b/ld/ld.texi
@@ -2186,6 +2186,16 @@ This option is ignored for Linux compatibility.
 @item -Qy
 This option is ignored for SVR4 compatibility.
 
+@cindex defer relaxation deletion
+@kindex --defer-relaxation-deletions
+@kindex --no-defer-relaxation-deletions
+@item --defer-relaxation-deletions
+@itemx --no-defer-relaxation-deletions
+Enable (or don't enable) linear time linker relaxations by deferring byte
+deletion. This assumes the pathologial case is rare. If the pathological case
+is common, disabling linear linking will improve performance.
+Default to linear time linker relaxation.
+
 @kindex --relax
 @cindex synthesizing linker
 @cindex relaxing addressing modes
diff --git a/ld/ldlex.h b/ld/ldlex.h
index bc58fea73cc..dc2375da096 100644
--- a/ld/ldlex.h
+++ b/ld/ldlex.h
@@ -54,6 +54,8 @@ enum option_values
   OPTION_OFORMAT,
   OPTION_RELAX,
   OPTION_NO_RELAX,
+  OPTION_DEFER_DELETION,
+  OPTION_NO_DEFER_DELETION,
   OPTION_NO_SYMBOLIC,
   OPTION_RETAIN_SYMBOLS_FILE,
   OPTION_RPATH,
diff --git a/ld/lexsup.c b/ld/lexsup.c
index 5acc47ed5a0..5067cad5d09 100644
--- a/ld/lexsup.c
+++ b/ld/lexsup.c
@@ -442,6 +442,10 @@ static const struct ld_option ld_options[] =
     '\0', NULL, N_("Reduce code size by using target specific optimizations"), TWO_DASHES },
   { {"no-relax", no_argument, NULL, OPTION_NO_RELAX},
     '\0', NULL, N_("Do not use relaxation techniques to reduce code size"), TWO_DASHES },
+  { {"defer-deletion", no_argument, NULL, OPTION_DEFER_DELETION},
+    '\0', NULL, N_("Defer deletions caused by linker relaxation"), TWO_DASHES },
+  { {"no-defer-deletion", no_argument, NULL, OPTION_NO_DEFER_DELETION},
+    '\0', NULL, N_("Do not defer deletions caused by linker relaxation"), TWO_DASHES },
   { {"retain-symbols-file", required_argument, NULL,
      OPTION_RETAIN_SYMBOLS_FILE},
     '\0', N_("FILE"), N_("Keep only symbols listed in FILE"), TWO_DASHES },
@@ -915,6 +919,12 @@ parse_args (unsigned argc, char **argv)
 	case OPTION_NON_CONTIGUOUS_REGIONS_WARNINGS:
 	  link_info.non_contiguous_regions_warnings = true;
 	  break;
+	case OPTION_DEFER_DELETION:
+	  link_info.immediate_deletion = false;
+	  break;
+	case OPTION_NO_DEFER_DELETION:
+	  link_info.immediate_deletion = true;
+	  break;
 	case 'e':
 	  lang_add_entry (optarg, true);
 	  break;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/5] RISCV: Implement piecewise deletion
  2022-05-02 13:50   ` [PATCH v2 3/5] RISCV: Implement piecewise deletion Patrick O'Neill
@ 2022-05-20 10:48     ` Nelson Chu
  2022-05-20 17:36       ` Patrick O'Neill
  0 siblings, 1 reply; 22+ messages in thread
From: Nelson Chu @ 2022-05-20 10:48 UTC (permalink / raw)
  To: Patrick O'Neill
  Cc: Binutils, gnu-toolchain, Jim Wilson, Palmer Dabbelt,
	Andrew Waterman, Kito Cheng

Hi Patrick,

Thanks for giving us this great idea.  I would suggest that we don't
consider the alignment issues for now, and focus on how to reduce the
time of the normal relaxations, which includes lui, auipc, call and
tls le relaxations.  The current relax behavior are listed as follows,

Pass 1: For lui, call and tls le relaxations, delete the redundant
bytes immediately;  For auipc, we build the table (pcgp_relocs) to
chain the %pcrel_lo insn to the corresponding %pcrel_hi insn, and then
actually delete the bytes until next relax pass.

Pass 2: Delete the bytes for each R_RISCV_DELETE  which are marked for
auipc relaxations.

Pass 3: Handle alignment, delete the bytes immediately for each R_RISCV_ALIGN.

So the current relax order should be,
Pass 1 -> Pass 1 -> ... -> Pass 1 -> Pass 2 -> Pass 3.  Since we
delete the bytes for lui/call/tls_le relaxations immediately, the Pass
1 will be done multiple times until there are no any bytes deleted.
There is a problem here - the auipc will not be deleted until Pass 2,
so we will need to update the pcgp_relocs table every time.  You can
see the PR28410 for details,
https://sourceware.org/bugzilla/show_bug.cgi?id=28410.

According to your patches, the relax order will be updated to,
Pass 1 (Do not delete the bytes immediately) -> Pass 2 (Handle
alignment) -> Pass 3 (Piecewise delete bytes for
lui/auipc/call/tls_le/alignment)

I would suggest we continue to put the alignment relaxations at the
last relax pass, that would reduce a lot of problems, including losing
too many relax opportunities in pass 1.  Therefore, the order will be,

Pass 1 (Do not delete the bytes immediately) -> Pass 2 (Piecewise
delete bytes for lui/auipc/call/tls_le) -> Pass 3 (Handle alignment)

However, the order doesn't help to increase the chance to relax more,
since we actually delete the bytes in Pass 2, so we won't be doing the
Pass 1 as much as we could, we just do it once.  However, if we want
to delete the bytes in the separate relax pass, then we probably need
the "restart_relax" again, to do the "Pass 1 -> Pass 2" many times.
Please see the following deprecated patch,
https://sourceware.org/pipermail/binutils-cvs/2021-March/056052.html.

Or we probably have another solution without adding the restart_relax
flag.  If my understanding is correct (please feel free to tell me if
I'm wrong), we always delete the bytes and adjust the tables (reloc,
local and global symbol tables) only for "one input section", and
won't cross sections.  That means, we could create another table to
record the deleted bytes rather than marking them as R_RISCV_DELETE,
and then actually delete them at the end of the
_bfd_riscv_relax_section, for each input section.  So that the order
will be,

Pass 1 (record the deleted bytes for lui/auipc/call/tls_le, and then
actually piecewise delete them at the end of _bfd_riscv_relax_section)
-> Pass 1 -> ... -> Pass 1 -> Pass 2 (Handle alignment).

As for the improvement of alignment, we could discuss it in the later
patches.  I would like to accept the piecewise deletion for the
lui/auipc/call/tls_le relaxations first, so let us start by making
some progress here.

BTW, we don't need to update the pcgp_relocs tables in the
_bfd_riscv_relax_delete anymore, since we don't delete bytes
immediately for each relax pattern.

Thanks
Nelson

On Mon, May 2, 2022 at 9:54 PM Patrick O'Neill <patrick@rivosinc.com> wrote:
>
> Rather than deleting bytes immediately, we can first mark the bytes for
> deletion. Then, using piecewise deletion, we reduce the runtime from
> O(n^2) to O(n).
>
> By moving the logic from the riscv_relax_delete_bytes function into the
> RISCV_DELETE reloc handler, we can delete the bytes in a piecewise
> manner.
>
> 2022-04-29 Patrick O'Neill <patrick@rivosinc.com>
>
>         * elfnn-riscv.c: Move deletion logic into DELETE reloc handler
>           and replace with DELETE reloc creation logic.
>
> Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
> ---
> This change overwrites the R_RISCV_RELAX with a R_RISCV_DELETE reloc
> when a relaxation wants to delete bytes. If it was going to delete a
> reloc, we use that reloc rather than deleting it.
> ---
> v2 Changelog:
> - Arrange functions as to reduce the diff of each patch and make
>   reviewing more manageable
> ---
>  bfd/elfnn-riscv.c | 144 ++++++++++++++++++++++++++++------------------
>  1 file changed, 88 insertions(+), 56 deletions(-)
>
> diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c
> index 17f9607744f..10e76e8628d 100644
> --- a/bfd/elfnn-riscv.c
> +++ b/bfd/elfnn-riscv.c
> @@ -4054,15 +4054,28 @@ riscv_update_pcgp_relocs (riscv_pcgp_relocs *p, asection *deleted_sec,
>      }
>  }
>
> -/* Delete some bytes from a section while relaxing.  */
> +typedef bool (*relax_func_t) (bfd *, asection *, asection *,
> +                             struct bfd_link_info *,
> +                             Elf_Internal_Rela *,
> +                             bfd_vma, bfd_vma, bfd_vma, bool *,
> +                             riscv_pcgp_relocs *,
> +                             bool undefined_weak, bfd_vma *);
> +
> +/* Delete the bytes for R_RISCV_DELETE.  */
>
>  static bool
> -riscv_relax_delete_bytes (bfd *abfd,
> -                         asection *sec,
> -                         bfd_vma addr,
> -                         size_t count,
> -                         struct bfd_link_info *link_info,
> -                         riscv_pcgp_relocs *p)
> +_bfd_riscv_relax_delete (bfd *abfd,
> +                        asection *sec,
> +                        asection *sym_sec ATTRIBUTE_UNUSED,
> +                        struct bfd_link_info *link_info,
> +                        Elf_Internal_Rela *rel,
> +                        bfd_vma symval ATTRIBUTE_UNUSED,
> +                        bfd_vma max_alignment ATTRIBUTE_UNUSED,
> +                        bfd_vma reserve_size ATTRIBUTE_UNUSED,
> +                        bool *again ATTRIBUTE_UNUSED,
> +                        riscv_pcgp_relocs *pcgp_relocs,
> +                        bool undefined_weak ATTRIBUTE_UNUSED,
> +                        bfd_vma *delete_total)
>  {
>    unsigned int i, symcount;
>    bfd_vma toaddr = sec->size;
> @@ -4072,9 +4085,42 @@ riscv_relax_delete_bytes (bfd *abfd,
>    struct bfd_elf_section_data *data = elf_section_data (sec);
>    bfd_byte *contents = data->this_hdr.contents;
>
> +  bfd_vma addr = rel->r_offset;
> +  size_t count = rel->r_addend;
> +  riscv_pcgp_relocs *p = pcgp_relocs;
> +
> +  /* Find the next DELETE reloc (if one exists).  */
> +  Elf_Internal_Rela *relocs = elf_section_data (sec)->relocs;
> +  Elf_Internal_Rela *next_delete = NULL;
> +  /* Since we only replace existing relocs and don't add new relocs, the
> +     relocs are in sequential order. We can skip the relocs prior to this one,
> +     making this search linear time.  */
> +  int start = rel - relocs;
> +  for (unsigned int i = start; i < sec->reloc_count; i++)
> +    {
> +      next_delete = relocs + i;
> +      if (ELFNN_R_TYPE (next_delete->r_info) == R_RISCV_DELETE
> +         && next_delete->r_offset > rel->r_offset)
> +       break;
> +      else
> +       next_delete = NULL;
> +    }
> +
> +  size_t bytes_to_move;
> +  /* Make this a piecewise deletion.  */
> +  if (next_delete == NULL)
> +    bytes_to_move = toaddr - addr - count;
> +  else
> +    bytes_to_move = next_delete->r_offset - addr - count;
> +
>    /* Actually delete the bytes.  */
>    sec->size -= count;
> -  memmove (contents + addr, contents + addr + count, toaddr - addr - count);
> +  memmove (contents + addr, contents + addr + count + *delete_total, bytes_to_move);
> +
> +  *delete_total += count;
> +
> +  /* Delete the reloc.  */
> +  rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
>
>    /* Adjust the location of all of the relocs.  Note that we need not
>       adjust the addends, since all PC-relative references must be against
> @@ -4172,12 +4218,20 @@ riscv_relax_delete_bytes (bfd *abfd,
>    return true;
>  }
>
> -typedef bool (*relax_func_t) (bfd *, asection *, asection *,
> -                             struct bfd_link_info *,
> -                             Elf_Internal_Rela *,
> -                             bfd_vma, bfd_vma, bfd_vma, bool *,
> -                             riscv_pcgp_relocs *,
> -                             bool undefined_weak);
> +/* Delete some bytes from a section while relaxing.  */
> +
> +static bool
> +riscv_relax_delete_bytes (bfd_vma addr,
> +                         size_t count,
> +                         Elf_Internal_Rela *rel)
> +{
> +  /* Mark bytes for deletion.  */
> +  rel->r_info = ELFNN_R_INFO (0, R_RISCV_DELETE);
> +  rel->r_offset = addr;
> +  rel->r_addend = count;
> +
> +  return true;
> +}
>
>  /* Relax AUIPC + JALR into JAL.  */
>
> @@ -4190,7 +4244,8 @@ _bfd_riscv_relax_call (bfd *abfd, asection *sec, asection *sym_sec,
>                        bfd_vma reserve_size ATTRIBUTE_UNUSED,
>                        bool *again,
>                        riscv_pcgp_relocs *pcgp_relocs,
> -                      bool undefined_weak ATTRIBUTE_UNUSED)
> +                      bool undefined_weak ATTRIBUTE_UNUSED,
> +                      bfd_vma *delete_total ATTRIBUTE_UNUSED)
>  {
>    bfd_byte *contents = elf_section_data (sec)->this_hdr.contents;
>    bfd_vma foff = symval - (sec_addr (sec) + rel->r_offset);
> @@ -4252,8 +4307,7 @@ _bfd_riscv_relax_call (bfd *abfd, asection *sec, asection *sym_sec,
>
>    /* Delete unnecessary JALR.  */
>    *again = true;
> -  return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + len, 8 - len,
> -                                  link_info, pcgp_relocs);
> +  return riscv_relax_delete_bytes (rel->r_offset + len, 8 - len, rel + 1);
>  }
>
>  /* Traverse all output sections and return the max alignment.  */
> @@ -4286,7 +4340,8 @@ _bfd_riscv_relax_lui (bfd *abfd,
>                       bfd_vma reserve_size,
>                       bool *again,
>                       riscv_pcgp_relocs *pcgp_relocs,
> -                     bool undefined_weak)
> +                     bool undefined_weak,
> +                     bfd_vma *delete_total ATTRIBUTE_UNUSED)
>  {
>    bfd_byte *contents = elf_section_data (sec)->this_hdr.contents;
>    bfd_vma gp = riscv_global_pointer_value (link_info);
> @@ -4343,11 +4398,10 @@ _bfd_riscv_relax_lui (bfd *abfd,
>           return true;
>
>         case R_RISCV_HI20:
> -         /* We can delete the unnecessary LUI and reloc.  */
> +         /* We can delete the unnecessary LUI and reuse the reloc.  */
>           rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
>           *again = true;
> -         return riscv_relax_delete_bytes (abfd, sec, rel->r_offset, 4,
> -                                          link_info, pcgp_relocs);
> +         return riscv_relax_delete_bytes (rel->r_offset, 4, rel);
>
>         default:
>           abort ();
> @@ -4379,8 +4433,7 @@ _bfd_riscv_relax_lui (bfd *abfd,
>        rel->r_info = ELFNN_R_INFO (ELFNN_R_SYM (rel->r_info), R_RISCV_RVC_LUI);
>
>        *again = true;
> -      return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + 2, 2,
> -                                      link_info, pcgp_relocs);
> +      return riscv_relax_delete_bytes (rel->r_offset + 2, 2, rel + 1);
>      }
>
>    return true;
> @@ -4399,7 +4452,8 @@ _bfd_riscv_relax_tls_le (bfd *abfd,
>                          bfd_vma reserve_size ATTRIBUTE_UNUSED,
>                          bool *again,
>                          riscv_pcgp_relocs *pcgp_relocs,
> -                        bool undefined_weak ATTRIBUTE_UNUSED)
> +                        bool undefined_weak ATTRIBUTE_UNUSED,
> +                        bfd_vma *delete_total ATTRIBUTE_UNUSED)
>  {
>    /* See if this symbol is in range of tp.  */
>    if (RISCV_CONST_HIGH_PART (tpoff (link_info, symval)) != 0)
> @@ -4418,11 +4472,10 @@ _bfd_riscv_relax_tls_le (bfd *abfd,
>
>      case R_RISCV_TPREL_HI20:
>      case R_RISCV_TPREL_ADD:
> -      /* We can delete the unnecessary instruction and reloc.  */
> +      /* We can delete the unnecessary instruction and reuse the reloc.  */
>        rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
>        *again = true;
> -      return riscv_relax_delete_bytes (abfd, sec, rel->r_offset, 4, link_info,
> -                                      pcgp_relocs);
> +      return riscv_relax_delete_bytes (rel->r_offset, 4, rel);
>
>      default:
>        abort ();
> @@ -4442,7 +4495,8 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
>                         bfd_vma reserve_size ATTRIBUTE_UNUSED,
>                         bool *again ATTRIBUTE_UNUSED,
>                         riscv_pcgp_relocs *pcgp_relocs ATTRIBUTE_UNUSED,
> -                       bool undefined_weak ATTRIBUTE_UNUSED)
> +                       bool undefined_weak ATTRIBUTE_UNUSED,
> +                       bfd_vma *delete_total ATTRIBUTE_UNUSED)
>  {
>    bfd_byte *contents = elf_section_data (sec)->this_hdr.contents;
>    bfd_vma alignment = 1, pos;
> @@ -4496,10 +4550,8 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
>      bfd_putl16 (RVC_NOP, contents + rel->r_offset + pos);
>
>    /* Mark the excess bytes for deletion.  */
> -  rel->r_info = ELFNN_R_INFO (0, R_RISCV_DELETE);
> -  rel->r_addend = rel->r_addend - nop_bytes;
> -  rel->r_offset = rel->r_offset + nop_bytes;
> -  return true;
> +  return riscv_relax_delete_bytes (rel->r_offset + nop_bytes,
> +                                  rel->r_addend - nop_bytes, rel);
>  }
>
>  /* Relax PC-relative references to GP-relative references.  */
> @@ -4515,7 +4567,8 @@ _bfd_riscv_relax_pc (bfd *abfd ATTRIBUTE_UNUSED,
>                      bfd_vma reserve_size,
>                      bool *again ATTRIBUTE_UNUSED,
>                      riscv_pcgp_relocs *pcgp_relocs,
> -                    bool undefined_weak)
> +                    bool undefined_weak,
> +                    bfd_vma *delete_total ATTRIBUTE_UNUSED)
>  {
>    bfd_byte *contents = elf_section_data (sec)->this_hdr.contents;
>    bfd_vma gp = riscv_global_pointer_value (link_info);
> @@ -4654,28 +4707,6 @@ _bfd_riscv_relax_pc (bfd *abfd ATTRIBUTE_UNUSED,
>    return true;
>  }
>
> -/* Delete the bytes for R_RISCV_DELETE.  */
> -
> -static bool
> -_bfd_riscv_relax_delete (bfd *abfd,
> -                        asection *sec,
> -                        asection *sym_sec ATTRIBUTE_UNUSED,
> -                        struct bfd_link_info *link_info,
> -                        Elf_Internal_Rela *rel,
> -                        bfd_vma symval ATTRIBUTE_UNUSED,
> -                        bfd_vma max_alignment ATTRIBUTE_UNUSED,
> -                        bfd_vma reserve_size ATTRIBUTE_UNUSED,
> -                        bool *again ATTRIBUTE_UNUSED,
> -                        riscv_pcgp_relocs *pcgp_relocs ATTRIBUTE_UNUSED,
> -                        bool undefined_weak ATTRIBUTE_UNUSED)
> -{
> -  if (!riscv_relax_delete_bytes (abfd, sec, rel->r_offset, rel->r_addend,
> -                                link_info, NULL))
> -    return false;
> -  rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
> -  return true;
> -}
> -
>  /* Called by after_allocation to set the information of data segment
>     before relaxing.  */
>
> @@ -4706,6 +4737,7 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
>    unsigned int i;
>    bfd_vma max_alignment, reserve_size = 0;
>    riscv_pcgp_relocs pcgp_relocs;
> +  bfd_vma delete_bytes = 0;
>
>    *again = false;
>
> @@ -4941,7 +4973,7 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
>
>        if (!relax_func (abfd, sec, sym_sec, info, rel, symval,
>                        max_alignment, reserve_size, again,
> -                      &pcgp_relocs, undefined_weak))
> +                      &pcgp_relocs, undefined_weak, &delete_bytes))
>         goto fail;
>      }
>
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/5] RISCV: Implement piecewise deletion
  2022-05-20 10:48     ` Nelson Chu
@ 2022-05-20 17:36       ` Patrick O'Neill
  0 siblings, 0 replies; 22+ messages in thread
From: Patrick O'Neill @ 2022-05-20 17:36 UTC (permalink / raw)
  To: Nelson Chu
  Cc: Binutils, gnu-toolchain, Jim Wilson, Palmer Dabbelt,
	Andrew Waterman, Kito Cheng

On 5/20/22 03:48, Nelson Chu wrote:

> Hi Patrick,
>
> Thanks for giving us this great idea.  I would suggest that we don't
> consider the alignment issues for now, and focus on how to reduce the
> time of the normal relaxations, which includes lui, auipc, call and
> tls le relaxations.  The current relax behavior are listed as follows,
>
> Pass 1: For lui, call and tls le relaxations, delete the redundant
> bytes immediately;  For auipc, we build the table (pcgp_relocs) to
> chain the %pcrel_lo insn to the corresponding %pcrel_hi insn, and then
> actually delete the bytes until next relax pass.
>
> Pass 2: Delete the bytes for each R_RISCV_DELETE  which are marked for
> auipc relaxations.
>
> Pass 3: Handle alignment, delete the bytes immediately for each R_RISCV_ALIGN.
>
> So the current relax order should be,
> Pass 1 -> Pass 1 -> ... -> Pass 1 -> Pass 2 -> Pass 3.  Since we
> delete the bytes for lui/call/tls_le relaxations immediately, the Pass
> 1 will be done multiple times until there are no any bytes deleted.
> There is a problem here - the auipc will not be deleted until Pass 2,
> so we will need to update the pcgp_relocs table every time.  You can
> see the PR28410 for details,
> https://sourceware.org/bugzilla/show_bug.cgi?id=28410.
>
> According to your patches, the relax order will be updated to,
> Pass 1 (Do not delete the bytes immediately) -> Pass 2 (Handle
> alignment) -> Pass 3 (Piecewise delete bytes for
> lui/auipc/call/tls_le/alignment)
>
> I would suggest we continue to put the alignment relaxations at the
> last relax pass, that would reduce a lot of problems, including losing
> too many relax opportunities in pass 1.  Therefore, the order will be,
>
> Pass 1 (Do not delete the bytes immediately) -> Pass 2 (Piecewise
> delete bytes for lui/auipc/call/tls_le) -> Pass 3 (Handle alignment)

Hi Nelson,

Thanks for reviewing the patch! I had misunderstood the flow of the
passes. I had assumed that the flow was:

Pass 1 (again=true) -> Pass 2 (skipped when again is true) -> Pass 3
-> Pass 1 (again=false) -> Pass 2 (performed) -> Pass 3

Basically trying the whole sequence until again=false.

When I put a print statement to display the pass that is being performed
I get this output (isolating print statements from a single section id):

(relax-call-2.s from Patch 1/5)
Pass 0
Pass 0
Pass 1
Pass 2
Pass 0
Pass 1
Pass 2

The second pass through was what had tripped me up. Thanks for the
clarification.

> However, the order doesn't help to increase the chance to relax more,
> since we actually delete the bytes in Pass 2, so we won't be doing the
> Pass 1 as much as we could, we just do it once.  However, if we want
> to delete the bytes in the separate relax pass, then we probably need
> the "restart_relax" again, to do the "Pass 1 -> Pass 2" many times.
> Please see the following deprecated patch,
> https://sourceware.org/pipermail/binutils-cvs/2021-March/056052.html.
>
> Or we probably have another solution without adding the restart_relax
> flag.  If my understanding is correct (please feel free to tell me if
> I'm wrong), we always delete the bytes and adjust the tables (reloc,
> local and global symbol tables) only for "one input section", and
> won't cross sections.  That means, we could create another table to
> record the deleted bytes rather than marking them as R_RISCV_DELETE,
> and then actually delete them at the end of the
> _bfd_riscv_relax_section, for each input section.  So that the order
> will be,
>
> Pass 1 (record the deleted bytes for lui/auipc/call/tls_le, and then
> actually piecewise delete them at the end of _bfd_riscv_relax_section)
> -> Pass 1 -> ... -> Pass 1 -> Pass 2 (Handle alignment).

I agree that that would be better. I'll try implementing this - it
has the added benefit of being reusable for the alignment directives
in future patches. It's fundamentally the same issue - deleting bytes
piecewise per pass rather than immediately.

> As for the improvement of alignment, we could discuss it in the later
> patches.  I would like to accept the piecewise deletion for the
> lui/auipc/call/tls_le relaxations first, so let us start by making
> some progress here.

Sounds good.

> BTW, we don't need to update the pcgp_relocs tables in the
> _bfd_riscv_relax_delete anymore, since we don't delete bytes
> immediately for each relax pattern.

Thanks, I'll remove that for v3.

Thank you again for looking at this, linkers are complicated and I'm
still learning ;)

Thanks,
Patrick

>
> Thanks
> Nelson
>
> On Mon, May 2, 2022 at 9:54 PM Patrick O'Neill<patrick@rivosinc.com>  wrote:
>> Rather than deleting bytes immediately, we can first mark the bytes for
>> deletion. Then, using piecewise deletion, we reduce the runtime from
>> O(n^2) to O(n).
>>
>> By moving the logic from the riscv_relax_delete_bytes function into the
>> RISCV_DELETE reloc handler, we can delete the bytes in a piecewise
>> manner.
>>
>> 2022-04-29 Patrick O'Neill<patrick@rivosinc.com>
>>
>>          * elfnn-riscv.c: Move deletion logic into DELETE reloc handler
>>            and replace with DELETE reloc creation logic.
>>
>> Signed-off-by: Patrick O'Neill<patrick@rivosinc.com>
>> ---
>> This change overwrites the R_RISCV_RELAX with a R_RISCV_DELETE reloc
>> when a relaxation wants to delete bytes. If it was going to delete a
>> reloc, we use that reloc rather than deleting it.
>> ---
>> v2 Changelog:
>> - Arrange functions as to reduce the diff of each patch and make
>>    reviewing more manageable
>> ---
>>   bfd/elfnn-riscv.c | 144 ++++++++++++++++++++++++++++------------------
>>   1 file changed, 88 insertions(+), 56 deletions(-)
>>
>> diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c
>> index 17f9607744f..10e76e8628d 100644
>> --- a/bfd/elfnn-riscv.c
>> +++ b/bfd/elfnn-riscv.c
>> @@ -4054,15 +4054,28 @@ riscv_update_pcgp_relocs (riscv_pcgp_relocs *p, asection *deleted_sec,
>>       }
>>   }
>>
>> -/* Delete some bytes from a section while relaxing.  */
>> +typedef bool (*relax_func_t) (bfd *, asection *, asection *,
>> +                             struct bfd_link_info *,
>> +                             Elf_Internal_Rela *,
>> +                             bfd_vma, bfd_vma, bfd_vma, bool *,
>> +                             riscv_pcgp_relocs *,
>> +                             bool undefined_weak, bfd_vma *);
>> +
>> +/* Delete the bytes for R_RISCV_DELETE.  */
>>
>>   static bool
>> -riscv_relax_delete_bytes (bfd *abfd,
>> -                         asection *sec,
>> -                         bfd_vma addr,
>> -                         size_t count,
>> -                         struct bfd_link_info *link_info,
>> -                         riscv_pcgp_relocs *p)
>> +_bfd_riscv_relax_delete (bfd *abfd,
>> +                        asection *sec,
>> +                        asection *sym_sec ATTRIBUTE_UNUSED,
>> +                        struct bfd_link_info *link_info,
>> +                        Elf_Internal_Rela *rel,
>> +                        bfd_vma symval ATTRIBUTE_UNUSED,
>> +                        bfd_vma max_alignment ATTRIBUTE_UNUSED,
>> +                        bfd_vma reserve_size ATTRIBUTE_UNUSED,
>> +                        bool *again ATTRIBUTE_UNUSED,
>> +                        riscv_pcgp_relocs *pcgp_relocs,
>> +                        bool undefined_weak ATTRIBUTE_UNUSED,
>> +                        bfd_vma *delete_total)
>>   {
>>     unsigned int i, symcount;
>>     bfd_vma toaddr = sec->size;
>> @@ -4072,9 +4085,42 @@ riscv_relax_delete_bytes (bfd *abfd,
>>     struct bfd_elf_section_data *data = elf_section_data (sec);
>>     bfd_byte *contents = data->this_hdr.contents;
>>
>> +  bfd_vma addr = rel->r_offset;
>> +  size_t count = rel->r_addend;
>> +  riscv_pcgp_relocs *p = pcgp_relocs;
>> +
>> +  /* Find the next DELETE reloc (if one exists).  */
>> +  Elf_Internal_Rela *relocs = elf_section_data (sec)->relocs;
>> +  Elf_Internal_Rela *next_delete = NULL;
>> +  /* Since we only replace existing relocs and don't add new relocs, the
>> +     relocs are in sequential order. We can skip the relocs prior to this one,
>> +     making this search linear time.  */
>> +  int start = rel - relocs;
>> +  for (unsigned int i = start; i < sec->reloc_count; i++)
>> +    {
>> +      next_delete = relocs + i;
>> +      if (ELFNN_R_TYPE (next_delete->r_info) == R_RISCV_DELETE
>> +         && next_delete->r_offset > rel->r_offset)
>> +       break;
>> +      else
>> +       next_delete = NULL;
>> +    }
>> +
>> +  size_t bytes_to_move;
>> +  /* Make this a piecewise deletion.  */
>> +  if (next_delete == NULL)
>> +    bytes_to_move = toaddr - addr - count;
>> +  else
>> +    bytes_to_move = next_delete->r_offset - addr - count;
>> +
>>     /* Actually delete the bytes.  */
>>     sec->size -= count;
>> -  memmove (contents + addr, contents + addr + count, toaddr - addr - count);
>> +  memmove (contents + addr, contents + addr + count + *delete_total, bytes_to_move);
>> +
>> +  *delete_total += count;
>> +
>> +  /* Delete the reloc.  */
>> +  rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
>>
>>     /* Adjust the location of all of the relocs.  Note that we need not
>>        adjust the addends, since all PC-relative references must be against
>> @@ -4172,12 +4218,20 @@ riscv_relax_delete_bytes (bfd *abfd,
>>     return true;
>>   }
>>
>> -typedef bool (*relax_func_t) (bfd *, asection *, asection *,
>> -                             struct bfd_link_info *,
>> -                             Elf_Internal_Rela *,
>> -                             bfd_vma, bfd_vma, bfd_vma, bool *,
>> -                             riscv_pcgp_relocs *,
>> -                             bool undefined_weak);
>> +/* Delete some bytes from a section while relaxing.  */
>> +
>> +static bool
>> +riscv_relax_delete_bytes (bfd_vma addr,
>> +                         size_t count,
>> +                         Elf_Internal_Rela *rel)
>> +{
>> +  /* Mark bytes for deletion.  */
>> +  rel->r_info = ELFNN_R_INFO (0, R_RISCV_DELETE);
>> +  rel->r_offset = addr;
>> +  rel->r_addend = count;
>> +
>> +  return true;
>> +}
>>
>>   /* Relax AUIPC + JALR into JAL.  */
>>
>> @@ -4190,7 +4244,8 @@ _bfd_riscv_relax_call (bfd *abfd, asection *sec, asection *sym_sec,
>>                         bfd_vma reserve_size ATTRIBUTE_UNUSED,
>>                         bool *again,
>>                         riscv_pcgp_relocs *pcgp_relocs,
>> -                      bool undefined_weak ATTRIBUTE_UNUSED)
>> +                      bool undefined_weak ATTRIBUTE_UNUSED,
>> +                      bfd_vma *delete_total ATTRIBUTE_UNUSED)
>>   {
>>     bfd_byte *contents = elf_section_data (sec)->this_hdr.contents;
>>     bfd_vma foff = symval - (sec_addr (sec) + rel->r_offset);
>> @@ -4252,8 +4307,7 @@ _bfd_riscv_relax_call (bfd *abfd, asection *sec, asection *sym_sec,
>>
>>     /* Delete unnecessary JALR.  */
>>     *again = true;
>> -  return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + len, 8 - len,
>> -                                  link_info, pcgp_relocs);
>> +  return riscv_relax_delete_bytes (rel->r_offset + len, 8 - len, rel + 1);
>>   }
>>
>>   /* Traverse all output sections and return the max alignment.  */
>> @@ -4286,7 +4340,8 @@ _bfd_riscv_relax_lui (bfd *abfd,
>>                        bfd_vma reserve_size,
>>                        bool *again,
>>                        riscv_pcgp_relocs *pcgp_relocs,
>> -                     bool undefined_weak)
>> +                     bool undefined_weak,
>> +                     bfd_vma *delete_total ATTRIBUTE_UNUSED)
>>   {
>>     bfd_byte *contents = elf_section_data (sec)->this_hdr.contents;
>>     bfd_vma gp = riscv_global_pointer_value (link_info);
>> @@ -4343,11 +4398,10 @@ _bfd_riscv_relax_lui (bfd *abfd,
>>            return true;
>>
>>          case R_RISCV_HI20:
>> -         /* We can delete the unnecessary LUI and reloc.  */
>> +         /* We can delete the unnecessary LUI and reuse the reloc.  */
>>            rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
>>            *again = true;
>> -         return riscv_relax_delete_bytes (abfd, sec, rel->r_offset, 4,
>> -                                          link_info, pcgp_relocs);
>> +         return riscv_relax_delete_bytes (rel->r_offset, 4, rel);
>>
>>          default:
>>            abort ();
>> @@ -4379,8 +4433,7 @@ _bfd_riscv_relax_lui (bfd *abfd,
>>         rel->r_info = ELFNN_R_INFO (ELFNN_R_SYM (rel->r_info), R_RISCV_RVC_LUI);
>>
>>         *again = true;
>> -      return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + 2, 2,
>> -                                      link_info, pcgp_relocs);
>> +      return riscv_relax_delete_bytes (rel->r_offset + 2, 2, rel + 1);
>>       }
>>
>>     return true;
>> @@ -4399,7 +4452,8 @@ _bfd_riscv_relax_tls_le (bfd *abfd,
>>                           bfd_vma reserve_size ATTRIBUTE_UNUSED,
>>                           bool *again,
>>                           riscv_pcgp_relocs *pcgp_relocs,
>> -                        bool undefined_weak ATTRIBUTE_UNUSED)
>> +                        bool undefined_weak ATTRIBUTE_UNUSED,
>> +                        bfd_vma *delete_total ATTRIBUTE_UNUSED)
>>   {
>>     /* See if this symbol is in range of tp.  */
>>     if (RISCV_CONST_HIGH_PART (tpoff (link_info, symval)) != 0)
>> @@ -4418,11 +4472,10 @@ _bfd_riscv_relax_tls_le (bfd *abfd,
>>
>>       case R_RISCV_TPREL_HI20:
>>       case R_RISCV_TPREL_ADD:
>> -      /* We can delete the unnecessary instruction and reloc.  */
>> +      /* We can delete the unnecessary instruction and reuse the reloc.  */
>>         rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
>>         *again = true;
>> -      return riscv_relax_delete_bytes (abfd, sec, rel->r_offset, 4, link_info,
>> -                                      pcgp_relocs);
>> +      return riscv_relax_delete_bytes (rel->r_offset, 4, rel);
>>
>>       default:
>>         abort ();
>> @@ -4442,7 +4495,8 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
>>                          bfd_vma reserve_size ATTRIBUTE_UNUSED,
>>                          bool *again ATTRIBUTE_UNUSED,
>>                          riscv_pcgp_relocs *pcgp_relocs ATTRIBUTE_UNUSED,
>> -                       bool undefined_weak ATTRIBUTE_UNUSED)
>> +                       bool undefined_weak ATTRIBUTE_UNUSED,
>> +                       bfd_vma *delete_total ATTRIBUTE_UNUSED)
>>   {
>>     bfd_byte *contents = elf_section_data (sec)->this_hdr.contents;
>>     bfd_vma alignment = 1, pos;
>> @@ -4496,10 +4550,8 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
>>       bfd_putl16 (RVC_NOP, contents + rel->r_offset + pos);
>>
>>     /* Mark the excess bytes for deletion.  */
>> -  rel->r_info = ELFNN_R_INFO (0, R_RISCV_DELETE);
>> -  rel->r_addend = rel->r_addend - nop_bytes;
>> -  rel->r_offset = rel->r_offset + nop_bytes;
>> -  return true;
>> +  return riscv_relax_delete_bytes (rel->r_offset + nop_bytes,
>> +                                  rel->r_addend - nop_bytes, rel);
>>   }
>>
>>   /* Relax PC-relative references to GP-relative references.  */
>> @@ -4515,7 +4567,8 @@ _bfd_riscv_relax_pc (bfd *abfd ATTRIBUTE_UNUSED,
>>                       bfd_vma reserve_size,
>>                       bool *again ATTRIBUTE_UNUSED,
>>                       riscv_pcgp_relocs *pcgp_relocs,
>> -                    bool undefined_weak)
>> +                    bool undefined_weak,
>> +                    bfd_vma *delete_total ATTRIBUTE_UNUSED)
>>   {
>>     bfd_byte *contents = elf_section_data (sec)->this_hdr.contents;
>>     bfd_vma gp = riscv_global_pointer_value (link_info);
>> @@ -4654,28 +4707,6 @@ _bfd_riscv_relax_pc (bfd *abfd ATTRIBUTE_UNUSED,
>>     return true;
>>   }
>>
>> -/* Delete the bytes for R_RISCV_DELETE.  */
>> -
>> -static bool
>> -_bfd_riscv_relax_delete (bfd *abfd,
>> -                        asection *sec,
>> -                        asection *sym_sec ATTRIBUTE_UNUSED,
>> -                        struct bfd_link_info *link_info,
>> -                        Elf_Internal_Rela *rel,
>> -                        bfd_vma symval ATTRIBUTE_UNUSED,
>> -                        bfd_vma max_alignment ATTRIBUTE_UNUSED,
>> -                        bfd_vma reserve_size ATTRIBUTE_UNUSED,
>> -                        bool *again ATTRIBUTE_UNUSED,
>> -                        riscv_pcgp_relocs *pcgp_relocs ATTRIBUTE_UNUSED,
>> -                        bool undefined_weak ATTRIBUTE_UNUSED)
>> -{
>> -  if (!riscv_relax_delete_bytes (abfd, sec, rel->r_offset, rel->r_addend,
>> -                                link_info, NULL))
>> -    return false;
>> -  rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
>> -  return true;
>> -}
>> -
>>   /* Called by after_allocation to set the information of data segment
>>      before relaxing.  */
>>
>> @@ -4706,6 +4737,7 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
>>     unsigned int i;
>>     bfd_vma max_alignment, reserve_size = 0;
>>     riscv_pcgp_relocs pcgp_relocs;
>> +  bfd_vma delete_bytes = 0;
>>
>>     *again = false;
>>
>> @@ -4941,7 +4973,7 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
>>
>>         if (!relax_func (abfd, sec, sym_sec, info, rel, symval,
>>                         max_alignment, reserve_size, again,
>> -                      &pcgp_relocs, undefined_weak))
>> +                      &pcgp_relocs, undefined_weak, &delete_bytes))
>>          goto fail;
>>       }
>>
>> --
>> 2.25.1
>>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v3 0/3] RISCV: Improve linker time complexity
  2022-05-02 13:50 ` [PATCH v2 0/5] " Patrick O'Neill
                     ` (4 preceding siblings ...)
  2022-05-02 13:50   ` [PATCH v2 5/5] RISCV: Add --defer-deletion flag Patrick O'Neill
@ 2022-05-27 21:20   ` Patrick O'Neill
  2022-05-27 21:20     ` [PATCH v3 1/3] RISCV: Add linker relaxation tests Patrick O'Neill
                       ` (2 more replies)
  5 siblings, 3 replies; 22+ messages in thread
From: Patrick O'Neill @ 2022-05-27 21:20 UTC (permalink / raw)
  To: binutils, nelson.chu
  Cc: gnu-toolchain, jim.wilson.gcc, palmer, andrew, kito.cheng,
	Patrick O'Neill

The current linker has an O(n^2) time complexity when it comes to
deleting bytes. By deferring the deletion of bytes, we can achieve O(n)
deletion runtime.

There is a pathological case that could cause this to have worse
performance, so the --no-defer-deletion flag is added to allow users
to opt-out.

Patrick O'Neill (3):
  RISCV: Add linker relaxation tests
  RISCV: Implement piecewise deletion
  RISCV: Add --defer-deletion flag

 bfd/elfnn-riscv.c                          | 193 ++++++++++++++++-----
 include/bfdlink.h                          |   4 +
 ld/ld.texi                                 |  10 ++
 ld/ldlex.h                                 |   2 +
 ld/lexsup.c                                |  10 ++
 ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp |   6 +
 ld/testsuite/ld-riscv-elf/relax-call-1.d   |  17 ++
 ld/testsuite/ld-riscv-elf/relax-call-1.s   |   7 +
 ld/testsuite/ld-riscv-elf/relax-call-2.d   |  21 +++
 ld/testsuite/ld-riscv-elf/relax-call-2.s   |  10 ++
 ld/testsuite/ld-riscv-elf/relax-call-3.d   |  25 +++
 ld/testsuite/ld-riscv-elf/relax-call-3.s   |  13 ++
 ld/testsuite/ld-riscv-elf/relax-call-4.d   |  19 ++
 ld/testsuite/ld-riscv-elf/relax-call-4.s   |   8 +
 ld/testsuite/ld-riscv-elf/relax-call-5.d   |  23 +++
 ld/testsuite/ld-riscv-elf/relax-call-5.s   |  11 ++
 ld/testsuite/ld-riscv-elf/relax-call-6.d   |  22 +++
 ld/testsuite/ld-riscv-elf/relax-call-6.s   |  11 ++
 18 files changed, 366 insertions(+), 46 deletions(-)
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-1.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-1.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-2.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-2.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-3.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-3.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-4.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-4.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-5.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-5.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-6.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-6.s

-- 
2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v3 1/3] RISCV: Add linker relaxation tests
  2022-05-27 21:20   ` [PATCH v3 0/3] RISCV: Improve linker time complexity Patrick O'Neill
@ 2022-05-27 21:20     ` Patrick O'Neill
  2022-05-27 21:20     ` [PATCH v3 2/3] RISCV: Implement piecewise deletion Patrick O'Neill
  2022-05-27 21:20     ` [PATCH v3 3/3] RISCV: Add --defer-deletion flag Patrick O'Neill
  2 siblings, 0 replies; 22+ messages in thread
From: Patrick O'Neill @ 2022-05-27 21:20 UTC (permalink / raw)
  To: binutils, nelson.chu
  Cc: gnu-toolchain, jim.wilson.gcc, palmer, andrew, kito.cheng,
	Patrick O'Neill

Add basic tests for the behavior of relaxing function calls.
This covers behavior like relaxing a call, not relaxing a call, and
relaxing a call that is enabled by relaxing another call.

2022-05-27 Patrick O'Neill <patrick@rivosinc.com>

	* ld-riscv-elf.exp: Add testcases.
	* relax-call-1.d: Add new testcase.
	* relax-call-1.s: Likewise.
	* relax-call-2.d: Likewise.
	* relax-call-2.s: Likewise.
	* relax-call-3.d: Likewise.
	* relax-call-3.s: Likewise.
	* relax-call-4.d: Likewise.
	* relax-call-4.s: Likewise.
	* relax-call-5.d: Likewise.
	* relax-call-5.s: Likewise.
	* relax-call-6.d: Likewise.
	* relax-call-6.s: Likewise.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
---
 ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp |  6 ++++++
 ld/testsuite/ld-riscv-elf/relax-call-1.d   | 17 +++++++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-1.s   |  7 ++++++
 ld/testsuite/ld-riscv-elf/relax-call-2.d   | 21 ++++++++++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-2.s   | 10 +++++++++
 ld/testsuite/ld-riscv-elf/relax-call-3.d   | 25 ++++++++++++++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-3.s   | 13 +++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-4.d   | 19 ++++++++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-4.s   |  8 +++++++
 ld/testsuite/ld-riscv-elf/relax-call-5.d   | 23 ++++++++++++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-5.s   | 11 ++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-6.d   | 22 +++++++++++++++++++
 ld/testsuite/ld-riscv-elf/relax-call-6.s   | 11 ++++++++++
 13 files changed, 193 insertions(+)
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-1.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-1.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-2.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-2.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-3.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-3.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-4.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-4.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-5.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-5.s
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-6.d
 create mode 100644 ld/testsuite/ld-riscv-elf/relax-call-6.s

diff --git a/ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp b/ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp
index 272424b33e3..60f7cc5f208 100644
--- a/ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp
+++ b/ld/testsuite/ld-riscv-elf/ld-riscv-elf.exp
@@ -119,6 +119,12 @@ proc run_relax_twice_test {} {
 }
 
 if [istarget "riscv*-*-*"] {
+    run_dump_test "relax-call-1"
+    run_dump_test "relax-call-2"
+    run_dump_test "relax-call-3"
+    run_dump_test "relax-call-4"
+    run_dump_test "relax-call-5"
+    run_dump_test "relax-call-6"
     run_dump_test "align-small-region"
     run_dump_test "call-relax"
     run_dump_test "pcgp-relax-01"
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-1.d b/ld/testsuite/ld-riscv-elf/relax-call-1.d
new file mode 100644
index 00000000000..3e7bfb500c9
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-1.d
@@ -0,0 +1,17 @@
+#source: relax-call-1.s
+#ld: --relax
+#objdump: -d
+
+
+.*:     file format .*
+
+
+Disassembly of section \.text:
+
+.*<foo>:
+.*ret
+
+.* <_start>:
+.*jal.* <foo>
+.*ret
+#pass
\ No newline at end of file
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-1.s b/ld/testsuite/ld-riscv-elf/relax-call-1.s
new file mode 100644
index 00000000000..7b81e28f99f
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-1.s
@@ -0,0 +1,7 @@
+	.text
+foo:
+	jr	ra
+	.globl	_start
+_start:
+	call	foo
+	jr	ra
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-2.d b/ld/testsuite/ld-riscv-elf/relax-call-2.d
new file mode 100644
index 00000000000..7481bea8c1e
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-2.d
@@ -0,0 +1,21 @@
+#source: relax-call-2.s
+#ld: --relax
+#objdump: -d
+
+
+.*:     file format .*
+
+
+Disassembly of section \.text:
+
+.*<foo>:
+.*ret
+
+.*<bar>:
+.*ret
+
+.* <_start>:
+.*jal.* <foo>
+.*jal.* <bar>
+.*ret
+#pass
\ No newline at end of file
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-2.s b/ld/testsuite/ld-riscv-elf/relax-call-2.s
new file mode 100644
index 00000000000..7f5844a6789
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-2.s
@@ -0,0 +1,10 @@
+	.text
+foo:
+	jr	ra
+bar:
+	jr	ra
+	.globl	_start
+_start:
+	call	foo
+	call	bar
+	jr	ra
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-3.d b/ld/testsuite/ld-riscv-elf/relax-call-3.d
new file mode 100644
index 00000000000..5c49222e056
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-3.d
@@ -0,0 +1,25 @@
+#source: relax-call-3.s
+#ld: --relax
+#objdump: -d
+
+
+.*:     file format .*
+
+
+Disassembly of section \.text:
+
+.*<foo>:
+.*ret
+
+.*<bar>:
+.*ret
+
+.*<baz>:
+.*ret
+
+.* <_start>:
+.*jal.* <foo>
+.*jal.* <bar>
+.*jal.* <baz>
+.*ret
+#pass
\ No newline at end of file
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-3.s b/ld/testsuite/ld-riscv-elf/relax-call-3.s
new file mode 100644
index 00000000000..66e7c02ca19
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-3.s
@@ -0,0 +1,13 @@
+	.text
+foo:
+	jr	ra
+bar:
+	jr	ra
+baz:
+	jr	ra
+	.globl	_start
+_start:
+	call	foo
+	call	bar
+	call	baz
+	jr	ra
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-4.d b/ld/testsuite/ld-riscv-elf/relax-call-4.d
new file mode 100644
index 00000000000..09e939ca9b4
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-4.d
@@ -0,0 +1,19 @@
+#source: relax-call-4.s
+#ld: --relax
+#objdump: -d
+
+
+.*:     file format .*
+
+
+Disassembly of section \.text:
+
+.*<foo>:
+.*ret
+.*\.\.\.
+
+.* <_start>:
+.*auipc.*
+.*jalr.* <foo>
+.*ret
+#pass
\ No newline at end of file
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-4.s b/ld/testsuite/ld-riscv-elf/relax-call-4.s
new file mode 100644
index 00000000000..44eed0bc128
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-4.s
@@ -0,0 +1,8 @@
+	.text
+foo:
+	jr	ra
+	.zero 1048570
+	.globl	_start
+_start:
+	call	foo
+	jr	ra
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-5.d b/ld/testsuite/ld-riscv-elf/relax-call-5.d
new file mode 100644
index 00000000000..d9f44e5f8a2
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-5.d
@@ -0,0 +1,23 @@
+#source: relax-call-5.s
+#ld: --relax
+#objdump: -d
+
+
+.*:     file format .*
+
+
+Disassembly of section \.text:
+
+.*<foo>:
+.*ret
+.*\.\.\.
+
+.*<bar>:
+.*ret
+
+.* <_start>:
+.*auipc.*
+.*jalr.* <foo>
+.*jal.* <bar>
+.*ret
+#pass
\ No newline at end of file
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-5.s b/ld/testsuite/ld-riscv-elf/relax-call-5.s
new file mode 100644
index 00000000000..5605fb14e78
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-5.s
@@ -0,0 +1,11 @@
+	.text
+foo:
+	jr	ra
+	.zero 1048566
+bar:
+	jr	ra
+	.globl	_start
+_start:
+	call	foo
+	call	bar
+	jr	ra
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-6.d b/ld/testsuite/ld-riscv-elf/relax-call-6.d
new file mode 100644
index 00000000000..c385ccb5511
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-6.d
@@ -0,0 +1,22 @@
+#source: relax-call-6.s
+#ld: --relax
+#objdump: -d
+
+
+.*:     file format .*
+
+
+Disassembly of section \.text:
+
+.*<foo>:
+.*ret
+.*\.\.\.
+
+.*<bar>:
+.*ret
+
+.* <_start>:
+.*jal.* <bar>
+.*jal.* <foo>
+.*ret
+#pass
\ No newline at end of file
diff --git a/ld/testsuite/ld-riscv-elf/relax-call-6.s b/ld/testsuite/ld-riscv-elf/relax-call-6.s
new file mode 100644
index 00000000000..6df73e108f4
--- /dev/null
+++ b/ld/testsuite/ld-riscv-elf/relax-call-6.s
@@ -0,0 +1,11 @@
+	.text
+foo:
+	jr	ra
+	.zero 1048560
+bar:
+	jr	ra
+	.globl	_start
+_start:
+	call	bar
+	call	foo
+	jr	ra
-- 
2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v3 2/3] RISCV: Implement piecewise deletion
  2022-05-27 21:20   ` [PATCH v3 0/3] RISCV: Improve linker time complexity Patrick O'Neill
  2022-05-27 21:20     ` [PATCH v3 1/3] RISCV: Add linker relaxation tests Patrick O'Neill
@ 2022-05-27 21:20     ` Patrick O'Neill
  2022-05-27 21:20     ` [PATCH v3 3/3] RISCV: Add --defer-deletion flag Patrick O'Neill
  2 siblings, 0 replies; 22+ messages in thread
From: Patrick O'Neill @ 2022-05-27 21:20 UTC (permalink / raw)
  To: binutils, nelson.chu
  Cc: gnu-toolchain, jim.wilson.gcc, palmer, andrew, kito.cheng,
	Patrick O'Neill

Rather than deleting bytes immediately, we can first mark the bytes for
deletion. Then, using piecewise deletion, we reduce the runtime from
O(n^2) to O(n).

2022-05-27 Patrick O'Neill <patrick@rivosinc.com>

	* elfnn-riscv.c: Add piecewise deletion option (Disabled for
	  .align directives).

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
---
This change overwrites the R_RISCV_RELAX with a R_RISCV_DELETE reloc
when a relaxation wants to delete bytes. If it was going to delete a
reloc, we use that reloc rather than deleting it.
---
v2 Changelog:
- Arrange functions as to reduce the diff of each patch and make
  reviewing more manageable
v3 Changelog:
- Remove alignment-directive related code and condense into 2 passes
---
 bfd/elfnn-riscv.c | 189 +++++++++++++++++++++++++++++++++++-----------
 1 file changed, 143 insertions(+), 46 deletions(-)

diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c
index 8f9f0d8a86a..4b6ea179442 100644
--- a/bfd/elfnn-riscv.c
+++ b/bfd/elfnn-riscv.c
@@ -4054,15 +4054,17 @@ riscv_update_pcgp_relocs (riscv_pcgp_relocs *p, asection *deleted_sec,
     }
 }
 
-/* Delete some bytes from a section while relaxing.  */
+/* Delete the bytes for R_RISCV_DELETE.  */
 
 static bool
-riscv_relax_delete_bytes (bfd *abfd,
-			  asection *sec,
-			  bfd_vma addr,
-			  size_t count,
-			  struct bfd_link_info *link_info,
-			  riscv_pcgp_relocs *p)
+riscv_relax_delete (bfd *abfd,
+		    asection *sec,
+		    struct bfd_link_info *link_info,
+		    riscv_pcgp_relocs *p,
+		    Elf_Internal_Rela *rel,
+		    bfd_vma *delete_total,
+		    Elf_Internal_Rela **next_delete,
+		    bool piecewise)
 {
   unsigned int i, symcount;
   bfd_vma toaddr = sec->size;
@@ -4072,9 +4074,49 @@ riscv_relax_delete_bytes (bfd *abfd,
   struct bfd_elf_section_data *data = elf_section_data (sec);
   bfd_byte *contents = data->this_hdr.contents;
 
+  bfd_vma addr = rel->r_offset;
+  size_t count = rel->r_addend;
+
+  /* Find the next DELETE reloc (if one exists).  */
+  Elf_Internal_Rela *relocs = elf_section_data (sec)->relocs;
+  *next_delete = NULL;
+  if (piecewise)
+    {
+       /* Since we only replace existing relocs and don't add new relocs, the
+	  relocs are in sequential order. We can skip the relocs prior to this one,
+	  making this search linear time.  */
+	int start = rel - relocs;
+	for (i = start; i < sec->reloc_count; i++)
+	{
+	  *next_delete = relocs + i;
+	  if (ELFNN_R_TYPE ((*next_delete)->r_info) == R_RISCV_DELETE
+	      && (*next_delete)->r_offset > rel->r_offset)
+	    break;
+	  else
+	    *next_delete = NULL;
+	}
+    }
+  else
+    {
+      /* When performing immediate deletions, disable the running total.  */
+      *delete_total = 0;
+    }
+
+  size_t bytes_to_move;
+  /* Make this a piecewise deletion.  */
+  if (*next_delete == NULL)
+    bytes_to_move = toaddr - addr - count;
+  else
+    bytes_to_move = (*next_delete)->r_offset - addr - count;
+
   /* Actually delete the bytes.  */
   sec->size -= count;
-  memmove (contents + addr, contents + addr + count, toaddr - addr - count);
+  memmove (contents + addr, contents + addr + count + *delete_total, bytes_to_move);
+
+  *delete_total += count;
+
+  /* Delete the reloc.  */
+  rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
 
   /* Adjust the location of all of the relocs.  Note that we need not
      adjust the addends, since all PC-relative references must be against
@@ -4085,7 +4127,7 @@ riscv_relax_delete_bytes (bfd *abfd,
 
   /* Adjust the hi_sec_off, and the hi_addr of any entries in the pcgp relocs
      table for which these values occur after the deleted bytes.  */
-  if (p)
+  if (!piecewise && p)
     riscv_update_pcgp_relocs (p, sec, addr, count);
 
   /* Adjust the local symbols defined in this section.  */
@@ -4172,6 +4214,35 @@ riscv_relax_delete_bytes (bfd *abfd,
   return true;
 }
 
+/* Delete some bytes from a section while relaxing.  */
+
+static bool
+riscv_relax_delete_bytes (bfd *abfd,
+			  asection *sec,
+			  bfd_vma addr,
+			  size_t count,
+			  Elf_Internal_Rela *rel,
+			  struct bfd_link_info *link_info,
+			  riscv_pcgp_relocs *p,
+			  bool piecewise)
+{
+  /* Mark bytes for deletion.  */
+  rel->r_info = ELFNN_R_INFO (0, R_RISCV_DELETE);
+  rel->r_offset = addr;
+  rel->r_addend = count;
+
+  /* Delete immediately.  */
+  if (!piecewise)
+    {
+      bfd_vma delete_total = 0;
+      Elf_Internal_Rela *next_delete = NULL;
+      riscv_relax_delete(abfd, sec, link_info, p, rel,
+			 &delete_total, &next_delete, false);
+    }
+
+  return true;
+}
+
 typedef bool (*relax_func_t) (bfd *, asection *, asection *,
 			      struct bfd_link_info *,
 			      Elf_Internal_Rela *,
@@ -4253,7 +4324,7 @@ _bfd_riscv_relax_call (bfd *abfd, asection *sec, asection *sym_sec,
   /* Delete unnecessary JALR.  */
   *again = true;
   return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + len, 8 - len,
-				   link_info, pcgp_relocs);
+				   rel + 1, link_info, pcgp_relocs, true);
 }
 
 /* Traverse all output sections and return the max alignment.  */
@@ -4343,11 +4414,10 @@ _bfd_riscv_relax_lui (bfd *abfd,
 	  return true;
 
 	case R_RISCV_HI20:
-	  /* We can delete the unnecessary LUI and reloc.  */
-	  rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
+	  /* We can delete the unnecessary LUI and reuse the reloc.  */
 	  *again = true;
-	  return riscv_relax_delete_bytes (abfd, sec, rel->r_offset, 4,
-					   link_info, pcgp_relocs);
+	  return riscv_relax_delete_bytes (abfd, sec, rel->r_offset, 4, rel,
+					   link_info, pcgp_relocs, true);
 
 	default:
 	  abort ();
@@ -4379,8 +4449,8 @@ _bfd_riscv_relax_lui (bfd *abfd,
       rel->r_info = ELFNN_R_INFO (ELFNN_R_SYM (rel->r_info), R_RISCV_RVC_LUI);
 
       *again = true;
-      return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + 2, 2,
-				       link_info, pcgp_relocs);
+      return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + 2, 2, rel + 1,
+				       link_info, pcgp_relocs, true);
     }
 
   return true;
@@ -4418,11 +4488,11 @@ _bfd_riscv_relax_tls_le (bfd *abfd,
 
     case R_RISCV_TPREL_HI20:
     case R_RISCV_TPREL_ADD:
-      /* We can delete the unnecessary instruction and reloc.  */
+      /* We can delete the unnecessary instruction and reuse the reloc.  */
       rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
       *again = true;
-      return riscv_relax_delete_bytes (abfd, sec, rel->r_offset, 4, link_info,
-				       pcgp_relocs);
+      return riscv_relax_delete_bytes (abfd, sec, rel->r_offset, 4, rel,
+				       link_info, pcgp_relocs, true);
 
     default:
       abort ();
@@ -4485,8 +4555,8 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
 
   /* Delete the excess bytes.  */
   return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + nop_bytes,
-				   rel->r_addend - nop_bytes, link_info,
-				   NULL);
+				   rel->r_addend - nop_bytes, rel, link_info,
+				   NULL, false);
 }
 
 /* Relax PC-relative references to GP-relative references.  */
@@ -4641,28 +4711,6 @@ _bfd_riscv_relax_pc (bfd *abfd ATTRIBUTE_UNUSED,
   return true;
 }
 
-/* Delete the bytes for R_RISCV_DELETE.  */
-
-static bool
-_bfd_riscv_relax_delete (bfd *abfd,
-			 asection *sec,
-			 asection *sym_sec ATTRIBUTE_UNUSED,
-			 struct bfd_link_info *link_info,
-			 Elf_Internal_Rela *rel,
-			 bfd_vma symval ATTRIBUTE_UNUSED,
-			 bfd_vma max_alignment ATTRIBUTE_UNUSED,
-			 bfd_vma reserve_size ATTRIBUTE_UNUSED,
-			 bool *again ATTRIBUTE_UNUSED,
-			 riscv_pcgp_relocs *pcgp_relocs ATTRIBUTE_UNUSED,
-			 bool undefined_weak ATTRIBUTE_UNUSED)
-{
-  if (!riscv_relax_delete_bytes (abfd, sec, rel->r_offset, rel->r_addend,
-				 link_info, NULL))
-    return false;
-  rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
-  return true;
-}
-
 /* Called by after_allocation to set the information of data segment
    before relaxing.  */
 
@@ -4771,9 +4819,7 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
 	  /* Skip over the R_RISCV_RELAX.  */
 	  i++;
 	}
-      else if (info->relax_pass == 1 && type == R_RISCV_DELETE)
-	relax_func = _bfd_riscv_relax_delete;
-      else if (info->relax_pass == 2 && type == R_RISCV_ALIGN)
+      else if (info->relax_pass == 1 && type == R_RISCV_ALIGN)
 	relax_func = _bfd_riscv_relax_align;
       else
 	continue;
@@ -4932,6 +4978,57 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
 	goto fail;
     }
 
+  bfd_vma delete_total = 0;
+
+  /* Resolve DELETE relocs.  */
+  if (info->relax_pass == 0)
+    {
+      for (i = 0; i < sec->reloc_count; i++)
+	{
+	  Elf_Internal_Rela *rel = relocs + i;
+	  int type = ELFNN_R_TYPE (rel->r_info);
+
+	  if (type != R_RISCV_DELETE)
+	    continue;
+
+	  /* Get the value of the symbol referred to by the reloc.  */
+	  if (!(ELFNN_R_SYM (rel->r_info) < symtab_hdr->sh_info))
+	    {
+	      unsigned long indx;
+	      struct elf_link_hash_entry *h;
+
+	      indx = ELFNN_R_SYM (rel->r_info) - symtab_hdr->sh_info;
+	      h = elf_sym_hashes (abfd)[indx];
+
+	      while (h->root.type == bfd_link_hash_indirect
+		     || h->root.type == bfd_link_hash_warning)
+		h = (struct elf_link_hash_entry *) h->root.u.i.link;
+
+	      /* Disable the relaxation for ifunc.  */
+	      if (h != NULL && h->type == STT_GNU_IFUNC)
+		continue;
+
+	      if (!((h->root.type == bfd_link_hash_defined
+		     || h->root.type == bfd_link_hash_defweak)
+		    && h->root.u.def.section != NULL
+		    && h->root.u.def.section->output_section != NULL))
+		continue;
+	    }
+
+	  Elf_Internal_Rela *next_delete = NULL;
+
+	  if (!riscv_relax_delete (abfd, sec, info, &pcgp_relocs, rel,
+				   &delete_total, &next_delete, true))
+	    goto fail;
+
+	  /* Skip ahead to the next delete reloc.  */
+	  if (next_delete != NULL)
+	    i = next_delete - relocs - 1;
+	  else
+	    i = sec->reloc_count;
+	}
+    }
+
   ret = true;
 
  fail:
-- 
2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v3 3/3] RISCV: Add --defer-deletion flag
  2022-05-27 21:20   ` [PATCH v3 0/3] RISCV: Improve linker time complexity Patrick O'Neill
  2022-05-27 21:20     ` [PATCH v3 1/3] RISCV: Add linker relaxation tests Patrick O'Neill
  2022-05-27 21:20     ` [PATCH v3 2/3] RISCV: Implement piecewise deletion Patrick O'Neill
@ 2022-05-27 21:20     ` Patrick O'Neill
  2 siblings, 0 replies; 22+ messages in thread
From: Patrick O'Neill @ 2022-05-27 21:20 UTC (permalink / raw)
  To: binutils, nelson.chu
  Cc: gnu-toolchain, jim.wilson.gcc, palmer, andrew, kito.cheng,
	Patrick O'Neill

Previously the linker failed to relax only the backwards case (where a
relaxation enables another relaxation that has already been checked).
With patches 1-4, the behavior has been changed to fail in both the
forward and backwards case. It's possible that this will lead to worse
performance, so the --no-defer-deletion flag allows you to use the
non-linear immediate deletion method.
By default, the linear deletion method is enabled as this is expected to
be more performant.

2022-05-27 Patrick O'Neill <patrick@rivosinc.com>

	* elfnn-riscv.c: Pass flag to deletion function to immediately
	  delete bytes (or not).
	* bfdlink.h: Add delete_immediately flag.
	* ld/ld.texi: Document the new --defer-deletion flag.
	* ld/ldlex.h: Add DEFER_DELETION and NO_DEFER_DELETION options.
	* ld/lexsup.c: Parse --defer-deletion and --no-defer-deletion
	  flags.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
---
I think that there may be a better way to phrase the --defer-deletion
flag. I'm also unsure if this is the correct way of expressing
a target-specific linker flag.
---
v2 Changelog:
- This is a new patch.
---
v3 Changelog:
- Rebase with changes to elfnn-riscv.c from patch 2/3.
---
 bfd/elfnn-riscv.c | 12 ++++++++----
 include/bfdlink.h |  4 ++++
 ld/ld.texi        | 10 ++++++++++
 ld/ldlex.h        |  2 ++
 ld/lexsup.c       | 10 ++++++++++
 5 files changed, 34 insertions(+), 4 deletions(-)

diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c
index 4b6ea179442..0a750773d47 100644
--- a/bfd/elfnn-riscv.c
+++ b/bfd/elfnn-riscv.c
@@ -4324,7 +4324,8 @@ _bfd_riscv_relax_call (bfd *abfd, asection *sec, asection *sym_sec,
   /* Delete unnecessary JALR.  */
   *again = true;
   return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + len, 8 - len,
-				   rel + 1, link_info, pcgp_relocs, true);
+				   rel + 1, link_info, pcgp_relocs,
+				   !link_info->immediate_deletion);
 }
 
 /* Traverse all output sections and return the max alignment.  */
@@ -4417,7 +4418,8 @@ _bfd_riscv_relax_lui (bfd *abfd,
 	  /* We can delete the unnecessary LUI and reuse the reloc.  */
 	  *again = true;
 	  return riscv_relax_delete_bytes (abfd, sec, rel->r_offset, 4, rel,
-					   link_info, pcgp_relocs, true);
+					   link_info, pcgp_relocs,
+					   !link_info->immediate_deletion);
 
 	default:
 	  abort ();
@@ -4450,7 +4452,8 @@ _bfd_riscv_relax_lui (bfd *abfd,
 
       *again = true;
       return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + 2, 2, rel + 1,
-				       link_info, pcgp_relocs, true);
+				       link_info, pcgp_relocs,
+				       !link_info->immediate_deletion);
     }
 
   return true;
@@ -4492,7 +4495,8 @@ _bfd_riscv_relax_tls_le (bfd *abfd,
       rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
       *again = true;
       return riscv_relax_delete_bytes (abfd, sec, rel->r_offset, 4, rel,
-				       link_info, pcgp_relocs, true);
+				       link_info, pcgp_relocs,
+				       !link_info->immediate_deletion);
 
     default:
       abort ();
diff --git a/include/bfdlink.h b/include/bfdlink.h
index 09a3ec01685..f6c87fbe911 100644
--- a/include/bfdlink.h
+++ b/include/bfdlink.h
@@ -587,6 +587,10 @@ struct bfd_link_info
        args_type structure in ldmain.c:main.  */
   signed int disable_target_specific_optimizations;
 
+  /* Enable or disable deferred deletion of linker relaxations.
+     This option is enabled my default.  */
+  unsigned int immediate_deletion: 1;
+
   /* Function callbacks.  */
   const struct bfd_link_callbacks *callbacks;
 
diff --git a/ld/ld.texi b/ld/ld.texi
index eabbec8faa9..c536eed4162 100644
--- a/ld/ld.texi
+++ b/ld/ld.texi
@@ -2186,6 +2186,16 @@ This option is ignored for Linux compatibility.
 @item -Qy
 This option is ignored for SVR4 compatibility.
 
+@cindex defer relaxation deletion
+@kindex --defer-relaxation-deletions
+@kindex --no-defer-relaxation-deletions
+@item --defer-relaxation-deletions
+@itemx --no-defer-relaxation-deletions
+Enable (or don't enable) linear time linker relaxations by deferring byte
+deletion. This assumes the pathologial case is rare. If the pathological case
+is common, disabling linear linking will improve performance.
+Default to linear time linker relaxation.
+
 @kindex --relax
 @cindex synthesizing linker
 @cindex relaxing addressing modes
diff --git a/ld/ldlex.h b/ld/ldlex.h
index 57ade1f754b..ce9f5d5aafe 100644
--- a/ld/ldlex.h
+++ b/ld/ldlex.h
@@ -54,6 +54,8 @@ enum option_values
   OPTION_OFORMAT,
   OPTION_RELAX,
   OPTION_NO_RELAX,
+  OPTION_DEFER_DELETION,
+  OPTION_NO_DEFER_DELETION,
   OPTION_NO_SYMBOLIC,
   OPTION_RETAIN_SYMBOLS_FILE,
   OPTION_RPATH,
diff --git a/ld/lexsup.c b/ld/lexsup.c
index 9225f71b3ce..1a0d9a31fef 100644
--- a/ld/lexsup.c
+++ b/ld/lexsup.c
@@ -442,6 +442,10 @@ static const struct ld_option ld_options[] =
     '\0', NULL, N_("Reduce code size by using target specific optimizations"), TWO_DASHES },
   { {"no-relax", no_argument, NULL, OPTION_NO_RELAX},
     '\0', NULL, N_("Do not use relaxation techniques to reduce code size"), TWO_DASHES },
+  { {"defer-deletion", no_argument, NULL, OPTION_DEFER_DELETION},
+    '\0', NULL, N_("Defer deletions caused by linker relaxation"), TWO_DASHES },
+  { {"no-defer-deletion", no_argument, NULL, OPTION_NO_DEFER_DELETION},
+    '\0', NULL, N_("Do not defer deletions caused by linker relaxation"), TWO_DASHES },
   { {"retain-symbols-file", required_argument, NULL,
      OPTION_RETAIN_SYMBOLS_FILE},
     '\0', N_("FILE"), N_("Keep only symbols listed in FILE"), TWO_DASHES },
@@ -935,6 +939,12 @@ parse_args (unsigned argc, char **argv)
 	case OPTION_NO_WARN_RWX_SEGMENTS:
 	  link_info.no_warn_rwx_segments = 1;
 	  break;
+	case OPTION_DEFER_DELETION:
+	  link_info.immediate_deletion = false;
+	  break;
+	case OPTION_NO_DEFER_DELETION:
+	  link_info.immediate_deletion = true;
+	  break;
 	case 'e':
 	  lang_add_entry (optarg, true);
 	  break;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2022-05-27 21:20 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-12 16:25 [PATCH 0/4] RISCV: Improve linker time complexity Patrick O'Neill
2022-04-12 16:25 ` [PATCH 1/4] RISCV: Add linker relaxation tests Patrick O'Neill
2022-04-12 16:25 ` [PATCH 2/4] RISCV: Arrange DELETE pass after .align pass Patrick O'Neill
2022-04-12 16:26 ` [PATCH 3/4] RISCV: Implement piecewise deletion Patrick O'Neill
2022-04-12 16:26 ` [PATCH 4/4] RISCV: Improve runtime of align directives Patrick O'Neill
2022-04-13  0:58 ` [PATCH 0/4] RISCV: Improve linker time complexity Kito Cheng
2022-04-13  2:23   ` Palmer Dabbelt
2022-04-13  5:12   ` Alan Modra
2022-04-13 18:11     ` Palmer Dabbelt
2022-04-25 17:26       ` Patrick O'Neill
2022-05-02 13:50 ` [PATCH v2 0/5] " Patrick O'Neill
2022-05-02 13:50   ` [PATCH v2 1/5] RISCV: Add linker relaxation tests Patrick O'Neill
2022-05-02 13:50   ` [PATCH v2 2/5] RISCV: Arrange DELETE pass after .align pass Patrick O'Neill
2022-05-02 13:50   ` [PATCH v2 3/5] RISCV: Implement piecewise deletion Patrick O'Neill
2022-05-20 10:48     ` Nelson Chu
2022-05-20 17:36       ` Patrick O'Neill
2022-05-02 13:50   ` [PATCH v2 4/5] RISCV: Improve runtime of align directives Patrick O'Neill
2022-05-02 13:50   ` [PATCH v2 5/5] RISCV: Add --defer-deletion flag Patrick O'Neill
2022-05-27 21:20   ` [PATCH v3 0/3] RISCV: Improve linker time complexity Patrick O'Neill
2022-05-27 21:20     ` [PATCH v3 1/3] RISCV: Add linker relaxation tests Patrick O'Neill
2022-05-27 21:20     ` [PATCH v3 2/3] RISCV: Implement piecewise deletion Patrick O'Neill
2022-05-27 21:20     ` [PATCH v3 3/3] RISCV: Add --defer-deletion flag Patrick O'Neill

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).