From: Nelson Chu <nelson@rivosinc.com>
To: binutils@sourceware.org, Jim Wilson <jim.wilson.gcc@gmail.com>,
Palmer Dabbelt <palmer@dabbelt.com>,
Andrew Waterman <andrew@sifive.com>
Cc: "Patrick O'Neill" <patrick@rivosinc.com>
Subject: Re: [committed 1/2] RISC-V: Improve link time complexity.
Date: Tue, 25 Oct 2022 12:36:01 +0800 [thread overview]
Message-ID: <CAPpQWtBGs4quaqyD9TpxSt9ptk-B7OrsMdVfPZHX283FnX_Q+A@mail.gmail.com> (raw)
In-Reply-To: <20221025013347.68282-1-nelson@rivosinc.com>
Btw, for some corner cases that the relaxations may have dependency to
others (relax A needs to relax B in advance), though the piecewise
deletion reduce the time of memove, but it seems like the time is
still wasted by the more relax rounds since the `again' flag. Not
sure if it is worth it to support multiple levels of deletions, for
example, to limit the relax round to 1 when using the piecewise
deletion, so that people who don't care much about the relaxed results
will have a faster linker.
Thanks
Nelson
On Tue, Oct 25, 2022 at 9:33 AM Nelson Chu <nelson@rivosinc.com> wrote:
>
> From: Patrick O'Neill <patrick@rivosinc.com>
>
> The riscv port does deletion and symbol table update for each relocation
> while relaxing, so we are moving section bytes and scanning symbol table once
> for each relocation. Compared to microblaze port, they record the relaxation
> changes into a table, then do the deletion and symbol table update once per
> section, rather than per relocation. Therefore, they should have better link
> time complexity than us.
>
> To improve the link time complexity, this patch try to make the deletion in
> linear time. Compared to record the relaxation changes into a table, we
> replace the unused relocation with R_RISCV_DELETE for the deleted bytes, and
> then resolve them at the end of the section. Assuming the number of
> R_RISCV_DELETE is m, and the number of symbols is n, the total link complexity
> should be O(m) for moving section bytes, and O(m*n^2) for symbol table update.
> If we record the relaxation changes into the table, and then sort the symbol
> table by values, then probably can reduce the time complexity to O(m*n*log(n))
> for updating symbol table, but it doesn't seem worth it for now.
>
> bfd/
> * elfnn-riscv.c (_riscv_relax_delete_bytes): Renamed from
> riscv_relax_delete_bytes, updated to reduce the tiem complexity to O(m)
> for memmove.
> (typedef relax_delete_t): Function pointer declaration of delete functions.
> (riscv_relax_delete_bytes): Can choose to use _riscv_relax_delete_piecewise
> or _riscv_relax_delete_immediate for deletion.
> (_riscv_relax_delete_piecewise): Just mark the deleted bytes as R_RISCV_DELETE.
> (_riscv_relax_delete_immediate): Delete some bytes from a section while
> relaxing.
> (riscv_relax_resolve_delete_relocs): Delete the bytes for R_RISCV_DELETE
> relocations from a section, at the end of _bfd_riscv_relax_section.
> (_bfd_riscv_relax_call): Mark deleted bytes as R_RISCV_DELETE by reusing
> R_RISCV_RELAX.
> (_bfd_riscv_relax_lui): Likewise, but reuse R_RISCV_HI20 for lui, and reuse
> R_RISCV_RELAX for c.lui.
> (_bfd_riscv_relax_tls_le): Likewise, but resue R_RISCV_TPREL_HI20 and
> R_RISCV_TPREL_ADD.
> (_bfd_riscv_relax_pc): Likewise, but resue R_RISCV_PCREL_HI20 for auipc.
> (_bfd_riscv_relax_align): Updated, don't need to resue relocation since
> calling _riscv_relax_delete_immediate.
> (_bfd_riscv_relax_delete): Removed.
> (_bfd_riscv_relax_section): Set riscv_relax_delete_bytes for each relax_func,
> to delete bytes immediately or later. Call riscv_relax_resolve_delete_relocs
> to delete bytes for DELETE relocations from a section.
> ---
> bfd/elfnn-riscv.c | 180 +++++++++++++++++++++++++++++++++-------------
> 1 file changed, 131 insertions(+), 49 deletions(-)
>
> diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c
> index 3d2ddf4e651..e4064313724 100644
> --- a/bfd/elfnn-riscv.c
> +++ b/bfd/elfnn-riscv.c
> @@ -4043,27 +4043,32 @@ riscv_update_pcgp_relocs (riscv_pcgp_relocs *p, asection *deleted_sec,
> }
> }
>
> -/* Delete some bytes from a section while relaxing. */
> +/* Delete some bytes, adjust relcocations and symbol table from a section. */
>
> static bool
> -riscv_relax_delete_bytes (bfd *abfd,
> - asection *sec,
> - bfd_vma addr,
> - size_t count,
> - struct bfd_link_info *link_info,
> - riscv_pcgp_relocs *p)
> +_riscv_relax_delete_bytes (bfd *abfd,
> + asection *sec,
> + bfd_vma addr,
> + size_t count,
> + struct bfd_link_info *link_info,
> + riscv_pcgp_relocs *p,
> + bfd_vma delete_total,
> + bfd_vma toaddr)
> {
> unsigned int i, symcount;
> - bfd_vma toaddr = sec->size;
> struct elf_link_hash_entry **sym_hashes = elf_sym_hashes (abfd);
> Elf_Internal_Shdr *symtab_hdr = &elf_tdata (abfd)->symtab_hdr;
> unsigned int sec_shndx = _bfd_elf_section_from_bfd_section (abfd, sec);
> struct bfd_elf_section_data *data = elf_section_data (sec);
> bfd_byte *contents = data->this_hdr.contents;
> + size_t bytes_to_move = toaddr - addr - count;
>
> /* Actually delete the bytes. */
> sec->size -= count;
> - memmove (contents + addr, contents + addr + count, toaddr - addr - count);
> + memmove (contents + addr, contents + addr + count + delete_total, bytes_to_move);
> +
> + /* Still adjust relocations and symbols in non-linear times. */
> + toaddr = sec->size + count;
>
> /* Adjust the location of all of the relocs. Note that we need not
> adjust the addends, since all PC-relative references must be against
> @@ -4161,6 +4166,99 @@ riscv_relax_delete_bytes (bfd *abfd,
> return true;
> }
>
> +typedef bool (*relax_delete_t) (bfd *, asection *,
> + bfd_vma, size_t,
> + struct bfd_link_info *,
> + riscv_pcgp_relocs *,
> + Elf_Internal_Rela *);
> +
> +static relax_delete_t riscv_relax_delete_bytes;
> +
> +/* Do not delete some bytes from a section while relaxing.
> + Just mark the deleted bytes as R_RISCV_DELETE. */
> +
> +static bool
> +_riscv_relax_delete_piecewise (bfd *abfd ATTRIBUTE_UNUSED,
> + asection *sec ATTRIBUTE_UNUSED,
> + bfd_vma addr,
> + size_t count,
> + struct bfd_link_info *link_info ATTRIBUTE_UNUSED,
> + riscv_pcgp_relocs *p ATTRIBUTE_UNUSED,
> + Elf_Internal_Rela *rel)
> +{
> + if (rel == NULL)
> + return false;
> + rel->r_info = ELFNN_R_INFO (0, R_RISCV_DELETE);
> + rel->r_offset = addr;
> + rel->r_addend = count;
> + return true;
> +}
> +
> +/* Delete some bytes from a section while relaxing. */
> +
> +static bool
> +_riscv_relax_delete_immediate (bfd *abfd,
> + asection *sec,
> + bfd_vma addr,
> + size_t count,
> + struct bfd_link_info *link_info,
> + riscv_pcgp_relocs *p,
> + Elf_Internal_Rela *rel)
> +{
> + if (rel != NULL)
> + rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
> + return _riscv_relax_delete_bytes (abfd, sec, addr, count,
> + link_info, p, 0, sec->size);
> +}
> +
> +/* Delete the bytes for R_RISCV_DELETE relocs. */
> +
> +static bool
> +riscv_relax_resolve_delete_relocs (bfd *abfd,
> + asection *sec,
> + struct bfd_link_info *link_info,
> + Elf_Internal_Rela *relocs)
> +{
> + bfd_vma delete_total = 0;
> + unsigned int i;
> +
> + for (i = 0; i < sec->reloc_count; i++)
> + {
> + Elf_Internal_Rela *rel = relocs + i;
> + if (ELFNN_R_TYPE (rel->r_info) != R_RISCV_DELETE)
> + continue;
> +
> + /* Find the next R_RISCV_DELETE reloc if possible. */
> + Elf_Internal_Rela *rel_next = NULL;
> + unsigned int start = rel - relocs;
> + for (i = start; i < sec->reloc_count; i++)
> + {
> + /* Since we only replace existing relocs and don't add new relocs, the
> + relocs are in sequential order. We can skip the relocs prior to this
> + one, making this search linear time. */
> + rel_next = relocs + i;
> + if (ELFNN_R_TYPE ((rel_next)->r_info) == R_RISCV_DELETE
> + && (rel_next)->r_offset > rel->r_offset)
> + break;
> + else
> + rel_next = NULL;
> + }
> +
> + bfd_vma toaddr = rel_next == NULL ? sec->size : rel_next->r_offset;
> + if (!_riscv_relax_delete_bytes (abfd, sec, rel->r_offset, rel->r_addend,
> + link_info, NULL, delete_total, toaddr))
> + return false;
> +
> + delete_total += rel->r_addend;
> + rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
> +
> + /* Skip ahead to the next delete reloc. */
> + i = rel_next != NULL ? rel_next - relocs - 1 : sec->reloc_count;
> + }
> +
> + return true;
> +}
> +
> typedef bool (*relax_func_t) (bfd *, asection *, asection *,
> struct bfd_link_info *,
> Elf_Internal_Rela *,
> @@ -4239,10 +4337,10 @@ _bfd_riscv_relax_call (bfd *abfd, asection *sec, asection *sym_sec,
> /* Replace the AUIPC. */
> riscv_put_insn (8 * len, auipc, contents + rel->r_offset);
>
> - /* Delete unnecessary JALR. */
> + /* Delete unnecessary JALR and reuse the R_RISCV_RELAX reloc. */
> *again = true;
> return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + len, 8 - len,
> - link_info, pcgp_relocs);
> + link_info, pcgp_relocs, rel + 1);
> }
>
> /* Traverse all output sections and return the max alignment. */
> @@ -4332,11 +4430,10 @@ _bfd_riscv_relax_lui (bfd *abfd,
> return true;
>
> case R_RISCV_HI20:
> - /* We can delete the unnecessary LUI and reloc. */
> - rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
> + /* Delete unnecessary LUI and reuse the reloc. */
> *again = true;
> return riscv_relax_delete_bytes (abfd, sec, rel->r_offset, 4,
> - link_info, pcgp_relocs);
> + link_info, pcgp_relocs, rel);
>
> default:
> abort ();
> @@ -4367,9 +4464,10 @@ _bfd_riscv_relax_lui (bfd *abfd,
> /* Replace the R_RISCV_HI20 reloc. */
> rel->r_info = ELFNN_R_INFO (ELFNN_R_SYM (rel->r_info), R_RISCV_RVC_LUI);
>
> + /* Delete extra bytes and reuse the R_RISCV_RELAX reloc. */
> *again = true;
> return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + 2, 2,
> - link_info, pcgp_relocs);
> + link_info, pcgp_relocs, rel + 1);
> }
>
> return true;
> @@ -4407,11 +4505,10 @@ _bfd_riscv_relax_tls_le (bfd *abfd,
>
> case R_RISCV_TPREL_HI20:
> case R_RISCV_TPREL_ADD:
> - /* We can delete the unnecessary instruction and reloc. */
> - rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
> + /* Delete unnecessary instruction and reuse the reloc. */
> *again = true;
> return riscv_relax_delete_bytes (abfd, sec, rel->r_offset, 4, link_info,
> - pcgp_relocs);
> + pcgp_relocs, rel);
>
> default:
> abort ();
> @@ -4472,10 +4569,10 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
> if (nop_bytes % 4 != 0)
> bfd_putl16 (RVC_NOP, contents + rel->r_offset + pos);
>
> - /* Delete the excess bytes. */
> + /* Delete excess bytes. */
> return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + nop_bytes,
> rel->r_addend - nop_bytes, link_info,
> - NULL);
> + NULL, NULL);
> }
>
> /* Relax PC-relative references to GP-relative references. */
> @@ -4617,9 +4714,9 @@ _bfd_riscv_relax_pc (bfd *abfd ATTRIBUTE_UNUSED,
> ELFNN_R_SYM(rel->r_info),
> sym_sec,
> undefined_weak);
> - /* We can delete the unnecessary AUIPC and reloc. */
> - rel->r_info = ELFNN_R_INFO (0, R_RISCV_DELETE);
> - rel->r_addend = 4;
> + /* Delete unnecessary AUIPC and reuse the reloc. */
> + riscv_relax_delete_bytes (abfd, sec, rel->r_offset, 4, link_info,
> + pcgp_relocs, rel);
> return true;
>
> default:
> @@ -4630,28 +4727,6 @@ _bfd_riscv_relax_pc (bfd *abfd ATTRIBUTE_UNUSED,
> return true;
> }
>
> -/* Delete the bytes for R_RISCV_DELETE. */
> -
> -static bool
> -_bfd_riscv_relax_delete (bfd *abfd,
> - asection *sec,
> - asection *sym_sec ATTRIBUTE_UNUSED,
> - struct bfd_link_info *link_info,
> - Elf_Internal_Rela *rel,
> - bfd_vma symval ATTRIBUTE_UNUSED,
> - bfd_vma max_alignment ATTRIBUTE_UNUSED,
> - bfd_vma reserve_size ATTRIBUTE_UNUSED,
> - bool *again ATTRIBUTE_UNUSED,
> - riscv_pcgp_relocs *pcgp_relocs ATTRIBUTE_UNUSED,
> - bool undefined_weak ATTRIBUTE_UNUSED)
> -{
> - if (!riscv_relax_delete_bytes (abfd, sec, rel->r_offset, rel->r_addend,
> - link_info, NULL))
> - return false;
> - rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
> - return true;
> -}
> -
> /* Called by after_allocation to set the information of data segment
> before relaxing. */
>
> @@ -4729,6 +4804,7 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
> bool undefined_weak = false;
>
> relax_func = NULL;
> + riscv_relax_delete_bytes = NULL;
> if (info->relax_pass == 0)
> {
> if (type == R_RISCV_CALL
> @@ -4750,6 +4826,7 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
> relax_func = _bfd_riscv_relax_pc;
> else
> continue;
> + riscv_relax_delete_bytes = _riscv_relax_delete_piecewise;
>
> /* Only relax this reloc if it is paired with R_RISCV_RELAX. */
> if (i == sec->reloc_count - 1
> @@ -4760,10 +4837,11 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
> /* Skip over the R_RISCV_RELAX. */
> i++;
> }
> - else if (info->relax_pass == 1 && type == R_RISCV_DELETE)
> - relax_func = _bfd_riscv_relax_delete;
> - else if (info->relax_pass == 2 && type == R_RISCV_ALIGN)
> - relax_func = _bfd_riscv_relax_align;
> + else if (info->relax_pass == 1 && type == R_RISCV_ALIGN)
> + {
> + relax_func = _bfd_riscv_relax_align;
> + riscv_relax_delete_bytes = _riscv_relax_delete_immediate;
> + }
> else
> continue;
>
> @@ -4921,6 +4999,10 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
> goto fail;
> }
>
> + /* Resolve R_RISCV_DELETE relocations. */
> + if (!riscv_relax_resolve_delete_relocs (abfd, sec, info, relocs))
> + goto fail;
> +
> ret = true;
>
> fail:
> --
> 2.37.0 (Apple Git-136)
>
next prev parent reply other threads:[~2022-10-25 4:36 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-25 1:33 Nelson Chu
2022-10-25 1:33 ` [committed 2/2] RISC-V: Should reset `again' flag for _bfd_riscv_relax_pc Nelson Chu
2022-10-25 4:36 ` Nelson Chu [this message]
2022-10-25 10:10 ` [committed 1/2] RISC-V: Improve link time complexity Luis Machado
2022-10-26 2:06 ` Nelson Chu
2022-10-26 8:30 ` Luis Machado
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAPpQWtBGs4quaqyD9TpxSt9ptk-B7OrsMdVfPZHX283FnX_Q+A@mail.gmail.com \
--to=nelson@rivosinc.com \
--cc=andrew@sifive.com \
--cc=binutils@sourceware.org \
--cc=jim.wilson.gcc@gmail.com \
--cc=palmer@dabbelt.com \
--cc=patrick@rivosinc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).