public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Nelson Chu <nelson@rivosinc.com>
To: binutils@sourceware.org, Jim Wilson <jim.wilson.gcc@gmail.com>,
	 Palmer Dabbelt <palmer@dabbelt.com>,
	Andrew Waterman <andrew@sifive.com>
Cc: "Patrick O'Neill" <patrick@rivosinc.com>
Subject: Re: [committed 1/2] RISC-V: Improve link time complexity.
Date: Tue, 25 Oct 2022 12:36:01 +0800	[thread overview]
Message-ID: <CAPpQWtBGs4quaqyD9TpxSt9ptk-B7OrsMdVfPZHX283FnX_Q+A@mail.gmail.com> (raw)
In-Reply-To: <20221025013347.68282-1-nelson@rivosinc.com>

Btw, for some corner cases that the relaxations may have dependency to
others (relax A needs to relax B in advance), though the piecewise
deletion reduce the time of memove, but it seems like the time is
still wasted by the more relax rounds since the `again' flag.  Not
sure if it is worth it to support multiple levels of deletions, for
example, to limit the relax round to 1 when using the piecewise
deletion, so that people who don't care much about the relaxed results
will have a faster linker.

Thanks
Nelson

On Tue, Oct 25, 2022 at 9:33 AM Nelson Chu <nelson@rivosinc.com> wrote:
>
> From: Patrick O'Neill <patrick@rivosinc.com>
>
> The riscv port does deletion and symbol table update for each relocation
> while relaxing, so we are moving section bytes and scanning symbol table once
> for each relocation.  Compared to microblaze port, they record the relaxation
> changes into a table, then do the deletion and symbol table update once per
> section, rather than per relocation.  Therefore, they should have better link
> time complexity than us.
>
> To improve the link time complexity, this patch try to make the deletion in
> linear time.  Compared to record the relaxation changes into a table, we
> replace the unused relocation with R_RISCV_DELETE for the deleted bytes, and
> then resolve them at the end of the section.  Assuming the number of
> R_RISCV_DELETE is m, and the number of symbols is n, the total link complexity
> should be O(m) for moving section bytes, and O(m*n^2) for symbol table update.
> If we record the relaxation changes into the table, and then sort the symbol
> table by values, then probably can reduce the time complexity to O(m*n*log(n))
> for updating symbol table, but it doesn't seem worth it for now.
>
> bfd/
>     * elfnn-riscv.c (_riscv_relax_delete_bytes): Renamed from
>     riscv_relax_delete_bytes, updated to reduce the tiem complexity to O(m)
>     for memmove.
>     (typedef relax_delete_t): Function pointer declaration of delete functions.
>     (riscv_relax_delete_bytes): Can choose to use _riscv_relax_delete_piecewise
>     or _riscv_relax_delete_immediate for deletion.
>     (_riscv_relax_delete_piecewise): Just mark the deleted bytes as R_RISCV_DELETE.
>     (_riscv_relax_delete_immediate): Delete some bytes from a section while
>     relaxing.
>     (riscv_relax_resolve_delete_relocs): Delete the bytes for R_RISCV_DELETE
>     relocations from a section, at the end of _bfd_riscv_relax_section.
>     (_bfd_riscv_relax_call): Mark deleted bytes as R_RISCV_DELETE by reusing
>     R_RISCV_RELAX.
>     (_bfd_riscv_relax_lui): Likewise, but reuse R_RISCV_HI20 for lui, and reuse
>     R_RISCV_RELAX for c.lui.
>     (_bfd_riscv_relax_tls_le): Likewise, but resue R_RISCV_TPREL_HI20 and
>     R_RISCV_TPREL_ADD.
>     (_bfd_riscv_relax_pc): Likewise, but resue R_RISCV_PCREL_HI20 for auipc.
>     (_bfd_riscv_relax_align): Updated, don't need to resue relocation since
>     calling _riscv_relax_delete_immediate.
>     (_bfd_riscv_relax_delete): Removed.
>     (_bfd_riscv_relax_section): Set riscv_relax_delete_bytes for each relax_func,
>     to delete bytes immediately or later.  Call riscv_relax_resolve_delete_relocs
>     to delete bytes for DELETE relocations from a section.
> ---
>  bfd/elfnn-riscv.c | 180 +++++++++++++++++++++++++++++++++-------------
>  1 file changed, 131 insertions(+), 49 deletions(-)
>
> diff --git a/bfd/elfnn-riscv.c b/bfd/elfnn-riscv.c
> index 3d2ddf4e651..e4064313724 100644
> --- a/bfd/elfnn-riscv.c
> +++ b/bfd/elfnn-riscv.c
> @@ -4043,27 +4043,32 @@ riscv_update_pcgp_relocs (riscv_pcgp_relocs *p, asection *deleted_sec,
>      }
>  }
>
> -/* Delete some bytes from a section while relaxing.  */
> +/* Delete some bytes, adjust relcocations and symbol table from a section.  */
>
>  static bool
> -riscv_relax_delete_bytes (bfd *abfd,
> -                         asection *sec,
> -                         bfd_vma addr,
> -                         size_t count,
> -                         struct bfd_link_info *link_info,
> -                         riscv_pcgp_relocs *p)
> +_riscv_relax_delete_bytes (bfd *abfd,
> +                          asection *sec,
> +                          bfd_vma addr,
> +                          size_t count,
> +                          struct bfd_link_info *link_info,
> +                          riscv_pcgp_relocs *p,
> +                          bfd_vma delete_total,
> +                          bfd_vma toaddr)
>  {
>    unsigned int i, symcount;
> -  bfd_vma toaddr = sec->size;
>    struct elf_link_hash_entry **sym_hashes = elf_sym_hashes (abfd);
>    Elf_Internal_Shdr *symtab_hdr = &elf_tdata (abfd)->symtab_hdr;
>    unsigned int sec_shndx = _bfd_elf_section_from_bfd_section (abfd, sec);
>    struct bfd_elf_section_data *data = elf_section_data (sec);
>    bfd_byte *contents = data->this_hdr.contents;
> +  size_t bytes_to_move = toaddr - addr - count;
>
>    /* Actually delete the bytes.  */
>    sec->size -= count;
> -  memmove (contents + addr, contents + addr + count, toaddr - addr - count);
> +  memmove (contents + addr, contents + addr + count + delete_total, bytes_to_move);
> +
> +  /* Still adjust relocations and symbols in non-linear times.  */
> +  toaddr = sec->size + count;
>
>    /* Adjust the location of all of the relocs.  Note that we need not
>       adjust the addends, since all PC-relative references must be against
> @@ -4161,6 +4166,99 @@ riscv_relax_delete_bytes (bfd *abfd,
>    return true;
>  }
>
> +typedef bool (*relax_delete_t) (bfd *, asection *,
> +                               bfd_vma, size_t,
> +                               struct bfd_link_info *,
> +                               riscv_pcgp_relocs *,
> +                               Elf_Internal_Rela *);
> +
> +static relax_delete_t riscv_relax_delete_bytes;
> +
> +/* Do not delete some bytes from a section while relaxing.
> +   Just mark the deleted bytes as R_RISCV_DELETE.  */
> +
> +static bool
> +_riscv_relax_delete_piecewise (bfd *abfd ATTRIBUTE_UNUSED,
> +                              asection *sec ATTRIBUTE_UNUSED,
> +                              bfd_vma addr,
> +                              size_t count,
> +                              struct bfd_link_info *link_info ATTRIBUTE_UNUSED,
> +                              riscv_pcgp_relocs *p ATTRIBUTE_UNUSED,
> +                              Elf_Internal_Rela *rel)
> +{
> +  if (rel == NULL)
> +    return false;
> +  rel->r_info = ELFNN_R_INFO (0, R_RISCV_DELETE);
> +  rel->r_offset = addr;
> +  rel->r_addend = count;
> +  return true;
> +}
> +
> +/* Delete some bytes from a section while relaxing.  */
> +
> +static bool
> +_riscv_relax_delete_immediate (bfd *abfd,
> +                              asection *sec,
> +                              bfd_vma addr,
> +                              size_t count,
> +                              struct bfd_link_info *link_info,
> +                              riscv_pcgp_relocs *p,
> +                              Elf_Internal_Rela *rel)
> +{
> +  if (rel != NULL)
> +    rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
> +  return _riscv_relax_delete_bytes (abfd, sec, addr, count,
> +                                   link_info, p, 0, sec->size);
> +}
> +
> +/* Delete the bytes for R_RISCV_DELETE relocs.  */
> +
> +static bool
> +riscv_relax_resolve_delete_relocs (bfd *abfd,
> +                                  asection *sec,
> +                                  struct bfd_link_info *link_info,
> +                                  Elf_Internal_Rela *relocs)
> +{
> +  bfd_vma delete_total = 0;
> +  unsigned int i;
> +
> +  for (i = 0; i < sec->reloc_count; i++)
> +    {
> +      Elf_Internal_Rela *rel = relocs + i;
> +      if (ELFNN_R_TYPE (rel->r_info) != R_RISCV_DELETE)
> +       continue;
> +
> +      /* Find the next R_RISCV_DELETE reloc if possible.  */
> +      Elf_Internal_Rela *rel_next = NULL;
> +      unsigned int start = rel - relocs;
> +      for (i = start; i < sec->reloc_count; i++)
> +       {
> +         /* Since we only replace existing relocs and don't add new relocs, the
> +            relocs are in sequential order. We can skip the relocs prior to this
> +            one, making this search linear time.  */
> +         rel_next = relocs + i;
> +         if (ELFNN_R_TYPE ((rel_next)->r_info) == R_RISCV_DELETE
> +             && (rel_next)->r_offset > rel->r_offset)
> +           break;
> +         else
> +           rel_next = NULL;
> +       }
> +
> +      bfd_vma toaddr = rel_next == NULL ? sec->size : rel_next->r_offset;
> +      if (!_riscv_relax_delete_bytes (abfd, sec, rel->r_offset, rel->r_addend,
> +                                     link_info, NULL, delete_total, toaddr))
> +       return false;
> +
> +      delete_total += rel->r_addend;
> +      rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
> +
> +      /* Skip ahead to the next delete reloc.  */
> +      i = rel_next != NULL ? rel_next - relocs - 1 : sec->reloc_count;
> +    }
> +
> +  return true;
> +}
> +
>  typedef bool (*relax_func_t) (bfd *, asection *, asection *,
>                               struct bfd_link_info *,
>                               Elf_Internal_Rela *,
> @@ -4239,10 +4337,10 @@ _bfd_riscv_relax_call (bfd *abfd, asection *sec, asection *sym_sec,
>    /* Replace the AUIPC.  */
>    riscv_put_insn (8 * len, auipc, contents + rel->r_offset);
>
> -  /* Delete unnecessary JALR.  */
> +  /* Delete unnecessary JALR and reuse the R_RISCV_RELAX reloc.  */
>    *again = true;
>    return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + len, 8 - len,
> -                                  link_info, pcgp_relocs);
> +                                  link_info, pcgp_relocs, rel + 1);
>  }
>
>  /* Traverse all output sections and return the max alignment.  */
> @@ -4332,11 +4430,10 @@ _bfd_riscv_relax_lui (bfd *abfd,
>           return true;
>
>         case R_RISCV_HI20:
> -         /* We can delete the unnecessary LUI and reloc.  */
> -         rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
> +         /* Delete unnecessary LUI and reuse the reloc.  */
>           *again = true;
>           return riscv_relax_delete_bytes (abfd, sec, rel->r_offset, 4,
> -                                          link_info, pcgp_relocs);
> +                                          link_info, pcgp_relocs, rel);
>
>         default:
>           abort ();
> @@ -4367,9 +4464,10 @@ _bfd_riscv_relax_lui (bfd *abfd,
>        /* Replace the R_RISCV_HI20 reloc.  */
>        rel->r_info = ELFNN_R_INFO (ELFNN_R_SYM (rel->r_info), R_RISCV_RVC_LUI);
>
> +      /* Delete extra bytes and reuse the R_RISCV_RELAX reloc.  */
>        *again = true;
>        return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + 2, 2,
> -                                      link_info, pcgp_relocs);
> +                                      link_info, pcgp_relocs, rel + 1);
>      }
>
>    return true;
> @@ -4407,11 +4505,10 @@ _bfd_riscv_relax_tls_le (bfd *abfd,
>
>      case R_RISCV_TPREL_HI20:
>      case R_RISCV_TPREL_ADD:
> -      /* We can delete the unnecessary instruction and reloc.  */
> -      rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
> +      /* Delete unnecessary instruction and reuse the reloc.  */
>        *again = true;
>        return riscv_relax_delete_bytes (abfd, sec, rel->r_offset, 4, link_info,
> -                                      pcgp_relocs);
> +                                      pcgp_relocs, rel);
>
>      default:
>        abort ();
> @@ -4472,10 +4569,10 @@ _bfd_riscv_relax_align (bfd *abfd, asection *sec,
>    if (nop_bytes % 4 != 0)
>      bfd_putl16 (RVC_NOP, contents + rel->r_offset + pos);
>
> -  /* Delete the excess bytes.  */
> +  /* Delete excess bytes.  */
>    return riscv_relax_delete_bytes (abfd, sec, rel->r_offset + nop_bytes,
>                                    rel->r_addend - nop_bytes, link_info,
> -                                  NULL);
> +                                  NULL, NULL);
>  }
>
>  /* Relax PC-relative references to GP-relative references.  */
> @@ -4617,9 +4714,9 @@ _bfd_riscv_relax_pc (bfd *abfd ATTRIBUTE_UNUSED,
>                                       ELFNN_R_SYM(rel->r_info),
>                                       sym_sec,
>                                       undefined_weak);
> -         /* We can delete the unnecessary AUIPC and reloc.  */
> -         rel->r_info = ELFNN_R_INFO (0, R_RISCV_DELETE);
> -         rel->r_addend = 4;
> +         /* Delete unnecessary AUIPC and reuse the reloc.  */
> +         riscv_relax_delete_bytes (abfd, sec, rel->r_offset, 4, link_info,
> +                                   pcgp_relocs, rel);
>           return true;
>
>         default:
> @@ -4630,28 +4727,6 @@ _bfd_riscv_relax_pc (bfd *abfd ATTRIBUTE_UNUSED,
>    return true;
>  }
>
> -/* Delete the bytes for R_RISCV_DELETE.  */
> -
> -static bool
> -_bfd_riscv_relax_delete (bfd *abfd,
> -                        asection *sec,
> -                        asection *sym_sec ATTRIBUTE_UNUSED,
> -                        struct bfd_link_info *link_info,
> -                        Elf_Internal_Rela *rel,
> -                        bfd_vma symval ATTRIBUTE_UNUSED,
> -                        bfd_vma max_alignment ATTRIBUTE_UNUSED,
> -                        bfd_vma reserve_size ATTRIBUTE_UNUSED,
> -                        bool *again ATTRIBUTE_UNUSED,
> -                        riscv_pcgp_relocs *pcgp_relocs ATTRIBUTE_UNUSED,
> -                        bool undefined_weak ATTRIBUTE_UNUSED)
> -{
> -  if (!riscv_relax_delete_bytes (abfd, sec, rel->r_offset, rel->r_addend,
> -                                link_info, NULL))
> -    return false;
> -  rel->r_info = ELFNN_R_INFO (0, R_RISCV_NONE);
> -  return true;
> -}
> -
>  /* Called by after_allocation to set the information of data segment
>     before relaxing.  */
>
> @@ -4729,6 +4804,7 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
>        bool undefined_weak = false;
>
>        relax_func = NULL;
> +      riscv_relax_delete_bytes = NULL;
>        if (info->relax_pass == 0)
>         {
>           if (type == R_RISCV_CALL
> @@ -4750,6 +4826,7 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
>             relax_func = _bfd_riscv_relax_pc;
>           else
>             continue;
> +         riscv_relax_delete_bytes = _riscv_relax_delete_piecewise;
>
>           /* Only relax this reloc if it is paired with R_RISCV_RELAX.  */
>           if (i == sec->reloc_count - 1
> @@ -4760,10 +4837,11 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
>           /* Skip over the R_RISCV_RELAX.  */
>           i++;
>         }
> -      else if (info->relax_pass == 1 && type == R_RISCV_DELETE)
> -       relax_func = _bfd_riscv_relax_delete;
> -      else if (info->relax_pass == 2 && type == R_RISCV_ALIGN)
> -       relax_func = _bfd_riscv_relax_align;
> +      else if (info->relax_pass == 1 && type == R_RISCV_ALIGN)
> +       {
> +         relax_func = _bfd_riscv_relax_align;
> +         riscv_relax_delete_bytes = _riscv_relax_delete_immediate;
> +       }
>        else
>         continue;
>
> @@ -4921,6 +4999,10 @@ _bfd_riscv_relax_section (bfd *abfd, asection *sec,
>         goto fail;
>      }
>
> +  /* Resolve R_RISCV_DELETE relocations.  */
> +  if (!riscv_relax_resolve_delete_relocs (abfd, sec, info, relocs))
> +    goto fail;
> +
>    ret = true;
>
>   fail:
> --
> 2.37.0 (Apple Git-136)
>

  parent reply	other threads:[~2022-10-25  4:36 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-25  1:33 Nelson Chu
2022-10-25  1:33 ` [committed 2/2] RISC-V: Should reset `again' flag for _bfd_riscv_relax_pc Nelson Chu
2022-10-25  4:36 ` Nelson Chu [this message]
2022-10-25 10:10 ` [committed 1/2] RISC-V: Improve link time complexity Luis Machado
2022-10-26  2:06   ` Nelson Chu
2022-10-26  8:30     ` Luis Machado

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPpQWtBGs4quaqyD9TpxSt9ptk-B7OrsMdVfPZHX283FnX_Q+A@mail.gmail.com \
    --to=nelson@rivosinc.com \
    --cc=andrew@sifive.com \
    --cc=binutils@sourceware.org \
    --cc=jim.wilson.gcc@gmail.com \
    --cc=palmer@dabbelt.com \
    --cc=patrick@rivosinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).