public inbox for debugedit@sourceware.org
 help / color / mirror / Atom feed
From: Mark Wielaard <mark@klomp.org>
To: Nikita Popov <npopov@redhat.com>, debugedit@sourceware.org
Subject: Re: [PATCH] Optimize do_read_32_relocated using binary search
Date: Wed, 17 Apr 2024 16:49:55 +0200	[thread overview]
Message-ID: <c38ed85a6be447b77634ebd4c1b1271f6bfc4abd.camel@klomp.org> (raw)
In-Reply-To: <CAFUfLSV-RtBkOcq0C1xCenAf_-rEL6g72y-hsV1PDz5QS7OR8w@mail.gmail.com>

Hi Nikita,

On Wed, 2024-04-17 at 15:34 +0900, Nikita Popov wrote:
> debugedit is currently very slow when processing DWARF 5 debuginfo
> produced by clang. For some kernel modules, debugedit processing
> takes hours.
> 
> The root cause of the issue is the loop for finding the correct
> REL entry in do_read_32_relocated. This is currently a simple
> linear scan. For large objects, it may loop for hundreds of
> thousands of iterations.
> 
> As the relocations are sorted, we can use a binary search instead,
> which is what this patch implements. The time to run debugedit on
> a large kernel module (nouveau.ko) drops down to 3 seconds with
> this change.

Very nice. I missed that in the original commit that added support for
strx. Commit 3e7aeeab4f744ad15108775685db68d3a35b0735 does explain a
bit, but I didn't realize that this change made things do a linear
search over and over again...

  The tricky part is the ET_REL (object files or kernel modules)
  support. Relocation reading is "global" per section and we expect to
  read a relocation only once. But we need to read the
  DW_AT_str_offsets_base before reading any strx form attributes. So we
  read that first, then reset the relptr. And when we read from the
  .debug_str_offsets section we need to save and restore the
  .debug_info relptr.


> Signed-off-by: Nikita Popov <npopov@redhat.com>
> ---
>  tools/debugedit.c | 19 +++++++++++++++++--
>  1 file changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/debugedit.c b/tools/debugedit.c
> index f16eecd..d678673 100644
> --- a/tools/debugedit.c
> +++ b/tools/debugedit.c
> @@ -335,12 +335,27 @@ strptr (DSO *dso, size_t sec, size_t offset)
>  REL *relptr, *relend;
>  int reltype;
>  
> +static inline REL *find_rel_for_ptr(unsigned char *xptr)
> +{
> +  size_t l = 0, r = relend - relptr;
> +  while (l < r)
> +  {
> +    size_t m = (l + r) / 2;
> +    if (relptr[m].ptr < xptr)
> +      l = m + 1;
> +    else if (relptr[m].ptr > xptr)
> +      r = m;
> +    else
> +      return &relptr[m];
> +  }
> +  return relend;
> +}
> +
>  #define do_read_32_relocated(xptr) ({ \
>    uint32_t dret = do_read_32 (xptr); \
>    if (relptr) \
>      { \
> -      while (relptr < relend && relptr->ptr < (xptr)) \
> - ++relptr; \
> +      relptr = find_rel_for_ptr((xptr)); \
>        if (relptr < relend && relptr->ptr == (xptr)) \
>   { \
>    if (reltype == SHT_REL) \

I think this might make the code slightly slower for the case where
there are no .debug_str_offsets since all other sections are parsed
linearly. But that might not really be such a problem. Since this is
only for the .ko cases.

But I now see a few errors in the testsuite (before and after your
patch, so it is a pre-existing failure) when configuring with CC=clang
CXX=clang++ for clang18, but also for clang17 with CC="clang -gdwarf-5"
CXX="clang++ -gdwarf-5".

  1: debugedit help                                  ok
  2: debugedit usage                                 ok
  3: debugedit executable                            ok
  4: debugedit .debug_str objects DWARF4             ok
  5: debugedit .debug_str/line_str objects DWARF5    ok
  6: debugedit .debug_str partial DWARF4             ok
  7: debugedit .debug_str/line_str partial DWARF5    ok
  8: debugedit .debug_str exe DWARF4                 ok
  9: debugedit .debug_str/line_str exe DWARF5        ok
 10: debugedit .debug_info objects                   FAILED
(debugedit.at:307)
 11: debugedit .debug_info partial                   FAILED
(debugedit.at:330)
 12: debugedit .debug_info exe                       ok
 13: debugedit .debug_types objects                  skipped
(debugedit.at:367)
 14: debugedit .debug_types partial                  skipped
(debugedit.at:405)
 15: debugedit .debug_types exe                      skipped
(debugedit.at:435)
 16: debugedit .debug_line objects DWARF4            ok
 17: debugedit .debug_line objects DWARF5            ok
 18: debugedit .debug_line partial DWARF4            ok
 19: debugedit .debug_line partial DWARF5            ok
 20: debugedit .debug_line exe DWARF4                ok
 21: debugedit .debug_line exe DWARF5                ok
 22: debugedit .debug_macro objects                  FAILED
(debugedit.at:619)
 23: debugedit .debug_macro partial                  FAILED
(debugedit.at:644)
 24: debugedit .debug_macro exe                      FAILED
(debugedit.at:667)
 25: debugedit --list-file DWARF4                    ok
 26: debugedit --list-file DWARF5                    ok


--- expout      2024-04-17 16:45:58.790260019 +0200
+++ /home/mark/src/debugedit/tests/testsuite.dir/at-groups/10/stdout  
2024-04-
17 16:45:58.814260130 +0200
@@ -1,3 +1,2 @@
 /foo/bar/baz
-/foo/bar/baz/baz.c
 /foo/bar/baz/subdir_bar
10. debugedit.at:294: 10. debugedit .debug_info objects
(debugedit.at:294): FAILED (debugedit.at:307)

#                             -*- compilation -*-
11. debugedit.at:319: testing debugedit .debug_info partial ...
./debugedit.at:329: debugedit -b $(pwd) -d /foo/bar/baz
./foobarbaz.part.o
./debugedit.at:330:
$READELF --debug-dump=info ./foobarbaz.part.o \
        | grep -E 'DW_AT_(name|comp_dir)' \
        | rev | cut -d: -f1 | rev | cut -c2- | grep ^/foo/bar/baz |
sort -u

--- /dev/null   2024-04-15 17:25:16.496344492 +0200
+++ /home/mark/src/debugedit/tests/testsuite.dir/at-groups/11/stderr  
2024-04-17 16:45:58.997260972 +0200
@@ -0,0 +1 @@
+readelf: ./foobarbaz.part.o: Warning: indirect offset too big: 0x1a9
11. debugedit.at:319: 11. debugedit .debug_info partial
(debugedit.at:319): FAILED (debugedit.at:330)

#                             -*- compilation -*-
22. debugedit.at:608: testing debugedit .debug_macro objects ...
./debugedit.at:619: debugedit -b $(pwd) -d /foo/bar/baz ./foo.o
--- /dev/null   2024-04-15 17:25:16.496344492 +0200
+++ /home/mark/src/debugedit/tests/testsuite.dir/at-groups/22/stderr  
2024-04-17 16:46:00.516267966 +0200
@@ -0,0 +1 @@
+debugedit: Unhandled DW_MACRO op 0xb
./debugedit.at:619: exit code was 1, expected 0
22. debugedit.at:608: 22. debugedit .debug_macro objects
(debugedit.at:608): FAILED (debugedit.at:619)

#                             -*- compilation -*-
23. debugedit.at:633: testing debugedit .debug_macro partial ...
./debugedit.at:644: debugedit -b $(pwd) -d /foo/bar/baz
./foobarbaz.part.o
--- /dev/null   2024-04-15 17:25:16.496344492 +0200
+++ /home/mark/src/debugedit/tests/testsuite.dir/at-groups/23/stderr  
2024-04-17 16:46:00.631268496 +0200
@@ -0,0 +1 @@
+debugedit: Unhandled DW_MACRO op 0xb
./debugedit.at:644: exit code was 1, expected 0
23. debugedit.at:633: 23. debugedit .debug_macro partial
(debugedit.at:633): FAILED (debugedit.at:644)

#                             -*- compilation -*-
24. debugedit.at:656: testing debugedit .debug_macro exe ...
./debugedit.at:667: debugedit -b $(pwd) -d /foo/bar/baz ./foobarbaz.exe
--- /dev/null   2024-04-15 17:25:16.496344492 +0200
+++ /home/mark/src/debugedit/tests/testsuite.dir/at-groups/24/stderr  
2024-04-17 16:46:00.746269025 +0200
@@ -0,0 +1 @@
+debugedit: Unhandled DW_MACRO op 0xb
./debugedit.at:667: exit code was 1, expected 0
24. debugedit.at:656: 24. debugedit .debug_macro exe
(debugedit.at:656): FAILED (debugedit.at:667)

I'll look into that and try to add CI to explicitly test against clang.

Thanks,

Mark

  reply	other threads:[~2024-04-17 14:49 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-17  6:34 Nikita Popov
2024-04-17 14:49 ` Mark Wielaard [this message]
2024-05-15 13:38   ` Mark Wielaard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c38ed85a6be447b77634ebd4c1b1271f6bfc4abd.camel@klomp.org \
    --to=mark@klomp.org \
    --cc=debugedit@sourceware.org \
    --cc=npopov@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).