* [PATCH] Optimize do_read_32_relocated using binary search
@ 2024-04-17 6:34 Nikita Popov
2024-04-17 14:49 ` Mark Wielaard
0 siblings, 1 reply; 2+ messages in thread
From: Nikita Popov @ 2024-04-17 6:34 UTC (permalink / raw)
To: debugedit
[-- Attachment #1: Type: text/plain, Size: 1586 bytes --]
debugedit is currently very slow when processing DWARF 5 debuginfo
produced by clang. For some kernel modules, debugedit processing
takes hours.
The root cause of the issue is the loop for finding the correct
REL entry in do_read_32_relocated. This is currently a simple
linear scan. For large objects, it may loop for hundreds of
thousands of iterations.
As the relocations are sorted, we can use a binary search instead,
which is what this patch implements. The time to run debugedit on
a large kernel module (nouveau.ko) drops down to 3 seconds with
this change.
Signed-off-by: Nikita Popov <npopov@redhat.com>
---
tools/debugedit.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/tools/debugedit.c b/tools/debugedit.c
index f16eecd..d678673 100644
--- a/tools/debugedit.c
+++ b/tools/debugedit.c
@@ -335,12 +335,27 @@ strptr (DSO *dso, size_t sec, size_t offset)
REL *relptr, *relend;
int reltype;
+static inline REL *find_rel_for_ptr(unsigned char *xptr)
+{
+ size_t l = 0, r = relend - relptr;
+ while (l < r)
+ {
+ size_t m = (l + r) / 2;
+ if (relptr[m].ptr < xptr)
+ l = m + 1;
+ else if (relptr[m].ptr > xptr)
+ r = m;
+ else
+ return &relptr[m];
+ }
+ return relend;
+}
+
#define do_read_32_relocated(xptr) ({ \
uint32_t dret = do_read_32 (xptr); \
if (relptr) \
{ \
- while (relptr < relend && relptr->ptr < (xptr)) \
- ++relptr; \
+ relptr = find_rel_for_ptr((xptr)); \
if (relptr < relend && relptr->ptr == (xptr)) \
{ \
if (reltype == SHT_REL) \
--
2.44.0
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH] Optimize do_read_32_relocated using binary search
2024-04-17 6:34 [PATCH] Optimize do_read_32_relocated using binary search Nikita Popov
@ 2024-04-17 14:49 ` Mark Wielaard
0 siblings, 0 replies; 2+ messages in thread
From: Mark Wielaard @ 2024-04-17 14:49 UTC (permalink / raw)
To: Nikita Popov, debugedit
Hi Nikita,
On Wed, 2024-04-17 at 15:34 +0900, Nikita Popov wrote:
> debugedit is currently very slow when processing DWARF 5 debuginfo
> produced by clang. For some kernel modules, debugedit processing
> takes hours.
>
> The root cause of the issue is the loop for finding the correct
> REL entry in do_read_32_relocated. This is currently a simple
> linear scan. For large objects, it may loop for hundreds of
> thousands of iterations.
>
> As the relocations are sorted, we can use a binary search instead,
> which is what this patch implements. The time to run debugedit on
> a large kernel module (nouveau.ko) drops down to 3 seconds with
> this change.
Very nice. I missed that in the original commit that added support for
strx. Commit 3e7aeeab4f744ad15108775685db68d3a35b0735 does explain a
bit, but I didn't realize that this change made things do a linear
search over and over again...
The tricky part is the ET_REL (object files or kernel modules)
support. Relocation reading is "global" per section and we expect to
read a relocation only once. But we need to read the
DW_AT_str_offsets_base before reading any strx form attributes. So we
read that first, then reset the relptr. And when we read from the
.debug_str_offsets section we need to save and restore the
.debug_info relptr.
> Signed-off-by: Nikita Popov <npopov@redhat.com>
> ---
> tools/debugedit.c | 19 +++++++++++++++++--
> 1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/tools/debugedit.c b/tools/debugedit.c
> index f16eecd..d678673 100644
> --- a/tools/debugedit.c
> +++ b/tools/debugedit.c
> @@ -335,12 +335,27 @@ strptr (DSO *dso, size_t sec, size_t offset)
> REL *relptr, *relend;
> int reltype;
>
> +static inline REL *find_rel_for_ptr(unsigned char *xptr)
> +{
> + size_t l = 0, r = relend - relptr;
> + while (l < r)
> + {
> + size_t m = (l + r) / 2;
> + if (relptr[m].ptr < xptr)
> + l = m + 1;
> + else if (relptr[m].ptr > xptr)
> + r = m;
> + else
> + return &relptr[m];
> + }
> + return relend;
> +}
> +
> #define do_read_32_relocated(xptr) ({ \
> uint32_t dret = do_read_32 (xptr); \
> if (relptr) \
> { \
> - while (relptr < relend && relptr->ptr < (xptr)) \
> - ++relptr; \
> + relptr = find_rel_for_ptr((xptr)); \
> if (relptr < relend && relptr->ptr == (xptr)) \
> { \
> if (reltype == SHT_REL) \
I think this might make the code slightly slower for the case where
there are no .debug_str_offsets since all other sections are parsed
linearly. But that might not really be such a problem. Since this is
only for the .ko cases.
But I now see a few errors in the testsuite (before and after your
patch, so it is a pre-existing failure) when configuring with CC=clang
CXX=clang++ for clang18, but also for clang17 with CC="clang -gdwarf-5"
CXX="clang++ -gdwarf-5".
1: debugedit help ok
2: debugedit usage ok
3: debugedit executable ok
4: debugedit .debug_str objects DWARF4 ok
5: debugedit .debug_str/line_str objects DWARF5 ok
6: debugedit .debug_str partial DWARF4 ok
7: debugedit .debug_str/line_str partial DWARF5 ok
8: debugedit .debug_str exe DWARF4 ok
9: debugedit .debug_str/line_str exe DWARF5 ok
10: debugedit .debug_info objects FAILED
(debugedit.at:307)
11: debugedit .debug_info partial FAILED
(debugedit.at:330)
12: debugedit .debug_info exe ok
13: debugedit .debug_types objects skipped
(debugedit.at:367)
14: debugedit .debug_types partial skipped
(debugedit.at:405)
15: debugedit .debug_types exe skipped
(debugedit.at:435)
16: debugedit .debug_line objects DWARF4 ok
17: debugedit .debug_line objects DWARF5 ok
18: debugedit .debug_line partial DWARF4 ok
19: debugedit .debug_line partial DWARF5 ok
20: debugedit .debug_line exe DWARF4 ok
21: debugedit .debug_line exe DWARF5 ok
22: debugedit .debug_macro objects FAILED
(debugedit.at:619)
23: debugedit .debug_macro partial FAILED
(debugedit.at:644)
24: debugedit .debug_macro exe FAILED
(debugedit.at:667)
25: debugedit --list-file DWARF4 ok
26: debugedit --list-file DWARF5 ok
--- expout 2024-04-17 16:45:58.790260019 +0200
+++ /home/mark/src/debugedit/tests/testsuite.dir/at-groups/10/stdout
2024-04-
17 16:45:58.814260130 +0200
@@ -1,3 +1,2 @@
/foo/bar/baz
-/foo/bar/baz/baz.c
/foo/bar/baz/subdir_bar
10. debugedit.at:294: 10. debugedit .debug_info objects
(debugedit.at:294): FAILED (debugedit.at:307)
# -*- compilation -*-
11. debugedit.at:319: testing debugedit .debug_info partial ...
./debugedit.at:329: debugedit -b $(pwd) -d /foo/bar/baz
./foobarbaz.part.o
./debugedit.at:330:
$READELF --debug-dump=info ./foobarbaz.part.o \
| grep -E 'DW_AT_(name|comp_dir)' \
| rev | cut -d: -f1 | rev | cut -c2- | grep ^/foo/bar/baz |
sort -u
--- /dev/null 2024-04-15 17:25:16.496344492 +0200
+++ /home/mark/src/debugedit/tests/testsuite.dir/at-groups/11/stderr
2024-04-17 16:45:58.997260972 +0200
@@ -0,0 +1 @@
+readelf: ./foobarbaz.part.o: Warning: indirect offset too big: 0x1a9
11. debugedit.at:319: 11. debugedit .debug_info partial
(debugedit.at:319): FAILED (debugedit.at:330)
# -*- compilation -*-
22. debugedit.at:608: testing debugedit .debug_macro objects ...
./debugedit.at:619: debugedit -b $(pwd) -d /foo/bar/baz ./foo.o
--- /dev/null 2024-04-15 17:25:16.496344492 +0200
+++ /home/mark/src/debugedit/tests/testsuite.dir/at-groups/22/stderr
2024-04-17 16:46:00.516267966 +0200
@@ -0,0 +1 @@
+debugedit: Unhandled DW_MACRO op 0xb
./debugedit.at:619: exit code was 1, expected 0
22. debugedit.at:608: 22. debugedit .debug_macro objects
(debugedit.at:608): FAILED (debugedit.at:619)
# -*- compilation -*-
23. debugedit.at:633: testing debugedit .debug_macro partial ...
./debugedit.at:644: debugedit -b $(pwd) -d /foo/bar/baz
./foobarbaz.part.o
--- /dev/null 2024-04-15 17:25:16.496344492 +0200
+++ /home/mark/src/debugedit/tests/testsuite.dir/at-groups/23/stderr
2024-04-17 16:46:00.631268496 +0200
@@ -0,0 +1 @@
+debugedit: Unhandled DW_MACRO op 0xb
./debugedit.at:644: exit code was 1, expected 0
23. debugedit.at:633: 23. debugedit .debug_macro partial
(debugedit.at:633): FAILED (debugedit.at:644)
# -*- compilation -*-
24. debugedit.at:656: testing debugedit .debug_macro exe ...
./debugedit.at:667: debugedit -b $(pwd) -d /foo/bar/baz ./foobarbaz.exe
--- /dev/null 2024-04-15 17:25:16.496344492 +0200
+++ /home/mark/src/debugedit/tests/testsuite.dir/at-groups/24/stderr
2024-04-17 16:46:00.746269025 +0200
@@ -0,0 +1 @@
+debugedit: Unhandled DW_MACRO op 0xb
./debugedit.at:667: exit code was 1, expected 0
24. debugedit.at:656: 24. debugedit .debug_macro exe
(debugedit.at:656): FAILED (debugedit.at:667)
I'll look into that and try to add CI to explicitly test against clang.
Thanks,
Mark
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-04-17 14:49 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-17 6:34 [PATCH] Optimize do_read_32_relocated using binary search Nikita Popov
2024-04-17 14:49 ` Mark Wielaard
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).