* Hitting g dwfl->lookup_elts limit in report_r_debug, so not all modules show up and backtracing fails @ 2023-04-25 19:00 Luke Diamand 2023-05-02 7:57 ` Florian Weimer 0 siblings, 1 reply; 4+ messages in thread From: Luke Diamand @ 2023-04-25 19:00 UTC (permalink / raw) To: elfutils-devel I've got a few cores where report_r_debug() in link_map.c fails to find all of the modules - for example I had libc.so missing. This obviously meant that elfutils could not backtrace my core. It seems to be related to this code: /* There can't be more elements in the link_map list than there are segments. DWFL->lookup_elts is probably twice that number, so it is certainly above the upper bound. If we iterate too many times, there must be a loop in the pointers due to link_map clobberation. */ size_t iterations = 0; while (next != 0 && ++iterations < dwfl->lookup_elts) I've changed this to just keep going until it reaches dwfl->lookup_elts*5, which seems to "fix" it, but I feel there must be a better fix! The most recent core I saw with this had lookup_elts=36, and hit 109 iterations of the loop and then backtraced just fine. Thanks! Luke ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Hitting g dwfl->lookup_elts limit in report_r_debug, so not all modules show up and backtracing fails 2023-04-25 19:00 Hitting g dwfl->lookup_elts limit in report_r_debug, so not all modules show up and backtracing fails Luke Diamand @ 2023-05-02 7:57 ` Florian Weimer 2023-05-08 16:35 ` Mark Wielaard 0 siblings, 1 reply; 4+ messages in thread From: Florian Weimer @ 2023-05-02 7:57 UTC (permalink / raw) To: Luke Diamand via Elfutils-devel; +Cc: Luke Diamand * Luke Diamand via Elfutils-devel: > I've got a few cores where report_r_debug() in link_map.c fails to > find all of the modules - for example I had libc.so missing. This > obviously meant that elfutils could not backtrace my core. > > It seems to be related to this code: > > /* There can't be more elements in the link_map list than there are > segments. DWFL->lookup_elts is probably twice that number, so it > is certainly above the upper bound. If we iterate too many times, > there must be a loop in the pointers due to link_map clobberation. */ > size_t iterations = 0; > > while (next != 0 && ++iterations < dwfl->lookup_elts) > > I've changed this to just keep going until it reaches > dwfl->lookup_elts*5, which seems to "fix" it, but I feel there must be > a better fix! > > The most recent core I saw with this had lookup_elts=36, and hit 109 > iterations of the loop and then backtraced just fine. It's probably another fallout from -z separate-code, which tends to create four LOAD segments. The magic number 5 sounds about right, as gold also has -z text-unlikely-segment, which might result in creating that number of load segments (but I haven't tried). Thanks, Florian ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Hitting g dwfl->lookup_elts limit in report_r_debug, so not all modules show up and backtracing fails 2023-05-02 7:57 ` Florian Weimer @ 2023-05-08 16:35 ` Mark Wielaard 2023-05-12 16:55 ` [EXTERNAL] " Luke Diamand 0 siblings, 1 reply; 4+ messages in thread From: Mark Wielaard @ 2023-05-08 16:35 UTC (permalink / raw) To: Florian Weimer, Luke Diamand via Elfutils-devel; +Cc: Luke Diamand Hi Florian, Hi Luke, On Tue, 2023-05-02 at 09:57 +0200, Florian Weimer via Elfutils-devel wrote: > * Luke Diamand via Elfutils-devel: > > > I've got a few cores where report_r_debug() in link_map.c fails to > > find all of the modules - for example I had libc.so missing. This > > obviously meant that elfutils could not backtrace my core. > > > > It seems to be related to this code: > > > > /* There can't be more elements in the link_map list than there are > > segments. DWFL->lookup_elts is probably twice that number, so it > > is certainly above the upper bound. If we iterate too many times, > > there must be a loop in the pointers due to link_map clobberation. */ > > size_t iterations = 0; > > > > while (next != 0 && ++iterations < dwfl->lookup_elts) > > > > I've changed this to just keep going until it reaches > > dwfl->lookup_elts*5, which seems to "fix" it, but I feel there must be > > a better fix! > > > > The most recent core I saw with this had lookup_elts=36, and hit 109 > > iterations of the loop and then backtraced just fine. > > It's probably another fallout from -z separate-code, which tends to > create four LOAD segments. The magic number 5 sounds about right, as > gold also has -z text-unlikely-segment, which might result in creating > that number of load segments (but I haven't tried). Wow, that had never occurred to me. Thanks. Luke does the binary/libraries from which your core file was generated contain multiple PT_LOAD segments? We could add something like: diff --git a/libdwfl/link_map.c b/libdwfl/link_map.c index 06d85eb6..76f23354 100644 --- a/libdwfl/link_map.c +++ b/libdwfl/link_map.c @@ -331,11 +331,17 @@ report_r_debug (uint_fast8_t elfclass, uint_fast8_t elfdata, int result = 0; /* There can't be more elements in the link_map list than there are - segments. DWFL->lookup_elts is probably twice that number, so it - is certainly above the upper bound. If we iterate too many times, - there must be a loop in the pointers due to link_map clobberation. */ + segments. A segment is created for each PT_LOAD and there can be + up to 5 per module (-z separate-code, tends to create four LOAD + segments, gold has -z text-unlikely-segment, which might result + in creating that number of load segments) DWFL->lookup_elts is + probably twice the number of modules, so that multiplied by max + PT_LOADs is certainly above the upper bound. If we iterate too + many times, there must be a loop in the pointers due to link_map + clobberation. */ +#define MAX_PT_LOAD 5 size_t iterations = 0; - while (next != 0 && ++iterations < dwfl->lookup_elts) + while (next != 0 && ++iterations < dwfl->lookup_elts * MAX_PT_LOAD) { if (read_addrs (&memory_closure, elfclass, elfdata, &buffer, &buffer_available, next, &read_vaddr, Does that sound reasonable? Thanks, Mark ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [EXTERNAL] Re: Hitting g dwfl->lookup_elts limit in report_r_debug, so not all modules show up and backtracing fails 2023-05-08 16:35 ` Mark Wielaard @ 2023-05-12 16:55 ` Luke Diamand 0 siblings, 0 replies; 4+ messages in thread From: Luke Diamand @ 2023-05-12 16:55 UTC (permalink / raw) To: Mark Wielaard, Florian Weimer, Luke Diamand via Elfutils-devel On 08/05/2023 17:35, Mark Wielaard wrote: > Hi Florian, Hi Luke, > > On Tue, 2023-05-02 at 09:57 +0200, Florian Weimer via Elfutils-devel > wrote: >> * Luke Diamand via Elfutils-devel: >> >>> I've got a few cores where report_r_debug() in link_map.c fails to >>> find all of the modules - for example I had libc.so missing. This >>> obviously meant that elfutils could not backtrace my core. >>> >>> It seems to be related to this code: >>> >>> /* There can't be more elements in the link_map list than there are >>> segments. DWFL->lookup_elts is probably twice that number, so it >>> is certainly above the upper bound. If we iterate too many times, >>> there must be a loop in the pointers due to link_map clobberation. */ >>> size_t iterations = 0; >>> >>> while (next != 0 && ++iterations < dwfl->lookup_elts) >>> >>> I've changed this to just keep going until it reaches >>> dwfl->lookup_elts*5, which seems to "fix" it, but I feel there must be >>> a better fix! >>> >>> The most recent core I saw with this had lookup_elts=36, and hit 109 >>> iterations of the loop and then backtraced just fine. >> >> It's probably another fallout from -z separate-code, which tends to >> create four LOAD segments. The magic number 5 sounds about right, as >> gold also has -z text-unlikely-segment, which might result in creating >> that number of load segments (but I haven't tried). > > Wow, that had never occurred to me. Thanks. > > Luke does the binary/libraries from which your core file was generated > contain multiple PT_LOAD segments? > > We could add something like: > > diff --git a/libdwfl/link_map.c b/libdwfl/link_map.c > index 06d85eb6..76f23354 100644 > --- a/libdwfl/link_map.c > +++ b/libdwfl/link_map.c > @@ -331,11 +331,17 @@ report_r_debug (uint_fast8_t elfclass, uint_fast8_t elfdata, > int result = 0; > > /* There can't be more elements in the link_map list than there are > - segments. DWFL->lookup_elts is probably twice that number, so it > - is certainly above the upper bound. If we iterate too many times, > - there must be a loop in the pointers due to link_map clobberation. */ > + segments. A segment is created for each PT_LOAD and there can be > + up to 5 per module (-z separate-code, tends to create four LOAD > + segments, gold has -z text-unlikely-segment, which might result > + in creating that number of load segments) DWFL->lookup_elts is > + probably twice the number of modules, so that multiplied by max > + PT_LOADs is certainly above the upper bound. If we iterate too > + many times, there must be a loop in the pointers due to link_map > + clobberation. */ > +#define MAX_PT_LOAD 5 > size_t iterations = 0; > - while (next != 0 && ++iterations < dwfl->lookup_elts) > + while (next != 0 && ++iterations < dwfl->lookup_elts * MAX_PT_LOAD) > { > if (read_addrs (&memory_closure, elfclass, elfdata, > &buffer, &buffer_available, next, &read_vaddr, > > Does that sound reasonable? Sorry - I did not see this until just after sending in my patch! Let me try it with this change and I will re-roll it. Luke ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-05-12 16:55 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-04-25 19:00 Hitting g dwfl->lookup_elts limit in report_r_debug, so not all modules show up and backtracing fails Luke Diamand 2023-05-02 7:57 ` Florian Weimer 2023-05-08 16:35 ` Mark Wielaard 2023-05-12 16:55 ` [EXTERNAL] " Luke Diamand
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).