public inbox for gdb-prs@sourceware.org
help / color / mirror / Atom feed
* [Bug gdb/27826] New: 10000 timeouts running the testsuite on arm-linux-gnueabihf
@ 2021-05-05 10:38 doko at debian dot org
  2021-05-05 11:54 ` [Bug gdb/27826] " luis.machado at linaro dot org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: doko at debian dot org @ 2021-05-05 10:38 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=27826

            Bug ID: 27826
           Summary: 10000 timeouts running the testsuite on
                    arm-linux-gnueabihf
           Product: gdb
           Version: HEAD
            Status: NEW
          Severity: normal
          Priority: P2
         Component: gdb
          Assignee: unassigned at sourceware dot org
          Reporter: doko at debian dot org
  Target Milestone: ---

trying to run the testsuite on arm-linux-gnueabihf for 10.2 and trunk 20210502,
I see around 10000 test failures, not seen on other architectures using the
same build environment (amd64, arm64, i386, ppc64el, s390x)

see https://launchpad.net/ubuntu/+source/gdb/10.2-0ubuntu3

$ zcat buildlog_ubuntu-impish-armhf.gdb_10.2-0ubuntu3_BUILDING.txt.gz | fgrep
'(timeout)'|wc -l
10451

the build environment is binutils 2.36.1, gcc 10.3, glibc 2.33 (updated to the
2.33 branch).

the build log show all the test cases which are timing out, there still seems
to be at least another running test, as the whole build times out.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug gdb/27826] 10000 timeouts running the testsuite on arm-linux-gnueabihf
  2021-05-05 10:38 [Bug gdb/27826] New: 10000 timeouts running the testsuite on arm-linux-gnueabihf doko at debian dot org
@ 2021-05-05 11:54 ` luis.machado at linaro dot org
  2021-05-07 12:40 ` luis.machado at linaro dot org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: luis.machado at linaro dot org @ 2021-05-05 11:54 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=27826

Luis Machado <luis.machado at linaro dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |luis.machado at linaro dot org

--- Comment #1 from Luis Machado <luis.machado at linaro dot org> ---
Could you please attach the log file that contains all the output/input from
GDB and the tests?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug gdb/27826] 10000 timeouts running the testsuite on arm-linux-gnueabihf
  2021-05-05 10:38 [Bug gdb/27826] New: 10000 timeouts running the testsuite on arm-linux-gnueabihf doko at debian dot org
  2021-05-05 11:54 ` [Bug gdb/27826] " luis.machado at linaro dot org
@ 2021-05-07 12:40 ` luis.machado at linaro dot org
  2021-05-07 16:14 ` doko at debian dot org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: luis.machado at linaro dot org @ 2021-05-07 12:40 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=27826

Luis Machado <luis.machado at linaro dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |WAITING
           Assignee|unassigned at sourceware dot org   |luis.machado at linaro dot org

--- Comment #2 from Luis Machado <luis.machado at linaro dot org> ---
After investigating this, the root cause of the timeouts is caused by GDB using
the wrong type of breakpoint (between a 4 bytes ARM breakpoint and a 2 bytes
thumb breakpoint), which causes some unexpected results.

The reason why this is happening is a bit more complex though. GDB has a couple
mechanisms for tracking loading/unloading of shared libraries in
dynamically-linked binaries. Via _dl_debug_state and r_brk and via stap probes.

Up until Ubuntu 18.04 (glibc 2.27), GDB could not use the stap probes mechanism
because it ran into a bug when parsing stap expression, thus failing the check
and falling back to using the old _dl_debug_state and r_brk mechanism.

The _dl_debug_state/r_brk mechanism works because we have an entry for
_dl_debug_state in the .dynsym section of ld.so.  Even though ld.so is
completely stripped of mapping symbols (another way to tell arm/thumb modes
apart), which are only available via the debug symbols file, GDB can still tell
_dl_debug_state is arm or thumb mode because the ELF symbol carries a flag
indicating so. That's why this fallback mechanism works.

On Ubuntu 20.04, running glibc 2.31, GDB no longer runs into problems with stap
probes. Thus GDB decides to use this mechanism instead of the old
_dl_debug_state/r_brk one.

Both mechanisms function by having GDB insert breakpoints at specific location
so shared library events can be tracked. But in the stap probes case there are
no real symbols.

What we have is metadata that contains the name of the probe and its address.
This address falls within a particular function. For example, init_start and
init_complete are probe points that fall within dl_main. The probe points do
not seem to carry any information about whether we have arm or thumb mode.

As before, the mapping symbols should tell us what the mode is, but ld.so is
stripped and doesn't carry those. But GDB could look at the ELF symbol of the
function the probe is sitting at, except that these symbols (not considered
special in any way) have been stripped as well. So the arm/thumb information is
completely gone and GDB can no longer make the correct decision.

So GDB defaults to assuming arm mode for the breakpoint to use, which is
obviously wrong for thumb code.

There are two possible solutions:

1 - Fallback to using _dl_debug_state/r_brk for armhf in GDB. This is
considered bad by GDB's maintainers, because it means using an outdated
mechanism instead of better interfaces.

2 - Don't strip glibc/ld.so function symbols that have stap probes installed in
them.

Right now, these are the functions that contain probes and that GDB wants to
breakpoint in a special way:

_dl_main, _dl_map_object_from_fd, lose, dl_open_worker and _dl_close_worker

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug gdb/27826] 10000 timeouts running the testsuite on arm-linux-gnueabihf
  2021-05-05 10:38 [Bug gdb/27826] New: 10000 timeouts running the testsuite on arm-linux-gnueabihf doko at debian dot org
  2021-05-05 11:54 ` [Bug gdb/27826] " luis.machado at linaro dot org
  2021-05-07 12:40 ` luis.machado at linaro dot org
@ 2021-05-07 16:14 ` doko at debian dot org
  2021-05-07 20:56 ` luis.machado at linaro dot org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: doko at debian dot org @ 2021-05-07 16:14 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=27826

--- Comment #3 from Matthias Klose <doko at debian dot org> ---
or 3) make sure that the detached debug information for ld.so is available and
can be found.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug gdb/27826] 10000 timeouts running the testsuite on arm-linux-gnueabihf
  2021-05-05 10:38 [Bug gdb/27826] New: 10000 timeouts running the testsuite on arm-linux-gnueabihf doko at debian dot org
                   ` (2 preceding siblings ...)
  2021-05-07 16:14 ` doko at debian dot org
@ 2021-05-07 20:56 ` luis.machado at linaro dot org
  2021-05-07 21:44 ` sergiodj at sergiodj dot net
  2022-05-13 23:41 ` luis.machado at arm dot com
  5 siblings, 0 replies; 7+ messages in thread
From: luis.machado at linaro dot org @ 2021-05-07 20:56 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=27826

--- Comment #4 from Luis Machado <luis.machado at linaro dot org> ---
3) is fine, but can you guarantee the detached information will always be
available?

Having the required data in the ld.so file itself would be best.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug gdb/27826] 10000 timeouts running the testsuite on arm-linux-gnueabihf
  2021-05-05 10:38 [Bug gdb/27826] New: 10000 timeouts running the testsuite on arm-linux-gnueabihf doko at debian dot org
                   ` (3 preceding siblings ...)
  2021-05-07 20:56 ` luis.machado at linaro dot org
@ 2021-05-07 21:44 ` sergiodj at sergiodj dot net
  2022-05-13 23:41 ` luis.machado at arm dot com
  5 siblings, 0 replies; 7+ messages in thread
From: sergiodj at sergiodj dot net @ 2021-05-07 21:44 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=27826

Sergio Durigan Junior <sergiodj at sergiodj dot net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sergiodj at sergiodj dot net

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug gdb/27826] 10000 timeouts running the testsuite on arm-linux-gnueabihf
  2021-05-05 10:38 [Bug gdb/27826] New: 10000 timeouts running the testsuite on arm-linux-gnueabihf doko at debian dot org
                   ` (4 preceding siblings ...)
  2021-05-07 21:44 ` sergiodj at sergiodj dot net
@ 2022-05-13 23:41 ` luis.machado at arm dot com
  5 siblings, 0 replies; 7+ messages in thread
From: luis.machado at arm dot com @ 2022-05-13 23:41 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=27826

Luis Machado <luis.machado at arm dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|WAITING                     |RESOLVED

--- Comment #5 from Luis Machado <luis.machado at arm dot com> ---
Fixed based on https://bugs.launchpad.net/ubuntu/+source/gdb/+bug/1927192

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-05-13 23:41 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-05 10:38 [Bug gdb/27826] New: 10000 timeouts running the testsuite on arm-linux-gnueabihf doko at debian dot org
2021-05-05 11:54 ` [Bug gdb/27826] " luis.machado at linaro dot org
2021-05-07 12:40 ` luis.machado at linaro dot org
2021-05-07 16:14 ` doko at debian dot org
2021-05-07 20:56 ` luis.machado at linaro dot org
2021-05-07 21:44 ` sergiodj at sergiodj dot net
2022-05-13 23:41 ` luis.machado at arm dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).