From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk1-x732.google.com (mail-qk1-x732.google.com [IPv6:2607:f8b0:4864:20::732]) by sourceware.org (Postfix) with ESMTPS id B02153853802 for ; Fri, 7 May 2021 19:42:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org B02153853802 Received: by mail-qk1-x732.google.com with SMTP id q136so9665076qka.7 for ; Fri, 07 May 2021 12:42:08 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:from:subject:message-id:date:user-agent :mime-version:content-language:content-transfer-encoding; bh=pMuYWCc0dojMjHVX2Rdhl6N65E+2ceUzT895wrGPm8c=; b=tlB+rfPuONRnGEl/yEa2URiOZTg/lq68dSGOFDVoPvp4w5UA2BjZkSJZVQqMF0oPkv e+SGCdNX+klifCsqJjUGmCZ2DDMjOJ7ClfyeOJugWZmVP5FwLF/xexmk7xhFjD9KiKgg vFuFSdfA1VRVyf5BgM0xYTWcc1iXWicy/cyjwuwINRtw/XFwHZvlgfd/u21/WNO7dg26 q4KlBfBjUDAQOd8KGNZN2YtYohsnL/RI0PnbSjZEA+L4DEP+HmsF1x3l1PmX6PlKOyvq dKitTNsJ8qkKkFVc5IDi8Gmpoylo80ROcKLo4AvsTHmqDmeey0ecN/OEenZk99TLIVvZ gsyQ== X-Gm-Message-State: AOAM5320O8nEuk8cMsEvmmYEZ6hZSfV9mkxxHsfondKUANYKl0LdVlcf Kyvt12ej126OWBGzZFtqda5QxA== X-Google-Smtp-Source: ABdhPJxx8hbtxwqNufNc6xX30X+fwdHbdoPlrd8rQAU7VVvqtfkAy1tM9JR9ys+0UYNoDzkGN2mXEg== X-Received: by 2002:a05:620a:44c5:: with SMTP id y5mr11242846qkp.344.1620416528226; Fri, 07 May 2021 12:42:08 -0700 (PDT) Received: from ?IPv6:2804:7f0:4841:40ad:2455:cbab:c687:242e? ([2804:7f0:4841:40ad:2455:cbab:c687:242e]) by smtp.gmail.com with ESMTPSA id h62sm5522916qkf.116.2021.05.07.12.42.06 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 07 May 2021 12:42:07 -0700 (PDT) To: libc-alpha@sourceware.org, "gdb@sourceware.org" Cc: doko@debian.org, Adhemerval Zanella From: Luis Machado Subject: GDB shared library tracking with stap probes x _dl_debug_state Message-ID: Date: Fri, 7 May 2021 16:42:03 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gdb@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 May 2021 19:42:10 -0000 Hi, I'm cc-ing the GDB ML as well, as this might be an issue for other architectures that store flags in ELF symbols like armhf. Matthias (cc-ed) reported the following ticket on GDB's bugzilla: https://sourceware.org/bugzilla/show_bug.cgi?id=27826 This is related to how GDB tracks shared library loads/unloads in dynamically-linked executables. GDB is aided by some hooks provided by the dynamic linker. There are two ways GDB will track these shared library events: * _dl_debug_state mechanism This is a dummy function that gets called by the dynamic linker's dl_main (...) function so debbugers can breakpoint it and track shared library events. _dl_debug_state is a real ELF symbol that lives in .dynsym. This is a fallback mechanism in GDB these days. * stap probes This is a more recent approach where some probe points are provided by the ELF file and GDB breakpoints a list of known probes instead. There are no real ELF symbols here, just probe names and addresses that debuggers should use to put breakpoints into. This is the preferred way to track shared library events in GDB nowadays. Going back to bz27826, up until Ubuntu 18.04 (glibc 2.27) on armhf, GDB used the _dl_debug_state mechanism to track shared library events. This is due to a bug in stap that made GDB fail a check, thus falling back to using the _dl_debug_state mechanism. With Ubuntu 20.04 (glibc 2.31), this check no longer fails and GDB decide to use the new stap mechanism instead. That's all fine, but there is one small detail that doesn't work for armhf, and that is discovering if we're dealing with a PC that is arm mode or thumb mode. armhf's GDB uses a few strategies to figure out the mode: mapping symbols, LSB of the PC and ELF symbol flags. Given distros usually strip binaries (ld.so is also stripped), only a few symbols are left in the executable file itself, and _dl_debug_state is one of them. GDB can still peak at the _dl_debug_state ELF symbol and retrieve the flag that indicates we have a arm or thumb mode function. That way GDB can place the proper arm/thumb breakpoint at the address pointed to by r_brk. With the stap probes approach, this is not possible. As was said before, the probe points are not real ELF symbols. They're just metadata with a name and an address. Of course we could lookup what symbol contains a particular probe address, but those symbols are not available in stripped binaries (_dl_main, for example). So GDB is left with no useful information to insert the right kind of breakpoint for the specified address. It defaults to arm mode, and, since dl_main is thumb mode, things just break. I believe this may also be a problem for MIPS, since it has to determine the ISA bit for some operations. Now, two or three possible solutions exist: 1 - Force GDB to fallback to using _dl_debug_state for armhf (and possibly other architectures). This is considered bad because the affected architectures can't take advantage of a more advanced mechanism for tracking shared library events. 2 - Not stripping ld.so/glibc. I can't determine the impact of this choice, but distros strip binaries for a reason. Having to carry all symbols for a particular library may not be desirable. It is also not desirable to force users to install a dbg package for ld.so/glibc just to be able to use a debugger. 3 - Strip symbols from ld.so/glibc, but keep a few select critical symbols that debuggers will want to use. I've been told this may be a bit undesirable from glibc's perspective. I noticed the probe points fall into the following functions: _dl_main, _dl_map_object_from_fd, lose, dl_open_worker and _dl_close_worker. If we keep those symbols, GDB will be able to figure out what mode we have and the proper breakpoint to use for each of those symbols. Before making a decision, it sounds best to discuss this and come up with the best solution for both projects and the distros. Thoughts?