From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28148 invoked by alias); 29 Dec 2019 18:58:51 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 28130 invoked by uid 89); 29 Dec 2019 18:58:51 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-3.9 required=5.0 tests=AWL,BAYES_00,LOTS_OF_MONEY,NORMAL_HTTP_TO_IP,NUMERIC_HTTP_ADDR,SPF_HELO_PASS,SPF_PASS autolearn=no version=3.3.1 spammy=observe, enhanced, UD:m, aoc X-HELO: simark.ca Received: from simark.ca (HELO simark.ca) (158.69.221.121) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sun, 29 Dec 2019 18:58:49 +0000 Received: from [10.0.0.11] (unknown [192.222.164.54]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id 6D43E1E5F7; Sun, 29 Dec 2019 13:58:47 -0500 (EST) Subject: Re: [PATCH] Fix handling of null stap semaphores To: George Barrett , gdb-patches@sourceware.org References: <5we87igzwt5_kr.5b-38floyexzwmozuj6vb-.hmx8r4u3r41_sy@mail.bob131.so> From: Simon Marchi Message-ID: Date: Sun, 29 Dec 2019 18:58:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.3.0 MIME-Version: 1.0 In-Reply-To: <5we87igzwt5_kr.5b-38floyexzwmozuj6vb-.hmx8r4u3r41_sy@mail.bob131.so> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-SW-Source: 2019-12/txt/msg01061.txt.bz2 On 2019-12-03 2:59 p.m., George Barrett wrote: > According to the SystemTap documentation on user-space probes[0], stap > probe points without semaphores are denoted by setting the semaphore > address in the probe's note to zero. At present the code does do a > comparison of the semaphore address against zero, but only after it's > been relocated; as such it will (almost?) always fail, commonly > resulting in GDB trying to overwrite the ELF magic located at the > image's base address. > > This commit tests the address as specified in the SDT note rather than > the relocated value in order to correctly detect absent probe > semaphores. > > [0]: https://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation > > gdb/Changelog: > > * stap-probe.c: Fix handling of null stap semaphores Hi George, Thanks, the patch looks good to me. Though the ChangeLog should mention the modified functions, I propose to use this: * stap-probe.c (stap_modify_semaphore): Don't check for null semaphores. (stap_probe::set_semaphore, stap_probe::clear_semaphore): Check for null semaphores. I'd really like if we could have a test for this, so that eventual refactors don't re-introduce this bug. Perhaps the gdb.base/stap-probe.exp test could be enhanced to test that the ELF magic number hasn't been changed? One difficulty is finding out where it is, I don't know if there's a GDB command that will compute that directly. One way is to take the address of a global variable before and after starting the process, and see how it has been relocated, that would be the base of the image: (gdb) p &some_global $1 = (int *) 0x402c (gdb) start Temporary breakpoint 1 at 0x111d: file test.c, line 9. Starting program: /home/simark/src/aoc/08/p2/a.out Temporary breakpoint 1, main () at test.c:9 9 for (i = 0; i < 1000; i++) { (gdb) p &some_global $2 = (int *) 0x55555555802c (gdb) p/x 0x55555555802c - 0x402c $3 = 0x555555554000 (gdb) p (*(char*) 0x555555554000)@4 $4 = "\177ELF" Also, the semaphore is set when the breakpoint is inserted and cleared when the breakpoint is removed. By default, GDB removes the breakpoints while the inferior is stopped, so if we inspect the magic number while the inferior is stopped, it will appear OK. However, we can use "set breakpoints always-inserted on" to make GDB leave the breakpoint inserted when the inferior is stopped. When doing this, we can observe the overwritten magic number: (gdb) set breakpoint always-inserted 1 (gdb) b -probe provider:name Breakpoint 1 at 0x1137 (gdb) p &some_global $1 = (int *) 0x402c (gdb) start Temporary breakpoint 2 at 0x111d: file test.c, line 9. Starting program: /home/simark/src/aoc/08/p2/a.out Temporary breakpoint 2, main () at test.c:9 9 for (i = 0; i < 1000; i++) { (gdb) p &some_global $2 = (int *) 0x55555555802c (gdb) p/x 0x55555555802c - 0x402c $3 = 0x555555554000 (gdb) p ((unsigned char[4]) *0x555555554000) $4 = "\200ELF" I think all this only applies if the main program is a position-independent executable, so this test should probably be skipped if we detect the relocation is 0. So with all this it should be pretty straightforward to improve the test to check for this. Note that I found what looks like to be a GDB bug while doing this: https://sourceware.org/bugzilla/show_bug.cgi?id=25321 Simon