From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 626F43857C4F for ; Tue, 25 Jul 2023 06:33:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 626F43857C4F Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1690266803; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=faXpGzGkJJNwbXDCFUWE2o80OTUngZZAQt2FC8+cK3U=; b=X0a3mkum1uGIT9WGiws2fWyIxVBJMWPLNFbhX5xXiBb/M8jbU7ajU5itjx9owwSnmVbdHw v6AhMwV6ilaCNFESEAxcAatL2Yf3tboxNLYUFZAxe9THwA9+HEuR7Yb13o6W3vni9YCXml 2wozACiw1NiDFSjJqybWUCLjF3KeuCo= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-240-Daj1EfbCPHGgaxkT6H0ITA-1; Tue, 25 Jul 2023 02:32:10 -0400 X-MC-Unique: Daj1EfbCPHGgaxkT6H0ITA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 987618564EF; Tue, 25 Jul 2023 06:32:09 +0000 (UTC) Received: from f37-zws-nv (unknown [10.22.8.103]) by smtp.corp.redhat.com (Postfix) with ESMTPS id F044D200BA63; Tue, 25 Jul 2023 06:32:08 +0000 (UTC) Date: Mon, 24 Jul 2023 23:32:07 -0700 From: Kevin Buettner To: "Yan, Zhiyong" Cc: gdb-patches@sourceware.org, "luis.machado@arm.com" , "tom@tromey.com" Subject: Re: [PATCH] gdbserver: Install single-step breakpoint for a pending thread whose last_resume_kind is resume_step Message-ID: <20230724233207.59d9bca1@f37-zws-nv> In-Reply-To: References: <20230712032540.3110113-1-zhiyong.yan@windriver.com> <20230721134940.1ee4be68@f37-zws-nv> <20230724203650.43ddd754@f37-zws-nv> Organization: Red Hat MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,WEIRD_PORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Zhiyong, One problem that I encountered on my Pi, which may explain the behavior that you're seeing, is that recent 32-bit versions of the Raspberry Pi OS are running a 64-bit/aarch64 kernel, but the userland is 32-bit. root@rpi4-2:~# /usr/bin/uname -m aarch64 root@rpi4-2:~# file /usr/bin/ls /usr/bin/ls: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-armhf.so.3, BuildID[sha1]=81004d065160807541b79235b23eea0e00a2d44e, for GNU/Linux 3.2.0, stripped Note that uname -m returns aarch64, but that "ls" and other executables are "ELF 32-bit ...". The binutils-gdb configury uses uname -m to figure out for what gdbserver host/target to build. (host and target must be the same, otherwise gdbserver won't build.) So it may be the case that you built an aarch64 gdbserver instead of an arm gdbserver. I think you can check this as follows: kev@rpi4-2:/mesquite2/sourceware-git/rpi-master/bld/gdbserver$ ls linux-{arm,aarch}* linux-aarch32-low.o linux-aarch32-tdesc.o linux-arm-low.o linux-arm-tdesc.o If you also/instead see linux-aarch64-low.o and linux-aarch64-tdesc.o in that list, then you probably have an aarch64 gdbserver. When I tried my build with uname -m returning aarch64, the build errored out because (I think) I was missing certain aarch64 header files. But I knew that I didn't want to build for aarch64, so I abandoned that build. What I ended up doing was making a wrapper for uname which substituted 'arm' for 'aarch64'. I put it in /usr/local/bin, and /usr/local/bin is early in my PATH, so the configury finds it first... root@rpi4-2:~# uname -m arm Here's my /usr/local/bin/uname script: - - - - root@rpi4-2:~# cat /usr/local/bin/uname #!/bin/bash /usr/bin/uname $* | sed -e s/aarch64/arm/ - - - - [ Yes, this is a hack, but I couldn't think of a cleaner way to do it. I tried a configure line with "--host=arm-linux --target=host-linux", but that didn't work because something in the build wanted arm-linux-ar to exist and it didn't. I could have made some symlinks, e.g. "ln -s /usr/bin/ar /usr/local/bin/arm-linux-ar", with similar symlinks for gcc, g++, ln, ranlib, etc, but that seemed like more work than my uname wrapper hack.] I just checked my gdbserver build. It's definitely getting into arm_target::supports_hardware_single_step: Breakpoint 1, linux_process_target::maybe_hw_step ( this=0x8645c , thread=0x9df38) at /mesquite2/sourceware-git/rpi-master/bld/../../worktree-gdbserver/gdbserver/linux-low.cc:2442 2442 if (supports_hardware_single_step ()) (gdb) s arm_target::supports_hardware_single_step (this=0x8645c ) at /mesquite2/sourceware-git/rpi-master/bld/../../worktree-gdbserver/gdbserver/linux-arm-low.cc:1042 1042 return false; Kevin On Tue, 25 Jul 2023 04:21:00 +0000 "Yan, Zhiyong" wrote: > Hi Kevin, > I test gdb11 on RaspBerry Pi4. > As you said, I can't produce this assert issue. > The direct reason is because supports_hardware_single_step () returns on RaspBerry Pi4, not like xilinux-zynq. > Please see attached pictures, we can see arm_target::supports_hardware_single_step () is never entered. > This assert only happens when supports_hardware_single_step () returns 'false'. On Raspberry Pi4, when I hardcoded supports_hardware_single_step () returns 'false', then assert happened. > For more information about " This assert only happens when supports_hardware_single_step () returns 'false'". > You can check https://sourceware.org/bugzilla/show_bug.cgi?id=30387 > > So, the new question is why arm_target::supports_hardware_single_step () is never entered on Raspberry Pi4. > > Best Regards. > Zhiyong > > > -----Original Message-----, > From: Kevin Buettner > Sent: Tuesday, July 25, 2023 11:37 AM > To: Yan, Zhiyong > Cc: gdb-patches@sourceware.org; luis.machado@arm.com; tom@tromey.com > Subject: Re: [PATCH] gdbserver: Install single-step breakpoint for a pending thread whose last_resume_kind is resume_step > > CAUTION: This email comes from a non Wind River email account! > Do not click links or open attachments unless you recognize the sender and know the content is safe. > > Hi Zhiyong, > > I looked at the backtrace that you provided and see that maybe_hw_step() is being called from linux_process_target::resume_stopped_resumed_lwps, > which is the one location where I wasn't able to convince myself that the assert should hold. > > I was running your test case executable (osm) as an unprivileged user, so neither the syslog calls nor the sudo were working. (Sudo could perhaps work, but it wanted to prompt for a password and stdin and stdout were closed.) I've since modified it so that sudo isn't used and I'm using 'fprintf(stderr, ...)' instead of syslog - which is how I discovered that sudo wasn't working. I've tried next'ing quite a lot, but so far I haven't reproduced the bug. (Hopefully, the sudo isn't required to reproduce the problem.) > > If you manage to reproduce the bug on a Raspberry Pi 4 (and tell me how to do it), that'd be great! > > So, what I'm doing, using three separate terminals, in an attempt to reproduce the bug is: > > 1) Run osm in terminal 1. (I didn't want to mess with systemd.) Once I start running it, I see a bunch of messages from the dd command. > > 2) In terminal 2, I run: > > /path/to/gdbserver --debug --debug-format=all --remote-debug --event-loop-debug --once --attach :1234 $(pgrep osm) > > 3) In terminal 3, I run: > > /path/to/gdb osm -x ./gdbx2 > > (I've changed the target remote command in gdbx2 to refer to localhost.) > > I'm also attaching my hacked lupdated.c. If you see anything wrong with what I'm trying, please let me know. > > Kevin > > On Mon, 24 Jul 2023 13:36:24 +0000 > "Yan, Zhiyong" wrote: > > > Hi Kevin, > > The callstack of assert is attached. > > Please see attached gdbx2 which add more 'n' commands, on arm platform, keep execute 'n' command, this test case can trigger assert error. > > > > Today, I didn't finish setting up test environments on RaspBerry Pi4. Before I produced this issue on Xilinx arm platform. > > > > Best Regards. > > Zhiyong > > > > -----Original Message----- > > From: Kevin Buettner > > Sent: Saturday, July 22, 2023 4:50 AM > > To: Yan, Zhiyong > > Cc: gdb-patches@sourceware.org; luis.machado@arm.com; tom@tromey.com > > Subject: Re: [PATCH] gdbserver: Install single-step breakpoint for a > > pending thread whose last_resume_kind is resume_step > > > > CAUTION: This email comes from a non Wind River email account! > > Do not click links or open attachments unless you recognize the sender and know the content is safe. > > > > Hi Zhiyong, > > > > I set up a Raspberry Pi running a recent 32-bit Raspberry Pi OS so that I could test your patch. I was able to build and run your test case, but I could not reproduce the bug on the Pi. > > > > I tested gdb.threads/*.exp using --target_board=native-gdbserver both > > with and without your patch. Some of these tests are racy, but my > > conclusion from just looking at the PASSes and FAILs (after many test > > runs) is that there are no regressions. > > > > But then I remembered to enable core dumps on the Pi and after running > > gdb.threads/pending-fork-event-detach/pending-fork-event-detach-main-v > > fork by itself, I saw that it left a core file... > > > > $ make check RUNTESTFLAGS="--target_board=native-gdbserver" > > TESTS=gdb.threads/pending-fork-event-detach.exp > > ... > > === gdb Summary === > > > > # of unexpected core files 1 > > # of expected passes 240 > > > > The core file was from the running test case, not gdbserver, nor gdb. > > > > Looking at the core file in GDB shows... > > > > Program terminated with signal SIGTRAP, Trace/breakpoint trap. > > #0 0x00010624 in break_here () at /mesquite2/sourceware-git/rpi-gdbserver/bld/../../worktree-gdbserver/gdb/testsuite/gdb.threads/pending-fork-event-detach.c:29 > > 29 x++; > > [Current thread is 1 (Thread 0xf7e10440 (LWP 4835))] > > (gdb) x/i $pc > > => 0x10624 : udf #16 > > (gdb) x/x $pc > > 0x10624 : 0xe7f001f0 > > > > ...and in gdbserver/linux-aarch32-low.cc: > > > > #define arm_eabi_breakpoint 0xe7f001f0UL > > > > I think what's happened here is that the breakpoint added by your patch is left in place when GDB detaches the test case. When it starts running again, it hits the software single step breakpoint and, since it's no longer under GDB control, it dies with a SIGTRAP. > > > > This core file is not created when I run the test using a gdbserver without your patch. > > > > I'm suspicious of the assert in linux_process_target::maybe_hw_step. > > Currently, it looks like this: > > > > bool > > linux_process_target::maybe_hw_step (thread_info *thread) { > > if (supports_hardware_single_step ()) > > return true; > > else > > { > > /* GDBserver must insert single-step breakpoint for software > > single step. */ > > gdb_assert (has_single_step_breakpoints (thread)); > > return false; > > } > > } > > > > But, when Yao Qi introduced it back in June, 2016, it looked like > > this: > > > > static int > > maybe_hw_step (struct thread_info *thread) { > > if (can_hardware_single_step ()) > > return 1; > > else > > { > > struct process_info *proc = get_thread_process (thread); > > > > /* GDBserver must insert reinsert breakpoint for software > > single step. */ > > gdb_assert (has_reinsert_breakpoints (proc)); > > return 0; > > } > > } > > > > So, back is 2016, when it was introduced, it's clear that the assert was referring to breakpoints which needed to be reinserted. Now, that's not at all obvious. > > > > Also, back in 2016, maybe_hw_step() was only called from two > > locations; in each case it was in a block in which the condition > > lwp->bp_reinsert != 0 was true. But now there are two other > > calls; in one case, the software single step breakpoints have just been inserted, so that should be okay, but for the other case, in linux_process_target::resume_stopped_resumed_lwps, I'm less certain. > > > > In any case, could you comment out (or delete) the assert in a version of the source without your patch and let me know what happens? > > > > Also, if possible, I'd like to see a backtrace from where the assert occurs so that I can see which call to maybe_hw_step is responsible for triggering the failing assert. > > > > Kevin > >