public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Kevin Buettner <kevinb@redhat.com>
To: "Yan, Zhiyong" <Zhiyong.Yan@windriver.com>
Cc: gdb-patches@sourceware.org,
	"luis.machado@arm.com" <luis.machado@arm.com>,
	"tom@tromey.com" <tom@tromey.com>
Subject: Re: [PATCH] gdbserver: Install single-step breakpoint for a pending thread whose last_resume_kind is resume_step
Date: Mon, 24 Jul 2023 20:36:50 -0700	[thread overview]
Message-ID: <20230724203650.43ddd754@f37-zws-nv> (raw)
In-Reply-To: <DS0PR11MB64476202F13833871C26D5DAE702A@DS0PR11MB6447.namprd11.prod.outlook.com>

[-- Attachment #1: Type: text/plain, Size: 6177 bytes --]

Hi Zhiyong,

I looked at the backtrace that you provided and see that maybe_hw_step()
is being called from linux_process_target::resume_stopped_resumed_lwps,
which is the one location where I wasn't able to convince myself that
the assert should hold.

I was running your test case executable (osm) as an unprivileged user,
so neither the syslog calls nor the sudo were working.  (Sudo could
perhaps work, but it wanted to prompt for a password and stdin and
stdout were closed.)  I've since modified it so that sudo isn't used
and I'm using 'fprintf(stderr, ...)' instead of syslog - which is how
I discovered that sudo wasn't working.  I've tried next'ing quite a
lot, but so far I haven't reproduced the bug.  (Hopefully, the sudo
isn't required to reproduce the problem.)

If you manage to reproduce the bug on a Raspberry Pi 4 (and tell me
how to do it), that'd be great!

So, what I'm doing, using three separate terminals, in an attempt to
reproduce the bug is:

1) Run osm in terminal 1.  (I didn't want to mess with systemd.)  Once
I start running it, I see a bunch of messages from the dd command.

2) In terminal 2, I run:

   /path/to/gdbserver --debug --debug-format=all --remote-debug --event-loop-debug --once --attach :1234 $(pgrep osm)

3) In terminal 3, I run:

   /path/to/gdb osm -x ./gdbx2

(I've changed the target remote command in gdbx2 to refer to localhost.)

I'm also attaching my hacked lupdated.c.  If you see anything wrong
with what I'm trying, please let me know.

Kevin

On Mon, 24 Jul 2023 13:36:24 +0000
"Yan, Zhiyong" <Zhiyong.Yan@windriver.com> wrote:

> Hi Kevin,
>     The callstack of assert is attached.
>     Please see attached gdbx2 which add more 'n' commands, on arm platform, keep execute 'n' command, this test case can trigger assert error.
> 
>     Today, I didn't finish setting up test environments on RaspBerry Pi4. Before I produced this issue on Xilinx arm platform.
> 
> Best Regards.
> Zhiyong
> 
> -----Original Message-----
> From: Kevin Buettner <kevinb@redhat.com> 
> Sent: Saturday, July 22, 2023 4:50 AM
> To: Yan, Zhiyong <Zhiyong.Yan@windriver.com>
> Cc: gdb-patches@sourceware.org; luis.machado@arm.com; tom@tromey.com
> Subject: Re: [PATCH] gdbserver: Install single-step breakpoint for a pending thread whose last_resume_kind is resume_step
> 
> CAUTION: This email comes from a non Wind River email account!
> Do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> Hi Zhiyong,
> 
> I set up a Raspberry Pi running a recent 32-bit Raspberry Pi OS so that I could test your patch.  I was able to build and run your test case, but I could not reproduce the bug on the Pi.
> 
> I tested gdb.threads/*.exp using --target_board=native-gdbserver both with and without your patch.  Some of these tests are racy, but my conclusion from just looking at the PASSes and FAILs (after many test
> runs) is that there are no regressions.
> 
> But then I remembered to enable core dumps on the Pi and after running gdb.threads/pending-fork-event-detach/pending-fork-event-detach-main-vfork
> by itself, I saw that it left a core file...
> 
> $ make check RUNTESTFLAGS="--target_board=native-gdbserver" TESTS=gdb.threads/pending-fork-event-detach.exp
> ...
>                 === gdb Summary ===
> 
> # of unexpected core files      1
> # of expected passes            240
> 
> The core file was from the running test case, not gdbserver, nor gdb.
> 
> Looking at the core file in GDB shows...
> 
> Program terminated with signal SIGTRAP, Trace/breakpoint trap.
> #0  0x00010624 in break_here () at /mesquite2/sourceware-git/rpi-gdbserver/bld/../../worktree-gdbserver/gdb/testsuite/gdb.threads/pending-fork-event-detach.c:29
> 29        x++;
> [Current thread is 1 (Thread 0xf7e10440 (LWP 4835))]
> (gdb) x/i $pc
> => 0x10624 <break_here+12>:     udf     #16  
> (gdb) x/x $pc
> 0x10624 <break_here+12>:        0xe7f001f0
> 
> ...and in gdbserver/linux-aarch32-low.cc:
> 
> #define arm_eabi_breakpoint 0xe7f001f0UL
> 
> I think what's happened here is that the breakpoint added by your patch is left in place when GDB detaches the test case.  When it starts running again, it hits the software single step breakpoint and, since it's no longer under GDB control, it dies with a SIGTRAP.
> 
> This core file is not created when I run the test using a gdbserver without your patch.
> 
> I'm suspicious of the assert in linux_process_target::maybe_hw_step.
> Currently, it looks like this:
> 
> bool
> linux_process_target::maybe_hw_step (thread_info *thread) {
>   if (supports_hardware_single_step ())
>     return true;
>   else
>     {
>       /* GDBserver must insert single-step breakpoint for software
>          single step.  */
>       gdb_assert (has_single_step_breakpoints (thread));
>       return false;
>     }
> }
> 
> But, when Yao Qi introduced it back in June, 2016, it looked like
> this:
> 
> static int
> maybe_hw_step (struct thread_info *thread) {
>   if (can_hardware_single_step ())
>     return 1;
>   else
>     {
>       struct process_info *proc = get_thread_process (thread);
> 
>       /* GDBserver must insert reinsert breakpoint for software
>      single step.  */
>       gdb_assert (has_reinsert_breakpoints (proc));
>       return 0;
>     }
> }
> 
> So, back is 2016, when it was introduced, it's clear that the assert was referring to breakpoints which needed to be reinserted.  Now, that's not at all obvious.
> 
> Also, back in 2016, maybe_hw_step() was only called from two locations; in each case it was in a block in which the condition
> lwp->bp_reinsert != 0 was true.  But now there are two other
> calls; in one case, the software single step breakpoints have just been inserted, so that should be okay, but for the other case, in linux_process_target::resume_stopped_resumed_lwps, I'm less certain.
> 
> In any case, could you comment out (or delete) the assert in a version of the source without your patch and let me know what happens?
> 
> Also, if possible, I'd like to see a backtrace from where the assert occurs so that I can see which call to maybe_hw_step is responsible for triggering the failing assert.
> 
> Kevin
> 

[-- Attachment #2: lpudated.c --]
[-- Type: text/x-c++src, Size: 4636 bytes --]

/*
* lpudated.c - Simple timestamping daemon
*/

#include <sys/types.h>
#include <sys/stat.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <errno.h>
#include <unistd.h>
#include <time.h>
#include <stddef.h>
//
#include <signal.h>
#include <pthread.h>
#include <sys/wait.h>

#define WORKERCOUNT 1
#define WORKERSLEEPSECS 5
sigset_t g_stSigSet;
pthread_t g_stPid;

int bRun = 1;

int _osal_task_pause (unsigned short sleep_time)
{
	struct timespec ts;

	ts.tv_sec = sleep_time / 1000;
	ts.tv_nsec = (sleep_time % 1000) * 1000000;

	if (nanosleep(&ts, NULL) != 0)
    	return -1;

	return 0;
}

static void* sigthread_routine(void *pArg)
{
    sigset_t *pstSigSet = &g_stSigSet;
    int iSigNum = 0;
    pthread_t stPid = pthread_self();
    int iRet = -1;
    /* pthread_detach(stPid); */
    while(1){
        iRet = sigwait(pstSigSet, &iSigNum);
        if(-1 == iRet){
             fprintf(stderr, "Signal error: %s", strerror(errno));
        }
        bRun = 0;
        break;
    }
    return NULL;
}

static void* worker_a(void *pArg)
{
	pid_t sm_cid;
    pthread_t stPid = pthread_self();
    pthread_detach(stPid);
    /* Finally do out work */
    // char *sm_args[] = {"sudo", "swManager", "swpactivation", "activate", "forced", "standby", NULL };
    char *sm_args[] = {"dd", "if=/dev/zero", "of=/dev/null", "bs=1k", "count=1024", NULL };
    while(bRun){
    	sm_cid = fork();
    	if( sm_cid == 0 ) { //child
    		if (execvp(sm_args[0], sm_args) == -1){
//    			LOGGER(LCON|LRAM, SWD, SDL, ERR, "%s() - ERROR activating sw package - errno=%d \n", __FUNCTION__, errno);
    		}
    	}

    	/* wait for the SW Manager activation execution */
    	waitpid(sm_cid, NULL, 0);
        _osal_task_pause(WORKERSLEEPSECS);
     }
}

static void* worker_b(void *pArg)
{
    int num = (int)pArg;
    int fd, len;
    time_t timebuf;
    char fname[1024] = {0};
    sprintf(fname, "/dev/null");
    pthread_t stPid = pthread_self();
    pthread_detach(stPid);
    /* Finally do out work */
    len = strlen(ctime(&timebuf));
    while(bRun){
        char *buf = malloc(sizeof(char)*(len+1));
        if(buf == NULL){
           fprintf(stderr, "malloc");
           exit(EXIT_FAILURE);
        }
        if((fd == open(fname, O_WRONLY|O_APPEND, 0600)) < 0){
            fprintf(stderr, "open");
            exit(EXIT_FAILURE);
        }
        time(&timebuf);
        strncpy(buf, ctime(&timebuf), len+1);
        write(fd, buf, len+1);
        free(buf);
        close(fd);
        _osal_task_pause(WORKERSLEEPSECS);
     }
}

int main(void)
{
    pid_t pid, sid;
    time_t timebuf;
    int fd, len;
    pthread_t worker_stPid[WORKERCOUNT];
    pthread_t worker_b_stPid[WORKERCOUNT];
    void *pRet;
    int i;
    fprintf(stderr, "zyan1 lpudated is starting.\n");
    pid = fork();
    if(pid < 0) {
        fprintf(stderr, "%s\n", strerror(errno));
        exit(EXIT_FAILURE);
    }
    if(pid > 0){
       /* In the parent, let's bail */
       exit(EXIT_SUCCESS);
    }

    /* In the child ... */
    /* Open the system log */
    //openlog("lpudated", LOG_PID, LOG_DAEMON);

    /* First, start a new session */
    if((sid = setsid())<0){
        fprintf(stderr, "%s\n", "setsid");
        exit(EXIT_FAILURE);
    }
#if 0
    if ((pid = fork()) < 0) {
    	fprintf(stderr, "%s %d, %s\n", __func__, __LINE__, strerror(errno));
        exit(EXIT_FAILURE);
    }else if (0!=pid) { //parent
    	exit(EXIT_SUCCESS);
    }
#endif
    /* Next, make / the current directory */
    if(chdir("/")<0){
       fprintf(stderr, "%s\n", "chdir");
       exit(EXIT_FAILURE);
    }
 
    /* Reset the file mode */
    umask(0);

    /* close the unneeded file descriptors */
    close(STDIN_FILENO);
    close(STDOUT_FILENO);
    //close(STDERR_FILENO);

    sigset_t *pstSigSet = &g_stSigSet;
    sigemptyset(pstSigSet);
    sigaddset(pstSigSet, SIGHUP);
    sigaddset(pstSigSet, SIGINT);
    sigaddset(pstSigSet, SIGTERM);
    sigaddset(pstSigSet, SIGKILL);
    if(pthread_sigmask(SIG_BLOCK, pstSigSet, NULL)<0){
	fprintf(stderr, "Cannot set signal mask\n");
        exit(EXIT_FAILURE);
    }

    for(i=0;i < WORKERCOUNT; i++){
    	pthread_create(&worker_stPid[i], NULL, worker_a, (void*)i);
    }

    for(i=0;i < WORKERCOUNT; i++){
        pthread_create(&worker_b_stPid[i], NULL, worker_b, (void*)i);
    }

    if(pthread_create(&g_stPid, NULL, sigthread_routine, NULL)<0){
	fprintf(stderr, "Can not start signal handler thread.\n");
        exit(EXIT_FAILURE);
    }

    pthread_join(g_stPid, &pRet);

    /* Close the system log and scram */
    //closelog();
    exit(EXIT_SUCCESS);
}

  reply	other threads:[~2023-07-25  3:37 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-12  3:25 zhiyong.yan
2023-07-21 20:49 ` Kevin Buettner
2023-07-24 13:36   ` Yan, Zhiyong
2023-07-25  3:36     ` Kevin Buettner [this message]
     [not found]       ` <DS0PR11MB6447E74277EE65464375E086E703A@DS0PR11MB6447.namprd11.prod.outlook.com>
2023-07-25  6:32         ` Kevin Buettner
2023-07-25  6:50           ` Yan, Zhiyong
2023-07-25  7:05             ` Yan, Zhiyong
2023-07-26  3:58               ` Kevin Buettner
2023-07-26  6:30                 ` Yan, Zhiyong
  -- strict thread matches above, loose matches on Subject: below --
2023-07-12  3:25 zhiyong.yan
2023-07-12  3:19 zhiyong.yan
2023-07-03  3:04 zhiyong.yan
2023-07-10 12:46 ` Alexandra Petlanova Hajkova
2023-07-11  1:49   ` Yan, Zhiyong
2023-07-11 18:42 ` Kevin Buettner
2023-07-12  2:17   ` Yan, Zhiyong
     [not found]   ` <DS0PR11MB644704609A8C731570939D96E736A@DS0PR11MB6447.namprd11.prod.outlook.com>
2023-07-12  3:29     ` Yan, Zhiyong
2023-07-17  9:35   ` Luis Machado

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230724203650.43ddd754@f37-zws-nv \
    --to=kevinb@redhat.com \
    --cc=Zhiyong.Yan@windriver.com \
    --cc=gdb-patches@sourceware.org \
    --cc=luis.machado@arm.com \
    --cc=tom@tromey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).