Re: [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

From: Tom de Vries <tdevries@suse.de>
To: Simon Marchi <simark@simark.ca>, gdb-patches@sourceware.org
Subject: Re: [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often
Date: Tue, 21 Nov 2023 12:08:53 +0100	[thread overview]
Message-ID: <0297df07-9331-4911-becc-e7aba0e92a7b@suse.de> (raw)
In-Reply-To: <cda460c6-a11b-41d7-8cd2-91b342f61ca8@simark.ca>

On 11/20/23 17:12, Simon Marchi wrote:
> On 11/20/23 10:37, Tom de Vries wrote:
>> When running test-case gdb.base/catch-syscall.exp on powerpc64le-linux, we run
>> into an xfail:
>> ...
>> (gdb) catch syscall execve^M
>> Catchpoint 18 (syscall 'execve' [11])^M
>> (gdb) PASS: gdb.base/catch-syscall.exp: execve: \
>>    catch syscall with arguments (execve)
>>    ...
>> continue^M
>> Continuing.^M
>> ^M
>> Catchpoint 18 (call to syscall execve), 0x00007ffff7d7f18c in execve () from \
>>    /lib64/libc.so.6^M
>> (gdb) PASS: gdb.base/catch-syscall.exp: execve: program has called execve
>> continue^M
>> Continuing.^M
>> process 60484 is executing new program: catch-syscall^M
>> ^M
>> Breakpoint 17, main (argc=1, argv=0x7fffffffe618) at catch-syscall.c:54^M
>> 54              char buf1[2] = "a";^M
>> (gdb) XFAIL: gdb.base/catch-syscall.exp: execve: syscall execve has returned
>> ...
>>
>> The problem is that the catchpoint "(return from syscall execve)" doesn't
>> trigger.
>>
>> This is caused by ppc_linux_get_syscall_number returning 0 at execve
>> syscall-exit-stop, while it should return 11.
>>
>> This is a problem that was fixed in linux kernel version v5.19, by commit
>> ec6d0dde71d7 ("powerpc: Enable execve syscall exit tracepoint"), but the
>> machine I'm running the tests on has v4.18.0.
>>
>> An approach was discussed in the PR where ppc_linux_get_syscall_number would
>> try to detect an execve syscall-exit-stop based on the register state, but
>> that was considered too fragile.
>>
>> Fix this by caching the syscall number at syscall-enter-stop, and reusing it
>> at syscall-exit-stop.
>>
>> This is sufficient to stop triggering the xfail, so remove it.
>>
>> It's good to point out that this doesn't always eliminate the need to get the
>> syscall number at a syscall-exit-stop.
>>
>> The test-case has an example called mid-vfork, where we do:
>> - catch vfork
>> - continue
>> - catch syscall
>> - continue.
>>
>> The following things happen:
>> - the "catch vfork" specifies that we capture the PTRACE_EVENT_VFORK event.
>> - the first continue runs into the event
>> - the "catch syscall" specifies that we capture syscall-enter-stop and
>>    syscall-exit-stop events.
>> - the second continue runs into the syscall-exit-stop.  At that point there's
>>    no syscall number value cached, because no corresponding syscall-enter-stop
>>    was observed.
> 
> Thanks for this example, it answers a question I had in the PR.
> 

Great.  It doesn't answer your question related to whether this can 
happen during attaching, but I've mentioned it in the kernel feature 
request ( https://bugzilla.kernel.org/show_bug.cgi?id=218170 ), so 
perhaps we'll get some feedback there.

>> We can address this issue somewhat by translating events into syscalls.  A
>> followup patch in this series use this approach (though not for vfork).
>>
>> This is an RFC at this point.
>>
>> I think there's an open issue with this patch: the cache needs to be
>> invalidated when we stop tracking syscalls.  I wonder if a generation_counter
>> scheme would be a good approach here.
>>
>> Perhaps we can do a per-thread approach where when continuing a thread we
>> reset the cached value unless PTRACE_SYSCALL is used to continue the thread.
> 
> I think that last idea makes sense.  I am not sure I undertsand the
> generation_counter idea.
> 

Well, my first idea was to hook into disabling syscall tracing, and then 
find all threads in the inferior and reset the cached value.  But I was 
not sure how to do this (I'm sure it's trivial, I just couldn't find the 
right code examples), so I figured it would be easier to implement 
incrementing a generation counter that is incremented when disabling 
syscall tracing, and store the generation alongside the cached value, 
allowing us to check whether the cached value is current.

Anyway, I'll try the per-thread approach.

>>
>> PR tdep/28623
>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28623
>> ---
>>   gdb/linux-nat.c                          | 35 +++++++++++++++++++++---
>>   gdb/linux-nat.h                          |  3 ++
>>   gdb/testsuite/gdb.base/catch-syscall.exp |  8 +-----
>>   3 files changed, 35 insertions(+), 11 deletions(-)
>>
>> diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
>> index d3e9560c2fc..efc5053cf85 100644
>> --- a/gdb/linux-nat.c
>> +++ b/gdb/linux-nat.c
>> @@ -1762,7 +1762,25 @@ linux_handle_syscall_trap (struct lwp_info *lp, int stopping)
>>     struct target_waitstatus *ourstatus = &lp->waitstatus;
>>     struct gdbarch *gdbarch = target_thread_architecture (lp->ptid);
>>     thread_info *thread = linux_target->find_thread (lp->ptid);
>> -  int syscall_number = (int) gdbarch_get_syscall_number (gdbarch, thread);
>> +
>> +  enum target_waitkind new_syscall_state
>> +    = (lp->syscall_state == TARGET_WAITKIND_SYSCALL_ENTRY
>> +       ? TARGET_WAITKIND_SYSCALL_RETURN
>> +       : TARGET_WAITKIND_SYSCALL_ENTRY);
>> +
>> +  int syscall_number;
>> +  if (new_syscall_state == TARGET_WAITKIND_SYSCALL_RETURN
>> +      && lp->syscall_number != -1)
>> +    {
>> +      /* Calling gdbarch_get_syscall_number for TARGET_WAITKIND_SYSCALL_RETURN
>> +	 is unreliable on some targets for some syscalls, use the syscall
>> +	 detected at TARGET_WAITKIND_SYSCALL_ENTRY instead.  */
>> +      syscall_number = lp->syscall_number;
>> +    }
>> +  else
>> +    {
>> +      syscall_number = (int) gdbarch_get_syscall_number (gdbarch, thread);
> 
> (unnecessary braces)
> 

Ack, will fix.

> I like this change, since (if we do things right, i.e. invalidate
> syscall_number at the right plcaes), lp->syscall_number is more
> trustworthy than gdbarch_get_syscall_number.
> 

Yes, that is also my hope.

> But, this overlaps a bit with your other patch "Use
> PTRACE_GET_SYSCALL_INFO for syscall number".  I'm not sure how you would
> reconcile the two patches, 

Me neither yet.

> but I think the general approach (since we
> don't have a one size fits all solution) should be to try the most
> trustworthy things first.  So that would be:
> 
>   1. PTRACE_GET_SYSCALL_INFO, if available
>   2. lp->syscall, if available
>   3. gdbarch_get_syscall_number
> 

I was browsing kernel commit logs and also came across a bug in 
PTRACE_GET_SYSCALL_INFO: commit 85cc91e2ba42 ("mips: fix syscall_get_nr").

So I guess YMMV depending on kernel version and architecture.

> In that other patch, you wrap the gdbarch_get_syscall_number call to try
> PTRACE_GET_SYSCALL_INFO before gdbarch_get_syscall_number, so depending
> on how the two patches are merged, the code might not match the order
> above.
> 

True, that remains to be handled.  My expectation is that the 
PTRACE_GET_SYSCALL_INFO will go in first, since it's fairly 
non-controversial, and then the priority question will have to be 
handled in this series.

> And if gdbarch_get_syscall_number could somehow detect when it can't
> provide an answer, that would be really nice.
> 

Yes, though Ulrich already indicated that he prefers a solution that 
doesn't rely on this.

> Side-note: I think that either syscall_number should be made a LONGEST,
> or gdbarch_get_syscall_number should be changed to return an int.
> There's no point in returning a LONGEST and then truncating it.

Perhaps, though I'd rather leave that issue for a separate patch.

Thanks,
- Tom

next prev parent reply	other threads:[~2023-11-21 11:06 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-20 15:37 Tom de Vries
2023-11-20 15:37 ` [RFC 2/3] [gdb/tdep] Add gdbarch_extended_event_to_syscall Tom de Vries
2023-11-20 15:37 ` [RFC 3/3] [gdb] Use ptrace events to get current syscall Tom de Vries
2023-11-20 16:12 ` [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often Simon Marchi
2023-11-21  0:29   ` John Baldwin
2023-11-21 10:26     ` Tom de Vries
2023-11-21 17:54       ` John Baldwin
2023-11-21  2:43   ` Simon Marchi
2023-11-21 11:08   ` Tom de Vries [this message]
2023-11-21 18:03     ` John Baldwin
2023-11-21 21:19       ` Simon Marchi
2023-11-21 22:17         ` John Baldwin
2023-11-22 12:58           ` Tom de Vries

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0297df07-9331-4911-becc-e7aba0e92a7b@suse.de \
    --to=tdevries@suse.de \
    --cc=gdb-patches@sourceware.org \
    --cc=simark@simark.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).