[RFC 1/3] [gdb] Call gdbarch_get_syscall

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often
@ 2023-11-20 15:37 Tom de Vries
  2023-11-20 15:37 ` [RFC 2/3] [gdb/tdep] Add gdbarch_extended_event_to_syscall Tom de Vries
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Tom de Vries @ 2023-11-20 15:37 UTC (permalink / raw)
  To: gdb-patches

When running test-case gdb.base/catch-syscall.exp on powerpc64le-linux, we run
into an xfail:
...
(gdb) catch syscall execve^M
Catchpoint 18 (syscall 'execve' [11])^M
(gdb) PASS: gdb.base/catch-syscall.exp: execve: \
  catch syscall with arguments (execve)
  ...
continue^M
Continuing.^M
^M
Catchpoint 18 (call to syscall execve), 0x00007ffff7d7f18c in execve () from \
  /lib64/libc.so.6^M
(gdb) PASS: gdb.base/catch-syscall.exp: execve: program has called execve
continue^M
Continuing.^M
process 60484 is executing new program: catch-syscall^M
^M
Breakpoint 17, main (argc=1, argv=0x7fffffffe618) at catch-syscall.c:54^M
54              char buf1[2] = "a";^M
(gdb) XFAIL: gdb.base/catch-syscall.exp: execve: syscall execve has returned
...

The problem is that the catchpoint "(return from syscall execve)" doesn't
trigger.

This is caused by ppc_linux_get_syscall_number returning 0 at execve
syscall-exit-stop, while it should return 11.

This is a problem that was fixed in linux kernel version v5.19, by commit
ec6d0dde71d7 ("powerpc: Enable execve syscall exit tracepoint"), but the
machine I'm running the tests on has v4.18.0.

An approach was discussed in the PR where ppc_linux_get_syscall_number would
try to detect an execve syscall-exit-stop based on the register state, but
that was considered too fragile.

Fix this by caching the syscall number at syscall-enter-stop, and reusing it
at syscall-exit-stop.

This is sufficient to stop triggering the xfail, so remove it.

It's good to point out that this doesn't always eliminate the need to get the
syscall number at a syscall-exit-stop.

The test-case has an example called mid-vfork, where we do:
- catch vfork
- continue
- catch syscall
- continue.

The following things happen:
- the "catch vfork" specifies that we capture the PTRACE_EVENT_VFORK event.
- the first continue runs into the event
- the "catch syscall" specifies that we capture syscall-enter-stop and
  syscall-exit-stop events.
- the second continue runs into the syscall-exit-stop.  At that point there's
  no syscall number value cached, because no corresponding syscall-enter-stop
  was observed.

We can address this issue somewhat by translating events into syscalls.  A
followup patch in this series use this approach (though not for vfork).

This is an RFC at this point.

I think there's an open issue with this patch: the cache needs to be
invalidated when we stop tracking syscalls.  I wonder if a generation_counter
scheme would be a good approach here.

Perhaps we can do a per-thread approach where when continuing a thread we
reset the cached value unless PTRACE_SYSCALL is used to continue the thread.

PR tdep/28623
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28623
---
 gdb/linux-nat.c                          | 35 +++++++++++++++++++++---
 gdb/linux-nat.h                          |  3 ++
 gdb/testsuite/gdb.base/catch-syscall.exp |  8 +-----
 3 files changed, 35 insertions(+), 11 deletions(-)

diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index d3e9560c2fc..efc5053cf85 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -1762,7 +1762,25 @@ linux_handle_syscall_trap (struct lwp_info *lp, int stopping)
   struct target_waitstatus *ourstatus = &lp->waitstatus;
   struct gdbarch *gdbarch = target_thread_architecture (lp->ptid);
   thread_info *thread = linux_target->find_thread (lp->ptid);
-  int syscall_number = (int) gdbarch_get_syscall_number (gdbarch, thread);
+
+  enum target_waitkind new_syscall_state
+    = (lp->syscall_state == TARGET_WAITKIND_SYSCALL_ENTRY
+       ? TARGET_WAITKIND_SYSCALL_RETURN
+       : TARGET_WAITKIND_SYSCALL_ENTRY);
+
+  int syscall_number;
+  if (new_syscall_state == TARGET_WAITKIND_SYSCALL_RETURN
+      && lp->syscall_number != -1)
+    {
+      /* Calling gdbarch_get_syscall_number for TARGET_WAITKIND_SYSCALL_RETURN
+	 is unreliable on some targets for some syscalls, use the syscall
+	 detected at TARGET_WAITKIND_SYSCALL_ENTRY instead.  */
+      syscall_number = lp->syscall_number;
+    }
+  else
+    {
+      syscall_number = (int) gdbarch_get_syscall_number (gdbarch, thread);
+    }

   if (stopping)
     {
@@ -1801,9 +1819,18 @@ linux_handle_syscall_trap (struct lwp_info *lp, int stopping)
      the user could install a new catchpoint for this syscall
      between syscall enter/return, and we'll need to know to
      report a syscall return if that happens.  */
-  lp->syscall_state = (lp->syscall_state == TARGET_WAITKIND_SYSCALL_ENTRY
-		       ? TARGET_WAITKIND_SYSCALL_RETURN
-		       : TARGET_WAITKIND_SYSCALL_ENTRY);
+  lp->syscall_state = new_syscall_state;
+
+  if (lp->syscall_state == TARGET_WAITKIND_SYSCALL_ENTRY)
+    {
+      /* Save to use in TARGET_WAITKIND_SYSCALL_RETURN.  */
+      lp->syscall_number = syscall_number;
+    }
+  else
+    {
+      /* Reset to prevent stale values.  */
+      lp->syscall_number = -1;
+    }

   if (catch_syscall_enabled ())
     {
diff --git a/gdb/linux-nat.h b/gdb/linux-nat.h
index cce8bb3ddcc..d53bd8bda5b 100644
--- a/gdb/linux-nat.h
+++ b/gdb/linux-nat.h
@@ -277,6 +277,9 @@ struct lwp_info : intrusive_list_node<lwp_info>
      - TARGET_WAITKIND_SYSCALL_RETURN */
   enum target_waitkind syscall_state;

+  /* Syscall number corresponding to syscall_state.  */
+  int syscall_number = -1;
+
   /* The processor core this LWP was last seen on.  */
   int core = -1;

diff --git a/gdb/testsuite/gdb.base/catch-syscall.exp b/gdb/testsuite/gdb.base/catch-syscall.exp
index 0588cb35d87..d8ea466cf00 100644
--- a/gdb/testsuite/gdb.base/catch-syscall.exp
+++ b/gdb/testsuite/gdb.base/catch-syscall.exp
@@ -134,13 +134,7 @@ proc check_return_from_syscall { syscall { pattern "" } } {
 		return 1
 	    }
 	    -re -wrap ".*Breakpoint $decimal, main .*" {
-		# On Powerpc the kernel does not report the returned from
-		# syscall as expected by the test.  GDB bugzilla 28623.
-		if { [istarget "powerpc64*-linux*"] } {
-		    xfail $thistest
-		} else {
-		    fail $thistest
-		}
+		fail $thistest
 		return 0
 	    }
 	}

base-commit: fdb4c2e02e6600b51f38a60cd70882887007cbdf
-- 
2.35.3

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC 2/3] [gdb/tdep] Add gdbarch_extended_event_to_syscall
  2023-11-20 15:37 [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often Tom de Vries
@ 2023-11-20 15:37 ` Tom de Vries
  2023-11-20 15:37 ` [RFC 3/3] [gdb] Use ptrace events to get current syscall Tom de Vries
  2023-11-20 16:12 ` [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often Simon Marchi
  2 siblings, 0 replies; 13+ messages in thread
From: Tom de Vries @ 2023-11-20 15:37 UTC (permalink / raw)
  To: gdb-patches

Add a new gdbarch hook, that translates ptrace events like PTRACE_EVENT_EXEC
into the system call number that triggers the event.

No implementation and no use of the hook yet, this will be added in the
following patch.
---
 gdb/arch-utils.c          |  8 ++++++++
 gdb/arch-utils.h          |  3 +++
 gdb/gdbarch-gen.h         |  7 +++++++
 gdb/gdbarch.c             | 22 ++++++++++++++++++++++
 gdb/gdbarch_components.py | 12 ++++++++++++
 5 files changed, 52 insertions(+)

diff --git a/gdb/arch-utils.c b/gdb/arch-utils.c
index c64584c5760..27ed580b265 100644
--- a/gdb/arch-utils.c
+++ b/gdb/arch-utils.c
@@ -1492,6 +1492,14 @@ gdbarch_initialized_p (gdbarch *arch)
   return arch->initialized_p;
 }
 
+/* See arch-utils.h.  */
+
+int
+default_extended_event_to_syscall (struct gdbarch *, int)
+{
+  return -1;
+}
+
 void _initialize_gdbarch_utils ();
 void
 _initialize_gdbarch_utils ()
diff --git a/gdb/arch-utils.h b/gdb/arch-utils.h
index e53950c4408..e41d31321a5 100644
--- a/gdb/arch-utils.h
+++ b/gdb/arch-utils.h
@@ -325,4 +325,7 @@ extern enum return_value_convention default_gdbarch_return_value
       struct regcache *regcache, struct value **read_value,
       const gdb_byte *writebuf);
 
+/* Default implementation of gdbarch extended_event_to_syscall method.  */
+extern int default_extended_event_to_syscall (struct gdbarch *, int);
+
 #endif /* ARCH_UTILS_H */
diff --git a/gdb/gdbarch-gen.h b/gdb/gdbarch-gen.h
index 9f468bd1f61..62ad52f5da8 100644
--- a/gdb/gdbarch-gen.h
+++ b/gdb/gdbarch-gen.h
@@ -1283,6 +1283,13 @@ typedef LONGEST (gdbarch_get_syscall_number_ftype) (struct gdbarch *gdbarch, thr
 extern LONGEST gdbarch_get_syscall_number (struct gdbarch *gdbarch, thread_info *thread);
 extern void set_gdbarch_get_syscall_number (struct gdbarch *gdbarch, gdbarch_get_syscall_number_ftype *get_syscall_number);
 
+/* Function for the 'catch syscall' feature.
+   Translate extended wait event to architecture-specific system call number. */
+
+typedef int (gdbarch_extended_event_to_syscall_ftype) (struct gdbarch *gdbarch, int event);
+extern int gdbarch_extended_event_to_syscall (struct gdbarch *gdbarch, int event);
+extern void set_gdbarch_extended_event_to_syscall (struct gdbarch *gdbarch, gdbarch_extended_event_to_syscall_ftype *extended_event_to_syscall);
+
 /* The filename of the XML syscall for this architecture. */
 
 extern const char * gdbarch_xml_syscall_file (struct gdbarch *gdbarch);
diff --git a/gdb/gdbarch.c b/gdb/gdbarch.c
index ea6e4c647b1..f4661b8b4cd 100644
--- a/gdb/gdbarch.c
+++ b/gdb/gdbarch.c
@@ -207,6 +207,7 @@ struct gdbarch
   gdbarch_get_siginfo_type_ftype *get_siginfo_type = nullptr;
   gdbarch_record_special_symbol_ftype *record_special_symbol = nullptr;
   gdbarch_get_syscall_number_ftype *get_syscall_number = nullptr;
+  gdbarch_extended_event_to_syscall_ftype *extended_event_to_syscall = default_extended_event_to_syscall;
   const char * xml_syscall_file = 0;
   struct syscalls_info * syscalls_info = 0;
   const char *const * stap_integer_prefixes = 0;
@@ -475,6 +476,7 @@ verify_gdbarch (struct gdbarch *gdbarch)
   /* Skip verify of get_siginfo_type, has predicate.  */
   /* Skip verify of record_special_symbol, has predicate.  */
   /* Skip verify of get_syscall_number, has predicate.  */
+  /* Skip verify of extended_event_to_syscall, invalid_p == 0 */
   /* Skip verify of xml_syscall_file, invalid_p == 0 */
   /* Skip verify of syscalls_info, invalid_p == 0 */
   /* Skip verify of stap_integer_prefixes, invalid_p == 0 */
@@ -1198,6 +1200,9 @@ gdbarch_dump (struct gdbarch *gdbarch, struct ui_file *file)
   gdb_printf (file,
 	      "gdbarch_dump: get_syscall_number = <%s>\n",
 	      host_address_to_string (gdbarch->get_syscall_number));
+  gdb_printf (file,
+	      "gdbarch_dump: extended_event_to_syscall = <%s>\n",
+	      host_address_to_string (gdbarch->extended_event_to_syscall));
   gdb_printf (file,
 	      "gdbarch_dump: xml_syscall_file = %s\n",
 	      pstring (gdbarch->xml_syscall_file));
@@ -4512,6 +4517,23 @@ set_gdbarch_get_syscall_number (struct gdbarch *gdbarch,
   gdbarch->get_syscall_number = get_syscall_number;
 }
 
+int
+gdbarch_extended_event_to_syscall (struct gdbarch *gdbarch, int event)
+{
+  gdb_assert (gdbarch != NULL);
+  gdb_assert (gdbarch->extended_event_to_syscall != NULL);
+  if (gdbarch_debug >= 2)
+    gdb_printf (gdb_stdlog, "gdbarch_extended_event_to_syscall called\n");
+  return gdbarch->extended_event_to_syscall (gdbarch, event);
+}
+
+void
+set_gdbarch_extended_event_to_syscall (struct gdbarch *gdbarch,
+				       gdbarch_extended_event_to_syscall_ftype extended_event_to_syscall)
+{
+  gdbarch->extended_event_to_syscall = extended_event_to_syscall;
+}
+
 const char *
 gdbarch_xml_syscall_file (struct gdbarch *gdbarch)
 {
diff --git a/gdb/gdbarch_components.py b/gdb/gdbarch_components.py
index 694ac366023..0292c91d29f 100644
--- a/gdb/gdbarch_components.py
+++ b/gdb/gdbarch_components.py
@@ -2064,6 +2064,18 @@ Get architecture-specific system calls information from registers.
     predicate=True,
 )
 
+Method(
+    comment="""
+Function for the 'catch syscall' feature.
+Translate extended wait event to architecture-specific system call number.
+""",
+    type="int",
+    name="extended_event_to_syscall",
+    params=[("int", "event")],
+    predefault="default_extended_event_to_syscall",
+    invalid=False,
+)
+
 Value(
     comment="""
 The filename of the XML syscall for this architecture.
-- 
2.35.3


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC 3/3] [gdb] Use ptrace events to get current syscall
  2023-11-20 15:37 [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often Tom de Vries
  2023-11-20 15:37 ` [RFC 2/3] [gdb/tdep] Add gdbarch_extended_event_to_syscall Tom de Vries
@ 2023-11-20 15:37 ` Tom de Vries
  2023-11-20 16:12 ` [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often Simon Marchi
  2 siblings, 0 replies; 13+ messages in thread
From: Tom de Vries @ 2023-11-20 15:37 UTC (permalink / raw)
  To: gdb-patches

Add a powerpc implementation of gdbarch_extended_event_to_syscall, for the
moment only mapping PTRACE_EVENT_EXEC to execve.

Use gdbarch_extended_event_to_syscall in linux_handle_extended_wait to
update the cached syscall number.
---
 gdb/linux-nat.c      |  5 +++++
 gdb/ppc-linux-tdep.c | 17 +++++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index efc5053cf85..6122050a982 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -1959,6 +1959,11 @@ linux_handle_extended_wait (struct lwp_info *lp, int status)
      you have to be using PTRACE_SEIZE to get that.  */
   lp->syscall_state = TARGET_WAITKIND_SYSCALL_ENTRY;
 
+  struct gdbarch *gdbarch = target_thread_architecture (lp->ptid);
+  int syscall_number = gdbarch_extended_event_to_syscall (gdbarch, event);
+  if (syscall_number != -1)
+    lp->syscall_number = syscall_number;
+
   if (event == PTRACE_EVENT_FORK || event == PTRACE_EVENT_VFORK
       || event == PTRACE_EVENT_CLONE)
     {
diff --git a/gdb/ppc-linux-tdep.c b/gdb/ppc-linux-tdep.c
index 1cc26ed69a5..b4ca660a9fb 100644
--- a/gdb/ppc-linux-tdep.c
+++ b/gdb/ppc-linux-tdep.c
@@ -64,6 +64,7 @@
 #include "elf-bfd.h"
 #include "producer.h"
 #include "target-float.h"
+#include "nat/linux-ptrace.h"
 
 #include "features/rs6000/powerpc-32l.c"
 #include "features/rs6000/powerpc-altivec32l.c"
@@ -1373,6 +1374,19 @@ ppc_linux_get_syscall_number (struct gdbarch *gdbarch,
 static struct linux_record_tdep ppc_linux_record_tdep;
 static struct linux_record_tdep ppc64_linux_record_tdep;
 
+static int
+ppc_linux_extended_event_to_syscall (struct gdbarch *gdbarch, int event)
+{
+  switch (event)
+    {
+    case PTRACE_EVENT_EXEC:
+      /* Execve syscall.  */
+      return 11;
+    }
+
+  return -1;
+}
+
 /* ppc_canonicalize_syscall maps from the native PowerPC Linux set of
    syscall ids into a canonical set of syscall ids used by process
    record.  (See arch/powerpc/include/uapi/asm/unistd.h in kernel tree.)
@@ -2173,6 +2187,9 @@ ppc_linux_init_abi (struct gdbarch_info info,
   /* Get the syscall number from the arch's register.  */
   set_gdbarch_get_syscall_number (gdbarch, ppc_linux_get_syscall_number);
 
+  set_gdbarch_extended_event_to_syscall (gdbarch,
+					 ppc_linux_extended_event_to_syscall);
+
   /* SystemTap functions.  */
   set_gdbarch_stap_integer_prefixes (gdbarch, stap_integer_prefixes);
   set_gdbarch_stap_register_indirection_prefixes (gdbarch,
-- 
2.35.3


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often
  2023-11-20 15:37 [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often Tom de Vries
  2023-11-20 15:37 ` [RFC 2/3] [gdb/tdep] Add gdbarch_extended_event_to_syscall Tom de Vries
  2023-11-20 15:37 ` [RFC 3/3] [gdb] Use ptrace events to get current syscall Tom de Vries
@ 2023-11-20 16:12 ` Simon Marchi
  2023-11-21  0:29   ` John Baldwin
                     ` (2 more replies)
  2 siblings, 3 replies; 13+ messages in thread
From: Simon Marchi @ 2023-11-20 16:12 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches

On 11/20/23 10:37, Tom de Vries wrote:
> When running test-case gdb.base/catch-syscall.exp on powerpc64le-linux, we run
> into an xfail:
> ...
> (gdb) catch syscall execve^M
> Catchpoint 18 (syscall 'execve' [11])^M
> (gdb) PASS: gdb.base/catch-syscall.exp: execve: \
>   catch syscall with arguments (execve)
>   ...
> continue^M
> Continuing.^M
> ^M
> Catchpoint 18 (call to syscall execve), 0x00007ffff7d7f18c in execve () from \
>   /lib64/libc.so.6^M
> (gdb) PASS: gdb.base/catch-syscall.exp: execve: program has called execve
> continue^M
> Continuing.^M
> process 60484 is executing new program: catch-syscall^M
> ^M
> Breakpoint 17, main (argc=1, argv=0x7fffffffe618) at catch-syscall.c:54^M
> 54              char buf1[2] = "a";^M
> (gdb) XFAIL: gdb.base/catch-syscall.exp: execve: syscall execve has returned
> ...
> 
> The problem is that the catchpoint "(return from syscall execve)" doesn't
> trigger.
> 
> This is caused by ppc_linux_get_syscall_number returning 0 at execve
> syscall-exit-stop, while it should return 11.
> 
> This is a problem that was fixed in linux kernel version v5.19, by commit
> ec6d0dde71d7 ("powerpc: Enable execve syscall exit tracepoint"), but the
> machine I'm running the tests on has v4.18.0.
> 
> An approach was discussed in the PR where ppc_linux_get_syscall_number would
> try to detect an execve syscall-exit-stop based on the register state, but
> that was considered too fragile.
> 
> Fix this by caching the syscall number at syscall-enter-stop, and reusing it
> at syscall-exit-stop.
> 
> This is sufficient to stop triggering the xfail, so remove it.
> 
> It's good to point out that this doesn't always eliminate the need to get the
> syscall number at a syscall-exit-stop.
> 
> The test-case has an example called mid-vfork, where we do:
> - catch vfork
> - continue
> - catch syscall
> - continue.
> 
> The following things happen:
> - the "catch vfork" specifies that we capture the PTRACE_EVENT_VFORK event.
> - the first continue runs into the event
> - the "catch syscall" specifies that we capture syscall-enter-stop and
>   syscall-exit-stop events.
> - the second continue runs into the syscall-exit-stop.  At that point there's
>   no syscall number value cached, because no corresponding syscall-enter-stop
>   was observed.

Thanks for this example, it answers a question I had in the PR.

> We can address this issue somewhat by translating events into syscalls.  A
> followup patch in this series use this approach (though not for vfork).
> 
> This is an RFC at this point.
> 
> I think there's an open issue with this patch: the cache needs to be
> invalidated when we stop tracking syscalls.  I wonder if a generation_counter
> scheme would be a good approach here.
> 
> Perhaps we can do a per-thread approach where when continuing a thread we
> reset the cached value unless PTRACE_SYSCALL is used to continue the thread.

I think that last idea makes sense.  I am not sure I undertsand the
generation_counter idea.

> 
> PR tdep/28623
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28623
> ---
>  gdb/linux-nat.c                          | 35 +++++++++++++++++++++---
>  gdb/linux-nat.h                          |  3 ++
>  gdb/testsuite/gdb.base/catch-syscall.exp |  8 +-----
>  3 files changed, 35 insertions(+), 11 deletions(-)
> 
> diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
> index d3e9560c2fc..efc5053cf85 100644
> --- a/gdb/linux-nat.c
> +++ b/gdb/linux-nat.c
> @@ -1762,7 +1762,25 @@ linux_handle_syscall_trap (struct lwp_info *lp, int stopping)
>    struct target_waitstatus *ourstatus = &lp->waitstatus;
>    struct gdbarch *gdbarch = target_thread_architecture (lp->ptid);
>    thread_info *thread = linux_target->find_thread (lp->ptid);
> -  int syscall_number = (int) gdbarch_get_syscall_number (gdbarch, thread);
> +
> +  enum target_waitkind new_syscall_state
> +    = (lp->syscall_state == TARGET_WAITKIND_SYSCALL_ENTRY
> +       ? TARGET_WAITKIND_SYSCALL_RETURN
> +       : TARGET_WAITKIND_SYSCALL_ENTRY);
> +
> +  int syscall_number;
> +  if (new_syscall_state == TARGET_WAITKIND_SYSCALL_RETURN
> +      && lp->syscall_number != -1)
> +    {
> +      /* Calling gdbarch_get_syscall_number for TARGET_WAITKIND_SYSCALL_RETURN
> +	 is unreliable on some targets for some syscalls, use the syscall
> +	 detected at TARGET_WAITKIND_SYSCALL_ENTRY instead.  */
> +      syscall_number = lp->syscall_number;
> +    }
> +  else
> +    {
> +      syscall_number = (int) gdbarch_get_syscall_number (gdbarch, thread);

(unnecessary braces)

I like this change, since (if we do things right, i.e. invalidate
syscall_number at the right plcaes), lp->syscall_number is more
trustworthy than gdbarch_get_syscall_number.

But, this overlaps a bit with your other patch "Use
PTRACE_GET_SYSCALL_INFO for syscall number".  I'm not sure how you would
reconcile the two patches, but I think the general approach (since we
don't have a one size fits all solution) should be to try the most
trustworthy things first.  So that would be:

 1. PTRACE_GET_SYSCALL_INFO, if available
 2. lp->syscall, if available
 3. gdbarch_get_syscall_number

In that other patch, you wrap the gdbarch_get_syscall_number call to try
PTRACE_GET_SYSCALL_INFO before gdbarch_get_syscall_number, so depending
on how the two patches are merged, the code might not match the order
above.

And if gdbarch_get_syscall_number could somehow detect when it can't
provide an answer, that would be really nice.

Side-note: I think that either syscall_number should be made a LONGEST,
or gdbarch_get_syscall_number should be changed to return an int.
There's no point in returning a LONGEST and then truncating it.

Simon

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often
  2023-11-20 16:12 ` [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often Simon Marchi
@ 2023-11-21  0:29   ` John Baldwin
  2023-11-21 10:26     ` Tom de Vries
  2023-11-21  2:43   ` Simon Marchi
  2023-11-21 11:08   ` Tom de Vries
  2 siblings, 1 reply; 13+ messages in thread
From: John Baldwin @ 2023-11-21  0:29 UTC (permalink / raw)
  To: Simon Marchi, Tom de Vries, gdb-patches

On 11/20/23 8:12 AM, Simon Marchi wrote:
> On 11/20/23 10:37, Tom de Vries wrote:
>> When running test-case gdb.base/catch-syscall.exp on powerpc64le-linux, we run
>> into an xfail:
>> ...
>> (gdb) catch syscall execve^M
>> Catchpoint 18 (syscall 'execve' [11])^M
>> (gdb) PASS: gdb.base/catch-syscall.exp: execve: \
>>    catch syscall with arguments (execve)
>>    ...
>> continue^M
>> Continuing.^M
>> ^M
>> Catchpoint 18 (call to syscall execve), 0x00007ffff7d7f18c in execve () from \
>>    /lib64/libc.so.6^M
>> (gdb) PASS: gdb.base/catch-syscall.exp: execve: program has called execve
>> continue^M
>> Continuing.^M
>> process 60484 is executing new program: catch-syscall^M
>> ^M
>> Breakpoint 17, main (argc=1, argv=0x7fffffffe618) at catch-syscall.c:54^M
>> 54              char buf1[2] = "a";^M
>> (gdb) XFAIL: gdb.base/catch-syscall.exp: execve: syscall execve has returned
>> ...
>>
>> The problem is that the catchpoint "(return from syscall execve)" doesn't
>> trigger.
>>
>> This is caused by ppc_linux_get_syscall_number returning 0 at execve
>> syscall-exit-stop, while it should return 11.
>>
>> This is a problem that was fixed in linux kernel version v5.19, by commit
>> ec6d0dde71d7 ("powerpc: Enable execve syscall exit tracepoint"), but the
>> machine I'm running the tests on has v4.18.0.
>>
>> An approach was discussed in the PR where ppc_linux_get_syscall_number would
>> try to detect an execve syscall-exit-stop based on the register state, but
>> that was considered too fragile.
>>
>> Fix this by caching the syscall number at syscall-enter-stop, and reusing it
>> at syscall-exit-stop.
>>
>> This is sufficient to stop triggering the xfail, so remove it.
>>
>> It's good to point out that this doesn't always eliminate the need to get the
>> syscall number at a syscall-exit-stop.
>>
>> The test-case has an example called mid-vfork, where we do:
>> - catch vfork
>> - continue
>> - catch syscall
>> - continue.
>>
>> The following things happen:
>> - the "catch vfork" specifies that we capture the PTRACE_EVENT_VFORK event.
>> - the first continue runs into the event
>> - the "catch syscall" specifies that we capture syscall-enter-stop and
>>    syscall-exit-stop events.
>> - the second continue runs into the syscall-exit-stop.  At that point there's
>>    no syscall number value cached, because no corresponding syscall-enter-stop
>>    was observed.
> 
> Thanks for this example, it answers a question I had in the PR.
> 
>> We can address this issue somewhat by translating events into syscalls.  A
>> followup patch in this series use this approach (though not for vfork).
>>
>> This is an RFC at this point.
>>
>> I think there's an open issue with this patch: the cache needs to be
>> invalidated when we stop tracking syscalls.  I wonder if a generation_counter
>> scheme would be a good approach here.
>>
>> Perhaps we can do a per-thread approach where when continuing a thread we
>> reset the cached value unless PTRACE_SYSCALL is used to continue the thread.
> 
> I think that last idea makes sense.  I am not sure I undertsand the
> generation_counter idea.

Regarding the generation counter, my understanding is that the native Linux
target never disables syscall tracing once it is enabled, or at least that is
what the comment in linux-nat.c implies to me:

int
linux_nat_target::set_syscall_catchpoint (int pid, bool needed, int any_count,
					  gdb::array_view<const int> syscall_counts)
{
   /* On GNU/Linux, we ignore the arguments.  It means that we only
      enable the syscall catchpoints, but do not disable them.

      Also, we do not use the `syscall_counts' information because we do not
      filter system calls here.  We let GDB do the logic for us.  */
   return 0;
}

-- 
John Baldwin


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often
  2023-11-20 16:12 ` [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often Simon Marchi
  2023-11-21  0:29   ` John Baldwin
@ 2023-11-21  2:43   ` Simon Marchi
  2023-11-21 11:08   ` Tom de Vries
  2 siblings, 0 replies; 13+ messages in thread
From: Simon Marchi @ 2023-11-21  2:43 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches



On 2023-11-20 11:12, Simon Marchi wrote:
> I like this change, since (if we do things right, i.e. invalidate
> syscall_number at the right plcaes), lp->syscall_number is more
> trustworthy than gdbarch_get_syscall_number.
> 
> But, this overlaps a bit with your other patch "Use
> PTRACE_GET_SYSCALL_INFO for syscall number".  I'm not sure how you would
> reconcile the two patches, but I think the general approach (since we
> don't have a one size fits all solution) should be to try the most
> trustworthy things first.  So that would be:
> 
>  1. PTRACE_GET_SYSCALL_INFO, if available
>  2. lp->syscall, if available
>  3. gdbarch_get_syscall_number
> 
> In that other patch, you wrap the gdbarch_get_syscall_number call to try
> PTRACE_GET_SYSCALL_INFO before gdbarch_get_syscall_number, so depending
> on how the two patches are merged, the code might not match the order
> above.

So I hadn't read your commit message fully for that other patch, but
John' reply made me realize that PTRACE_GET_SYSCALL_INFO doesn't give
the syscall number for exit... that's a bummer.

Simon

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often
  2023-11-21  0:29   ` John Baldwin
@ 2023-11-21 10:26     ` Tom de Vries
  2023-11-21 17:54       ` John Baldwin
  0 siblings, 1 reply; 13+ messages in thread
From: Tom de Vries @ 2023-11-21 10:26 UTC (permalink / raw)
  To: John Baldwin, Simon Marchi, gdb-patches

On 11/21/23 01:29, John Baldwin wrote:
> On 11/20/23 8:12 AM, Simon Marchi wrote:
>> On 11/20/23 10:37, Tom de Vries wrote:
>>> When running test-case gdb.base/catch-syscall.exp on 
>>> powerpc64le-linux, we run
>>> into an xfail:
>>> ...
>>> (gdb) catch syscall execve^M
>>> Catchpoint 18 (syscall 'execve' [11])^M
>>> (gdb) PASS: gdb.base/catch-syscall.exp: execve: \
>>>    catch syscall with arguments (execve)
>>>    ...
>>> continue^M
>>> Continuing.^M
>>> ^M
>>> Catchpoint 18 (call to syscall execve), 0x00007ffff7d7f18c in execve 
>>> () from \
>>>    /lib64/libc.so.6^M
>>> (gdb) PASS: gdb.base/catch-syscall.exp: execve: program has called 
>>> execve
>>> continue^M
>>> Continuing.^M
>>> process 60484 is executing new program: catch-syscall^M
>>> ^M
>>> Breakpoint 17, main (argc=1, argv=0x7fffffffe618) at 
>>> catch-syscall.c:54^M
>>> 54              char buf1[2] = "a";^M
>>> (gdb) XFAIL: gdb.base/catch-syscall.exp: execve: syscall execve has 
>>> returned
>>> ...
>>>
>>> The problem is that the catchpoint "(return from syscall execve)" 
>>> doesn't
>>> trigger.
>>>
>>> This is caused by ppc_linux_get_syscall_number returning 0 at execve
>>> syscall-exit-stop, while it should return 11.
>>>
>>> This is a problem that was fixed in linux kernel version v5.19, by 
>>> commit
>>> ec6d0dde71d7 ("powerpc: Enable execve syscall exit tracepoint"), but the
>>> machine I'm running the tests on has v4.18.0.
>>>
>>> An approach was discussed in the PR where 
>>> ppc_linux_get_syscall_number would
>>> try to detect an execve syscall-exit-stop based on the register 
>>> state, but
>>> that was considered too fragile.
>>>
>>> Fix this by caching the syscall number at syscall-enter-stop, and 
>>> reusing it
>>> at syscall-exit-stop.
>>>
>>> This is sufficient to stop triggering the xfail, so remove it.
>>>
>>> It's good to point out that this doesn't always eliminate the need to 
>>> get the
>>> syscall number at a syscall-exit-stop.
>>>
>>> The test-case has an example called mid-vfork, where we do:
>>> - catch vfork
>>> - continue
>>> - catch syscall
>>> - continue.
>>>
>>> The following things happen:
>>> - the "catch vfork" specifies that we capture the PTRACE_EVENT_VFORK 
>>> event.
>>> - the first continue runs into the event
>>> - the "catch syscall" specifies that we capture syscall-enter-stop and
>>>    syscall-exit-stop events.
>>> - the second continue runs into the syscall-exit-stop.  At that point 
>>> there's
>>>    no syscall number value cached, because no corresponding 
>>> syscall-enter-stop
>>>    was observed.
>>
>> Thanks for this example, it answers a question I had in the PR.
>>
>>> We can address this issue somewhat by translating events into 
>>> syscalls.  A
>>> followup patch in this series use this approach (though not for vfork).
>>>
>>> This is an RFC at this point.
>>>
>>> I think there's an open issue with this patch: the cache needs to be
>>> invalidated when we stop tracking syscalls.  I wonder if a 
>>> generation_counter
>>> scheme would be a good approach here.
>>>
>>> Perhaps we can do a per-thread approach where when continuing a 
>>> thread we
>>> reset the cached value unless PTRACE_SYSCALL is used to continue the 
>>> thread.
>>
>> I think that last idea makes sense.  I am not sure I undertsand the
>> generation_counter idea.
> 
> Regarding the generation counter, my understanding is that the native Linux
> target never disables syscall tracing once it is enabled, or at least 
> that is
> what the comment in linux-nat.c implies to me:
> 
> int
> linux_nat_target::set_syscall_catchpoint (int pid, bool needed, int 
> any_count,
>                        gdb::array_view<const int> syscall_counts)
> {
>    /* On GNU/Linux, we ignore the arguments.  It means that we only
>       enable the syscall catchpoints, but do not disable them.
> 
>       Also, we do not use the `syscall_counts' information because we do 
> not
>       filter system calls here.  We let GDB do the logic for us.  */
>    return 0;
> }
> 

I made a simple example, with gdb.in:
...
catch syscall
continue
delete breakpoints
continue
...

In this case, I run into one PTRACE_SYSCALL (observed using strace).

If I comment out "delete breakpoint", I run into two.

So I think it's possible to disable syscall tracing.

AFAIU the logic is in inf_ptrace_target::resume, were we choose between 
PT_SYSCALL and PT_CONTINUE.

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often
  2023-11-20 16:12 ` [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often Simon Marchi
  2023-11-21  0:29   ` John Baldwin
  2023-11-21  2:43   ` Simon Marchi
@ 2023-11-21 11:08   ` Tom de Vries
  2023-11-21 18:03     ` John Baldwin
  2 siblings, 1 reply; 13+ messages in thread
From: Tom de Vries @ 2023-11-21 11:08 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches

On 11/20/23 17:12, Simon Marchi wrote:
> On 11/20/23 10:37, Tom de Vries wrote:
>> When running test-case gdb.base/catch-syscall.exp on powerpc64le-linux, we run
>> into an xfail:
>> ...
>> (gdb) catch syscall execve^M
>> Catchpoint 18 (syscall 'execve' [11])^M
>> (gdb) PASS: gdb.base/catch-syscall.exp: execve: \
>>    catch syscall with arguments (execve)
>>    ...
>> continue^M
>> Continuing.^M
>> ^M
>> Catchpoint 18 (call to syscall execve), 0x00007ffff7d7f18c in execve () from \
>>    /lib64/libc.so.6^M
>> (gdb) PASS: gdb.base/catch-syscall.exp: execve: program has called execve
>> continue^M
>> Continuing.^M
>> process 60484 is executing new program: catch-syscall^M
>> ^M
>> Breakpoint 17, main (argc=1, argv=0x7fffffffe618) at catch-syscall.c:54^M
>> 54              char buf1[2] = "a";^M
>> (gdb) XFAIL: gdb.base/catch-syscall.exp: execve: syscall execve has returned
>> ...
>>
>> The problem is that the catchpoint "(return from syscall execve)" doesn't
>> trigger.
>>
>> This is caused by ppc_linux_get_syscall_number returning 0 at execve
>> syscall-exit-stop, while it should return 11.
>>
>> This is a problem that was fixed in linux kernel version v5.19, by commit
>> ec6d0dde71d7 ("powerpc: Enable execve syscall exit tracepoint"), but the
>> machine I'm running the tests on has v4.18.0.
>>
>> An approach was discussed in the PR where ppc_linux_get_syscall_number would
>> try to detect an execve syscall-exit-stop based on the register state, but
>> that was considered too fragile.
>>
>> Fix this by caching the syscall number at syscall-enter-stop, and reusing it
>> at syscall-exit-stop.
>>
>> This is sufficient to stop triggering the xfail, so remove it.
>>
>> It's good to point out that this doesn't always eliminate the need to get the
>> syscall number at a syscall-exit-stop.
>>
>> The test-case has an example called mid-vfork, where we do:
>> - catch vfork
>> - continue
>> - catch syscall
>> - continue.
>>
>> The following things happen:
>> - the "catch vfork" specifies that we capture the PTRACE_EVENT_VFORK event.
>> - the first continue runs into the event
>> - the "catch syscall" specifies that we capture syscall-enter-stop and
>>    syscall-exit-stop events.
>> - the second continue runs into the syscall-exit-stop.  At that point there's
>>    no syscall number value cached, because no corresponding syscall-enter-stop
>>    was observed.
> 
> Thanks for this example, it answers a question I had in the PR.
> 

Great.  It doesn't answer your question related to whether this can 
happen during attaching, but I've mentioned it in the kernel feature 
request ( https://bugzilla.kernel.org/show_bug.cgi?id=218170 ), so 
perhaps we'll get some feedback there.

>> We can address this issue somewhat by translating events into syscalls.  A
>> followup patch in this series use this approach (though not for vfork).
>>
>> This is an RFC at this point.
>>
>> I think there's an open issue with this patch: the cache needs to be
>> invalidated when we stop tracking syscalls.  I wonder if a generation_counter
>> scheme would be a good approach here.
>>
>> Perhaps we can do a per-thread approach where when continuing a thread we
>> reset the cached value unless PTRACE_SYSCALL is used to continue the thread.
> 
> I think that last idea makes sense.  I am not sure I undertsand the
> generation_counter idea.
> 

Well, my first idea was to hook into disabling syscall tracing, and then 
find all threads in the inferior and reset the cached value.  But I was 
not sure how to do this (I'm sure it's trivial, I just couldn't find the 
right code examples), so I figured it would be easier to implement 
incrementing a generation counter that is incremented when disabling 
syscall tracing, and store the generation alongside the cached value, 
allowing us to check whether the cached value is current.

Anyway, I'll try the per-thread approach.

>>
>> PR tdep/28623
>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28623
>> ---
>>   gdb/linux-nat.c                          | 35 +++++++++++++++++++++---
>>   gdb/linux-nat.h                          |  3 ++
>>   gdb/testsuite/gdb.base/catch-syscall.exp |  8 +-----
>>   3 files changed, 35 insertions(+), 11 deletions(-)
>>
>> diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
>> index d3e9560c2fc..efc5053cf85 100644
>> --- a/gdb/linux-nat.c
>> +++ b/gdb/linux-nat.c
>> @@ -1762,7 +1762,25 @@ linux_handle_syscall_trap (struct lwp_info *lp, int stopping)
>>     struct target_waitstatus *ourstatus = &lp->waitstatus;
>>     struct gdbarch *gdbarch = target_thread_architecture (lp->ptid);
>>     thread_info *thread = linux_target->find_thread (lp->ptid);
>> -  int syscall_number = (int) gdbarch_get_syscall_number (gdbarch, thread);
>> +
>> +  enum target_waitkind new_syscall_state
>> +    = (lp->syscall_state == TARGET_WAITKIND_SYSCALL_ENTRY
>> +       ? TARGET_WAITKIND_SYSCALL_RETURN
>> +       : TARGET_WAITKIND_SYSCALL_ENTRY);
>> +
>> +  int syscall_number;
>> +  if (new_syscall_state == TARGET_WAITKIND_SYSCALL_RETURN
>> +      && lp->syscall_number != -1)
>> +    {
>> +      /* Calling gdbarch_get_syscall_number for TARGET_WAITKIND_SYSCALL_RETURN
>> +	 is unreliable on some targets for some syscalls, use the syscall
>> +	 detected at TARGET_WAITKIND_SYSCALL_ENTRY instead.  */
>> +      syscall_number = lp->syscall_number;
>> +    }
>> +  else
>> +    {
>> +      syscall_number = (int) gdbarch_get_syscall_number (gdbarch, thread);
> 
> (unnecessary braces)
> 

Ack, will fix.

> I like this change, since (if we do things right, i.e. invalidate
> syscall_number at the right plcaes), lp->syscall_number is more
> trustworthy than gdbarch_get_syscall_number.
> 

Yes, that is also my hope.

> But, this overlaps a bit with your other patch "Use
> PTRACE_GET_SYSCALL_INFO for syscall number".  I'm not sure how you would
> reconcile the two patches, 

Me neither yet.

> but I think the general approach (since we
> don't have a one size fits all solution) should be to try the most
> trustworthy things first.  So that would be:
> 
>   1. PTRACE_GET_SYSCALL_INFO, if available
>   2. lp->syscall, if available
>   3. gdbarch_get_syscall_number
> 

I was browsing kernel commit logs and also came across a bug in 
PTRACE_GET_SYSCALL_INFO: commit 85cc91e2ba42 ("mips: fix syscall_get_nr").

So I guess YMMV depending on kernel version and architecture.

> In that other patch, you wrap the gdbarch_get_syscall_number call to try
> PTRACE_GET_SYSCALL_INFO before gdbarch_get_syscall_number, so depending
> on how the two patches are merged, the code might not match the order
> above.
> 

True, that remains to be handled.  My expectation is that the 
PTRACE_GET_SYSCALL_INFO will go in first, since it's fairly 
non-controversial, and then the priority question will have to be 
handled in this series.

> And if gdbarch_get_syscall_number could somehow detect when it can't
> provide an answer, that would be really nice.
> 

Yes, though Ulrich already indicated that he prefers a solution that 
doesn't rely on this.

> Side-note: I think that either syscall_number should be made a LONGEST,
> or gdbarch_get_syscall_number should be changed to return an int.
> There's no point in returning a LONGEST and then truncating it.

Perhaps, though I'd rather leave that issue for a separate patch.

Thanks,
- Tom
	

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often
  2023-11-21 10:26     ` Tom de Vries
@ 2023-11-21 17:54       ` John Baldwin
  0 siblings, 0 replies; 13+ messages in thread
From: John Baldwin @ 2023-11-21 17:54 UTC (permalink / raw)
  To: Tom de Vries, Simon Marchi, gdb-patches

On 11/21/23 2:26 AM, Tom de Vries wrote:
> On 11/21/23 01:29, John Baldwin wrote:
>> On 11/20/23 8:12 AM, Simon Marchi wrote:
>>> On 11/20/23 10:37, Tom de Vries wrote:
>>>> When running test-case gdb.base/catch-syscall.exp on
>>>> powerpc64le-linux, we run
>>>> into an xfail:
>>>> ...
>>>> (gdb) catch syscall execve^M
>>>> Catchpoint 18 (syscall 'execve' [11])^M
>>>> (gdb) PASS: gdb.base/catch-syscall.exp: execve: \
>>>>     catch syscall with arguments (execve)
>>>>     ...
>>>> continue^M
>>>> Continuing.^M
>>>> ^M
>>>> Catchpoint 18 (call to syscall execve), 0x00007ffff7d7f18c in execve
>>>> () from \
>>>>     /lib64/libc.so.6^M
>>>> (gdb) PASS: gdb.base/catch-syscall.exp: execve: program has called
>>>> execve
>>>> continue^M
>>>> Continuing.^M
>>>> process 60484 is executing new program: catch-syscall^M
>>>> ^M
>>>> Breakpoint 17, main (argc=1, argv=0x7fffffffe618) at
>>>> catch-syscall.c:54^M
>>>> 54              char buf1[2] = "a";^M
>>>> (gdb) XFAIL: gdb.base/catch-syscall.exp: execve: syscall execve has
>>>> returned
>>>> ...
>>>>
>>>> The problem is that the catchpoint "(return from syscall execve)"
>>>> doesn't
>>>> trigger.
>>>>
>>>> This is caused by ppc_linux_get_syscall_number returning 0 at execve
>>>> syscall-exit-stop, while it should return 11.
>>>>
>>>> This is a problem that was fixed in linux kernel version v5.19, by
>>>> commit
>>>> ec6d0dde71d7 ("powerpc: Enable execve syscall exit tracepoint"), but the
>>>> machine I'm running the tests on has v4.18.0.
>>>>
>>>> An approach was discussed in the PR where
>>>> ppc_linux_get_syscall_number would
>>>> try to detect an execve syscall-exit-stop based on the register
>>>> state, but
>>>> that was considered too fragile.
>>>>
>>>> Fix this by caching the syscall number at syscall-enter-stop, and
>>>> reusing it
>>>> at syscall-exit-stop.
>>>>
>>>> This is sufficient to stop triggering the xfail, so remove it.
>>>>
>>>> It's good to point out that this doesn't always eliminate the need to
>>>> get the
>>>> syscall number at a syscall-exit-stop.
>>>>
>>>> The test-case has an example called mid-vfork, where we do:
>>>> - catch vfork
>>>> - continue
>>>> - catch syscall
>>>> - continue.
>>>>
>>>> The following things happen:
>>>> - the "catch vfork" specifies that we capture the PTRACE_EVENT_VFORK
>>>> event.
>>>> - the first continue runs into the event
>>>> - the "catch syscall" specifies that we capture syscall-enter-stop and
>>>>     syscall-exit-stop events.
>>>> - the second continue runs into the syscall-exit-stop.  At that point
>>>> there's
>>>>     no syscall number value cached, because no corresponding
>>>> syscall-enter-stop
>>>>     was observed.
>>>
>>> Thanks for this example, it answers a question I had in the PR.
>>>
>>>> We can address this issue somewhat by translating events into
>>>> syscalls.  A
>>>> followup patch in this series use this approach (though not for vfork).
>>>>
>>>> This is an RFC at this point.
>>>>
>>>> I think there's an open issue with this patch: the cache needs to be
>>>> invalidated when we stop tracking syscalls.  I wonder if a
>>>> generation_counter
>>>> scheme would be a good approach here.
>>>>
>>>> Perhaps we can do a per-thread approach where when continuing a
>>>> thread we
>>>> reset the cached value unless PTRACE_SYSCALL is used to continue the
>>>> thread.
>>>
>>> I think that last idea makes sense.  I am not sure I undertsand the
>>> generation_counter idea.
>>
>> Regarding the generation counter, my understanding is that the native Linux
>> target never disables syscall tracing once it is enabled, or at least
>> that is
>> what the comment in linux-nat.c implies to me:
>>
>> int
>> linux_nat_target::set_syscall_catchpoint (int pid, bool needed, int
>> any_count,
>>                         gdb::array_view<const int> syscall_counts)
>> {
>>     /* On GNU/Linux, we ignore the arguments.  It means that we only
>>        enable the syscall catchpoints, but do not disable them.
>>
>>        Also, we do not use the `syscall_counts' information because we do
>> not
>>        filter system calls here.  We let GDB do the logic for us.  */
>>     return 0;
>> }
>>
> 
> I made a simple example, with gdb.in:
> ...
> catch syscall
> continue
> delete breakpoints
> continue
> ...
> 
> In this case, I run into one PTRACE_SYSCALL (observed using strace).
> 
> If I comment out "delete breakpoint", I run into two.
> 
> So I think it's possible to disable syscall tracing.
> 
> AFAIU the logic is in inf_ptrace_target::resume, were we choose between
> PT_SYSCALL and PT_CONTINUE.

Ah, that might be true.  On the BSD's PT_SYSCALL is "sticky" so that even if you
do PT_CONTINUE, you continue to get syscall events once PT_SYSCALL has been used
at least once.  It might be worth clarifying the above comment in linux-nat.c
though to be clearer about how syscall tracing works.

-- 
John Baldwin


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often
  2023-11-21 11:08   ` Tom de Vries
@ 2023-11-21 18:03     ` John Baldwin
  2023-11-21 21:19       ` Simon Marchi
  0 siblings, 1 reply; 13+ messages in thread
From: John Baldwin @ 2023-11-21 18:03 UTC (permalink / raw)
  To: Tom de Vries, Simon Marchi, gdb-patches

On 11/21/23 3:08 AM, Tom de Vries wrote:
> On 11/20/23 17:12, Simon Marchi wrote:
>> On 11/20/23 10:37, Tom de Vries wrote:
>>> I think there's an open issue with this patch: the cache needs to be
>>> invalidated when we stop tracking syscalls.  I wonder if a generation_counter
>>> scheme would be a good approach here.
>>>
>>> Perhaps we can do a per-thread approach where when continuing a thread we
>>> reset the cached value unless PTRACE_SYSCALL is used to continue the thread.
>>
>> I think that last idea makes sense.  I am not sure I undertsand the
>> generation_counter idea.
>>
> 
> Well, my first idea was to hook into disabling syscall tracing, and then
> find all threads in the inferior and reset the cached value.  But I was
> not sure how to do this (I'm sure it's trivial, I just couldn't find the
> right code examples), so I figured it would be easier to implement
> incrementing a generation counter that is incremented when disabling
> syscall tracing, and store the generation alongside the cached value,
> allowing us to check whether the cached value is current.
> 
> Anyway, I'll try the per-thread approach.

I think you can use linux_nat_target::set_syscall_catchpoint to do the
invalidation.  From my reading it is called each time a syscall catchpoint
is added or removed, and needed is set to false if there are no more
catchpoints (so syscall catches are disabled).   I think that means
you could do something like:

linux_nat_target::set_syscall_catchpoint ()
{
   if (!needed)
     iterate_over_lwps (ptid_t (pid), [] (struct lwp_info *lp)
                        {
                          lp->syscall_number = -1;
                        });
}

-- 
John Baldwin


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often
  2023-11-21 18:03     ` John Baldwin
@ 2023-11-21 21:19       ` Simon Marchi
  2023-11-21 22:17         ` John Baldwin
  0 siblings, 1 reply; 13+ messages in thread
From: Simon Marchi @ 2023-11-21 21:19 UTC (permalink / raw)
  To: John Baldwin, Tom de Vries, gdb-patches

On 11/21/23 13:03, John Baldwin wrote:
> I think you can use linux_nat_target::set_syscall_catchpoint to do the
> invalidation.  From my reading it is called each time a syscall catchpoint
> is added or removed, and needed is set to false if there are no more
> catchpoints (so syscall catches are disabled).   I think that means
> you could do something like:
> 
> linux_nat_target::set_syscall_catchpoint ()
> {
>   if (!needed)
>     iterate_over_lwps (ptid_t (pid), [] (struct lwp_info *lp)
>                        {
>                          lp->syscall_number = -1;
>                        });
> }

What about this corner case of a corner case:

 1. you have a syscall catchpoint, you hit it (the syscall entry)
 2. you delete the syscall catchpoint
 3. you create a syscall catchpoint again
 4. you continue, expecting to hit the syscall exit

Will step 2 cause set_syscall_catchpoint to be called, and will we
therefore lose the information that would have been useful at step 4?

Simon

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often
  2023-11-21 21:19       ` Simon Marchi
@ 2023-11-21 22:17         ` John Baldwin
  2023-11-22 12:58           ` Tom de Vries
  0 siblings, 1 reply; 13+ messages in thread
From: John Baldwin @ 2023-11-21 22:17 UTC (permalink / raw)
  To: Simon Marchi, Tom de Vries, gdb-patches

On 11/21/23 1:19 PM, Simon Marchi wrote:
> On 11/21/23 13:03, John Baldwin wrote:
>> I think you can use linux_nat_target::set_syscall_catchpoint to do the
>> invalidation.  From my reading it is called each time a syscall catchpoint
>> is added or removed, and needed is set to false if there are no more
>> catchpoints (so syscall catches are disabled).   I think that means
>> you could do something like:
>>
>> linux_nat_target::set_syscall_catchpoint ()
>> {
>>    if (!needed)
>>      iterate_over_lwps (ptid_t (pid), [] (struct lwp_info *lp)
>>                         {
>>                           lp->syscall_number = -1;
>>                         });
>> }
> 
> What about this corner case of a corner case:
> 
>   1. you have a syscall catchpoint, you hit it (the syscall entry)
>   2. you delete the syscall catchpoint
>   3. you create a syscall catchpoint again
>   4. you continue, expecting to hit the syscall exit
> 
> Will step 2 cause set_syscall_catchpoint to be called, and will we
> therefore lose the information that would have been useful at step 4?

Yes, but I think the generation counter idea is subject to the same issue
unless in both cases you actually defer the geneation bump (or clearing
syscall_number to -1) until you resume the lwp.  For that approach
perhaps you could reset syscall_number to -1 in linux_resume_one_lwp_throw
if catch_syscalls_enabled is zero?

-- 
John Baldwin


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often
  2023-11-21 22:17         ` John Baldwin
@ 2023-11-22 12:58           ` Tom de Vries
  0 siblings, 0 replies; 13+ messages in thread
From: Tom de Vries @ 2023-11-22 12:58 UTC (permalink / raw)
  To: John Baldwin, Simon Marchi, gdb-patches

On 11/21/23 23:17, John Baldwin wrote:
> On 11/21/23 1:19 PM, Simon Marchi wrote:
>> On 11/21/23 13:03, John Baldwin wrote:
>>> I think you can use linux_nat_target::set_syscall_catchpoint to do the
>>> invalidation.  From my reading it is called each time a syscall 
>>> catchpoint
>>> is added or removed, and needed is set to false if there are no more
>>> catchpoints (so syscall catches are disabled).   I think that means
>>> you could do something like:
>>>
>>> linux_nat_target::set_syscall_catchpoint ()
>>> {
>>>    if (!needed)
>>>      iterate_over_lwps (ptid_t (pid), [] (struct lwp_info *lp)
>>>                         {
>>>                           lp->syscall_number = -1;
>>>                         });
>>> }
>>
>> What about this corner case of a corner case:
>>
>>   1. you have a syscall catchpoint, you hit it (the syscall entry)
>>   2. you delete the syscall catchpoint
>>   3. you create a syscall catchpoint again
>>   4. you continue, expecting to hit the syscall exit
>>
>> Will step 2 cause set_syscall_catchpoint to be called, and will we
>> therefore lose the information that would have been useful at step 4?
> 
> Yes, but I think the generation counter idea is subject to the same issue
> unless in both cases you actually defer the geneation bump (or clearing
> syscall_number to -1) until you resume the lwp.  For that approach
> perhaps you could reset syscall_number to -1 in linux_resume_one_lwp_throw
> if catch_syscalls_enabled is zero?
> 

Thanks for the suggestion, but I went with a solution in 
linux_resume_one_lwp_throw ( 
https://sourceware.org/pipermail/gdb-patches/2023-November/204407.html ) .

AFAIU it should handle the corner case Simon mentions.

Thanks,
- Tom


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-11-22 12:56 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-20 15:37 [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often Tom de Vries
2023-11-20 15:37 ` [RFC 2/3] [gdb/tdep] Add gdbarch_extended_event_to_syscall Tom de Vries
2023-11-20 15:37 ` [RFC 3/3] [gdb] Use ptrace events to get current syscall Tom de Vries
2023-11-20 16:12 ` [RFC 1/3] [gdb] Call gdbarch_get_syscall_number less often Simon Marchi
2023-11-21  0:29   ` John Baldwin
2023-11-21 10:26     ` Tom de Vries
2023-11-21 17:54       ` John Baldwin
2023-11-21  2:43   ` Simon Marchi
2023-11-21 11:08   ` Tom de Vries
2023-11-21 18:03     ` John Baldwin
2023-11-21 21:19       ` Simon Marchi
2023-11-21 22:17         ` John Baldwin
2023-11-22 12:58           ` Tom de Vries

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).