[Bug threads/26286] New: FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 1: break at break

public inbox for gdb-prs@sourceware.org
help / color / mirror / Atom feed

* [Bug threads/26286] New: FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 1: break at break_fn: 1 (SIGTRAP)
@ 2020-07-22 19:08 vries at gcc dot gnu.org
  2020-07-22 19:08 ` [Bug threads/26286] " vries at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: vries at gcc dot gnu.org @ 2020-07-22 19:08 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=26286

            Bug ID: 26286
           Summary: FAIL: gdb.threads/attach-many-short-lived-threads.exp:
                    iter 1: break at break_fn: 1 (SIGTRAP)
           Product: gdb
           Version: HEAD
            Status: NEW
          Severity: normal
          Priority: P2
         Component: threads
          Assignee: unassigned at sourceware dot org
          Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

(gdb) PASS: gdb.threads/attach-many-short-lived-threads.exp: iter 1: break
break_fn
continue
Continuing.
[LWP 13327 exited]
[LWP 13324 exited]
[LWP 13295 exited]
[LWP 13301 exited]
[LWP 13300 exited]
[LWP 13286 exited]
[LWP 13249 exited]
[LWP 13248 exited]
[LWP 13236 exited]
[LWP 13233 exited]
[LWP 13231 exited]
[LWP 13242 exited]
[LWP 13240 exited]
[LWP 13221 exited]
[LWP 13215 exited]
[LWP 13213 exited]
[LWP 13210 exited]
[LWP 13161 exited]
[LWP 13155 exited]
[LWP 13124 exited]
[LWP 13120 exited]
[LWP 13117 exited]
[LWP 13115 exited]
[LWP 13113 exited]
[LWP 13111 exited]
[LWP 13110 exited]
[LWP 13108 exited]
[LWP 13105 exited]
[LWP 13104 exited]
[LWP 13143 exited]
[LWP 13140 exited]
[LWP 13137 exited]
[LWP 13136 exited]
[LWP 13133 exited]
[LWP 13131 exited]
[LWP 13128 exited]
[LWP 13127 exited]
[LWP 13125 exited]
[LWP 13099 exited]
[LWP 13091 exited]
[LWP 13089 exited]
[LWP 13085 exited]
[LWP 13083 exited]
[LWP 13081 exited]
[LWP 13079 exited]
[LWP 13078 exited]
[LWP 13076 exited]
[LWP 13073 exited]
[LWP 13071 exited]
[LWP 13070 exited]
[LWP 13065 exited]
[LWP 12948 exited]
[LWP 12946 exited]
[LWP 12945 exited]
[LWP 12943 exited]
[LWP 12940 exited]
[LWP 12937 exited]
[LWP 12934 exited]
[LWP 12931 exited]
[LWP 12930 exited]
[LWP 12927 exited]
[LWP 12923 exited]
[LWP 12921 exited]
[LWP 12918 exited]
[LWP 12912 exited]
[LWP 12909 exited]
[LWP 12906 exited]
[LWP 12903 exited]
[LWP 12900 exited]
[LWP 12886 exited]
[LWP 12823 exited]
[LWP 12820 exited]
[LWP 12816 exited]
[LWP 12813 exited]
[LWP 12811 exited]
[LWP 12808 exited]
[LWP 12798 exited]
[LWP 12737 exited]
[LWP 12735 exited]
[LWP 12733 exited]
[LWP 12727 exited]
[LWP 12724 exited]
[LWP 12720 exited]
[LWP 12717 exited]
[LWP 12606 exited]
[LWP 12532 exited]
[LWP 12522 exited]
[LWP 12518 exited]
[LWP 12509 exited]
[LWP 12503 exited]
[LWP 12500 exited]
[LWP 12496 exited]
[LWP 12493 exited]
[LWP 12490 exited]
[LWP 12485 exited]
[LWP 12482 exited]
[LWP 12477 exited]
[LWP 12475 exited]
[LWP 12473 exited]
[LWP 12469 exited]
[LWP 12468 exited]
[LWP 12465 exited]
[LWP 12464 exited]
[LWP 12460 exited]
[LWP 12457 exited]
[LWP 12455 exited]
[LWP 12453 exited]
[LWP 12451 exited]
[LWP 12445 exited]
[LWP 12443 exited]
[LWP 12440 exited]
[LWP 12438 exited]
[LWP 12433 exited]
[LWP 12418 exited]
[LWP 12409 exited]
[LWP 12387 exited]
[LWP 12348 exited]
[LWP 12011 exited]

Program terminated with signal SIGTRAP, Trace/breakpoint trap.
The program no longer exists.
(gdb) FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 1: break at
break_fn: 1

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug threads/26286] FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 1: break at break_fn: 1 (SIGTRAP)
  2020-07-22 19:08 [Bug threads/26286] New: FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 1: break at break_fn: 1 (SIGTRAP) vries at gcc dot gnu.org
@ 2020-07-22 19:08 ` vries at gcc dot gnu.org
  2020-07-27 14:17 ` vries at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: vries at gcc dot gnu.org @ 2020-07-22 19:08 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=26286

--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> ---
Created attachment 12718
  --> https://sourceware.org/bugzilla/attachment.cgi?id=12718&action=edit
gdb.log

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug threads/26286] FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 1: break at break_fn: 1 (SIGTRAP)
  2020-07-22 19:08 [Bug threads/26286] New: FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 1: break at break_fn: 1 (SIGTRAP) vries at gcc dot gnu.org
  2020-07-22 19:08 ` [Bug threads/26286] " vries at gcc dot gnu.org
@ 2020-07-27 14:17 ` vries at gcc dot gnu.org
  2020-08-31  7:54 ` ianchi at andestech dot com
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: vries at gcc dot gnu.org @ 2020-07-27 14:17 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=26286

--- Comment #2 from Tom de Vries <vries at gcc dot gnu.org> ---
Reproduced today on master, so it's not a fluke.

FTR: on openSUSE Leap 15.2 laptop.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug threads/26286] FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 1: break at break_fn: 1 (SIGTRAP)
  2020-07-22 19:08 [Bug threads/26286] New: FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 1: break at break_fn: 1 (SIGTRAP) vries at gcc dot gnu.org
  2020-07-22 19:08 ` [Bug threads/26286] " vries at gcc dot gnu.org
  2020-07-27 14:17 ` vries at gcc dot gnu.org
@ 2020-08-31  7:54 ` ianchi at andestech dot com
  2020-08-31  8:30 ` vries at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: ianchi at andestech dot com @ 2020-08-31  7:54 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=26286

Chungyi Chi <ianchi at andestech dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ianchi at andestech dot com

--- Comment #3 from Chungyi Chi <ianchi at andestech dot com> ---
It is not a bug but security issue. Due to ptrace protection, if you wanna
attach another process without "parent-child relationship", it is illegal
behavior.

There are two different way to solve this issue.
1. Execute under root level
2. Set "/proc/sys/kernel/yama/ptrace_scope" into 0.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug threads/26286] FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 1: break at break_fn: 1 (SIGTRAP)
  2020-07-22 19:08 [Bug threads/26286] New: FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 1: break at break_fn: 1 (SIGTRAP) vries at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2020-08-31  7:54 ` ianchi at andestech dot com
@ 2020-08-31  8:30 ` vries at gcc dot gnu.org
  2024-03-19 17:44 ` thiago.bauermann at linaro dot org
  2024-05-17 18:08 ` pedro at palves dot net
  5 siblings, 0 replies; 7+ messages in thread
From: vries at gcc dot gnu.org @ 2020-08-31  8:30 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=26286

--- Comment #4 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Chungyi Chi from comment #3)
> It is not a bug but security issue. Due to ptrace protection, if you wanna
> attach another process without "parent-child relationship", it is illegal
> behavior.
> 
> There are two different way to solve this issue.
> 1. Execute under root level
> 2. Set "/proc/sys/kernel/yama/ptrace_scope" into 0.

On my system, there's no yama:
...
$ cat /sys/kernel/security/lsm 
lockdown,capability,apparmor
$
...

Also, I don't understand how yama would cause the specific failure reported in
this PR.  If yama were active, wouldn't things fail much earlier, and in much
more tests?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug threads/26286] FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 1: break at break_fn: 1 (SIGTRAP)
  2020-07-22 19:08 [Bug threads/26286] New: FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 1: break at break_fn: 1 (SIGTRAP) vries at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2020-08-31  8:30 ` vries at gcc dot gnu.org
@ 2024-03-19 17:44 ` thiago.bauermann at linaro dot org
  2024-05-17 18:08 ` pedro at palves dot net
  5 siblings, 0 replies; 7+ messages in thread
From: thiago.bauermann at linaro dot org @ 2024-03-19 17:44 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=26286

Thiago Jung Bauermann <thiago.bauermann at linaro dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |thiago.bauermann at linaro dot org

--- Comment #5 from Thiago Jung Bauermann <thiago.bauermann at linaro dot org> ---
I also encountered this issue with current master branch on 3 machines: two
x86_64-linux and one aarch64-linux. Carl Love also reported in bug #31312
that he encountered the issue on a powerpc64le-linux system. The aarch64
and powerpc64le machines had a patch to fix bug #31312 applied.

In all cases it's necessary to keep running
attach-many-short-lived-threads.exp in a loop to reproduce the problem. In
one of the x86_64-linux machines it takes anywhere from 30 to 500
iterations to hit the problem, while in the other it took between 120 and
900 iterations. The aarch64-linux machine took ~2500 iterations. In
powerpc64le-linux, the problem happened in 3 iterations out of 500.

I do have Yama present in those machines, but it is disabled in all of
them:

  $ sysctl kernel.yama.ptrace_scope
  kernel.yama.ptrace_scope = 0

I also agree with Tom that if Yama was the problem, it would affect the
testcase in a different way.

The issue arises from this loop in linux_proc_attach_tgid_threads ():

  /* Scan the task list for existing threads.  While we go through the
     threads, new threads may be spawned.  Cycle through the list of
     threads until we have done two iterations without finding new
     threads.  */
  for (iterations = 0; iterations < 2; iterations++)
    {
      struct dirent *dp;

      new_threads_found = 0;
      while ((dp = readdir (dir.get ())) != NULL)
        {
          unsigned long lwp;

          /* Fetch one lwp.  */
          lwp = strtoul (dp->d_name, NULL, 10);
          if (lwp != 0)
            {
              ptid_t ptid = ptid_t (pid, lwp);

              if (attach_lwp (ptid))
                new_threads_found = 1;
            }
        }

      if (new_threads_found)
        {
          /* Start over.  */
          iterations = -1;
        }

      rewinddir (dir.get ());
    }

What happens is that two iterations without seeing new threads in
linux_proc_attach_tgid_threads () isn't always enough for GDB to know that
it has attached to all inferior threads. So sometimes after this function
returns, an unattached inferior thread trips on the breakpoint instruction
that GDB put in the inferior.

I don't know if I would consider this a bug, but rather an issue that
arises from the way attach-many-short-lived-threads.c behaves: since it's
constantly creating new threads it's impossible for GDB to know when it has
attached to all of them so that it can finish looking for new threads to
attach.

The only way I can see to improve GDB's behaviour is to increase the number
of iterations of the loop that checks for new threads.

I suspected that the ability of the inferior to create new threads was
proportional to the number of CPUs present in the system so I was going to
make the number of iterations in linux_proc_attach_tgid_threads ()
proportional to the number of CPUS, but on the machines I have at hand, the
one where it takes longest to reproduce the problem has the most CPUs (160,
vs 8 CPUs on the other machines), so maybe we just have to find a magical
iteration number that works well for everybody who can reproduce the issue?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug threads/26286] FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 1: break at break_fn: 1 (SIGTRAP)
  2020-07-22 19:08 [Bug threads/26286] New: FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 1: break at break_fn: 1 (SIGTRAP) vries at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2024-03-19 17:44 ` thiago.bauermann at linaro dot org
@ 2024-05-17 18:08 ` pedro at palves dot net
  5 siblings, 0 replies; 7+ messages in thread
From: pedro at palves dot net @ 2024-05-17 18:08 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=26286

Pedro Alves <pedro at palves dot net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |pedro at palves dot net

--- Comment #6 from Pedro Alves <pedro at palves dot net> ---
The theory was that if we attached to a parent thread just after it
forked/cloned a child, then we may have not seen the new child in the current
iteration, but should see it in the next.  

I had assumed that when clone syscall returns in the parent, the child process
is already created and listed.  Thus by the time we attach to the parent, the
child will be garanteed to show in the next /proc directory listing.

But I suppose it may happen that the kernel takes a bit to actually create the
child, and thus gdb could iterate over the list twice _before_ the child is
created?  If so, maybe we can call that a kernel bug?

What other scenarios / races could break the "iterate twice" logic?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-05-17 18:08 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-22 19:08 [Bug threads/26286] New: FAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 1: break at break_fn: 1 (SIGTRAP) vries at gcc dot gnu.org
2020-07-22 19:08 ` [Bug threads/26286] " vries at gcc dot gnu.org
2020-07-27 14:17 ` vries at gcc dot gnu.org
2020-08-31  7:54 ` ianchi at andestech dot com
2020-08-31  8:30 ` vries at gcc dot gnu.org
2024-03-19 17:44 ` thiago.bauermann at linaro dot org
2024-05-17 18:08 ` pedro at palves dot net

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).