public inbox for glibc-bugs@sourceware.org help / color / mirror / Atom feed
From: "wuxu.wu at huawei dot com" <sourceware-bugzilla@sourceware.org> To: glibc-bugs@sourceware.org Subject: [Bug nptl/26104] New forked process __reclaim_stacks endless loop Date: Fri, 12 Jun 2020 02:38:24 +0000 [thread overview] Message-ID: <bug-26104-131-MtOnukjK7D@http.sourceware.org/bugzilla/> (raw) In-Reply-To: <bug-26104-131@http.sourceware.org/bugzilla/> https://sourceware.org/bugzilla/show_bug.cgi?id=26104 --- Comment #3 from buque <wuxu.wu at huawei dot com> --- Hi, your analysis is exactly what I think. We install a device with glibc 2.17(centos 7.5), process B deadloop in line 2-6, cpu core is 100%. It's seems like crash at first, the bad address 0x11940177fe9c0 is 0x7fd9177fe9c0,line 5 damaged this pointer later(self-pid=72000)。It's amazing that not crash, I guss there is a ring. As your says, it is very hard to reproduce the problem. glibc2.17 had used sereval years and only one time, I will reproduce with white box test in a few days. I think it's hard to solve this bug with lock-free manner, it can't stop reading and writting, this bring intermediate state. Maybe you have a better way. (gdb) p 0x11940 $1 = 72000 1 /* Reset the PIDs in any cached stacks. */ 2 list_for_each (runp, &stack_cache) 3 { 4 struct pthread *curp = list_entry (runp, struct pthread, list); 5 curp->pid = self->pid; 6 } Detaching from program: /usr/bin/sysmonitor, process 72000 [root@cn-north-4b-CloudDataCompassSurfer-010077236019 ~]# (gdb) info r rax 0x7fd9167fc9c0 140570362104256 rbx 0x7fd93f30f010 140571044802576 rcx 0x7fd9177fe9c0 140570378889664 rdx 0x11940 72000 rsi 0x7fd915ffb9c0 140570353711552 rdi 0x7fd93eeef5c0 140571040478656 rbp 0x7fd93f30f020 0x7fd93f30f020 <stack_cache> rsp 0x7fd9357f9608 0x7fd9357f9608 r8 0x7fd93f30f010 140571044802576 r9 0x159a 5530 r10 0x7fd93eb23700 140571036497664 r11 0x7fd9357fa700 140570882189056 r12 0x0 0 r13 0x0 0 r14 0x7fd93f749000 140571049234432 r15 0x7fd9357f99e0 140570882185696 rip 0x7fd93f0ff3aa 0x7fd93f0ff3aa <__reclaim_stacks+538> eflags 0x287 [ CF PF SF IF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 (gdb) n 900 curp->pid = self->pid; (gdb) p curp $3 = (struct pthread *) 0x7fd916ffd700 (gdb) p stack_cache $4 = {next = 0x11940177fe9c0, prev = 0x7fd9177fe9c0} (gdb) p stack_cache.prev $5 = (struct list_head *) 0x7fd9177fe9c0 (gdb) p stack_cache.prev->prev $6 = (struct list_head *) 0x7fd93f30f020 <stack_cache> (gdb) p curp $7 = (struct pthread *) 0x7fd916ffd700 (gdb) i r rax 0x7fd916ffd9c0 140570370496960 rbx 0x7fd93f30f010 140571044802576 rcx 0x7fd9177fe9c0 140570378889664 rdx 0x11940 72000 rsi 0x7fd915ffb9c0 140570353711552 rdi 0x7fd93eeef5c0 140571040478656 rbp 0x7fd93f30f020 0x7fd93f30f020 <stack_cache> rsp 0x7fd9357f9608 0x7fd9357f9608 r8 0x7fd93f30f010 140571044802576 r9 0x159a 5530 r10 0x7fd93eb23700 140571036497664 r11 0x7fd9357fa700 140570882189056 r12 0x0 0 r13 0x0 0 r14 0x7fd93f749000 140571049234432 r15 0x7fd9357f99e0 140570882185696 rip 0x7fd93f0ff3a0 0x7fd93f0ff3a0 <__reclaim_stacks+528> eflags 0x287 [ CF PF SF IF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 (gdb) n 897 list_for_each (runp, &stack_cache) (gdb) n 900 curp->pid = self->pid; (gdb) p curp $8 = (struct pthread *) 0x7fd90f7fe700 (gdb) n 897 list_for_each (runp, &stack_cache) (gdb) n 900 curp->pid = self->pid; (gdb) p curp $9 = (struct pthread *) 0x7fd917fff700 (gdb) n 897 list_for_each (runp, &stack_cache) (gdb) n 900 curp->pid = self->pid; (gdb) p curp $10 = (struct pthread *) 0x7fd934ff9700 (gdb) n 897 list_for_each (runp, &stack_cache) (gdb) 900 curp->pid = self->pid; (gdb) p curp $11 = (struct pthread *) 0x7fd9357fa700 (gdb) p stack_cache $12 = {next = 0x11940177fe9c0, prev = 0x7fd9177fe9c0} (gdb) q A debugging session is active. Inferior 1 [process 72000] will be detached. (gdb) p stack_cache $1 = {next = 0x11940177fe9c0, prev = 0x7fd9177fe9c0} (gdb) p stack_cache.next $2 = (struct list_head *) 0x11940177fe9c0 (gdb) p stack_cache.next->next Cannot access memory at address 0x11940177fe9c0 //0x11940177fe9c0(0x7fd9177fe9c0) -- You are receiving this mail because: You are on the CC list for the bug.
next prev parent reply other threads:[~2020-06-12 2:38 UTC|newest] Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-06-10 11:20 [Bug nptl/26104] New: " wuxu.wu at huawei dot com 2020-06-11 17:57 ` [Bug nptl/26104] " carlos at redhat dot com 2020-06-11 17:58 ` carlos at redhat dot com 2020-06-12 2:38 ` wuxu.wu at huawei dot com [this message] 2020-06-12 2:39 ` wuxu.wu at huawei dot com
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-26104-131-MtOnukjK7D@http.sourceware.org/bugzilla/ \ --to=sourceware-bugzilla@sourceware.org \ --cc=glibc-bugs@sourceware.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).