public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug nptl/25765] New: Incorrect futex syscall in __pthread_disable_asynccancel for linux x86_64 leads to livelock
@ 2020-04-02 12:08 martin.lubich at gmx dot at
  2020-04-02 12:22 ` [Bug nptl/25765] " adhemerval.zanella at linaro dot org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: martin.lubich at gmx dot at @ 2020-04-02 12:08 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=25765

            Bug ID: 25765
           Summary: Incorrect futex syscall in
                    __pthread_disable_asynccancel for linux x86_64 leads
                    to livelock
           Product: glibc
           Version: unspecified
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: nptl
          Assignee: unassigned at sourceware dot org
          Reporter: martin.lubich at gmx dot at
                CC: drepper.fsp at gmail dot com
  Target Milestone: ---

Created attachment 12422
  --> https://sourceware.org/bugzilla/attachment.cgi?id=12422&action=edit
Example code to reproduce and trigger the bug

There is a bug in the x86_64 specific implementation of
__pthread_disable_asynccancel.

When detecting an ongoing thread cancellation (CANCELLING_BITMASK) the code
tries to block on a futex based on the cancellation member of the thread
structure.

The generic c-code in nptl/cancellation.c does this in a correct way.

The specific implemention in sysdeps/unix/sysv/linux/x86_64/cancellation.S has
an error in setting up the futex syscall. The 3rd parameter ( the value against
which the kernel futex code checks ) is not set (edx register) i.e. edx is not
in a defined state and thus typically the futex call will return immediately
with EAGAIN. This leads to an endless loop.

If the looping thread has a higher RT priority than the cancelling thread, the
loop will go on forever, consuming all CPU cycles there are. In case of RT
threads, this will also cause complete system freezes.

If have attached a simple test which will show the problem after some time. 

This is a patch which fixes the problem.

The patch is based on a glibc 2.27, but the bug is still present in the actual
version 2.31. as well as the actual developmemt version.

--------------- snip ----------------------------

diff -Naur glibc-2.27/sysdeps/unix/sysv/linux/x86_64/cancellation.S
glibc-2.27_patched/sysdeps/unix/sysv/linux/x86_64/cancellation.S
--- glibc-2.27/sysdeps/unix/sysv/linux/x86_64/cancellation.S    2018-02-01
17:17:18.000000000 +0100
+++ glibc-2.27_patched/sysdeps/unix/sysv/linux/x86_64/cancellation.S   
2020-04-02 12:08:02.712851151 +0200
@@ -95,8 +95,8 @@
        cmpxchgl %r11d, %fs:CANCELHANDLING
        jnz     2b

-       movl    %r11d, %eax
-3:     andl    $(TCB_CANCELING_BITMASK|TCB_CANCELED_BITMASK), %eax
+3:     movl    %r11d, %eax
+       andl    $(TCB_CANCELING_BITMASK|TCB_CANCELED_BITMASK), %eax
        cmpl    $TCB_CANCELING_BITMASK, %eax
        je      4f
 1:     ret
@@ -104,12 +104,13 @@
        /* Performance doesn't matter in this loop.  We will
           delay until the thread is canceled.  And we will unlikely
           enter the loop twice.  */
-4:     mov     %fs:0, %RDI_LP
+4:      movl    %r11d, %edx
+        mov    %fs:0, %RDI_LP
        movl    $__NR_futex, %eax
        xorq    %r10, %r10
        addq    $CANCELHANDLING, %rdi
        LOAD_PRIVATE_FUTEX_WAIT (%esi)
        syscall
-       movl    %fs:CANCELHANDLING, %eax
+       movl    %fs:CANCELHANDLING, %edx
        jmp     3b
 END(__pthread_disable_asynccancel)

------------------- snip ---------------------------

This is a linux x86_64 specific bug.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-04-03 14:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-02 12:08 [Bug nptl/25765] New: Incorrect futex syscall in __pthread_disable_asynccancel for linux x86_64 leads to livelock martin.lubich at gmx dot at
2020-04-02 12:22 ` [Bug nptl/25765] " adhemerval.zanella at linaro dot org
2020-04-02 12:25 ` martin.lubich at gmx dot at
2020-04-02 12:27 ` adhemerval.zanella at linaro dot org
2020-04-03 14:17 ` cvs-commit at gcc dot gnu.org
2020-04-03 14:36 ` adhemerval.zanella at linaro dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).