public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug nptl/29029] New: poll() spuriously returns EINTR during thread cancellation and with cancellation disabled
@ 2022-04-05 21:32 aurelien at aurel32 dot net
  2022-04-05 22:28 ` [Bug nptl/29029] " adhemerval.zanella at linaro dot org
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: aurelien at aurel32 dot net @ 2022-04-05 21:32 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=29029

            Bug ID: 29029
           Summary: poll() spuriously returns EINTR during thread
                    cancellation and with cancellation disabled
           Product: glibc
           Version: 2.34
            Status: NEW
          Severity: normal
          Priority: P2
         Component: nptl
          Assignee: unassigned at sourceware dot org
          Reporter: aurelien at aurel32 dot net
                CC: adhemerval.zanella at linaro dot org, drepper.fsp at gmail dot com
  Target Milestone: ---

Created attachment 14049
  --> https://sourceware.org/bugzilla/attachment.cgi?id=14049&action=edit
Reproducer

Rémi Denis-Courmont has reported a regression introduced in glibc 2.34, and I
am not sure if it is allowed by the standard or not. In short when thread
cancellation is disabled, calling pthread_setcancelstate() causes a spurious
EINTR error in poll() in the child.

It has been introduced by the following commit:
https://sourceware.org/git/?p=glibc.git;a=commit;h=26cfbb7162ad364d53d69f6d482f2d87b5950524

I have also attached the reproducer Rémi provided. More details can be found on
https://bugs.debian.org/1008174

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug nptl/29029] poll() spuriously returns EINTR during thread cancellation and with cancellation disabled
  2022-04-05 21:32 [Bug nptl/29029] New: poll() spuriously returns EINTR during thread cancellation and with cancellation disabled aurelien at aurel32 dot net
@ 2022-04-05 22:28 ` adhemerval.zanella at linaro dot org
  2022-04-06 13:27 ` fweimer at redhat dot com
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2022-04-05 22:28 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=29029

--- Comment #1 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
I think what is happening is since there is no synchronization between
pthread_cancel and pthread pthread_setcancelstate, the internal SIGCANCEL is
sent to the created thread and since cancellation is not enabled, the signal
occurence is reported to application.

However it should not happen because we do install sigcancel_handler with
SA_RESTART to exactly avoid such issue.  From bf6749a7f87, it leads to
considere as a kernel issue.

As a side note, testing with a different implementation that aims to implement
pthread cancellation correctly (musl) it fails with exactly same issue.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug nptl/29029] poll() spuriously returns EINTR during thread cancellation and with cancellation disabled
  2022-04-05 21:32 [Bug nptl/29029] New: poll() spuriously returns EINTR during thread cancellation and with cancellation disabled aurelien at aurel32 dot net
  2022-04-05 22:28 ` [Bug nptl/29029] " adhemerval.zanella at linaro dot org
@ 2022-04-06 13:27 ` fweimer at redhat dot com
  2022-04-06 13:31 ` fweimer at redhat dot com
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: fweimer at redhat dot com @ 2022-04-06 13:27 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=29029

Florian Weimer <fweimer at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|                            |security-
                 CC|                            |fweimer at redhat dot com

--- Comment #2 from Florian Weimer <fweimer at redhat dot com> ---
I agree that this looks like a kernel bug.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug nptl/29029] poll() spuriously returns EINTR during thread cancellation and with cancellation disabled
  2022-04-05 21:32 [Bug nptl/29029] New: poll() spuriously returns EINTR during thread cancellation and with cancellation disabled aurelien at aurel32 dot net
  2022-04-05 22:28 ` [Bug nptl/29029] " adhemerval.zanella at linaro dot org
  2022-04-06 13:27 ` fweimer at redhat dot com
@ 2022-04-06 13:31 ` fweimer at redhat dot com
  2022-04-06 14:38 ` adhemerval.zanella at linaro dot org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: fweimer at redhat dot com @ 2022-04-06 13:31 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=29029

--- Comment #3 from Florian Weimer <fweimer at redhat dot com> ---
Well, I spoke to soon, it's actually documented in signal(7):

“
       The following interfaces are never restarted after being
       interrupted by a signal handler, regardless of the use of
       SA_RESTART; they always fail with the error EINTR when
       interrupted by a signal handler:
”

<https://man7.org/linux/man-pages/man7/signal.7.html>

I think that explains why glibc used the old, more complex code: it may be
required for POSIX conformance.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug nptl/29029] poll() spuriously returns EINTR during thread cancellation and with cancellation disabled
  2022-04-05 21:32 [Bug nptl/29029] New: poll() spuriously returns EINTR during thread cancellation and with cancellation disabled aurelien at aurel32 dot net
                   ` (2 preceding siblings ...)
  2022-04-06 13:31 ` fweimer at redhat dot com
@ 2022-04-06 14:38 ` adhemerval.zanella at linaro dot org
  2022-04-06 14:41 ` fweimer at redhat dot com
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2022-04-06 14:38 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=29029

--- Comment #4 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
Maybe one option would be to block the SIGCANCEL on the function noted on
signal(7) that never restarted after being interrupted if the cancellation is
disabled, and then re-enable the signal after the function returns (on
SYSCALL_CANCEL).

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug nptl/29029] poll() spuriously returns EINTR during thread cancellation and with cancellation disabled
  2022-04-05 21:32 [Bug nptl/29029] New: poll() spuriously returns EINTR during thread cancellation and with cancellation disabled aurelien at aurel32 dot net
                   ` (3 preceding siblings ...)
  2022-04-06 14:38 ` adhemerval.zanella at linaro dot org
@ 2022-04-06 14:41 ` fweimer at redhat dot com
  2022-04-06 14:46 ` adhemerval.zanella at linaro dot org
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: fweimer at redhat dot com @ 2022-04-06 14:41 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=29029

--- Comment #5 from Florian Weimer <fweimer at redhat dot com> ---
(In reply to Adhemerval Zanella from comment #4)
> Maybe one option would be to block the SIGCANCEL on the function noted on
> signal(7) that never restarted after being interrupted if the cancellation
> is disabled, and then re-enable the signal after the function returns (on
> SYSCALL_CANCEL).

Technically yes, but it optimizes the wrong case because cancellation is rare
in most programs (or might never happen). Two extra system calls per poll seems
excessive.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug nptl/29029] poll() spuriously returns EINTR during thread cancellation and with cancellation disabled
  2022-04-05 21:32 [Bug nptl/29029] New: poll() spuriously returns EINTR during thread cancellation and with cancellation disabled aurelien at aurel32 dot net
                   ` (4 preceding siblings ...)
  2022-04-06 14:41 ` fweimer at redhat dot com
@ 2022-04-06 14:46 ` adhemerval.zanella at linaro dot org
  2022-04-06 15:28 ` adhemerval.zanella at linaro dot org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2022-04-06 14:46 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=29029

--- Comment #6 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
(In reply to Florian Weimer from comment #5)
> (In reply to Adhemerval Zanella from comment #4)
> > Maybe one option would be to block the SIGCANCEL on the function noted on
> > signal(7) that never restarted after being interrupted if the cancellation
> > is disabled, and then re-enable the signal after the function returns (on
> > SYSCALL_CANCEL).
> 
> Technically yes, but it optimizes the wrong case because cancellation is
> rare in most programs (or might never happen). Two extra system calls per
> poll seems excessive.

It seems that to fix the Linux behavious we will need to really avoid sending
the SIGCANCEL unless the asynchrnous mode is enabled. I am really not found of
the old code complexity...

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug nptl/29029] poll() spuriously returns EINTR during thread cancellation and with cancellation disabled
  2022-04-05 21:32 [Bug nptl/29029] New: poll() spuriously returns EINTR during thread cancellation and with cancellation disabled aurelien at aurel32 dot net
                   ` (5 preceding siblings ...)
  2022-04-06 14:46 ` adhemerval.zanella at linaro dot org
@ 2022-04-06 15:28 ` adhemerval.zanella at linaro dot org
  2022-04-14 18:43 ` adhemerval.zanella at linaro dot org
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2022-04-06 15:28 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=29029

Adhemerval Zanella <adhemerval.zanella at linaro dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at sourceware dot org   |adhemerval.zanella at linaro dot o
                   |                            |rg

--- Comment #7 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
I will work on revert my changes on pthread cancellation.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug nptl/29029] poll() spuriously returns EINTR during thread cancellation and with cancellation disabled
  2022-04-05 21:32 [Bug nptl/29029] New: poll() spuriously returns EINTR during thread cancellation and with cancellation disabled aurelien at aurel32 dot net
                   ` (6 preceding siblings ...)
  2022-04-06 15:28 ` adhemerval.zanella at linaro dot org
@ 2022-04-14 18:43 ` adhemerval.zanella at linaro dot org
  2023-06-21  3:00 ` andres at anarazel dot de
  2023-06-21 14:25 ` adhemerval.zanella at linaro dot org
  9 siblings, 0 replies; 11+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2022-04-14 18:43 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=29029

Adhemerval Zanella <adhemerval.zanella at linaro dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
   Target Milestone|---                         |2.36
             Status|NEW                         |RESOLVED

--- Comment #8 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
Fixed on 2.36.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug nptl/29029] poll() spuriously returns EINTR during thread cancellation and with cancellation disabled
  2022-04-05 21:32 [Bug nptl/29029] New: poll() spuriously returns EINTR during thread cancellation and with cancellation disabled aurelien at aurel32 dot net
                   ` (7 preceding siblings ...)
  2022-04-14 18:43 ` adhemerval.zanella at linaro dot org
@ 2023-06-21  3:00 ` andres at anarazel dot de
  2023-06-21 14:25 ` adhemerval.zanella at linaro dot org
  9 siblings, 0 replies; 11+ messages in thread
From: andres at anarazel dot de @ 2023-06-21  3:00 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=29029

Andres Freund <andres at anarazel dot de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |andres at anarazel dot de

--- Comment #9 from Andres Freund <andres at anarazel dot de> ---
FWIW, the fix / revert (404656009b) seems to have lead to a noticeable
performance regression.

Comparing a process and a single thread doing 10M, on a "Intel(R) Xeon(R) Gold
5215 CPU @ 2.50GHz", I measured:
1) pread()s of 4096 bytes of data
2) close(-1)

glibc commit 404656009b:
testing without threads, performing 10000000 syscalls
        pread() via libc: 3035.157 ms
        pread() via syscall(): 3048.916 ms
        close(-1) via libc: 993.145 ms
        close(-1) via syscall(): 993.274 ms
testing with 1 threads, performing 10000000 syscalls each
        pread() via libc: 3400.858 ms
        pread() via syscall(): 3135.069 ms
        close(-1) via libc: 1203.851 ms
        close(-1) via syscall(): 996.809 ms

glibc commit 2376944b9e:
testing without threads, performing 10000000 syscalls
        pread() via libc: 3048.536 ms
        pread() via syscall(): 3049.653 ms
        close(-1) via libc: 968.820 ms
        close(-1) via syscall(): 950.485 ms
testing with 1 threads, performing 10000000 syscalls each
        pread() via libc: 3203.544 ms
        pread() via syscall(): 3142.467 ms
        close(-1) via libc: 1020.983 ms
        close(-1) via syscall(): 953.038 ms

To reduce variability this was run with turbo mode disabled and the test pinned
to an otherwise unused core and were run against both glibc in sequence,
without a reboot inbetween, with a few repetitions.

Looking at a profile it's obvious that the difference primarily is from
__GI___pthread_disable_asynccancel() and __GI___pthread_enable_asynccancel().
There's also some difference in kernel overhead, but that obviously applies to
both glibc versions.

It's one thing for something as pointless as close(-1) to regress ~18%, but the
fact that this also regresses workloads doing useful, non-trivial syscalls,
like pread() by ~5% seems worrisome.  

I have been seeing __GI___pthread_disable_asynccancel etc in more an more
production profiles for a while and finally spent the time analyzing the issue.


Do you want me to create a new issue or leave it just here?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug nptl/29029] poll() spuriously returns EINTR during thread cancellation and with cancellation disabled
  2022-04-05 21:32 [Bug nptl/29029] New: poll() spuriously returns EINTR during thread cancellation and with cancellation disabled aurelien at aurel32 dot net
                   ` (8 preceding siblings ...)
  2023-06-21  3:00 ` andres at anarazel dot de
@ 2023-06-21 14:25 ` adhemerval.zanella at linaro dot org
  9 siblings, 0 replies; 11+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2023-06-21 14:25 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=29029

--- Comment #10 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
(In reply to Andres Freund from comment #9)
> FWIW, the fix / revert (404656009b) seems to have lead to a noticeable
> performance regression.

The 404656009b was a bugfix, meaning that the changes
8c1c0aae20/2b51742531/26cfbb7162 that possibly optimized the cancellation path
are not correct, and any performance implications should not be considered.

However, I have patch that refactor the cancellation handling to avoid most of
atomic operations to enable/disable async on every cancellation entrypoint [1].
I am not sure about its performance implication though.

[1]
https://patchwork.sourceware.org/project/glibc/patch/20230524135126.3174670-1-adhemerval.zanella@linaro.org/

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-06-21 14:25 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-05 21:32 [Bug nptl/29029] New: poll() spuriously returns EINTR during thread cancellation and with cancellation disabled aurelien at aurel32 dot net
2022-04-05 22:28 ` [Bug nptl/29029] " adhemerval.zanella at linaro dot org
2022-04-06 13:27 ` fweimer at redhat dot com
2022-04-06 13:31 ` fweimer at redhat dot com
2022-04-06 14:38 ` adhemerval.zanella at linaro dot org
2022-04-06 14:41 ` fweimer at redhat dot com
2022-04-06 14:46 ` adhemerval.zanella at linaro dot org
2022-04-06 15:28 ` adhemerval.zanella at linaro dot org
2022-04-14 18:43 ` adhemerval.zanella at linaro dot org
2023-06-21  3:00 ` andres at anarazel dot de
2023-06-21 14:25 ` adhemerval.zanella at linaro dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).