public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libc/30789] New: [2.38 Regression] sem_open will fail on multithreaded scenarios when semaphore file doesn't exist (O_CREAT)
@ 2023-08-23 14:20 sergiodj at sergiodj dot net
2023-08-23 14:23 ` [Bug libc/30789] " sergiodj at sergiodj dot net
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: sergiodj at sergiodj dot net @ 2023-08-23 14:20 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=30789
Bug ID: 30789
Summary: [2.38 Regression] sem_open will fail on multithreaded
scenarios when semaphore file doesn't exist (O_CREAT)
Product: glibc
Version: 2.38
Status: NEW
Severity: normal
Priority: P2
Component: libc
Assignee: unassigned at sourceware dot org
Reporter: sergiodj at sergiodj dot net
CC: drepper.fsp at gmail dot com
Target Milestone: ---
Ref.: https://inbox.sourceware.org/libc-alpha/87cyzdrgh3.fsf@sergiodj.net/T/#t
Ref.: https://bugs.launchpad.net/ubuntu/+source/h5py/+bug/2031912
Due to 533deafbdf189f5fbb280c28562dd43ace2f4b0f ("Use O_CLOEXEC in more places
(BZ #15722)") glibc 2.38's sem_open will fail on multithreaded scenarios when
the semaphore file doesn't exist but should be created (i.e., when it's invoked
with O_CREAT).
Detailed analysis of this problem can found at
https://inbox.sourceware.org/libc-alpha/87cyzdrgh3.fsf@sergiodj.net/T/#t and
https://bugs.launchpad.net/ubuntu/+source/h5py/+bug/2031912/comments/7.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug libc/30789] [2.38 Regression] sem_open will fail on multithreaded scenarios when semaphore file doesn't exist (O_CREAT)
2023-08-23 14:20 [Bug libc/30789] New: [2.38 Regression] sem_open will fail on multithreaded scenarios when semaphore file doesn't exist (O_CREAT) sergiodj at sergiodj dot net
@ 2023-08-23 14:23 ` sergiodj at sergiodj dot net
2023-08-24 12:38 ` sam at gentoo dot org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: sergiodj at sergiodj dot net @ 2023-08-23 14:23 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=30789
--- Comment #1 from Sergio Durigan Junior <sergiodj at sergiodj dot net> ---
The bug can be reproduced in Ubuntu Mantic (development version). Here are the
steps:
As root:
# apt update
# apt install -y mpi-default-bin python3-h5py-mpi
As a regular user:
$ cat > test.py << __EOF__
from h5py import File
from mpi4py import MPI
with File('/tmp/aaaa', 'w', driver='mpio', comm=MPI.COMM_WORLD) as f:
print(f)
__EOF__
$ mpirun -n 2 python3 test.py
You will notice that the mpirun process hangs indefinitely and consumes 100% of
two CPUs.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug libc/30789] [2.38 Regression] sem_open will fail on multithreaded scenarios when semaphore file doesn't exist (O_CREAT)
2023-08-23 14:20 [Bug libc/30789] New: [2.38 Regression] sem_open will fail on multithreaded scenarios when semaphore file doesn't exist (O_CREAT) sergiodj at sergiodj dot net
2023-08-23 14:23 ` [Bug libc/30789] " sergiodj at sergiodj dot net
@ 2023-08-24 12:38 ` sam at gentoo dot org
2023-08-24 12:39 ` sam at gentoo dot org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: sam at gentoo dot org @ 2023-08-24 12:38 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=30789
Sam James <sam at gentoo dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |sam at gentoo dot org
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug libc/30789] [2.38 Regression] sem_open will fail on multithreaded scenarios when semaphore file doesn't exist (O_CREAT)
2023-08-23 14:20 [Bug libc/30789] New: [2.38 Regression] sem_open will fail on multithreaded scenarios when semaphore file doesn't exist (O_CREAT) sergiodj at sergiodj dot net
2023-08-23 14:23 ` [Bug libc/30789] " sergiodj at sergiodj dot net
2023-08-24 12:38 ` sam at gentoo dot org
@ 2023-08-24 12:39 ` sam at gentoo dot org
2023-09-25 14:37 ` freswa at archlinux dot org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: sam at gentoo dot org @ 2023-08-24 12:39 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=30789
Sam James <sam at gentoo dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |adhemerval.zanella at linaro dot o
| |rg,
| |samuel.thibault@ens-lyon.or
| |g
--- Comment #2 from Sam James <sam at gentoo dot org> ---
Sergey Bugaev <bugaevc@gmail.com> doesn't seem to have a BZ account, so CCing
committer/reviewer.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug libc/30789] [2.38 Regression] sem_open will fail on multithreaded scenarios when semaphore file doesn't exist (O_CREAT)
2023-08-23 14:20 [Bug libc/30789] New: [2.38 Regression] sem_open will fail on multithreaded scenarios when semaphore file doesn't exist (O_CREAT) sergiodj at sergiodj dot net
` (2 preceding siblings ...)
2023-08-24 12:39 ` sam at gentoo dot org
@ 2023-09-25 14:37 ` freswa at archlinux dot org
2023-11-03 18:46 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: freswa at archlinux dot org @ 2023-09-25 14:37 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=30789
Frederik Schwan <freswa at archlinux dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |freswa at archlinux dot org
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug libc/30789] [2.38 Regression] sem_open will fail on multithreaded scenarios when semaphore file doesn't exist (O_CREAT)
2023-08-23 14:20 [Bug libc/30789] New: [2.38 Regression] sem_open will fail on multithreaded scenarios when semaphore file doesn't exist (O_CREAT) sergiodj at sergiodj dot net
` (3 preceding siblings ...)
2023-09-25 14:37 ` freswa at archlinux dot org
@ 2023-11-03 18:46 ` cvs-commit at gcc dot gnu.org
2023-11-03 18:47 ` adhemerval.zanella at linaro dot org
2023-12-03 12:11 ` cvs-commit at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-03 18:46 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=30789
--- Comment #3 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Adhemerval Zanella
<azanella@sourceware.org>:
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=f957f47df75b9fab995754011491edebc6feb147
commit f957f47df75b9fab995754011491edebc6feb147
Author: Sergio Durigan Junior <sergiodj@sergiodj.net>
Date: Wed Nov 1 18:15:23 2023 -0400
sysdeps: sem_open: Clear O_CREAT when semaphore file is expected to exist
[BZ #30789]
When invoking sem_open with O_CREAT as one of its flags, we'll end up
in the second part of sem_open's "if ((oflag & O_CREAT) == 0 || (oflag
& O_EXCL) == 0)", which means that we don't expect the semaphore file
to exist.
In that part, open_flags is initialized as "O_RDWR | O_CREAT | O_EXCL
| O_CLOEXEC" and there's an attempt to open(2) the file, which will
likely fail because it won't exist. After that first (expected)
failure, some cleanup is done and we go back to the label "try_again",
which lives in the first part of the aforementioned "if".
The problem is that, in that part of the code, we expect the semaphore
file to exist, and as such O_CREAT (this time the flag we pass to
open(2)) needs to be cleaned from open_flags, otherwise we'll see
another failure (this time unexpected) when trying to open the file,
which will lead the call to sem_open to fail as well.
This can cause very strange bugs, especially with OpenMPI, which makes
extensive use of semaphores.
Fix the bug by simplifying the logic when choosing open(2) flags and
making sure O_CREAT is not set when the semaphore file is expected to
exist.
A regression test for this issue would require a complex and cpu time
consuming logic, since to trigger the wrong code path is not
straightforward due the racy condition. There is a somewhat reliable
reproducer in the bug, but it requires using OpenMPI.
This resolves BZ #30789.
See also: https://bugs.launchpad.net/ubuntu/+source/h5py/+bug/2031912
Signed-off-by: Sergio Durigan Junior <sergiodj@sergiodj.net>
Co-Authored-By: Simon Chopin <simon.chopin@canonical.com>
Co-Authored-By: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Fixes: 533deafbdf189f5fbb280c28562dd43ace2f4b0f ("Use O_CLOEXEC in more
places (BZ #15722)")
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug libc/30789] [2.38 Regression] sem_open will fail on multithreaded scenarios when semaphore file doesn't exist (O_CREAT)
2023-08-23 14:20 [Bug libc/30789] New: [2.38 Regression] sem_open will fail on multithreaded scenarios when semaphore file doesn't exist (O_CREAT) sergiodj at sergiodj dot net
` (4 preceding siblings ...)
2023-11-03 18:46 ` cvs-commit at gcc dot gnu.org
@ 2023-11-03 18:47 ` adhemerval.zanella at linaro dot org
2023-12-03 12:11 ` cvs-commit at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2023-11-03 18:47 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=30789
Adhemerval Zanella <adhemerval.zanella at linaro dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Target Milestone|--- |2.39
Resolution|--- |FIXED
Assignee|unassigned at sourceware dot org |adhemerval.zanella at linaro dot o
| |rg
--- Comment #4 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
Fixed on 2.39.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug libc/30789] [2.38 Regression] sem_open will fail on multithreaded scenarios when semaphore file doesn't exist (O_CREAT)
2023-08-23 14:20 [Bug libc/30789] New: [2.38 Regression] sem_open will fail on multithreaded scenarios when semaphore file doesn't exist (O_CREAT) sergiodj at sergiodj dot net
` (5 preceding siblings ...)
2023-11-03 18:47 ` adhemerval.zanella at linaro dot org
@ 2023-12-03 12:11 ` cvs-commit at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-12-03 12:11 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=30789
--- Comment #5 from Sourceware Commits <cvs-commit at gcc dot gnu.org> ---
The release/2.38/master branch has been updated by Aurelien Jarno
<aurel32@sourceware.org>:
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=63dbbc5c52f9823f86270f32fce20d1e91cdf484
commit 63dbbc5c52f9823f86270f32fce20d1e91cdf484
Author: Sergio Durigan Junior <sergiodj@sergiodj.net>
Date: Wed Nov 1 18:15:23 2023 -0400
sysdeps: sem_open: Clear O_CREAT when semaphore file is expected to exist
[BZ #30789]
When invoking sem_open with O_CREAT as one of its flags, we'll end up
in the second part of sem_open's "if ((oflag & O_CREAT) == 0 || (oflag
& O_EXCL) == 0)", which means that we don't expect the semaphore file
to exist.
In that part, open_flags is initialized as "O_RDWR | O_CREAT | O_EXCL
| O_CLOEXEC" and there's an attempt to open(2) the file, which will
likely fail because it won't exist. After that first (expected)
failure, some cleanup is done and we go back to the label "try_again",
which lives in the first part of the aforementioned "if".
The problem is that, in that part of the code, we expect the semaphore
file to exist, and as such O_CREAT (this time the flag we pass to
open(2)) needs to be cleaned from open_flags, otherwise we'll see
another failure (this time unexpected) when trying to open the file,
which will lead the call to sem_open to fail as well.
This can cause very strange bugs, especially with OpenMPI, which makes
extensive use of semaphores.
Fix the bug by simplifying the logic when choosing open(2) flags and
making sure O_CREAT is not set when the semaphore file is expected to
exist.
A regression test for this issue would require a complex and cpu time
consuming logic, since to trigger the wrong code path is not
straightforward due the racy condition. There is a somewhat reliable
reproducer in the bug, but it requires using OpenMPI.
This resolves BZ #30789.
See also: https://bugs.launchpad.net/ubuntu/+source/h5py/+bug/2031912
Signed-off-by: Sergio Durigan Junior <sergiodj@sergiodj.net>
Co-Authored-By: Simon Chopin <simon.chopin@canonical.com>
Co-Authored-By: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Fixes: 533deafbdf189f5fbb280c28562dd43ace2f4b0f ("Use O_CLOEXEC in more
places (BZ #15722)")
(cherry picked from commit f957f47df75b9fab995754011491edebc6feb147)
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-12-03 12:11 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-23 14:20 [Bug libc/30789] New: [2.38 Regression] sem_open will fail on multithreaded scenarios when semaphore file doesn't exist (O_CREAT) sergiodj at sergiodj dot net
2023-08-23 14:23 ` [Bug libc/30789] " sergiodj at sergiodj dot net
2023-08-24 12:38 ` sam at gentoo dot org
2023-08-24 12:39 ` sam at gentoo dot org
2023-09-25 14:37 ` freswa at archlinux dot org
2023-11-03 18:46 ` cvs-commit at gcc dot gnu.org
2023-11-03 18:47 ` adhemerval.zanella at linaro dot org
2023-12-03 12:11 ` cvs-commit at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).