public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] sysdeps: Clear O_CREAT|O_ACCMODE when trying again on sem_open
@ 2023-08-23  4:21 Sergio Durigan Junior
  2023-08-23  7:40 ` Andreas Schwab
  2023-08-23 13:46 ` Adhemerval Zanella Netto
  0 siblings, 2 replies; 15+ messages in thread
From: Sergio Durigan Junior @ 2023-08-23  4:21 UTC (permalink / raw)
  To: libc-alpha; +Cc: Sergio Durigan Junior

When invoking sem_open with O_CREAT as one of its flags, we'll end up
in the second part of sem_open's "if ((oflag & O_CREAT) == 0 || (oflag
& O_EXCL) == 0)", which means that we don't expect the semaphore file
to exist.

In that part, open_flags is initialized as "O_RDWR | O_CREAT | O_EXCL
| O_CLOEXEC" and there's an attempt to open(2) the file, which will
likely fail because it won't exist.  After that first (expected)
failure, some cleanup is done and we go back to the label "try_again",
which lives in the first part of the aforementioned "if".

The problem is that, in that part of the code, we expect the semaphore
file to exist, and as such O_CREAT (this time the flag we pass to
open(2)) needs to be cleaned from open_flags, otherwise we'll see
another failure (this time unexpected) when trying to open the file,
which will lead the call to sem_open to fail as well.

This can cause very strange bugs, especially with OpenMPI, which makes
extensive use of semaphores.

The fix here is to actually make sure that the O_CREAT|O_ACCMODE flags
are clear after we enter "try_again".

See also: https://bugs.launchpad.net/ubuntu/+source/h5py/+bug/2031912
---
 sysdeps/pthread/sem_open.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sysdeps/pthread/sem_open.c b/sysdeps/pthread/sem_open.c
index e5db929d20..ba91f89d57 100644
--- a/sysdeps/pthread/sem_open.c
+++ b/sysdeps/pthread/sem_open.c
@@ -66,8 +66,8 @@ __sem_open (const char *name, int oflag, ...)
   if ((oflag & O_CREAT) == 0 || (oflag & O_EXCL) == 0)
     {
       open_flags = O_RDWR | O_NOFOLLOW | O_CLOEXEC;
-      open_flags |= (oflag & ~(O_CREAT|O_ACCMODE));
     try_again:
+      open_flags |= (oflag & ~(O_CREAT|O_ACCMODE));
       fd = __open (dirname.name, open_flags);
 
       if (fd == -1)
-- 
2.39.2


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] sysdeps: Clear O_CREAT|O_ACCMODE when trying again on sem_open
  2023-08-23  4:21 [PATCH] sysdeps: Clear O_CREAT|O_ACCMODE when trying again on sem_open Sergio Durigan Junior
@ 2023-08-23  7:40 ` Andreas Schwab
  2023-08-23 14:10   ` Sergio Durigan Junior
  2023-08-23 13:46 ` Adhemerval Zanella Netto
  1 sibling, 1 reply; 15+ messages in thread
From: Andreas Schwab @ 2023-08-23  7:40 UTC (permalink / raw)
  To: Sergio Durigan Junior via Libc-alpha; +Cc: Sergio Durigan Junior

On Aug 23 2023, Sergio Durigan Junior via Libc-alpha wrote:

> The fix here is to actually make sure that the O_CREAT|O_ACCMODE flags
> are clear after we enter "try_again".

That doesn't match what the patch does.  It only adds flags to
open_flags, and does not remove any.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] sysdeps: Clear O_CREAT|O_ACCMODE when trying again on sem_open
  2023-08-23  4:21 [PATCH] sysdeps: Clear O_CREAT|O_ACCMODE when trying again on sem_open Sergio Durigan Junior
  2023-08-23  7:40 ` Andreas Schwab
@ 2023-08-23 13:46 ` Adhemerval Zanella Netto
  2023-08-23 14:16   ` Sergio Durigan Junior
  2023-10-28 20:30   ` Sergio Durigan Junior
  1 sibling, 2 replies; 15+ messages in thread
From: Adhemerval Zanella Netto @ 2023-08-23 13:46 UTC (permalink / raw)
  To: Sergio Durigan Junior, libc-alpha



On 23/08/23 01:21, Sergio Durigan Junior via Libc-alpha wrote:
> When invoking sem_open with O_CREAT as one of its flags, we'll end up
> in the second part of sem_open's "if ((oflag & O_CREAT) == 0 || (oflag
> & O_EXCL) == 0)", which means that we don't expect the semaphore file
> to exist.
> 
> In that part, open_flags is initialized as "O_RDWR | O_CREAT | O_EXCL
> | O_CLOEXEC" and there's an attempt to open(2) the file, which will
> likely fail because it won't exist.  After that first (expected)
> failure, some cleanup is done and we go back to the label "try_again",
> which lives in the first part of the aforementioned "if".
> 
> The problem is that, in that part of the code, we expect the semaphore
> file to exist, and as such O_CREAT (this time the flag we pass to
> open(2)) needs to be cleaned from open_flags, otherwise we'll see
> another failure (this time unexpected) when trying to open the file,
> which will lead the call to sem_open to fail as well.
> 
> This can cause very strange bugs, especially with OpenMPI, which makes
> extensive use of semaphores.
> 
> The fix here is to actually make sure that the O_CREAT|O_ACCMODE flags
> are clear after we enter "try_again".
> 
> See also: https://bugs.launchpad.net/ubuntu/+source/h5py/+bug/2031912

This need needs a bug report and, if possible, a regression check (I give
you that it might be tricky due it is a racy condition).

> ---
>  sysdeps/pthread/sem_open.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/sysdeps/pthread/sem_open.c b/sysdeps/pthread/sem_open.c
> index e5db929d20..ba91f89d57 100644
> --- a/sysdeps/pthread/sem_open.c
> +++ b/sysdeps/pthread/sem_open.c
> @@ -66,8 +66,8 @@ __sem_open (const char *name, int oflag, ...)
>    if ((oflag & O_CREAT) == 0 || (oflag & O_EXCL) == 0)
>      {
>        open_flags = O_RDWR | O_NOFOLLOW | O_CLOEXEC;
> -      open_flags |= (oflag & ~(O_CREAT|O_ACCMODE));
>      try_again:
> +      open_flags |= (oflag & ~(O_CREAT|O_ACCMODE));
>        fd = __open (dirname.name, open_flags);
>  
>        if (fd == -1)

I still think this is not fully correct, because on second try it would
not use O_NOFOLLOW.  Also, O_RDWR will be always set since now it always
clear the O_ACCMODE.

So I think it does not actually need to keep the open_flags over the
iteration, we can simplify with something like that:

diff --git a/sysdeps/pthread/sem_open.c b/sysdeps/pthread/sem_open.c
index e5db929d20..7c189afbcf 100644
--- a/sysdeps/pthread/sem_open.c
+++ b/sysdeps/pthread/sem_open.c
@@ -32,11 +32,12 @@
 # define __unlink unlink
 #endif

+#define SEM_OPEN_FLAGS (O_RDWR | O_NOFOLLOW | O_CLOEXEC)
+
 sem_t *
 __sem_open (const char *name, int oflag, ...)
 {
   int fd;
-  int open_flags;
   sem_t *result;

   /* Check that shared futexes are supported.  */
@@ -65,10 +66,8 @@ __sem_open (const char *name, int oflag, ...)
   /* If the semaphore object has to exist simply open it.  */
   if ((oflag & O_CREAT) == 0 || (oflag & O_EXCL) == 0)
     {
-      open_flags = O_RDWR | O_NOFOLLOW | O_CLOEXEC;
-      open_flags |= (oflag & ~(O_CREAT|O_ACCMODE));
     try_again:
-      fd = __open (dirname.name, open_flags);
+      fd = __open (dirname.name, (oflag & O_EXCL) | SEM_OPEN_FLAGS);

       if (fd == -1)
        {
@@ -135,8 +134,7 @@ __sem_open (const char *name, int oflag, ...)
            }

          /* Open the file.  Make sure we do not overwrite anything.  */
-         open_flags = O_RDWR | O_CREAT | O_EXCL | O_CLOEXEC;
-         fd = __open (tmpfname, open_flags, mode);
+         fd = __open (tmpfname, O_CREAT | O_EXCL | SEM_OPEN_FLAGS, mode);
          if (fd == -1)
            {
              if (errno == EEXIST)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] sysdeps: Clear O_CREAT|O_ACCMODE when trying again on sem_open
  2023-08-23  7:40 ` Andreas Schwab
@ 2023-08-23 14:10   ` Sergio Durigan Junior
  2023-08-23 14:45     ` Adhemerval Zanella Netto
  0 siblings, 1 reply; 15+ messages in thread
From: Sergio Durigan Junior @ 2023-08-23 14:10 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Sergio Durigan Junior via Libc-alpha

On Wednesday, August 23 2023, Andreas Schwab wrote:

> On Aug 23 2023, Sergio Durigan Junior via Libc-alpha wrote:
>
>> The fix here is to actually make sure that the O_CREAT|O_ACCMODE flags
>> are clear after we enter "try_again".
>
> That doesn't match what the patch does.  It only adds flags to
> open_flags, and does not remove any.

You're correct.  Simon also spotted the same problem.  Here's an updated
version of the patch.

-- 
Sergio
GPG key ID: 237A 54B1 0287 28BF 00EF  31F4 D0EB 7628 65FC 5E36
Please send encrypted e-mail if possible
https://sergiodj.net/

From cb1a17590878955bcf6d7c2821ca95da3896608c Mon Sep 17 00:00:00 2001
From: Sergio Durigan Junior <sergiodj@sergiodj.net>
Date: Wed, 23 Aug 2023 00:10:44 -0400
Subject: [PATCH] sysdeps: Clear O_CREAT|O_ACCMODE when trying again on
 sem_open

When invoking sem_open with O_CREAT as one of its flags, we'll end up
in the second part of sem_open's "if ((oflag & O_CREAT) == 0 || (oflag
& O_EXCL) == 0)", which means that we don't expect the semaphore file
to exist.

In that part, open_flags is initialized as "O_RDWR | O_CREAT | O_EXCL
| O_CLOEXEC" and there's an attempt to open(2) the file, which will
likely fail because it won't exist.  After that first (expected)
failure, some cleanup is done and we go back to the label "try_again",
which lives in the first part of the aforementioned "if".

The problem is that, in that part of the code, we expect the semaphore
file to exist, and as such O_CREAT (this time the flag we pass to
open(2)) needs to be cleaned from open_flags, otherwise we'll see
another failure (this time unexpected) when trying to open the file,
which will lead the call to sem_open to fail as well.

This can cause very strange bugs, especially with OpenMPI, which makes
extensive use of semaphores.

The fix here is to actually make sure that the O_CREAT|O_ACCMODE flags
are clear after we enter "try_again".

See also: https://bugs.launchpad.net/ubuntu/+source/h5py/+bug/2031912

Signed-off-by: Sergio Durigan Junior <sergiodj@sergiodj.net>
Co-Authored-By: Simon Chopin <simon.chopin@canonical.com>
Fixes: 533deafbdf189f5fbb280c28562dd43ace2f4b0f ("Use O_CLOEXEC in more places (BZ #15722)")
---
 sysdeps/pthread/sem_open.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/sysdeps/pthread/sem_open.c b/sysdeps/pthread/sem_open.c
index e5db929d20..529c286541 100644
--- a/sysdeps/pthread/sem_open.c
+++ b/sysdeps/pthread/sem_open.c
@@ -36,7 +36,6 @@ sem_t *
 __sem_open (const char *name, int oflag, ...)
 {
   int fd;
-  int open_flags;
   sem_t *result;
 
   /* Check that shared futexes are supported.  */
@@ -65,10 +64,9 @@ __sem_open (const char *name, int oflag, ...)
   /* If the semaphore object has to exist simply open it.  */
   if ((oflag & O_CREAT) == 0 || (oflag & O_EXCL) == 0)
     {
-      open_flags = O_RDWR | O_NOFOLLOW | O_CLOEXEC;
-      open_flags |= (oflag & ~(O_CREAT|O_ACCMODE));
     try_again:
-      fd = __open (dirname.name, open_flags);
+      fd = __open (dirname.name,
+		   (oflag & ~(O_CREAT|O_ACCMODE)) | O_NOFOLLOW | O_RDWR | O_CLOEXEC);
 
       if (fd == -1)
 	{
@@ -135,8 +133,7 @@ __sem_open (const char *name, int oflag, ...)
 	    }
 
 	  /* Open the file.  Make sure we do not overwrite anything.  */
-	  open_flags = O_RDWR | O_CREAT | O_EXCL | O_CLOEXEC;
-	  fd = __open (tmpfname, open_flags, mode);
+	  fd = __open (tmpfname, O_RDWR | O_CREAT | O_EXCL | O_CLOEXEC, mode);
 	  if (fd == -1)
 	    {
 	      if (errno == EEXIST)
-- 
2.39.2


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] sysdeps: Clear O_CREAT|O_ACCMODE when trying again on sem_open
  2023-08-23 13:46 ` Adhemerval Zanella Netto
@ 2023-08-23 14:16   ` Sergio Durigan Junior
  2023-08-23 14:23     ` Sergio Durigan Junior
  2023-10-28 20:30   ` Sergio Durigan Junior
  1 sibling, 1 reply; 15+ messages in thread
From: Sergio Durigan Junior @ 2023-08-23 14:16 UTC (permalink / raw)
  To: Adhemerval Zanella Netto; +Cc: libc-alpha

On Wednesday, August 23 2023, Adhemerval Zanella Netto wrote:

> On 23/08/23 01:21, Sergio Durigan Junior via Libc-alpha wrote:
>> When invoking sem_open with O_CREAT as one of its flags, we'll end up
>> in the second part of sem_open's "if ((oflag & O_CREAT) == 0 || (oflag
>> & O_EXCL) == 0)", which means that we don't expect the semaphore file
>> to exist.
>> 
>> In that part, open_flags is initialized as "O_RDWR | O_CREAT | O_EXCL
>> | O_CLOEXEC" and there's an attempt to open(2) the file, which will
>> likely fail because it won't exist.  After that first (expected)
>> failure, some cleanup is done and we go back to the label "try_again",
>> which lives in the first part of the aforementioned "if".
>> 
>> The problem is that, in that part of the code, we expect the semaphore
>> file to exist, and as such O_CREAT (this time the flag we pass to
>> open(2)) needs to be cleaned from open_flags, otherwise we'll see
>> another failure (this time unexpected) when trying to open the file,
>> which will lead the call to sem_open to fail as well.
>> 
>> This can cause very strange bugs, especially with OpenMPI, which makes
>> extensive use of semaphores.
>> 
>> The fix here is to actually make sure that the O_CREAT|O_ACCMODE flags
>> are clear after we enter "try_again".
>> 
>> See also: https://bugs.launchpad.net/ubuntu/+source/h5py/+bug/2031912
>
> This need needs a bug report and, if possible, a regression check (I give
> you that it might be tricky due it is a racy condition).

Sure thing.  I can provide a regression check using the openmpi bug
we've been investigating, if that's acceptable.

>> ---
>>  sysdeps/pthread/sem_open.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/sysdeps/pthread/sem_open.c b/sysdeps/pthread/sem_open.c
>> index e5db929d20..ba91f89d57 100644
>> --- a/sysdeps/pthread/sem_open.c
>> +++ b/sysdeps/pthread/sem_open.c
>> @@ -66,8 +66,8 @@ __sem_open (const char *name, int oflag, ...)
>>    if ((oflag & O_CREAT) == 0 || (oflag & O_EXCL) == 0)
>>      {
>>        open_flags = O_RDWR | O_NOFOLLOW | O_CLOEXEC;
>> -      open_flags |= (oflag & ~(O_CREAT|O_ACCMODE));
>>      try_again:
>> +      open_flags |= (oflag & ~(O_CREAT|O_ACCMODE));
>>        fd = __open (dirname.name, open_flags);
>>  
>>        if (fd == -1)
>
> I still think this is not fully correct, because on second try it would
> not use O_NOFOLLOW.  Also, O_RDWR will be always set since now it always
> clear the O_ACCMODE.

Yeah, Andreas and Simon pointed this out too.

> So I think it does not actually need to keep the open_flags over the
> iteration, we can simplify with something like that:
>
> diff --git a/sysdeps/pthread/sem_open.c b/sysdeps/pthread/sem_open.c
> index e5db929d20..7c189afbcf 100644
> --- a/sysdeps/pthread/sem_open.c
> +++ b/sysdeps/pthread/sem_open.c
> @@ -32,11 +32,12 @@
>  # define __unlink unlink
>  #endif
>
> +#define SEM_OPEN_FLAGS (O_RDWR | O_NOFOLLOW | O_CLOEXEC)
> +
>  sem_t *
>  __sem_open (const char *name, int oflag, ...)
>  {
>    int fd;
> -  int open_flags;
>    sem_t *result;
>
>    /* Check that shared futexes are supported.  */
> @@ -65,10 +66,8 @@ __sem_open (const char *name, int oflag, ...)
>    /* If the semaphore object has to exist simply open it.  */
>    if ((oflag & O_CREAT) == 0 || (oflag & O_EXCL) == 0)
>      {
> -      open_flags = O_RDWR | O_NOFOLLOW | O_CLOEXEC;
> -      open_flags |= (oflag & ~(O_CREAT|O_ACCMODE));
>      try_again:
> -      fd = __open (dirname.name, open_flags);
> +      fd = __open (dirname.name, (oflag & O_EXCL) | SEM_OPEN_FLAGS);
>
>        if (fd == -1)
>         {
> @@ -135,8 +134,7 @@ __sem_open (const char *name, int oflag, ...)
>             }
>
>           /* Open the file.  Make sure we do not overwrite anything.  */
> -         open_flags = O_RDWR | O_CREAT | O_EXCL | O_CLOEXEC;
> -         fd = __open (tmpfname, open_flags, mode);
> +         fd = __open (tmpfname, O_CREAT | O_EXCL | SEM_OPEN_FLAGS, mode);
>           if (fd == -1)
>             {
>               if (errno == EEXIST)

Thanks.  I've just sent a v2 of the patch (before I saw your message),
which is pretty much the same thing you're proposing here (modulo the
SEM_OPEN_FLAGS define).

I've checked and it does fix the problem we are seeing.

-- 
Sergio
GPG key ID: 237A 54B1 0287 28BF 00EF  31F4 D0EB 7628 65FC 5E36
Please send encrypted e-mail if possible
https://sergiodj.net/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] sysdeps: Clear O_CREAT|O_ACCMODE when trying again on sem_open
  2023-08-23 14:16   ` Sergio Durigan Junior
@ 2023-08-23 14:23     ` Sergio Durigan Junior
  0 siblings, 0 replies; 15+ messages in thread
From: Sergio Durigan Junior @ 2023-08-23 14:23 UTC (permalink / raw)
  To: Adhemerval Zanella Netto; +Cc: libc-alpha

On Wednesday, August 23 2023, I wrote:

> On Wednesday, August 23 2023, Adhemerval Zanella Netto wrote:
>
>> On 23/08/23 01:21, Sergio Durigan Junior via Libc-alpha wrote:
>>> When invoking sem_open with O_CREAT as one of its flags, we'll end up
>>> in the second part of sem_open's "if ((oflag & O_CREAT) == 0 || (oflag
>>> & O_EXCL) == 0)", which means that we don't expect the semaphore file
>>> to exist.
>>> 
>>> In that part, open_flags is initialized as "O_RDWR | O_CREAT | O_EXCL
>>> | O_CLOEXEC" and there's an attempt to open(2) the file, which will
>>> likely fail because it won't exist.  After that first (expected)
>>> failure, some cleanup is done and we go back to the label "try_again",
>>> which lives in the first part of the aforementioned "if".
>>> 
>>> The problem is that, in that part of the code, we expect the semaphore
>>> file to exist, and as such O_CREAT (this time the flag we pass to
>>> open(2)) needs to be cleaned from open_flags, otherwise we'll see
>>> another failure (this time unexpected) when trying to open the file,
>>> which will lead the call to sem_open to fail as well.
>>> 
>>> This can cause very strange bugs, especially with OpenMPI, which makes
>>> extensive use of semaphores.
>>> 
>>> The fix here is to actually make sure that the O_CREAT|O_ACCMODE flags
>>> are clear after we enter "try_again".
>>> 
>>> See also: https://bugs.launchpad.net/ubuntu/+source/h5py/+bug/2031912
>>
>> This need needs a bug report and, if possible, a regression check (I give
>> you that it might be tricky due it is a racy condition).
>
> Sure thing.  I can provide a regression check using the openmpi bug
> we've been investigating, if that's acceptable.

https://sourceware.org/bugzilla/show_bug.cgi?id=30789

Let me know if that's enough.

Thanks,

-- 
Sergio
GPG key ID: 237A 54B1 0287 28BF 00EF  31F4 D0EB 7628 65FC 5E36
Please send encrypted e-mail if possible
https://sergiodj.net/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] sysdeps: Clear O_CREAT|O_ACCMODE when trying again on sem_open
  2023-08-23 14:10   ` Sergio Durigan Junior
@ 2023-08-23 14:45     ` Adhemerval Zanella Netto
  2023-08-23 19:08       ` Sergio Durigan Junior
  0 siblings, 1 reply; 15+ messages in thread
From: Adhemerval Zanella Netto @ 2023-08-23 14:45 UTC (permalink / raw)
  To: Sergio Durigan Junior, Andreas Schwab
  Cc: Sergio Durigan Junior via Libc-alpha



On 23/08/23 11:10, Sergio Durigan Junior via Libc-alpha wrote:
> On Wednesday, August 23 2023, Andreas Schwab wrote:
> 
>> On Aug 23 2023, Sergio Durigan Junior via Libc-alpha wrote:
>>
>>> The fix here is to actually make sure that the O_CREAT|O_ACCMODE flags
>>> are clear after we enter "try_again".
>>
>> That doesn't match what the patch does.  It only adds flags to
>> open_flags, and does not remove any.
> 
> You're correct.  Simon also spotted the same problem.  Here's an updated
> version of the patch.
> 

> From cb1a17590878955bcf6d7c2821ca95da3896608c Mon Sep 17 00:00:00 2001
> From: Sergio Durigan Junior <sergiodj@sergiodj.net>
> Date: Wed, 23 Aug 2023 00:10:44 -0400
> Subject: [PATCH] sysdeps: Clear O_CREAT|O_ACCMODE when trying again on
>  sem_open
> 

Please open a bug report, even without a reproducer; and reference it
on the subject.

> When invoking sem_open with O_CREAT as one of its flags, we'll end up
> in the second part of sem_open's "if ((oflag & O_CREAT) == 0 || (oflag
> & O_EXCL) == 0)", which means that we don't expect the semaphore file
> to exist.
> 
> In that part, open_flags is initialized as "O_RDWR | O_CREAT | O_EXCL
> | O_CLOEXEC" and there's an attempt to open(2) the file, which will
> likely fail because it won't exist.  After that first (expected)
> failure, some cleanup is done and we go back to the label "try_again",
> which lives in the first part of the aforementioned "if".
> 
> The problem is that, in that part of the code, we expect the semaphore
> file to exist, and as such O_CREAT (this time the flag we pass to
> open(2)) needs to be cleaned from open_flags, otherwise we'll see
> another failure (this time unexpected) when trying to open the file,
> which will lead the call to sem_open to fail as well.
> 
> This can cause very strange bugs, especially with OpenMPI, which makes
> extensive use of semaphores.
> 
> The fix here is to actually make sure that the O_CREAT|O_ACCMODE flags
> are clear after we enter "try_again".
> 
> See also: https://bugs.launchpad.net/ubuntu/+source/h5py/+bug/2031912
> 
> Signed-off-by: Sergio Durigan Junior <sergiodj@sergiodj.net>
> Co-Authored-By: Simon Chopin <simon.chopin@canonical.com>
> Fixes: 533deafbdf189f5fbb280c28562dd43ace2f4b0f ("Use O_CLOEXEC in more places (BZ #15722)")
> ---
>  sysdeps/pthread/sem_open.c | 9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/sysdeps/pthread/sem_open.c b/sysdeps/pthread/sem_open.c
> index e5db929d20..529c286541 100644
> --- a/sysdeps/pthread/sem_open.c
> +++ b/sysdeps/pthread/sem_open.c
> @@ -36,7 +36,6 @@ sem_t *
>  __sem_open (const char *name, int oflag, ...)
>  {
>    int fd;
> -  int open_flags;
>    sem_t *result;
>  
>    /* Check that shared futexes are supported.  */
> @@ -65,10 +64,9 @@ __sem_open (const char *name, int oflag, ...)
>    /* If the semaphore object has to exist simply open it.  */
>    if ((oflag & O_CREAT) == 0 || (oflag & O_EXCL) == 0)
>      {
> -      open_flags = O_RDWR | O_NOFOLLOW | O_CLOEXEC;
> -      open_flags |= (oflag & ~(O_CREAT|O_ACCMODE));
>      try_again:
> -      fd = __open (dirname.name, open_flags);
> +      fd = __open (dirname.name,
> +		   (oflag & ~(O_CREAT|O_ACCMODE)) | O_NOFOLLOW | O_RDWR | O_CLOEXEC);

There is no need to support other flags than O_CREAT/O_EXCL, in fact I think
is QoI if sem_open ignore non-supported flags (since POSIX does not really
state an error for invalid flags).  That's why I added 'oflags & O_EXCL' on my
suggestion, since at this point either oflag does not have any supported
flags set, or any of O_CREAT/O_EXCL; and only need to check O_EXCL.

>  
>        if (fd == -1)
>  	{
> @@ -135,8 +133,7 @@ __sem_open (const char *name, int oflag, ...)
>  	    }
>  
>  	  /* Open the file.  Make sure we do not overwrite anything.  */
> -	  open_flags = O_RDWR | O_CREAT | O_EXCL | O_CLOEXEC;
> -	  fd = __open (tmpfname, open_flags, mode);
> +	  fd = __open (tmpfname, O_RDWR | O_CREAT | O_EXCL | O_CLOEXEC, mode);
>  	  if (fd == -1)
>  	    {
>  	      if (errno == EEXIST)


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] sysdeps: Clear O_CREAT|O_ACCMODE when trying again on sem_open
  2023-08-23 14:45     ` Adhemerval Zanella Netto
@ 2023-08-23 19:08       ` Sergio Durigan Junior
  2023-08-23 21:53         ` Joseph Myers
  0 siblings, 1 reply; 15+ messages in thread
From: Sergio Durigan Junior @ 2023-08-23 19:08 UTC (permalink / raw)
  To: Adhemerval Zanella Netto
  Cc: Andreas Schwab, Sergio Durigan Junior via Libc-alpha

On Wednesday, August 23 2023, Adhemerval Zanella Netto wrote:

> On 23/08/23 11:10, Sergio Durigan Junior via Libc-alpha wrote:
>> On Wednesday, August 23 2023, Andreas Schwab wrote:
>> 
>>> On Aug 23 2023, Sergio Durigan Junior via Libc-alpha wrote:
>>>
>>>> The fix here is to actually make sure that the O_CREAT|O_ACCMODE flags
>>>> are clear after we enter "try_again".
>>>
>>> That doesn't match what the patch does.  It only adds flags to
>>> open_flags, and does not remove any.
>> 
>> You're correct.  Simon also spotted the same problem.  Here's an updated
>> version of the patch.
>> 
>
>> From cb1a17590878955bcf6d7c2821ca95da3896608c Mon Sep 17 00:00:00 2001
>> From: Sergio Durigan Junior <sergiodj@sergiodj.net>
>> Date: Wed, 23 Aug 2023 00:10:44 -0400
>> Subject: [PATCH] sysdeps: Clear O_CREAT|O_ACCMODE when trying again on
>>  sem_open
>> 
>
> Please open a bug report, even without a reproducer; and reference it
> on the subject.

Here it is.

Thanks,

-- 
Sergio
GPG key ID: 237A 54B1 0287 28BF 00EF  31F4 D0EB 7628 65FC 5E36
Please send encrypted e-mail if possible
https://sergiodj.net/

From 584ac8bf375903cf38d73703d2439b3b74cbcd41 Mon Sep 17 00:00:00 2001
From: Sergio Durigan Junior <sergiodj@sergiodj.net>
Date: Wed, 23 Aug 2023 00:10:44 -0400
Subject: [PATCH] sysdeps: sem_open: Clear O_CREAT when semaphore file is
 expected to exist [BZ #30789]

When invoking sem_open with O_CREAT as one of its flags, we'll end up
in the second part of sem_open's "if ((oflag & O_CREAT) == 0 || (oflag
& O_EXCL) == 0)", which means that we don't expect the semaphore file
to exist.

In that part, open_flags is initialized as "O_RDWR | O_CREAT | O_EXCL
| O_CLOEXEC" and there's an attempt to open(2) the file, which will
likely fail because it won't exist.  After that first (expected)
failure, some cleanup is done and we go back to the label "try_again",
which lives in the first part of the aforementioned "if".

The problem is that, in that part of the code, we expect the semaphore
file to exist, and as such O_CREAT (this time the flag we pass to
open(2)) needs to be cleaned from open_flags, otherwise we'll see
another failure (this time unexpected) when trying to open the file,
which will lead the call to sem_open to fail as well.

This can cause very strange bugs, especially with OpenMPI, which makes
extensive use of semaphores.

Fix the bug by simplifying the logic when choosing open(2) flags and
making sure O_CREAT is not set when the semaphore file is expected to
exist.

This resolves BZ #30789.

See also: https://bugs.launchpad.net/ubuntu/+source/h5py/+bug/2031912

Signed-off-by: Sergio Durigan Junior <sergiodj@sergiodj.net>
Co-Authored-By: Simon Chopin <simon.chopin@canonical.com>
Co-Authored-By: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Fixes: 533deafbdf189f5fbb280c28562dd43ace2f4b0f ("Use O_CLOEXEC in more places (BZ #15722)")
---
 sysdeps/pthread/sem_open.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/sysdeps/pthread/sem_open.c b/sysdeps/pthread/sem_open.c
index e5db929d20..0e331a7445 100644
--- a/sysdeps/pthread/sem_open.c
+++ b/sysdeps/pthread/sem_open.c
@@ -32,11 +32,12 @@
 # define __unlink unlink
 #endif
 
+#define SEM_OPEN_FLAGS (O_RDWR | O_NOFOLLOW | O_CLOEXEC)
+
 sem_t *
 __sem_open (const char *name, int oflag, ...)
 {
   int fd;
-  int open_flags;
   sem_t *result;
 
   /* Check that shared futexes are supported.  */
@@ -65,10 +66,8 @@ __sem_open (const char *name, int oflag, ...)
   /* If the semaphore object has to exist simply open it.  */
   if ((oflag & O_CREAT) == 0 || (oflag & O_EXCL) == 0)
     {
-      open_flags = O_RDWR | O_NOFOLLOW | O_CLOEXEC;
-      open_flags |= (oflag & ~(O_CREAT|O_ACCMODE));
     try_again:
-      fd = __open (dirname.name, open_flags);
+      fd = __open (dirname.name, (oflag & O_EXCL) | SEM_OPEN_FLAGS);
 
       if (fd == -1)
 	{
@@ -135,8 +134,7 @@ __sem_open (const char *name, int oflag, ...)
 	    }
 
 	  /* Open the file.  Make sure we do not overwrite anything.  */
-	  open_flags = O_RDWR | O_CREAT | O_EXCL | O_CLOEXEC;
-	  fd = __open (tmpfname, open_flags, mode);
+	  fd = __open (tmpfname, O_CREAT | O_EXCL | SEM_OPEN_FLAGS, mode);
 	  if (fd == -1)
 	    {
 	      if (errno == EEXIST)
-- 
2.39.2


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] sysdeps: Clear O_CREAT|O_ACCMODE when trying again on sem_open
  2023-08-23 19:08       ` Sergio Durigan Junior
@ 2023-08-23 21:53         ` Joseph Myers
  0 siblings, 0 replies; 15+ messages in thread
From: Joseph Myers @ 2023-08-23 21:53 UTC (permalink / raw)
  To: Sergio Durigan Junior
  Cc: Adhemerval Zanella Netto, Andreas Schwab,
	Sergio Durigan Junior via Libc-alpha

I'd expect the patch to add a testcase to the glibc testsuite or explain 
why it's hard to test.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] sysdeps: Clear O_CREAT|O_ACCMODE when trying again on sem_open
  2023-08-23 13:46 ` Adhemerval Zanella Netto
  2023-08-23 14:16   ` Sergio Durigan Junior
@ 2023-10-28 20:30   ` Sergio Durigan Junior
  2023-11-01 13:27     ` Adhemerval Zanella Netto
  1 sibling, 1 reply; 15+ messages in thread
From: Sergio Durigan Junior @ 2023-10-28 20:30 UTC (permalink / raw)
  To: Adhemerval Zanella Netto; +Cc: libc-alpha

On Wednesday, August 23 2023, Adhemerval Zanella Netto wrote:

> On 23/08/23 01:21, Sergio Durigan Junior via Libc-alpha wrote:
>> When invoking sem_open with O_CREAT as one of its flags, we'll end up
>> in the second part of sem_open's "if ((oflag & O_CREAT) == 0 || (oflag
>> & O_EXCL) == 0)", which means that we don't expect the semaphore file
>> to exist.
>> 
>> In that part, open_flags is initialized as "O_RDWR | O_CREAT | O_EXCL
>> | O_CLOEXEC" and there's an attempt to open(2) the file, which will
>> likely fail because it won't exist.  After that first (expected)
>> failure, some cleanup is done and we go back to the label "try_again",
>> which lives in the first part of the aforementioned "if".
>> 
>> The problem is that, in that part of the code, we expect the semaphore
>> file to exist, and as such O_CREAT (this time the flag we pass to
>> open(2)) needs to be cleaned from open_flags, otherwise we'll see
>> another failure (this time unexpected) when trying to open the file,
>> which will lead the call to sem_open to fail as well.
>> 
>> This can cause very strange bugs, especially with OpenMPI, which makes
>> extensive use of semaphores.
>> 
>> The fix here is to actually make sure that the O_CREAT|O_ACCMODE flags
>> are clear after we enter "try_again".
>> 
>> See also: https://bugs.launchpad.net/ubuntu/+source/h5py/+bug/2031912
>
> This need needs a bug report and, if possible, a regression check (I give
> you that it might be tricky due it is a racy condition).

Hi folks,

It took me much longer than I intended to get back to this thread, so I
apologize.  I'm afraid I don't have very exciting news either: I still
don't have a testcase to exercise the fix.

After talking to Adhemerval during the last Cauldron, we have agreed
that (a) creating a testcase for this bug is indeed tricky (and may even
introduce false positives), and (b) we should likely move forward as is.

I still would like to point out that it is possible have a reliable
reproducer if you follow the steps I outlined on
https://sourceware.org/bugzilla/show_bug.cgi?id=30789#c1, but
unfortunately this is not acceptable as a glibc test, so there's that.

Either way, I'd like to know if you consider it OK to proceed.  I can
replicate this explanation in the commit message if you think it's
necessary.

Thank you,

-- 
Sergio
GPG key ID: 237A 54B1 0287 28BF 00EF  31F4 D0EB 7628 65FC 5E36
Please send encrypted e-mail if possible
https://sergiodj.net/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] sysdeps: Clear O_CREAT|O_ACCMODE when trying again on sem_open
  2023-10-28 20:30   ` Sergio Durigan Junior
@ 2023-11-01 13:27     ` Adhemerval Zanella Netto
  2023-11-01 13:53       ` Joseph Myers
                         ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Adhemerval Zanella Netto @ 2023-11-01 13:27 UTC (permalink / raw)
  To: Sergio Durigan Junior, Joseph Myers; +Cc: libc-alpha



On 28/10/23 17:30, Sergio Durigan Junior wrote:
> On Wednesday, August 23 2023, Adhemerval Zanella Netto wrote:
> 
>> On 23/08/23 01:21, Sergio Durigan Junior via Libc-alpha wrote:
>>> When invoking sem_open with O_CREAT as one of its flags, we'll end up
>>> in the second part of sem_open's "if ((oflag & O_CREAT) == 0 || (oflag
>>> & O_EXCL) == 0)", which means that we don't expect the semaphore file
>>> to exist.
>>>
>>> In that part, open_flags is initialized as "O_RDWR | O_CREAT | O_EXCL
>>> | O_CLOEXEC" and there's an attempt to open(2) the file, which will
>>> likely fail because it won't exist.  After that first (expected)
>>> failure, some cleanup is done and we go back to the label "try_again",
>>> which lives in the first part of the aforementioned "if".
>>>
>>> The problem is that, in that part of the code, we expect the semaphore
>>> file to exist, and as such O_CREAT (this time the flag we pass to
>>> open(2)) needs to be cleaned from open_flags, otherwise we'll see
>>> another failure (this time unexpected) when trying to open the file,
>>> which will lead the call to sem_open to fail as well.
>>>
>>> This can cause very strange bugs, especially with OpenMPI, which makes
>>> extensive use of semaphores.
>>>
>>> The fix here is to actually make sure that the O_CREAT|O_ACCMODE flags
>>> are clear after we enter "try_again".
>>>
>>> See also: https://bugs.launchpad.net/ubuntu/+source/h5py/+bug/2031912
>>
>> This need needs a bug report and, if possible, a regression check (I give
>> you that it might be tricky due it is a racy condition).
> 
> Hi folks,
> 
> It took me much longer than I intended to get back to this thread, so I
> apologize.  I'm afraid I don't have very exciting news either: I still
> don't have a testcase to exercise the fix.
> 
> After talking to Adhemerval during the last Cauldron, we have agreed
> that (a) creating a testcase for this bug is indeed tricky (and may even
> introduce false positives), and (b) we should likely move forward as is.
> 
> I still would like to point out that it is possible have a reliable
> reproducer if you follow the steps I outlined on
> https://sourceware.org/bugzilla/show_bug.cgi?id=30789#c1, but
> unfortunately this is not acceptable as a glibc test, so there's that.
> 
> Either way, I'd like to know if you consider it OK to proceed.  I can
> replicate this explanation in the commit message if you think it's
> necessary.

Joseph, would adding the following on commit message be enough to 
install this patch:

  A regression test for this issue would require a complex and cpu
  time consuming logic, since to trigger the wrong code path is not
  straightforward due the racy condition. 

Sergio, could you resend the patch either this following or more
extended explanation along with BZ# 30789 on the title?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] sysdeps: Clear O_CREAT|O_ACCMODE when trying again on sem_open
  2023-11-01 13:27     ` Adhemerval Zanella Netto
@ 2023-11-01 13:53       ` Joseph Myers
  2023-11-01 22:14       ` Sergio Durigan Junior
  2023-11-01 22:15       ` [PATCH] sysdeps: sem_open: Clear O_CREAT when semaphore file is expected to exist [BZ #30789] Sergio Durigan Junior
  2 siblings, 0 replies; 15+ messages in thread
From: Joseph Myers @ 2023-11-01 13:53 UTC (permalink / raw)
  To: Adhemerval Zanella Netto; +Cc: Sergio Durigan Junior, libc-alpha

On Wed, 1 Nov 2023, Adhemerval Zanella Netto wrote:

> Joseph, would adding the following on commit message be enough to 
> install this patch:
> 
>   A regression test for this issue would require a complex and cpu
>   time consuming logic, since to trigger the wrong code path is not
>   straightforward due the racy condition. 

Yes, that seems reasonable as justification for not adding a test case.  
(Note: I have not reviewed the substance of the patch.)

The default assumption is that a test case is added for a bug fix (that 
isn't just fixing an issue shown by an existing testsuite failure, build 
failure, etc.), but this assumption may be overcome in a particular case 
by a reason the issue is hard to cover in the testsuite.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] sysdeps: Clear O_CREAT|O_ACCMODE when trying again on sem_open
  2023-11-01 13:27     ` Adhemerval Zanella Netto
  2023-11-01 13:53       ` Joseph Myers
@ 2023-11-01 22:14       ` Sergio Durigan Junior
  2023-11-01 22:15       ` [PATCH] sysdeps: sem_open: Clear O_CREAT when semaphore file is expected to exist [BZ #30789] Sergio Durigan Junior
  2 siblings, 0 replies; 15+ messages in thread
From: Sergio Durigan Junior @ 2023-11-01 22:14 UTC (permalink / raw)
  To: Adhemerval Zanella Netto; +Cc: Joseph Myers, libc-alpha

On Wednesday, November 01 2023, Adhemerval Zanella Netto wrote:

> On 28/10/23 17:30, Sergio Durigan Junior wrote:
>> On Wednesday, August 23 2023, Adhemerval Zanella Netto wrote:
>> 
>>> On 23/08/23 01:21, Sergio Durigan Junior via Libc-alpha wrote:
>>>> When invoking sem_open with O_CREAT as one of its flags, we'll end up
>>>> in the second part of sem_open's "if ((oflag & O_CREAT) == 0 || (oflag
>>>> & O_EXCL) == 0)", which means that we don't expect the semaphore file
>>>> to exist.
>>>>
>>>> In that part, open_flags is initialized as "O_RDWR | O_CREAT | O_EXCL
>>>> | O_CLOEXEC" and there's an attempt to open(2) the file, which will
>>>> likely fail because it won't exist.  After that first (expected)
>>>> failure, some cleanup is done and we go back to the label "try_again",
>>>> which lives in the first part of the aforementioned "if".
>>>>
>>>> The problem is that, in that part of the code, we expect the semaphore
>>>> file to exist, and as such O_CREAT (this time the flag we pass to
>>>> open(2)) needs to be cleaned from open_flags, otherwise we'll see
>>>> another failure (this time unexpected) when trying to open the file,
>>>> which will lead the call to sem_open to fail as well.
>>>>
>>>> This can cause very strange bugs, especially with OpenMPI, which makes
>>>> extensive use of semaphores.
>>>>
>>>> The fix here is to actually make sure that the O_CREAT|O_ACCMODE flags
>>>> are clear after we enter "try_again".
>>>>
>>>> See also: https://bugs.launchpad.net/ubuntu/+source/h5py/+bug/2031912
>>>
>>> This need needs a bug report and, if possible, a regression check (I give
>>> you that it might be tricky due it is a racy condition).
>> 
>> Hi folks,
>> 
>> It took me much longer than I intended to get back to this thread, so I
>> apologize.  I'm afraid I don't have very exciting news either: I still
>> don't have a testcase to exercise the fix.
>> 
>> After talking to Adhemerval during the last Cauldron, we have agreed
>> that (a) creating a testcase for this bug is indeed tricky (and may even
>> introduce false positives), and (b) we should likely move forward as is.
>> 
>> I still would like to point out that it is possible have a reliable
>> reproducer if you follow the steps I outlined on
>> https://sourceware.org/bugzilla/show_bug.cgi?id=30789#c1, but
>> unfortunately this is not acceptable as a glibc test, so there's that.
>> 
>> Either way, I'd like to know if you consider it OK to proceed.  I can
>> replicate this explanation in the commit message if you think it's
>> necessary.
>
> Joseph, would adding the following on commit message be enough to 
> install this patch:
>
>   A regression test for this issue would require a complex and cpu
>   time consuming logic, since to trigger the wrong code path is not
>   straightforward due the racy condition. 
>
> Sergio, could you resend the patch either this following or more
> extended explanation along with BZ# 30789 on the title?

Sure.  I'll send it as a reply to your message.

-- 
Sergio
GPG key ID: 237A 54B1 0287 28BF 00EF  31F4 D0EB 7628 65FC 5E36
Please send encrypted e-mail if possible
https://sergiodj.net/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] sysdeps: sem_open: Clear O_CREAT when semaphore file is expected to exist [BZ #30789]
  2023-11-01 13:27     ` Adhemerval Zanella Netto
  2023-11-01 13:53       ` Joseph Myers
  2023-11-01 22:14       ` Sergio Durigan Junior
@ 2023-11-01 22:15       ` Sergio Durigan Junior
  2023-11-03 13:04         ` Adhemerval Zanella Netto
  2 siblings, 1 reply; 15+ messages in thread
From: Sergio Durigan Junior @ 2023-11-01 22:15 UTC (permalink / raw)
  To: libc-alpha; +Cc: Sergio Durigan Junior, Simon Chopin, Adhemerval Zanella Netto

When invoking sem_open with O_CREAT as one of its flags, we'll end up
in the second part of sem_open's "if ((oflag & O_CREAT) == 0 || (oflag
& O_EXCL) == 0)", which means that we don't expect the semaphore file
to exist.

In that part, open_flags is initialized as "O_RDWR | O_CREAT | O_EXCL
| O_CLOEXEC" and there's an attempt to open(2) the file, which will
likely fail because it won't exist.  After that first (expected)
failure, some cleanup is done and we go back to the label "try_again",
which lives in the first part of the aforementioned "if".

The problem is that, in that part of the code, we expect the semaphore
file to exist, and as such O_CREAT (this time the flag we pass to
open(2)) needs to be cleaned from open_flags, otherwise we'll see
another failure (this time unexpected) when trying to open the file,
which will lead the call to sem_open to fail as well.

This can cause very strange bugs, especially with OpenMPI, which makes
extensive use of semaphores.

Fix the bug by simplifying the logic when choosing open(2) flags and
making sure O_CREAT is not set when the semaphore file is expected to
exist.

A regression test for this issue would require a complex and cpu time
consuming logic, since to trigger the wrong code path is not
straightforward due the racy condition.  There is a somewhat reliable
reproducer in the bug, but it requires using OpenMPI.

This resolves BZ #30789.

See also: https://bugs.launchpad.net/ubuntu/+source/h5py/+bug/2031912

Signed-off-by: Sergio Durigan Junior <sergiodj@sergiodj.net>
Co-Authored-By: Simon Chopin <simon.chopin@canonical.com>
Co-Authored-By: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Fixes: 533deafbdf189f5fbb280c28562dd43ace2f4b0f ("Use O_CLOEXEC in more places (BZ #15722)")
---
 sysdeps/pthread/sem_open.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/sysdeps/pthread/sem_open.c b/sysdeps/pthread/sem_open.c
index e5db929d20..0e331a7445 100644
--- a/sysdeps/pthread/sem_open.c
+++ b/sysdeps/pthread/sem_open.c
@@ -32,11 +32,12 @@
 # define __unlink unlink
 #endif
 
+#define SEM_OPEN_FLAGS (O_RDWR | O_NOFOLLOW | O_CLOEXEC)
+
 sem_t *
 __sem_open (const char *name, int oflag, ...)
 {
   int fd;
-  int open_flags;
   sem_t *result;
 
   /* Check that shared futexes are supported.  */
@@ -65,10 +66,8 @@ __sem_open (const char *name, int oflag, ...)
   /* If the semaphore object has to exist simply open it.  */
   if ((oflag & O_CREAT) == 0 || (oflag & O_EXCL) == 0)
     {
-      open_flags = O_RDWR | O_NOFOLLOW | O_CLOEXEC;
-      open_flags |= (oflag & ~(O_CREAT|O_ACCMODE));
     try_again:
-      fd = __open (dirname.name, open_flags);
+      fd = __open (dirname.name, (oflag & O_EXCL) | SEM_OPEN_FLAGS);
 
       if (fd == -1)
 	{
@@ -135,8 +134,7 @@ __sem_open (const char *name, int oflag, ...)
 	    }
 
 	  /* Open the file.  Make sure we do not overwrite anything.  */
-	  open_flags = O_RDWR | O_CREAT | O_EXCL | O_CLOEXEC;
-	  fd = __open (tmpfname, open_flags, mode);
+	  fd = __open (tmpfname, O_CREAT | O_EXCL | SEM_OPEN_FLAGS, mode);
 	  if (fd == -1)
 	    {
 	      if (errno == EEXIST)
-- 
2.40.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] sysdeps: sem_open: Clear O_CREAT when semaphore file is expected to exist [BZ #30789]
  2023-11-01 22:15       ` [PATCH] sysdeps: sem_open: Clear O_CREAT when semaphore file is expected to exist [BZ #30789] Sergio Durigan Junior
@ 2023-11-03 13:04         ` Adhemerval Zanella Netto
  0 siblings, 0 replies; 15+ messages in thread
From: Adhemerval Zanella Netto @ 2023-11-03 13:04 UTC (permalink / raw)
  To: Sergio Durigan Junior, libc-alpha; +Cc: Simon Chopin



On 01/11/23 19:15, Sergio Durigan Junior wrote:
> When invoking sem_open with O_CREAT as one of its flags, we'll end up
> in the second part of sem_open's "if ((oflag & O_CREAT) == 0 || (oflag
> & O_EXCL) == 0)", which means that we don't expect the semaphore file
> to exist.
> 
> In that part, open_flags is initialized as "O_RDWR | O_CREAT | O_EXCL
> | O_CLOEXEC" and there's an attempt to open(2) the file, which will
> likely fail because it won't exist.  After that first (expected)
> failure, some cleanup is done and we go back to the label "try_again",
> which lives in the first part of the aforementioned "if".
> 
> The problem is that, in that part of the code, we expect the semaphore
> file to exist, and as such O_CREAT (this time the flag we pass to
> open(2)) needs to be cleaned from open_flags, otherwise we'll see
> another failure (this time unexpected) when trying to open the file,
> which will lead the call to sem_open to fail as well.
> 
> This can cause very strange bugs, especially with OpenMPI, which makes
> extensive use of semaphores.
> 
> Fix the bug by simplifying the logic when choosing open(2) flags and
> making sure O_CREAT is not set when the semaphore file is expected to
> exist.
> 
> A regression test for this issue would require a complex and cpu time
> consuming logic, since to trigger the wrong code path is not
> straightforward due the racy condition.  There is a somewhat reliable
> reproducer in the bug, but it requires using OpenMPI.
> 
> This resolves BZ #30789.
> 
> See also: https://bugs.launchpad.net/ubuntu/+source/h5py/+bug/2031912
> 
> Signed-off-by: Sergio Durigan Junior <sergiodj@sergiodj.net>
> Co-Authored-By: Simon Chopin <simon.chopin@canonical.com>
> Co-Authored-By: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
> Fixes: 533deafbdf189f5fbb280c28562dd43ace2f4b0f ("Use O_CLOEXEC in more places (BZ #15722)")

LGTM, thanks.  I will install this shortly.

> ---
>  sysdeps/pthread/sem_open.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/sysdeps/pthread/sem_open.c b/sysdeps/pthread/sem_open.c
> index e5db929d20..0e331a7445 100644
> --- a/sysdeps/pthread/sem_open.c
> +++ b/sysdeps/pthread/sem_open.c
> @@ -32,11 +32,12 @@
>  # define __unlink unlink
>  #endif
>  
> +#define SEM_OPEN_FLAGS (O_RDWR | O_NOFOLLOW | O_CLOEXEC)
> +
>  sem_t *
>  __sem_open (const char *name, int oflag, ...)
>  {
>    int fd;
> -  int open_flags;
>    sem_t *result;
>  
>    /* Check that shared futexes are supported.  */
> @@ -65,10 +66,8 @@ __sem_open (const char *name, int oflag, ...)
>    /* If the semaphore object has to exist simply open it.  */
>    if ((oflag & O_CREAT) == 0 || (oflag & O_EXCL) == 0)
>      {
> -      open_flags = O_RDWR | O_NOFOLLOW | O_CLOEXEC;
> -      open_flags |= (oflag & ~(O_CREAT|O_ACCMODE));
>      try_again:
> -      fd = __open (dirname.name, open_flags);
> +      fd = __open (dirname.name, (oflag & O_EXCL) | SEM_OPEN_FLAGS);
>  
>        if (fd == -1)
>  	{
> @@ -135,8 +134,7 @@ __sem_open (const char *name, int oflag, ...)
>  	    }
>  
>  	  /* Open the file.  Make sure we do not overwrite anything.  */
> -	  open_flags = O_RDWR | O_CREAT | O_EXCL | O_CLOEXEC;
> -	  fd = __open (tmpfname, open_flags, mode);
> +	  fd = __open (tmpfname, O_CREAT | O_EXCL | SEM_OPEN_FLAGS, mode);
>  	  if (fd == -1)
>  	    {
>  	      if (errno == EEXIST)

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-11-03 13:04 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-23  4:21 [PATCH] sysdeps: Clear O_CREAT|O_ACCMODE when trying again on sem_open Sergio Durigan Junior
2023-08-23  7:40 ` Andreas Schwab
2023-08-23 14:10   ` Sergio Durigan Junior
2023-08-23 14:45     ` Adhemerval Zanella Netto
2023-08-23 19:08       ` Sergio Durigan Junior
2023-08-23 21:53         ` Joseph Myers
2023-08-23 13:46 ` Adhemerval Zanella Netto
2023-08-23 14:16   ` Sergio Durigan Junior
2023-08-23 14:23     ` Sergio Durigan Junior
2023-10-28 20:30   ` Sergio Durigan Junior
2023-11-01 13:27     ` Adhemerval Zanella Netto
2023-11-01 13:53       ` Joseph Myers
2023-11-01 22:14       ` Sergio Durigan Junior
2023-11-01 22:15       ` [PATCH] sysdeps: sem_open: Clear O_CREAT when semaphore file is expected to exist [BZ #30789] Sergio Durigan Junior
2023-11-03 13:04         ` Adhemerval Zanella Netto

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).