public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libc/28957] New: pthread rwlocks Do Not Prevent Race Condition
@ 2022-03-10 22:45 gavin.d.howard at gmail dot com
  2022-03-15 19:02 ` [Bug libc/28957] " gavin.d.howard at gmail dot com
  2022-03-15 21:21 ` gavin.d.howard at gmail dot com
  0 siblings, 2 replies; 3+ messages in thread
From: gavin.d.howard at gmail dot com @ 2022-03-10 22:45 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=28957

            Bug ID: 28957
           Summary: pthread rwlocks Do Not Prevent Race Condition
           Product: glibc
           Version: 2.33
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: libc
          Assignee: unassigned at sourceware dot org
          Reporter: gavin.d.howard at gmail dot com
                CC: drepper.fsp at gmail dot com
  Target Milestone: ---
            Target: Linux 5.15.26-gentoo-x86_64

Hello,

The write lock of the pthread_rwlock_t type in glibc appears to sometimes allow
multiple threads to have write access at the same time.

Background: I am building a build system that uses multiple threads.

When the build system is about to run a command, it takes a read lock, and then
it consults a map of command names to paths. For example, if the command was
`gcc -c -o test.o test.c`, then it would look up "gcc" in the map.

If an entry exists, the value is the full path to the executable, and that is
what is used for the argv[0], so if the value is found, it is copied, the read
lock is released, and all is good.

If no entry exists, then the read lock is released, and there is a lookup of
the executable in PATH. Once it is found and the entry is ready to be inserted
into the map, a write lock is taken. Then it checks again to see if an entry
exists. If not it inserts the new entry.

At that point, the write lock is released, and it goes back to the beginning,
takes the read lock, gets the entry, releases the read lock, and then continues
on its merry way.

While I can't provide the actual code because of licensing issues, I can give
some pseudo-code:

```
do
{
  cont = false;

  r = pthread_rwlock_rdlock(&lock);
  <check r>

  if (map_exists(map, cmd))
  {
    path = map_at(map, cmd);

    r = pthread_rwlock_unlock(&lock);
    <check r>
  }
  else
  {
    r = pthread_rwlock_unlock(&lock);
    <check r>

    <Search PATH and prepare entry>

    r = pthread_rwlock_wrlock(&lock);
    <check r>

    if (!map_exists(&map, cmd))
    {
      err = map_insert(&map, cmd, entry);
      if (err == ELEM_EXISTS) abort();
    }

    r = pthread_rwlock_unlock(&lock);
    <check r>

    cont = true;
  }
}
while (cont);
```

With that code, I can get my build system to `abort()` within 10 minutes,
consistently, by running it over and over again bootstrapping itself.

If I change the `pthread_rwlock_rdlock()` to a `pthread_rwlock_wrlock()`, the
`abort()` can still be triggered within 10 minutes, so it appears that it is
not an issue of using a read lock in the wrong place.

In addition, if I change `lock` to be a mutex and update all of the locking and
unlocking appropriately, I cannot trigger the abort() even with 12+ hours of
trying. I think that this means that I am not using the rwlocks wrong, but that
there is some bug in the write lock of the rwlock.

I'm attempting to create a small reproducer that I can publish, but so far, no
luck.

Any help would be greatly appreciated.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug libc/28957] pthread rwlocks Do Not Prevent Race Condition
  2022-03-10 22:45 [Bug libc/28957] New: pthread rwlocks Do Not Prevent Race Condition gavin.d.howard at gmail dot com
@ 2022-03-15 19:02 ` gavin.d.howard at gmail dot com
  2022-03-15 21:21 ` gavin.d.howard at gmail dot com
  1 sibling, 0 replies; 3+ messages in thread
From: gavin.d.howard at gmail dot com @ 2022-03-15 19:02 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=28957

--- Comment #1 from Gavin Howard <gavin.d.howard at gmail dot com> ---
After a lot of sweating, I decided to make the repo that surfaced the bug
public after relicensing.

It's at https://git.yzena.com/Yzena/Yc . I've also made an easy reproducer. To
use it, you'll need CMake and Clang installed, but once you do, just run the
following:

```
git clone https://git.yzena.com/Yzena/Yc.git
cd Yc
./tools/rwlock_repro.sh
```

That script will setup the repo, build the build system with glibc, and then
repeatedly run the build system on itself until an error happens.

When running this on glibc, I consistently (on average, about 2-4 hours of
testing) get this abort() message that I created to test this bug:

```
Panic: More than one thread in the critical section
    Source:    /home/gavin/Yc/src/rig/build.c:555
    Function:  rig_searchPath()
```

This happens because there is a global variable that threads increment when
they enter the critical section of the write lock and decrement when they leave
it. But even though they do, and that's the only place the global is used (it's
only meant for testing this bug), threads will occasionally find that the value
of the global is 2 or more, meaning that more than one thread is allowed into
the critical section of the write lock.

Now, I may be wrong about the guarantees of glibc write locks, and if so,
please let me know, but as far as I know, write locks are only supposed to
allow one thread into the critical section at one time.

This is on glibc 2.33, installed on Gentoo. I looked at the Gentoo package, and
it seems that they have no patches applied to glibc, so I thought I would
report it here.

Also, I am using an AMD Ryzen Threadripper 1900X CPU, if that matters.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug libc/28957] pthread rwlocks Do Not Prevent Race Condition
  2022-03-10 22:45 [Bug libc/28957] New: pthread rwlocks Do Not Prevent Race Condition gavin.d.howard at gmail dot com
  2022-03-15 19:02 ` [Bug libc/28957] " gavin.d.howard at gmail dot com
@ 2022-03-15 21:21 ` gavin.d.howard at gmail dot com
  1 sibling, 0 replies; 3+ messages in thread
From: gavin.d.howard at gmail dot com @ 2022-03-15 21:21 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=28957

Gavin Howard <gavin.d.howard at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |INVALID
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #2 from Gavin Howard <gavin.d.howard at gmail dot com> ---
I am stupid.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-03-15 21:21 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-10 22:45 [Bug libc/28957] New: pthread rwlocks Do Not Prevent Race Condition gavin.d.howard at gmail dot com
2022-03-15 19:02 ` [Bug libc/28957] " gavin.d.howard at gmail dot com
2022-03-15 21:21 ` gavin.d.howard at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).