public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/89417] helgrind detects a lock order violation inside std::scoped_lock
       [not found] <bug-89417-4@http.gcc.gnu.org/bugzilla/>
@ 2020-07-12 13:00 ` federico.kircheis at gmail dot com
  2020-07-13  9:03 ` redi at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: federico.kircheis at gmail dot com @ 2020-07-12 13:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89417

Federico Kircheis <federico.kircheis at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|8.2.0                       |10.1.0

--- Comment #1 from Federico Kircheis <federico.kircheis at gmail dot com> ---
I've tested with g++10 (g++-10 (Debian 10.1.0-4) 10.1.0), and the situation is
unchanged.

Does anything speak against adding the proposed "make_lock" logic inside
std::lock_guard?
Are there any drawbacks?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/89417] helgrind detects a lock order violation inside std::scoped_lock
       [not found] <bug-89417-4@http.gcc.gnu.org/bugzilla/>
  2020-07-12 13:00 ` [Bug libstdc++/89417] helgrind detects a lock order violation inside std::scoped_lock federico.kircheis at gmail dot com
@ 2020-07-13  9:03 ` redi at gcc dot gnu.org
  2020-07-14 20:16 ` federico.kircheis at gmail dot com
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: redi at gcc dot gnu.org @ 2020-07-13  9:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89417

--- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Federico Kircheis from comment #0)
> In case helgrind is correct, it seems that there are some issues behind
> std::scoped_lock, since it was explicitly designed for solving issues with
> lock order.

Our implementation matches the specification in the standard, and the reference
implementation in
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0156r2.html

It was designed to avoid deadlock when locking, not necessarily by considering
lock order.

> In case helgrind (and possibly for the same reason other tools) is wrong,
> this is would be a feature request.
> 
> 
> A possible fix (or improvement) would be for `std::scoped_lock` to sort its
> arguments by address (since std::mutex are not copyable or movable, and thus
> their address should remain constant):

The scoped_lock type just uses the std::lock algorithm for locking multiple
mutexes, and  that is specified to avoid deadlock even if another thread is
locking the same mutexes in a different order. Ordering them in
std::scoped_lock doesn't help, you would need to do it in std::lock.

But there is no requirement that the same set of mutexes are always locked
using std::scoped_lock or std::lock, they can be manually locked using
individual std::lock_guard locks or bare calls to the mutex's lock() member
functions, in arbitrary orders. So ordering them in std::lock still doesn't
solve anything.

Also, I'm not even sure what the helgrind error really means. Just because they
were locked in one order at one time doesn't establish an invariant that they
must always be locked in that order elsewhere in the program. So I don't see
anything that needs to be fixed. If you specific program has an invariant that
the order is preserved, then don't pass them to the scoped_lock constructor in
different orders.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/89417] helgrind detects a lock order violation inside std::scoped_lock
       [not found] <bug-89417-4@http.gcc.gnu.org/bugzilla/>
  2020-07-12 13:00 ` [Bug libstdc++/89417] helgrind detects a lock order violation inside std::scoped_lock federico.kircheis at gmail dot com
  2020-07-13  9:03 ` redi at gcc dot gnu.org
@ 2020-07-14 20:16 ` federico.kircheis at gmail dot com
  2020-07-15  9:48 ` redi at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: federico.kircheis at gmail dot com @ 2020-07-14 20:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89417

--- Comment #3 from Federico Kircheis <federico.kircheis at gmail dot com> ---
Thank you for the analysis, explanation and references, I did not think about
testing std::lock directly.


I'm still unsure if that means that it is a bug in valgrind, unfortunately I do
not know other 3rd party tool for doing such verifications.

I also noticed that I made an error in my initial report, the second snippet
should have been

-----
int main(){
        std::mutex m1;
        std::mutex m2;
        int data1{};
        int data2{};
        auto f1 = std::async(std::launch::async, [&](){
                std::lock_guard lg1{m1};std::lock_guard lg2{m2};
                ++data1;
                ++data2;
                return data1;
        });
    auto f2 = std::async(std::launch::async, [&](){
                std::lock_guard lg2{m2};std::lock_guard lg1{m1};
                ++data1;
                ++data2;
                return data2;
    });
    return f1.get() + f2.get(); // result should be 3
}
-----

otherwise it does not generate any warning in valgrind as the order is the
same.

> Also, I'm not even sure what the helgrind error really means.

>From helgrind manual: https://www.valgrind.org/docs/manual/hg-manual.html

----
Helgrind monitors the order in which threads acquire locks. This allows it to
detect potential deadlocks which could arise from the formation of cycles of
locks. Detecting such inconsistencies is useful because, whilst actual
deadlocks are fairly obvious, potential deadlocks may never be discovered
during testing and could later lead to hard-to-diagnose in-service failures.
----



> Just because they were locked in one order at one time doesn't establish an invariant that they must always be locked in that order elsewhere in the program.

I'm not an expert, I've just made the report because I've tried out valgrind on
a bigger program and then noticed it also complained about this example.

In order to avoid deadlocks, shouldn't mutexes get always locked in the same
order?

Otherwise thread1 locks the first mutex, then thread2 locks the second, and now
both thread are waiting for locking the next.

My guess, without having looked at the implementation of std::lock, is that the
algorithm uses try_lock to eventually lock/unlock the mutexes and see if it
manages to lock both, in order to avoid the deadlock.
And I suppose that somehow this algorithm also manages to avoid livelocks.


But even if std::lock would be correct (and it is a false positive of
valgrind), wouldn't sorting by address be an optimization?

As far as I understand, if the mutexes are always locked in the same order, one
does not need to try_lock. If it does, it would at least avoid the dance of
both threads trying to lock and unlock, as the first one that arrives manages
to lock both.
And if someone else locked them somewhere else in a different order, then the
current algorithm still applies.

I'm therefore not saying the current algorithm should be dismissed, just asking
if ordering before applying the algorithm could have any benefit.
Apart from the fact that helgrind would not complain anymore, which IMHO would
already be a big win.

Maybe there are known drawbacks to sorting by address, but as mutexes cannot be
moved, the output of sorting should always be the same, and I suppose that
sorting should not cost much, especially compared to locking.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/89417] helgrind detects a lock order violation inside std::scoped_lock
       [not found] <bug-89417-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2020-07-14 20:16 ` federico.kircheis at gmail dot com
@ 2020-07-15  9:48 ` redi at gcc dot gnu.org
  2020-07-15 16:24 ` federico.kircheis at gmail dot com
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: redi at gcc dot gnu.org @ 2020-07-15  9:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89417

--- Comment #4 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Federico Kircheis from comment #3)
> My guess, without having looked at the implementation of std::lock, is that
> the algorithm uses try_lock to eventually lock/unlock the mutexes and see if
> it manages to lock both, in order to avoid the deadlock.

Right. So the potential deadlock helgrind complains about can't happen.


> But even if std::lock would be correct (and it is a false positive of
> valgrind), wouldn't sorting by address be an optimization?

Not in the uncontended case, no, because it does extra work to sort them. Your
make_lock function doesn't work for more than two arguments so a more complex
algorithm is needed for the general case (it doesn't even work for fewer than
two arguments, although obviously that's trivial to support).

> As far as I understand, if the mutexes are always locked in the same order,
> one does not need to try_lock.

That's not true. As I already said above, another thread could be locking the
same mutexes without using std::lock (in any order), or it could be locking a
different but overlapping set of mutexes.


> I'm therefore not saying the current algorithm should be dismissed, just
> asking if ordering before applying the algorithm could have any benefit.
> Apart from the fact that helgrind would not complain anymore, which IMHO
> would already be a big win.

Changing the locking algorithm to avoid a false positive from a particular tool
seems unwise.

> Maybe there are known drawbacks to sorting by address, but as mutexes cannot
> be moved,

std::mutex cannot be moved, but std::lock and std::scoped_lock don't only work
with standard mutexes. You have no idea whether foo::Mutex or bar::Lockable can
be moved. You can't even assume that the address is meaningful, I could write a
copyable mutex where different copies share a lock:

class CopyableMutex
{
  std::shared_ptr<std::mutex> m = std::make_shared<std::mutex>();
public:
  CopyableMutex() = default;
  CopyableMutex(const CopyableMutex&) = default;

  void lock() { m->lock(); }
  bool try_lock() { return m->try_lock(); }
  void unlock() { m->unlock(); }
};

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/89417] helgrind detects a lock order violation inside std::scoped_lock
       [not found] <bug-89417-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2020-07-15  9:48 ` redi at gcc dot gnu.org
@ 2020-07-15 16:24 ` federico.kircheis at gmail dot com
  2020-07-15 17:41 ` federico.kircheis at gmail dot com
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: federico.kircheis at gmail dot com @ 2020-07-15 16:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89417

--- Comment #5 from Federico Kircheis <federico.kircheis at gmail dot com> ---
(In reply to Jonathan Wakely from comment #4)
> (In reply to Federico Kircheis from comment #3)
> > My guess, without having looked at the implementation of std::lock, is that
> > the algorithm uses try_lock to eventually lock/unlock the mutexes and see if
> > it manages to lock both, in order to avoid the deadlock.
> 
> Right. So the potential deadlock helgrind complains about can't happen.

Ok, I missed that confirmation.
Mine where only suppositions, I was not sure.
I would normally use a tool to validate such claims in a code-basis (as my
expertise with mutlithreading is very limited), unfortunately it produced the
log I've attached in my first report :-)

> > But even if std::lock would be correct (and it is a false positive of
> > valgrind), wouldn't sorting by address be an optimization?
> 
> Not in the uncontended case, no, because it does extra work to sort them.
> Your make_lock function doesn't work for more than two arguments so a more
> complex algorithm is needed for the general case (it doesn't even work for
> fewer than two arguments, although obviously that's trivial to support).

Well, for the generic case I though a sort based on address would suffice.
My "implementation" was just the special-case for 2 arguments.
As it would be abnormal to lock a lot (with a lot >=5) mutexes, I guessed the
overhead of sorting would be minimal compared to locking.

> > As far as I understand, if the mutexes are always locked in the same order,
> > one does not need to try_lock.
> 
> That's not true. As I already said above, another thread could be locking
> the same mutexes without using std::lock (in any order), or it could be
> locking a different but overlapping set of mutexes.
> 
> 
> > I'm therefore not saying the current algorithm should be dismissed, just
> > asking if ordering before applying the algorithm could have any benefit.
> > Apart from the fact that helgrind would not complain anymore, which IMHO
> > would already be a big win.
> 
> Changing the locking algorithm to avoid a false positive from a particular
> tool seems unwise.

Do you know other independent tools that can diagnose possible
deadlocks/race-conditions?
I'm genuinely curious (sorry if this is gettin off-topic).
It could have helped avoid creating this issue if valgrind was the only one
complaining (tools working on windows are fine too).
On the other hand, if other tools complains too, it might make more sense (at
least from my perspective) to accommodate third-party analyzers (while of
course they should be fixed too).

> > Maybe there are known drawbacks to sorting by address, but as mutexes cannot
> > be moved,
> 
> std::mutex cannot be moved, but std::lock and std::scoped_lock don't only
> work with standard mutexes. You have no idea whether foo::Mutex or
> bar::Lockable can be moved. You can't even assume that the address is
> meaningful, I could write a copyable mutex where different copies share a
> lock:
> 
> class CopyableMutex
> {
>   std::shared_ptr<std::mutex> m = std::make_shared<std::mutex>();
> public:
>   CopyableMutex() = default;
>   CopyableMutex(const CopyableMutex&) = default;
> 
>   void lock() { m->lock(); }
>   bool try_lock() { return m->try_lock(); }
>   void unlock() { m->unlock(); }
> };

Oh, I did not realize that the locking classes works on custom types, I guess
that settles the discussion.

I could of course argue that it is possible to sort only for the standard
mutexes, but I see that it is more work than I thought with very little benefit
(especially if other tools do not complain).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/89417] helgrind detects a lock order violation inside std::scoped_lock
       [not found] <bug-89417-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2020-07-15 16:24 ` federico.kircheis at gmail dot com
@ 2020-07-15 17:41 ` federico.kircheis at gmail dot com
  2020-07-15 19:28 ` redi at gcc dot gnu.org
  2021-06-21 13:47 ` redi at gcc dot gnu.org
  7 siblings, 0 replies; 8+ messages in thread
From: federico.kircheis at gmail dot com @ 2020-07-15 17:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89417

--- Comment #6 from Federico Kircheis <federico.kircheis at gmail dot com> ---
For what is worth, I imagined the implementation for parameters of different
type and more or less than two to be similar to

----
template <class...Args>
auto sorted_indexes(Args&... args) {
    const void* addresses[] = {&args...};
    std::array<int,std::size(addresses)> indexes{};
    std::iota(std::begin(indexes), std::end(indexes), 0);
    std::sort(std::begin(indexes), std::end(indexes), [&](int i1, int i2){
        return std::less<void>()(addresses[i1], addresses[i2]);
    });
    return indexes;
}


template <class...Args>
auto create_lock(Args&... args){
    auto indexes = sorted_indexes(args...);
    return std::scoped_lock(args[indexes]...); // ???
}
----

But it's probably a dead end, as args is not an array and I see no way to use
the computed index.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/89417] helgrind detects a lock order violation inside std::scoped_lock
       [not found] <bug-89417-4@http.gcc.gnu.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2020-07-15 17:41 ` federico.kircheis at gmail dot com
@ 2020-07-15 19:28 ` redi at gcc dot gnu.org
  2021-06-21 13:47 ` redi at gcc dot gnu.org
  7 siblings, 0 replies; 8+ messages in thread
From: redi at gcc dot gnu.org @ 2020-07-15 19:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89417

--- Comment #7 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Federico Kircheis from comment #6)
> For what is worth, I imagined the implementation for parameters of different
> type and more or less than two to be similar to
> 
> ----
> template <class...Args>
> auto sorted_indexes(Args&... args) {
>     const void* addresses[] = {&args...};
>     std::array<int,std::size(addresses)> indexes{};
>     std::iota(std::begin(indexes), std::end(indexes), 0);
>     std::sort(std::begin(indexes), std::end(indexes), [&](int i1, int i2){
>         return std::less<void>()(addresses[i1], addresses[i2]);
>     });
>     return indexes;
> }
> 
> 
> template <class...Args>
> auto create_lock(Args&... args){
>     auto indexes = sorted_indexes(args...);
>     return std::scoped_lock(args[indexes]...); // ???
> }
> ----
> 
> But it's probably a dead end, as args is not an array and I see no way to
> use the computed index.

args is not an array and indexes is not a pack.

You could make sorted_indexes a constexpr function that returns a
std::index_sequence holding the sorted indices, then:

template<class T, size_t... I>
auto create_lock_impl(std::index_sequence<I...>, T arg_tuple, ) {
  return std::scoped_lock(std::get<I>(args)...);
}

template<class... Args>
auto create_lock(Args&... args) {
  return create_lock_impl(sorted_indexes(args...), std::tie(args...));
}

But I am still unconvinced it's worth changing anything here.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/89417] helgrind detects a lock order violation inside std::scoped_lock
       [not found] <bug-89417-4@http.gcc.gnu.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2020-07-15 19:28 ` redi at gcc dot gnu.org
@ 2021-06-21 13:47 ` redi at gcc dot gnu.org
  7 siblings, 0 replies; 8+ messages in thread
From: redi at gcc dot gnu.org @ 2021-06-21 13:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89417

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |WONTFIX
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #8 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Closing, because this is not a bug. Helgrind is warning about something that
isn't a problem. The order the locks are acquired by std::lock is unspecified,
so that the implementation can guarantee no deadlock. The current std::lock
just tries to lock each argument in order, but I'm about to commit a new
implementation which can rotate the arguments before locking them. For such an
implementation, there is no way the destructor of std::scoped_lock can possibly
know in which order the locks were acquired.

Since all the unlocks happen at the same time, one after another, the window of
time where some are unlocked and the others are still locked is small anyway.
And by using std::scoped_lock to lock and unlock, deadlock avoidance is
guaranteed.

You should suppress or ignore the helgrind warnings about this usage.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-06-21 13:47 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-89417-4@http.gcc.gnu.org/bugzilla/>
2020-07-12 13:00 ` [Bug libstdc++/89417] helgrind detects a lock order violation inside std::scoped_lock federico.kircheis at gmail dot com
2020-07-13  9:03 ` redi at gcc dot gnu.org
2020-07-14 20:16 ` federico.kircheis at gmail dot com
2020-07-15  9:48 ` redi at gcc dot gnu.org
2020-07-15 16:24 ` federico.kircheis at gmail dot com
2020-07-15 17:41 ` federico.kircheis at gmail dot com
2020-07-15 19:28 ` redi at gcc dot gnu.org
2021-06-21 13:47 ` redi at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).