[PATCH] aio: fix newp->running data race

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

* [PATCH] aio: fix newp->running data race
@ 2016-05-04 14:02 Samuel Thibault
  2016-05-18 13:55 ` Samuel Thibault
  2016-07-08 14:28 ` Samuel Thibault
  0 siblings, 2 replies; 6+ messages in thread
From: Samuel Thibault @ 2016-05-04 14:02 UTC (permalink / raw)
  To: GNU C Library

Hello,

As reported by helgrind:

==9363== Possible data race during write of size 4 at 0x6B45D18 by thread #5
==9363== Locks held: 2, at addresses 0x557A1E0 0x6A149E0
==9363==    at 0x5375E0D: __aio_enqueue_request (in /lib/x86_64-linux-gnu/librt-2.21.so)
==9363==    by 0x53764FD: aio_write (in /lib/x86_64-linux-gnu/librt-2.21.so)

which is the write in

  if (result == 0)
    newp->running = running;

==9363== This conflicts with a previous read of size 4 by thread #6
==9363== Locks held: none
==9363==    at 0x53758FE: handle_fildes_io (in /lib/x86_64-linux-gnu/librt-2.21.so)

which is the read in 

	  /* Hopefully this request is marked as running.  */
	  assert (runp->running == allocated);

So what is happening is that __aio_enqueue_request writes newp->running
after having possibly started a thread worker to handle the request,
thus in concurrence with the thread worker, which not only reads
runp->running as shown here, but also writes to it later (and does not
take __aio_requests_mutex since it's already given a request).

But actually when we do start a new thread worker, there is no need
to set newp->running, since it was already set by the "running =
newp->running = allocated;" line along the thread creation.  So to avoid
the race we can just set newp->running in the cases where we do not
start the thread.

Samuel


    	* sysdeps/pthread/aio_misc.c (__aio_enqueue_request): Do not write
    	`running` field of `newp` when a thread was started to process it,
    	since that thread will not take `__aio_requests_mutex`, and the field
    	already has the proper value actually.

diff --git a/sysdeps/pthread/aio_misc.c b/sysdeps/pthread/aio_misc.c
index f55570d..faf139d 100644
--- a/sysdeps/pthread/aio_misc.c
+++ b/sysdeps/pthread/aio_misc.c
@@ -453,7 +453,11 @@ __aio_enqueue_request (aiocb_union *aiocbp, int operation)
 		result = 0;
 	    }
 	}
+      else
+	newp->running = running;
     }
+  else
+    newp->running = running;
 
   /* Enqueue the request in the run queue if it is not yet running.  */
   if (running == yes && result == 0)
@@ -466,9 +470,7 @@ __aio_enqueue_request (aiocb_union *aiocbp, int operation)
 	pthread_cond_signal (&__aio_new_request_notification);
     }
 
-  if (result == 0)
-    newp->running = running;
-  else
+  if (result != 0)
     {
       /* Something went wrong.  */
       __aio_free_request (newp);

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] aio: fix newp->running data race
  2016-05-04 14:02 [PATCH] aio: fix newp->running data race Samuel Thibault
@ 2016-05-18 13:55 ` Samuel Thibault
  2016-07-08 14:28 ` Samuel Thibault
  1 sibling, 0 replies; 6+ messages in thread
From: Samuel Thibault @ 2016-05-18 13:55 UTC (permalink / raw)
  To: GNU C Library, drepper, roland

Ping?

Samuel Thibault, on Wed 04 May 2016 16:02:17 +0200, wrote:
> Hello,
> 
> As reported by helgrind:
> 
> ==9363== Possible data race during write of size 4 at 0x6B45D18 by thread #5
> ==9363== Locks held: 2, at addresses 0x557A1E0 0x6A149E0
> ==9363==    at 0x5375E0D: __aio_enqueue_request (in /lib/x86_64-linux-gnu/librt-2.21.so)
> ==9363==    by 0x53764FD: aio_write (in /lib/x86_64-linux-gnu/librt-2.21.so)
> 
> which is the write in
> 
>   if (result == 0)
>     newp->running = running;
> 
> ==9363== This conflicts with a previous read of size 4 by thread #6
> ==9363== Locks held: none
> ==9363==    at 0x53758FE: handle_fildes_io (in /lib/x86_64-linux-gnu/librt-2.21.so)
> 
> which is the read in 
> 
> 	  /* Hopefully this request is marked as running.  */
> 	  assert (runp->running == allocated);
> 
> So what is happening is that __aio_enqueue_request writes newp->running
> after having possibly started a thread worker to handle the request,
> thus in concurrence with the thread worker, which not only reads
> runp->running as shown here, but also writes to it later (and does not
> take __aio_requests_mutex since it's already given a request).
> 
> But actually when we do start a new thread worker, there is no need
> to set newp->running, since it was already set by the "running =
> newp->running = allocated;" line along the thread creation.  So to avoid
> the race we can just set newp->running in the cases where we do not
> start the thread.
> 
> Samuel
> 
> 
>     	* sysdeps/pthread/aio_misc.c (__aio_enqueue_request): Do not write
>     	`running` field of `newp` when a thread was started to process it,
>     	since that thread will not take `__aio_requests_mutex`, and the field
>     	already has the proper value actually.
> 
> diff --git a/sysdeps/pthread/aio_misc.c b/sysdeps/pthread/aio_misc.c
> index f55570d..faf139d 100644
> --- a/sysdeps/pthread/aio_misc.c
> +++ b/sysdeps/pthread/aio_misc.c
> @@ -453,7 +453,11 @@ __aio_enqueue_request (aiocb_union *aiocbp, int operation)
>  		result = 0;
>  	    }
>  	}
> +      else
> +	newp->running = running;
>      }
> +  else
> +    newp->running = running;
>  
>    /* Enqueue the request in the run queue if it is not yet running.  */
>    if (running == yes && result == 0)
> @@ -466,9 +470,7 @@ __aio_enqueue_request (aiocb_union *aiocbp, int operation)
>  	pthread_cond_signal (&__aio_new_request_notification);
>      }
>  
> -  if (result == 0)
> -    newp->running = running;
> -  else
> +  if (result != 0)
>      {
>        /* Something went wrong.  */
>        __aio_free_request (newp);

-- 
Samuel
<c> ya(ka|ma|to)* ca existe une fois sur 2 au japon, c'est facile ;-)
 -+- #ens-mim au japon -+-

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] aio: fix newp->running data race
  2016-05-04 14:02 [PATCH] aio: fix newp->running data race Samuel Thibault
  2016-05-18 13:55 ` Samuel Thibault
@ 2016-07-08 14:28 ` Samuel Thibault
  2016-07-08 16:05   ` Torvald Riegel
  1 sibling, 1 reply; 6+ messages in thread
From: Samuel Thibault @ 2016-07-08 14:28 UTC (permalink / raw)
  To: GNU C Library

Ping?

Samuel Thibault, on Wed 04 May 2016 16:02:17 +0200, wrote:
> Hello,
> 
> As reported by helgrind:
> 
> ==9363== Possible data race during write of size 4 at 0x6B45D18 by thread #5
> ==9363== Locks held: 2, at addresses 0x557A1E0 0x6A149E0
> ==9363==    at 0x5375E0D: __aio_enqueue_request (in /lib/x86_64-linux-gnu/librt-2.21.so)
> ==9363==    by 0x53764FD: aio_write (in /lib/x86_64-linux-gnu/librt-2.21.so)
> 
> which is the write in
> 
>   if (result == 0)
>     newp->running = running;
> 
> ==9363== This conflicts with a previous read of size 4 by thread #6
> ==9363== Locks held: none
> ==9363==    at 0x53758FE: handle_fildes_io (in /lib/x86_64-linux-gnu/librt-2.21.so)
> 
> which is the read in 
> 
> 	  /* Hopefully this request is marked as running.  */
> 	  assert (runp->running == allocated);
> 
> So what is happening is that __aio_enqueue_request writes newp->running
> after having possibly started a thread worker to handle the request,
> thus in concurrence with the thread worker, which not only reads
> runp->running as shown here, but also writes to it later (and does not
> take __aio_requests_mutex since it's already given a request).
> 
> But actually when we do start a new thread worker, there is no need
> to set newp->running, since it was already set by the "running =
> newp->running = allocated;" line along the thread creation.  So to avoid
> the race we can just set newp->running in the cases where we do not
> start the thread.
> 
> Samuel
> 
> 
>     	* sysdeps/pthread/aio_misc.c (__aio_enqueue_request): Do not write
>     	`running` field of `newp` when a thread was started to process it,
>     	since that thread will not take `__aio_requests_mutex`, and the field
>     	already has the proper value actually.
> 
> diff --git a/sysdeps/pthread/aio_misc.c b/sysdeps/pthread/aio_misc.c
> index f55570d..faf139d 100644
> --- a/sysdeps/pthread/aio_misc.c
> +++ b/sysdeps/pthread/aio_misc.c
> @@ -453,7 +453,11 @@ __aio_enqueue_request (aiocb_union *aiocbp, int operation)
>  		result = 0;
>  	    }
>  	}
> +      else
> +	newp->running = running;
>      }
> +  else
> +    newp->running = running;
>  
>    /* Enqueue the request in the run queue if it is not yet running.  */
>    if (running == yes && result == 0)
> @@ -466,9 +470,7 @@ __aio_enqueue_request (aiocb_union *aiocbp, int operation)
>  	pthread_cond_signal (&__aio_new_request_notification);
>      }
>  
> -  if (result == 0)
> -    newp->running = running;
> -  else
> +  if (result != 0)
>      {
>        /* Something went wrong.  */
>        __aio_free_request (newp);

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] aio: fix newp->running data race
  2016-07-08 14:28 ` Samuel Thibault
@ 2016-07-08 16:05   ` Torvald Riegel
  2016-08-17 13:36     ` Samuel Thibault
  0 siblings, 1 reply; 6+ messages in thread
From: Torvald Riegel @ 2016-07-08 16:05 UTC (permalink / raw)
  To: Samuel Thibault; +Cc: GNU C Library

On Fri, 2016-07-08 at 16:28 +0200, Samuel Thibault wrote:
> Ping?

I've just looked quickly at the patch.  Fixes for concurrency issues
should be accompanied with detailed comments, so that other developers
can easily grasp how synchronization is supposed to work.  In this case,
for example, it should be documented why runp->running can't be accessed
concurrently from several threads.
Doing this properly may add a lot of comments to the code if there are
missing elsewhere, so we might need to find some middle ground if this
patch is meant to become part of 2.24.  But it will need something.  See
the concurrency notes in ./stdlib/cxa_thread_atexit_impl.c for an
example, or the code comments in nptl/sem_waitcommon.c.

> Samuel Thibault, on Wed 04 May 2016 16:02:17 +0200, wrote:
> > Hello,
> > 
> > As reported by helgrind:
> > 
> > ==9363== Possible data race during write of size 4 at 0x6B45D18 by thread #5
> > ==9363== Locks held: 2, at addresses 0x557A1E0 0x6A149E0
> > ==9363==    at 0x5375E0D: __aio_enqueue_request (in /lib/x86_64-linux-gnu/librt-2.21.so)
> > ==9363==    by 0x53764FD: aio_write (in /lib/x86_64-linux-gnu/librt-2.21.so)
> > 
> > which is the write in
> > 
> >   if (result == 0)
> >     newp->running = running;
> > 
> > ==9363== This conflicts with a previous read of size 4 by thread #6
> > ==9363== Locks held: none
> > ==9363==    at 0x53758FE: handle_fildes_io (in /lib/x86_64-linux-gnu/librt-2.21.so)
> > 
> > which is the read in 
> > 
> > 	  /* Hopefully this request is marked as running.  */
> > 	  assert (runp->running == allocated);
> > 
> > So what is happening is that __aio_enqueue_request writes newp->running
> > after having possibly started a thread worker to handle the request,
> > thus in concurrence with the thread worker, which not only reads
> > runp->running as shown here, but also writes to it later (and does not
> > take __aio_requests_mutex since it's already given a request).
> > 
> > But actually when we do start a new thread worker, there is no need
> > to set newp->running, since it was already set by the "running =
> > newp->running = allocated;" line along the thread creation.  So to avoid
> > the race we can just set newp->running in the cases where we do not
> > start the thread.
> > 
> > Samuel
> > 
> > 
> >     	* sysdeps/pthread/aio_misc.c (__aio_enqueue_request): Do not write
> >     	`running` field of `newp` when a thread was started to process it,
> >     	since that thread will not take `__aio_requests_mutex`, and the field
> >     	already has the proper value actually.
> > 
> > diff --git a/sysdeps/pthread/aio_misc.c b/sysdeps/pthread/aio_misc.c
> > index f55570d..faf139d 100644
> > --- a/sysdeps/pthread/aio_misc.c
> > +++ b/sysdeps/pthread/aio_misc.c
> > @@ -453,7 +453,11 @@ __aio_enqueue_request (aiocb_union *aiocbp, int operation)
> >  		result = 0;
> >  	    }
> >  	}
> > +      else
> > +	newp->running = running;
> >      }
> > +  else
> > +    newp->running = running;
> >  
> >    /* Enqueue the request in the run queue if it is not yet running.  */
> >    if (running == yes && result == 0)
> > @@ -466,9 +470,7 @@ __aio_enqueue_request (aiocb_union *aiocbp, int operation)
> >  	pthread_cond_signal (&__aio_new_request_notification);
> >      }
> >  
> > -  if (result == 0)
> > -    newp->running = running;
> > -  else
> > +  if (result != 0)
> >      {
> >        /* Something went wrong.  */
> >        __aio_free_request (newp);



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] aio: fix newp->running data race
  2016-07-08 16:05   ` Torvald Riegel
@ 2016-08-17 13:36     ` Samuel Thibault
  2017-08-03 16:42       ` Samuel Thibault
  0 siblings, 1 reply; 6+ messages in thread
From: Samuel Thibault @ 2016-08-17 13:36 UTC (permalink / raw)
  To: Torvald Riegel; +Cc: GNU C Library

Hello,

Torvald Riegel, on Fri 08 Jul 2016 18:05:15 +0200, wrote:
> On Fri, 2016-07-08 at 16:28 +0200, Samuel Thibault wrote:
> > Ping?
> 
> I've just looked quickly at the patch.  Fixes for concurrency issues
> should be accompanied with detailed comments, so that other developers
> can easily grasp how synchronization is supposed to work.

Ok.  How about this?

Samuel

	* sysdeps/pthread/aio_misc.c (__aio_enqueue_request): Do not write
	`running` field of `newp` when a thread was started to process it,
	since that thread will not take `__aio_requests_mutex`, and the field
	already has the proper value actually.
	* sysdeps/pthread/aio_misc.h: Add a notes about concurrency of
	aio requests.


diff --git a/sysdeps/pthread/aio_misc.c b/sysdeps/pthread/aio_misc.c
index f55570d..00b2feb 100644
--- a/sysdeps/pthread/aio_misc.c
+++ b/sysdeps/pthread/aio_misc.c
@@ -435,7 +435,9 @@ __aio_enqueue_request (aiocb_union *aiocbp, int operation)
 	  if (result == 0)
 	    /* We managed to enqueue the request.  All errors which can
 	       happen now can be recognized by calls to `aio_return' and
-	       `aio_error'.  */
+	       `aio_error'.  From here, newp must not be modified any more
+	       since a helper thread could be already working on it.
+	       */
 	    ++nthreads;
 	  else
 	    {
@@ -453,7 +455,11 @@ __aio_enqueue_request (aiocb_union *aiocbp, int operation)
 		result = 0;
 	    }
 	}
+      else
+	newp->running = running;
     }
+  else
+    newp->running = running;
 
   /* Enqueue the request in the run queue if it is not yet running.  */
   if (running == yes && result == 0)
@@ -466,9 +472,7 @@ __aio_enqueue_request (aiocb_union *aiocbp, int operation)
 	pthread_cond_signal (&__aio_new_request_notification);
     }
 
-  if (result == 0)
-    newp->running = running;
-  else
+  if (result != 0)
     {
       /* Something went wrong.  */
       __aio_free_request (newp);
diff --git a/sysdeps/pthread/aio_misc.h b/sysdeps/pthread/aio_misc.h
index e042998..66ef3b1 100644
--- a/sysdeps/pthread/aio_misc.h
+++ b/sysdeps/pthread/aio_misc.h
@@ -72,6 +72,18 @@ enum
   done
 };
 
+/* CONCURRENCY NOTES:
+
+   __aio_requests_mutex only protects the request list.  The requests
+   themselves are not protected, since the producer is alone producing it
+   before pushing it to the list, and the consumer is alone working on it after
+   popping from the list.
+
+   Notably, when __aio_create_helper_thread is used, it directly gives the
+   request to the thread, and thus no lock is taken.  Consequently, the
+   producer must not access the request after calling
+   __aio_create_helper_thread since the consumer will immediately work on it
+   without taking __aio_requests_mutex.  */
 
 /* Used to queue requests..  */
 struct requestlist

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] aio: fix newp->running data race
  2016-08-17 13:36     ` Samuel Thibault
@ 2017-08-03 16:42       ` Samuel Thibault
  0 siblings, 0 replies; 6+ messages in thread
From: Samuel Thibault @ 2017-08-03 16:42 UTC (permalink / raw)
  To: Torvald Riegel; +Cc: GNU C Library

Hello,

Ping?

Samuel

Samuel Thibault, on mer. 17 aoÃ»t 2016 14:49:29 +0200, wrote:
> Torvald Riegel, on Fri 08 Jul 2016 18:05:15 +0200, wrote:
> > On Fri, 2016-07-08 at 16:28 +0200, Samuel Thibault wrote:
> > > Ping?
> > 
> > I've just looked quickly at the patch.  Fixes for concurrency issues
> > should be accompanied with detailed comments, so that other developers
> > can easily grasp how synchronization is supposed to work.
> 
> Ok.  How about this?

	* sysdeps/pthread/aio_misc.c (__aio_enqueue_request): Do not write
	`running` field of `newp` when a thread was started to process it,
	since that thread will not take `__aio_requests_mutex`, and the field
	already has the proper value actually.
	* sysdeps/pthread/aio_misc.h: Add a notes about concurrency of
	aio requests.


diff --git a/sysdeps/pthread/aio_misc.c b/sysdeps/pthread/aio_misc.c
index f55570d..00b2feb 100644
--- a/sysdeps/pthread/aio_misc.c
+++ b/sysdeps/pthread/aio_misc.c
@@ -435,7 +435,9 @@ __aio_enqueue_request (aiocb_union *aiocbp, int operation)
 	  if (result == 0)
 	    /* We managed to enqueue the request.  All errors which can
 	       happen now can be recognized by calls to `aio_return' and
-	       `aio_error'.  */
+	       `aio_error'.  From here, newp must not be modified any more
+	       since a helper thread could be already working on it.
+	       */
 	    ++nthreads;
 	  else
 	    {
@@ -453,7 +455,11 @@ __aio_enqueue_request (aiocb_union *aiocbp, int operation)
 		result = 0;
 	    }
 	}
+      else
+	newp->running = running;
     }
+  else
+    newp->running = running;
 
   /* Enqueue the request in the run queue if it is not yet running.  */
   if (running == yes && result == 0)
@@ -466,9 +472,7 @@ __aio_enqueue_request (aiocb_union *aiocbp, int operation)
 	pthread_cond_signal (&__aio_new_request_notification);
     }
 
-  if (result == 0)
-    newp->running = running;
-  else
+  if (result != 0)
     {
       /* Something went wrong.  */
       __aio_free_request (newp);
diff --git a/sysdeps/pthread/aio_misc.h b/sysdeps/pthread/aio_misc.h
index e042998..66ef3b1 100644
--- a/sysdeps/pthread/aio_misc.h
+++ b/sysdeps/pthread/aio_misc.h
@@ -72,6 +72,18 @@ enum
   done
 };
 
+/* CONCURRENCY NOTES:
+
+   __aio_requests_mutex only protects the request list.  The requests
+   themselves are not protected, since the producer is alone producing it
+   before pushing it to the list, and the consumer is alone working on it after
+   popping from the list.
+
+   Notably, when __aio_create_helper_thread is used, it directly gives the
+   request to the thread, and thus no lock is taken.  Consequently, the
+   producer must not access the request after calling
+   __aio_create_helper_thread since the consumer will immediately work on it
+   without taking __aio_requests_mutex.  */
 
 /* Used to queue requests..  */
 struct requestlist

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-08-03 16:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-04 14:02 [PATCH] aio: fix newp->running data race Samuel Thibault
2016-05-18 13:55 ` Samuel Thibault
2016-07-08 14:28 ` Samuel Thibault
2016-07-08 16:05   ` Torvald Riegel
2016-08-17 13:36     ` Samuel Thibault
2017-08-03 16:42       ` Samuel Thibault

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).