public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* Re: 20090521 systemtap meeting notes
       [not found]       ` <4A1BE332.8070302@redhat.com>
@ 2009-05-26 19:06         ` Jim Keniston
  2009-06-01 17:49           ` task_finder holding 'mmap_sem' too long David Smith
  0 siblings, 1 reply; 2+ messages in thread
From: Jim Keniston @ 2009-05-26 19:06 UTC (permalink / raw)
  To: David Smith
  Cc: ananth, RedHat_perftools, William Cohen, Roland McGrath, systemtap

On Tue, 2009-05-26 at 07:40 -0500, David Smith wrote:
> Ananth N Mavinakayanahalli wrote:
> > On Fri, May 22, 2009 at 06:56:51PM -0700, Roland McGrath wrote:
> >> (Why is this not on systemtap@?)
> > 
> > (This was a response to the Thursday MoM Will posted to perftools.
> > Should've been on systemtap@)
> > 
> >>> stapio/2796 is trying to acquire lock:
> >>>  (&mm->mmap_sem){++++++}, at: [<e181beab>] register_uprobe+0x24d/0x82a [uprobes]
> >>>
> >>> but task is already holding lock:
> >>>  (&mm->mmap_sem){++++++}, at: [<e18bdfd6>] __stp_utrace_task_finder_target_quiesce+0x211/0x2db [stap_722fa39772a3d7da10b7105c514a76be_1462]
> >> task_finder calls ->mmap_callback with mmap_sem held for reading.  But it
> >> can lead into register_uprobe, which can try to take it for either reading
> >> or writing.  The lockdep complaint about taking it again for reading could
> >> be avoided by using down_read_nested.  But the real problem is when
> >> register_uprobe gets into uprobe_setup_ssol_vma and tries to take it for
> >> writing.

uprobe_setup_ssol_vma() is not called from register_uprobe(), but rather
from uprobe_report_signal() the first time a breakpoint is hit.

> >>
> >> I think the task-finder callback plan just has to get more sophisticated.
> >> Callbacks with a lock like mmap_sem held is kind of dubious for any
> >> quasi-generic API, because of just this kind of complexity.
> > 
> > Maybe that's something for David Smith to take a first stab at. David?
> 
> Hmm.  Looking back through the task_finder code, I believe the mmap_sem
> is being held so that the vma list doesn't get deleted from underneath
> the task_finder.  However, I'm not sure that can really happen in the
> cases where it is done.  It might be possible that calling
> 'get_task_mm()' would be enough here.
> 
> It looks like the task_finder runs callbacks with mmap_sem held in 2 places:
> 
> 1) When initially attaching to a "interesting" thread, it gets stopped.
>  In the quiesce handler, the mmap callbacks are run for vma's that
> existed before task_finder attached to it.  (This is only done for the
> thread group leader.)  The entire vma list is processed in this matter.
> 
> Since the thread is stopped, how worried should the task_finder be that
> another thread in the same thread group might modify mm->map?

If you're introducing (say) 1000 probes into an existing multithreaded
app, probe #1 could get hit by one thread (thus triggering
uprobe_setup_ssol_vma()) while later probes are still being registered.
I don't see how that causes a deadlock, though.  Seems like
uprobe_setup_ssol_vma() would just block (NOT holding the
uprobe_process->rwsem, BTW) until task_finder released mmap_sem.

> 
> 2) At syscall exit, if the call is mmap or mmap2, the callbacks are
> called on the new vma.  In this case it would be possible to hold
> mmap_sem, get the information needed out of the new vma, release
> mmap_sem, then call the callbacks.
> 

Jim

^ permalink raw reply	[flat|nested] 2+ messages in thread

* task_finder holding 'mmap_sem' too long
  2009-05-26 19:06         ` 20090521 systemtap meeting notes Jim Keniston
@ 2009-06-01 17:49           ` David Smith
  0 siblings, 0 replies; 2+ messages in thread
From: David Smith @ 2009-06-01 17:49 UTC (permalink / raw)
  To: Jim Keniston; +Cc: ananth, William Cohen, Roland McGrath, systemtap

>> On Tue, 2009-05-26 at 07:40 -0500, David Smith wrote:
>> Hmm.  Looking back through the task_finder code, I believe the mmap_sem
>> is being held so that the vma list doesn't get deleted from underneath
>> the task_finder.  However, I'm not sure that can really happen in the
>> cases where it is done.  It might be possible that calling
>> 'get_task_mm()' would be enough here.
>>
>> It looks like the task_finder runs callbacks with mmap_sem held in 2 places:
>>
>> 1) When initially attaching to a "interesting" thread, it gets stopped.
>>  In the quiesce handler, the mmap callbacks are run for vma's that
>> existed before task_finder attached to it.  (This is only done for the
>> thread group leader.)  The entire vma list is processed in this matter.
>>
>> 2) At syscall exit, if the call is mmap or mmap2, the callbacks are
>> called on the new vma.  In this case it would be possible to hold
>> mmap_sem, get the information needed out of the new vma, release
>> mmap_sem, then call the callbacks.

After a bit of work, I've fixed these 2 issues (the fixes are in commits
9b59029 and bec8cf6 for the curious).  The task_finder no longer holds
the mmap_sem while making callbacks.

In case 1), the new code grabs the mmap_sem, caches information about
each vma, releases the mmap_sem, then makes the callbacks.

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2009-06-01 17:49 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <4A15A354.4050000@redhat.com>
     [not found] ` <20090522094036.GD5562@in.ibm.com>
     [not found]   ` <20090523015651.D7354FC35D@magilla.sf.frob.com>
     [not found]     ` <20090525104509.GA19797@in.ibm.com>
     [not found]       ` <4A1BE332.8070302@redhat.com>
2009-05-26 19:06         ` 20090521 systemtap meeting notes Jim Keniston
2009-06-01 17:49           ` task_finder holding 'mmap_sem' too long David Smith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).