public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* RE: FW: locking timeout error ?
@ 2006-01-19 22:35 Stone, Joshua I
  2006-01-28 13:51 ` Frank Ch. Eigler
  0 siblings, 1 reply; 9+ messages in thread
From: Stone, Joshua I @ 2006-01-19 22:35 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

Frank Ch. Eigler wrote:
>> [...] Imagine the performance if the probe handles perhaps 5
>> different locks in its body, which is well within reason.
> 
> If the locking scheme is reorganized as outlined in #2060, by pulling
> all lock/unlock operations to the outermost level of a probe, then the
> overall (rather than per-lock) timeout could be easily bounded.  Plus
> each global variable would be locked at most once per probe handler
> run, rather than around each appearance in an expression.

I really like this idea, because it treats probes as truly atomic - if
you don't get all of the locks you need, don't do anything.
Implementing this will also require full function analysis to propogate
the locks back to the probe.

> Actually, even without that, we could accumulate locking iteration
> counts in a new context variable, and limit the cumulative (rather
> than individual) total to MAXTRYLOCK.

I think this would be a much more meaningful metric, and it should be an
easy thing to implement...


Josh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: FW: locking timeout error ?
  2006-01-19 22:35 FW: locking timeout error ? Stone, Joshua I
@ 2006-01-28 13:51 ` Frank Ch. Eigler
  0 siblings, 0 replies; 9+ messages in thread
From: Frank Ch. Eigler @ 2006-01-28 13:51 UTC (permalink / raw)
  To: systemtap


joshua.i.stone wrote:

> > If the locking scheme is reorganized as outlined in #2060, by pulling
> > all lock/unlock operations to the outermost level of a probe [...]
> 
> I really like this idea, because it treats probes as truly atomic [...]

This is now done.

- FChE

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: FW: locking timeout error ?
  2006-01-16 20:51 ` Frank Ch. Eigler
  2006-01-16 21:02   ` James Dickens
@ 2006-01-16 22:03   ` Martin Hunt
  1 sibling, 0 replies; 9+ messages in thread
From: Martin Hunt @ 2006-01-16 22:03 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Stone, Joshua I, systemtap

On Mon, 2006-01-16 at 15:50 -0500, Frank Ch. Eigler wrote:
> Hi -
> 
> > [...] Imagine the performance if the probe handles perhaps 5
> > different locks in its body, which is well within reason.
> 
> If the locking scheme is reorganized as outlined in #2060, by pulling
> all lock/unlock operations to the outermost level of a probe, then the
> overall (rather than per-lock) timeout could be easily bounded.  Plus
> each global variable would be locked at most once per probe handler
> run, rather than around each appearance in an expression.

If you reorganize the locks as proposed, you should be able to actually
reduce MAXTRYLOCK because the consequences of a lock timeout will simply
be an increment of the skipped probe counter instead of a fatal error.
Maps are not scalable anyway and pmaps will be unaffected, so I expect
performance will not suffer at all.

Martin


^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: FW: locking timeout error ?
@ 2006-01-16 21:07 Stone, Joshua I
  0 siblings, 0 replies; 9+ messages in thread
From: Stone, Joshua I @ 2006-01-16 21:07 UTC (permalink / raw)
  To: James Dickens, Frank Ch. Eigler; +Cc: systemtap

James Dickens wrote:
> perhaps it would be better if systemtap took no locks, other than
> private ones related to systemtap data structures, [...]

AFAIK this is already the case - all of the locks we are talking about
here are 'private' systemtap locks.

Josh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: FW: locking timeout error ?
  2006-01-16 21:02   ` James Dickens
@ 2006-01-16 21:06     ` Frank Ch. Eigler
  0 siblings, 0 replies; 9+ messages in thread
From: Frank Ch. Eigler @ 2006-01-16 21:06 UTC (permalink / raw)
  To: James Dickens; +Cc: systemtap

Hi -

On Mon, Jan 16, 2006 at 03:02:20PM -0600, James Dickens wrote:

> [...]  perhaps it would be better if systemtap took no locks, other
> than private ones related to systemtap data structures [...]

It is exactly these locks we are talking about.

- FChE

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: FW: locking timeout error ?
  2006-01-16 20:51 ` Frank Ch. Eigler
@ 2006-01-16 21:02   ` James Dickens
  2006-01-16 21:06     ` Frank Ch. Eigler
  2006-01-16 22:03   ` Martin Hunt
  1 sibling, 1 reply; 9+ messages in thread
From: James Dickens @ 2006-01-16 21:02 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Stone, Joshua I, systemtap

On 1/16/06, Frank Ch. Eigler <fche@redhat.com> wrote:
> Hi -
>
> > [...] Imagine the performance if the probe handles perhaps 5
> > different locks in its body, which is well within reason.
>
> If the locking scheme is reorganized as outlined in #2060, by pulling
> all lock/unlock operations to the outermost level of a probe, then the
> overall (rather than per-lock) timeout could be easily bounded.  Plus
> each global variable would be locked at most once per probe handler
> run, rather than around each appearance in an expression.
>
> Actually, even without that, we could accumulate locking iteration
> counts in a new context variable, and limit the cumulative (rather
> than individual) total to MAXTRYLOCK.
>
>
> > Perhaps the better way to approach this is to estimate how long we might
> > expect the person holding the lock to take.  Do we have any
> > microbenchmarks on how long various operations take, like indexing a
> > map?
>
perhaps it would be better if systemtap took no locks, other than
private ones related to systemtap data structures, sure it will be
harder to write, but you can be sure that your script will work long
term. Not having to worry about some part of the kernel taking a lock
you depend on and holding it or worse yet, taking a lock and then have
a probe fire that wants to take the lock being held.

James Dickens
uadmin.blogspot.com



> Unfortunately, no.  Bug #1884 and #2060 could benefit from the
> development of a microbenchmark suite.
>
>
> - FChE
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: FW: locking timeout error ?
  2006-01-13  8:05 Stone, Joshua I
@ 2006-01-16 20:51 ` Frank Ch. Eigler
  2006-01-16 21:02   ` James Dickens
  2006-01-16 22:03   ` Martin Hunt
  0 siblings, 2 replies; 9+ messages in thread
From: Frank Ch. Eigler @ 2006-01-16 20:51 UTC (permalink / raw)
  To: Stone, Joshua I; +Cc: systemtap

Hi -

> [...] Imagine the performance if the probe handles perhaps 5
> different locks in its body, which is well within reason.

If the locking scheme is reorganized as outlined in #2060, by pulling
all lock/unlock operations to the outermost level of a probe, then the
overall (rather than per-lock) timeout could be easily bounded.  Plus
each global variable would be locked at most once per probe handler
run, rather than around each appearance in an expression.

Actually, even without that, we could accumulate locking iteration
counts in a new context variable, and limit the cumulative (rather
than individual) total to MAXTRYLOCK.


> Perhaps the better way to approach this is to estimate how long we might
> expect the person holding the lock to take.  Do we have any
> microbenchmarks on how long various operations take, like indexing a
> map?

Unfortunately, no.  Bug #1884 and #2060 could benefit from the
development of a microbenchmark suite.


- FChE

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: FW: locking timeout error ?
@ 2006-01-13  8:05 Stone, Joshua I
  2006-01-16 20:51 ` Frank Ch. Eigler
  0 siblings, 1 reply; 9+ messages in thread
From: Stone, Joshua I @ 2006-01-13  8:05 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

Frank Ch. Eigler wrote:
>> Actually, I meant to ask you about this.  Is it safe to be waiting so
>> long for a lock if the probe is in interrupt context?  If I
>> understand it, the lock will wait about MAXTRYLOCK*TRYLOCKDELAY
>> nanoseconds = 100 microseconds.  That seems like an awfully long
>> time to poll the lock. 
> 
> The current figures are not meant as definitive.  It may be wise to
> wait less if in_interrupt().  Do you have alternate numbers to
> suggest? 

I'll leave it to the experts to decide what is appropriate.  But even
when not in_interrupt, I think this may be too long.  Do we have
estimates for what a "reasonable" runtime for a probe?  We can assume
that in the worst case, the lock will almost timeout everytime, but
still succeed.  Imagine the performance if the probe handles perhaps 5
different locks in its body, which is well within reason.

Perhaps the better way to approach this is to estimate how long we might
expect the person holding the lock to take.  Do we have any
microbenchmarks on how long various operations take, like indexing a
map?

Josh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: FW: locking timeout error ?
       [not found] <CBDB88BFD06F7F408399DBCF8776B3DC0606B0CA@scsmsx403.amr.corp.intel.com>
@ 2006-01-07  4:18 ` Frank Ch. Eigler
  0 siblings, 0 replies; 9+ messages in thread
From: Frank Ch. Eigler @ 2006-01-07  4:18 UTC (permalink / raw)
  To: Stone, Joshua I; +Cc: systemtap

Hi -

joshua.i.stone wrote:

> Actually, I meant to ask you about this.  Is it safe to be waiting so
> long for a lock if the probe is in interrupt context?  If I understand
> it, the lock will wait about MAXTRYLOCK*TRYLOCKDELAY nanoseconds = 100
> microseconds.  That seems like an awfully long time to poll the lock.

The current figures are not meant as definitive.  It may be wise to
wait less if in_interrupt().  Do you have alternate numbers to suggest?


- FChE

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-01-28 13:51 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-19 22:35 FW: locking timeout error ? Stone, Joshua I
2006-01-28 13:51 ` Frank Ch. Eigler
  -- strict thread matches above, loose matches on Subject: below --
2006-01-16 21:07 Stone, Joshua I
2006-01-13  8:05 Stone, Joshua I
2006-01-16 20:51 ` Frank Ch. Eigler
2006-01-16 21:02   ` James Dickens
2006-01-16 21:06     ` Frank Ch. Eigler
2006-01-16 22:03   ` Martin Hunt
     [not found] <CBDB88BFD06F7F408399DBCF8776B3DC0606B0CA@scsmsx403.amr.corp.intel.com>
2006-01-07  4:18 ` Frank Ch. Eigler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).