public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* RE: offline elfutils processing committed
@ 2006-11-02 16:42 Stone, Joshua I
  0 siblings, 0 replies; 20+ messages in thread
From: Stone, Joshua I @ 2006-11-02 16:42 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

On Wednesday, November 01, 2006 2:22 PM, Frank Ch. Eigler wrote:
> Hi -
> 
> On Wed, Nov 01, 2006 at 02:02:23PM -0800, Stone, Joshua I wrote:
> 
>> [...]  The easy-out answer is that stp_random_pm deals with ints,
>> and I needed the randomization to be 64-bit.  [...]
> 
> Yeah, might as well upgrade the runtime's random_pm function.

Upgrade?  Or create a new one as a 64-bit alternative?

>> I also thought that using the kernel's get_random_bytes would yield
>> better randomization [...]
> 
> The problem there is that this kernel facility uses locks of its own,
> which would constitute another probing hazard area.

I think it's ok in this case.  It helps that we know what context we're
calling from -- timers are run in soft-irqs.  The locks via
get_random_bytes use irqsave, so I don't think there's any way for us to
trap here.

>> On a related note, you broke the randomization on jiffies timers when
>> used as ms.  [...]
> 
> Sorry about that.  I probably broke one or two other things by
> mistake -- or out of spite! :-)

Yes, spite would be just like you... ;)

> How about this: rework enter_hrtimer_probe, so that the
> timer expiration computation is done before the
> common_probe_entryfn_prologue.  "But you can't have statements so
> early!" I hear you cry.  Ahem!  Yes, you can - if it's nested within a
> new block.  So:
> 
>   enter_hrtimer_probe (...) {
>     timer->expires = [...];
>     { /* <<<< */
>        ... prologue etc.
> 
>        ... epilogue
>     }
>     return HRTIMER_RESTART;
>  }

Ok, and we'll need a similar thing for enter_timer_probe.  I'm worried
about this introducing races in shutdown though.  If we're trying to
remove the timers, then the timer body shouldn't restart itself.  I
suppose I could check for STAP_SESSION_STARTING/_RUNNING before
reloading the timer -- but we'll also need to change it so that when
init fails the cleanup takes us out of STARTING mode.


Josh

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: offline elfutils processing committed
  2006-11-07 16:39         ` Martin Hunt
@ 2006-11-08  1:26           ` Frank Ch. Eigler
  0 siblings, 0 replies; 20+ messages in thread
From: Frank Ch. Eigler @ 2006-11-08  1:26 UTC (permalink / raw)
  To: Martin Hunt; +Cc: systemtap

[-- Attachment #1: Type: text/plain, Size: 439 bytes --]

Hi -

hunt wrote:
> [...]
> Plan B involves having the module detect staprun's death. Easiest way
> would probably be to add in to the worker thread poll loop which would
> mean it would detect that staprun had been killed within .01 seconds.

I actually don't see that there need to be a big rush to have the
probe module respond - definitely not to the point of adding any
measurable amount of system load dedicated to the task.

- FChE

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: offline elfutils processing committed
  2006-11-07 16:17       ` Frank Ch. Eigler
@ 2006-11-07 16:39         ` Martin Hunt
  2006-11-08  1:26           ` Frank Ch. Eigler
  0 siblings, 1 reply; 20+ messages in thread
From: Martin Hunt @ 2006-11-07 16:39 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

On Tue, 2006-11-07 at 10:43 -0500, Frank Ch. Eigler wrote:
> Martin Hunt <hunt@redhat.com> writes:
> 
> > Right now I have staprun getting SIGKILL from __oom_kill_task and it
> > signals the end probe functions [...]
> 
> How, when by its nature, no process can hook SIGKILL.

Doh! You are right of course. Must remember to drink caffeine in the
mornings before posting.

I checked and oom does use SIGKILL, but staprun only handles SIGINT and
SIGTERM.  

Plan B involves having the module detect staprun's death. Easiest way
would probably be to add in to the worker thread poll loop which would
mean it would detect that staprun had been killed within .01 seconds.

Martin


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: offline elfutils processing committed
  2006-11-07 15:43     ` Martin Hunt
@ 2006-11-07 16:17       ` Frank Ch. Eigler
  2006-11-07 16:39         ` Martin Hunt
  0 siblings, 1 reply; 20+ messages in thread
From: Frank Ch. Eigler @ 2006-11-07 16:17 UTC (permalink / raw)
  To: Martin Hunt; +Cc: systemtap

Martin Hunt <hunt@redhat.com> writes:

> Right now I have staprun getting SIGKILL from __oom_kill_task and it
> signals the end probe functions [...]

How, when by its nature, no process can hook SIGKILL.

- FChE

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: offline elfutils processing committed
  2006-11-07 15:13   ` Frank Ch. Eigler
@ 2006-11-07 15:43     ` Martin Hunt
  2006-11-07 16:17       ` Frank Ch. Eigler
  0 siblings, 1 reply; 20+ messages in thread
From: Martin Hunt @ 2006-11-07 15:43 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

On Tue, 2006-11-07 at 09:34 -0500, Frank Ch. Eigler wrote:
> Hi -
> 
> hunt wrote:
> 
> > [...]  What happens when systemtap's allocations succeed, but leave
> > the system in a low memory state such that other applications
> > trigger the oom killer when they try to allocate memory.  In this
> > case, we want staprun and the systemtap module to be first to be
> > killed. [...]
> 
> Since the systemtap module rather than staprun owns most of the
> memory, and because by its nature the module reacts relatively slowly
> to staprun's demise, biasing staprun for OOM targeting may not
> meaningfully assist the system in a time of need.

Right now I have staprun getting SIGKILL from __oom_kill_task and it
signals the end probe functions to run before unloading the module. If
we decide this is too slow a reaction, we can always just unload the
module immediately. We certainly don't want to depend on the code in the
module that periodically checks for staprun's existence.

> Also, preferring to kill staprun/etc. under such conditions is not
> obviously correct.  One might argue that once a systemtap script is
> running, it deserves to be kept alive no less than any other process:
> it may be running precisely because the sysadmin wanted to monitor the
> system.  Heck, it might be in the middle of debugging excessive memory
> consumption problems.

That's a good point, although it is hard to imagine sysadmins would
often prefer to trust oom-killer to randomly kill processes rather than
remove systemtap scripts.  Perhaps we need a command line option to set
that. 

Martin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: offline elfutils processing committed
  2006-11-07  6:36 ` Martin Hunt
@ 2006-11-07 15:13   ` Frank Ch. Eigler
  2006-11-07 15:43     ` Martin Hunt
  0 siblings, 1 reply; 20+ messages in thread
From: Frank Ch. Eigler @ 2006-11-07 15:13 UTC (permalink / raw)
  To: Martin Hunt; +Cc: systemtap

Hi -

hunt wrote:

> [...]  What happens when systemtap's allocations succeed, but leave
> the system in a low memory state such that other applications
> trigger the oom killer when they try to allocate memory.  In this
> case, we want staprun and the systemtap module to be first to be
> killed. [...]

Since the systemtap module rather than staprun owns most of the
memory, and because by its nature the module reacts relatively slowly
to staprun's demise, biasing staprun for OOM targeting may not
meaningfully assist the system in a time of need.

Also, preferring to kill staprun/etc. under such conditions is not
obviously correct.  One might argue that once a systemtap script is
running, it deserves to be kept alive no less than any other process:
it may be running precisely because the sysadmin wanted to monitor the
system.  Heck, it might be in the middle of debugging excessive memory
consumption problems.

- FChE

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: offline elfutils processing committed
  2006-11-06 22:41 Stone, Joshua I
  2006-11-07  4:37 ` Roland McGrath
@ 2006-11-07  6:36 ` Martin Hunt
  2006-11-07 15:13   ` Frank Ch. Eigler
  1 sibling, 1 reply; 20+ messages in thread
From: Martin Hunt @ 2006-11-07  6:36 UTC (permalink / raw)
  To: Stone, Joshua I; +Cc: Frank Ch. Eigler, systemtap

On Mon, 2006-11-06 at 14:15 -0800, Stone, Joshua I wrote:
> On Monday, November 06, 2006 1:18 PM, Martin Hunt wrote:
> > The point is damage control.  Systemtap allocates too much memory and
> > oom killer gets active, the first thing it will kill is staprun and
> > that should unload the module (but this seems broken at the moment). 
> > So we haven't really hurt the system.
> 
> The goal is fine, but I don't think this accomplishes it.  My
> understanding is that __alloc_pages will keep calling OOM until it is
> able to satisfy the request -- thus the module is blocked waiting for
> memory.  The process might end up something like:
> 
> stap module: allocate lots of memory
> __alloc_pages: Not enough memory -> OOM kill something (staprun)
> __alloc_pages: Still not enough memory -> OOM kill other stuff
> __alloc_pages: Yay, now we have enough memory!
> stap module: got some memory
> stap module: Oops, staprun is gone, better exit...

There are 2 different, but related problems. The one you describe is
easily fixed by using the GFP_NORETRY flag on our allocs.  The second
problem is the one I was trying to describe. What happens when
systemtap's allocations succeed, but leave the system in a low memory
state such that other applications trigger the oom killer when they try
to allocate memory.  In this case, we want staprun and the systemtap
module to be first to be killed. I haven't looked at the sources, but it
seems unlikely to me that the oom killer would be so fast that it would
kill staprun and then kill other processes before the module is also
killed and frees it's memory.

Martin


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: offline elfutils processing committed
  2006-11-06 22:41 Stone, Joshua I
@ 2006-11-07  4:37 ` Roland McGrath
  2006-11-07  6:36 ` Martin Hunt
  1 sibling, 0 replies; 20+ messages in thread
From: Roland McGrath @ 2006-11-07  4:37 UTC (permalink / raw)
  To: Stone, Joshua I; +Cc: Martin Hunt, Frank Ch. Eigler, systemtap

> The goal is fine, but I don't think this accomplishes it.  My
> understanding is that __alloc_pages will keep calling OOM until it is
> able to satisfy the request -- thus the module is blocked waiting for
> memory.  

It looks like that can be avoided with the __GFP_NORETRY flag.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: offline elfutils processing committed
@ 2006-11-06 22:41 Stone, Joshua I
  2006-11-07  4:37 ` Roland McGrath
  2006-11-07  6:36 ` Martin Hunt
  0 siblings, 2 replies; 20+ messages in thread
From: Stone, Joshua I @ 2006-11-06 22:41 UTC (permalink / raw)
  To: Martin Hunt, Frank Ch. Eigler; +Cc: systemtap

On Monday, November 06, 2006 1:18 PM, Martin Hunt wrote:
> The point is damage control.  Systemtap allocates too much memory and
> oom killer gets active, the first thing it will kill is staprun and
> that should unload the module (but this seems broken at the moment). 
> So we haven't really hurt the system.

The goal is fine, but I don't think this accomplishes it.  My
understanding is that __alloc_pages will keep calling OOM until it is
able to satisfy the request -- thus the module is blocked waiting for
memory.  The process might end up something like:

stap module: allocate lots of memory
__alloc_pages: Not enough memory -> OOM kill something (staprun)
__alloc_pages: Still not enough memory -> OOM kill other stuff
__alloc_pages: Yay, now we have enough memory!
stap module: got some memory
stap module: Oops, staprun is gone, better exit...

Ouch...


Josh

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: offline elfutils processing committed
  2006-11-06 21:52         ` Martin Hunt
@ 2006-11-06 22:15           ` Frank Ch. Eigler
  0 siblings, 0 replies; 20+ messages in thread
From: Frank Ch. Eigler @ 2006-11-06 22:15 UTC (permalink / raw)
  To: Martin Hunt; +Cc: systemtap


hunt wrote:

> [...]
> > What would be the harm in that?  IOW, what kind of concurrency problem
> > is the lock intended to prevent?
> 
> Module loads (and eventually unloads) will update the list of modules
> and their symbols. This cannot happen while a kprobe is looking through
> the list resolving some symbol.

OTOH if a module load is in progress, kprobes would be locked out.
Have you characterized how long such a duration could be?  (At last
it's mostly moot once the relocation call is taken out of its current
homestead in the $variable fetcher functions.)

> > > Regarding OOM killer, shouldn't we just set oom_adj to something
> > > like +15?  That would put us at the top of the victims list [...]
> > 
> > Who is that "us"?  The staprun process?  It's too late by then [...]
> Yeah, "us" means staprun and the systemtap module.
> 
> The point is damage control.  Systemtap allocates too much memory and
> oom killer gets active, the first thing it will kill is staprun and that
> should unload the module [...]

But it's too late by then.  Most or all the memory allocation attempts
take place in one big lump during one short interval, between checks
for the continued existence of the staprun process.  So if the
attempts are too aggressive, sure staprun will be zapped, but so will
many other programs, by the time the probe module would notice.

- FChE

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: offline elfutils processing committed
  2006-11-06 21:18       ` Frank Ch. Eigler
@ 2006-11-06 21:52         ` Martin Hunt
  2006-11-06 22:15           ` Frank Ch. Eigler
  0 siblings, 1 reply; 20+ messages in thread
From: Martin Hunt @ 2006-11-06 21:52 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

On Mon, 2006-11-06 at 15:55 -0500, Frank Ch. Eigler wrote:
> Martin Hunt <hunt@redhat.com> writes:

> > I put the spinlock in at the last minute because I was worried that
> > someone might be tempted to use it to try to resolve addresses
> > during a kprobe. [...]
> 
> What would be the harm in that?  IOW, what kind of concurrency problem
> is the lock intended to prevent?

Module loads (and eventually unloads) will update the list of modules
and their symbols. This cannot happen while a kprobe is looking through
the list resolving some symbol.


> 
> > Regarding OOM killer, shouldn't we just set oom_adj to something
> > like +15?  That would put us at the top of the victims list [...]
> 
> Who is that "us"?  The staprun process?  It's too late by then - the
> module will not notice that staprun is gone until after the module's
> allocation attempts have hurt the system.

Yeah, "us" means staprun and the systemtap module.

The point is damage control.  Systemtap allocates too much memory and
oom killer gets active, the first thing it will kill is staprun and that
should unload the module (but this seems broken at the moment).  So we
haven't really hurt the system.

Martin


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: offline elfutils processing committed
  2006-11-06 20:55     ` Martin Hunt
@ 2006-11-06 21:18       ` Frank Ch. Eigler
  2006-11-06 21:52         ` Martin Hunt
  0 siblings, 1 reply; 20+ messages in thread
From: Frank Ch. Eigler @ 2006-11-06 21:18 UTC (permalink / raw)
  To: Martin Hunt; +Cc: systemtap

Martin Hunt <hunt@redhat.com> writes:

> [...]
> > It however exposes two problems with the newest symbol table code.
> > First, the new _stp_module_lookup function holds a spinlock on its
> > data structure, which could conceivably block.  
> 
> Is there some reason why the $target variable must be looked up each
> time it is accessed?  Sounds slow.

It was the simplest functional expedient.

> I put the spinlock in at the last minute because I was worried that
> someone might be tempted to use it to try to resolve addresses
> during a kprobe. [...]

What would be the harm in that?  IOW, what kind of concurrency problem
is the lock intended to prevent?

> [...]  Moving the allocations into .data or .bss sections could
> still trigger OOM situations.

Yes, however, the difference is that even if the allocation size were
configurable by a user, it would be expressed in obvious units
(megabytes, rather than array rows).

> [...] I made this argument a year ago and was told to rewrite to
> allocate memory in small chunks with kmalloc. [...]

You will recall that, at the time, we were experiencing problems with
probing the proximity of the vmalloc fault handler path.

> Regarding OOM killer, shouldn't we just set oom_adj to something
> like +15?  That would put us at the top of the victims list [...]

Who is that "us"?  The staprun process?  It's too late by then - the
module will not notice that staprun is gone until after the module's
allocation attempts have hurt the system.

- FChE

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: offline elfutils processing committed
  2006-11-04  3:29   ` Frank Ch. Eigler
  2006-11-04  4:05     ` Roland McGrath
@ 2006-11-06 20:55     ` Martin Hunt
  2006-11-06 21:18       ` Frank Ch. Eigler
  1 sibling, 1 reply; 20+ messages in thread
From: Martin Hunt @ 2006-11-06 20:55 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Roland McGrath, systemtap

On Fri, 2006-11-03 at 18:43 -0500, Frank Ch. Eigler wrote:

> 
> Yup, now done, at least an easy way (emitting an inline query every
> time the relocatable $target variable is accessed).
> 
> It however exposes two problems with the newest symbol table code.
> First, the new _stp_module_lookup function holds a spinlock on its
> data structure, which could conceivably block.  

Is there some reason why the $target variable must be looked up each
time it is accessed?  Sounds slow.

I put the spinlock in at the last minute because I was worried that
someone might be tempted to use it to try to resolve addresses during a
kprobe. I don't think it can deadlock.


> The second problem is memory allocation.  Like ordinary associative
> arrays, the new symbol table also appears to be allocated using what
> comes down to kmalloc.  Unfortunately, with large number of entries of
> either kind, this allocation can fail, or trigger OOM handling.
> 
> The runtime needs to learn to either rely more (perhaps exclusively)
> on data segment allocation (.data or .bss arrays), or needs to make
> sure that it fails in a clean way if kernel free memory happens to be
> short.  This is essential.  (To see why, try running a script with
> -DMAXMAPENTRIES=999999.)

Moving the allocations into .data or .bss sections could still trigger
OOM situations.  You just simplify things some be having one large
allocation. I made this argument a year ago and was told to rewrite to
allocate memory in small chunks with kmalloc.  Now we are back to large
chunks with vmalloc?  I'm fine with that, but let's be clear we agree on
why we are doing this.

Regarding OOM killer, shouldn't we just set oom_adj to something like
+15?  That would put us at the top of the victims list so avoid the
situation where loading a systemtap module puts the system in a low
memory state which causes the oom killer to start ramdomly killing off
processes.

Martin



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: offline elfutils processing committed
  2006-11-04  4:05     ` Roland McGrath
@ 2006-11-04 14:27       ` Frank Ch. Eigler
  0 siblings, 0 replies; 20+ messages in thread
From: Frank Ch. Eigler @ 2006-11-04 14:27 UTC (permalink / raw)
  To: Roland McGrath; +Cc: systemtap


Roland McGrath <roland@redhat.com> writes:

> [...]
> [...]
> > idea anyway for performance reasons) we may need to relocate $target
> > addresses once during initialization, and keep a table.
> 
> I am a little shocked that you were ever doing anything different.  

Now now, we have been doing this for about ... six hours.  For about
two days before that, we were using unrelocated offline addresses
(which is wrong).  Before then, we had been in "online" mode, where
those addresses were literal constants.

- FChE

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: offline elfutils processing committed
  2006-11-04  3:29   ` Frank Ch. Eigler
@ 2006-11-04  4:05     ` Roland McGrath
  2006-11-04 14:27       ` Frank Ch. Eigler
  2006-11-06 20:55     ` Martin Hunt
  1 sibling, 1 reply; 20+ messages in thread
From: Roland McGrath @ 2006-11-04  4:05 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Martin Hunt, systemtap

> It would be nice if we didn't have to know ahead of time which modules
> we'd need, that is, if elfutils were to pass us placeholder structs
> with nothing but names, to get it to fetch module contents only when
> we query further.

That is already how it works.  The offline module reporting is surely
somewhat slower than the live kernel module enumeration, because it does
the equivalent of find /lib/modules/foo -name '*.ko' rather than the
equivalent of cat /proc/modules.  But the big slowdown I was referring to
is the actual loading of the DWARF info.  That doesn't happen until you
request it.  If you only ever use dwfl_module_* calls to get DWARF info out
of an individual module, no other module's info will be loaded.  I thought
systemtap was using some of the global iterators like dwfl_nextcu, which
asks for DWARF info from all modules.  To do all-modules matches on source
file names and the like you unavoidably have to load each and every
module's DWARF info.

> idea anyway for performance reasons) we may need to relocate $target
> addresses once during initialization, and keep a table.

I am a little shocked that you were ever doing anything different.  I
always figured that you would collect at translation time the set of
relocation bases you need and have emit_address and the like emit things
like "base27 + 0x123".  At module initialization time each static variable
like base27 gets initialized with its module+section runtime location.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: offline elfutils processing committed
  2006-11-01 20:37 ` Roland McGrath
@ 2006-11-04  3:29   ` Frank Ch. Eigler
  2006-11-04  4:05     ` Roland McGrath
  2006-11-06 20:55     ` Martin Hunt
  0 siblings, 2 replies; 20+ messages in thread
From: Frank Ch. Eigler @ 2006-11-04  3:29 UTC (permalink / raw)
  To: Roland McGrath, Martin Hunt; +Cc: systemtap

[-- Attachment #1: Type: text/plain, Size: 1986 bytes --]

Hi -

On Wed, Nov 01, 2006 at 12:03:36PM -0800, Roland McGrath wrote:
> Go team!  When I added the elfutils support for relocatability in August of
> last year, I was really hoping someone would start testing it one day. ;-)

Sorry for the wait.

> Is "somewhat longer" not an understatement?  [...]
> Note that
> dwfl_linux_kernel_report_offline lets you supply a predicate [...]

Right.

> which is the only way I've thought of to reduce the DWARF-loading
> overhead (without changing the way we install it).

It would be nice if we didn't have to know ahead of time which modules
we'd need, that is, if elfutils were to pass us placeholder structs
with nothing but names, to get it to fetch module contents only when
we query further.

But failing that, we can probably make use of the existing predicate
function to cut down the work.

> You didn't change dwflpp::emit_address [...]

Yup, now done, at least an easy way (emitting an inline query every
time the relocatable $target variable is accessed).

It however exposes two problems with the newest symbol table code.
First, the new _stp_module_lookup function holds a spinlock on its
data structure, which could conceivably block.  This either needs
analysis to prove that this cannot happen, or else (what may be a good
idea anyway for performance reasons) we may need to relocate $target
addresses once during initialization, and keep a table.

The second problem is memory allocation.  Like ordinary associative
arrays, the new symbol table also appears to be allocated using what
comes down to kmalloc.  Unfortunately, with large number of entries of
either kind, this allocation can fail, or trigger OOM handling.

The runtime needs to learn to either rely more (perhaps exclusively)
on data segment allocation (.data or .bss arrays), or needs to make
sure that it fails in a clean way if kernel free memory happens to be
short.  This is essential.  (To see why, try running a script with
-DMAXMAPENTRIES=999999.)

- FChE

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: offline elfutils processing committed
  2006-11-01 22:22 Stone, Joshua I
@ 2006-11-01 22:57 ` Frank Ch. Eigler
  0 siblings, 0 replies; 20+ messages in thread
From: Frank Ch. Eigler @ 2006-11-01 22:57 UTC (permalink / raw)
  To: Stone, Joshua I; +Cc: systemtap

[-- Attachment #1: Type: text/plain, Size: 1541 bytes --]

Hi -

On Wed, Nov 01, 2006 at 02:02:23PM -0800, Stone, Joshua I wrote:

> [...]  The easy-out answer is that stp_random_pm deals with ints,
> and I needed the randomization to be 64-bit.  [...]

Yeah, might as well upgrade the runtime's random_pm function.

> I also thought that using the kernel's get_random_bytes would yield
> better randomization [...]

The problem there is that this kernel facility uses locks of its own,
which would constitute another probing hazard area.


> On a related note, you broke the randomization on jiffies timers when
> used as ms.  [...]

Sorry about that.  I probably broke one or two other things by
mistake -- or out of spite! :-)


> [...]  But, this brings to mind a potential race in the startup,
> with both hrtimers and jiffies-timers.  In both cases, the timer
> only gets restarted if we run the probe, which only happens if the
> state is STAP_SESSION_RUNNING.  If a newly registered timer fires
> off while we're still initializing other code
> (STAP_SESSION_STARTING), the timer won't be restarted, and it will
> be useless for the whole session. [...]

Good catch.  How about this: rework enter_hrtimer_probe, so that the
timer expiration computation is done before the
common_probe_entryfn_prologue.  "But you can't have statements so
early!" I hear you cry.  Ahem!  Yes, you can - if it's nested within a
new block.  So:

  enter_hrtimer_probe (...) {
    timer->expires = [...];
    { /* <<<< */
       ... prologue etc.

       ... epilogue
    }
    return HRTIMER_RESTART;
 }

- FChE

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: offline elfutils processing committed
@ 2006-11-01 22:22 Stone, Joshua I
  2006-11-01 22:57 ` Frank Ch. Eigler
  0 siblings, 1 reply; 20+ messages in thread
From: Stone, Joshua I @ 2006-11-01 22:22 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

On Tuesday, October 31, 2006 6:13 PM, Frank Ch. Eigler wrote:
> Hi -
> 
> A big commit went in just now [...]

cvsps | diffstat says:
 11 files changed, 1058 insertions(+), 1849 deletions(-)

Big indeed!  I hope the testsuite is up to the task... :)

I'll address the XXX notes you added to the hrtimer stuff:


In hrtimer_derived_probe_group::emit_interval:
// XXX: why not use stp_random_pm instead of this?
(re: code that generates randomization of timer interval)

The easy-out answer is that stp_random_pm deals with ints, and I needed
the randomization to be 64-bit.  I'll concede that it's probably better
to add an stp_random_pm_64.  I also thought that using the kernel's
get_random_bytes would yield better randomization than the pseudo-random
generator in stp_random_pm.  I guess the quality of randomization could
be addressed in the runtime function.

On a related note, you broke the randomization on jiffies timers when
used as ms.  The random value needs to be added *before* the conversion
to jiffies.  It's also an optimization to gate the random-call on
whether a randomization value is present.  I'll make these corrections
shortly.


In hrtimer_derived_probe_group::emit_module_decls:
// presume problem with updating ->expires or something else XXX
(re: the restart_or_not variable)

This was a fix for bz #3278.  The problem was that at the end of a
session, the probe prologue skipped skipped over the probe, and the
timer got restarted with the same expiration.  With this change it only
restarts if the expiration was updated.

But, this brings to mind a potential race in the startup, with both
hrtimers and jiffies-timers.  In both cases, the timer only gets
restarted if we run the probe, which only happens if the state is
STAP_SESSION_RUNNING.  If a newly registered timer fires off while we're
still initializing other code (STAP_SESSION_STARTING), the timer won't
be restarted, and it will be useless for the whole session.

I'm actually surprised we haven't had a problem with this yet, as it's a
pretty simple race condition.  I guess there's not many people using
very short timer intervals.  In any case, I'm not sure of an elegant way
to fix it -- suggestions?


Josh

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: offline elfutils processing committed
  2006-11-01  4:48 Frank Ch. Eigler
@ 2006-11-01 20:37 ` Roland McGrath
  2006-11-04  3:29   ` Frank Ch. Eigler
  0 siblings, 1 reply; 20+ messages in thread
From: Roland McGrath @ 2006-11-01 20:37 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

Go team!  When I added the elfutils support for relocatability in August of
last year, I was really hoping someone would start testing it one day. ;-)

Is "somewhat longer" not an understatement?  When I use elfutils'
tests/dwflmodtest with -K (offline kernel, 1389 modules) vs -k (running
kernel, 64 modules) it takes about 17 times longer (mostly i/o).  Note that
dwfl_linux_kernel_report_offline lets you supply a predicate to exclude
uninteresting modules, which is the only way I've thought of to reduce the
DWARF-loading overhead (without changing the way we install it).

You didn't change dwflpp::emit_address, which is used for things like
$global_var addresses.  That needs to emit references to your run-time
section addresses too.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 20+ messages in thread

* offline elfutils processing committed
@ 2006-11-01  4:48 Frank Ch. Eigler
  2006-11-01 20:37 ` Roland McGrath
  0 siblings, 1 reply; 20+ messages in thread
From: Frank Ch. Eigler @ 2006-11-01  4:48 UTC (permalink / raw)
  To: systemtap

[-- Attachment #1: Type: text/plain, Size: 1395 bytes --]

Hi -

A big commit went in just now, as a holloween prank no doubt.  Some
dude named "fche" finally checked in code to switch to elfutils'
"offline" processing mode, which allows generation of instrumentation
for kernels where the specific list of modules and their addresses
need not be known until run-time.  This involves a somewhat longer
elaboration pass, since all possibly-needed modules' dwarf data are
brought in, not just those that are loaded right now on the
translator's host.

This part needs one more bit of code from staprun (hunt) to pass a
module/section/address table to the module.  A limited mockup is
present in the generated code and should be replaced shortly.
(dwarf_derived_probe_group::emit_module_decls).

The new code also significantly improves the code generated for
scripts involving many probes.  (It also drastically simplifies the
translator's own code for this.  Partly unraveling this mess took lots
of time.)  All this builds on but replaces the earlier "probe groups"
effort.  You may find first-time compile time for complex scripts to
be noticeably shorter.  Since it is compatible with the recent caching
code, second-time compile time is near-zero.

Just to finally commit this piece of work, I just disabled some
previously working but little-used code.  I'll plop them back in
shortly: the benchmarking option, and perfmon/mark based probes.

- FChE

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2006-11-07 16:39 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-11-02 16:42 offline elfutils processing committed Stone, Joshua I
  -- strict thread matches above, loose matches on Subject: below --
2006-11-06 22:41 Stone, Joshua I
2006-11-07  4:37 ` Roland McGrath
2006-11-07  6:36 ` Martin Hunt
2006-11-07 15:13   ` Frank Ch. Eigler
2006-11-07 15:43     ` Martin Hunt
2006-11-07 16:17       ` Frank Ch. Eigler
2006-11-07 16:39         ` Martin Hunt
2006-11-08  1:26           ` Frank Ch. Eigler
2006-11-01 22:22 Stone, Joshua I
2006-11-01 22:57 ` Frank Ch. Eigler
2006-11-01  4:48 Frank Ch. Eigler
2006-11-01 20:37 ` Roland McGrath
2006-11-04  3:29   ` Frank Ch. Eigler
2006-11-04  4:05     ` Roland McGrath
2006-11-04 14:27       ` Frank Ch. Eigler
2006-11-06 20:55     ` Martin Hunt
2006-11-06 21:18       ` Frank Ch. Eigler
2006-11-06 21:52         ` Martin Hunt
2006-11-06 22:15           ` Frank Ch. Eigler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).