* Re: Instrumenting context switches
[not found] <OFB78608CD.65116F14-ON85257236.0073959D-85257236.00738AAF@mck.us.ray.com>
@ 2006-12-01 0:25 ` Dave Sperry
2006-12-01 3:12 ` Perry Cheng
0 siblings, 1 reply; 9+ messages in thread
From: Dave Sperry @ 2006-12-01 0:25 UTC (permalink / raw)
To: perryche; +Cc: systemtap
Perry,
I had no problem running your systemtap scripts on my AMD-686 SMP box
running the same kernel as you list below.
I did a stap perry.stp -vvvv -g &>perryFoo.txt and it worked just fine.
You can see the log at:
http://toomanyprojects.org:2000/outbound/perry/perryFoo.txt
The version of systemtap I used is:
SystemTap translator/driver (version 0.5.11 built 2006-11-20)
(Using Red Hat elfutils 0.124 libraries.)
Copyright (C) 2005-2006 Red Hat, Inc. and others
This is free software; see the source for copying conditions.
Dave
> Perry Cheng
> <perryche@us.ibm.
> com> To
> Sent by: Martin Hunt <hunt@redhat.com>,
> systemtap-owner@s systemtap@sourceware.org
> ourceware.org cc
>
> Subject
> 11/30/2006 03:29 Re: Instrumenting context switches
> PM
>
> The following even simpler program also dies and rules out gettimeofday or
> accessing a global variable as possible causes. The crash happens on both
> an intel-686 and AMD-686 both running a modified version of 2.6.16. I
> don't know the details of the modifications but they are generally used to
> support real-time features and include the hrt and rt-prio patches. The
> src is at ftp://linuxpatch.ncsa.uiuc.edu/rt-linux/rhel4u2/R1-iFix1.
>
> If I replace __switch_to with context_switch or finish_task_switch, the
> failure still occurs. However, if I switch to set_task_comm, then things
> seem ok.
>
> probe kernel.function("__switch_to")
> {
> foobar()
> }
>
> function foobar()
> %{
> _stp_printf("foobar\n");
> %}
>
> Perry
>
>
> systemtap-owner@sourceware.org wrote on 11/30/2006 10:31:10 AM:
>
> > On Wed, 2006-11-29 at 18:25 -0500, Perry Cheng wrote:
> >
> > > probe kernel.function("__switch_to")
> > > {
> > > doSwitchTo(gettimeofday_us(), $prev_p, $next_p);
> > > }
> > >
> > > function doSwitchTo(timeus:long, prev:long, next:long)
> > > %{
> > > _stp_printf("SWITCHCOUNT = %ld\n", switchCount);
> > > <------------------------------ BAD LINE
> > > switchCount++;
> > > }%
> >
> > Obviously the code fragment above is not exactly what you are using to
> > reproduce the bug. (You can't use keywords as parameter names,
> > uninitialized switchCount, "}%", etc). I tried the following and did
> > not see any problems. Can you give more details (arch, OS, etc) on how
> > to reproduce?
> >
> > %{
> > long switchCount = 1000000;
> > %}
> >
> > function doSwitchTo (t:long, p:long, n:long) %{
> > _stp_printf("SWITCHCOUNT = %ld\n", switchCount);
> > switchCount++;
> > %}
> >
> > probe kernel.function("__switch_to")
> > {
> > doSwitchTo(gettimeofday_us(), $prev_p, $next_p);
> > }
> >
> >
> >
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Instrumenting context switches
2006-12-01 0:25 ` Instrumenting context switches Dave Sperry
@ 2006-12-01 3:12 ` Perry Cheng
0 siblings, 0 replies; 9+ messages in thread
From: Perry Cheng @ 2006-12-01 3:12 UTC (permalink / raw)
To: dave_sperry, systemtap
Hi Dave,
I was using version 0.4 and have upgraded to version 0.5.11 like you
(mine's from today though and not 11/20).
However, the new stap can't find the probe points. I suspect that it is
not locating the kernel symbols file.
Do you know how to get the info out of the old /usr/bin/stap and give it
to the new /usr/local/bin/stap?
Perry
systemtap-owner@sourceware.org wrote on 11/30/2006 05:01:25 PM:
> Perry,
> I had no problem running your systemtap scripts on my AMD-686 SMP box
> running the same kernel as you list below.
>
> I did a stap perry.stp -vvvv -g &>perryFoo.txt and it worked just fine.
> You can see the log at:
>
> http://toomanyprojects.org:2000/outbound/perry/perryFoo.txt
>
>
> The version of systemtap I used is:
>
> SystemTap translator/driver (version 0.5.11 built 2006-11-20)
> (Using Red Hat elfutils 0.124 libraries.)
> Copyright (C) 2005-2006 Red Hat, Inc. and others
> This is free software; see the source for copying conditions.
>
>
> Dave
>
> > Perry Cheng
> > <perryche@us.ibm.
> > com> To
> > Sent by: Martin Hunt <hunt@redhat.com>,
> > systemtap-owner@s systemtap@sourceware.org
> > ourceware.org cc
> >
> > Subject
> > 11/30/2006 03:29 Re: Instrumenting context
switches
> > PM
> >
> > The following even simpler program also dies and rules out
gettimeofday or
> > accessing a global variable as possible causes. The crash happens on
both
> > an intel-686 and AMD-686 both running a modified version of 2.6.16. I
> > don't know the details of the modifications but they are generally
used to
> > support real-time features and include the hrt and rt-prio patches.
The
> > src is at ftp://linuxpatch.ncsa.uiuc.edu/rt-linux/rhel4u2/R1-iFix1.
> >
> > If I replace __switch_to with context_switch or finish_task_switch,
the
> > failure still occurs. However, if I switch to set_task_comm, then
things
> > seem ok.
> >
> > probe kernel.function("__switch_to")
> > {
> > foobar()
> > }
> >
> > function foobar()
> > %{
> > _stp_printf("foobar\n");
> > %}
> >
> > Perry
> >
> >
> > systemtap-owner@sourceware.org wrote on 11/30/2006 10:31:10 AM:
> >
> > > On Wed, 2006-11-29 at 18:25 -0500, Perry Cheng wrote:
> > >
> > > > probe kernel.function("__switch_to")
> > > > {
> > > > doSwitchTo(gettimeofday_us(), $prev_p, $next_p);
> > > > }
> > > >
> > > > function doSwitchTo(timeus:long, prev:long, next:long)
> > > > %{
> > > > _stp_printf("SWITCHCOUNT = %ld\n", switchCount);
> > > > <------------------------------ BAD LINE
> > > > switchCount++;
> > > > }%
> > >
> > > Obviously the code fragment above is not exactly what you are using
to
> > > reproduce the bug. (You can't use keywords as parameter names,
> > > uninitialized switchCount, "}%", etc). I tried the following and
did
> > > not see any problems. Can you give more details (arch, OS, etc) on
how
> > > to reproduce?
> > >
> > > %{
> > > long switchCount = 1000000;
> > > %}
> > >
> > > function doSwitchTo (t:long, p:long, n:long) %{
> > > _stp_printf("SWITCHCOUNT = %ld\n", switchCount);
> > > switchCount++;
> > > %}
> > >
> > > probe kernel.function("__switch_to")
> > > {
> > > doSwitchTo(gettimeofday_us(), $prev_p, $next_p);
> > > }
> > >
> > >
> > >
> >
> >
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Instrumenting context switches
[not found] <OFBBDFEA36.CA571100-ON85257237.00006111-85257237.0000676B@mck.us.ray.com>
@ 2006-12-01 3:14 ` Dave Sperry
0 siblings, 0 replies; 9+ messages in thread
From: Dave Sperry @ 2006-12-01 3:14 UTC (permalink / raw)
To: perryche; +Cc: systemtap
Hi Perry,
I did not have to do anything to tell stap where to find the symbols.
One thing you can check is that the debug symbols are located where the
src/README file suggests.
I also had to modify tapsets.cxx file to fix some rt Enums
--- src_orig/tapsets.cxx 2006-11-17 15:35:47.000000000 -0500
+++ src/tapsets.cxx 2006-11-19 19:09:02.000000000 -0500
@@ -4332,13 +4332,13 @@ hrtimer_derived_probe_group::emit_module
s.op->newline() << "for (i=0; i<" << probes.size() << "; i++) {";
s.op->newline(1) << "struct stap_hrtimer_probe* stp = &
stap_hrtimer_probes [i];";
- s.op->newline() << "hrtimer_init (& stp->hrtimer, CLOCK_MONOTONIC,
HRTIMER_REL);";
- s.op->newline() << "stp->hrtimer.function = & enter_hrtimer_probe;";
+ s.op->newline() << "hrtimer_init (& stp->hrtimer, CLOCK_MONOTONIC,
HRTIMER_MODE_REL);";
+ s.op->newline() << "stp->hrtimer.function = (void *)(&
enter_hrtimer_probe);";
// There is no hrtimer field to identify *this* (i-th) probe handler
// callback. So instead we'll deduce it at entry time.
s.op->newline() << "(void) hrtimer_start (& stp->hrtimer, ";
emit_interval (s.op);
- s.op->line() << ", HRTIMER_REL);";
+ s.op->line() << ", HRTIMER_MODE_REL);";
// Note: no partial failure rollback is needed: hrtimer_start only
// "fails" if the timer was already active, which cannot be.
s.op->newline(-1) << "}"; // for loop
The other thing I do when thing behave strangely is flush the systemtap
cache
"rm -rf /root/.systemtap/cache/*"
Dave
>
>
> Hi Dave,
>
> I was using version 0.4 and have upgraded to version 0.5.11 like you
> (mine's from today though and not 11/20).
> However, the new stap can't find the probe points. I suspect that it is
> not locating the kernel symbols file.
> Do you know how to get the info out of the old /usr/bin/stap and give it
> to the new /usr/local/bin/stap?
>
>
> Perry
>
>
> systemtap-owner@sourceware.org wrote on 11/30/2006 05:01:25 PM:
>
> > Perry,
> > I had no problem running your systemtap scripts on my AMD-686 SMP box
> > running the same kernel as you list below.
> >
> > I did a stap perry.stp -vvvv -g &>perryFoo.txt and it worked just fine.
> > You can see the log at:
> >
> > http://toomanyprojects.org:2000/outbound/perry/perryFoo.txt
> >
> >
> > The version of systemtap I used is:
> >
> > SystemTap translator/driver (version 0.5.11 built 2006-11-20)
> > (Using Red Hat elfutils 0.124 libraries.)
> > Copyright (C) 2005-2006 Red Hat, Inc. and others
> > This is free software; see the source for copying conditions.
> >
> >
> > Dave
> >
> > > Perry Cheng
> > > <perryche@us.ibm.
> > > com> To
> > > Sent by: Martin Hunt <hunt@redhat.com>,
>
> > > systemtap-owner@s systemtap@sourceware.org
> > > ourceware.org cc
> > >
> > > Subject
> > > 11/30/2006 03:29 Re: Instrumenting context
> switches
> > > PM
> > >
> > > The following even simpler program also dies and rules out
> gettimeofday or
> > > accessing a global variable as possible causes. The crash happens on
> both
> > > an intel-686 and AMD-686 both running a modified version of 2.6.16. I
> > > don't know the details of the modifications but they are generally
> used to
> > > support real-time features and include the hrt and rt-prio patches.
> The
> > > src is at ftp://linuxpatch.ncsa.uiuc.edu/rt-linux/rhel4u2/R1-iFix1.
> > >
> > > If I replace __switch_to with context_switch or finish_task_switch,
> the
> > > failure still occurs. However, if I switch to set_task_comm, then
> things
> > > seem ok.
> > >
> > > probe kernel.function("__switch_to")
> > > {
> > > foobar()
> > > }
> > >
> > > function foobar()
> > > %{
> > > _stp_printf("foobar\n");
> > > %}
> > >
> > > Perry
> > >
> > >
> > > systemtap-owner@sourceware.org wrote on 11/30/2006 10:31:10 AM:
> > >
> > > > On Wed, 2006-11-29 at 18:25 -0500, Perry Cheng wrote:
> > > >
> > > > > probe kernel.function("__switch_to")
> > > > > {
> > > > > doSwitchTo(gettimeofday_us(), $prev_p, $next_p);
> > > > > }
> > > > >
> > > > > function doSwitchTo(timeus:long, prev:long, next:long)
> > > > > %{
> > > > > _stp_printf("SWITCHCOUNT = %ld\n", switchCount);
> > > > > <------------------------------ BAD LINE
> > > > > switchCount++;
> > > > > }%
> > > >
> > > > Obviously the code fragment above is not exactly what you are using
> to
> > > > reproduce the bug. (You can't use keywords as parameter names,
> > > > uninitialized switchCount, "}%", etc). I tried the following and
> did
> > > > not see any problems. Can you give more details (arch, OS, etc) on
> how
> > > > to reproduce?
> > > >
> > > > %{
> > > > long switchCount = 1000000;
> > > > %}
> > > >
> > > > function doSwitchTo (t:long, p:long, n:long) %{
> > > > _stp_printf("SWITCHCOUNT = %ld\n", switchCount);
> > > > switchCount++;
> > > > %}
> > > >
> > > > probe kernel.function("__switch_to")
> > > > {
> > > > doSwitchTo(gettimeofday_us(), $prev_p, $next_p);
> > > > }
> > > >
> > > >
> > > >
> > >
> > >
> >
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Instrumenting context switches
2006-11-30 18:22 ` Martin Hunt
@ 2006-11-30 22:01 ` Perry Cheng
0 siblings, 0 replies; 9+ messages in thread
From: Perry Cheng @ 2006-11-30 22:01 UTC (permalink / raw)
To: Martin Hunt, systemtap
The following even simpler program also dies and rules out gettimeofday or
accessing a global variable as possible causes. The crash happens on both
an intel-686 and AMD-686 both running a modified version of 2.6.16. I
don't know the details of the modifications but they are generally used to
support real-time features and include the hrt and rt-prio patches. The
src is at ftp://linuxpatch.ncsa.uiuc.edu/rt-linux/rhel4u2/R1-iFix1.
If I replace __switch_to with context_switch or finish_task_switch, the
failure still occurs. However, if I switch to set_task_comm, then things
seem ok.
probe kernel.function("__switch_to")
{
foobar()
}
function foobar()
%{
_stp_printf("foobar\n");
%}
Perry
systemtap-owner@sourceware.org wrote on 11/30/2006 10:31:10 AM:
> On Wed, 2006-11-29 at 18:25 -0500, Perry Cheng wrote:
>
> > probe kernel.function("__switch_to")
> > {
> > doSwitchTo(gettimeofday_us(), $prev_p, $next_p);
> > }
> >
> > function doSwitchTo(timeus:long, prev:long, next:long)
> > %{
> > _stp_printf("SWITCHCOUNT = %ld\n", switchCount);
> > <------------------------------ BAD LINE
> > switchCount++;
> > }%
>
> Obviously the code fragment above is not exactly what you are using to
> reproduce the bug. (You can't use keywords as parameter names,
> uninitialized switchCount, "}%", etc). I tried the following and did
> not see any problems. Can you give more details (arch, OS, etc) on how
> to reproduce?
>
> %{
> long switchCount = 1000000;
> %}
>
> function doSwitchTo (t:long, p:long, n:long) %{
> _stp_printf("SWITCHCOUNT = %ld\n", switchCount);
> switchCount++;
> %}
>
> probe kernel.function("__switch_to")
> {
> doSwitchTo(gettimeofday_us(), $prev_p, $next_p);
> }
>
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Instrumenting context switches
2006-11-30 1:35 Perry Cheng
2006-11-30 1:41 ` Frank Ch. Eigler
@ 2006-11-30 18:22 ` Martin Hunt
2006-11-30 22:01 ` Perry Cheng
1 sibling, 1 reply; 9+ messages in thread
From: Martin Hunt @ 2006-11-30 18:22 UTC (permalink / raw)
To: Perry Cheng; +Cc: systemtap
On Wed, 2006-11-29 at 18:25 -0500, Perry Cheng wrote:
> probe kernel.function("__switch_to")
> {
> doSwitchTo(gettimeofday_us(), $prev_p, $next_p);
> }
>
> function doSwitchTo(timeus:long, prev:long, next:long)
> %{
> _stp_printf("SWITCHCOUNT = %ld\n", switchCount);
> <------------------------------ BAD LINE
> switchCount++;
> }%
Obviously the code fragment above is not exactly what you are using to
reproduce the bug. (You can't use keywords as parameter names,
uninitialized switchCount, "}%", etc). I tried the following and did
not see any problems. Can you give more details (arch, OS, etc) on how
to reproduce?
%{
long switchCount = 1000000;
%}
function doSwitchTo (t:long, p:long, n:long) %{
_stp_printf("SWITCHCOUNT = %ld\n", switchCount);
switchCount++;
%}
probe kernel.function("__switch_to")
{
doSwitchTo(gettimeofday_us(), $prev_p, $next_p);
}
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: Instrumenting context switches
@ 2006-11-30 18:10 Stone, Joshua I
0 siblings, 0 replies; 9+ messages in thread
From: Stone, Joshua I @ 2006-11-30 18:10 UTC (permalink / raw)
To: Frank Ch. Eigler, Perry Cheng; +Cc: systemtap
On Wednesday, November 29, 2006 4:47 PM, Frank Ch. Eigler wrote:
> It may be that this problem is due to the recent rewrite of
> gettimeofday_us. That code contains bits like preempt_disable() and
> _enable(), even though the equivalent (interrupt disabling) should
> already be done within probe context. In particular, I wonder if
> changing the latter to preempt_enable_no_resched() might improve the
> situation.
I'll take a look at whether some of the locking in the time subsystem
can go away, under the assumption that probes are always
interrupt-disabled. I took a very conservative approach, so I'm sure
some of that is overkill. However, the preempt_enable vs. _no_resched
shouldn't really cause a problem, because preempt_schedule checks for
irqs_disabled() anyway.
I think this is a red herring for Perry though. He mentions that simply
taking out his _stp_printf statement makes things work, so the
gettimeofday_us is still being called in the working case.
A wild guess -- the stack is transitioning in __switch_to, so might it
be that _stp_printf is running out-of-bounds somehow?
Josh
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: Instrumenting context switches
@ 2006-11-30 15:31 Stone, Joshua I
0 siblings, 0 replies; 9+ messages in thread
From: Stone, Joshua I @ 2006-11-30 15:31 UTC (permalink / raw)
To: Perry Cheng, systemtap
On Wednesday, November 29, 2006 3:26 PM, Perry Cheng wrote:
> Using some example code, I tried to instrument context switching by
> adding a probe to the method __switch_to. Some documentation had
> suggested that certain versions of the systemtap could not handle
> instrumenting context_switch.
The problems with resolving the context_switch function have been fixed
for a while, but there's still sometimes issues accessing the parameters
of inline functions. See bugzilla #1155.
http://sources.redhat.com/bugzilla/show_bug.cgi?id=1155
Using __switch_to is not an option on all platforms -- on IA64 it is a
macro, and on x86_64 it is blacklisted (bz2086).
Josh
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Instrumenting context switches
2006-11-30 1:35 Perry Cheng
@ 2006-11-30 1:41 ` Frank Ch. Eigler
2006-11-30 18:22 ` Martin Hunt
1 sibling, 0 replies; 9+ messages in thread
From: Frank Ch. Eigler @ 2006-11-30 1:41 UTC (permalink / raw)
To: Perry Cheng; +Cc: systemtap
> [...]
> doSwitchTo(gettimeofday_us(), $prev_p, $next_p);
> [...]
It may be that this problem is due to the recent rewrite of
gettimeofday_us. That code contains bits like preempt_disable() and
_enable(), even though the equivalent (interrupt disabling) should
already be done within probe context. In particular, I wonder if
changing the latter to preempt_enable_no_resched() might improve the
situation.
- FChE
^ permalink raw reply [flat|nested] 9+ messages in thread
* Instrumenting context switches
@ 2006-11-30 1:35 Perry Cheng
2006-11-30 1:41 ` Frank Ch. Eigler
2006-11-30 18:22 ` Martin Hunt
0 siblings, 2 replies; 9+ messages in thread
From: Perry Cheng @ 2006-11-30 1:35 UTC (permalink / raw)
To: systemtap
Using some example code, I tried to instrument context switching by adding
a probe to the method __switch_to. Some documentation had suggested that
certain versions of the systemtap could not handle instrumenting
context_switch. In the past I have gotten this to work but now the use of
_stp_printf seems to cause the machine to freeze hard. If I leave out the
printing of switchCount but leave in the increment, things work fine. Is
this a known problem or a known historical problem and if so what are the
workarounds?
probe kernel.function("__switch_to")
{
doSwitchTo(gettimeofday_us(), $prev_p, $next_p);
}
function doSwitchTo(timeus:long, prev:long, next:long)
%{
_stp_printf("SWITCHCOUNT = %ld\n", switchCount);
<------------------------------ BAD LINE
switchCount++;
}%
Perry
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2006-12-01 0:25 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <OFB78608CD.65116F14-ON85257236.0073959D-85257236.00738AAF@mck.us.ray.com>
2006-12-01 0:25 ` Instrumenting context switches Dave Sperry
2006-12-01 3:12 ` Perry Cheng
[not found] <OFBBDFEA36.CA571100-ON85257237.00006111-85257237.0000676B@mck.us.ray.com>
2006-12-01 3:14 ` Dave Sperry
2006-11-30 18:10 Stone, Joshua I
-- strict thread matches above, loose matches on Subject: below --
2006-11-30 15:31 Stone, Joshua I
2006-11-30 1:35 Perry Cheng
2006-11-30 1:41 ` Frank Ch. Eigler
2006-11-30 18:22 ` Martin Hunt
2006-11-30 22:01 ` Perry Cheng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).