public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* RE: 3 bugs found.
@ 2006-09-21  1:08 Mao, Bibo
  0 siblings, 0 replies; 6+ messages in thread
From: Mao, Bibo @ 2006-09-21  1:08 UTC (permalink / raw)
  To: Vara Prasad, Keshavamurthy, Anil S, Prasanna S Panchamukhi,
	Ananth N Mavinakayanahalli, Masami Hiramatsu
  Cc: David Smith, James Dickens, SystemTAP

I am not kprobes experts, but I am going to write kprobe test case to narrow down what is the fundamental problem. If you have any suggestion, please inform me :)


Thanks
Bibo,mao
>-----Original Message-----
>From: systemtap-owner@sourceware.org [mailto:systemtap-owner@sourceware.org]
>On Behalf Of Vara Prasad
>Sent: 2006年9月21日 1:04
>To: Keshavamurthy, Anil S; Prasanna S Panchamukhi; Ananth N Mavinakayanahalli;
>Masami Hiramatsu
>Cc: David Smith; James Dickens; SystemTAP
>Subject: Re: 3 bugs found.
>
>David Smith wrote:
>
>> [...]
>> I've found it.  If I add '_raw_spin_unlock' to the blacklist (along
>> with 'atomic_notifier_call_chain' and '_spin_unlock_irqrestore'), then
>> probing kernel.function("*") works fine for me on x86.
>
>
>David, thanks a bunch for narrowing it down to this small list.
>
>>
>> Note that once again I'm not sure that is the correct fix (adding it
>> to the blacklist), I just wanted to get past it.
>
>I think our kprobes experts can now write a simple kprobes module to
>reproduce the problem and narrow it down further to see what is the
>fundamental problem. If it turns out something we can't change these
>functions or some call these functions make to be safe to probe we may
>be able to put magic __kprobes macro to prevent anyone stumbling into
>these functions via probes.
>
>Do I hear any volunteers from the kprobe folks in the To list?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 3 bugs found.
  2006-09-20 16:05   ` David Smith
@ 2006-09-20 17:04     ` Vara Prasad
  0 siblings, 0 replies; 6+ messages in thread
From: Vara Prasad @ 2006-09-20 17:04 UTC (permalink / raw)
  To: Keshavamurthy, Anil S, Prasanna S Panchamukhi,
	Ananth N Mavinakayanahalli, Masami Hiramatsu
  Cc: David Smith, James Dickens, SystemTAP

David Smith wrote:

> [...]
> I've found it.  If I add '_raw_spin_unlock' to the blacklist (along 
> with 'atomic_notifier_call_chain' and '_spin_unlock_irqrestore'), then 
> probing kernel.function("*") works fine for me on x86.


David, thanks a bunch for narrowing it down to this small list.

>
> Note that once again I'm not sure that is the correct fix (adding it 
> to the blacklist), I just wanted to get past it.

I think our kprobes experts can now write a simple kprobes module to 
reproduce the problem and narrow it down further to see what is the 
fundamental problem. If it turns out something we can't change these 
functions or some call these functions make to be safe to probe we may 
be able to put magic __kprobes macro to prevent anyone stumbling into 
these functions via probes. 

Do I hear any volunteers from the kprobe folks in the To list?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 3 bugs found.
  2006-09-19 13:40 ` David Smith
  2006-09-20  8:29   ` bibo,mao
@ 2006-09-20 16:05   ` David Smith
  2006-09-20 17:04     ` Vara Prasad
  1 sibling, 1 reply; 6+ messages in thread
From: David Smith @ 2006-09-20 16:05 UTC (permalink / raw)
  To: James Dickens; +Cc: SystemTAP

David Smith wrote:
> James Dickens wrote:
>> Here are 3 scripts, one line each, they each break the Systemtap one
>> is old, just needs to expand the current bug report.
>>
>> Linux localhost.localdomain 2.6.17-1.2187_FC5 #1 Mon Sep 11 01:17:06
>> EDT 2006 i686 athlon i386 GNU/Linux
> 
> Since I've got a similar system (except mine isn't an athlon), I decided 
> to take a look at these.  See stuff below.
> 
>> Latest cvs updates as of 2  hours ago.
>>
>> Distro: Fedora FC5
>>
>> Stack fault on the cpu. Halting the machine, This is x86 previously
>> only filed for x86_64
>>
>> probe kernel.function("*") { print(".") }
>>
>> To narrow it down a bit, this causes the same stack fault
>>
>> probe kernel.function("*@kernel/*") { printf("here\n"); }
> 
> I've narrowed this one down.  This only happens when probing 
> 'atomic_notifier_call_chain'.
> 
>> This one produces an oops, I can post the end of it, if nobudy else
>> can reproduce that has a serial console.
>>
>> probe kernel.function("*@kernel/spinlock.c") { printf("."); }
> 
> I've reproduced this one with '_spin_unlock_irqrestore'.  On the console 
> I get:
> 
> BUG: spinlock lockup on CPU#0
> 
> With those two functions added to the blacklist (I'm not sure that is 
> the right fix, I just want to get past those two functions), the 
> following works correctly:
> 
> probe kernel.function("*@kernel/*") { printf("here\n"); }
> 
> Now your first probe (probing the entire kernel) still fails.  I'm 
> trying to narrow it down.

I've found it.  If I add '_raw_spin_unlock' to the blacklist (along with 
'atomic_notifier_call_chain' and '_spin_unlock_irqrestore'), then 
probing kernel.function("*") works fine for me on x86.

Note that once again I'm not sure that is the correct fix (adding it to 
the blacklist), I just wanted to get past it.

-- 
David Smith
dsmith@redhat.com
Red Hat, Inc.
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 3 bugs found.
  2006-09-19 13:40 ` David Smith
@ 2006-09-20  8:29   ` bibo,mao
  2006-09-20 16:05   ` David Smith
  1 sibling, 0 replies; 6+ messages in thread
From: bibo,mao @ 2006-09-20  8:29 UTC (permalink / raw)
  To: David Smith; +Cc: James Dickens, SystemTAP

When running these three test cases, system also hanged up on my ia32 machine,
and my kernel version is latest git kernel.

David Smith wrote:
> James Dickens wrote:
>  > Here are 3 scripts, one line each, they each break the Systemtap one
>  > is old, just needs to expand the current bug report.
>  >
>  > Linux localhost.localdomain 2.6.17-1.2187_FC5 #1 Mon Sep 11 01:17:06
>  > EDT 2006 i686 athlon i386 GNU/Linux
> 
> Since I've got a similar system (except mine isn't an athlon), I decided
> to take a look at these.  See stuff below.
> 
>  > Latest cvs updates as of 2  hours ago.
>  >
>  > Distro: Fedora FC5
>  >
>  > Stack fault on the cpu. Halting the machine, This is x86 previously
>  > only filed for x86_64
>  >
>  > probe kernel.function("*") { print(".") }
>  >
>  > To narrow it down a bit, this causes the same stack fault
>  >
>  > probe kernel.function("*@kernel/*") { printf("here\n"); }
> 
> I've narrowed this one down.  This only happens when probing
> 'atomic_notifier_call_chain'.
If there is recursive kprobe, the reenter kprobe hanlder path will include 
atomic_notifier_call_chain function, that will cause numerous recursive kprobe.
In kernel side this function should be labeled with prefix __kprobe.

>  > This one produces an oops, I can post the end of it, if nobudy else
>  > can reproduce that has a serial console.
>  >
>  > probe kernel.function("*@kernel/spinlock.c") { printf("."); }
> 
> I've reproduced this one with '_spin_unlock_irqrestore'.  On the console
> I get:
> 
> BUG: spinlock lockup on CPU#0
In file runtime/transport/procfs.c of systemtap package, two function both call
spin_lock_irqsave function, these two function are _stp_write and 
_stp_proc_read_cmd. If any function or code between spin_lock_irqsave and 
spin_unlock_irqrestore of _stp_proc_read_cmd is probed, there will be possible
deadlock, because probe handler will call _stp_write function. 

On kernel side if kretprobe address falls between 
spin_lock_irqsave/spin_unlock_irqrestore space of kprobe_flush_task function 
there will be deadlock also.

> 
> With those two functions added to the blacklist (I'm not sure that is
> the right fix, I just want to get past those two functions), the
> following works correctly:
> 
> probe kernel.function("*@kernel/*") { printf("here\n"); }
> 
> Now your first probe (probing the entire kernel) still fails.  I'm
> trying to narrow it down.
> 
> --
> David Smith
> dsmith@redhat.com
> Red Hat, Inc.
> http://www.redhat.com
> 256.217.0141 (direct)
> 256.837.0057 (fax)
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 3 bugs found.
  2006-09-15 19:35 James Dickens
@ 2006-09-19 13:40 ` David Smith
  2006-09-20  8:29   ` bibo,mao
  2006-09-20 16:05   ` David Smith
  0 siblings, 2 replies; 6+ messages in thread
From: David Smith @ 2006-09-19 13:40 UTC (permalink / raw)
  To: James Dickens; +Cc: SystemTAP

James Dickens wrote:
> Here are 3 scripts, one line each, they each break the Systemtap one
> is old, just needs to expand the current bug report.
> 
> Linux localhost.localdomain 2.6.17-1.2187_FC5 #1 Mon Sep 11 01:17:06
> EDT 2006 i686 athlon i386 GNU/Linux

Since I've got a similar system (except mine isn't an athlon), I decided 
to take a look at these.  See stuff below.

> Latest cvs updates as of 2  hours ago.
> 
> Distro: Fedora FC5
> 
> Stack fault on the cpu. Halting the machine, This is x86 previously
> only filed for x86_64
> 
> probe kernel.function("*") { print(".") }
> 
> To narrow it down a bit, this causes the same stack fault
> 
> probe kernel.function("*@kernel/*") { printf("here\n"); }

I've narrowed this one down.  This only happens when probing 
'atomic_notifier_call_chain'.

> This one produces an oops, I can post the end of it, if nobudy else
> can reproduce that has a serial console.
> 
> probe kernel.function("*@kernel/spinlock.c") { printf("."); }

I've reproduced this one with '_spin_unlock_irqrestore'.  On the console 
I get:

BUG: spinlock lockup on CPU#0

With those two functions added to the blacklist (I'm not sure that is 
the right fix, I just want to get past those two functions), the 
following works correctly:

probe kernel.function("*@kernel/*") { printf("here\n"); }

Now your first probe (probing the entire kernel) still fails.  I'm 
trying to narrow it down.

-- 
David Smith
dsmith@redhat.com
Red Hat, Inc.
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* 3 bugs found.
@ 2006-09-15 19:35 James Dickens
  2006-09-19 13:40 ` David Smith
  0 siblings, 1 reply; 6+ messages in thread
From: James Dickens @ 2006-09-15 19:35 UTC (permalink / raw)
  To: SystemTAP

Here are 3 scripts, one line each, they each break the Systemtap one
is old, just needs to expand the current bug report.

Linux localhost.localdomain 2.6.17-1.2187_FC5 #1 Mon Sep 11 01:17:06
EDT 2006 i686 athlon i386 GNU/Linux

Latest cvs updates as of 2  hours ago.

Distro: Fedora FC5

Stack fault on the cpu. Halting the machine, This is x86 previously
only filed for x86_64

probe kernel.function("*") { print(".") }

To narrow it down a bit, this causes the same stack fault

probe kernel.function("*@kernel/*") { printf("here\n"); }

This one produces an oops, I can post the end of it, if nobudy else
can reproduce that has a serial console.

probe kernel.function("*@kernel/spinlock.c") { printf("."); }

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-09-21  1:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-09-21  1:08 3 bugs found Mao, Bibo
  -- strict thread matches above, loose matches on Subject: below --
2006-09-15 19:35 James Dickens
2006-09-19 13:40 ` David Smith
2006-09-20  8:29   ` bibo,mao
2006-09-20 16:05   ` David Smith
2006-09-20 17:04     ` Vara Prasad

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).