public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* RE: Hitachi djprobe mechanism
@ 2005-08-01 20:46 Keshavamurthy, Anil S
  2005-08-01 21:08 ` Karim Yaghmour
  0 siblings, 1 reply; 83+ messages in thread
From: Keshavamurthy, Anil S @ 2005-08-01 20:46 UTC (permalink / raw)
  To: karim, Satoshi Oshima
  Cc: Richard J Moore, systemtap, Andi Kleen, Mathieu Desnoyers,
	Masami Hiramatsu, Masami Hiramatsu, michel.dagenais,
	Roland McGrath, sugita


>In as far as I can see, it remains that the only safe way to 
>use djprobe
>is to not touch any instruction that is less than 5 bytes, that's if
>there aren't other limitations as I mentioned earlier.

Though this is the safe way to insert djprobe, this might not always
serve the desired purpose.

Say for example, user is interested to find how many times a function
gets called and he need 
to insert a probe at the beginning of a function. Due to the nature of
djprobe
1) we might not find a 5 byte instruction with in this function, Or
2) Even if we find one such instruction, that instruction might tend to 
not get executed (due to nature of the code flow), then in this case
If user inserts a probe looking at 5 bytes or more instruction from the
beginning 
of the function address, the results (i.e times a function entered) will
end up being wrong.

So in effect, we just can't look for instruction size greater than 5
bytes and insert probe there. 
This djprobe will push the need for a stronger static
analysizer/translator in selecting the probe point.

-thanks,
Anil



^ permalink raw reply	[flat|nested] 83+ messages in thread
* RE: Hitachi djprobe mechanism
@ 2005-08-01 22:49 Keshavamurthy, Anil S
  2005-08-01 23:05 ` Karim Yaghmour
  0 siblings, 1 reply; 83+ messages in thread
From: Keshavamurthy, Anil S @ 2005-08-01 22:49 UTC (permalink / raw)
  To: karim
  Cc: Satoshi Oshima, Richard J Moore, systemtap, Andi Kleen,
	Mathieu Desnoyers, Masami Hiramatsu, Masami Hiramatsu,
	michel.dagenais, Roland McGrath, sugita

>Does this mean that you think we could use djprobe on anything less
>than 5 bytes?
Can be done provided you take care of all the issues that has been
discussed on this mailing list.
Let's all wait for Hitachi's djprobe patch to show up before we can
comment further on this topic.

Cheers,
-Anil

^ permalink raw reply	[flat|nested] 83+ messages in thread
* RE: Hitachi djprobe mechanism
@ 2005-08-01 22:41 Keshavamurthy, Anil S
  2005-08-02  3:21 ` Roland McGrath
  0 siblings, 1 reply; 83+ messages in thread
From: Keshavamurthy, Anil S @ 2005-08-01 22:41 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Mathieu Desnoyers, Andi Kleen, Karim Yaghmour, Masami Hiramatsu,
	Masami Hiramatsu, Richard J Moore, systemtap, sugita,
	Satoshi Oshima, michel.dagenais

 
>It's OK for probe insertion to be slow. 

Yes, slow is okay as long as you are *not* taking the full 
system down during probe inserting/removal time. Halting full 
system during probe insertion/removal might not be acceptable 
at least on IA64.

>So why not use RCU to 
>synchronize
>other processors?

The thing here is while we are trying to modify the section of code say 
couple of instructions to accommodate 5 bytes(jmp inst), during this
period 
we don't want any cpu to be in the middle of this area and to be
*really* 
sure they(other CPU's) don't get half backed instruction, we must 
place all of the other CPU's in a known location. So in this scenario, 
I doubt how  RCU synchronization can help. 

If I am wrong please educate me.


Cheers,
-Anil

^ permalink raw reply	[flat|nested] 83+ messages in thread
* RE: Hitachi djprobe mechanism
@ 2005-08-01 16:14 Keshavamurthy, Anil S
  2005-08-01 20:31 ` Roland McGrath
  0 siblings, 1 reply; 83+ messages in thread
From: Keshavamurthy, Anil S @ 2005-08-01 16:14 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Andi Kleen, Karim Yaghmour, Masami Hiramatsu, Masami Hiramatsu,
	Roland McGrath, Richard J Moore, systemtap, sugita,
	Satoshi Oshima, michel.dagenais

>
>* Keshavamurthy, Anil S (anil.s.keshavamurthy@intel.com) wrote:
>> Andi and others,
>> 	Sending an IPI to each other CPU's (all but self) and make *spin
>> on a lock* during the modification will *freeze* the system. 
>Please do
>> not *spin* inside an IPI.
>> 
>> My observation:
>> Here is what I had discovered, CPU2 had taken an
>> read_lock(&tasklist_lock) and CPU had entered IPI and is now 
>busy *spin
>> on a lock*.
>> CPU3 had called write_lock_irq(&tasklist_lock) where CPU3 
>first disables
>> the local irq and disables preemption and then is trying to 
>> acquire the lock which is already taken by CPU2 and since CPU2 never
>> releases this lock as it is busy spin wait, CPU3 never enters IPI :-(
>> 
>
>Yep, I see the problem : you cannot control other locks that 
>would have been
>taken by other CPUs with interrupts disabled.
>
>Is there any way to send a non-maskable IPI ? This could solve 
>this problem.


The only way I can think of is to use stop_machine_run(fn, data, cpu)
which freezes the machine 
on all cpu's and runs fn() on cpu which is what we want. 
This is slower than an IPI way but definetly very safe compared to IPI.

The only drawback is this is a very heavy weight operation and not sure
its impact on a busy production system.

Thanks,
-Anil





-Anil


^ permalink raw reply	[flat|nested] 83+ messages in thread
* RE: Hitachi djprobe mechanism
@ 2005-08-01 15:50 Keshavamurthy, Anil S
  2005-08-01 16:03 ` Mathieu Desnoyers
  0 siblings, 1 reply; 83+ messages in thread
From: Keshavamurthy, Anil S @ 2005-08-01 15:50 UTC (permalink / raw)
  To: Andi Kleen, Mathieu Desnoyers
  Cc: Karim Yaghmour, Masami Hiramatsu, Masami Hiramatsu,
	Roland McGrath, Richard J Moore, systemtap, sugita,
	Satoshi Oshima, michel.dagenais

Andi and others,
	Sending an IPI to each other CPU's (all but self) and make *spin
on a lock* during the modification will *freeze* the system. Please do
not *spin* inside an IPI.

My observation:
Here is what I had discovered, CPU2 had taken an
read_lock(&tasklist_lock) and CPU had entered IPI and is now busy *spin
on a lock*.
CPU3 had called write_lock_irq(&tasklist_lock) where CPU3 first disables
the local irq and disables preemption and then is trying to 
acquire the lock which is already taken by CPU2 and since CPU2 never
releases this lock as it is busy spin wait, CPU3 never enters IPI :-(

Cheers,
-Anil





>-----Original Message-----
>From: systemtap-owner@sources.redhat.com 
>[mailto:systemtap-owner@sources.redhat.com] On Behalf Of Andi Kleen
>Sent: Monday, August 01, 2005 8:38 AM
>To: Mathieu Desnoyers
>Cc: Andi Kleen; Karim Yaghmour; Masami Hiramatsu; Masami 
>Hiramatsu; Roland McGrath; Richard J Moore; 
>systemtap@sources.redhat.com; sugita@sdl.hitachi.co.jp; 
>Satoshi Oshima; michel.dagenais@polymtl.ca
>Subject: Re: Hitachi djprobe mechanism
>
>On Sun, Jul 31, 2005 at 06:59:41PM -0400, Mathieu Desnoyers wrote:
>> * Andi Kleen (ak@suse.de) wrote:
>> > 
>> > One way would be to just search the task list for any 
>tasks blocked with an IP
>> > inside the patched region. If yes rewait for another 
>quiescent period.
>> > 
>> > 
>> 
>> If you stop other cpus'scheduler when you do that, then it's ok.
>
>You don't need to stop them, a snapshot of the task list is enough
>since you only care about preempted sleeping processes at a single 
>point of time.
>
>Anyways, this discussion is theoretic because the IPI approach
>is probably better.
>
>> 
>> I just though about an interesting way to implement the IPI, 
>which would work
>> very well (and safely) for any case where the instruction to 
>overwrite is >= 5
>> bytes. The idea :
>> 
>> - Send IPI to each other cpu
>>   IP args : * address we plan to write to
>>             * the new instruction we plan to write
>>   (The IPI handler could then make an infinite loop, reading 
>the address,
>>   waiting for it to contain the new instruction.)
>
>Seems far too complicated, just make it spin on a lock during 
>the modification.
>
>
>-Andi
>

^ permalink raw reply	[flat|nested] 83+ messages in thread
* RE: Hitachi djprobe mechanism
@ 2005-07-29  0:18 Keshavamurthy, Anil S
  2005-07-29  1:48 ` Karim Yaghmour
  2005-07-29  1:53 ` Frank Ch. Eigler
  0 siblings, 2 replies; 83+ messages in thread
From: Keshavamurthy, Anil S @ 2005-07-29  0:18 UTC (permalink / raw)
  To: Richard J Moore
  Cc: Mathieu Desnoyers, Masami Hiramatsu, Karim Yaghmour,
	Masami Hiramatsu, michel.dagenais, Roland McGrath,
	Satoshi Oshima, sugita, systemtap

 
>
>There are more efficient ways of implementing a jmp type hook - see the
>kernel hooks package, where we evloved past this string of 5 no-ops
>implementation Here we moved an immediate value - 1 byte - 
>into a reg and
>jumped on the reg being non-zero. To spring the hook we stored the one
>immediate byte in the mov instruction. This technique works 
>quite well on
>IA64 where one can use a predicate register for the same purpose.

Yup, I agree with you and this seems to the correct way to support
djprobe 
with having to worry about all the other issues which we have discussed
earlier. 
The only _limitations_ here is that djprobe can only be placed if there
is a static hook as mentioned above.

-thanks,
Anil


^ permalink raw reply	[flat|nested] 83+ messages in thread
* RE: Hitachi djprobe mechanism
@ 2005-07-27 21:05 Keshavamurthy, Anil S
  2005-07-28  1:51 ` Karim Yaghmour
  0 siblings, 1 reply; 83+ messages in thread
From: Keshavamurthy, Anil S @ 2005-07-27 21:05 UTC (permalink / raw)
  To: Masami Hiramatsu, Roland McGrath
  Cc: Richard J Moore, SystemTAP, sugita, Satoshi Oshima

Hi Masami,
	The same paper you have mentioned below talks 
about overwriting a single instruction at the instrumentation
point as opposed to what djprobe is doing which is
replacing multiple instruction( in order to overwrite
5 byte jmp instruction).

Having to replace multiple instructions in order to
insert a long jump instruction is a very dangerous thing
as some processes on some cpu might have been preempted
out in the middle of those instructions and are expected
to continue from the middle of that instruction which is now
a data for overwritten jump instruction.

I think that overwriting just a single-instruction
is always hazard-free and should be followed in djprobe. 
The paper clearly explains how to achieve this using what
is known as springboard technique.

Please let me know your thoughts on this.

-thanks,
Anil
 

>-----Original Message-----
>From: systemtap-owner@sources.redhat.com 
>[mailto:systemtap-owner@sources.redhat.com] On Behalf Of 
>Masami Hiramatsu
>Sent: Wednesday, July 27, 2005 6:02 AM
>To: Roland McGrath
>Cc: Richard J Moore; SystemTAP; sugita@sdl.hitachi.co.jp; 
>Satoshi Oshima
>Subject: Re: Hitachi djprobe mechanism
>
>Hi, Roland
>
>Roland McGrath wrote:
>>>  I think Kerninst is similar in effect to djprobe. both of them copy
>>>original code to a buffer and jump to the buffer.
>>>  However I think that the most unique feature of djprobe is use of
>>>"bypass" route to safely insert code on SMP.
>>>  I cannot find SMP safety mechanism like "bypass" in kerninst papers
>>>yet.
>> 
>> 
>> If by this you mean inserting an int3 while writing the rest 
>of the jmp
>> instruction and then overwriting the first byte when the 
>rest is in place,
>> I recall reading about that in some kerninst paper to be sure.
>
>Thanks a lot.
>Finally, I found it in page.9 of the OSDI paper:
>"Fine-Grained Dynamic Instrumentation of Commodity Operating 
>System Kernels",
>Ariel Tamches and Barton P. Miller, OSDI, Feb 1999.
>
>Actually, it seems to describe a similar thing.
>
>-- 
>Masami HIRAMATSU
>2nd Research Dept.
>Hitachi, Ltd., Systems Development Laboratory
>E-mail: hiramatu@sdl.hitachi.co.jp
>
>
>
>

^ permalink raw reply	[flat|nested] 83+ messages in thread
* Re: Hitachi djprobe mechanism
@ 2005-07-22 18:09 Frank Ch. Eigler
  0 siblings, 0 replies; 83+ messages in thread
From: Frank Ch. Eigler @ 2005-07-22 18:09 UTC (permalink / raw)
  To: systemtap

Hi -

richardj_moore wrote:

> Oh they did. Did anyone have any thoughts on their mechanism?

I have nothing but encouragement toward the effort.  We should exploit
the facility as soon and as far as it is safely applicable.

- FChE

^ permalink raw reply	[flat|nested] 83+ messages in thread
* Hitachi djprobe mechanism
@ 2005-07-21 22:32 Richard J Moore
  2005-07-21 22:52 ` Roland McGrath
  0 siblings, 1 reply; 83+ messages in thread
From: Richard J Moore @ 2005-07-21 22:32 UTC (permalink / raw)
  To: SystemTAP





The guys from Hitachi at OLS have just shown me an interesting performance
innovation based on krpobes. For high performance probing they use an
inserted jmp. This can't be installed atomically, so to get around that
problem they use a kprobe on the first instance then insert the jmp. Sounds
interesting. They are sending me the details.
- -
Richard J Moore
IBM Advanced Linux Response Team - Linux Technology Centre
MOBEX: 264807; Mobile (+44) (0)7739-875237
Office: (+44) (0)1962-817072

^ permalink raw reply	[flat|nested] 83+ messages in thread

end of thread, other threads:[~2005-11-08  9:49 UTC | newest]

Thread overview: 83+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-08-01 20:46 Hitachi djprobe mechanism Keshavamurthy, Anil S
2005-08-01 21:08 ` Karim Yaghmour
  -- strict thread matches above, loose matches on Subject: below --
2005-08-01 22:49 Keshavamurthy, Anil S
2005-08-01 23:05 ` Karim Yaghmour
2005-08-01 23:18   ` Karim Yaghmour
2005-08-01 22:41 Keshavamurthy, Anil S
2005-08-02  3:21 ` Roland McGrath
2005-08-02  3:35   ` Karim Yaghmour
2005-08-01 16:14 Keshavamurthy, Anil S
2005-08-01 20:31 ` Roland McGrath
2005-08-04  0:28   ` Mathieu Desnoyers
2005-08-04 10:01     ` Andi Kleen
2005-08-05 16:25       ` Mathieu Desnoyers
2005-08-05 16:39         ` Andi Kleen
2005-08-01 15:50 Keshavamurthy, Anil S
2005-08-01 16:03 ` Mathieu Desnoyers
2005-07-29  0:18 Keshavamurthy, Anil S
2005-07-29  1:48 ` Karim Yaghmour
2005-07-29  3:41   ` Mathieu Desnoyers
2005-07-29  3:47     ` Karim Yaghmour
2005-07-29  1:53 ` Frank Ch. Eigler
2005-08-01  9:02   ` Mathieu Lacage
2005-08-01 13:18     ` Mathieu Desnoyers
2005-08-02  7:07       ` Mathieu Lacage
2005-07-27 21:05 Keshavamurthy, Anil S
2005-07-28  1:51 ` Karim Yaghmour
2005-07-28  2:10   ` Karim Yaghmour
2005-07-28 16:23     ` Masami Hiramatsu
2005-07-28 16:28       ` Karim Yaghmour
2005-07-28 17:36         ` Mathieu Desnoyers
     [not found]           ` <20050728110717.A30199@unix-os.sc.intel.com>
2005-07-28 18:33             ` Mathieu Desnoyers
     [not found]               ` <20050728133456.A32210@unix-os.sc.intel.com>
2005-07-28 23:53                 ` Richard J Moore
2005-07-29  5:59                 ` Mathieu Desnoyers
2005-07-29  7:55                   ` Andi Kleen
2005-07-29  8:44                     ` Richard J Moore
2005-07-29  8:46                       ` Andi Kleen
2005-07-29 15:51                     ` Mathieu Desnoyers
2005-07-30 15:55                       ` Andi Kleen
2005-07-30 16:54                         ` Mathieu Desnoyers
2005-07-31 22:03                           ` Andi Kleen
2005-07-31 23:11                             ` Mathieu Desnoyers
2005-08-01 15:37                               ` Andi Kleen
2005-08-01  8:44                             ` Richard J Moore
2005-08-01 13:21                               ` Mathieu Desnoyers
2005-08-01 19:57                               ` Satoshi Oshima
2005-08-01 20:21                                 ` Karim Yaghmour
2005-08-01 22:12                                   ` Satoshi Oshima
2005-08-01 22:54                                     ` Karim Yaghmour
2005-08-02 18:42                                       ` Satoshi Oshima
2005-08-03 14:50                                         ` Karim Yaghmour
2005-08-04  1:19                                         ` Mathieu Desnoyers
2005-08-04  3:31                                           ` Mathieu Desnoyers
2005-08-02  9:42                                   ` Mathieu Lacage
2005-08-02 15:09                                     ` Karim Yaghmour
2005-10-07 15:35                                     ` Richard J Moore
2005-10-08 18:33                                       ` mathieu lacage
2005-10-08 21:59                                         ` Richard J Moore
2005-10-08 23:24                                           ` Roland McGrath
2005-10-22 11:49                                             ` mathieu lacage
2005-10-22 22:09                                               ` Roland McGrath
2005-10-24  6:33                                                 ` Mathieu Lacage
2005-10-24 19:48                                                   ` Roland McGrath
     [not found]                                             ` <43621B0D.70204@sophia.inria.fr>
2005-11-07 10:04                                               ` mathieu lacage
2005-11-07 10:06                                                 ` mathieu lacage
2005-11-08  9:49                                             ` Richard J Moore
2005-10-09 16:47                                           ` mathieu lacage
2005-08-02 15:33                                   ` Mathieu Lacage
2005-08-02 15:36                                     ` Mathieu Lacage
2005-08-02 16:12                                     ` Karim Yaghmour
2005-08-02 16:30                                       ` Mathieu Lacage
2005-08-02 16:46                                         ` Karim Yaghmour
2005-08-04 17:09                                         ` Mathieu Lacage
2005-08-03 14:46                                 ` Andi Kleen
2005-07-29 16:06                   ` Frank Ch. Eigler
2005-07-29 18:24                     ` sugita
2005-07-28 18:13       ` Richard J Moore
2005-07-22 18:09 Frank Ch. Eigler
2005-07-21 22:32 Richard J Moore
2005-07-21 22:52 ` Roland McGrath
2005-07-22  2:52   ` Richard J Moore
2005-07-26  7:14   ` Masami Hiramatsu
2005-07-26  7:53     ` Roland McGrath
2005-07-27 13:02       ` Masami Hiramatsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).