public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* Unsafe mode for probes
@ 2018-08-05 10:14 Arkady
  2018-08-06 17:37 ` David Smith
  0 siblings, 1 reply; 7+ messages in thread
From: Arkady @ 2018-08-05 10:14 UTC (permalink / raw)
  To: systemtap

Hi,

I think that I have mentioned the problem before. I allocate
significant amount of virtual memory and create a shared memory
between my kernel drivers and code running in the user space. The goal
of the shared memory is to improve the throughput of the data channel
between the systemtap driver and the rest of the system.

I started by using 'probe begin' to call my C initialization code.
This approach expectedly failed.

My next step was a small kernel module which only initializes the
shared memory. Finally
I have patched the systemtap by adding two hooks
stp_user_init()/stp_user_close()  in
emit_module_init()/emit_module_exit() in the translate.cxx See, for
example, https://github.com/larytet/SystemTap/commit/6e4c99f96c9ecce9508f2c8612e8bace2ac91ae5
This worked just fine.

Still performance wise the driver is not where I would like it to be.
I collect majority of the system calls  and install 100+ probes. The
code has to stand in tight performance constraints. I improve the
performance further by replacing the most frequently called probes
generated by the systemtap by custom C code. Every time the systemtap
version/kernel version changes I have to rewrite the probes. I do not
want to drop the systemtap because the framework solves lot of
compatibility related problems.

I think to extend the STAP language by adding an "unsafe probe" - a
probe implemented in C. Unsafe probes do not enforce any checks. The
framework will "emit" access to the system call arguments and compile
the code. The framework will use the original stack for the local
variables.

I understand that this is a long way from the SystemTap stated
intention of the system debug. I am breaking quite a few core
assumptions. Indeed I use the SystemTap to implement a high
performance system audit for Linux.

I appreciate any feedback

Thank you

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Unsafe mode for probes
  2018-08-05 10:14 Unsafe mode for probes Arkady
@ 2018-08-06 17:37 ` David Smith
  2018-08-07  9:20   ` Arkady
  0 siblings, 1 reply; 7+ messages in thread
From: David Smith @ 2018-08-06 17:37 UTC (permalink / raw)
  To: Arkady; +Cc: systemtap

For your "unsafe probes", why not just do something like the following?

====
probe syscall.read {
%{
// C code here ...
%}
}
====

On Sun, Aug 5, 2018 at 5:14 AM Arkady <arkady.miasnikov@gmail.com> wrote:
>
> Hi,
>
> I think that I have mentioned the problem before. I allocate
> significant amount of virtual memory and create a shared memory
> between my kernel drivers and code running in the user space. The goal
> of the shared memory is to improve the throughput of the data channel
> between the systemtap driver and the rest of the system.
>
> I started by using 'probe begin' to call my C initialization code.
> This approach expectedly failed.
>
> My next step was a small kernel module which only initializes the
> shared memory. Finally
> I have patched the systemtap by adding two hooks
> stp_user_init()/stp_user_close()  in
> emit_module_init()/emit_module_exit() in the translate.cxx See, for
> example, https://github.com/larytet/SystemTap/commit/6e4c99f96c9ecce9508f2c8612e8bace2ac91ae5
> This worked just fine.
>
> Still performance wise the driver is not where I would like it to be.
> I collect majority of the system calls  and install 100+ probes. The
> code has to stand in tight performance constraints. I improve the
> performance further by replacing the most frequently called probes
> generated by the systemtap by custom C code. Every time the systemtap
> version/kernel version changes I have to rewrite the probes. I do not
> want to drop the systemtap because the framework solves lot of
> compatibility related problems.
>
> I think to extend the STAP language by adding an "unsafe probe" - a
> probe implemented in C. Unsafe probes do not enforce any checks. The
> framework will "emit" access to the system call arguments and compile
> the code. The framework will use the original stack for the local
> variables.
>
> I understand that this is a long way from the SystemTap stated
> intention of the system debug. I am breaking quite a few core
> assumptions. Indeed I use the SystemTap to implement a high
> performance system audit for Linux.
>
> I appreciate any feedback
>
> Thank you



--
David Smith
Associate Manager
Red Hat

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Unsafe mode for probes
  2018-08-06 17:37 ` David Smith
@ 2018-08-07  9:20   ` Arkady
  2018-08-09 20:07     ` Frank Ch. Eigler
  0 siblings, 1 reply; 7+ messages in thread
From: Arkady @ 2018-08-07  9:20 UTC (permalink / raw)
  To: David Smith; +Cc: systemtap

David,

The SystemTap introduces overhead.

My typical probe looks like this

probe syscall.accept4.return ?
{
    %{HIT_MAP_INC(HIT_MAP_SYSCALL_ACCEPT4_RETURN)%}
    send_incident_result(%{INCDENT_TYPE_SYSCALL_ACCEPT4%}, $return);
}

where send_incident_result() is my C function. The function pushes ~16
bytes to a lockless FIFO in the shared memory.

The actual assembler of the probe will contain quite a bit of code
handling local variables, call to lock, etc
For short probes I can shave about 20% of CPU cycles.

Thank you, Arkady

On Mon, Aug 6, 2018 at 8:36 PM, David Smith <dsmith@redhat.com> wrote:
> For your "unsafe probes", why not just do something like the following?
>
> ====
> probe syscall.read {
> %{
> // C code here ...
> %}
> }
> ====
>
> On Sun, Aug 5, 2018 at 5:14 AM Arkady <arkady.miasnikov@gmail.com> wrote:
>>
>> Hi,
>>
>> I think that I have mentioned the problem before. I allocate
>> significant amount of virtual memory and create a shared memory
>> between my kernel drivers and code running in the user space. The goal
>> of the shared memory is to improve the throughput of the data channel
>> between the systemtap driver and the rest of the system.
>>
>> I started by using 'probe begin' to call my C initialization code.
>> This approach expectedly failed.
>>
>> My next step was a small kernel module which only initializes the
>> shared memory. Finally
>> I have patched the systemtap by adding two hooks
>> stp_user_init()/stp_user_close()  in
>> emit_module_init()/emit_module_exit() in the translate.cxx See, for
>> example, https://github.com/larytet/SystemTap/commit/6e4c99f96c9ecce9508f2c8612e8bace2ac91ae5
>> This worked just fine.
>>
>> Still performance wise the driver is not where I would like it to be.
>> I collect majority of the system calls  and install 100+ probes. The
>> code has to stand in tight performance constraints. I improve the
>> performance further by replacing the most frequently called probes
>> generated by the systemtap by custom C code. Every time the systemtap
>> version/kernel version changes I have to rewrite the probes. I do not
>> want to drop the systemtap because the framework solves lot of
>> compatibility related problems.
>>
>> I think to extend the STAP language by adding an "unsafe probe" - a
>> probe implemented in C. Unsafe probes do not enforce any checks. The
>> framework will "emit" access to the system call arguments and compile
>> the code. The framework will use the original stack for the local
>> variables.
>>
>> I understand that this is a long way from the SystemTap stated
>> intention of the system debug. I am breaking quite a few core
>> assumptions. Indeed I use the SystemTap to implement a high
>> performance system audit for Linux.
>>
>> I appreciate any feedback
>>
>> Thank you
>
>
>
> --
> David Smith
> Associate Manager
> Red Hat

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Unsafe mode for probes
  2018-08-07  9:20   ` Arkady
@ 2018-08-09 20:07     ` Frank Ch. Eigler
  2018-08-11  4:06       ` Arkady
  0 siblings, 1 reply; 7+ messages in thread
From: Frank Ch. Eigler @ 2018-08-09 20:07 UTC (permalink / raw)
  To: Arkady; +Cc: David Smith, systemtap


arkady.miasnikov wrote:

> My typical probe looks like this
>
> probe syscall.accept4.return ?
> {
>     %{HIT_MAP_INC(HIT_MAP_SYSCALL_ACCEPT4_RETURN)%}
>     send_incident_result(%{INCDENT_TYPE_SYSCALL_ACCEPT4%}, $return);
> }
> [...]

> The actual assembler of the probe will contain quite a bit of code
> handling local variables, call to lock, etc
> For short probes I can shave about 20% of CPU cycles.

A blanket "unsafe" probe cannot make do with no checks at all, e.g.
a proper context structure allocation for temporary values, etc. 
It's a matter of detail - which checks particularly should one skip?

It is not unlikely that some of that initialization / lock business in
the probe prologue could be elided entirely with some more cleverness
during translation.  Like a new pragma for embedded-C functions that
skip the context-struct based api, which in turn could make it
unnecessary to have a context struct at all for that probe.  Stuff
like that - but it takes analysis to figure out which is needed
and which is not.  stap -p3 ftw.

- FChE

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Unsafe mode for probes
  2018-08-09 20:07     ` Frank Ch. Eigler
@ 2018-08-11  4:06       ` Arkady
  2018-08-11 15:37         ` Frank Ch. Eigler
  0 siblings, 1 reply; 7+ messages in thread
From: Arkady @ 2018-08-11  4:06 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: David Smith, systemtap

Frank,

"a proper context structure allocation for temporary values" is
one of the things I would like to wrap into a stable API, and
do not enforce it.

"Unsafe" means that the stack safety is up to the author.

For example, if I hook only entry points of the system calls I
want the freedom to inject custom C code.

Arkady.

On Thu, Aug 9, 2018 at 11:07 PM, Frank Ch. Eigler <fche@redhat.com> wrote:
>
> arkady.miasnikov wrote:
>
>> My typical probe looks like this
>>
>> probe syscall.accept4.return ?
>> {
>>     %{HIT_MAP_INC(HIT_MAP_SYSCALL_ACCEPT4_RETURN)%}
>>     send_incident_result(%{INCDENT_TYPE_SYSCALL_ACCEPT4%}, $return);
>> }
>> [...]
>
>> The actual assembler of the probe will contain quite a bit of code
>> handling local variables, call to lock, etc
>> For short probes I can shave about 20% of CPU cycles.
>
> A blanket "unsafe" probe cannot make do with no checks at all, e.g.
> a proper context structure allocation for temporary values, etc.
> It's a matter of detail - which checks particularly should one skip?
>
> It is not unlikely that some of that initialization / lock business in
> the probe prologue could be elided entirely with some more cleverness
> during translation.  Like a new pragma for embedded-C functions that
> skip the context-struct based api, which in turn could make it
> unnecessary to have a context struct at all for that probe.  Stuff
> like that - but it takes analysis to figure out which is needed
> and which is not.  stap -p3 ftw.
>
> - FChE

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Unsafe mode for probes
  2018-08-11  4:06       ` Arkady
@ 2018-08-11 15:37         ` Frank Ch. Eigler
  2018-08-12 15:16           ` Arkady
  0 siblings, 1 reply; 7+ messages in thread
From: Frank Ch. Eigler @ 2018-08-11 15:37 UTC (permalink / raw)
  To: Arkady; +Cc: David Smith, systemtap

Hi -

> "a proper context structure allocation for temporary values" is
> one of the things I would like to wrap into a stable API, and
> do not enforce it.

But a context is not an optional safety check sort of thing.  It is
the place where all the runtime per-probe-execution data is, including
error flags, temporaries, self-monitoring counters, printing buffers,
all kinds of stuff.  We can probably make it smaller for probe
handlers that are trivially simple.  But code just won't compile/run
with no context.

> "Unsafe" means that the stack safety is up to the author.
> For example, if I hook only entry points of the system calls I
> want the freedom to inject custom C code.

Maybe "raw" would be a better name then: a probe handler that
is bound to use self-contained embedded-C only (thus can't print,
deal with stap globals, etc.).  Even providing $context vars
would be tricky.

- FChE

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Unsafe mode for probes
  2018-08-11 15:37         ` Frank Ch. Eigler
@ 2018-08-12 15:16           ` Arkady
  0 siblings, 0 replies; 7+ messages in thread
From: Arkady @ 2018-08-12 15:16 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: David Smith, systemtap

Hi,

I can not use printing, monitoring counter and other goodies -
these are too slow for the task. My code handles between 0.2M
to 1M events/s. I must not introduce more than 8% overhead.
I hit the target (~10% overhead) when I write the probe
completely in C. I see how $context is handled. Given a stable
API I could do it in the hand crafted code.
The name "raw" is great.

Arkady

On Sat, Aug 11, 2018 at 6:37 PM, Frank Ch. Eigler <fche@redhat.com> wrote:
> Hi -
>
>> "a proper context structure allocation for temporary values" is
>> one of the things I would like to wrap into a stable API, and
>> do not enforce it.
>
> But a context is not an optional safety check sort of thing.  It is
> the place where all the runtime per-probe-execution data is, including
> error flags, temporaries, self-monitoring counters, printing buffers,
> all kinds of stuff.  We can probably make it smaller for probe
> handlers that are trivially simple.  But code just won't compile/run
> with no context.
>
>> "Unsafe" means that the stack safety is up to the author.
>> For example, if I hook only entry points of the system calls I
>> want the freedom to inject custom C code.
>
> Maybe "raw" would be a better name then: a probe handler that
> is bound to use self-contained embedded-C only (thus can't print,
> deal with stap globals, etc.).  Even providing $context vars
> would be tricky.
>
> - FChE

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-08-12 15:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-05 10:14 Unsafe mode for probes Arkady
2018-08-06 17:37 ` David Smith
2018-08-07  9:20   ` Arkady
2018-08-09 20:07     ` Frank Ch. Eigler
2018-08-11  4:06       ` Arkady
2018-08-11 15:37         ` Frank Ch. Eigler
2018-08-12 15:16           ` Arkady

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).