public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* Re: Fwd: systemtap global var lead to high cpu
@ 2020-07-24  2:00 Kun
  2020-07-24  3:49 ` Arkady
  0 siblings, 1 reply; 6+ messages in thread
From: Kun @ 2020-07-24  2:00 UTC (permalink / raw)
  To: Serhei Makarov, systemtap

My global var saved is string type, not number, so I can not use aggregate.
FChE will fixed this bug,
https://sourceware.org/bugzilla/show_bug.cgi?id=26296
Hoping to fixed soon.




------------------ Original ------------------
From: Serhei Makarov <smakarov@redhat.com&gt;
Date: Thu,Jul 23,2020 10:22 PM
To: systemtap <systemtap@sourceware.org&gt;, mingkunone <mingkunone@qq.com&gt;
Subject: Re: Fwd: systemtap global var lead to high cpu



Forwarding your question to systemtap@sourceware.org in case other people have suggestions.



In general, the locks protect concurrent modifications from interfering with each other when different processes trigger the same probe.


Depending on what&nbsp; you want to do with the iphdr value, you may be able to reduce contention by using statistical aggregates
(which do not require locking). For example, aaa[iphdr] <<< some_statistic; in a later probe iterate through aaa. A lot of the SystemTap example scripts use this type of structure, for example: https://sourceware.org/systemtap/examples/network/netfilter_summary.stp


Otherwise, every tcp_ack() for every packet on your system will try to grab the same lock. The resulting CPU load is unsurprising to me.


Hope this information is helpful; if not, someone else may have a better suggestion.


All the best,
&nbsp;&nbsp;&nbsp; &nbsp; Serhei


---------- Forwarded message ---------
From: Kun <mingkunone@qq.com&gt;
Date: Wed, Jul 22, 2020 at 11:24 PM
Subject: systemtap global var lead to high cpu
To: smakarov <smakarov@redhat.com&gt;



Hi,
&nbsp; &nbsp; I have a problem of system tap which
&nbsp;Using global var.
&nbsp; &nbsp; A simple demo as following,
Global aaa
Probe kernel.function(“tcp_ack”){
&nbsp; &nbsp; Iphdr = __get_skb_iphdr($skb)
&nbsp; &nbsp; If(iphdr == 0){
&nbsp; &nbsp; &nbsp; &nbsp; aaa=iphdr
&nbsp; &nbsp; }


Then our env have a 10Gbps flow, and our cpu is nearly 100%.


Analysising c code, I find this is because of a lock as following:
Static void probe_6330()
{
If(sta_lock_probe(lock, ARRAY_SIZE(locks))
&nbsp; &nbsp; Return;
If(l-&gt;iphdr == 0){
&nbsp; &nbsp; Global(s_global_aaa)=l-&gt;l_iphdr;
}


My question is that aaa should be protected in “if”,
Why is the lock directly at the function entry?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fwd: systemtap global var lead to high cpu
  2020-07-24  2:00 Fwd: systemtap global var lead to high cpu Kun
@ 2020-07-24  3:49 ` Arkady
  2020-07-25  2:45   ` Kun
  0 siblings, 1 reply; 6+ messages in thread
From: Arkady @ 2020-07-24  3:49 UTC (permalink / raw)
  To: Kun; +Cc: Serhei Makarov, systemtap

Hi Kun,

I avoid locks by using lockless data structures. For example, I have
my own hashtables instead of associative arrays.
You can check this code for inspiration
https://github.com/larytet/lockfree_hashtable

If the performance is paramount for your task you can switch to 100%
handwritten C code.

Arkady,

On Fri, Jul 24, 2020 at 5:01 AM Kun via Systemtap
<systemtap@sourceware.org> wrote:
>
> My global var saved is string type, not number, so I can not use aggregate.
> FChE will fixed this bug,
> https://sourceware.org/bugzilla/show_bug.cgi?id=26296
> Hoping to fixed soon.
>
>
>
>
> ------------------ Original ------------------
> From: Serhei Makarov <smakarov@redhat.com&gt;
> Date: Thu,Jul 23,2020 10:22 PM
> To: systemtap <systemtap@sourceware.org&gt;, mingkunone <mingkunone@qq.com&gt;
> Subject: Re: Fwd: systemtap global var lead to high cpu
>
>
>
> Forwarding your question to systemtap@sourceware.org in case other people have suggestions.
>
>
>
> In general, the locks protect concurrent modifications from interfering with each other when different processes trigger the same probe.
>
>
> Depending on what&nbsp; you want to do with the iphdr value, you may be able to reduce contention by using statistical aggregates
> (which do not require locking). For example, aaa[iphdr] <<< some_statistic; in a later probe iterate through aaa. A lot of the SystemTap example scripts use this type of structure, for example: https://sourceware.org/systemtap/examples/network/netfilter_summary.stp
>
>
> Otherwise, every tcp_ack() for every packet on your system will try to grab the same lock. The resulting CPU load is unsurprising to me.
>
>
> Hope this information is helpful; if not, someone else may have a better suggestion.
>
>
> All the best,
> &nbsp;&nbsp;&nbsp; &nbsp; Serhei
>
>
> ---------- Forwarded message ---------
> From: Kun <mingkunone@qq.com&gt;
> Date: Wed, Jul 22, 2020 at 11:24 PM
> Subject: systemtap global var lead to high cpu
> To: smakarov <smakarov@redhat.com&gt;
>
>
>
> Hi,
> &nbsp; &nbsp; I have a problem of system tap which
> &nbsp;Using global var.
> &nbsp; &nbsp; A simple demo as following,
> Global aaa
> Probe kernel.function(“tcp_ack”){
> &nbsp; &nbsp; Iphdr = __get_skb_iphdr($skb)
> &nbsp; &nbsp; If(iphdr == 0){
> &nbsp; &nbsp; &nbsp; &nbsp; aaa=iphdr
> &nbsp; &nbsp; }
>
>
> Then our env have a 10Gbps flow, and our cpu is nearly 100%.
>
>
> Analysising c code, I find this is because of a lock as following:
> Static void probe_6330()
> {
> If(sta_lock_probe(lock, ARRAY_SIZE(locks))
> &nbsp; &nbsp; Return;
> If(l-&gt;iphdr == 0){
> &nbsp; &nbsp; Global(s_global_aaa)=l-&gt;l_iphdr;
> }
>
>
> My question is that aaa should be protected in “if”,
> Why is the lock directly at the function entry?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fwd: systemtap global var lead to high cpu
  2020-07-24  3:49 ` Arkady
@ 2020-07-25  2:45   ` Kun
  2020-07-25  5:25     ` Arkady
  2020-08-18 19:11     ` Frank Ch. Eigler
  0 siblings, 2 replies; 6+ messages in thread
From: Kun @ 2020-07-25  2:45 UTC (permalink / raw)
  To: Arkady; +Cc: Serhei Makarov, systemtap

Hi,
How to embed your module into systemtap?
it prompts some error when i run dup_probe.stp,
semantic error:while resolving probe point: identifier ‘kernel’ at kprocess.stp:29:25
&nbsp; &nbsp; Source: probe kprocess.creat = kernel.function(“copy_process”).return
&nbsp; &nbsp; Thank you




------------------ Original ------------------
From: Arkady <arkady.miasnikov@gmail.com&gt;
Date: Fri,Jul 24,2020 11:49 AM
To: Kun <mingkunone@qq.com&gt;
Cc: Serhei Makarov <smakarov@redhat.com&gt;, systemtap <systemtap@sourceware.org&gt;
Subject: Re: Fwd: systemtap global var lead to high cpu



Hi Kun,

I avoid locks by using lockless data structures. For example, I have
my own hashtables instead of associative arrays.
You can check this code for inspiration
https://github.com/larytet/lockfree_hashtable

If the performance is paramount for your task you can switch to 100%
handwritten C code.

Arkady,

On Fri, Jul 24, 2020 at 5:01 AM Kun via Systemtap
<systemtap@sourceware.org&gt; wrote:
&gt;
&gt; My global var saved is string type, not number, so I can not use aggregate.
&gt; FChE will fixed this bug,
&gt; https://sourceware.org/bugzilla/show_bug.cgi?id=26296
&gt; Hoping to fixed soon.
&gt;
&gt;
&gt;
&gt;
&gt; ------------------ Original ------------------
&gt; From: Serhei Makarov <smakarov@redhat.com&amp;gt;
&gt; Date: Thu,Jul 23,2020 10:22 PM
&gt; To: systemtap <systemtap@sourceware.org&amp;gt;, mingkunone <mingkunone@qq.com&amp;gt;
&gt; Subject: Re: Fwd: systemtap global var lead to high cpu
&gt;
&gt;
&gt;
&gt; Forwarding your question to systemtap@sourceware.org in case other people have suggestions.
&gt;
&gt;
&gt;
&gt; In general, the locks protect concurrent modifications from interfering with each other when different processes trigger the same probe.
&gt;
&gt;
&gt; Depending on what&amp;nbsp; you want to do with the iphdr value, you may be able to reduce contention by using statistical aggregates
&gt; (which do not require locking). For example, aaa[iphdr] <<< some_statistic; in a later probe iterate through aaa. A lot of the SystemTap example scripts use this type of structure, for example: https://sourceware.org/systemtap/examples/network/netfilter_summary.stp
&gt;
&gt;
&gt; Otherwise, every tcp_ack() for every packet on your system will try to grab the same lock. The resulting CPU load is unsurprising to me.
&gt;
&gt;
&gt; Hope this information is helpful; if not, someone else may have a better suggestion.
&gt;
&gt;
&gt; All the best,
&gt; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp; Serhei
&gt;
&gt;
&gt; ---------- Forwarded message ---------
&gt; From: Kun <mingkunone@qq.com&amp;gt;
&gt; Date: Wed, Jul 22, 2020 at 11:24 PM
&gt; Subject: systemtap global var lead to high cpu
&gt; To: smakarov <smakarov@redhat.com&amp;gt;
&gt;
&gt;
&gt;
&gt; Hi,
&gt; &amp;nbsp; &amp;nbsp; I have a problem of system tap which
&gt; &amp;nbsp;Using global var.
&gt; &amp;nbsp; &amp;nbsp; A simple demo as following,
&gt; Global aaa
&gt; Probe kernel.function(“tcp_ack”){
&gt; &amp;nbsp; &amp;nbsp; Iphdr = __get_skb_iphdr($skb)
&gt; &amp;nbsp; &amp;nbsp; If(iphdr == 0){
&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; aaa=iphdr
&gt; &amp;nbsp; &amp;nbsp; }
&gt;
&gt;
&gt; Then our env have a 10Gbps flow, and our cpu is nearly 100%.
&gt;
&gt;
&gt; Analysising c code, I find this is because of a lock as following:
&gt; Static void probe_6330()
&gt; {
&gt; If(sta_lock_probe(lock, ARRAY_SIZE(locks))
&gt; &amp;nbsp; &amp;nbsp; Return;
&gt; If(l-&amp;gt;iphdr == 0){
&gt; &amp;nbsp; &amp;nbsp; Global(s_global_aaa)=l-&amp;gt;l_iphdr;
&gt; }
&gt;
&gt;
&gt; My question is that aaa should be protected in “if”,
&gt; Why is the lock directly at the function entry?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fwd: systemtap global var lead to high cpu
  2020-07-25  2:45   ` Kun
@ 2020-07-25  5:25     ` Arkady
  2020-08-18 19:11     ` Frank Ch. Eigler
  1 sibling, 0 replies; 6+ messages in thread
From: Arkady @ 2020-07-25  5:25 UTC (permalink / raw)
  To: Kun; +Cc: Serhei Makarov, systemtap

On Sat, Jul 25, 2020 at 5:45 AM Kun <mingkunone@qq.com> wrote:
>
> Hi,
> How to embed your module into systemtap?

Try this
https://sourceware.org/systemtap/tutorial/4_Tapsets.html#SECTION00053000000000000000
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/5/html/systemtap_language_reference/ch03s06

> it prompts some error when i run dup_probe.stp,
> semantic error:while resolving probe point: identifier ‘kernel’ at kprocess.stp:29:25
>     Source: probe kprocess.creat = kernel.function(“copy_process”).return

Please send the whole probe

>     Thank you
>
>
>
> ------------------ Original ------------------
> From: Arkady <arkady.miasnikov@gmail.com>
> Date: Fri,Jul 24,2020 11:49 AM
> To: Kun <mingkunone@qq.com>
> Cc: Serhei Makarov <smakarov@redhat.com>, systemtap <systemtap@sourceware.org>
> Subject: Re: Fwd: systemtap global var lead to high cpu
>
> Hi Kun,
>
> I avoid locks by using lockless data structures. For example, I have
> my own hashtables instead of associative arrays.
> You can check this code for inspiration
> https://github.com/larytet/lockfree_hashtable
>
> If the performance is paramount for your task you can switch to 100%
> handwritten C code.
>
> Arkady,
>
> On Fri, Jul 24, 2020 at 5:01 AM Kun via Systemtap
> <systemtap@sourceware.org> wrote:
> >
> > My global var saved is string type, not number, so I can not use aggregate.
> > FChE will fixed this bug,
> > https://sourceware.org/bugzilla/show_bug.cgi?id=26296
> > Hoping to fixed soon.
> >
> >
> >
> >
> > ------------------ Original ------------------
> > From: Serhei Makarov <smakarov@redhat.com&gt;
> > Date: Thu,Jul 23,2020 10:22 PM
> > To: systemtap <systemtap@sourceware.org&gt;, mingkunone <mingkunone@qq.com&gt;
> > Subject: Re: Fwd: systemtap global var lead to high cpu
> >
> >
> >
> > Forwarding your question to systemtap@sourceware.org in case other people have suggestions.
> >
> >
> >
> > In general, the locks protect concurrent modifications from interfering with each other when different processes trigger the same probe.
> >
> >
> > Depending on what&nbsp; you want to do with the iphdr value, you may be able to reduce contention by using statistical aggregates
> > (which do not require locking). For example, aaa[iphdr] <<< some_statistic; in a later probe iterate through aaa. A lot of the SystemTap example scripts use this type of structure, for example: https://sourceware.org/systemtap/examples/network/netfilter_summary.stp
> >
> >
> > Otherwise, every tcp_ack() for every packet on your system will try to grab the same lock. The resulting CPU load is unsurprising to me.
> >
> >
> > Hope this information is helpful; if not, someone else may have a better suggestion.
> >
> >
> > All the best,
> > &nbsp;&nbsp;&nbsp; &nbsp; Serhei
> >
> >
> > ---------- Forwarded message ---------
> > From: Kun <mingkunone@qq.com&gt;
> > Date: Wed, Jul 22, 2020 at 11:24 PM
> > Subject: systemtap global var lead to high cpu
> > To: smakarov <smakarov@redhat.com&gt;
> >
> >
> >
> > Hi,
> > &nbsp; &nbsp; I have a problem of system tap which
> > &nbsp;Using global var.
> > &nbsp; &nbsp; A simple demo as following,
> > Global aaa
> > Probe kernel.function(“tcp_ack”){
> > &nbsp; &nbsp; Iphdr = __get_skb_iphdr($skb)
> > &nbsp; &nbsp; If(iphdr == 0){
> > &nbsp; &nbsp; &nbsp; &nbsp; aaa=iphdr
> > &nbsp; &nbsp; }
> >
> >
> > Then our env have a 10Gbps flow, and our cpu is nearly 100%.
> >
> >
> > Analysising c code, I find this is because of a lock as following:
> > Static void probe_6330()
> > {
> > If(sta_lock_probe(lock, ARRAY_SIZE(locks))
> > &nbsp; &nbsp; Return;
> > If(l-&gt;iphdr == 0){
> > &nbsp; &nbsp; Global(s_global_aaa)=l-&gt;l_iphdr;
> > }
> >
> >
> > My question is that aaa should be protected in “if”,
> > Why is the lock directly at the function entry?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fwd: systemtap global var lead to high cpu
  2020-07-25  2:45   ` Kun
  2020-07-25  5:25     ` Arkady
@ 2020-08-18 19:11     ` Frank Ch. Eigler
  1 sibling, 0 replies; 6+ messages in thread
From: Frank Ch. Eigler @ 2020-08-18 19:11 UTC (permalink / raw)
  To: Kun, Arkady, craig; +Cc: systemtap


Hi -

Improvements made for PR26296 should dramatically improve
your global variable locking issues with systemtap.  Please
let us know if it's enough!

https://sourceware.org/git/?p=systemtap.git;a=commit;h=25012d82e181afe7de5cb8bcc2cefcef0b123e32

- FChE

------------------------------------------------------------------------

PR26296: lock pushdown optimization

Implements an algorithm to push lock/unlock operations downward in the
syntax tree, to just enclose the smallest possible region that deals
with global variables.  This means two common patterns run with much
more concurrency than before:

global a
probe foo {
  if (condition)
     { a++ }
  else
     { something_else() }
}

will only lock globals -if- the condition is true, so something_else()
would run unlocked.  Also:

global a
probe foo {
  if (a)
    { long_twisty_operation(); }
}

will unlock globals right after the condition is evaluated, so
long_twisty runs unlocked.  Previous behaviour is avilable with
--compatible=4.3.  New test case lock-pushdown.stp asserts locking
conditions throughout various relevant constructs.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Fwd: systemtap global var lead to high cpu
       [not found] <tencent_FB0905D58DA3FA7F0E728C535B447E0D1905@qq.com>
@ 2020-07-23 14:22 ` Serhei Makarov
  0 siblings, 0 replies; 6+ messages in thread
From: Serhei Makarov @ 2020-07-23 14:22 UTC (permalink / raw)
  To: systemtap, mingkunone

Forwarding your question to systemtap@sourceware.org in case other people
have suggestions.

In general, the locks protect concurrent modifications from interfering
with each other when different processes trigger the same probe.

Depending on what  you want to do with the iphdr value, you may be able to
reduce contention by using statistical aggregates
(which do not require locking). For example, aaa[iphdr] <<< some_statistic;
in a later probe iterate through aaa. A lot of the SystemTap example
scripts use this type of structure, for example:
https://sourceware.org/systemtap/examples/network/netfilter_summary.stp

Otherwise, every tcp_ack() for every packet on your system will try to grab
the same lock. The resulting CPU load is unsurprising to me.

Hope this information is helpful; if not, someone else may have a better
suggestion.

All the best,
      Serhei

---------- Forwarded message ---------
From: Kun <mingkunone@qq.com>
Date: Wed, Jul 22, 2020 at 11:24 PM
Subject: systemtap global var lead to high cpu
To: smakarov <smakarov@redhat.com>


Hi,
    I have a problem of system tap which
 Using global var.
    A simple demo as following,
Global aaa
Probe kernel.function(“tcp_ack”){
    Iphdr = __get_skb_iphdr($skb)
    If(iphdr == 0){
        aaa=iphdr
    }

Then our env have a 10Gbps flow, and our cpu is nearly 100%.

Analysising c code, I find this is because of a lock as following:
Static void probe_6330()
{
If(sta_lock_probe(lock, ARRAY_SIZE(locks))
    Return;
If(l->iphdr == 0){
    Global(s_global_aaa)=l->l_iphdr;
}

My question is that aaa should be protected in “if”,
Why is the lock directly at the function entry?

------------------------------

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-08-18 19:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-24  2:00 Fwd: systemtap global var lead to high cpu Kun
2020-07-24  3:49 ` Arkady
2020-07-25  2:45   ` Kun
2020-07-25  5:25     ` Arkady
2020-08-18 19:11     ` Frank Ch. Eigler
     [not found] <tencent_FB0905D58DA3FA7F0E728C535B447E0D1905@qq.com>
2020-07-23 14:22 ` Serhei Makarov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).