control channel usermode polling vs runtime "IO check" kernel thread

public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed

* control channel usermode polling vs runtime "IO check" kernel thread
@ 2011-01-18 14:42 Turgis, Frederic
  2011-01-18 15:21 ` Frank Ch. Eigler
  0 siblings, 1 reply; 3+ messages in thread
From: Turgis, Frederic @ 2011-01-18 14:42 UTC (permalink / raw)
  To: systemtap

Hi all,

While using systemtap on a SoC with power management enabled and "tickless" kernel, I have observed regular wake-ups of systemtap/0 kernel thread every scheduler tick. I have related that to below observations/conclusions:

"staprun" main loop indicates "the runtime does not implement select() on the command filehandle so we poll periodically" and there is a 250ms sleep between non blocking reads in mainloop.c

If I understand correctly, a "control channel" read triggers _stp_ctl_read_cmd(xxx) on the runtime side. In case of blocking read, it would block and be unblocked by a _stp_ctl_send(xxx) call or by periodically checking for IO in a kernel thread (_stp_ctl_work_callback() + STP_WORK_TIMER renamed STP_CTL_TIMER_INTERVAL in v1.4).

Of course, as usermode is using polling method, the kernel thread seems useless and I removed it with success (could have increased the timer also).
- Is there something that may be impacted by disabling this kernel thread ?
- are there plans to have a poll() or select() call on usermode side ?

Regards
Fred

PS: data channel also wakes up the platform too much but timers used are more obvious to identify and understand

Frederic Turgis
OMAP Platform Business Unit - OMAP System Engineering - Product System Integration

Texas Instruments France SA, 821 Avenue Jack Kilby, 06270 Villeneuve Loubet. 036 420 040 R.C.S Antibes. Capital de EUR 753.920

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: control channel usermode polling vs runtime "IO check" kernel thread
  2011-01-18 14:42 control channel usermode polling vs runtime "IO check" kernel thread Turgis, Frederic
@ 2011-01-18 15:21 ` Frank Ch. Eigler
  2011-01-28 17:22   ` Turgis, Frederic
  0 siblings, 1 reply; 3+ messages in thread
From: Frank Ch. Eigler @ 2011-01-18 15:21 UTC (permalink / raw)
  To: Turgis, Frederic; +Cc: systemtap

f-turgis wrote:

> [...]  While using systemtap on a SoC with power management enabled
> and "tickless" kernel, I have observed regular wake-ups of
> systemtap/0 kernel thread ever> y scheduler tick. I have related
> that to below observations/conclusions:

I believe all of your observations are correct.  We haven't
sufficiently tuned this aspect of the system, and would appreciate
further advice or help.

- FChE

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: control channel usermode polling vs runtime "IO check" kernel thread
  2011-01-18 15:21 ` Frank Ch. Eigler
@ 2011-01-28 17:22   ` Turgis, Frederic
  0 siblings, 0 replies; 3+ messages in thread
From: Turgis, Frederic @ 2011-01-28 17:22 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap

Hi,

I have currently more an analysis of the situation rather than solutions. I am seeing it from a SoC perspective (OMAP4 dual core Cortex A9), not desktop or server. We are trying to achieve that systemtap wakes up the platform only every second in low trace bitrate cases, i.e systemtap works hard not very often, rather than regular small (or no) work. We use systemtap V1.3 with transport version V2 (that is not yet the ring buffer)

Issue is that kernelmode relies on polling (timer, delayed work queue) tuned to wake-up every few scheduler ticks (20ms), where we finally do nothing as there is no control message, no filled data buffer ready.

"polling" is very nice to control your wake-ups and not be flooded by wake-up events but tuning timer can be annoying. Currently we have tuned the wake-up timers to higher values but we have in mind that "event-based" (1 buffer is ready) implementation could be efficient:
- it is adaptive
- an acceptable max trace bitrate compared to polling could be let's say 1 event max per 10ms. Currently buffers are 64KB (what about ring buffer ?) , which leads to 6.4MB/s trace bitrate.

As an example, our least demanding use cases are MP3 and 1080P/30FPS playback. We dump each scheduler thread switch + V4L2 related things -> 30KB/s. We have of course higher trace bitrates but in this context, we are OK to pay the wake-up price as this is debug/investigation.

So ideas would be:
- use "polling" and introduce more tuning of timers/buffer size as parameters of the tools.
- introduce "event-based" implementation. Unfortunately there seems to be a reason why it is not used for data channel in kernel module code (see below)

I imagine that we would need to hear about trace bitrate requirements from other domains to conclude.

Here is what we have analyzed from the code. It is split between usermode and kernelmode.

Usermode:
- the control channel handling is based on 250ms polling so reads in a non blocking way. We have increased it to 3s but having a real "read blocking" implementation with timeout could be good. My understanding is that control will only be exchanged at beginning and end of execution. With 3s, reaction time of control channel is longer but we don't care.
- data channel is based on blocking implementation ( ppoll() ) + 200ms timeout. We simply increased the timeout to 5s. Kernel part is key here.

What is interesting in kernelmode is that it is rebuilt and embedded in each script. Therefore it was easier to play with and create different set of compiled scripts depending on the use case. But that is only temporary solution

Kernelmode:
- control channel: it was the reason of my first mail. This part seems to have some "read blocking" implementation. But it is internally based on polling through a delayed work queue waking up around every 20ms. As usermode control channel is currently polling, we discarded it and it works OK. What is the reason why "blocking" is not used or possible ?
- data channel: the wake-up of user-mode relies on a regular check. We increased STP_RELAY_TIMER_INTERVAL to 1s. I noticed that xxx_switch_subbuf() callback has a comment indicating we can't call wake_up_interruptible() from there. So we need to poll :-(

This last point is probably the key point. We have found a thread that seems to related to that http://kerneltrap.org/mailarchive/linux-kernel/2007/7/26/122021/thread. I imagine we could live with tuning this parameter for low bitrate cases and for the rest.

I admit that our team's not strong enough knowledge of kernel means we can't give innovative solution. So please be tolerant !

Regards
Fred

Frederic Turgis
OMAP Platform Business Unit - OMAP System Engineering - Product System Integration

Texas Instruments France SA, 821 Avenue Jack Kilby, 06270 Villeneuve Loubet. 036 420 040 R.C.S Antibes. Capital de EUR 753.920

-----Original Message-----

From: Frank Ch. Eigler [mailto:fche@redhat.com]
Sent: Tuesday, January 18, 2011 4:22 PM
To: Turgis, Frederic
Cc: systemtap@sourceware.org
Subject: Re: control channel usermode polling vs runtime "IO check" kernel thread

f-turgis wrote:

> [...]  While using systemtap on a SoC with power management enabled
> and "tickless" kernel, I have observed regular wake-ups of systemtap/0
> kernel thread ever> y scheduler tick. I have related that to below
> observations/conclusions:

I believe all of your observations are correct.  We haven't sufficiently tuned this aspect of the system, and would appreciate further advice or help.

- FChE

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-01-28 17:22 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-18 14:42 control channel usermode polling vs runtime "IO check" kernel thread Turgis, Frederic
2011-01-18 15:21 ` Frank Ch. Eigler
2011-01-28 17:22   ` Turgis, Frederic

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).