public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* Making the transport layer more robust - Power Management - follow-up
@ 2011-12-09  4:38 Turgis, Frederic
  2011-12-15 15:31 ` Mark Wielaard
  0 siblings, 1 reply; 3+ messages in thread
From: Turgis, Frederic @ 2011-12-09  4:38 UTC (permalink / raw)
  To: systemtap

Hi,

To summarize past discussions in http://sourceware.org/ml/systemtap/2011-q3/msg00272.html: we want to tune systemtap to have as few regular wake-ups as possible. This is important to us as we monitor low power use cases on OMAP SoC and "straight" systemtap tool wakes up every 1 or 2 scheduler ticks.

- we looked at code and worked around 4 causes of regular wake-up: polling and timeouts in userspace and kernel space control/data channels

- in the "Making the transport layer more robust" thread, Mark presented a rework that replaces  userspace polling of control channel by a "select()" model if target allows. 1 less regular wake-up !

- in same thread, Mark made the remark that STP_RELAY_TIMER_INTERVAL and STP_CTL_TIMER_INTERVAL (kernel pollings) are in fact tunables, no need to modify the code:
   * to match our work-arounds, we used then -D STP_RELAY_TIMER_INTERVAL=128 -D STP_CTL_TIMER_INTERVAL=256
   * regular wake-ups are clearly occuring less often, with no tracing issue. But our trace bandwidth is generally hundreds of KB/s max so we don't really need much robustness


Some more recent findings:
- while testing fixes on some ARM backtrace issue with Mark, I got message "ctl_write_msg type=2 len=61 ENOMEM" several times at beginning of test (not root-caused yet). That means lack of trace buffer for msg type=2, which is OOB_DATA (error and warning messages). Test and trace data looked fine. Messages do not appear if I compile without -D STP_CTL_TIMER_INTERVAL=256.
So here is a kind of consequence of our tuning, not killer but still to be noted ;-) I guess we could have the same for data channel

- last non tunable wake-up is timeout of userspace data channel ppoll() call in reader_thread(). Without change, we wake-up every 200ms:
   * we currently set it to 5s. No issue so far
   * Mark (or someone else) suggested to use bulkmode. Here are some findings:
      + bulkmode sets timeout to NULL (or 10s if NEED_PPOLL is set). It solves wake-up issue. I am just wondering why we have NULL in bulkmode and 200ms otherwise
      + OMAP hotplugs core so generally core 1 is off at beginning of test. Therefore I don't get trace of core1 even if core1 is used later. Makes bulkmode less usable than I thought (at least I still need to test with core1 "on" at beginning of test to see further behaviour)


That makes the possibility to tune ppoll timeout value in non bulkmode still interesting. I even don't really know what could be consequences of directly setting to 1s or more but tunable would be good trade-off that does not break current status.

Well, I think I gave myself few actions to perform !


Regards
Fred

Frederic Turgis
OMAP Platform Business Unit - OMAP System Engineering - Platform Enablement - System Multimedia


Texas Instruments France SA, 821 Avenue Jack Kilby, 06270 Villeneuve Loubet. 036 420 040 R.C.S Antibes. Capital de EUR 753.920


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-12-15 22:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-09  4:38 Making the transport layer more robust - Power Management - follow-up Turgis, Frederic
2011-12-15 15:31 ` Mark Wielaard
2011-12-16 15:37   ` Turgis, Frederic

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).