public inbox for frysk@sourceware.org
 help / color / mirror / Atom feed
* my notes from the tracing workshop
@ 2008-02-01 16:37 Andrew Cagney
  2008-02-01 22:15 ` Elena Zannoni
  2008-02-05 20:37 ` William Cohen
  0 siblings, 2 replies; 6+ messages in thread
From: Andrew Cagney @ 2008-02-01 16:37 UTC (permalink / raw)
  To: systemtap; +Cc: frysk

[The slides get published next week]


Overview

The underlying goal of the workshop was to gather information on the 
current state of tracing and monitoring technology, and identify areas 
of potential research and development.  The Canadian Government is 
looking to significantly further research in this area; and is preparing 
a report.

Broadly the talks had an embedded bent, which isn't surprising given its 
organizational origins in the telco industry.  There was a wide level of 
representation though with both large system, and deeply embedded 
viewpoints being presented.


The Technology

For most talks, the assumed approach was

    <probe> -> <filtering> -> <recorder> -> $LOG

then on the host; or in user land:

    $LOG -> <converter> -> "DB" -> <visualization>

so I'll talk to that.


Probes

That there were two technology camps (modified kernel, and dynamic 
probes), with the majority in the former group.  Interestingly, the 
embedded players strongly indicated that deploying the modified kernel 
was acceptable (even advantageous) - the systems were permanently 
running in flight-recorder mode so they were in a better position to do 
postmortem analysis.

The exceptions were SystemTAP and SensorPoint (Wind River) (and on the 
edge, frysk).  Both SystemTAP and SensorPoint and the same basic 
approaches.  SensorPoint did have a djprobe like mechanism working, and 
nested(?) probes (where you could specify the call chain required to 
trigger the probe - it worked by watching the functions and not by 
looking at backtraces); finally the ability to replace code on live systems.


Finaly, the big and positive thing on probes was that the kernel markers 
being accepted.  Oracle(Elena) identified that a lacking feature was 
being able to query the list of possible probe points -> embedding 
markers in the code (and hopefully having them documented in situ ????) 
will address this.  On the other hand, I picked up a few concerns 
(outside of presentations): who gets to back port this (if at all); its 
an ABI, who gets to maintain it long term; and what happens when someone 
refuses to accept markers in their code :-)


Filters

This is where SystemTAP and SensorPoint stood out (I think :-).  Both 
have the ability to filter events before pushing them to the recorder.  
Using SystemTAP on the kernel markers should be a wicked combination.

[Can I assume that, when there's a marked up kernel, SystemTAP inserts 
jumps instead of traps?  If fche had been giving the talk, it would have 
been my question :-)]


Recorders and logs

Zzzzz.


Converters

The consistent approach was to implement some sort of converter that 
could load random external file formats and load them into an internal form.

While there seemed to be a push to standardize on log-file format, I got 
the impression that it was solving the wrong problem (and others two).  
Size really did matter.


"DB"

There was a strong consensus that the "internal" format of the log data 
needed to be a fast light weight database; two vendors were using sqlite 
for instance (TPTP the eclipse tool didn't but I suspect will shortly).  
Wind River presented a discussion illustrating its advantages.

There were suggestions, and it appears a strong degree of consensus, of 
standardizing a database format, so that could be shared amongst 
visualization tools.  I think this, and the conversion tools will gather 
traction.  Something SystemTAP should monitor.


Visualization.

Many visualization tools were presented (if I see another useless 
full-screen snap-shot in a slide I'll scream), most built on eclipse, 
but a few were not.  While this is a very crowded market, there seems, 
in mnsho, to still be a need for clear simple visualization tools backed 
by a databse.

The quote of the day, in describing eclipse, has to be "icon diarrhea".


A few of the Talks

Me / Red Hat: SystemTAP / Frysk
(I got to do both talks).
What's the status of SystemTAP on the ARM?  Ditto for Frysk.

Robert Winsiewski / IBM: Performance analys and debugging at IBM
It was as much about IBM as a few other companies Robert had worked for; 
it have a general history of logging challenges in a number of 
companies.  Strongly in favor of the marker approach; and set that as a 
theme.  Two notable ideas were non-locked logging (the in-memory log 
file format handled synchronization using atomic instructions); and 
sharing memory logs between user and system.

Elena Zannoni / Oracle: Tracing at Oracle
Presented the challenges with using SystemTAP in a "binary only / clean 
room" environment.

Beth Tibbits / IBM: Eclipse Parallel Tools Platform
Underneath they are using a consolidating process that then, in turn, 
talks to a distributed collection of gdb processes (makes you cry :-); 
this basic approach is described in Bevin Brett's paper on making 
ladebug HPC.  There's work to generalize this, see http://scalabletools.org/

Andrew McDermott / Wind River: Developing OS-agnostic visualization tools.
Discussed the "DB" approach for managing all that data.

Felix Burton / Wind River: Sensorpoint Technology
Wind Rivers rough equivalent to SystemTAP.  Use "C" for the probes.


--

I was asked if SystemTAP is supported on arm (have e-mail address if 
fche you want to contact them).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: my notes from the tracing workshop
  2008-02-01 16:37 my notes from the tracing workshop Andrew Cagney
@ 2008-02-01 22:15 ` Elena Zannoni
  2008-02-05 20:37 ` William Cohen
  1 sibling, 0 replies; 6+ messages in thread
From: Elena Zannoni @ 2008-02-01 22:15 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: systemtap, frysk

Thanks for the notes, Andrew. Good summary.
I posted mine here, including my slides:
http://blogs.oracle.com/ezannoni/

elena


Andrew Cagney wrote:
> [The slides get published next week]
>
>
> Overview
>
> The underlying goal of the workshop was to gather information on the 
> current state of tracing and monitoring technology, and identify areas 
> of potential research and development.  The Canadian Government is 
> looking to significantly further research in this area; and is 
> preparing a report.
>
> Broadly the talks had an embedded bent, which isn't surprising given 
> its organizational origins in the telco industry.  There was a wide 
> level of representation though with both large system, and deeply 
> embedded viewpoints being presented.
>
>
> The Technology
>
> For most talks, the assumed approach was
>
>    <probe> -> <filtering> -> <recorder> -> $LOG
>
> then on the host; or in user land:
>
>    $LOG -> <converter> -> "DB" -> <visualization>
>
> so I'll talk to that.
>
>
> Probes
>
> That there were two technology camps (modified kernel, and dynamic 
> probes), with the majority in the former group.  Interestingly, the 
> embedded players strongly indicated that deploying the modified kernel 
> was acceptable (even advantageous) - the systems were permanently 
> running in flight-recorder mode so they were in a better position to 
> do postmortem analysis.
>
> The exceptions were SystemTAP and SensorPoint (Wind River) (and on the 
> edge, frysk).  Both SystemTAP and SensorPoint and the same basic 
> approaches.  SensorPoint did have a djprobe like mechanism working, 
> and nested(?) probes (where you could specify the call chain required 
> to trigger the probe - it worked by watching the functions and not by 
> looking at backtraces); finally the ability to replace code on live 
> systems.
>
>
> Finaly, the big and positive thing on probes was that the kernel 
> markers being accepted.  Oracle(Elena) identified that a lacking 
> feature was being able to query the list of possible probe points -> 
> embedding markers in the code (and hopefully having them documented in 
> situ ????) will address this.  On the other hand, I picked up a few 
> concerns (outside of presentations): who gets to back port this (if at 
> all); its an ABI, who gets to maintain it long term; and what happens 
> when someone refuses to accept markers in their code :-)
>
>
> Filters
>
> This is where SystemTAP and SensorPoint stood out (I think :-).  Both 
> have the ability to filter events before pushing them to the 
> recorder.  Using SystemTAP on the kernel markers should be a wicked 
> combination.
>
> [Can I assume that, when there's a marked up kernel, SystemTAP inserts 
> jumps instead of traps?  If fche had been giving the talk, it would 
> have been my question :-)]
>
>
> Recorders and logs
>
> Zzzzz.
>
>
> Converters
>
> The consistent approach was to implement some sort of converter that 
> could load random external file formats and load them into an internal 
> form.
>
> While there seemed to be a push to standardize on log-file format, I 
> got the impression that it was solving the wrong problem (and others 
> two).  Size really did matter.
>
>
> "DB"
>
> There was a strong consensus that the "internal" format of the log 
> data needed to be a fast light weight database; two vendors were using 
> sqlite for instance (TPTP the eclipse tool didn't but I suspect will 
> shortly).  Wind River presented a discussion illustrating its advantages.
>
> There were suggestions, and it appears a strong degree of consensus, 
> of standardizing a database format, so that could be shared amongst 
> visualization tools.  I think this, and the conversion tools will 
> gather traction.  Something SystemTAP should monitor.
>
>
> Visualization.
>
> Many visualization tools were presented (if I see another useless 
> full-screen snap-shot in a slide I'll scream), most built on eclipse, 
> but a few were not.  While this is a very crowded market, there seems, 
> in mnsho, to still be a need for clear simple visualization tools 
> backed by a databse.
>
> The quote of the day, in describing eclipse, has to be "icon diarrhea".
>
>
> A few of the Talks
>
> Me / Red Hat: SystemTAP / Frysk
> (I got to do both talks).
> What's the status of SystemTAP on the ARM?  Ditto for Frysk.
>
> Robert Winsiewski / IBM: Performance analys and debugging at IBM
> It was as much about IBM as a few other companies Robert had worked 
> for; it have a general history of logging challenges in a number of 
> companies.  Strongly in favor of the marker approach; and set that as 
> a theme.  Two notable ideas were non-locked logging (the in-memory log 
> file format handled synchronization using atomic instructions); and 
> sharing memory logs between user and system.
>
> Elena Zannoni / Oracle: Tracing at Oracle
> Presented the challenges with using SystemTAP in a "binary only / 
> clean room" environment.
>
> Beth Tibbits / IBM: Eclipse Parallel Tools Platform
> Underneath they are using a consolidating process that then, in turn, 
> talks to a distributed collection of gdb processes (makes you cry :-); 
> this basic approach is described in Bevin Brett's paper on making 
> ladebug HPC.  There's work to generalize this, see 
> http://scalabletools.org/
>
> Andrew McDermott / Wind River: Developing OS-agnostic visualization 
> tools.
> Discussed the "DB" approach for managing all that data.
>
> Felix Burton / Wind River: Sensorpoint Technology
> Wind Rivers rough equivalent to SystemTAP.  Use "C" for the probes.
>
>
> -- 
>
> I was asked if SystemTAP is supported on arm (have e-mail address if 
> fche you want to contact them).
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: my notes from the tracing workshop
  2008-02-01 16:37 my notes from the tracing workshop Andrew Cagney
  2008-02-01 22:15 ` Elena Zannoni
@ 2008-02-05 20:37 ` William Cohen
  2008-03-03 16:57   ` Andrew Cagney
  1 sibling, 1 reply; 6+ messages in thread
From: William Cohen @ 2008-02-05 20:37 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: systemtap, frysk

Andrew Cagney wrote:

> Visualization.
> 
> Many visualization tools were presented (if I see another useless 
> full-screen snap-shot in a slide I'll scream), most built on eclipse, 
> but a few were not.  While this is a very crowded market, there seems, 
> in mnsho, to still be a need for clear simple visualization tools backed 
> by a databse.
> 
> The quote of the day, in describing eclipse, has to be "icon diarrhea".

Were the tools showing more than just a simple time line of logs? Having just a 
time-line plot of when things happen is not that useful. For example, LTT had 
some time-line graphing of events. There is either too much data or too little 
data on the screen . When showing a signficant portion the time line there is a 
massive clutter of items in the time line in addition to the few events you are 
interested in. When zooming in one only sees the single event, not the other 
interesting event(s). Really want visualization tools declutter and filter out 
as much as possible from the graphics.

-Will

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: my notes from the tracing workshop
  2008-02-05 20:37 ` William Cohen
@ 2008-03-03 16:57   ` Andrew Cagney
  0 siblings, 0 replies; 6+ messages in thread
From: Andrew Cagney @ 2008-03-03 16:57 UTC (permalink / raw)
  To: William Cohen; +Cc: systemtap, frysk

[sorry missed this e-mail]

William Cohen wrote:
> Andrew Cagney wrote:
>
>> Visualization.
>>
>> Many visualization tools were presented (if I see another useless 
>> full-screen snap-shot in a slide I'll scream), most built on eclipse, 
>> but a few were not.  While this is a very crowded market, there 
>> seems, in mnsho, to still be a need for clear simple visualization 
>> tools backed by a databse.
>>
>> The quote of the day, in describing eclipse, has to be "icon diarrhea".
>
> Were the tools showing more than just a simple time line of logs? 
> Having just a time-line plot of when things happen is not that useful. 
> For example, LTT had some time-line graphing of events. There is 
> either too much data or too little data on the screen . When showing a 
> signficant portion the time line there is a massive clutter of items 
> in the time line in addition to the few events you are interested in. 
> When zooming in one only sees the single event, not the other 
> interesting event(s). Really want visualization tools declutter and 
> filter out as much as possible from the graphics.
>

Yes,

- two tools demonstrated some form of 3d visualization (I know there 
were at least two as that each project was doing their own SWT bindings 
to OpenGL came up as a topic :-) taking a high level view.  For instance 
a 3d graph of events vs process over time.

- tptp demonstrated (the 10 minute unintended demo was far more useful 
than the slides) zomming in/out using drag select to give that 
high-level view and then zoom in

Typically the UI was using SQL queries to extract/filter the data before 
visualizing it.

Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: my notes from the tracing workshop
  2008-02-01 19:44 ` Frank Ch. Eigler
@ 2008-02-05 19:02   ` Andrew Cagney
  0 siblings, 0 replies; 6+ messages in thread
From: Andrew Cagney @ 2008-02-05 19:02 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: systemtap, frysk

Frank Ch. Eigler wrote:
>> [...]
>> "DB"
>> There was a strong consensus that the "internal" format of the log
>> data needed to be a fast light weight database; two vendors were using
>> sqlite for instance (TPTP the eclipse tool didn't but I suspect will
>> shortly).  [...]
>>     
>
> This seems rather wacky.  If they're talking about gigabytes of trace
> traffic, a little wee in-memory database is a reach.  If you need to
> do declarative querying, then you need a real database with indexes
> and whatnot.  If you just need a big ass array, use BerkeleyDB.  If
> you just want strongly typed flat data on disk, go XML.  I wish I'd
> been there - perhaps my perceptions could have been falsified.
>
>   

Right, SqLite is a little wee in-process database;  it provides a 
mechanism for powerful queries without the overhead of a server 
implementation.  Wind River presented performance numbers supporting the 
approaches usability; in particular timing such as populating the 
database from trace logs.  While that cost is real, it is outweighed by 
the benefit of being able to select arbitrary data-sets for visualization.

For system tap, provided the on-disk raw trace data format is well 
defined, it will be possible to load it into SQL.  BTW, while the 
on-disk format could be XML,  there was a general feeling that XML is 
just too verbose and more compact formats intermediate forms were needed.

Can I recommend looking through the relevant slides.

>   
>> [...] What's the status of SystemTAP on the ARM?  [...]
>>     
>
> I haven't run it personally, but others have (Eugene Teo for the Nokia
> N800).  One difficulty appears to be finding a big enough ARM box to
> self-host the kernel module build process, or else cross-compiling and
> cross-running.
>   
Cool, ARM support for libunwind is currently being integrated.

Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: my notes from the tracing workshop
       [not found] <47A34AA2.5070404__28393.9727153212$1201883893$gmane$org@redhat.com>
@ 2008-02-01 19:44 ` Frank Ch. Eigler
  2008-02-05 19:02   ` Andrew Cagney
  0 siblings, 1 reply; 6+ messages in thread
From: Frank Ch. Eigler @ 2008-02-01 19:44 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: systemtap, frysk

Andrew Cagney <cagney@redhat.com> writes:

> [...]
> Overview

Thank you!

> [...]
> The exceptions were SystemTAP and SensorPoint (Wind River) (and on the
> edge, frysk).  Both SystemTAP and SensorPoint and the same basic
> approaches.  SensorPoint did have a djprobe like mechanism working,
> and nested(?) probes (where you could specify the call chain required
> to trigger the probe - it worked by watching the functions and not by
> looking at backtraces); 

We will approximate this with the incoming optimizations for
"conditional probes".

> finally the ability to replace code on live systems.

I dunno when we try to address this aspect.


> Finaly, the big and positive thing on probes was that the kernel
> markers being accepted.  [...]
>
> This is where SystemTAP and SensorPoint stood out (I think :-).  Both
> have the ability to filter events before pushing them to the recorder.
> Using SystemTAP on the kernel markers should be a wicked combination.

Yeah, we hope so!

> [Can I assume that, when there's a marked up kernel, SystemTAP
> inserts jumps instead of traps?]

Indeed - or rather, the kernel marker API does this for us.  We become
just a client.  This was what my "integration platform for probing"
title line was all about - we can attach natively to multiple
instrumentation systems and present them in a cohesive manner.


> [...]
> "DB"
> There was a strong consensus that the "internal" format of the log
> data needed to be a fast light weight database; two vendors were using
> sqlite for instance (TPTP the eclipse tool didn't but I suspect will
> shortly).  [...]

This seems rather wacky.  If they're talking about gigabytes of trace
traffic, a little wee in-memory database is a reach.  If you need to
do declarative querying, then you need a real database with indexes
and whatnot.  If you just need a big ass array, use BerkeleyDB.  If
you just want strongly typed flat data on disk, go XML.  I wish I'd
been there - perhaps my perceptions could have been falsified.


> [...] What's the status of SystemTAP on the ARM?  [...]

I haven't run it personally, but others have (Eugene Teo for the Nokia
N800).  One difficulty appears to be finding a big enough ARM box to
self-host the kernel module build process, or else cross-compiling and
cross-running.

- FChE

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-03-03 16:57 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-01 16:37 my notes from the tracing workshop Andrew Cagney
2008-02-01 22:15 ` Elena Zannoni
2008-02-05 20:37 ` William Cohen
2008-03-03 16:57   ` Andrew Cagney
     [not found] <47A34AA2.5070404__28393.9727153212$1201883893$gmane$org@redhat.com>
2008-02-01 19:44 ` Frank Ch. Eigler
2008-02-05 19:02   ` Andrew Cagney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).