* systemtap/pcp integration @ 2014-07-17 21:14 David Smith 2014-07-18 15:49 ` Frank Ch. Eigler ` (2 more replies) 0 siblings, 3 replies; 16+ messages in thread From: David Smith @ 2014-07-17 21:14 UTC (permalink / raw) To: Systemtap List, pcp [-- Attachment #1: Type: text/plain, Size: 1035 bytes --] Here's a small update on the prototype systemtap/pcp integration work I'm doing. I've create a systemtap branch, called 'dsmith/mmv' that contains all my work. Basically this work allows systemtap to create 'mmv' memory mapped files. If you checkout and build that, then you should be able to run the attached systemtap script. This script is a translation of pcp's example python mmv script found in src/python/mmv.py. Note that systemtap will create a file called 'mmv' in /proc/systemtap/{MODULE_NAME}. I've just been using pcp's 'mmvdump' utility to dump the contents of the /proc/systemtap/{MODULE_NAME}/mmv file. Currently the pcp mmv pmda only looks in one place for mmv files, but it might be possible to create a symbolic link to systemtap's mmv file to make it happy. The code works for the attached script, but I'm sure it is quite fragile. Things like locking, error checking, documentation, etc. need to be done. -- David Smith dsmith@redhat.com Red Hat http://www.redhat.com 256.217.0141 (direct) 256.837.0057 (fax) [-- Attachment #2: mmv.stp --] [-- Type: text/plain, Size: 1286 bytes --] global inst0, inst1, counter_metric, instant_metric, indom_metric global instant_value probe begin { inst0 = mmv_add_instance(0, "zero") inst1 = mmv_add_instance(1, "hero") printf("instances: %d, %d\n", inst0, inst1) indom0 = mmv_add_indom(1, "We can be heroes", "Set of instances from zero to hero") mmv_add_indom_instance(indom0, inst0) mmv_add_indom_instance(indom0, inst1) printf("indom: %d\n", indom0) counter_metric = mmv_add_metric("counter", 1, MMV_TYPE_NUMBER, MMV_SEM_COUNTER, mmv_units(0, 0, 1, 0, 0, 0), 0, "Example counter metric", "Yep, a test counter metric") instant_metric = mmv_add_metric("instant", 2, MMV_TYPE_NUMBER, MMV_SEM_INSTANT, mmv_units(0, 0, 0, 0, 0, 0), 0, "Example instant metric", "Yep, a test instantaneous metric") indom_metric = mmv_add_metric("indom", 3, MMV_TYPE_NUMBER, MMV_SEM_DISCRETE, mmv_units(0, 0, 0, 0, 0, 0), 1, "", "") printf("metrics: %d, %d, %d\n", counter_metric, instant_metric, indom_metric) mmv_stats_start(42, 0) instant_value = mmv_lookup_value(instant_metric, 0) mmv_set_value(instant_value, 41) mmv_inc_value(instant_value, 2) } probe timer.s(1) { mmv_inc_value(instant_value, 3) } probe end { mmv_stats_stop() } ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: systemtap/pcp integration 2014-07-17 21:14 systemtap/pcp integration David Smith @ 2014-07-18 15:49 ` Frank Ch. Eigler 2014-07-18 17:02 ` David Smith 2014-07-18 21:10 ` [pcp] " William Cohen 2014-07-22 1:32 ` Nathan Scott 2 siblings, 1 reply; 16+ messages in thread From: Frank Ch. Eigler @ 2014-07-18 15:49 UTC (permalink / raw) To: David Smith; +Cc: Systemtap List, pcp Hi, David - > Here's a small update on the prototype systemtap/pcp integration work > I'm doing. Thanks, what a great start. > I've create a systemtap branch, called 'dsmith/mmv' that contains > all my work. Basically this work allows systemtap to create 'mmv' > memory mapped files. (That's git://sourceware.org/git/systemtap.git branch dsmith/mmv.) > [...] The code works for the attached script, but I'm sure it is > quite fragile. Things like locking, error checking, documentation, > etc. need to be done. [...] Overall, are you happy with the general approach of reusing the exact MMV format (and thus the PMDA)? At one point I suggested reworking the earlier prototype so that the bulk of the MMV format's emulation would be based on tapset script code (and possibly more declarative / dynamic / safe) rather than C. Have you come to any conclusions about the propriety of that? How much post-initialization change can the MMV format tolerate, as regarding indom contents or metric availability? I assume such metadata changes are synchronized with the PMDA via the generation numbers. Moving around contents of the mmap region as in __stp_mmv_alloc_data_item sounds like it leaves the data inconsistent during the process; does it need similar protection? - FChE ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: systemtap/pcp integration 2014-07-18 15:49 ` Frank Ch. Eigler @ 2014-07-18 17:02 ` David Smith 2014-07-18 18:27 ` Frank Ch. Eigler 0 siblings, 1 reply; 16+ messages in thread From: David Smith @ 2014-07-18 17:02 UTC (permalink / raw) To: Frank Ch. Eigler; +Cc: Systemtap List, pcp On 07/18/2014 10:49 AM, Frank Ch. Eigler wrote: > > Hi, David - > >> Here's a small update on the prototype systemtap/pcp integration work >> I'm doing. > > Thanks, what a great start. > >> I've create a systemtap branch, called 'dsmith/mmv' that contains >> all my work. Basically this work allows systemtap to create 'mmv' >> memory mapped files. > > (That's git://sourceware.org/git/systemtap.git branch dsmith/mmv.) > >> [...] The code works for the attached script, but I'm sure it is >> quite fragile. Things like locking, error checking, documentation, >> etc. need to be done. [...] > > Overall, are you happy with the general approach of reusing the exact > MMV format (and thus the PMDA)? That's a good question. I certainly started with reusing the exact MMV format because that's a working format with an existing data consumer on the pcp side of things. Note that the current systemtap implementation is actually a subset of the full pcp MMV format - currently it only supports 64-bit integers (with string values to come). However, as I've worked with the MMV format I've come to realize its limitations. As Nathan has pointed out in another email, the MMV format is designed to only support exporting values, and isn't suited for more event-like tracing. As far as the more technical side of things goes, some of the internal offset logic might be done better/differently. > At one point I suggested reworking the earlier prototype so that the > bulk of the MMV format's emulation would be based on tapset script > code (and possibly more declarative / dynamic / safe) rather than C. > Have you come to any conclusions about the propriety of that? I've been focused on other things, like reworking the allocation logic. As you describe it above, I'm not sure I see where you are headed. Certainly when I add systemtap's dyninst support into the picture, I was planning on sharing that code between systemtap's linux and dyninst runtimes. > How much post-initialization change can the MMV format tolerate, as > regarding indom contents or metric availability? I assume such > metadata changes are synchronized with the PMDA via the generation > numbers. Moving around contents of the mmap region as in > __stp_mmv_alloc_data_item sounds like it leaves the data inconsistent > during the process; does it need similar protection? The MMV format doesn't support any post-initialization changes - once you call "start" the file format can't change without removing and recreating the file. (The reader knows "start" has been called based on when the generation numbers match.) As far as systemtap is concerned, all the memory moving that __stp_mmv_alloc_data_item() does is done before calling "start", so the data being inconsistent as far as the reader is concerned doesn't matter. For the linux runtime side of things, we could not allow reads of the memory-mapped data until "start" has been called (and the data is consistent). -- David Smith dsmith@redhat.com Red Hat http://www.redhat.com 256.217.0141 (direct) 256.837.0057 (fax) ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: systemtap/pcp integration 2014-07-18 17:02 ` David Smith @ 2014-07-18 18:27 ` Frank Ch. Eigler 2014-07-22 14:25 ` David Smith 0 siblings, 1 reply; 16+ messages in thread From: Frank Ch. Eigler @ 2014-07-18 18:27 UTC (permalink / raw) To: David Smith; +Cc: Systemtap List, pcp Hi - dsmith wrote: > [...] > > Overall, are you happy with the general approach of reusing the exact > > MMV format (and thus the PMDA)? > > [...] > However, as I've worked with the MMV format I've come to realize its > limitations. As Nathan has pointed out in another email, the MMV format > is designed to only support exporting values, and isn't suited for more > event-like tracing. As far as the more technical side of things goes, > some of the internal offset logic might be done better/differently. An application of pmda/mmv & pmda/logger to the same stap module could perhaps accomplish both goals (assuming we consider the pcp events overengineered to the extent that supplying timestamped strings is sufficient). Have you considered an alternative unified design? This reminds me of another PCP PMDA we've mentioned in the past: a JSON fetcher/parser. We'll need something like this for a variety of non-systemtap purposes too (interop with JSON-producing tools). What if stap were to produce pcp metrics in the form of /proc/systemtap/* JSON files that the PMDA would read on demand? (The cost of the parsing overhead may be low enough not to worry about it.) A separate generated JSON file could provide metadata. That format could be rich enough to contain events too (mapped from arrays of string). > > At one point I suggested reworking the earlier prototype so that the > > bulk of the MMV format's emulation would be based on tapset script > > code (and possibly more declarative / dynamic / safe) rather than C. > > Have you come to any conclusions about the propriety of that? > > I've been focused on other things, like reworking the allocation logic. > As you describe it above, I'm not sure I see where you are headed. To spell it out, the idea was to encode the mmv format logic (including metadata management) within a stap tapset script instead of as C in the runtime. Then the C runtime would need to do nothing but provide a memory-mapped-byte-array kind of abstraction, and a way for the script code to read/write it (maybe a variant of sprintf("%b...")?). > [...] > > How much post-initialization change can the MMV format tolerate, as > > regarding indom contents or metric availability? I assume such > > metadata changes are synchronized with the PMDA via the generation > > numbers. Moving around contents of the mmap region as in > > __stp_mmv_alloc_data_item sounds like it leaves the data inconsistent > > during the process; does it need similar protection? > > The MMV format doesn't support any post-initialization changes - once > you call "start" the file format can't change without removing and > recreating the file. (The reader knows "start" has been called based on > when the generation numbers match.) [...] (I think generation numbers can be changed during the run, to trigger a pmda/mmv reload, but don't know how thoroughly that works.) - FChE ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: systemtap/pcp integration 2014-07-18 18:27 ` Frank Ch. Eigler @ 2014-07-22 14:25 ` David Smith 2014-07-22 15:21 ` Frank Ch. Eigler 0 siblings, 1 reply; 16+ messages in thread From: David Smith @ 2014-07-22 14:25 UTC (permalink / raw) To: Frank Ch. Eigler; +Cc: Systemtap List, pcp On 07/18/2014 01:27 PM, Frank Ch. Eigler wrote: > Hi - > > dsmith wrote: > >> [...] >>> Overall, are you happy with the general approach of reusing the exact >>> MMV format (and thus the PMDA)? >> >> [...] >> However, as I've worked with the MMV format I've come to realize its >> limitations. As Nathan has pointed out in another email, the MMV format >> is designed to only support exporting values, and isn't suited for more >> event-like tracing. As far as the more technical side of things goes, >> some of the internal offset logic might be done better/differently. > > An application of pmda/mmv & pmda/logger to the same stap module could > perhaps accomplish both goals (assuming we consider the pcp events > overengineered to the extent that supplying timestamped strings is > sufficient). Have you considered an alternative unified design? > > This reminds me of another PCP PMDA we've mentioned in the past: a > JSON fetcher/parser. We'll need something like this for a variety of > non-systemtap purposes too (interop with JSON-producing tools). What > if stap were to produce pcp metrics in the form of /proc/systemtap/* > JSON files that the PMDA would read on demand? (The cost of the > parsing overhead may be low enough not to worry about it.) A separate > generated JSON file could provide metadata. That format could be rich > enough to contain events too (mapped from arrays of string). I think a JSON fetcher/parser is a good idea. >>> At one point I suggested reworking the earlier prototype so that the >>> bulk of the MMV format's emulation would be based on tapset script >>> code (and possibly more declarative / dynamic / safe) rather than C. >>> Have you come to any conclusions about the propriety of that? >> >> I've been focused on other things, like reworking the allocation logic. >> As you describe it above, I'm not sure I see where you are headed. > > To spell it out, the idea was to encode the mmv format logic > (including metadata management) within a stap tapset script instead of > as C in the runtime. Then the C runtime would need to do nothing but > provide a memory-mapped-byte-array kind of abstraction, and a way for > the script code to read/write it (maybe a variant of > sprintf("%b...")?). Hmm. I think I see where you are going with this, but I don't know if it will work well for a couple of reasons. One is that you don't just write to this array, you have to be able to read it back (in order to shuffle things around, hook up instances to indoms, etc.). In addition, the best way to ensure that a client can read the data produced by the mmv stuff is to use the same header file so we know the data is laid out the same way. -- David Smith dsmith@redhat.com Red Hat http://www.redhat.com 256.217.0141 (direct) 256.837.0057 (fax) ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: systemtap/pcp integration 2014-07-22 14:25 ` David Smith @ 2014-07-22 15:21 ` Frank Ch. Eigler 2014-07-22 21:02 ` David Smith 0 siblings, 1 reply; 16+ messages in thread From: Frank Ch. Eigler @ 2014-07-22 15:21 UTC (permalink / raw) To: David Smith; +Cc: Systemtap List, pcp Hi, David - > [...] > I think a JSON fetcher/parser is a good idea. Good enough to redirect from MMV? > [...] > > To spell it out, the idea was to encode the mmv format logic > > (including metadata management) within a stap tapset script instead of > > as C in the runtime. Then the C runtime would need to do nothing but > > provide a memory-mapped-byte-array kind of abstraction, and a way for > > the script code to read/write it (maybe a variant of > > sprintf("%b...")?). > > Hmm. I think I see where you are going with this, but I don't know if it > will work well for a couple of reasons. One is that you don't just write > to this array, you have to be able to read it back (in order to shuffle > things around, hook up instances to indoms, etc.). (Not if the stap tapset just rewrites the whole metadata if necessary.) > In addition, the best way to ensure that a client can read the data > produced by the mmv stuff is to use the same header file so we know > the data is laid out the same way. Right ... though since MMV is a fixed ABI, a completely separate implementation would help test the robustness of its specification. - FChE ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: systemtap/pcp integration 2014-07-22 15:21 ` Frank Ch. Eigler @ 2014-07-22 21:02 ` David Smith 0 siblings, 0 replies; 16+ messages in thread From: David Smith @ 2014-07-22 21:02 UTC (permalink / raw) To: Frank Ch. Eigler; +Cc: Systemtap List, pcp On 07/22/2014 10:21 AM, Frank Ch. Eigler wrote: > Hi, David - > > >> [...] >> I think a JSON fetcher/parser is a good idea. > > Good enough to redirect from MMV? I think MMV and a JSON fetcher/parser aren't mutually exclusive. I'd say they are complementary. -- David Smith dsmith@redhat.com Red Hat http://www.redhat.com 256.217.0141 (direct) 256.837.0057 (fax) ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [pcp] systemtap/pcp integration 2014-07-17 21:14 systemtap/pcp integration David Smith 2014-07-18 15:49 ` Frank Ch. Eigler @ 2014-07-18 21:10 ` William Cohen 2014-07-21 15:43 ` David Smith 2014-07-22 1:32 ` Nathan Scott 2 siblings, 1 reply; 16+ messages in thread From: William Cohen @ 2014-07-18 21:10 UTC (permalink / raw) To: David Smith, Systemtap List, pcp On 07/17/2014 05:14 PM, David Smith wrote: > Here's a small update on the prototype systemtap/pcp integration work > I'm doing. I've create a systemtap branch, called 'dsmith/mmv' that > contains all my work. Basically this work allows systemtap to create > 'mmv' memory mapped files. > > If you checkout and build that, then you should be able to run the > attached systemtap script. This script is a translation of pcp's example > python mmv script found in src/python/mmv.py. > > Note that systemtap will create a file called 'mmv' in > /proc/systemtap/{MODULE_NAME}. I've just been using pcp's 'mmvdump' > utility to dump the contents of the /proc/systemtap/{MODULE_NAME}/mmv > file. Currently the pcp mmv pmda only looks in one place for mmv files, > but it might be possible to create a symbolic link to systemtap's mmv > file to make it happy. > > The code works for the attached script, but I'm sure it is quite > fragile. Things like locking, error checking, documentation, etc. need > to be done. > > > > _______________________________________________ > pcp mailing list > pcp@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/pcp > Hi David, I was able to get this to work on RHEL 7. However, on fedora20 the mmvdump gets a sigsegv in the dump() function. -Will ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [pcp] systemtap/pcp integration 2014-07-18 21:10 ` [pcp] " William Cohen @ 2014-07-21 15:43 ` David Smith 2014-07-21 15:54 ` William Cohen 0 siblings, 1 reply; 16+ messages in thread From: David Smith @ 2014-07-21 15:43 UTC (permalink / raw) To: William Cohen, Systemtap List, pcp On 07/18/2014 04:10 PM, William Cohen wrote: > Hi David, > > I was able to get this to work on RHEL 7. However, on fedora20 the mmvdump > gets a sigsegv in the dump() function. Thanks for the bug report. I believe I just fixed this one, so after updating let me know if it doesn't work for you. -- David Smith dsmith@redhat.com Red Hat http://www.redhat.com 256.217.0141 (direct) 256.837.0057 (fax) ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [pcp] systemtap/pcp integration 2014-07-21 15:43 ` David Smith @ 2014-07-21 15:54 ` William Cohen 0 siblings, 0 replies; 16+ messages in thread From: William Cohen @ 2014-07-21 15:54 UTC (permalink / raw) To: David Smith, Systemtap List, pcp On 07/21/2014 11:43 AM, David Smith wrote: > On 07/18/2014 04:10 PM, William Cohen wrote: >> Hi David, >> >> I was able to get this to work on RHEL 7. However, on fedora20 the mmvdump >> gets a sigsegv in the dump() function. > > Thanks for the bug report. I believe I just fixed this one, so after > updating let me know if it doesn't work for you. > Hi David, That fix made fedora 20 happy. In one window did: $ ./systemtap_write/install/bin/stap mmv.stp instances: 0, 1 indom: 0 metrics: 0, 1, 2 In another ran the mmvdump from pcp: $ ./mmvdump /proc/systemtap/stap_cb2f18a783870e1474713324d01a25f_25939/mmv MMV file = /proc/systemtap/stap_cb2f18a783870e1474713324d01a25f_25939/mmv Version = 1 Generated = 769479 TOC count = 5 Cluster = 42 Process = 0 Flags = 0x0 TOC[0]: offset 40, indoms offset 1848 (1 entries) [1/1848] 2 instances, starting at offset 152 shorttext=We can be heroes helptext=Set of instances from zero to hero TOC[1]: offset 56, instances offset 152 (2 entries) [1/152] instance = [0 or "zero"] [1/232] instance = [1 or "hero"] TOC[2]: toc offset 72, metrics offset 1880 (3 entries) [1/1880] counter type=64-bit int (0x2), sem=counter (0x1), pad=0x0 units=count (no indom) shorttext=Example counter metric helptext=Yep, a test counter metric [2/1984] instant type=64-bit int (0x2), sem=instant (0x3), pad=0x0 units= (no indom) shorttext=Example instant metric helptext=Yep, a test instantaneous metric [3/2088] indom type=64-bit int (0x2), sem=discrete (0x4), pad=0x0 units= indom=1 (no shorttext) (no helptext) TOC[3]: offset 88, values offset 2192 (4 entries) [1/2192] counter = 0 [2/2224] instant = 325 [3/2256] indom[0 or "zero"] = 0 [3/2288] indom[0 or "zero"] = 0 TOC[4]: offset 104, string offset 312 (6 entries) [1/312] We can be heroes [2/568] Set of instances from zero to hero [3/824] Example counter metric [4/1080] Yep, a test counter metric [5/1336] Example instant metric [6/1592] Yep, a test instantaneous metric -Will ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [pcp] systemtap/pcp integration 2014-07-17 21:14 systemtap/pcp integration David Smith 2014-07-18 15:49 ` Frank Ch. Eigler 2014-07-18 21:10 ` [pcp] " William Cohen @ 2014-07-22 1:32 ` Nathan Scott 2014-07-22 2:57 ` Frank Ch. Eigler 2014-07-22 14:50 ` [pcp] " David Smith 2 siblings, 2 replies; 16+ messages in thread From: Nathan Scott @ 2014-07-22 1:32 UTC (permalink / raw) To: David Smith; +Cc: Systemtap List, pcp [-- Attachment #1: Type: text/plain, Size: 2458 bytes --] Hi David, ----- Original Message ----- > [...] > Note that systemtap will create a file called 'mmv' in > /proc/systemtap/{MODULE_NAME}. I've just been using pcp's 'mmvdump' > utility to dump the contents of the /proc/systemtap/{MODULE_NAME}/mmv > file. Currently the pcp mmv pmda only looks in one place for mmv files, > but it might be possible to create a symbolic link to systemtap's mmv > file to make it happy. (OOC, what's {MODULE_NAME} in this context?) A symlink would sorta work but it feels like a bit of a workaround - the PMDA is written to be able to detect arrival/departure of new MMV files based on changes in a directory (and the location of that directory is parameterised via /etc/pcp.conf variables). I'd not recommend trying to find it within a stap script ... I imagine it will be cleaner if we go for separate user/kernel locations for MMV files. Attached patch (lightly tested) implements the PCP side of things with that in mind - with this patch (and making the kernel code manage the lifecycle of separate /proc/mmv/* entries), things should begin to work out-of-the-box (the MMV PMDA is already default-enabled in the default pmcd config file, so everything else is in place for you). It also occurs to me that there's not really anything systemtap-specific about the kernel MMV instrumentation you've done, or is there? A device driver author might want to instrument her code, for example, without needing a stap install/script...? It would be worth thinking about if the MMV bits in your script could become a more general kernel module for others to use too (in addition to stap, of course!). The extremely lightweight nature of the interface is such that it can be permanently enabled (that's certainly how the userspace MMV is used). Nice work anyway ... I'm looking forward to seeing the code in wider use. Oh, to use the attached patch with PCP tools - apply to a "dev" PCP tree (git.performancecopilot.org/pcp), build, cd src/pmdas/mmv/src, and ... $ pminfo -L -Kclear -Kadd,70,./pmda_mmv.so,mmv_init mmv ... will serve as a quick sanity test that the metrics are functioning. You can also use gdb on /usr/bin/pminfo to inspect behaviour of the PMDA. Once thats going, installing it properly ('make install') and restarting the pmcd service should get you a fully-functional PCP setup with your kernel code firing on all cylinders (pmchart, pmie, everything will work at that point). cheers. -- Nathan [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: proc-mmv.patch --] [-- Type: text/x-patch; name=proc-mmv.patch, Size: 5538 bytes --] diff --git a/src/pmdas/mmv/src/mmv.c b/src/pmdas/mmv/src/mmv.c index f3ef26f..92cd4e1 100644 --- a/src/pmdas/mmv/src/mmv.c +++ b/src/pmdas/mmv/src/mmv.c @@ -42,15 +42,21 @@ static int incnt; static int reload; static __pmnsTree * pmns; -static int statsdir_code; /* last statsdir stat code */ -static time_t statsdir_ts; /* last statsdir timestamp */ static char * prefix = "mmv"; static char * pcptmpdir; /* probably /var/tmp */ static char * pcpvardir; /* probably /var/pcp */ static char * pcppmdasdir; /* probably /var/pcp/pmdas */ static char pmnsdir[MAXPATHLEN]; /* pcpvardir/pmns */ -static char statsdir[MAXPATHLEN]; /* pcptmpdir/<prefix> */ + +typedef struct { + char path[MAXPATHLEN]; + int status; /* last statsdir stat return code */ + time_t timestamp; /* last statsdir modify timestamp */ +} statsdir_t; + +static statsdir_t sdirlist[2]; /* userspace and kernel MMV files */ +#define SDIRCOUNT (sizeof(sdirlist)/sizeof(sdirlist[0])) typedef struct { char * name; /* strdup client name */ @@ -364,13 +370,44 @@ create_indom(pmdaExt *pmda, stats_t *s, mmv_disk_indom_t *id, pmInDom indom) return 0; } +static int +create_client_statdir(statsdir_t *statsdir) +{ + struct dirent **files; + int need_reload = 0; + int i, num; + + num = scandir(statsdir->path, &files, NULL, NULL); + for (i = 0; i < num; i++) { + struct stat statbuf; + char path[MAXPATHLEN]; + char *client; + + if (files[i]->d_name[0] == '.') + continue; + + client = files[i]->d_name; + sprintf(path, "%s%c%s", statsdir->path, __pmPathSeparator(), client); + + if (stat(path, &statbuf) >= 0 && S_ISREG(statbuf.st_mode)) + if (create_client_stat(client, path, statbuf.st_size) == -EAGAIN) + need_reload = 1; + } + + for (i = 0; i < num; i++) + free(files[i]); + if (num > 0) + free(files); + + return need_reload; +} + static void map_stats(pmdaExt *pmda) { - struct dirent **files; char name[64]; int need_reload = 0; - int i, j, k, sts, num; + int i, j, k, sts; if (pmns) __pmFreePMNS(pmns); @@ -407,27 +444,8 @@ map_stats(pmdaExt *pmda) scnt = 0; } - num = scandir(statsdir, &files, NULL, NULL); - for (i = 0; i < num; i++) { - struct stat statbuf; - char path[MAXPATHLEN]; - char *client; - - if (files[i]->d_name[0] == '.') - continue; - - client = files[i]->d_name; - sprintf(path, "%s%c%s", statsdir, __pmPathSeparator(), client); - - if (stat(path, &statbuf) >= 0 && S_ISREG(statbuf.st_mode)) - if (create_client_stat(client, path, statbuf.st_size) == -EAGAIN) - need_reload = 1; - } - - for (i = 0; i < num; i++) - free(files[i]); - if (num > 0) - free(files); + for (i = 0; i < SDIRCOUNT; i++) + need_reload += create_client_statdir(&sdirlist[i]); for (i = 0; slist && i < scnt; i++) { stats_t * s = slist + i; @@ -609,11 +627,38 @@ mmv_fetchCallBack(pmdaMetric *mdesc, unsigned int inst, pmAtomValue *atom) return 0; } +static int +check_time_statsdir(statsdir_t *statsdir) +{ + struct stat s; + int need_reload = 0; + + /* + * check if the directory has been modified, reload if so; + * note modification may involve removal or newly appeared, + * a change in permissions from accessible to not (or vice- + * versa), and so on. + */ + if (stat(statsdir->path, &s) >= 0) { + if (s.st_mtime != statsdir->timestamp) { + need_reload++; + statsdir->status = 0; + statsdir->timestamp = s.st_mtime; + } + } else { + if (statsdir->status != oserror()) { + statsdir->status = oserror(); + statsdir->timestamp = 0; + need_reload++; + } + } + return need_reload; +} + static void mmv_reload_maybe(pmdaExt *pmda) { int i; - struct stat s; int need_reload = reload; /* check if generation numbers changed or monitored process exited */ @@ -629,26 +674,8 @@ mmv_reload_maybe(pmdaExt *pmda) } } - /* - * check if the directory has been modified, reload if so; - * note modification may involve removal or newly appeared, - * a change in permissions from accessible to not (or vice- - * versa), and so on. - */ - if (stat(statsdir, &s) >= 0) { - if (s.st_mtime != statsdir_ts) { - need_reload++; - statsdir_code = 0; - statsdir_ts = s.st_mtime; - } - } else { - i = oserror(); - if (statsdir_code != i) { - statsdir_code = i; - statsdir_ts = 0; - need_reload++; - } - } + for (i = 0; i < SDIRCOUNT; i++) + need_reload += check_time_statsdir(&sdirlist[i]); if (need_reload) { if (pmDebug & DBG_TRACE_APPL0) @@ -808,6 +835,7 @@ mmv_init(pmdaInterface *dp) { int m; int sep = __pmPathSeparator(); + char *procdir; if (isDSO) { pmdaDSO(dp, PMDA_INTERFACE_4, "mmv", NULL); @@ -818,12 +846,15 @@ mmv_init(pmdaInterface *dp) pcptmpdir = pmGetConfig("PCP_TMP_DIR"); pcpvardir = pmGetConfig("PCP_VAR_DIR"); pcppmdasdir = pmGetConfig("PCP_PMDAS_DIR"); + if ((procdir = getenv("PROC_STATSPATH")) == NULL) + procdir = "/proc"; - snprintf(statsdir, sizeof(statsdir), "%s%c%s", pcptmpdir, sep, prefix); snprintf(pmnsdir, sizeof(pmnsdir), "%s%c" "pmns", pcpvardir, sep); - statsdir[sizeof(statsdir)-1] = '\0'; pmnsdir[sizeof(pmnsdir)-1] = '\0'; + snprintf(sdirlist[0].path, MAXPATHLEN-1, "%s%c%s", pcptmpdir, sep, prefix); + snprintf(sdirlist[1].path, MAXPATHLEN-1, "%s%c%s", procdir, sep, prefix); + /* Initialize internal dispatch table */ if (dp->status == 0) { /* ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: systemtap/pcp integration 2014-07-22 1:32 ` Nathan Scott @ 2014-07-22 2:57 ` Frank Ch. Eigler 2014-07-22 14:50 ` [pcp] " David Smith 1 sibling, 0 replies; 16+ messages in thread From: Frank Ch. Eigler @ 2014-07-22 2:57 UTC (permalink / raw) To: Nathan Scott; +Cc: David Smith, pcp, Systemtap List Hi, Nathan - nathans wrote: > [...] > (OOC, what's {MODULE_NAME} in this context?) It is usually a machine-generated unique identifier for the systemtap script in question, such as "stap_deadbeefdeadbeefdeadbeef_2222". It is not generally predictable in advance. (A user can override the name, but that costs possible collisions.) > [...] PMDA is written to be able to detect arrival/departure of new > MMV files based on changes in a directory (and the location of that > directory is parameterised via /etc/pcp.conf variables). [...] If we do end up sticking with this MMV approach, the PMDA would probably need to use glob(3) or similar to search for "/proc/systemtap/*/mmv" instead of "/proc/mmv/*". > It also occurs to me that there's not really anything > systemtap-specific about the kernel MMV instrumentation you've done, > or is there? [...] It would be worth thinking about if the MMV > bits in your script could become a more general kernel module for > others to use too [...] Yes, perhaps, though the path of code implementing a seemingly good idea into the kernel is rarely smooth. They generally dislike infrastructure unless it's well-yearned-for. So, if you wish to pursue this goal, some serious selling on LKML will be needed. - FChE ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [pcp] systemtap/pcp integration 2014-07-22 1:32 ` Nathan Scott 2014-07-22 2:57 ` Frank Ch. Eigler @ 2014-07-22 14:50 ` David Smith 2014-07-23 10:29 ` Nathan Scott 1 sibling, 1 reply; 16+ messages in thread From: David Smith @ 2014-07-22 14:50 UTC (permalink / raw) To: Nathan Scott; +Cc: Systemtap List, pcp On 07/21/2014 08:32 PM, Nathan Scott wrote: > Hi David, > > ----- Original Message ----- >> [...] >> Note that systemtap will create a file called 'mmv' in >> /proc/systemtap/{MODULE_NAME}. I've just been using pcp's 'mmvdump' >> utility to dump the contents of the /proc/systemtap/{MODULE_NAME}/mmv >> file. Currently the pcp mmv pmda only looks in one place for mmv files, >> but it might be possible to create a symbolic link to systemtap's mmv >> file to make it happy. > > (OOC, what's {MODULE_NAME} in this context?) As Frank mentioned, {MODULE_NAME} is the name of the module, usually autogenerated. > A symlink would sorta work but it feels like a bit of a workaround - the > PMDA is written to be able to detect arrival/departure of new MMV files > based on changes in a directory (and the location of that directory is > parameterised via /etc/pcp.conf variables). I'd not recommend trying to > find it within a stap script ... I imagine it will be cleaner if we go > for separate user/kernel locations for MMV files. The symlink feels like a bit of a workaround, because it *is* one. Long term we certainly want a better way to find the systemtap MMV files. > Attached patch (lightly tested) implements the PCP side of things with > that in mind - with this patch (and making the kernel code manage the > lifecycle of separate /proc/mmv/* entries), things should begin to work > out-of-the-box (the MMV PMDA is already default-enabled in the default > pmcd config file, so everything else is in place for you). This code sounds like a step in the right direction. Thanks. -- David Smith dsmith@redhat.com Red Hat http://www.redhat.com 256.217.0141 (direct) 256.837.0057 (fax) ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [pcp] systemtap/pcp integration 2014-07-22 14:50 ` [pcp] " David Smith @ 2014-07-23 10:29 ` Nathan Scott 2014-07-23 14:42 ` Frank Ch. Eigler 0 siblings, 1 reply; 16+ messages in thread From: Nathan Scott @ 2014-07-23 10:29 UTC (permalink / raw) To: David Smith; +Cc: Systemtap List, pcp Hey David, ----- Original Message ----- > On 07/21/2014 08:32 PM, Nathan Scott wrote: > > Hi David, > > > > ----- Original Message ----- > >> [...] > >> Note that systemtap will create a file called 'mmv' in > >> /proc/systemtap/{MODULE_NAME}. I've just been using pcp's 'mmvdump' > >> utility to dump the contents of the /proc/systemtap/{MODULE_NAME}/mmv > >> file. Currently the pcp mmv pmda only looks in one place for mmv files, > >> but it might be possible to create a symbolic link to systemtap's mmv > >> file to make it happy. > > > > (OOC, what's {MODULE_NAME} in this context?) > > As Frank mentioned, {MODULE_NAME} is the name of the module, usually > autogenerated. Got it. So, next I'm wondering... what is it doing here, in this interface between systemtap/pcp? (interfaces being "forever" and all that, the simpler the better). From a PCP POV it (appears to) add no value, so would seem to be something we can remove/simplify? OTOH maybe not -maybe you can use it as follows: In MMV (and the existing pmdammv, in particular), the basename of these files is used to form the first component of the metric namespace - that is, unless the caller (i.e. you, in your kernel code) sets MMV_FLAG_NOPREFIX. In that case, its ignored (see also mmv_stats_init(3), "name" parameter). So, perhaps for the stap case you could always set that flag, and use those module names as the <xxx> in a /proc/mmv/<xxx> -alike scheme. That'd be a fairly simple mechanism that'd work directly with the patch I sent earlier, I think (they'd need to be files, not directories though - else some slightly more complex code is going to be needed). > > A symlink would sorta work but it feels like a bit of a workaround - the > > PMDA is written to be able to detect arrival/departure of new MMV files > > based on changes in a directory (and the location of that directory is > > parameterised via /etc/pcp.conf variables). I'd not recommend trying to > > find it within a stap script ... I imagine it will be cleaner if we go > > for separate user/kernel locations for MMV files. > > The symlink feels like a bit of a workaround, because it *is* one. Long :) > term we certainly want a better way to find the systemtap MMV files. > > > Attached patch (lightly tested) implements the PCP side of things with > > that in mind - with this patch (and making the kernel code manage the > > lifecycle of separate /proc/mmv/* entries), things should begin to work > > out-of-the-box (the MMV PMDA is already default-enabled in the default > > pmcd config file, so everything else is in place for you). > > This code sounds like a step in the right direction. Well, hopefully something to experiment with anyway; it might get you close to a complete end-to-end working scenario. > [...] > I think MMV and a JSON fetcher/parser aren't mutually exclusive. I'd say > they are complementary. Using memory-mappings to share between user/kernel (or user/user as existing PCP MMV code does) requires fixed locations in the mapping - we're accessing it via pointers on both sides of the fence. Entirely-string representations (which is what I guess the intention with a JSON interface would be?) would lose that property, since the strings (and the values in particular) do not have fixed offsets within the mapping file anymore. I guess you'd have to then completely start over again (for ... reasons?) and implement it more like a traditional /proc interface (via seq_file), where userspace sampling is via open/read/close, and not the mapped memory model where the kernel is actively writing while userspace is actively reading. I haven't seen any other need for a generic JSON interface, so can't really comment as to its suitability for this kind of thing. There is one existing JSON PMDA - the elasticsearch PMDA uses JSON encodings fairly extensively - but it's very specific & not a generic representation like you're after here. Maybe you can use it to help gauge levels of complexity to expect though. cheers. -- Nathan ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: systemtap/pcp integration 2014-07-23 10:29 ` Nathan Scott @ 2014-07-23 14:42 ` Frank Ch. Eigler 2014-07-24 7:40 ` Nathan Scott 0 siblings, 1 reply; 16+ messages in thread From: Frank Ch. Eigler @ 2014-07-23 14:42 UTC (permalink / raw) To: Nathan Scott; +Cc: David Smith, pcp, Systemtap List nathans wrote: > Got it. So, next I'm wondering... what is [the module name] doing > here, in this interface between systemtap/pcp? [...] systemtap is not a single bundle of instrumentation. It is a tool for building/running many instrumentation scripts. The unique module name lets the kernel tell them apart. > In MMV (and the existing pmdammv, in particular), the basename of > these files is used to form the first component of the metric > namespace [...] That's a most straightforward possibility, though not the most user-friendly one, as the module names change from run to run. Perhaps the systemtap script could propose a pmns prefix via an auxiliary file. > [...] I guess you'd have to then completely start over again [for > JSON] Yes, but the systemtap side of this would be pretty trivial. Printing JSON strings is easy. > (for ... reasons?) Other messages in the thread pointed out some reasons, namely: - ability to generalize to event traffic, not just sampled metrics - robustness, by avoiding fragile C code > [...] I haven't seen any other need for a generic JSON interface > [...] In previous notes it was pointed out that JSON is a good encoding for stats data, because it is exported by other tools like ceph, zabbix, and many others including > [...] elasticsearch [...] So the proposal is to think about a single general JSON PMDA that can be configured to bridge data from multiple JSON-emitting applications, including systemtap scripts. - FChE ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: systemtap/pcp integration 2014-07-23 14:42 ` Frank Ch. Eigler @ 2014-07-24 7:40 ` Nathan Scott 0 siblings, 0 replies; 16+ messages in thread From: Nathan Scott @ 2014-07-24 7:40 UTC (permalink / raw) To: Frank Ch. Eigler, David Smith; +Cc: pcp, Systemtap List ----- Original Message ----- > [...] > Yes, but the systemtap side of this would be pretty trivial. Printing > JSON strings is easy. Thats only a small part of the problem though (even just considering the systemtap side of things), and there are many advantages to the MMV format being so close to the representation the PMDA needs to export. > > (for ... reasons?) > > Other messages in the thread pointed out some reasons Those references were a little too vague for me to follow the intent. > - ability to generalize to event traffic, not just sampled metrics That's not really a reason to start over though - the MMV format is versioned, it could easily be extended and all its many advantages kept. Or, a separate PMDA might make a better option there - there is no reason one PMDA needs to service both these (very different) tracing/sampling data models. Anyway, do keep in mind the oft-mentioned cost of string conversions (Ken mentioned on the last call, IIRC, David mentioned his concerns originally, and I will say it again too) - every time we've profiled pmdalinux, the CPU cost is always dominated by the string extraction/ conversion code for the procfs/sysfs files. That will be orders of magnitude worse for event tracing, I imagine. > - robustness, by avoiding fragile C code Do you mean Davids systemtap code? The MMV code in PCP has been around for quite some time, and I know of a number of production environments using it 24x7 on hundreds of machines, for many years - its not fragile at all. Stability in Davids code will come over time, if there really is an issue there (?) but experience would suggest there is no inherent flaw in the approach he's using. > So the proposal is to think about a single general JSON PMDA that can > be configured to bridge data from multiple JSON-emitting applications, > including systemtap scripts. Bridges like this have been implemented several times in the past and it has never worked well. The fundamental, unsolvable-in-the-general-case problem is that we have well-defined metric semantics in PCP. So these bridges have to (attempt to) map from the usually-far-less-so,if-defined -at-all semantics offered by the other tool, and come up with the highly detailed PCP metric metadata, somehow. To date custom PMDAs have always done a better job - more simply-coded, and with domain-specific-code, as designed for PCP PMDAs (we can still use JSON there of course, as in the elasticsearch case). FWIW, I chatted to someone last week who was actually using Zabbix in a production environment - they were interested in exploring the opposite direction to what you've described above. IOW, they would like to have a PCP component that talks Zabbix protocol (JSON-alike, IIRC) such that they wouldn't have to install the Zabbix agent code everywhere. They'd like instead to rely on a PCP component (could pmwebd be taught to fill this role, perhaps? dunno) giving them everything they need for all the many machines they rollout, cookie-cutter style, from PCP. Then, their central Zabbix instance could continue doing their alerting, which they like. It sounded a little bit like the graphana/graphite exporting, and I believe its also JSON-based, so just thought I'd mention it here. cheers. -- Nathan ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2014-07-24 7:40 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-07-17 21:14 systemtap/pcp integration David Smith 2014-07-18 15:49 ` Frank Ch. Eigler 2014-07-18 17:02 ` David Smith 2014-07-18 18:27 ` Frank Ch. Eigler 2014-07-22 14:25 ` David Smith 2014-07-22 15:21 ` Frank Ch. Eigler 2014-07-22 21:02 ` David Smith 2014-07-18 21:10 ` [pcp] " William Cohen 2014-07-21 15:43 ` David Smith 2014-07-21 15:54 ` William Cohen 2014-07-22 1:32 ` Nathan Scott 2014-07-22 2:57 ` Frank Ch. Eigler 2014-07-22 14:50 ` [pcp] " David Smith 2014-07-23 10:29 ` Nathan Scott 2014-07-23 14:42 ` Frank Ch. Eigler 2014-07-24 7:40 ` Nathan Scott
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).