public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* [Bug runtime/17101] New: [rfe] timeout for stap
@ 2014-06-30 13:51 mcermak at redhat dot com
  2014-06-30 14:22 ` [Bug runtime/17101] " dsmith at redhat dot com
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: mcermak at redhat dot com @ 2014-06-30 13:51 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17101

            Bug ID: 17101
           Summary: [rfe] timeout for stap
           Product: systemtap
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: runtime
          Assignee: systemtap at sourceware dot org
          Reporter: mcermak at redhat dot com

I think it would be useful to implement some kind of run time limit. An option
like "--timeout N" and/or some environmental variable with similar meaning
would be handy.

Such an environmental variable might help when running testsuites. Exporting it
would be sufficient to adjust the stap behaviour during the testsuite run.

Also a check for inserted kernel modules + some cleanup might be performed in
the end, so that subsequent testcases can start running on top of a clean
table.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug runtime/17101] [rfe] timeout for stap
  2014-06-30 13:51 [Bug runtime/17101] New: [rfe] timeout for stap mcermak at redhat dot com
@ 2014-06-30 14:22 ` dsmith at redhat dot com
  2014-06-30 16:34 ` fche at redhat dot com
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: dsmith at redhat dot com @ 2014-06-30 14:22 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17101

David Smith <dsmith at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dsmith at redhat dot com

--- Comment #1 from David Smith <dsmith at redhat dot com> ---
For ordinary scripts, we already have a way to kill scripts after a time limit
- just call 'exit()' from a timer probe, something like the following:

====
# Kill the script after 60 seconds.
probe timer.s(60) { exit() }
====

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug runtime/17101] [rfe] timeout for stap
  2014-06-30 13:51 [Bug runtime/17101] New: [rfe] timeout for stap mcermak at redhat dot com
  2014-06-30 14:22 ` [Bug runtime/17101] " dsmith at redhat dot com
@ 2014-06-30 16:34 ` fche at redhat dot com
  2014-07-01  7:00 ` mcermak at redhat dot com
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: fche at redhat dot com @ 2014-06-30 16:34 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17101

Frank Ch. Eigler <fche at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fche at redhat dot com

--- Comment #2 from Frank Ch. Eigler <fche at redhat dot com> ---
How about a macro in the tapset, and an option for stap to accept
multiple -e SCRIPT options, so that this would work:?

   % stap FOO.script -e '@timeout(5)'
   % stap -e FOO -e '@timeout(5)'

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug runtime/17101] [rfe] timeout for stap
  2014-06-30 13:51 [Bug runtime/17101] New: [rfe] timeout for stap mcermak at redhat dot com
  2014-06-30 14:22 ` [Bug runtime/17101] " dsmith at redhat dot com
  2014-06-30 16:34 ` fche at redhat dot com
@ 2014-07-01  7:00 ` mcermak at redhat dot com
  2014-07-03 18:45 ` fche at redhat dot com
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: mcermak at redhat dot com @ 2014-07-01  7:00 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17101

--- Comment #3 from Martin Cermak <mcermak at redhat dot com> ---
Hmm -- how to use this with the testsuite without modifying testcases and
without side-effects? E.g. bash wrapper for stap will not play well with stap
scripts containing shebang.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug runtime/17101] [rfe] timeout for stap
  2014-06-30 13:51 [Bug runtime/17101] New: [rfe] timeout for stap mcermak at redhat dot com
                   ` (2 preceding siblings ...)
  2014-07-01  7:00 ` mcermak at redhat dot com
@ 2014-07-03 18:45 ` fche at redhat dot com
  2014-07-04 11:30 ` mcermak at redhat dot com
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: fche at redhat dot com @ 2014-07-03 18:45 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17101

--- Comment #4 from Frank Ch. Eigler <fche at redhat dot com> ---
(In reply to Martin Cermak from comment #3)
> Hmm -- how to use this with the testsuite without modifying testcases and
> without side-effects?

One way could be to add the -e .... bit to the systemtap-rc file, which
for test cases is populated in testsuite/lib/systemtap.exp.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug runtime/17101] [rfe] timeout for stap
  2014-06-30 13:51 [Bug runtime/17101] New: [rfe] timeout for stap mcermak at redhat dot com
                   ` (3 preceding siblings ...)
  2014-07-03 18:45 ` fche at redhat dot com
@ 2014-07-04 11:30 ` mcermak at redhat dot com
  2014-07-04 14:26 ` fche at redhat dot com
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: mcermak at redhat dot com @ 2014-07-04 11:30 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17101

--- Comment #5 from Martin Cermak <mcermak at redhat dot com> ---
This looks elastic :)

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug runtime/17101] [rfe] timeout for stap
  2014-06-30 13:51 [Bug runtime/17101] New: [rfe] timeout for stap mcermak at redhat dot com
                   ` (4 preceding siblings ...)
  2014-07-04 11:30 ` mcermak at redhat dot com
@ 2014-07-04 14:26 ` fche at redhat dot com
  2014-07-07 13:21 ` mcermak at redhat dot com
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: fche at redhat dot com @ 2014-07-04 14:26 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17101

--- Comment #6 from Frank Ch. Eigler <fche at redhat dot com> ---
If putting such a timeout control into the /rc file is acceptable,
then we don't even have to compress it with a tapset macro.  We could
just teach stap to accept multiple -e SCRIPT bits, and let the /rc
file contain a fully spelled out

     -e 'probe timer.s(120) { error("timeout") }'

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug runtime/17101] [rfe] timeout for stap
  2014-06-30 13:51 [Bug runtime/17101] New: [rfe] timeout for stap mcermak at redhat dot com
                   ` (5 preceding siblings ...)
  2014-07-04 14:26 ` fche at redhat dot com
@ 2014-07-07 13:21 ` mcermak at redhat dot com
  2014-07-07 15:39 ` dsmith at redhat dot com
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: mcermak at redhat dot com @ 2014-07-07 13:21 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17101

--- Comment #7 from Martin Cermak <mcermak at redhat dot com> ---
Sure! Plus the trivial macro can always be added too:

@define timeout(secs) %(
    probe timer.s(@secs) { error(sprintf("timeout %d", @secs)) }
%)

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug runtime/17101] [rfe] timeout for stap
  2014-06-30 13:51 [Bug runtime/17101] New: [rfe] timeout for stap mcermak at redhat dot com
                   ` (6 preceding siblings ...)
  2014-07-07 13:21 ` mcermak at redhat dot com
@ 2014-07-07 15:39 ` dsmith at redhat dot com
  2014-07-07 16:32 ` jistone at redhat dot com
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: dsmith at redhat dot com @ 2014-07-07 15:39 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17101

--- Comment #8 from David Smith <dsmith at redhat dot com> ---
Hmm, I wonder if this rc file modification will work exactly the way we'd like.
I could see where the testcase would just ignore the new output and not fail
properly.

Martin, are there certain tests in the testsuite where you have the most
problems with missing timeout behavior?

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug runtime/17101] [rfe] timeout for stap
  2014-06-30 13:51 [Bug runtime/17101] New: [rfe] timeout for stap mcermak at redhat dot com
                   ` (7 preceding siblings ...)
  2014-07-07 15:39 ` dsmith at redhat dot com
@ 2014-07-07 16:32 ` jistone at redhat dot com
  2014-07-09 17:49 ` mcermak at redhat dot com
  2014-07-15 19:50 ` ajakop at redhat dot com
  10 siblings, 0 replies; 12+ messages in thread
From: jistone at redhat dot com @ 2014-07-07 16:32 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17101

Josh Stone <jistone at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jistone at redhat dot com

--- Comment #9 from Josh Stone <jistone at redhat dot com> ---
(In reply to Frank Ch. Eigler from comment #6)
> just teach stap to accept multiple -e SCRIPT bits

Using multiple -e is fine, but script files are chosen as the first non-option
argument.  So how do we decide between (-e SCRIPT_TEXT SCRIPT_FILE ARG...) and
(-e SCRIPT_TEXT ARG...) ?

I thought perhaps just testing the file-existence of the first argument, but I
often use things like 'process(@1).mark("*")'.  So then you'd have to have some
heuristic to decide that it's not just a file, but a real script too.

I suggest adding a new argument flag for these supplementary script pieces. 
Perhaps simply -E, and users will still be required to have a primary -e or
script file.  A -E can also be ignored by non-script modes, -l, --dump, etc.

(In reply to David Smith from comment #8)
> Hmm, I wonder if this rc file modification will work exactly the way we'd
> like. I could see where the testcase would just ignore the new output and
> not fail properly.

We need to fix such tests whenever we find them.  I've found several cases of
bugs that were displayed in testcases, but were masked because one good line of
output made it report success.  In one test, I even saw that the *absence* of a
particular error message was counted as success, but a different error was
preventing it from getting to the point of interest at all.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug runtime/17101] [rfe] timeout for stap
  2014-06-30 13:51 [Bug runtime/17101] New: [rfe] timeout for stap mcermak at redhat dot com
                   ` (8 preceding siblings ...)
  2014-07-07 16:32 ` jistone at redhat dot com
@ 2014-07-09 17:49 ` mcermak at redhat dot com
  2014-07-15 19:50 ` ajakop at redhat dot com
  10 siblings, 0 replies; 12+ messages in thread
From: mcermak at redhat dot com @ 2014-07-09 17:49 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17101

--- Comment #10 from Martin Cermak <mcermak at redhat dot com> ---
(In reply to David Smith from comment #8)
> Martin, are there certain tests in the testsuite where you have the most
> problems with missing timeout behavior?

So for example th FJ testsuite runs following testcase which "hangs" on
el6.x86_64: 

# cat DWARF_probes_004.stp
probe kernel.function("sys_read").return.maxactive(5) {
        printf("%s\n", probefunc())
        exit()
}

Other such example from the same testsuite which "hangs on el7.x86_64 is:

# cat probe_definition_003_ext4.stp

probe module("ext4").function("ext4_*") {
        printf("%s\n", probefunc())
        exit()
}

Similarly testcases relying capably on the open syscall will hang on aarch64,
since it has opent instead.

-------

When running any testsuite in an automated way, it is frustrating to work with
such situations, although they're easily solvable by hand when one has an easy
access on a testing box. Getting such hang in an automated testing framework
means that no results from testcases following the hangy one are obtained.
Provosioning a box takes time as well as running the testsuite. For this reason
I'd appreciate this functionality. If we know of testcases that wouldn't fail
appropriately due to this, then I'd propose to fix them so that that clearly
fail if the timeout is hit.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug runtime/17101] [rfe] timeout for stap
  2014-06-30 13:51 [Bug runtime/17101] New: [rfe] timeout for stap mcermak at redhat dot com
                   ` (9 preceding siblings ...)
  2014-07-09 17:49 ` mcermak at redhat dot com
@ 2014-07-15 19:50 ` ajakop at redhat dot com
  10 siblings, 0 replies; 12+ messages in thread
From: ajakop at redhat dot com @ 2014-07-15 19:50 UTC (permalink / raw)
  To: systemtap

https://sourceware.org/bugzilla/show_bug.cgi?id=17101

Abe Jakop <ajakop at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ajakop at redhat dot com
           Assignee|systemtap at sourceware dot org    |ajakop at redhat dot com

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-07-15 19:50 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-30 13:51 [Bug runtime/17101] New: [rfe] timeout for stap mcermak at redhat dot com
2014-06-30 14:22 ` [Bug runtime/17101] " dsmith at redhat dot com
2014-06-30 16:34 ` fche at redhat dot com
2014-07-01  7:00 ` mcermak at redhat dot com
2014-07-03 18:45 ` fche at redhat dot com
2014-07-04 11:30 ` mcermak at redhat dot com
2014-07-04 14:26 ` fche at redhat dot com
2014-07-07 13:21 ` mcermak at redhat dot com
2014-07-07 15:39 ` dsmith at redhat dot com
2014-07-07 16:32 ` jistone at redhat dot com
2014-07-09 17:49 ` mcermak at redhat dot com
2014-07-15 19:50 ` ajakop at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).