public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* RFC: systemtap remote shell (stapsh)
@ 2011-03-09 21:02 Josh Stone
  2011-03-09 22:06 ` Josh Stone
  0 siblings, 1 reply; 3+ messages in thread
From: Josh Stone @ 2011-03-09 21:02 UTC (permalink / raw)
  To: systemtap

Hi stappers,

I've written a wrapper for use in stap --remote execution, called
stapsh, and published it in working-but-somewhat-crude form in a branch
of the systemtap.git repository:
http://sourceware.org/git/gitweb.cgi?p=systemtap.git;a=shortlog;h=refs/heads/jistone/stapsh

This stapsh agent is meant to be automatically invoked through an ssh
connection to facilitate transferring systemtap modules to remotes hosts
and invoking staprun on them.  There is already code in master that sort
of works to this end, using multiple ssh invocations (sharing an ssh
ControlMaster), but it gets ugly in the corners.  The primary motivation
of stapsh is make sure all the exit cases can happen cleanly, especially
with sending signals to the remote and cleaning up the processes and
temporary files.

So I'd appreciate reviews of this code, and hopefully we can get it into
stap 1.5 in a form that we're happy with for a while. :)  My desired
review priorities are:

1. (High) Review the protocol between stap & stapsh.  It's intentionally
simplistic, but I think enough to meet current needs.  I've documented
at the top of runtime/staprun/stapsh.c, and I'll paste it below as well.
 My own thought is that I probably need to add ok/error replies to most
of the commands, but please give your ideas too.

2. Review the high-level approach to implementing this protocol.  What's
there does seem to work, but I called it "somewhat-crude" above because
my own code feels a bit clunky to me.  If I'm doing something
brain-dead, let me know.

3. (Low) Review low-level implementation details.  I'm already working
on polishing the code as-is, so it's probably not ready for a
fine-toothed comb.  But if you have the time/inclination, I'll still
appreciate such feedback too.

Thanks,
Josh


stapsh protocol documentation from runtime/staprun/stapsh.c:
> // stapsh implements a minimal protocol for a remote stap client to transfer a
> // systemtap module to a temporary location and invoke staprun on it.  It is
> // not meant to be invoked directly by the user.  Commands are simply
> // whitespace-delimited strings, terminated by newlines.
> //
> //   command: stap VERSION
> //     reply: stapsh VERSION MACHINE RELEASE
> //      desc: This is the initial handshake.  The VERSION exchange is intended
> //            to facilitate compatibility checks, in case the protocol needs to
> //            change.  MACHINE and RELEASE are reported as given by uname.
> //
> //   command: file SIZE NAME
> //            DATA
> //     reply: (none)
> //      desc: Create a file of SIZE bytes, called NAME.  The NAME is a basename
> //            only, and limited to roughly "[a-z0-9][a-z0-9._]*".  The DATA is
> //            read as raw bytes following the command's newline.
> //
> //   command: arg SIZE
> //            DATA
> //     reply: (none)
> //      desc: Push an argument of SIZE bytes for staprun.  Note that while it
> //            is read as binary data, any embedded NIL will truncate the
> //            argument in the actual command invocation.
> //
> //   command: run
> //     reply: (none)
> //      desc: Start staprun with the previously pushed arguments.  When the
> //            child exits, stapsh will clean up and then exit with the same
> //            return code.
> //
> //   command: signal NUM
> //     reply: (none)
> //      desc: Send signal NUM to the child process.
> //
> // If stapsh reaches EOF on its standard input, it will send SIGHUP to the
> // child process, wait for completion, then cleanup and exit normally.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: RFC: systemtap remote shell (stapsh)
  2011-03-09 21:02 RFC: systemtap remote shell (stapsh) Josh Stone
@ 2011-03-09 22:06 ` Josh Stone
  2011-03-10 22:55   ` Josh Stone
  0 siblings, 1 reply; 3+ messages in thread
From: Josh Stone @ 2011-03-09 22:06 UTC (permalink / raw)
  To: systemtap

Reproducing some feedback given on freenode #systemtap:

>> //   command: arg SIZE
>> //            DATA
>> //     reply: (none)
>> //      desc: Push an argument of SIZE bytes for staprun.  Note that while it
>> //            is read as binary data, any embedded NIL will truncate the
>> //            argument in the actual command invocation.
>> //
>> //   command: run
>> //     reply: (none)
>> //      desc: Start staprun with the previously pushed arguments.  When the
>> //            child exits, stapsh will clean up and then exit with the same
>> //            return code.

fche suggested these should be combined to just "run ARG1 ARG2 ...",
where each ARG is encoded as perhaps base64 or quoted-printable.

>> //   command: signal NUM
>> //     reply: (none)
>> //      desc: Send signal NUM to the child process.

przemoc pointed out that signal numbers are architecture-specific, so
this should perhaps use names instead.  At the moment, I'm only actually
using SIGINT, which appears standardized at signal #2, but changing to
names is probably still a good idea.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: RFC: systemtap remote shell (stapsh)
  2011-03-09 22:06 ` Josh Stone
@ 2011-03-10 22:55   ` Josh Stone
  0 siblings, 0 replies; 3+ messages in thread
From: Josh Stone @ 2011-03-10 22:55 UTC (permalink / raw)
  To: systemtap

On 03/09/2011 02:06 PM, Josh Stone wrote:
>>> //   command: signal NUM
>>> //     reply: (none)
>>> //      desc: Send signal NUM to the child process.
> 
> przemoc pointed out that signal numbers are architecture-specific, so
> this should perhaps use names instead.  At the moment, I'm only actually
> using SIGINT, which appears standardized at signal #2, but changing to
> names is probably still a good idea.

On further reflection, do we even need the ability to send arbitrary
signals?  My goal with this command is just to get the running process
to quit, so maybe a simple "stop" or "quit" command would make more
sense. It'd still be implemented with a kill(SIGINT) though.  Are there
other cases where we'd actually want a flexible signal command?

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-03-10 22:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-09 21:02 RFC: systemtap remote shell (stapsh) Josh Stone
2011-03-09 22:06 ` Josh Stone
2011-03-10 22:55   ` Josh Stone

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).