From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 3663 invoked by alias); 14 Sep 2011 18:23:24 -0000 Received: (qmail 3654 invoked by uid 22791); 14 Sep 2011 18:23:22 -0000 X-SWARE-Spam-Status: No, hits=-6.3 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,SPF_HELO_PASS X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 14 Sep 2011 18:23:08 +0000 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p8EIN830024443 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Wed, 14 Sep 2011 14:23:08 -0400 Received: from [10.15.16.135] (dhcp-10-15-16-135.yyz.redhat.com [10.15.16.135]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id p8EIN82r005427 for ; Wed, 14 Sep 2011 14:23:08 -0400 Message-ID: <4E70F10B.2080201@redhat.com> Date: Wed, 14 Sep 2011 18:23:00 -0000 From: Dave Brolley User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110428 Fedora/3.1.10-1.fc13 Lightning/1.0b3pre Thunderbird/3.1.10 MIME-Version: 1.0 To: systemtap@sourceware.org Subject: Re: [Bug translator/13187] Reconsider the semantics of process(number).thread.begin/end Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2011-q3/txt/msg00338.txt.bz2 No reason why this discussion should not be public. The discussion started on IRC within Red Hat when I tried to test the process.thread.begin/end variants and ran into what was, for me anyway, unexpected semantics. The scenario is an already-running process, foo, with process id 1234. stap -e 'probe process("foo").thread.begin,process("foo").thread.end {}' catches each child thread of foo as they begin and end, as expected. I expected stap -e 'probe process(1234).thread.begin,process(1234).thread.end {}' to do the same, but it does not. Instead, it never fires, because the current semantics are to probe the child thread with task id == 1234, which doesn't exist. My proposal for new semantics, and David Smith's response are in the PR 13187. My subsequent reply to David is below. Feel free to add your thoughts and opinions! -------------------------------------------------------- On 09/14/2011 10:28 AM, dsmith at redhat dot com wrote: > > There is a situation where process(PID).thread.begin probes will fire. > > - Start a process that creates several threads that will run for an extended > period of time. Determine the pids of those threads. > - RUn systemtap with a script that probes those pids. > - As systemtap attaches to those threads, the process(PID).thread.begin probe > will fire. I think that you're getting caught up in the notion that the PID here refers to a task because that happens to be the current implementation. Sure we can jump through some hoops and get this probe type to fire for a particular type of application (long running threads) in a particular situation (want to probe one thread in particular, probe fires at some random point in the already-running thread), but is this really the most useful and most intuitive interpretation of this probe type from the user's point of view? I think that there is a lack of orthogonality in the current implementation that is confusing. At least it is for me. 1) stap -e 'process("PATH").thread.begin {}' catches *all* child threads of *processes* (not tasks) identified by PATH, *as they start* 2) stap -e 'process.thread.begin {}' -c CMD catches *all* child threads of the specific *process* (not task) created by running CMD, *as they start* whereas 3) stap -e 'process(NUMBER).thread.begin {}': catches only the thread with task id equal to NUMBER. The behavior of variant 3 is not intuitive at all, to me, given the behavior of the other two variants, combined with the name of the probe itself being process(NUMBER), and not task(NUMBER). Furthermore, the number in process(number).statement(stmtnumber).absolute process(number).statement(stmenumber).absolute.return process(number).syscall process(number).syscall.return refers to the process id and these probes all fire in the main thread of the process with that id and also in its children. i.e. it simply identifies the target process. So why should the number in not also refer to the process id? process(number).thread.begin process(number).thread.end process(number).begin process(number).end the only difference should be that the latter two only refer to the main thread and the first two only refer to the child threads, as they do for the other begin/end and thread.begin/end variants. It seems to me that the thread.begin family of probes was intended to provide for the probing of threading activity within a process (as opposed to probing of a particular thread/task) and that providing a *process* (not task) NUMBER instead of a *process* PATH or a -c option shouldn't change this. i.e. the purpose of providing process(NUMBER) probes is to allow the specification of an already-running *process* as the target and not a specific thread/task id. The point of my proposal in the BZ is that simply having variant 3 use the same "follow all child threads" semantics as the other two would accomplish this. As fche pointed out, this is exactly what variant 2 does for the *process* id generated by -c PATH, so why deny it for the *process* id provided by NUMBER? > In my opinion, changing the semantics here is too big/messy of a change. > Instead, a new probe type, something like 'process(PATH/PID).thread.create', > could be created. I see it the other way. I think that the current process(NUMBER).thread.start/end should behave like its counterparts and that if you really want to probe a specific thread/task, you can: 1) filter on the task id or 2) create something like task(TID).begin/end or process(PATH/PID).thread(TID).begin/end. Note also, that process(PATH).thread.create would duplicate functionality already provided by process(PATH).thread.begin. At the end of the day, don't we want what is most flexible, useful and intuitive for the end user? With my proposed change: useful: probes which are possible with the current implementation will still be possible after the change (via filtering) useful and flexible: a whole range of probes on already-running processes, which are now impossible would become possible intuitive: the NUMBER in process(NUMBER).thread.begin would refer to a process id as the name of the probe suggests As for the concern about changing the semantics. The new semantics are a superset of the current semantics and, given the current semantics, I doubt that anyone is currently using process(NUMBER).thread.begin/end at all. I welcome your further thoughts on this! Thanks, Dave