From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7986 invoked by alias); 25 Dec 2011 23:59:49 -0000 Received: (qmail 7973 invoked by uid 22791); 25 Dec 2011 23:59:48 -0000 X-SWARE-Spam-Status: No, hits=-2.2 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD,TW_LR,TW_RW,TW_WX,TW_XR,TW_ZF X-Spam-Check-By: sourceware.org Received: from smtp1.onthe.net.au (HELO smtp1.onthe.net.au) (203.22.196.249) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 25 Dec 2011 23:59:31 +0000 Received: from localhost (localhost [127.0.0.1]) by smtp1.onthe.net.au (Postfix) with ESMTP id 19651610F8; Mon, 26 Dec 2011 10:59:29 +1100 (EST) Received: from smtp1.onthe.net.au ([127.0.0.1]) by localhost (smtp1.onthe.net.au [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bzp5v4k3xCSt; Mon, 26 Dec 2011 10:59:28 +1100 (EST) Received: from o1.office.otn.net.au (unknown [180.148.179.26]) by smtp1.onthe.net.au (Postfix) with ESMTP id 667BC61187; Mon, 26 Dec 2011 10:59:28 +1100 (EST) Received: from achates.office.onthe.net.au (achates-vpn.office.onthe.net.au [10.8.0.4]) by o1.office.otn.net.au (Postfix) with ESMTP id 1B9EC7A6C7; Mon, 26 Dec 2011 10:59:28 +1100 (EST) Received: by achates.office.onthe.net.au (Postfix, from userid 999) id E1AF310D; Mon, 26 Dec 2011 10:59:27 +1100 (EST) Date: Mon, 26 Dec 2011 18:33:00 -0000 From: Chris Dunlop To: Josh Stone Cc: systemtap@sourceware.org Subject: Re: Error removing module: Device or resource busy Message-ID: <20111225235927.GA2907@onthe.net.au> References: <20111223050020.GA11829@onthe.net.au> <4EF4E089.6060008@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4EF4E089.6060008@redhat.com> User-Agent: Mutt/1.5.20 (2009-06-14) Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2011-q4/txt/msg00427.txt.bz2 Hi, Thanks for looking at this... On Fri, Dec 23, 2011 at 12:11:53PM -0800, Josh Stone wrote: > On 12/22/2011 09:00 PM, Chris Dunlop wrote: >> Linux-3.0.13, systemtap 6.1 and now HEAD(b6a3da4) > > Your dump looks like x86_64 - is this on any particular distro? Are you > building this kernel yourself? Yes, x86_64, on debian wheezy/sid, self-built kernel and systemtap. Note: just updated to linux-3.0.14. >> Whenever I run stap[2], the output ends with: >> >> Error removing module 'stap_fcac0085842418e34d8094455dc203e8_1_21605': Device or resource busy. >> >> (obviously the module name changes) and the module is still loaded: >> >> # lsmod | grep stap >> stap_fcac0085842418e34d8094455dc203e8_1_21605 2896285 1027571582 [permanent] > > This is odd. Reading kernel/module.c, the "[permanent]" should only > come about if the module has no ->exit callback. And 1027571582 is the > field for the module refcount, which doesn't look plausible. That refcount is consistently odd, e.g. after a few runs: # lsmod | grep stap stap_b3ee5e5f7f7df4f11fdf95b215e43f6_7050 26018 4294967295 [permanent] stap_b3ee5e5f7f7df4f11fdf95b215e43f6_6574 26018 4294967295 [permanent] stap_6bbd9bcdc91b5b9122793a314d03458_5786 26018 4294967295 [permanent] stap_8969cc5adcc470f954f5b37c4134b9a_5609 26034 4294967295 [permanent] ...actually that's equal to 0xFFFFFFFF, or -1. But the previously seen 1027571582 is 0x3D3F7F7E which doesn't mean anything obvious to me. > You must have CONFIG_MODULE_UNLOAD=y, or else the kernel just prints a > dummy " - -" in place of the refcount. Though I'm not certain how lsmod > translates that, so you might check /proc/modules directly. # grep stap /proc/modules stap_b3ee5e5f7f7df4f11fdf95b215e43f6_7050 26018 4294967295 [permanent], Live 0xffffffffa037e000 stap_b3ee5e5f7f7df4f11fdf95b215e43f6_6574 26018 4294967295 [permanent], Live 0xffffffffa0372000 stap_6bbd9bcdc91b5b9122793a314d03458_5786 26018 4294967295 [permanent], Live 0xffffffffa0366000 stap_8969cc5adcc470f954f5b37c4134b9a_5609 26034 4294967295 [permanent], Live 0xffffffffa035a000 > But it seems like your stap modules are being built without > this machinery, so corruption ensues. By "this machinery", do you mean the ->exit callback, and if so, is this the ->exit callback? # grep module_exit /root/.systemtap/cache/6b/stap_6bbd9bcdc91b5b9122793a314d034588_795.c static void systemtap_module_exit (void) { > Are you sure that /lib/modules/`uname -r`/build matches the running kernel? Yup, and, as above, I've just updated to 3.0.14 to be sure: # uname -a Linux b5 3.0.14-otn-00018-g2c7c13d #1 SMP Mon Dec 26 07:11:13 EST 2011 x86_64 GNU/Linux # ls -l /lib/modules/`uname -r`/build lrwxrwxrwx 1 root root 53 2011-12-26 07:22 /lib/modules/3.0.14-otn-00018-g2c7c13d/build -> /home/chris/git/linux-build/3.0.14-otn-00018-g2c7c13d > If you can boot without any of the zfs stuff, then I'd experiment with > that first, just to make sure stap is lined up with all the correct > kernel build infrastructure. Try something simple, like one of the > syscall examples. Without any of the zfs modules loaded... # stap -v -e 'probe begin {printf("foo\n"); exit()}' Pass 1: parsed user script and 78 library script(s) using 77348virt/21892res/2620shr kb, in 110usr/20sys/170real ms. Pass 2: analyzed script: 1 probe(s), 1 function(s), 0 embed(s), 0 global(s) using 77876virt/22684res/2852shr kb, in 0usr/0sys/3real ms. Pass 3: translated to C into "/tmp/stapSBQNZp/stap_6bbd9bcdc91b5b9122793a314d034588_795_src.c" using 77876virt/22768res/2924shr kb, in 0usr/0sys/1real ms. Pass 4: compiled C into "stap_6bbd9bcdc91b5b9122793a314d034588_795.ko" in 1120usr/160sys/1889real ms. Pass 5: starting run. foo Error removing module 'stap_6bbd9bcdc91b5b9122793a314d03458_5786': Device or resource busy. WARNING: /usr/bin/staprun exited with status: 1 Pass 5: run completed in 20usr/0sys/423real ms. Pass 5: run failed. Try again with another '--vp 00001' option. Cheers, Chris.