From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24556 invoked by alias); 22 Jun 2011 23:52:27 -0000 Received: (qmail 24547 invoked by uid 22791); 22 Jun 2011 23:52:27 -0000 X-SWARE-Spam-Status: No, hits=-6.2 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_HI,SPF_HELO_PASS,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 22 Jun 2011 23:52:09 +0000 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p5MNq8PK007496 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Wed, 22 Jun 2011 19:52:08 -0400 Received: from [10.3.113.54] (ovpn-113-54.phx2.redhat.com [10.3.113.54]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p5MNq8m2026030; Wed, 22 Jun 2011 19:52:08 -0400 Message-ID: <4E028028.4010603@redhat.com> Date: Wed, 22 Jun 2011 23:52:00 -0000 From: Josh Stone User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110428 Fedora/3.1.10-1.fc15 Lightning/1.0b3pre Thunderbird/3.1.10 MIME-Version: 1.0 To: "Richard W.M. Jones" CC: systemtap@sourceware.org Subject: Re: Rapidly running systemtap causing hangs or oops References: <20110622230025.GG18438@amd.home.annexia.org> In-Reply-To: <20110622230025.GG18438@amd.home.annexia.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2011-q2/txt/msg00318.txt.bz2 On 06/22/2011 04:00 PM, Richard W.M. Jones wrote: > Me again. I can get something involving systemtap, ext2, the loop > device, Linux 3.0 to oops very easily. I'm not quite sure exactly > what factor causes it, but here's an easy reproducer: > > $ mkdir /tmp/mnt > > $ truncate -s 1G /tmp/fs > $ mkfs.ext2 -F /tmp/fs > > $ cat > /tmp/test.sh > #!/bin/sh - > echo mount > mount -o loop /tmp/fs /tmp/mnt > echo unmount > umount /tmp/mnt > > $ chmod +x /tmp/test.sh > > $ while sudo stap -e 'probe module("ext2").statement ("*@*.c:*") { printf ("%s\n", pp()); }' -c /tmp/test.sh ; do : ; done > > The final command usually either hangs the machine, or produces a long > oops like the one attached, after just a few iterations. It takes > just a few seconds on my VM to get a hang or oops. Can you try running stap with "-D STP_ALIBI"? This alibi mode compiles out most of stap's code, so each probe handler is reduced to just an atomic increment, then a final hit count is reported on exit. Another test might be to move the loop inside test.sh, so stap is left running the whole time, and we might tell if the issue is timed around stap's probe registration or unregistration. > [ 342.037017] [] show_registers+0xbd/0x206 > [ 342.037017] [] ? atomic_notifier_call_chain+0x14/0x16 > [ 342.037017] [] __die+0x97/0xd8 > [ 342.037017] [] die+0x47/0x63 > [ 342.037017] [] do_double_fault+0x65/0x67 > [ 342.037017] [] double_fault+0x2a/0x30 > [ 342.037017] [] ? ext2_get_inode+0x6d/0x130 [ext2] Is the Oops always this minimal? Does it always (questionably) point to the same ext2_get_inode location? I'll play with this tomorrow and see if I can reproduce it myself... Josh