From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7920 invoked by alias); 15 May 2009 13:52:30 -0000 Received: (qmail 7909 invoked by uid 22791); 15 May 2009 13:52:28 -0000 X-SWARE-Spam-Status: No, hits=-2.5 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mx2.redhat.com (HELO mx2.redhat.com) (66.187.237.31) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 15 May 2009 13:52:21 +0000 Received: from int-mx2.corp.redhat.com (int-mx2.corp.redhat.com [172.16.27.26]) by mx2.redhat.com (8.13.8/8.13.8) with ESMTP id n4FDqJBp021091; Fri, 15 May 2009 09:52:19 -0400 Received: from ns3.rdu.redhat.com (ns3.rdu.redhat.com [10.11.255.199]) by int-mx2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id n4FDqIOt003840; Fri, 15 May 2009 09:52:19 -0400 Received: from localhost.localdomain (vpn-13-209.rdu.redhat.com [10.11.13.209]) by ns3.rdu.redhat.com (8.13.8/8.13.8) with ESMTP id n4FDqHpC032570; Fri, 15 May 2009 09:52:18 -0400 Message-ID: <4A0D7391.4030106@redhat.com> Date: Fri, 15 May 2009 13:52:00 -0000 From: David Smith User-Agent: Thunderbird 2.0.0.21 (X11/20090320) MIME-Version: 1.0 To: maynardj@us.ibm.com CC: Roland McGrath , systemtap@sourceware.org, "Frank Ch. Eigler" Subject: ia64 hang when using itrace (was Re: Backward compatibility for insn probe point) References: <49D3E3DF.1000108@us.ibm.com> <49F61D09.8090503@redhat.com> <49F8C1C0.8070208@us.ibm.com> <49F9E71D.2070500@redhat.com> <20090430203302.5FFA0FC3BF@magilla.sf.frob.com> <4A099E6B.8090501@redhat.com> <20090512181950.19BCCFC35D@magilla.sf.frob.com> <4A0AE179.9050801@redhat.com> <20090513182359.0CF33FC35D@magilla.sf.frob.com> <4A0C347D.9060204@redhat.com> <4A0C6A90.6070306@us.ibm.com> <4A0C7482.3010103@redhat.com> <4A0C8ED3.4090301@redhat.com> In-Reply-To: <4A0C8ED3.4090301@redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2009-q2/txt/msg00596.txt.bz2 David Smith wrote: > David Smith wrote: >> Maynard Johnson wrote: >>>> David Smith wrote: >>>> One last thing. I thought I'd try block stepping, so I got access to an >>>> ia64 machine. Unfortunately, using systemtap insn probes (either single >>>> or block step) lock up the system with a spinlock lockup. Sigh. >>> Does anyone know who maintains ia64/utrace? David, was the above error >>> on "old" utrace or "new"? >> The error is on "old" utrace. I'm trying to look into the ia64 utrace >> problem now. > > Here's what I see on the console (running lockdep enabled > 2.6.18-146.el5debug): > > ==== > BUG: spinlock lockup on CPU#0, ls/2576, e0000040fe1092d8 (Tainted: G) > > Call Trace: > [] show_stack+0x40/0xa0 > sp=e0000003f640f870 bsp=e0000003f6409440 > [] dump_stack+0x30/0x60 > sp=e0000003f640fa40 bsp=e0000003f6409428 > [] _raw_spin_lock+0x200/0x260 > sp=e0000003f640fa40 bsp=e0000003f64093e8 > [] _spin_lock_irqsave+0x30/0x60 > sp=e0000003f640fa40 bsp=e0000003f64093c0 > [] force_sig_info+0x30/0x160 > sp=e0000003f640fa40 bsp=e0000003f6409380 > [] ia64_fault+0xff0/0x1280 > sp=e0000003f640fa40 bsp=e0000003f6409328 > [] __ia64_leave_kernel+0x0/0x280 > sp=e0000003f640fc60 bsp=e0000003f6409328 > [] _raw_spin_lock+0xd0/0x260 > sp=e0000003f640fe30 bsp=e0000003f64092c0 > [] _spin_lock_irqsave+0x30/0x60 > sp=e0000003f640fe30 bsp=e0000003f6409298 > [] force_sig_info+0x30/0x160 > sp=e0000003f640fe30 bsp=e0000003f6409258 > [] force_sig+0x30/0x60 > sp=e0000003f640fe30 bsp=e0000003f6409230 > [] syscall_trace_leave+0x100/0x140 > sp=e0000003f640fe30 bsp=e0000003f64091d0 > [] __ia64_trace_syscall+0x100/0x110 > sp=e0000003f640fe30 bsp=e0000003f64091d0 > [] __start_ivt_text+0xffffffff00010620/0x400 > sp=e0000003f6410000 bsp=e0000003f64091d0 > ==== > > From what I can tell, the spinlock that is stuck is > current->sighand->siglock. force_sig_info() (from kernel/signal.c:739) > grabs the spinlock, but we get a fault somewhere? and end up in > __ia64_leave_kernel() (from arch/ia64/kernel/entry.S:813). The fault > handling in ia64_fault() calls force_sig_info() again, which tries to > grab same spinlock again. > > If anyone has a better understanding of this, I'd love to know how we > ended up in __ia64_leave_kernel(). I should have included other information I know. This always happens after a call to set_tid_address(), which is the 79th syscall that 'ls' runs. By this point the insn probe has been hit at least 555391 times (my test script prints the number of instructions at every syscall entry and exit). -- David Smith dsmith@redhat.com Red Hat http://www.redhat.com 256.217.0141 (direct) 256.837.0057 (fax)