From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 15894 invoked by alias); 20 Nov 2006 21:02:16 -0000 Received: (qmail 15885 invoked by uid 22791); 20 Nov 2006 21:02:15 -0000 X-Spam-Status: No, hits=-2.4 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (66.187.233.31) by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 20 Nov 2006 21:02:05 +0000 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id kAKL23kK002793 for ; Mon, 20 Nov 2006 16:02:03 -0500 Received: from pobox.toronto.redhat.com (pobox.toronto.redhat.com [172.16.14.4]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kAKL22oX004441 for ; Mon, 20 Nov 2006 16:02:02 -0500 Received: from touchme.toronto.redhat.com (IDENT:postfix@touchme.toronto.redhat.com [172.16.14.9]) by pobox.toronto.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id kAKL22w3019646 for ; Mon, 20 Nov 2006 16:02:02 -0500 Received: from ton.toronto.redhat.com (ton.toronto.redhat.com [172.16.14.15]) by touchme.toronto.redhat.com (Postfix) with ESMTP id 67AC3800002 for ; Mon, 20 Nov 2006 16:02:02 -0500 (EST) Received: from ton.toronto.redhat.com (localhost.localdomain [127.0.0.1]) by ton.toronto.redhat.com (8.13.1/8.13.1) with ESMTP id kAKL22R9019519 for ; Mon, 20 Nov 2006 16:02:02 -0500 Received: (from fche@localhost) by ton.toronto.redhat.com (8.13.1/8.13.1/Submit) id kAKL225Q019516; Mon, 20 Nov 2006 16:02:02 -0500 X-Authentication-Warning: ton.toronto.redhat.com: fche set sender to fche@redhat.com using -f To: systemtap@sources.redhat.com Subject: function("*") probing From: fche@redhat.com (Frank Ch. Eigler) Date: Mon, 20 Nov 2006 21:12:00 -0000 Message-ID: User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/21.3 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2006-q4/txt/msg00475.txt.bz2 Hi - Thanks to a well-working kexec/kdump setup on fc5, fc6, and rhel5, I've made some progress in working out the reasons for the crashes we encounter when indiscriminately probing kernel functions. Often, the stack traces include multiple nested faults, sometimes bottoming out on some random error, sometimes on a hung lock. One problem is an old bugaboo: reentrancy. It turns out that many of the locking primitives we sparingly use, which ideally should be inlined, in fact turn into function calls. The main bunch of problems occurs when the locking-related kernel functions (_read_lock and many pals) are themselves probed. Putting these into the translator blacklist makes a big difference. I'm working on characterizing the callees of increasingly complex probes, and am considering blacklisting many of them. I was under the impression that kprobes tries to detect & prevent such reentrancy but perhaps that too needs some work. Another problem is our lack of self-throttling. It is related to bug #2685 ("skip probes on insufficient stack") but is more like the linux nmi-watchdog looking at /proc/interrupts and what dtrace has: a way of detecting excessive probing load, and consequential probe skipping or outright session shutdown. I just created bug #3545 for this part. - FChE