From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 26867 invoked by alias); 13 Jun 2016 04:28:18 -0000 Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org Received: (qmail 26850 invoked by uid 89); 13 Jun 2016 04:28:16 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 spammy=locked X-HELO: mail-qt0-f176.google.com Received: from mail-qt0-f176.google.com (HELO mail-qt0-f176.google.com) (209.85.216.176) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Mon, 13 Jun 2016 04:28:05 +0000 Received: by mail-qt0-f176.google.com with SMTP id m2so8650090qtd.1 for ; Sun, 12 Jun 2016 21:28:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=2+PfRxvVRnf3cY4vHGasYOVMhuoKAjmww697MXTtQyw=; b=kWQq1ZN7fpvUdMB+vwA04prwAOs+HYPSdWKq/pldoiPjq1PIZH9z74t6QAQ5l8j3kt fHgwvJgZB23kQHvWErWLNv2HURWqlR+ES81Js3vMVQYSQ4k1XTY6N/Mk/jSlDjSoYFJ3 P3EWHZ73hxHOsuhzwWZD7on14RQ4I3dcjbDT8DjJpdQUqavexf4g4GKBSbVwlVXKz0EJ v5ovgG941rayErG3QI5s61jLhTrGYLATvDJZT9xyLwdiEba2MixTFHCXy51yumvx1H+B 7K+3ft90ZqyhFCJk9l+6eGGpuytBE2VwtndCTKCaq/UY668xoxtuu0OHdN/ZLRMAKHTT uPZw== X-Gm-Message-State: ALyK8tKlPF4PSK64wG/Bs+WD+DNevpspPygRAweU9Ad7D89K2yeIZNhlo8sl0wj7KBzu+w4i X-Received: by 10.200.53.39 with SMTP id y36mr12410425qtb.92.1465792083436; Sun, 12 Jun 2016 21:28:03 -0700 (PDT) Received: from localhost ([122.177.222.230]) by smtp.gmail.com with ESMTPSA id 61sm6588418qte.38.2016.06.12.21.28.01 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 12 Jun 2016 21:28:02 -0700 (PDT) Date: Mon, 13 Jun 2016 04:28:00 -0000 From: Pratyush Anand To: William Cohen Cc: systemtap@sourceware.org, Dave Long , Mark Brown , Jeremy Linton Subject: Re: exercising current aarch64 kprobe support with systemtap Message-ID: <20160613042758.GB6344@dhcppc6> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.6.1 (2016-04-27) X-IsSubscribed: yes X-SW-Source: 2016-q2/txt/msg00218.txt.bz2 Hi Will, On 10/06/2016:05:28:36 PM, William Cohen wrote: > On 06/09/2016 12:17 PM, William Cohen wrote: > > I have been exercising the current kprobes and uprobe patches for > > arm64 that are in the test_upstream_arm64_devel branch of > > https://github.com/pratyushanand/linux with systemtap. There are a > > two issues that I have seen on this kernel with systemtap. There are > > some cases where kprobes fail to register at places that appear to be > > reasonable places for a kprobe. The other issue is that kernel starts > > having soft lockups when the hw_watch_addr.stp tests runs. To get > > systemtap with the newer kernels need the attached hack because of > > changes in the aarch64 macro args. > ... > > Soft Lookup for the hw_watch_addr.stp > > > > When running the hw_watch_addr.stp tests the machine gets a number of > > processes using a lot of sys time and eventually the kernel reports > > soft lockup: > > > > http://paste.stg.fedoraproject.org/5323/ > > > > The systemtap.base/overload.exp tests all pass, but maybe there is > > much work being done to generate the backtraces for hw_watch_addr.stp > > and that is triggering the problem. > > I can reliably reproduce the soft lockup running a single test with: > > /root/systemtap_write/install/bin/stap --all-modules \ > /root/systemtap_write/systemtap/testsuite/systemtap.examples/memory/hw_watch_addr.stp \ > 0x`grep "vm_dirty_ratio" /proc/kallsyms | awk '{print $1}'` -T 5 > /dev/null > > paste of output and soft lockup at: http://paste.stg.fedoraproject.org/5324/ > > One of the things that Jeremy Linton pointed to was: > > https://lkml.org/lkml/2016/3/21/198 Now we have following in arch_within_kprobe_blacklist(). So above issue should not bite us. + !!search_exception_tables(addr)) + return true; > > Could the aarch64 hardware watchpoint handler have an issue that is causing this problem with the soft lockup? > Or spending too much time doing the stack backtrace? Not sure, could be the locked up CPU waiting for a lock (spinlock), which is not being released. Just noticed that, backtrace of all active CPUs (`echo l > /proc/sysrq-trigger`) is not working for arm64. Probably because, we do not have arch_trigger_all_cpu_backtrace() defined for aarch64. May be we can have one, like that of arm. Backtrace of CPUs in this state might give us some input. ~Pratyush