From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 123769 invoked by alias); 4 Aug 2016 20:51:01 -0000 Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org Received: (qmail 122714 invoked by uid 89); 4 Aug 2016 20:51:00 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.1 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=0x40, reserved X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Thu, 04 Aug 2016 20:50:50 +0000 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 515BA4E330; Thu, 4 Aug 2016 20:50:49 +0000 (UTC) Received: from [10.13.129.231] (dhcp129-231.rdu.redhat.com [10.13.129.231]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u74KomI5005133; Thu, 4 Aug 2016 16:50:48 -0400 Subject: Re: exercising current aarch64 kprobe support with systemtap To: Pratyush Anand References: <0a594132-796b-779d-b473-a06c0f3e8ae8@redhat.com> <20160627141840.GB8139@dhcppc9> <577EA7EE.2070607@linaro.org> <20160803131302.GC18785@localhost.localdomain> <2947a749-a518-d560-f768-60cc2f2c691e@redhat.com> <20160804044230.GB22191@localhost.localdomain> <20160804143549.GF22191@localhost.localdomain> Cc: David Long , systemtap@sourceware.org, Mark Brown , Jeremy Linton , David Smith , "Frank Ch. Eigler" From: William Cohen Message-ID: Date: Thu, 04 Aug 2016 20:51:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <20160804143549.GF22191@localhost.localdomain> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2016-q3/txt/msg00132.txt.bz2 On 08/04/2016 10:35 AM, Pratyush Anand wrote: > Hi Will, > > On 04/08/2016:09:56:45 AM, William Cohen wrote: ... >> Hi, >> >> The OOM errors came before the otf_stress_hard_iter_5000 test that previous triggered the infinite unexpected EL1, so can't really say that the proposed patch has fixed the problem. > > Yes, yes, previously also we were getting OOM, and then that OOM was triggering > infinite unexpected EL1, because OOM message uses WARN_ON() to print, and > WARN_ON() uses "BRK BUG_BRK_IMM". Now when it is printing though BRK, we were > hitting kprobe at print_worker_info() which was resulting in unexpected EL1. > > Proposed patch fixes kprobe tracing within none kprobe BRK context such as > uprobe or WARN_ON() breakpoint handler etc. So, now a kprobe at > print_worker_info() will work while printing message of WARN_ON(). > > >> >> Any thoughts on how to track down the oom issue? Are you able to replicate it running the systemtap onthefly/kprobes_onthefly.exp tests? > > Sure, will look into. Have reserved a seattle. > > ~Pratyush > Hi Pratyush, The stack backtrace of http://paste.stg.fedoraproject.org/5375/ is: [ 668.676682] [] page_counter_cancel+0x54/0x60 [ 668.682508] [] page_counter_uncharge+0x2c/0x40 [ 668.688509] [] cancel_charge+0x40/0xe0 [ 668.693815] [] mem_cgroup_cancel_charge+0x2c/0x38 [ 668.700088] [] uprobe_write_opcode+0x4e8/0x688 [ 668.706089] [] set_swbp+0x30/0x40 [ 668.710962] [] install_breakpoint.isra.10+0x5c/0x2b8 [ 668.717484] [] uprobe_mmap+0x248/0x2a8 [ 668.722791] [] mmap_region+0x204/0x558 [ 668.728097] [] do_mmap+0x264/0x320 [ 668.733057] [] vm_mmap_pgoff+0xb0/0xd8 [ 668.738363] [] vm_mmap+0x70/0xa0 [ 668.743149] [] elf_map+0x80/0xf8 [ 668.747934] [] load_elf_binary+0x480/0xb90 [ 668.753588] [] search_binary_handler+0xbc/0x210 [ 668.759674] [] do_execveat_common+0x4b0/0x620 [ 668.765587] [] SyS_execve+0x44/0x58 [ 668.770633] [] __sys_trace_return+0x0/0x4 There is some uprobe code running in the traceback. It looks like things are going wrong when uprobes are being installed on a newly loaded executable. -Will