From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 87564 invoked by alias); 3 Dec 2015 02:49:12 -0000 Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org Received: (qmail 87497 invoked by uid 89); 3 Dec 2015 02:49:08 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.0 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Thu, 03 Dec 2015 02:49:07 +0000 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) by mx1.redhat.com (Postfix) with ESMTPS id 14D7CC0CC645 for ; Thu, 3 Dec 2015 02:49:06 +0000 (UTC) Received: from [10.10.49.209] (vpn-49-209.rdu2.redhat.com [10.10.49.209]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id tB32n5lZ004992 for ; Wed, 2 Dec 2015 21:49:05 -0500 Subject: Re: The systemtap stress testsuite crashing on RHEL7.2 x86-64 To: systemtap@sourceware.org References: <565F644B.7020200@redhat.com> From: William Cohen Message-ID: <565FADA0.6050005@redhat.com> Date: Thu, 03 Dec 2015 02:49:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <565F644B.7020200@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2015-q4/txt/msg00227.txt.bz2 On 12/02/2015 04:36 PM, William Cohen wrote: > I have a Lenovo T530 running rhel7.2. I have noticed that the "make installcheck" of the system git checkout has been causing the machine to crash and reboot. The machine is running a stock 3.10.0-327.el7.x86_64 kernel. The tests are dying in the stress tests as below is the tail end of the systemtap.log: > For further analysis the cached .stp, .c, .ko, and a disassembly of the .ko have been placed on http://people.redhat.com/wcohen/rhel7/ . -Will > Running ./systemtap.stress/tapset_functions.exp ... > PASS: tapset_functions_stress (got output of stap --dump-functions) > UNTESTED: tapset_functions_stress (commit() blacklisted) > UNTESTED: tapset_functions_stress (delete_stopwatch() blacklisted) > UNTESTED: tapset_functions_stress (discard() blacklisted) > UNTESTED: tapset_functions_stress (mdelay() blacklisted) > UNTESTED: tapset_functions_stress (panic() blacklisted) > UNTESTED: tapset_functions_stress (qsq_start() blacklisted) > UNTESTED: tapset_functions_stress (raise() blacklisted) > UNTESTED: tapset_functions_stress (start_stopwatch() blacklisted) > UNTESTED: tapset_functions_stress (udelay() blacklisted) > as_root stap -g --compatible=2.9 testscript.stp 2>errors.txt > > The errors.txt has the following: > > WARNING: never-assigned global variable '__global_qs_stime' (similar: __global_qs_rtime, __global_qs_utime, __global_qs_wtime, __global_qs_rlentime, __global_qs_wlentime): identifier 'qs_stime' at /home/wcohen/systemtap_write/install/share/systemtap/tapset/queue_stats.stp:32:21 > source: @__private30 global qs_stime, qs_utime, qs_dcount > > > I was luck that the crash did record what caused the crash and below is the tail end of of > /var/crash/127.0.0.1-2015-12-01-13:50:05/vmcore-dmesg.txt: > > [11325.888727] stap_5efe1be6a70a93280312b684d83f5f29_8_6396: systemtap: 3.0/0.163, base: ffffffffa0c03000, memory: 1012data/868text/715ctx/2058net/39913alloc kb, probes: 2021 > [11326.524551] BUG: unable to handle kernel NULL pointer dereference at (null) > [11326.524578] IP: [] __list_add+0x1b/0xc0 > [11326.524596] PGD 1add2c067 PUD 1ade62067 PMD 0 > [11326.524612] Oops: 0000 [#1] SMP > [11326.524625] Modules linked in: stap_5efe1be6a70a93280312b684d83f5f29_8_6396(OE) binfmt_misc bnep bluetooth fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter arc4 iwldvm mac80211 uvcvideo iTCO_wdt iTCO_vendor_support sg videobuf2_vmalloc videobuf2_memops videobuf2_core intel_powerclamp lpc_ich sdhci_pci sdhci videodev iwlwifi mmc_core coretemp mfd_core cfg80211 thinkpad_acpi wmi intel_rapl mei_me mei i2c_i801 shpchp > [11326.524855] snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq rfkill snd_seq_device snd_pcm kvm_intel snd_timer snd soundcore pcspkr kvm nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c dm_crypt drbg sd_mod sr_mod crc_t10dif cdrom crct10dif_generic crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel ghash_clmulni_intel i915 aesni_intel glue_helper lrw gf128mul i2c_algo_bit ablk_helper ahci drm_kms_helper cryptd libahci libata e1000e firewire_ohci serio_raw firewire_core drm crc_itu_t ptp pps_core i2c_core video dm_mirror dm_region_hash dm_log dm_mod [last unloaded: stap_4bd16bf1b7c55eaba73679d5aedc66eb_5_6137] > [11326.525060] CPU: 5 PID: 6396 Comm: stapio Tainted: G OE ------------ 3.10.0-327.el7.x86_64 #1 > [11326.525083] Hardware name: LENOVO 24293E8/24293E8, BIOS G4ETA5WW (2.65 ) 09/10/2015 > [11326.525102] task: ffff8801adf6c500 ti: ffff88019197c000 task.ti: ffff88019197c000 > [11326.525120] RIP: 0010:[] [] __list_add+0x1b/0xc0 > [11326.525141] RSP: 0018:ffff88019197fd08 EFLAGS: 00010246 > [11326.525155] RAX: 00000000ffffffff RBX: ffff88019197fd30 RCX: ffff88019197ffd8 > [11326.525172] RDX: ffffffffa0db4828 RSI: 0000000000000000 RDI: ffff88019197fd30 > [11326.525189] RBP: ffff88019197fd20 R08: 0000000000000000 R09: 0000000000000073 > [11326.525207] R10: 00000000ffffffff R11: 0000000000000000 R12: ffffffffa0db4828 > [11326.525224] R13: 0000000000000000 R14: 00000000ffffffff R15: ffffffffa0db4828 > [11326.525241] FS: 00007f9688dbe740(0000) GS:ffff88021e340000(0000) knlGS:0000000000000000 > [11326.525261] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [11326.525275] CR2: 0000000000000000 CR3: 00000002123ab000 CR4: 00000000001407e0 > [11326.525292] DR0: ffffffff819bfa78 DR1: 0000000000000000 DR2: 0000000000000000 > [11326.525308] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 > [11326.525325] Stack: > [11326.525332] ffffffffa0db4820 ffffffffa0db4824 ffff8801adf6c500 ffff88019197fd78 > [11326.525354] ffffffff816396c6 0000000039cba761 ffffc900055c9000 000000000000f608 > [11326.525375] 0000000039cba761 ffffffffa0db4820 ffffc900055c9000 ffffc900055cbf68 > [11326.525397] Call Trace: > [11326.525407] [] __mutex_lock_slowpath+0xa6/0x1c0 > [11326.525423] [] mutex_lock+0x1f/0x2f > [11326.525453] [] function___global_mmv_add_indom+0x7b/0x290 [stap_5efe1be6a70a93280312b684d83f5f29_8_6396] > [11326.525485] [] probe_6055+0xba/0x120 [stap_5efe1be6a70a93280312b684d83f5f29_8_6396] > [11326.525512] [] enter_be_probe+0x14a/0x230 [stap_5efe1be6a70a93280312b684d83f5f29_8_6396] > [11326.525541] [] systemtap_module_init+0xe64/0x2120 [stap_5efe1be6a70a93280312b684d83f5f29_8_6396] > [11326.525567] [] ? enqueue_task+0x2c/0x80 > [11326.525586] [] _stp_ctl_write_cmd+0x298/0x4cc [stap_5efe1be6a70a93280312b684d83f5f29_8_6396] > [11326.525610] [] vfs_write+0xbd/0x1e0 > [11326.525624] [] SyS_write+0x7f/0xe0 > [11326.525638] [] system_call_fastpath+0x16/0x1b > [11326.525652] Code: ff e9 4a ff ff ff b8 f4 ff ff ff e9 40 ff ff ff 55 48 89 e5 41 55 49 89 f5 41 54 49 89 d4 53 4c 8b 42 08 48 89 fb 49 39 f0 75 2a <4d> 8b 45 00 4d 39 c4 75 68 4c 39 e3 74 3e 4c 39 eb 74 39 49 89 > [11326.525768] RIP [] __list_add+0x1b/0xc0 > [11326.525784] RSP > [11326.525793] CR2: 0000000000000000 > > > -Will >