From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17659 invoked by alias); 20 Aug 2009 19:42:01 -0000 Received: (qmail 17649 invoked by uid 22791); 20 Aug 2009 19:42:00 -0000 X-SWARE-Spam-Status: No, hits=-1.7 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mx1-old.redhat.com (HELO mx1.redhat.com) (66.187.233.31) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 20 Aug 2009 19:41:51 +0000 Received: from int-mx03.intmail.prod.int.phx2.redhat.com ([10.11.47.254]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id n7KJflcM018346 for ; Thu, 20 Aug 2009 15:41:47 -0400 Received: from [10.16.2.46] (dhcp-100-2-46.bos.redhat.com [10.16.2.46]) by int-mx03.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id n7KJfhGT022266; Thu, 20 Aug 2009 15:41:44 -0400 Message-ID: <4A8DA7CE.2040702@redhat.com> Date: Thu, 20 Aug 2009 19:42:00 -0000 From: Masami Hiramatsu User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.1) Gecko/20090814 Fedora/3.0-2.6.b3.fc11 Thunderbird/3.0b3 MIME-Version: 1.0 To: Frederic Weisbecker CC: Ingo Molnar , Steven Rostedt , lkml , Ananth N Mavinakayanahalli , Avi Kivity , Andi Kleen , Christoph Hellwig , "Frank Ch. Eigler" , "H. Peter Anvin" , Jason Baron , Jim Keniston , "K.Prasad" , Lai Jiangshan , Li Zefan , =?UTF-8?B?UHJ6ZW15c8WCYXdQYXdlxYJjenlr?= , Roland McGrath , Sam Ravnborg , Srikar Dronamraju , Tom Zanussi , Vegard Nossum , systemtap , kvm , DLE Subject: Re: [TOOL] kprobestest : Kprobe stress test tool References: <20090813203403.31965.20973.stgit@localhost.localdomain> <4A847E30.9050903@redhat.com> <20090820184331.GA6078@nowhere> In-Reply-To: <20090820184331.GA6078@nowhere> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2009-q3/txt/msg00416.txt.bz2 Frederic Weisbecker wrote: > On Thu, Aug 13, 2009 at 04:57:20PM -0400, Masami Hiramatsu wrote: >> This script tests kprobes to probe on all symbols in the kernel and finds >> symbols which must be blacklisted. >> >> >> Usage >> ----- >> kprobestest [-s SYMLIST] [-b BLACKLIST] [-w WHITELIST] >> Run stress test. If SYMLIST file is specified, use it as >> an initial symbol list (This is useful for verifying white list >> after diagnosing all symbols). >> >> kprobestest cleanup >> Cleanup all lists >> >> >> How to Work >> ----------- >> This tool list up all symbols in the kernel via /proc/kallsyms, and sorts >> it into groups (each of them including 64 symbols in default). And then, >> it tests each group by using kprobe-tracer. If a kernel crash occurred, >> that group is moved into 'failed' dir. If the group passed the test, this >> script moves it into 'passed' dir and saves kprobe_profile into >> 'passed/profiles/'. >> After testing all groups, all 'failed' groups are merged and sorted into >> smaller groups (divided by 4, in default). And those are tested again. >> This loop will be repeated until all group has just 1 symbol. >> >> Finally, the script sorts all 'passed' symbols into 'tested', 'untested', >> and 'missed' based on profiles. >> >> >> Note >> ---- >> - This script just gives us some clues to the blacklisted functions. >> In some cases, a combination of probe points will cause a problem, but >> each of them doesn't cause the problem alone. >> >> Thank you, >> > > > This script makes my x86-64 dual core easily and hardly locking-up > on the 1st batch of symbols to test. > I have one sym list in the failed and unset directories: > > int_very_careful > int_signal > int_restore_rest > stub_clone > stub_fork > stub_vfork > stub_sigaltstack > stub_iopl > ptregscall_common > stub_execve > stub_rt_sigreturn > irq_entries_start > common_interrupt > ret_from_intr > exit_intr > retint_with_reschedule > retint_check > retint_swapgs > retint_restore_args > restore_args > irq_return > retint_careful > retint_signal > retint_kernel > irq_move_cleanup_interrupt > reboot_interrupt > apic_timer_interrupt > generic_interrupt > invalidate_interrupt0 > invalidate_interrupt1 > invalidate_interrupt2 > invalidate_interrupt3 > invalidate_interrupt4 > invalidate_interrupt5 > invalidate_interrupt6 > invalidate_interrupt7 > threshold_interrupt > thermal_interrupt > mce_self_interrupt > call_function_single_interrupt > call_function_interrupt > reschedule_interrupt > error_interrupt > spurious_interrupt > perf_pending_interrupt > divide_error > overflow > bounds > invalid_op > device_not_available > double_fault > coprocessor_segment_overrun > invalid_TSS > segment_not_present > spurious_interrupt_bug > coprocessor_error > alignment_check > simd_coprocessor_error > native_load_gs_index > gs_change > kernel_thread > child_rip > kernel_execve > call_softirq > > > I don't have a crash log because I was running with X. > But it also happened with other batch of symbols. Thank you for reporting, here, I also have a result tested on KVM@x86-64. native_read_tscp native_read_msr_safe native_read_msr_amd_safe native_write_msr_safe vmalloc_fault spurious_fault search_exception_tables notify_die trace_hardirqs_off_caller ident_complete lock_acquire lock_release bad_address secondary_startup_64 stack_start bad_address restore_args irq_return restore trace_hardirqs_off_thunk init_level4_pgt level3_ident_pgt level3_kernel_pgt level2_fixmap_pgt _text startup_64 level1_fixmap_pgt level2_ident_pgt level2_kernel_pgt level2_spare_pgt native_get_debugreg native_set_debugreg native_set_iopl_mask native_load_sp0 debug_show_all_locks debug_check_no_locks_held valid_state mark_lock mark_held_locks lockdep_trace_alloc trace_softirqs_on trace_hardirqs_on_caller __down_write __down_read trace_hardirqs_on_thunk lockdep_sys_exit_thunk Most of them can be fixed just by adding __kprobes. Some of them which are already in the another section, kprobes should check the symbols are in the section. > The problem is that I don't have any serial line in this > box then I can't catch any crash log. > My K7 testbox also died in my arms this afternoon. > > But I still have two other testboxes (one P2 and one P3), > hopefully I could reproduce the problem in these boxes > in which I can connect a serial line. Thank you for helping me to find it! > I've pushed your patches in the following git tree: > > git://git.kernel.org/pub/scm/linux/kernel/git/fgrederic/random-tracing.git \ > tracing/kprobes > > So you can send patches on top of this one. Great! I've found another trivial bugs, so I'll fix those on it. Thank you, -- Masami Hiramatsu Software Engineer Hitachi Computer Products (America), Inc. Software Solutions Division e-mail: mhiramat@redhat.com