From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1204 invoked by alias); 30 Jan 2007 07:17:53 -0000 Received: (qmail 1164 invoked by uid 48); 30 Jan 2007 07:17:38 -0000 Date: Tue, 30 Jan 2007 07:17:00 -0000 Message-ID: <20070130071738.1163.qmail@sourceware.org> From: "srinivasa at in dot ibm dot com" To: systemtap@sources.redhat.com In-Reply-To: <20070124094941.3911.srinivasa@in.ibm.com> References: <20070124094941.3911.srinivasa@in.ibm.com> Reply-To: sourceware-bugzilla@sourceware.org Subject: [Bug runtime/3911] Compilation of systemtap causes the system to crash on p570 system. X-Bugzilla-Reason: AssignedTo Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2007-q1/txt/msg00266.txt.bz2 ------- Additional Comments From srinivasa at in dot ibm dot com 2007-01-30 07:17 ------- Here is my analysis of this bug ====================================== 1) Looking at the backtrace, _stp_map_init() calls kmalloc_node() with cpu as argument. kmalloc_node() is same as kmalloc if NUMA is not configured and kmalloc_node() calls kmem_cache_alloc_node() if NUMA is configured. _stp_map_init is called by _stp_pmap_new() within for_each_cpu() brace. ============================================= static int _stp_map_init(MAP m, unsigned max_entries, int type, int key_size, int data_si ze, int cpu) { int size; ..................................... ..................................... for (i = 0; i < max_entries; i++) { if (cpu < 0) tmp = kmalloc(size, STP_ALLOC_FLAGS); else tmp = kmalloc_node(size, STP_ALLOC_FLAGS, cpu); if (!tmp) return -1;; dbug ("allocated %lx\n", (long)tmp); ========================================================================= static PMAP _stp_pmap_new(unsigned max_entries, int type, int key_size, int data_size) { int i; MAP map, m; PMAP pmap = (PMAP) kmalloc(sizeof(struct pmap), STP_ALLOC_FLAGS); if (pmap == NULL) return NULL; ......................... ......................... for_each_cpu(i) { m = per_cpu_ptr (map, i); if (_stp_map_init(m, max_entries, type, key_size, data_size, i)) { goto err1; } } ============================================ 2) Since in my system, NUMA is configured kmalloc_node() calls kmem_cache_alloc_node with "cpu" as nodeid ==================================================== #ifdef CONFIG_NUMA extern void *__kmalloc_node(size_t size, gfp_t flags, int node); static inline void *kmalloc_node(size_t size, gfp_t flags, int node) { if (__builtin_constant_p(size)) { int i = 0; #define CACHE(x) \ ....................... .............................. return kmem_cache_alloc_node((flags & GFP_DMA) ? malloc_sizes[i].cs_dmacachep : malloc_sizes[i].cs_cachep, flags, node); } ======================================================== 3) This means systemtap code expects the number of nodes in numa should be same as number of cpu's. kmem_cache_alloc_node() inturn calls ___cache_alloc_node where cachep->nodelists[nodeid] gives wrong address because in my system number of nodes are less than number of cpus. ================================ Mount-cache hash table entries: 4096 Processor 1 found. Processor 2 found. Processor 3 found. Processor 4 found. Processor 5 found. Processor 6 found. Processor 7 found. Brought up 8 CPUs Node 0 CPUs: 0-7 Node 1 CPUs: Node 2 CPUs: Node 3 CPUs: sizeof(vma)=176 bytes sizeof(page)=56 bytes sizeof(inode)=560 bytes ================================= for example: I have 8 cpu's in my system and 4 numa nodes. nodelists[8] gives wrong address and that is causing the oops. ============================================================ static void *____cache_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid) { struct list_head *entry; struct slab *slabp; struct kmem_list3 *l3; void *obj; int x; l3 = cachep->nodelists[nodeid]; <<<========PC is here BUG_ON(!l3); ============================================ Hence assuming, number of nodes equal to number of cpus in systemtap modules is causing this bug. Martin Any ideas?? Thanks Srinivasa DS -- What |Removed |Added ---------------------------------------------------------------------------- CC| |hunt at redhat dot com http://sourceware.org/bugzilla/show_bug.cgi?id=3911 ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.