From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4965 invoked by alias); 3 Apr 2007 18:23:02 -0000 Received: (qmail 4949 invoked by uid 9478); 3 Apr 2007 18:23:01 -0000 Date: Tue, 03 Apr 2007 18:23:00 -0000 Message-ID: <20070403182301.4948.qmail@sourceware.org> From: jbrassow@sourceware.org To: cluster-cvs@sources.redhat.com Subject: cluster/cmirror-kernel/src dm-cmirror-client.c ... Mailing-List: contact cluster-cvs-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: cluster-cvs-owner@sourceware.org X-SW-Source: 2007-q2/txt/msg00011.txt.bz2 CVSROOT: /cvs/cluster Module name: cluster Branch: RHEL45 Changes by: jbrassow@sourceware.org 2007-04-03 19:23:01 Modified files: cmirror-kernel/src: dm-cmirror-client.c dm-cmirror-common.h dm-cmirror-server.c dm-cmirror-xfr.h Log message: Bug 234539: multiple streams of I/O can cause system to lock up This bug provoked an audit of the communications exchange, locking, and memory allocations/stack usage. Communication fixes include: 1) Added sequence numbers to ensure that replies from the server correctly correspond to client requests. It was found that if a client timed out waiting for a server to respond, it would send the request again. However, the server may have simply been too busy to respond in a timely fashion. It ends up responding to both the original request and the resent request - causing the client and server to become out-of-sync WRT log requests. Locking fixes include: 1) A semaphore was being "up"ed twice in some cases, rendering the lock impotent. 2) A spin lock controlling region status lists was being held across blocking operations - sometimes causing deadlocks. The spin lock was changed to a per-log lock, and some logging operations were restructured to better suit the way locking needed to be done. A side-effect of this fix is a 20% improvement in write operations. 3) The log list protection lock needed to change from a spin lock to a semaphore to allow blocking operations. Memory allocation fixes include: 1) Wrong flags to kmalloc could cause deadlock. Use NOFS instead of KERNEL. 2) Mempools needed more reserves for low memory conditions. 3) Server now allocates a communication structure instead of having it on the stack. This reduces the likelyhood of stack corruption. Patches: http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-client.c.diff?cvsroot=cluster&only_with_tag=RHEL45&r1=1.1.2.41.2.1&r2=1.1.2.41.2.2 http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-common.h.diff?cvsroot=cluster&only_with_tag=RHEL45&r1=1.1.2.12&r2=1.1.2.12.2.1 http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-server.c.diff?cvsroot=cluster&only_with_tag=RHEL45&r1=1.1.2.26.2.1&r2=1.1.2.26.2.2 http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-xfr.h.diff?cvsroot=cluster&only_with_tag=RHEL45&r1=1.1.2.2.2.1&r2=1.1.2.2.2.2