From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1987 invoked by alias); 10 Apr 2007 07:13:16 -0000 Received: (qmail 1970 invoked by uid 9478); 10 Apr 2007 07:13:15 -0000 Date: Tue, 10 Apr 2007 07:13:00 -0000 Message-ID: <20070410071315.1969.qmail@sourceware.org> From: jbrassow@sourceware.org To: cluster-cvs@sources.redhat.com Subject: cluster/cmirror-kernel/src dm-cmirror-client.c ... Mailing-List: contact cluster-cvs-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: cluster-cvs-owner@sourceware.org X-SW-Source: 2007-q2/txt/msg00025.txt.bz2 CVSROOT: /cvs/cluster Module name: cluster Branch: RHEL45 Changes by: jbrassow@sourceware.org 2007-04-10 08:13:15 Modified files: cmirror-kernel/src: dm-cmirror-client.c dm-cmirror-common.h dm-cmirror-server.c dm-cmirror-server.h Log message: Bug 235686: Kernel BUG at dm_cmirror_server while recovering region Several fixes have gone in to fix the handling of this bug: 1) During server relocation (which can happen due to machine failure or normal mirror suspension), the server value could get set before the client had a chance to clean-up. This caused the server to become confused and issue a BUG(). 2) perform a flush of the log before suspending. This ensures that regions which are in-sync get correctly flushed to the disk log. Without this, there will always be recovery work to be done when a mirror starts up - even if it was properly in-sync during shutdown. 3) clean-up memory used to record region users when a mirror is shutdown. It was possible for some regions to be left over (causing a memory leak) during certain fault scenarios. 4) properly initialize the state field (ru_rw) in the region user structure when a mark occurs. Without the initialization, it was sometimes possible for the region to be misinterpretted as recovering instead of marked. 5) resolve and unhandled case in server_complete_resync_work 6) reset a variable in cluster_complete_resync_work. Failure to do so was causing a retry to include the wrong value for the completion of the resync work - confusing the server. Patches: http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-client.c.diff?cvsroot=cluster&only_with_tag=RHEL45&r1=1.1.2.41.2.3&r2=1.1.2.41.2.4 http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-common.h.diff?cvsroot=cluster&only_with_tag=RHEL45&r1=1.1.2.12.2.1&r2=1.1.2.12.2.2 http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-server.c.diff?cvsroot=cluster&only_with_tag=RHEL45&r1=1.1.2.26.2.4&r2=1.1.2.26.2.5 http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-server.h.diff?cvsroot=cluster&only_with_tag=RHEL45&r1=1.1.2.2&r2=1.1.2.2.8.1