From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <cluster-cvs-return-1681-listarch-cluster-cvs=sources.redhat.com@sources.redhat.com>
Received: (qmail 20298 invoked by alias); 20 Apr 2005 05:51:15 -0000
Mailing-List: contact cluster-cvs-help@sources.redhat.com; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:cluster-cvs-subscribe@sources.redhat.com>
List-Post: <mailto:cluster-cvs@sources.redhat.com>
List-Help: <mailto:cluster-cvs-help@sources.redhat.com>, <http://sources.redhat.com/lists.html#faqs>
Sender: cluster-cvs-owner@sources.redhat.com
Received: (qmail 20281 invoked by uid 9453); 20 Apr 2005 05:51:15 -0000
Date: Wed, 20 Apr 2005 05:51:00 -0000
Message-ID: <20050420055115.20278.qmail@sourceware.org>
From: teigland@sourceware.org
To: cluster-cvs@sources.redhat.com
Subject: cluster/fence/fenced recover.c
X-SW-Source: 2005-q2/txt/msg00098.txt.bz2
List-Id: <cluster-cvs.sourceware.org>

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4
Changes by:	teigland@sourceware.org	2005-04-20 05:51:15

Modified files:
	fence/fenced   : recover.c 

Log message:
	Improve logic that delays and reduces fencing.  When fenced is recovering
	for a failed node, the 'post_fail_delay' is used to give victims some
	time to rejoin the cluster and avoid being fenced.  If this happens once,
	then it's likely to happen again and the 'post_join_delay' is more
	appropriate, so fenced switches to the 'post_join_delay' value (if it's
	larger which is usually the case.)
	
	The common situation where this helps is when multiple nodes fail causing
	the cluster to lose quorum and then the failed nodes all rejoin the
	cluster at about the same time.  The rejoining nodes are more likely
	to all avoid being fenced if fenced uses the larger post_join_delay.

Patches:
http://sources.redhat.com/cgi-bin/cvsweb.cgi/cluster/fence/fenced/recover.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.10.2.6&r2=1.10.2.7