public inbox for cluster-cvs@sourceware.org
help / color / mirror / Atom feed
* cluster/dlm-kernel/src lockqueue.c
@ 2008-01-14 15:57 teigland
  0 siblings, 0 replies; 18+ messages in thread
From: teigland @ 2008-01-14 15:57 UTC (permalink / raw)
  To: cluster-cvs

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4
Changes by:	teigland@sourceware.org	2008-01-14 15:57:46

Modified files:
	dlm-kernel/src : lockqueue.c 

Log message:
	bz 351321
	
	add_to_requestqueue() can add a new message to the requestqueue
	just after process_requestqueue() checks it and determines it's
	empty.  This means dlm_recvd will spin forever in wait_requestqueue()
	waiting for the message to be removed.
	
	The same problem was found and fixed in the RHEL5 code (and then
	subsequently changed again).  This patch is the RHEL4 equivalent of the
	original RHEL5 fix.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/dlm-kernel/src/lockqueue.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.37.2.11&r2=1.37.2.12


^ permalink raw reply	[flat|nested] 18+ messages in thread

* cluster/dlm-kernel/src lockqueue.c
@ 2008-01-04 16:12 teigland
  0 siblings, 0 replies; 18+ messages in thread
From: teigland @ 2008-01-04 16:12 UTC (permalink / raw)
  To: cluster-cvs

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4
Changes by:	teigland@sourceware.org	2008-01-04 16:12:05

Modified files:
	dlm-kernel/src : lockqueue.c 

Log message:
	Some message gets out of place, but there's no need to panic
	the machine; just ignore it.  bz 427531

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/dlm-kernel/src/lockqueue.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.37.2.10&r2=1.37.2.11


^ permalink raw reply	[flat|nested] 18+ messages in thread

* cluster/dlm-kernel/src lockqueue.c
@ 2007-11-07 15:57 teigland
  0 siblings, 0 replies; 18+ messages in thread
From: teigland @ 2007-11-07 15:57 UTC (permalink / raw)
  To: cluster-cvs

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL46
Changes by:	teigland@sourceware.org	2007-11-07 15:57:09

Modified files:
	dlm-kernel/src : lockqueue.c 

Log message:
	bz 349001
	
	For the entire life of the dlm, there's been an annoying issue that we've
	worked around and not "fixed" directly.  It's the source of all these
	messages:
	
	process_lockqueue_reply id 2c0224 state 0
	
	The problem that a lock master sends an async "granted" message for a
	convert request *before* actually sending the reply for the original
	convert.  The work-around is that the requesting node just takes the
	granted message as an implicit reply to the conversion and ignores the
	convert reply when it arrives later (the message above is printed when
	it gets the out-of-order reply for its convert).  Apart from the annoying
	messages, it's never been a problem.
	
	Now we've found a case where it's a real problem:
	
	1. nodeA: send convert PR->CW to nodeB
	nodeB: send granted message to nodeA
	nodeB: send convert reply to nodeA
	2. nodeA: receive granted message for conversion
	complete request, sending ast to gfs
	3. nodeA: send convert CW->EX to nodeB
	4. nodeA: receive reply for convert in step 1, which we ordinarily
	ignore, but since another convert has been sent, we mistake this
	message as the reply for the convert in step 3, and complete
	the convert request which is *not* really completed yet
	5. nodeA: send unlock to nodeB
	nodeB: complains about an unlock during a conversion
	
	The fix is to have nodeB not send a convert reply if it has already sent a
	granted message.  (We already do this for cases where the conversion is
	granted when first processing it, but we don't in cases where the grant
	is done after processing the convert.)

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/dlm-kernel/src/lockqueue.c.diff?cvsroot=cluster&only_with_tag=RHEL46&r1=1.37.2.9&r2=1.37.2.9.6.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* cluster/dlm-kernel/src lockqueue.c
@ 2007-11-07 15:22 teigland
  0 siblings, 0 replies; 18+ messages in thread
From: teigland @ 2007-11-07 15:22 UTC (permalink / raw)
  To: cluster-cvs

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4
Changes by:	teigland@sourceware.org	2007-11-07 15:22:31

Modified files:
	dlm-kernel/src : lockqueue.c 

Log message:
	bz 349001
	
	For the entire life of the dlm, there's been an annoying issue that we've
	worked around and not "fixed" directly.  It's the source of all these
	messages:
	
	process_lockqueue_reply id 2c0224 state 0
	
	The problem that a lock master sends an async "granted" message for a
	convert request *before* actually sending the reply for the original
	convert.  The work-around is that the requesting node just takes the
	granted message as an implicit reply to the conversion and ignores the
	convert reply when it arrives later (the message above is printed when
	it gets the out-of-order reply for its convert).  Apart from the annoying
	messages, it's never been a problem.
	
	Now we've found a case where it's a real problem:
	
	1. nodeA: send convert PR->CW to nodeB
	nodeB: send granted message to nodeA
	nodeB: send convert reply to nodeA
	2. nodeA: receive granted message for conversion
	complete request, sending ast to gfs
	3. nodeA: send convert CW->EX to nodeB
	4. nodeA: receive reply for convert in step 1, which we ordinarily
	ignore, but since another convert has been sent, we mistake this
	message as the reply for the convert in step 3, and complete
	the convert request which is *not* really completed yet
	5. nodeA: send unlock to nodeB
	nodeB: complains about an unlock during a conversion
	
	The fix is to have nodeB not send a convert reply if it has already sent a
	granted message.  (We already do this for cases where the conversion is
	granted when first processing it, but we don't in cases where the grant
	is done after processing the convert.)

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/dlm-kernel/src/lockqueue.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.37.2.9&r2=1.37.2.10


^ permalink raw reply	[flat|nested] 18+ messages in thread

* cluster/dlm-kernel/src lockqueue.c
@ 2006-01-24 17:46 teigland
  0 siblings, 0 replies; 18+ messages in thread
From: teigland @ 2006-01-24 17:46 UTC (permalink / raw)
  To: cluster-cvs

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4U3
Changes by:	teigland@sourceware.org	2006-01-24 17:46:39

Modified files:
	dlm-kernel/src : lockqueue.c 

Log message:
	When GFS uses direct-io PR and CW locks are mixed together
	on a single resource.  To optimize the interaction between
	these two lock modes, GFS uses LM_FLAG_ANY to request that
	either of the modes be granted.  When the dlm carries out
	this optimization and grants a PR lock instead of a CW, or
	a CW instead of a PR, the mode is not switched on the non-
	master node.  So, for example, the lock will be requested
	in PR mode with the ALTCW flag, it will be granted on the
	master node in CW mode, but the non master (requesting)
	node will record the granted mode as PR.
	
	Fix by changing the grmode on the non-master node when we
	get ALTMODE back from the master.
	
	Fixes bz 178738

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/dlm-kernel/src/lockqueue.c.diff?cvsroot=cluster&only_with_tag=RHEL4U3&r1=1.37.2.6.10.1&r2=1.37.2.6.10.2


^ permalink raw reply	[flat|nested] 18+ messages in thread

* cluster/dlm-kernel/src lockqueue.c
@ 2006-01-24 17:16 teigland
  0 siblings, 0 replies; 18+ messages in thread
From: teigland @ 2006-01-24 17:16 UTC (permalink / raw)
  To: cluster-cvs

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4U3
Changes by:	teigland@sourceware.org	2006-01-24 17:16:56

Modified files:
	dlm-kernel/src : lockqueue.c 

Log message:
	- In process_cluster_request() replace most of the assertions with an
	error message followed by ignoring the request.  There are some corner
	cases that would trigger assertions/panics when the request should
	just be ignored instead.
	
	- There's a statement to catch a corner case where a grant message
	arrives for a lock being unlocked.  We want to ignore the grant
	message, but the code was just returning instead of breaking which
	meant the in_recovery rw-semapohre wasn't being released.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/dlm-kernel/src/lockqueue.c.diff?cvsroot=cluster&only_with_tag=RHEL4U3&r1=1.37.2.6&r2=1.37.2.6.10.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* cluster/dlm-kernel/src lockqueue.c
@ 2006-01-24 14:38 teigland
  0 siblings, 0 replies; 18+ messages in thread
From: teigland @ 2006-01-24 14:38 UTC (permalink / raw)
  To: cluster-cvs

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4
Changes by:	teigland@sourceware.org	2006-01-24 14:38:19

Modified files:
	dlm-kernel/src : lockqueue.c 

Log message:
	When GFS uses direct-io PR and CW locks are mixed together
	on a single resource.  To optimize the interaction between
	these two lock modes, GFS uses LM_FLAG_ANY to request that
	either of the modes be granted.  When the dlm carries out
	this optimization and grants a PR lock instead of a CW, or
	a CW instead of a PR, the mode is not switched on the non-
	master node.  So, for example, the lock will be requested
	in PR mode with the ALTCW flag, it will be granted on the
	master node in CW mode, but the non master (requesting)
	node will record the granted mode as PR.
	
	Fix by changing the grmode on the non-master node when we
	get ALTMODE back from the master.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/dlm-kernel/src/lockqueue.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.37.2.8&r2=1.37.2.9


^ permalink raw reply	[flat|nested] 18+ messages in thread

* cluster/dlm-kernel/src lockqueue.c
@ 2006-01-23 21:58 teigland
  0 siblings, 0 replies; 18+ messages in thread
From: teigland @ 2006-01-23 21:58 UTC (permalink / raw)
  To: cluster-cvs

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	STABLE
Changes by:	teigland@sourceware.org	2006-01-23 21:58:45

Modified files:
	dlm-kernel/src : lockqueue.c 

Log message:
	When GFS uses direct-io PR and CW locks are mixed together
	on a single resource.  To optimize the interaction between
	these two lock modes, GFS uses LM_FLAG_ANY to request that
	either of the modes be granted.  When the dlm carries out
	this optimization and grants a PR lock instead of a CW, or
	a CW instead of a PR, the mode is not switched on the non-
	master node.  So, for example, the lock will be requested
	in PR mode with the ALTCW flag, it will be granted on the
	master node in CW mode, but the non master (requesting)
	node will record the granted mode as PR.
	
	Fix by changing the grmode on the non-master node when we
	get ALTMODE back from the master.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/dlm-kernel/src/lockqueue.c.diff?cvsroot=cluster&only_with_tag=STABLE&r1=1.37.2.6.6.4&r2=1.37.2.6.6.5


^ permalink raw reply	[flat|nested] 18+ messages in thread

* cluster/dlm-kernel/src lockqueue.c
@ 2005-12-19 23:01 teigland
  0 siblings, 0 replies; 18+ messages in thread
From: teigland @ 2005-12-19 23:01 UTC (permalink / raw)
  To: cluster-cvs

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4
Changes by:	teigland@sourceware.org	2005-12-19 23:01:02

Modified files:
	dlm-kernel/src : lockqueue.c 

Log message:
	Remove an assertion.  Before sending a lock request we were asserting
	that the destination (from the lkb's nodeid field) matched the rsb's
	nodeid that we got from a lookup and copied to the lkb.  Given the
	right combination, the nodeid in the rsb can be invalidated by another
	failed request (and set to -1) before the assertion check.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/dlm-kernel/src/lockqueue.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.37.2.7&r2=1.37.2.8


^ permalink raw reply	[flat|nested] 18+ messages in thread

* cluster/dlm-kernel/src lockqueue.c
@ 2005-12-19 22:55 teigland
  0 siblings, 0 replies; 18+ messages in thread
From: teigland @ 2005-12-19 22:55 UTC (permalink / raw)
  To: cluster-cvs

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	STABLE
Changes by:	teigland@sourceware.org	2005-12-19 22:55:58

Modified files:
	dlm-kernel/src : lockqueue.c 

Log message:
	Remove an assertion.  Before sending a lock request we were asserting
	that the destination (from the lkb's nodeid field) matched the rsb's
	nodeid that we got from a lookup and copied to the lkb.  Given the
	right combination, the nodeid in the rsb can be invalidated by another
	failed request (and set to -1) before the assertion check.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/dlm-kernel/src/lockqueue.c.diff?cvsroot=cluster&only_with_tag=STABLE&r1=1.37.2.6.6.3&r2=1.37.2.6.6.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* cluster/dlm-kernel/src lockqueue.c
@ 2005-12-16 20:18 teigland
  0 siblings, 0 replies; 18+ messages in thread
From: teigland @ 2005-12-16 20:18 UTC (permalink / raw)
  To: cluster-cvs

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4
Changes by:	teigland@sourceware.org	2005-12-16 20:18:04

Modified files:
	dlm-kernel/src : lockqueue.c 

Log message:
	- In process_cluster_request() replace most of the assertions with an
	error message followed by ignoring the request.  There are some corner
	cases that would trigger assertions/panics when the request should
	just be ignored instead.
	
	- There's a statement to catch a corner case where a grant message
	arrives for a lock being unlocked.  We want to ignore the grant
	message, but the code was just returning instead of breaking which
	meant the in_recovery rw-semapohre wasn't being released.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/dlm-kernel/src/lockqueue.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.37.2.6&r2=1.37.2.7


^ permalink raw reply	[flat|nested] 18+ messages in thread

* cluster/dlm-kernel/src lockqueue.c
@ 2005-12-16 16:39 teigland
  0 siblings, 0 replies; 18+ messages in thread
From: teigland @ 2005-12-16 16:39 UTC (permalink / raw)
  To: cluster-cvs

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	STABLE
Changes by:	teigland@sourceware.org	2005-12-16 16:39:57

Modified files:
	dlm-kernel/src : lockqueue.c 

Log message:
	When replacing ASSERT's with error messages one check was reversed;
	res_nodeid == 0 is the error we're checking for, res_nodeid > 0 is
	correct.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/dlm-kernel/src/lockqueue.c.diff?cvsroot=cluster&only_with_tag=STABLE&r1=1.37.2.6.6.2&r2=1.37.2.6.6.3


^ permalink raw reply	[flat|nested] 18+ messages in thread

* cluster/dlm-kernel/src lockqueue.c
@ 2005-12-14 23:24 teigland
  0 siblings, 0 replies; 18+ messages in thread
From: teigland @ 2005-12-14 23:24 UTC (permalink / raw)
  To: cluster-cvs

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	STABLE
Changes by:	teigland@sourceware.org	2005-12-14 23:24:55

Modified files:
	dlm-kernel/src : lockqueue.c 

Log message:
	There's a statement to catch a corner case where a grant message
	arrives for a lock being unlocked.  We want to ignore the grant
	message, but the code was just returning instead of breaking which
	meant the in_recovery rw-semapohre wasn't being released.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/dlm-kernel/src/lockqueue.c.diff?cvsroot=cluster&only_with_tag=STABLE&r1=1.37.2.6.6.1&r2=1.37.2.6.6.2


^ permalink raw reply	[flat|nested] 18+ messages in thread

* cluster/dlm-kernel/src lockqueue.c
@ 2005-12-14 23:20 teigland
  0 siblings, 0 replies; 18+ messages in thread
From: teigland @ 2005-12-14 23:20 UTC (permalink / raw)
  To: cluster-cvs

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	STABLE
Changes by:	teigland@sourceware.org	2005-12-14 23:20:41

Modified files:
	dlm-kernel/src : lockqueue.c 

Log message:
	In process_cluster_request() replace most of the assertions with an
	error message followed by ignoring the request.  There are some corner
	cases that would trigger assertions/panics when the request should just
	be ignored instead.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/dlm-kernel/src/lockqueue.c.diff?cvsroot=cluster&only_with_tag=STABLE&r1=1.37.2.6&r2=1.37.2.6.6.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* cluster/dlm-kernel/src lockqueue.c
@ 2005-02-17  4:38 teigland
  0 siblings, 0 replies; 18+ messages in thread
From: teigland @ 2005-02-17  4:38 UTC (permalink / raw)
  To: cluster-cvs

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4
Changes by:	teigland@sourceware.org	2005-02-17 04:38:00

Modified files:
	dlm-kernel/src : lockqueue.c 

Log message:
	remove some non-critical printk's

Patches:
http://sources.redhat.com/cgi-bin/cvsweb.cgi/cluster/dlm-kernel/src/lockqueue.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.37.2.3&r2=1.37.2.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* cluster/dlm-kernel/src lockqueue.c
@ 2005-02-17  4:37 teigland
  0 siblings, 0 replies; 18+ messages in thread
From: teigland @ 2005-02-17  4:37 UTC (permalink / raw)
  To: cluster-cvs

CVSROOT:	/cvs/cluster
Module name:	cluster
Changes by:	teigland@sourceware.org	2005-02-17 04:37:17

Modified files:
	dlm-kernel/src : lockqueue.c 

Log message:
	remove some non-critical printk's

Patches:
http://sources.redhat.com/cgi-bin/cvsweb.cgi/cluster/dlm-kernel/src/lockqueue.c.diff?cvsroot=cluster&r1=1.40&r2=1.41


^ permalink raw reply	[flat|nested] 18+ messages in thread

* cluster/dlm-kernel/src lockqueue.c
@ 2005-02-16  3:53 teigland
  0 siblings, 0 replies; 18+ messages in thread
From: teigland @ 2005-02-16  3:53 UTC (permalink / raw)
  To: cluster-cvs

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4
Changes by:	teigland@sourceware.org	2005-02-16 03:53:16

Modified files:
	dlm-kernel/src : lockqueue.c 

Log message:
	Blocking asts were being ignored for all locks being converted which
	resulted in some necessary basts being skipped.  In particular,
	after a failed NOQUEUE conversion, gfs could be left holding a lock
	and getting no callback for it while others were left waiting.
	
	This changes things so that a bast message is ignored if the lock is
	being converted and NOQUEUE isn't set, or if the locks is being
	unlocked.  Fixes bz 147798.

Patches:
http://sources.redhat.com/cgi-bin/cvsweb.cgi/cluster/dlm-kernel/src/lockqueue.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.37.2.2&r2=1.37.2.3


^ permalink raw reply	[flat|nested] 18+ messages in thread

* cluster/dlm-kernel/src lockqueue.c
@ 2005-02-16  3:52 teigland
  0 siblings, 0 replies; 18+ messages in thread
From: teigland @ 2005-02-16  3:52 UTC (permalink / raw)
  To: cluster-cvs

CVSROOT:	/cvs/cluster
Module name:	cluster
Changes by:	teigland@sourceware.org	2005-02-16 03:52:24

Modified files:
	dlm-kernel/src : lockqueue.c 

Log message:
	Blocking asts were being ignored for all locks being converted which
	resulted in some necessary basts being skipped.  In particular,
	after a failed NOQUEUE conversion, gfs could be left holding a lock
	and getting no callback for it while others were left waiting.
	
	This changes things so that a bast message is ignored if the lock is
	being converted and NOQUEUE isn't set, or if the locks is being
	unlocked.  Fixes bz 147798.

Patches:
http://sources.redhat.com/cgi-bin/cvsweb.cgi/cluster/dlm-kernel/src/lockqueue.c.diff?cvsroot=cluster&r1=1.39&r2=1.40


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2008-01-14 15:57 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-01-14 15:57 cluster/dlm-kernel/src lockqueue.c teigland
  -- strict thread matches above, loose matches on Subject: below --
2008-01-04 16:12 teigland
2007-11-07 15:57 teigland
2007-11-07 15:22 teigland
2006-01-24 17:46 teigland
2006-01-24 17:16 teigland
2006-01-24 14:38 teigland
2006-01-23 21:58 teigland
2005-12-19 23:01 teigland
2005-12-19 22:55 teigland
2005-12-16 20:18 teigland
2005-12-16 16:39 teigland
2005-12-14 23:24 teigland
2005-12-14 23:20 teigland
2005-02-17  4:38 teigland
2005-02-17  4:37 teigland
2005-02-16  3:53 teigland
2005-02-16  3:52 teigland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).