[Pacemaker] stonith and avoiding split brain in two nodes cluster

Mon Mar 25 08:54:22 EDT 2013

Hello,

	I am newbie with pacemaker (and, generally, with ha clusters). I have 
configured a two nodes cluster. Both nodes are virtual machines (vmware 
esx) and use a shared storage (provided by a SAN, although access to the 
SAN is from esx infrastructure and VM consider it as scsi disk). I have 
configured clvm so logical volumes are only active in one of the nodes.

	Now I need some help with the stonith configuration to avoid data 
corrumption. Since I'm using ESX virtual machines, I think I won't have 
any problem using external/vcenter stonith plugin to shutdown virtual 
machines.

	My problem is how to avoid split brain situation with this 
configuration, without configuring a 3rd node. I have read about quorum 
disks, external/sbd stonith plugin and other references, but I'm too 
confused with all this.

	For example, [1] mention techniques to improve quorum with scsi reserve 
or quorum daemon, but it didn't point to how to do this pacemaker. Or 
[2] talks about external/sbd.

	Any help?

PS: I have attached my corosync.conf and crm configure show outputs

[1] 
http://techthoughts.typepad.com/managing_computers/2007/10/split-brain-quo.html
[2] http://www.gossamer-threads.com/lists/linuxha/pacemaker/78887

-- 
Angel L. Mateo Martínez
Sección de Telemática
Área de Tecnologías de la Información
y las Comunicaciones Aplicadas (ATICA)
http://www.um.es/atica
Tfo: 868889150
Fax: 868888337
-------------- next part --------------
# Please read the openais.conf.5 manual page

totem {
	version: 2

	# How long before declaring a token lost (ms)
	token: 3000

	# How many token retransmits before forming a new configuration
	token_retransmits_before_loss_const: 10

	# How long to wait for join messages in the membership protocol (ms)
	join: 60

	# How long to wait for consensus to be achieved before starting a new round of membership configuration (ms)
	consensus: 3600

	# Turn off the virtual synchrony filter
	vsftype: none

	# Number of messages that may be sent by one processor on receipt of the token
	max_messages: 20

	# Limit generated nodeids to 31-bits (positive signed integers)
	clear_node_high_bit: yes

	# Disable encryption
 	secauth: off

	# How many threads to use for encryption/decryption
 	threads: 0

	# Optionally assign a fixed node id (integer)
	# nodeid: 1234

	# This specifies the mode of redundant ring, which may be none, active, or passive.
 	rrp_mode: none

 	interface {
		# The following values need to be set based on your environment 
		ringnumber: 0
		bindnetaddr: 155.54.211.160
		mcastaddr: 226.94.1.1
		mcastport: 5405
	}
}

amf {
	mode: disabled
}

service {
 	# Load the Pacemaker Cluster Resource Manager
 	ver:       1
 	name:      pacemaker
}

aisexec {
        user:   root
        group:  root
}

logging {
        fileline: off
        to_stderr: yes
        to_logfile: no
        to_syslog: yes
	syslog_facility: daemon
        debug: off
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
                tags: enter|leave|trace1|trace2|trace3|trace4|trace6
        }
}
-------------- next part --------------
node myotis51
node myotis52
primitive clvm ocf:lvm2:clvmd \
	params daemon_timeout="30" \
	meta target-role="Started"
primitive dlm ocf:pacemaker:controld \
	meta target-role="Started"
primitive vg_users1 ocf:heartbeat:LVM \
	params volgrpname="UsersDisk" exclusive="yes" \
	op monitor interval="60" timeout="60"
group dlm-clvm dlm clvm
clone dlm-clvm-clone dlm-clvm \
	meta interleave="true" ordered="true" target-role="Started"
location cli-prefer-vg_users1 vg_users1 \
	rule $id="cli-prefer-rule-vg_users1" inf: #uname eq myotis52
property $id="cib-bootstrap-options" \
	dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
	cluster-infrastructure="openais" \
	expected-quorum-votes="2" \
	stonith-enabled="false" \
	no-quorum-policy="ignore" \
	last-lrm-refresh="1364212376"
rsc_defaults $id="rsc-options" \
	resource-stickiness="100"