[Pacemaker] Trouble with Xen high availability. Can't get it.

Tue Dec 6 00:17:58 UTC 2011

Hello , thanks for your answer.

06 декабря 2011, 02:08 от Andreas Kurz <andreas.kurz at gmail.com>:
> Hello,
> 
> On 12/05/2011 12:57 PM, Богомолов Дмитрий Викторович wrote:
> > Hello. I made a cluster with two nodes(Ubuntu 11.10 + corocync + drbd
> > + cman + Pacemaker), and configure Xen resource to start virtual
> > machine (VM1 for short, Ubuntu 10.10 ), virtual machines disks are on
> > the drbd resource. So now i try testing availability.
> 
> And how did you configure it? Hard to comment without seeing any
> configuration.

$cat /etc/drbd.conf
include "drbd.d/global_common.conf";
include "drbd.d/*.res";
resource clusterdata {
	meta-disk internal;
	device	/dev/drbd1;
	protocol C;
	syncer {
		verify-alg	sha1;
		rate 33M;
	}
	net {
		allow-two-primaries;
	}
	on blaster {
		disk /dev/mapper/turrel-cluster_storage;
		address 192.168.0.254:7789;
	}
	on turrel {
		disk /dev/mapper/turrel-cluster_storage;
		address 192.168.0.253:7789;
	}
}

$cat /etc/corosync/corosync.conf
# Please read the corosync.conf.5 manual page
compatibility: whitetank

totem {
	version: 2
	secauth: off
	threads: 0
	interface {
		ringnumber: 0
		bindnetaddr: 192.168.0.0
		mcastaddr: 239.0.0.1
		mcastport: 4000
	}
}

logging {
	fileline: off
	to_stderr: off
	to_logfile: yes
	to_syslog: off
	logfile: /var/log/corosync/corosync.log
	debug: off
	timestamp: on
	logger_subsys {
		subsys: AMF
		debug: off
	}
}

amf {
	mode: disabled
}

service {
	# Load the Pacemaker Cluster Resource Manager
	name: 	pacemaker
	clustername:	tumba
	ver:	1
}

$cat /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster config_version="1" name="tumba">
 <logging debug="off"/>
 <clusternodes>
  <clusternode name="blaster" nodeid="1">
   <fence>
    <method name="pcmk-redirect">
     <device name="pcmk" port="blaster"/>
    </method>
   </fence>
  </clusternode>
  <clusternode name="turrel" nodeid="2">
   <fence>
    <method name="pcmk-redirect">
     <device name="pcmk" port="turrel"/>
    </method>
   </fence>
  </clusternode>
 </clusternodes>
 <fencedevices>
  <fencedevice name="pcmk" agent="fence_pcmk"/>
 </fencedevices>
</cluster>

$sudo crm configure show
node blaster \
	attributes standby="off"
node turrel \
	attributes standby="off"
primitive ClusterData ocf:linbit:drbd \
	params drbd_resource="clusterdata" \
	op monitor interval="60s"
primitive ClusterFS ocf:heartbeat:Filesystem \
	params device="/dev/drbd/by-res/clusterdata" directory="/mnt/cluster" fstype="gfs2" \
	op start interval="0" timeout="60s" \
	op stop interval="0" timeout="60s" \
	op monitor interval="60s" timeout="60s"
primitive ClusterIP ocf:heartbeat:IPaddr2 \
	params ip="192.168.122.252" cidr_netmask="32" clusterip_hash="sourceip" \
	op monitor interval="30s"
primitive XenDom ocf:heartbeat:Xen \
	params xmfile="/etc/xen/xen1.example.com.cfg" \
	meta is-managed="true" \
	utilization cores="1" mem="512" \
	op monitor interval="1min" timeout="30sec" start-delay="10sec" \
	op start interval="0" timeout="1min" \
	op stop interval="0" timeout="60sec" \
	op migrate_to interval="0" timeout="180sec"
ms ClusterDataClone ClusterData \
	meta master-max="2" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
clone ClusterFSClone ClusterFS \
	meta target-role="Started" is-managed="true"
clone IP ClusterIP \
	meta globally-unique="true" clone-max="2" clone-node-max="2"
clone XenDomClone XenDom \
	meta target-role="Started"
location cli-prefer-ClusterFSClone ClusterFSClone \
	rule $id="cli-prefer-rule-ClusterFSClone" inf: #uname eq blaster and #uname eq blaster
location prefere-blaster XenDomClone 50: blaster
colocation XenDom-with-ClusterFS inf: XenDomClone ClusterFSClone
colocation fs_on_drbd inf: ClusterFSClone ClusterDataClone:Master
order ClusterFS-after-ClusterData inf: ClusterDataClone:promote ClusterFSClone:start
order XenDom-after-ClusterFS inf: ClusterFSClone XenDomClone
property $id="cib-bootstrap-options" \
	dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
	cluster-infrastructure="cman" \
	expected-quorum-votes="2" \
	stonith-enabled="false" \
	no-quorum-policy="ignore" \
	last-lrm-refresh="1323127127"
rsc_defaults $id="rsc-options" \
	resource-stickiness="100"

$sudo crm_mon -1
============
Last updated: Tue Dec  6 10:54:38 2011
Stack: cman
Current DC: blaster - partition with quorum
Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
4 Resources configured.
============

Online: [ blaster turrel ]

 Master/Slave Set: ClusterDataClone [ClusterData]
     Masters: [ blaster turrel ]
 Clone Set: IP [ClusterIP] (unique)
     ClusterIP:0	(ocf::heartbeat:IPaddr2):	Started blaster
     ClusterIP:1	(ocf::heartbeat:IPaddr2):	Started turrel
 Clone Set: ClusterFSClone [ClusterFS]
     Started: [ blaster turrel ]
 Clone Set: XenDomClone [XenDom]
     Started: [ blaster turrel ]

> 
> > I execute this command on node1:
> >
> > $sudo crm node standby
> >
> > And I receive this message:
> >
> > block drbd1: Sending state for detaching disk failed
> >
> > I notice that on node1 service drbd stops
> >
> > $cat /proc/drbd 1: cs:Unconfigured
> >
> > Is this normal? There is a following:
> 
> Yes, a node in standby runs no resources.
> 
> >
> > Virtual machine doesn't stop. It confirms with icmp echo response
> > from the VM1. I run interactive VM1 console on node2, with :
> >
> > $sudo xm console VM1
> >
> > I can see that it continues to work, and remote ssh session with VM1
> > also continues to work.
> 
> That looks like a working live-migration.
> 
> >
> > Then I bring back node1 , with:
> >
> > $sudo crm node online
> >
> > I receive messages:
> >
> > dlm: Using TCP for comunications dlm: connecting to 1 dlm: got
> > connection from 1
> >
> > There Icmp echo responces from VM1 stopped on 15 sec. Thus the
> > interactive console VM1 on node2 and remote ssh session with VM1 too
> > has shown shutdown process. I.e.there was a restart VM1 on node2,
> > that as I believe shouldn't be. Further I switch off node2:
> >
> > $sudo crm node standby
> >
> > Also, I receive this message:
> >
> > block drbd1: Sending state for detaching disk failed
> >
> > I notice that on node2 service drbd stops. The interactive console
> > VM1 on node2 and remote ssh session has displayed shutdown process,
> > but the interactive console VM1 on node1 works normally. Thus ICMP
> > echo response from the VM1 has stopped on 275с. During this time i
> > cant get remote ssh connect to VM1. After this long interval Xen
> > services start working . Further I switch on node2:
> 
> config???
> 
> >
> > $sudo crm node online
> >
> > A situation similarly described earlier, i.e. icmp echo responces to
> > VM1 stopped on 15 sec. Thus the interactive console VM1 on node1 and
> > remote ssh session with VM1 too has shown shutdown process. I.e.
> > there was a restart VM1 on node1.
> >
> > I have repeated this operation some times(4-5), with the same result,
> > tried to add in parameters of service Xen:
> >
> > meta allow-migrate = "true"
> >
> > It doesn't changed behavior.
> >
> > I wonder whether this parameter, allow-migrate, is necessary in
> > Active/Active configuration? It was not include on Clusters from
> > scratch manual, but i saw it on other (active/passive) config's, thus
> > I assume it's not nessasary, because Xen services are equally started
> > on both servers. And I expect that any node failure must not stop
> > services on another node. Am I think correctly?
> >
> 
> What? You are starting the same VM on both nodes ... are you serious?

Yes, the Xen resource start it in Active/Active pacemaker configuration. I dont get what is wrong? I need another way?
I want to get High Availability Xen cluster, when one host failure does not affect users, with zero-downtime. Also, i want to get a load balansing for VM1.

> 
> > So. How to avoid such reboots of VM1? And  what I need to do for
> > maintaining continuous working of VM1?
> >
> > What the reason of such various delay restoration - 15 sec node1 and
> > 275 sec on node2? How to reduce them, and is better to avoid?
> >
> > Do i need live migration? If yes, then how to make that. I used
> > parameter meta allow-migrate = "true", but it didn't influence.
> >
> > Whether it is because i do not configure Stonith yet?. At least this
> > is my assumption.
> 
> Dual primary DRBD setup? Yes, you must use stonith.

I'm not absolutely understand how to use it in my configuration.
There is an example for ipmi stonith setting up in Clusters_from_Scratch/s-stonith-example.html. If I have properly understood, the example is resulted for hosts which support IPMI. My hosts don't support it.

My nodes in test cluster are: Node1 - VMware virtual machine, Node 2 - old computer with PIV 2,8 MHz + 1Gb RAM used.
What stonith ra I should use with my config for Xen resource - fence_xenapi, fence_node, external/xen0, external/xen0-ha?
Man is available only for fence_xenapi, and , this RA has to apply for xen-center which I don't use.
I don't know how to receive the help manuals for others ra, because man is absent. May be you help with it.

> 
> Regards,
> Andreas
> 
> --
> Need help with Pacemaker?
> http://www.hastexo.com/now
> 
> >
> > I will be grateful for you for any help.
> > _______________________________________________ Pacemaker mailing
> > list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org Getting started:
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs:
> > http://bugs.clusterlabs.org
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
>