[Pacemaker] Help on setting order of resources

Adrian Gibanel adrian.gibanel at btactic.com
Sat Aug 18 16:16:07 EDT 2012

Short description 
Corosync ignores my resources order settings. 

Final goal
Being able to HA zimbra. 

Description of the system 
This is an Ubuntu 10.04 LTS because current stable Zimbra works in Ubuntu 10.04 and not yet in 12.04. 

I've dist-upgraded packages from: https://launchpad.net/~ubuntu-ha-maintainers/+archive/ppa as it was advised on some sites. 

My main configuration is based on this document: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 

I've created some OCF resource agents (for zimbra and some network stuff) on my own and I've already tested them thanks to ocf-tester and ocf-tester-py (a hack of mine of ocf-tester that allows you to test python based ocf scripts). 

Finally some packages versions: 

libcrmcluster1 1.1.6-2ubuntu0~ppa2 
libcrmcommon2 1.1.6-2ubuntu0~ppa2 
corosync 1.4.2-1ubuntu0~ppa1 
libcorosync4 1.4.2-1ubuntu0~ppa1 
lvm2 2.02.54-1ubuntu4.1ppa5 
pacemaker 1.1.6-2ubuntu0~ppa2 
libglib2.0-0 2.24.1-0ubuntu1.1~ppa1 
lvm2 2.02.54-1ubuntu4.1ppa5 
cluster-glue 1.0.8-2ubuntu0~ppa4 
libcluster-glue 1.0.8-2ubuntu0~ppa4 
resource-agents 1:3.9.2-4ubuntu0~ppa2 

crm configure show output: 

adrian at zhatest-01:~$ sudo crm configure show 
node zhatest-01.domain.com 
node zhatest-02.domain.com 
primitive ClusterDefaultRoute ocf:btactic:OVHdefaultroute \ 
op monitor interval="30s" 
primitive ClusterHostRoute ocf:btactic:OVHhostroute \ 
params device="eth0" \ 
op monitor interval="30s" 
primitive ClusterIP ocf:heartbeat:IPaddr2 \ 
params nic="eth0" ip="" cidr_netmask="32" broadcast="" \ 
op monitor interval="30s" 
primitive ClusterOVHFailover ocf:btactic:OVHfailover \ 
op monitor interval="120s" timeout="60s" \ 
op start interval="0" timeout="660" \ 
op stop interval="0" timeout="660" \ 
params nichandle="MYLOGIN" password="MYSECRET" failover="" \ 
meta target-role="Started" 
primitive ZimbraData ocf:linbit:drbd \ 
params drbd_resource="zimbradata" \ 
op monitor interval="60s" role="Master" \ 
op monitor interval="50s" role="Slave" \ 
op start interval="0" role="Master" timeout="240" \ 
op start interval="0" role="Slave" timeout="240" \ 
op stop interval="0" role="Master" timeout="100" \ 
op stop interval="0" role="Slave" timeout="100" 
primitive ZimbraFS ocf:heartbeat:Filesystem \ 
params device="/dev/drbd/by-res/zimbradata" directory="/opt/zimbra" fstype="ext4" \ 
op start interval="0" timeout="60s" \ 
op stop interval="0" timeout="60s" 
primitive ZimbraServer ocf:btactic:zimbra \ 
op monitor interval="2min" \ 
op start interval="0" timeout="360s" \ 
op stop interval="0" timeout="360s" 
group MySystem ClusterOVHFailover ClusterIP ClusterHostRoute ClusterDefaultRoute 
group MyZimbra ZimbraFS ZimbraServer 
ms ZimbraDataClone ZimbraData \ 
meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" 
location prefer-zhatest-01 MyZimbra 50: zhatest-01.domain.com 
colocation everything-together inf: MySystem ZimbraDataClone:Master MyZimbra 
order everything-ordered inf: MySystem ZimbraDataClone:promote MyZimbra 
property $id="cib-bootstrap-options" \ 
no-quorum-policy="ignore" \ 
stonith-enabled="false" \ 
dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \ 
cluster-infrastructure="openais" \ 
rsc_defaults $id="rsc-options" \ 

crm_on -orVVVV1 output: 
crm_mon[4215]: 2012/08/18_19:46:39 info: main: Starting crm_mon 
crm_mon[4215]: 2012/08/18_19:46:39 info: unpack_config: Startup probes: enabled 
crm_mon[4215]: 2012/08/18_19:46:39 notice: unpack_config: On loss of CCM Quorum: Ignore 
crm_mon[4215]: 2012/08/18_19:46:39 info: unpack_config: Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0 
crm_mon[4215]: 2012/08/18_19:46:39 info: unpack_domains: Unpacking domains 
crm_mon[4215]: 2012/08/18_19:46:39 info: determine_online_status: Node zhatest-01.domain.com is online 
crm_mon[4215]: 2012/08/18_19:46:39 notice: unpack_rsc_op: Hard error - ZimbraServer_last_failure_0 failed with rc=5: Preventing ZimbraServer from re-starting on zhatest-01.domain.com 
Last updated: Sat Aug 18 19:46:39 2012 
Last change: Sat Aug 18 18:09:51 2012 via crmd on zhatest-01.domain.com 
Stack: openais 
Current DC: zhatest-01.domain.com - partition WITHOUT quorum 
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c 
2 Nodes configured, 2 expected votes 
8 Resources configured. 

Online: [ zhatest-01.domain.com ] 
OFFLINE: [ zhatest-02.domain.com ] 

Full list of resources: 

Resource Group: MySystem 
ClusterOVHFailover (ocf::btactic:OVHfailover): Stopped 
ClusterIP (ocf::heartbeat:IPaddr2): Stopped 
ClusterHostRoute (ocf::btactic:OVHhostroute): Stopped 
ClusterDefaultRoute (ocf::btactic:OVHdefaultroute): Stopped 
Resource Group: MyZimbra 
ZimbraFS (ocf::heartbeat:Filesystem): Stopped 
ZimbraServer (ocf::btactic:zimbra): Stopped 
Master/Slave Set: ZimbraDataClone [ZimbraData] 
Slaves: [ zhatest-01.domain.com ] 
Stopped: [ ZimbraData:1 ] 

* Node zhatest-01.domain.com: 
ZimbraData:0: migration-threshold=1000000 
+ (9) start: rc=0 (ok) 
+ (11) monitor: interval=50000ms rc=0 (ok) 
ZimbraServer: migration-threshold=1000000 
+ (7) probe: rc=5 (not installed) 

Failed actions: 
ZimbraServer_monitor_0 (node=zhatest-01.domain.com, call=7, rc=5, status=complete): not installed 

Long description: 
I expect that system tries to start resources in the following order: 
MySystem ZimbraDataClone:Master MyZimbra 
that after expanding group members is: 
ClusterOVHFailover ClusterIP ClusterHostRoute \ 
ClusterDefaultRoute ZimbraDataClone:Master \ 
ZimbraFS ZimbraServer 

If crm_mon -o shows the operation history as per my former log it seems that corosync insists on starting ZimbraData on the first place and I don't want that. 

So, that's it. Am I missing something? If you need more logs don't hesitate to ask for them. 
Thank you! 

Other questions 
Where is documented the probe operation which happens to appear on crm_mon output? 

P.S.: This unanswered email is very similar to my issue: http://lists.linux-ha.org/pipermail/linux-ha/2011-May/043144.html 


Adrián Gibanel 
I.T. Manager 

+34 675 683 301 

Ens podeu seguir a/Nos podeis seguir en: 


Abans d´imprimir aquest missatge, pensa en el medi ambient. El medi ambient és cosa de tothom. / Antes de imprimir el mensaje piensa en el medio ambiente. El medio ambiente es cosa de todos. 

El contingut d'aquest missatge i els seus annexos és confidencial. Si no en sou el destinatari, us fem saber que està prohibit utilitzar-lo, divulgar-lo i/o copiar-lo sense tenir l'autorització corresponent. Si heu rebut aquest missatge per error, us agrairem que ho feu saber immediatament al remitent i que procediu a destruir el missatge . 

El contenido de este mensaje y de sus anexos es confidencial. Si no es el destinatario, les hacemos saber que está prohibido utilizarlo, divulgarlo y/o copiarlo sin tener la autorización correspondiente. Si han recibido este mensaje por error, les agradeceríamos que lo hagan saber inmediatamente al remitente y que procedan a destruir el mensaje . 

More information about the Pacemaker mailing list