[Pacemaker] failover problem with pacemaker & drbd

diego.remolina at physics.gatech.edu diego.remolina at physics.gatech.edu
Sat Aug 15 07:51:46 EDT 2009

I noticed that you are using a non-cluster file system, ext3, so you should be using a master slave resource, not a simple resource for drbd. (unless you seem to be starting drbd with the system init scripts that may not be the best thing to do). 

Please look at my previous post to the list "Master/Slave resource cannot start" which has a working configuration for drbd using two drbd resurces, nfs and samba with pingd. 


Please note that I am using drbd-8.3.2 which has a new resource script included under linbit:drbd 

The drbd documentation has a decent example which is only missing the pingd part of the configuration 




----- "Gerry kernan" <gerry.kernan at infinityit.ie> wrote: 


I have setup 2 servers so that I can replicate a filesystem between both servers using drbd. I configured a drbd , filesystem, IPaddress, pingd resources, I also have an lsb resource to start icobol. 

I can stop & start the resource group & migrate the resource group between servers using pacemaker GUI. But if I power down or take one of the servers of the network the resource group doesn’t fail over to the other node. 

Hopefully someone can point out to me where I have make a mistake or not configured sometime. 

. Pacemaker config, drbd.conf & openais.conf are below 

node host1.localdomain 

node host2.localdomain \ 

attributes standby="false" 

primitive res_drbd_credit heartbeat:drbddisk \ 

operations $id="res_drbd_credit-operations" \ 

op monitor interval="15" timeout="15" start-delay="15" \ 

params 1="credit" \ 

meta $id="res_drbd_credit-meta_attributes" 

primitive res_filesystem_credit ocf:heartbeat:Filesystem \ 

meta $id="res_filesystem_credit-meta_attributes" \ 

operations $id="res_filesystem_credit-operations" \ 

op monitor interval="20" timeout="40" start-delay="10" \ 

params device="/dev/drbd0" directory="/credit" fstype="ext3" 

primitive res_icobol_credit lsb:icobol \ 

meta is-managed="true" \ 

operations $id="res_icobol_credit-operations" \ 

op monitor interval="15" timeout="15" start-delay="15" 

primitive res_ip_credit ocf:heartbeat:IPaddr2 \ 

meta $id="res_ip_credit-meta_attributes" \ 

operations $id="res_ip_credit-operations" \ 

op monitor interval="10s" timeout="20s" start-delay="5s" \ 

params ip="" cidr_netmask="" 

primitive res_pingd ocf:pacemaker:pingd \ 

operations $id="res_pingd-operations" \ 

op monitor interval="10" timeout="20" start-delay="1m" \ 

params host_list="" 

group grp_credit res_drbd_credit res_filesystem_credit res_ip_credit res_icobol_credit res_pingd \ 

meta target-role="started" 

location cli-prefer-grp_credit grp_credit \ 

rule $id="cli-prefer-rule-grp_credit" inf: #uname eq host2.localdomain 

location cli-prefer-res_icobol_credit res_icobol_credit \ 

rule $id="cli-prefer-rule-res_icobol_credit" inf: #uname eq host1.localdomain 

location cli-standby-grp_credit grp_credit \ 

rule $id="cli-standby-rule-grp_credit" -inf: #uname eq host1.localdomain 

colocation loc_grp_credit inf: res_filesystem_credit res_drbd_credit 

colocation loc_icobol inf: res_icobol_credit res_ip_credit 

colocation loc_ip inf: res_ip_credit res_filesystem_credit 

property $id="cib-bootstrap-options" \ 

dc-version="1.0.4-6dede86d6105786af3a5321ccf66b44b6914f0aa" \ 

cluster-infrastructure="openais" \ 

expected-quorum-votes="2" \ 

last-lrm-refresh="1250158583" \ 

node-health-red="0" \ 

stonith-enabled="false" \ 

default-resource-stickiness="200" \ 

no-quorum-policy="ignore" \ 


[root at host1 ~]# cat /etc/drbd.conf 


# please have a a look at the example configuration file in 

# /usr/share/doc/packages/drbd/drbd.conf 


global { 

usage-count yes; 


common { 

protocol C; 


resource credit { 

device /dev/drbd0; 

meta-disk internal; 

disk /dev/cciss/c0d0p5; 

on host1.localdomain { 



on host2.localdomain { 



handlers { 

split-brain "/usr/lib/drbd/notify-split-brain.sh root"; 



# Please read the openais.conf.5 manual page 

aisexec { 

# Run as root - this is necessary to be able to manage resources with Pa 


user: root 

group: root 


service { 

# Load the Pacemaker Cluster Resource Manager 

ver: 0 

name: pacemaker 

use_mgmtd: yes 

use_logd: yes 


totem { 

version: 2 

# How long before declaring a token lost (ms) 

token: 5000 

# How many token retransmits before forming a new configuration 

token_retransmits_before_loss_const: 10 

# How long to wait for join messages in the membership protocol (ms) 

join: 1000 

# How long to wait for consensus to be achieved before starting a new round of membership configuration (ms) 

consensus: 2500 

# Turn off the virtual synchrony filter 

vsftype: none 

# Number of messages that may be sent by one processor on receipt of thetoken 

max_messages: 20 

# Stagger sending the node join messages by 1..send_join ms 

send_join: 45 

# Limit generated nodeids to 31-bits (positive signed integers) 

clear_node_high_bit: yes 

# Disable encryption 

secauth: on 

# How many threads to use for encryption/decryption 

threads: 0 

# Optionally assign a fixed node id (integer) 

# nodeid: 1234 

rrp_mode: active 

interface { 

ringnumber: 0 



mcastport: 5405 


interface { 

ringnumber: 1 



mcastport: 5405 



logging { 

debug: on 

fileline: off 

to_syslog: yes 

to_stderr: off 

syslog_facility: daemon 

timestamp: on 


amf { 

mode: disabled 


Best regards, 

Gerry kernan 

Infinity Integration technology 

Suite 17 The mall Beacon Court 


Dublin 18 


P. +35312930090 

F. +35312930137 


> _______________________________________________ Pacemaker mailing list Pacemaker at oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker 

Diego Julian Remolina 
System Administrator - Systems Support Specialist IV 
School of Physics 
Georgia Institute of Technology 
Phone: (404) 385-3499 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20090815/04cc20e7/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.jpg
Type: image/jpeg
Size: 4478 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20090815/04cc20e7/attachment-0003.jpg>

More information about the Pacemaker mailing list