[Pacemaker] Could not connect to the CIB: Remote node did not respond

Liang.Ma at asc-csa.gc.ca Liang.Ma at asc-csa.gc.ca
Wed Feb 9 10:09:11 EST 2011


Forgot mentioning that the pair of nodes work before. And I can still run "crm configure show". Here is the configuration.

node $id="bc6bf61d-6b5f-4307-85f3-bf7bb11531bb" arsvr2 \
	attributes standby="off"
node $id="bf0e7394-9684-42b9-893b-5a9a6ecddd7e" arsvr1 \
	attributes standby="off"
primitive apache2 lsb:apache2 \
	op start interval="0" timeout="60" \
	op stop interval="0" timeout="120" start-delay="15" \
	meta target-role="Started"
primitive drbd_mysql ocf:linbit:drbd \
	params drbd_resource="r0" \
	op monitor interval="15s"
primitive drbd_webfs ocf:linbit:drbd \
	params drbd_resource="r1" \
	op monitor interval="15s" \
	op start interval="0" timeout="240" \
	op stop interval="0" timeout="100"
primitive fs_mysql ocf:heartbeat:Filesystem \
	params device="/dev/drbd/by-res/r0" directory="/var/lib/mysql" fstype="ext4" \
	op start interval="0" timeout="60" \
	op stop interval="0" timeout="120" \
	meta target-role="Started"
primitive fs_webfs ocf:heartbeat:Filesystem \
	params device="/dev/drbd/by-res/r1" directory="/srv" fstype="ext4" \
	op start interval="0" timeout="60" \
	op stop interval="0" timeout="120" \
	meta target-role="Started"
primitive ip1 ocf:heartbeat:IPaddr2 \
	params ip="138.214.240.193" nic="eth0" \
	op monitor interval="5s" \
	meta target-role="Started"
primitive ip1arp ocf:heartbeat:SendArp \
	params ip="138.214.240.193" nic="eth0" \
	meta target-role="Started"
primitive mysql ocf:heartbeat:mysql \
	params binary="/usr/bin/mysqld_safe" config="/etc/mysql/my.cnf"
user="mysql" group="mysql" log="/var/log/mysql.log"
pid="/var/lib/mysql/mysqld.pid" datadir="/var/lib/mysql"
socket="/var/run/mysqld/mysqld.sock" \
	op monitor interval="30s" timeout="30s" \
	op start interval="0" timeout="120" \
	op stop interval="0" timeout="120" \
	meta target-role="Started"
group MySQLDB fs_mysql mysql \
	meta target-role="Started"
group WebServices ip1 ip1arp fs_webfs apache2 \
	meta target-role="Started"
ms ms_drbd_mysql drbd_mysql \
	meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true"
ms ms_drbd_webfs drbd_webfs \
	meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true" target-role="Started"
colocation apache2_with_ip inf: apache2 ip1 
colocation apache2_with_mysql inf: apache2 ms_drbd_mysql:Master 
colocation apache2_with_webfs inf: apache2 ms_drbd_webfs:Master 
colocation fs_on_drbd inf: fs_mysql ms_drbd_mysql:Master 
colocation ip_with_ip_arp inf: ip1 ip1arp 
colocation mysql_on_drbd inf: MySQLDB ms_drbd_mysql:Master 
colocation mysql_with_ip inf: MySQLDB ip1 
colocation webfs_on_drbd inf: fs_webfs ms_drbd_webfs:Master 
order apache2-after-arp inf: ip1arp:start apache2:start 
order arp-after-ip inf: ip1:start ip1arp:start 
order fs-mysql-after-drbd inf: ms_drbd_mysql:promote fs_mysql:start 
order fs-webfs-after-drbd inf: ms_drbd_webfs:promote fs_webfs:start 
order ip-after-mysql inf: mysql:start ip1:start 
order mysql-after-fs-mysql inf: fs_mysql:start mysql:start property $id="cib-bootstrap-options" \
	dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \
	cluster-infrastructure="Heartbeat" \
	expected-quorum-votes="1" \
	stonith-enabled="false" \
	no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
	resource-stickiness="100"

Liang Ma
Contractuel | Consultant | SED Systems Inc. 
Ground Systems Analyst
Agence spatiale canadienne | Canadian Space Agency
6767, Route de l'Aéroport, Longueuil (St-Hubert), QC, Canada, J3Y 8Y9
Tél/Tel : (450) 926-5099 | Téléc/Fax: (450) 926-5083
Courriel/E-mail : [liang.ma at space.gc.ca]
Site web/Web site : [www.space.gc.ca ] 




-----Original Message-----
From: Ma, Liang 
Sent: February 9, 2011 9:59 AM
To: 'The Pacemaker cluster resource manager'
Subject: Could not connect to the CIB: Remote node did not respond

Hi There,

After a network and power shutdown, my LAMP cluster servers were totally screwed up.

Now crm status gives me

crm status
============
Last updated: Wed Feb  9 09:44:17 2011
Stack: Heartbeat
Current DC: arsvr2 (bc6bf61d-6b5f-4307-85f3-bf7bb11531bb) - partition with quorum
Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
2 Nodes configured, 1 expected votes
4 Resources configured.
============

Online: [ arsvr1 arsvr2 ]

None of the resources comes up.

First I found a brain split in drbd disks. I fixed that and the drbd disks are health. I can mount them manually without problem.

However if I try anything to bring up a resource or edit cib or even a query, it gives me errors as following

crm resource start fs_mysql
Call cib_replace failed (-41): Remote node did not respond <null>

crm configure edit
Could not connect to the CIB: Remote node did not respond
ERROR: creating tmp shadow __crmshell.2540 failed


cibadmin -Q
Call cib_query failed (-41): Remote node did not respond <null>

Any idea what I can do to bring the cluster back?

Thank you,

Liang Ma
Contractuel | Consultant | SED Systems Inc. 
Ground Systems Analyst
Agence spatiale canadienne | Canadian Space Agency
6767, Route de l'Aéroport, Longueuil (St-Hubert), QC, Canada, J3Y 8Y9
Tél/Tel : (450) 926-5099 | Téléc/Fax: (450) 926-5083
Courriel/E-mail : [liang.ma at space.gc.ca]
Site web/Web site : [www.space.gc.ca ] 




er




More information about the Pacemaker mailing list