[Pacemaker] Resource starting problem

Christian Roessner c at g33k5.de
Wed Jun 15 06:46:57 EDT 2011


Hi,

this is my first post on this list. I hope I put my question to the
correct mailing-list.

I have installed Pacemaker/Corosync on two Ubuntu-Lucid Servers building
a two node cluster. This cluster shall become a router for a datacenter.
I installed the distribution provided packages. I guess version 1.0.8.

The cluster is set up so far and it seems to work. It seems, because
sometimes one of the resources does not start and this is shown in the
logs as unknown error. The error also is very random, like rolling the
dice. But first of all, here is my crm config:

node bgwnode1 \
	attributes standby="off"
node bgwnode2 \
	attributes standby="off"
primitive resIPdatacenter ocf:heartbeat:IPaddr2 \
	meta migration-threshold="3" \
	op monitor interval="10s" timeout="20s" \
	params ip="10.0.0.1" nic="eth3" cidr_netmask="8"
primitive resIPoffice ocf:heartbeat:IPaddr2 \
	meta migration-threshold="3" \
	op monitor interval="10s" timeout="20s" \
	params ip="192.168.20.1" nic="eth3" cidr_netmask="24"
primitive resIPsubnet1 ocf:heartbeat:IPaddr2 \
	meta migration-threshold="3" \
	op monitor interval="10s" timeout="20s" \
	params ip="213.252.188.1" nic="eth3" cidr_netmask="25"
primitive resIPtransfer ocf:heartbeat:IPaddr2 \
	meta migration-threshold="3" \
	op monitor interval="10s" timeout="20s" \
	params ip="212.68.95.210" nic="eth2" cidr_netmask="30"
primitive resPing ocf:heartbeat:pingd \
	params host_list="172.16.1.1 172.16.1.2" dampen="5s" multiplier="100"
primitive resRouteWANbcc ocf:heartbeat:Route \
	meta migration-threshold="3" \
	op monitor interval="10s" timeout="20s" \
	params destination="0.0.0.0/0" device="eth2" gateway="212.68.95.209"
primitive resSysInfo ocf:heartbeat:SysInfo \
	op monitor interval="10s"
clone clonePing resPing
clone cloneSysInfo resSysInfo
location locNetServices resIPtransfer \
	rule $id="locNetServices-rule" pingd: defined pingd
xml <rsc_colocation id="totalColoc" score="INFINITY"> \
	<resource_set id="orderSetup-30bacef5" sequential="true"> \
		<resource_ref id="resIPtransfer"/> \
		<resource_ref id="resIPsubnet1"/> \
		<resource_ref id="resIPoffice"/> \
		<resource_ref id="resIPdatacenter"/> \
		<resource_ref id="resRouteWANbcc"/> \
	</resource_set> \
</rsc_colocation>
property $id="cib-bootstrap-options" \
	dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \
	cluster-infrastructure="openais" \
	expected-quorum-votes="2" \
	no-quorum-policy="ignore" \
	stonith-enabled="false"
rsc_defaults $id="rsc-options" \
	resource-stickiness="INFINITY"

The resource "resRouteWANbcc" sometimes does not start and I really
don't know why. I thought that the resource_set would start eache
resource one-by-one and only would start later resources if early
resources started successfully. The route belongs to "resIPtransfer"
which should have been up as first resource.

I also thought about adding a ocf:heartbeat:Delay resource, but this did
not work.

I also thought that the interface might take too long because of AutoNeg
media detection, so I configured the interfaces appropriate. This does
not fix the problem as well.

Unfortunately if the default route is not HA, then the whole setup isn't.

And a second problem is detecting an unplugged cable. I realized that
crm triggers the ifconfig up/down state. So I simply installed ifplugd
to monitor the ports and automatically bring interfaces up and down:

ARGS="-q -p -f -u0 -d0 -w -I -m ethtool"

But this also works only sometimes. So currently I am a little bit stuck :-)

Of some of you had some beginners tips for me, I appreciate that very much.

Thanks in advance

Christian Roessner
-- 
Roessner-Network-Solutions
Bachelor of Science Informatik
50°34.725'N, 08°40.904'O, Nahrungsberg 81, 35390 Giessen
F: +49 641 5879091, M: +49 176 93118939
USt-IdNr.: DE225643613
http://www.roessner-network-solutions.com

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 554 bytes
Desc: OpenPGP digital signature
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110615/f930e3b0/attachment-0002.sig>


More information about the Pacemaker mailing list