[Pacemaker] Corosync starts after system reboot but fails to load/start any resources

Parshvi parshvi.17 at gmail.com
Mon May 28 08:18:00 EDT 2012


Hi,

I have setup a two node cluster, with stonith disabled (Node-1 and Node-2), 
ocfs2 as the file system (running in a separate cluster):

Use case:
1) One resource runs in a Master/Slave mode with CIP1.
2) 5 resources run in Active/Passive mode with CIP2, preferred node being Node-
1.
   These resources are non sticky in nature (resource stickiness = 0)
3) 5 resources run in Active/Passive mode with CIP3, preferred node being Node-
2.
   These resources are non sticky in nature (resource stickiness = 0)
4) There are few more resources running in Active/Passive with stickiness = 1, 
preferred node being Node-1.

Test case:
Node-2 (Running as Primary) is rebooted.
Expected result (While Node-2 is offline):
   All resources of Node-2 running in Active/Passive fail-over to Node-1
   The slave instance of the M/S resource is promoted to Master on Node-1
When Node-2 is up after reboot:
   The non-sticky 5 resources should fail-back to Node-2
   A slave resource must start on Node-2

Observations:
-> When Node-2 is up after reboot, the following issues are observed:

1) NONE of the resources start on Node-2: The 5 non-sticky resources do not 
fail-back on Node-2.
2) The slave instance is not started on Node-2.

The system is rebooted at 8:38 a.m.
A restart of corosync engine is initiated at 9:50 a.m. which fails to fix the 
issue.
At 10:24 a.m. the system is rebooted again. This time the resources are started 
normally.

crm_mon on Node-1: shows Node-2 as offline.
crm_mon on Node-2: shows Node-1 as online.

An hb_report could not be captured. Although I have logs (corosync logs + sys 
logs)and pe-input files. Where can I publish them ? pastebin seems to be blocked





More information about the Pacemaker mailing list