[Pacemaker] [Question]About the recovery procedure from the state that a node was divided.

renayama19661014 at ybb.ne.jp renayama19661014 at ybb.ne.jp
Wed Nov 3 21:44:32 EDT 2010


Hi All,

We tested it about the recovery procedure from the state that a node was divided.
(As for four nodes, three nodes are active, and one node is constitution of the standby.)

It is the restoration from a state divided by two nodes that we set in no-quorum-policy="freeze".

The resource keeps a state as is after it was divided in the case of freeze setting.
(We tested it using special RA to evade that recognition of the division of the node of ccm was late
in Heartbeat.)


We confirmed some patterns to recovery.
And we thought that the next method was desirable.

* The first method. (By this method, all resources do not stop.)
 Step1) Stop all the divided nodes of the one side.
 Step2) Break off the problem that a node divided.(For example, change a network card.)
 Step3) Make "/var/lib/heartbeat/crm/" clean.
        Make it clean in the node that stopped.
 Step4) Start two nodes that stopped.
 Step5) A cluster is rebuilt.

* The second method. (But, all resources stop when we take this method)
 Step1) Stop all four nodes.
 Step2) Break off the problem that a node divided.(For example, change a network card.)
 Step3) Make "/var/lib/heartbeat/crm/" clean.
        Make it clean in all nodes
 Step4) Start all four nodes.
 Step5) Send cib information to a cluster.
 Step6) A cluster is rebuilt.


We do not want to take the second method.
Because, all resources stop when we take second method.

Is not there a problem in the first method that we took?

Is there a method to recommend by a recovery method of the division from freeze setting as community?

Best Regards,
Hideo Yamauchi.





More information about the Pacemaker mailing list