No subject


Sun Apr 3 02:52:37 EDT 2011


and have many things to do, like check_fs_done, check_fencing_done ...
The key point here is dlm need to wait the fencing is really done
before it proceed. If we employ a cluster filesystem here, like ocfs2,
it also needs the fencing is really done. I believe in the normal
cases, pacemaker will fence nodeA and then everything should be OK.

However, there is a possibility here that pacemaker won't fence nodeA.
Say nodeA is the original DC of the cluster, when nodeA is down, the
cluster should elect a new DC. But if the time window where membership
change 2 -> 3 is too small, node A is up again and attend the election
too, then node A is elected to be the DC again and it won't fence
itself.
Andrew, correct me if my understanding on pacemaker is wrong;)

So I think the membership change should be like a transaction in
database or filesystem field, that is, for the membership change
1 -> 2, every thing should be done (e.g. fencing nodeA), no matter the
following change 2 -> 3 will happen or not. For the situation where a
node magically disappear and reappear, and the situation where a node
normally down and then up, ocfs2 and dlm should not be able to see any
difference between them, what they can do is just waiting the fencing
to be done.

Any comments? thoughts?

Thanks,
Jiaju



More information about the Pacemaker mailing list