[ClusterLabs] Pacemaker failover failure

Tue Jul 14 23:23:43 EDT 2015

В Wed, 15 Jul 2015 00:31:45 +0100
alex austin <alexixalex at gmail.com> пишет:

> Unfortunately I have nothing yet ...
> 
> There's something I don't quite understand though. What's the role of
> stonith if the other machine crashes unexpectedly and totally unclean? Is
> it to reboot the machine and recreate the cluster, thus making the drbd
> volume available again? or is it other?
> 

The role of stonith is to ensure known "offline" state of unclean node,
so that other nodes can start resources that were previously active on
unclean node. Without stonith remaining nodes cannot decide whether
unclean node is no more functional or whether it is simply
communication channel between nodes. 

> The way I see it, even if stonith is functional the other node's drbd
> filesystem will not be acessible until the crashed node is back up, is this
> correct?
> 

No, that is something in your configuration. Unfortunately I do not
have experience specifically with DRBD so cannot comment. But in
general once other node knows that crashed node is definitely down it
should allow full read-write access to local DRBD copy.

I'm not sure what "other's node drbd filesystem" means here though ...
DRBD is replicated by definition and does not belong to one or another
node. It has two replicas, and each replica is physically owned by one
node; nodes coordinate and replicate updates of local copy to copy on
its partner.