[ClusterLabs] Recovering a failed (but running) server in rgmanager

Digimer lists at alteeve.ca
Sun Sep 18 15:37:26 EDT 2016

Hi all,

  If, for example, a server's definition file is corrupted while the
server is running, rgmanager will put the server into a 'failed' state.
That's fine and fair.

  The problem is that, once the file is fixed, there appears to be no
way to go failed -> started without disabling (and thus powering off)
the VM. This is troublesom because it forces an interruption when the
service could have been placed under resource management without a reboot.

  For example, doing 'clusvcadm -e <server>' when the service was
'disabled' (say because of a manual boot of the server), rgmanager
detects that the server is running fine and simply marks the server as
'started'. Is there no way to do something similar to go 'failed' ->
'started' without the 'disable' step?

  I tried freezing the service, no luck. I also tried coalescing via
'-c', but that didn't help either.


Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

More information about the Users mailing list