[ClusterLabs] ClusterLabsdlm reason for leaving the cluster changes when stopping gfs2-utils service
wferi at niif.hu
Wed Mar 23 13:33:20 EDT 2016
(Please post only to the list, or at least keep it amongst the Cc-s.)
Momcilo Medic <fedorauser at fedoraproject.org> writes:
> On Wed, Mar 23, 2016 at 1:56 PM, Ferenc Wágner <wferi at niif.hu> wrote:
>> Momcilo Medic <fedorauser at fedoraproject.org> writes:
>>> I have three hosts setup in my test environment.
>>> They each have two connections to the SAN which has GFS2 on it.
>>> Everything works like a charm, except when I reboot a host.
>>> Once it tries to stop gfs2-utils service it will just hang.
>> Are you sure the OS reboot sequence does not stop the network or
>> corosync before GFS and DLM?
> I specifically configured services to start in this order:
> Corosync - DLM - GFS2-utils
> and to shutdown in this order:
> GFS2-utils - DLM - Corosync.
> I've acomplish this with:
> update-rc.d -f corosync remove
> update-rc.d -f corosync-notifyd remove
> update-rc.d -f dlm remove
> update-rc.d -f gfs2-utils remove
> update-rc.d -f xendomains remove
> update-rc.d corosync start 25 2 3 4 5 . stop 35 0 1 6 .
> update-rc.d corosync-notifyd start 25 2 3 4 5 . stop 35 0 1 6 .
> update-rc.d dlm start 30 2 3 4 5 . stop 30 0 1 6 .
> update-rc.d gfs2-utils start 35 2 3 4 5 . stop 25 0 1 6 .
> update-rc.d xendomains start 40 2 3 4 5 . stop 20 0 1 6 .
I don't know your OS, the above may or may not work.
> Also, the moment I was capturing logs, corosync and dlm were not
> running as services, but in foreground debugging mode.
> SSH connection did not break until I powered down the host so network
> is not stopped either.
At least you've got interactive debugging ability then. So try to find
out why the Corosync membership broke down. The output of
corosync-quorumtool and corosync-cpgtool might help. Also try pinging
the Corosync ring0 addresses between the nodes.
More information about the Users