[ClusterLabs] ClusterLabsdlm reason for leaving the cluster changes when stopping gfs2-utils service

Ferenc Wágner wferi at niif.hu
Wed Mar 23 17:33:20 UTC 2016


(Please post only to the list, or at least keep it amongst the Cc-s.)

Momcilo Medic <fedorauser at fedoraproject.org> writes:

> On Wed, Mar 23, 2016 at 1:56 PM, Ferenc Wágner <wferi at niif.hu> wrote:
>> Momcilo Medic <fedorauser at fedoraproject.org> writes:
>>
>>> I have three hosts setup in my test environment.
>>> They each have two connections to the SAN which has GFS2 on it.
>>>
>>> Everything works like a charm, except when I reboot a host.
>>> Once it tries to stop gfs2-utils service it will just hang.
>>
>> Are you sure the OS reboot sequence does not stop the network or
>> corosync before GFS and DLM?
>
> I specifically configured services to start in this order:
> Corosync - DLM - GFS2-utils
> and to shutdown in this order:
> GFS2-utils - DLM - Corosync.
>
> I've acomplish this with:
>  update-rc.d -f corosync remove
>  update-rc.d -f corosync-notifyd remove
>  update-rc.d -f dlm remove
>  update-rc.d -f gfs2-utils remove
>  update-rc.d -f xendomains remove
>  update-rc.d corosync start 25 2 3 4 5 . stop 35 0 1 6 .
>  update-rc.d corosync-notifyd start 25 2 3 4 5 . stop 35 0 1 6 .
>  update-rc.d dlm start 30 2 3 4 5 . stop 30 0 1 6 .
>  update-rc.d gfs2-utils start 35 2 3 4 5 . stop 25 0 1 6 .
>  update-rc.d xendomains start 40 2 3 4 5 . stop 20 0 1 6 .

I don't know your OS, the above may or may not work.

> Also, the moment I was capturing logs, corosync and dlm were not
> running as services, but in foreground debugging mode.
> SSH connection did not break until I powered down the host so network
> is not stopped either.

At least you've got interactive debugging ability then.  So try to find
out why the Corosync membership broke down.  The output of
corosync-quorumtool and corosync-cpgtool might help.  Also try pinging
the Corosync ring0 addresses between the nodes.
-- 
Feri




More information about the Users mailing list