[ClusterLabs] single node fails to start the ocfs2 resource

Mon Mar 12 08:25:51 EDT 2018

Hi Muhammad!

Could you be a little bit more elaborate on your fencing-setup!
I read about you using SBD but I don't see any sbd-fencing-resource.
For the case you wanted to use watchdog-fencing with SBD this
would require stonith-watchdog-timeout property to be set.
But watchdog-fencing relies on quorum (without 2-node trickery)
and thus wouldn't work on a 2-node-cluster anyway.

Didn't read through the whole thread - so I might be missing something ...

Regards,
Klaus

On 03/12/2018 12:51 PM, Muhammad Sharfuddin wrote:
> Hello Gang,
>
> as informed, previously cluster was fixed to start the ocfs2 resources by
>
> a) crm resource start dlm
>
> b) mount/umount  the ocfs2 file system manually. (this step was the fix)
>
> and then starting the clone group(which include dlm, ocfs2 file
> systems) worked fine:
>
> c) crm resource start base-clone.
>
> Now I crash the nodes intentionally and then keep only one node
> online, again cluster stopped starting the ocfs2 resources. I again
> tried to follow your instructions i.e
>
> i) crm resource start dlm
>
> then try to mount the ocfs2 file system manually which got hanged this
> time(previously manually mounting helped me):
>
> # cat /proc/3966/stack
> [<ffffffffa039f18e>] do_uevent+0x7e/0x200 [dlm]
> [<ffffffffa039fe0a>] new_lockspace+0x80a/0xa70 [dlm]
> [<ffffffffa03a02d9>] dlm_new_lockspace+0x69/0x160 [dlm]
> [<ffffffffa038e758>] user_cluster_connect+0xc8/0x350 [ocfs2_stack_user]
> [<ffffffffa03c2872>] ocfs2_cluster_connect+0x192/0x240 [ocfs2_stackglue]
> [<ffffffffa045eefc>] ocfs2_dlm_init+0x31c/0x570 [ocfs2]
> [<ffffffffa04a9983>] ocfs2_fill_super+0xb33/0x1200 [ocfs2]
> [<ffffffff8120e130>] mount_bdev+0x1a0/0x1e0
> [<ffffffff8120ea1a>] mount_fs+0x3a/0x170
> [<ffffffff81228bf2>] vfs_kern_mount+0x62/0x110
> [<ffffffff8122b123>] do_mount+0x213/0xcd0
> [<ffffffff8122bed5>] SyS_mount+0x85/0xd0
> [<ffffffff81614b0a>] entry_SYSCALL_64_fastpath+0x1e/0xb6
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> I killed the mount.ocfs2 process stop(crm resource stop dlm) the dlm
> process, and then try to start(crm resource start dlm) the dlm(which
> previously always get started successfully), this time dlm didn't
> start and I checked the dlm_controld process
>
> cat /proc/3754/stack
> [<ffffffff8121dc55>] poll_schedule_timeout+0x45/0x60
> [<ffffffff8121f0bc>] do_sys_poll+0x38c/0x4f0
> [<ffffffff8121f2dd>] SyS_poll+0x5d/0xe0
> [<ffffffff81614b0a>] entry_SYSCALL_64_fastpath+0x1e/0xb6
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> Nutshell:
>
> 1 - this cluster is configured to run when single node is online
>
> 2 - this cluster does not start the ocfs2 resources after a crash when
> only one node is online.
>
> -- 
> Regards,
> Muhammad Sharfuddin | +923332144823 | nds.com.pk
>
> On 3/12/2018 12:41 PM, Gang He wrote:
>>
>>
>>> Hello Gang,
>>>
>>> to follow your instructions, I started the dlm resource via:
>>>
>>>       crm resource start dlm
>>>
>>> then mount/unmount the ocfs2 file system manually..(which seems to be
>>> the fix of the situation).
>>>
>>> Now resources are getting started properly on a single node.. I am
>>> happy
>>> as the issue is fixed, but at the same time I am lost because I have
>>> no idea
>>>
>>> how things get fixed here(merely by mounting/unmounting the ocfs2 file
>>> systems)
>> >From your description.
>> I just wonder  the DLM resource does not work normally under that
>> situation.
>> Yan/Bin, do you have any comments about two-node cluster? which
>> configuration settings will affect corosync quorum/DLM ?
>>
>>
>> Thanks
>> Gang
>>
>>
>>>
>>> -- 
>>> Regards,
>>> Muhammad Sharfuddin
>>>
>>> On 3/12/2018 10:59 AM, Gang He wrote:
>>>> Hello Muhammad,
>>>>
>>>> Usually, ocfs2 resource startup failure is caused by mount command
>>>> timeout
>>> (or hanged).
>>>> The sample debugging method is,
>>>> remove ocfs2 resource from crm first,
>>>> then mount this file system manually, see if the mount command will be
>>> timeout or hanged.
>>>> If this command is hanged, please watch where is mount.ocfs2
>>>> process hanged
>>> via "cat /proc/xxx/stack" command.
>>>> If the back trace is stopped at DLM kernel module, usually the root
>>>> cause is
>>> cluster configuration problem.
>>>>
>>>> Thanks
>>>> Gang
>>>>
>>>>
>>>>> On 3/12/2018 7:32 AM, Gang He wrote:
>>>>>> Hello Muhammad,
>>>>>>
>>>>>> I think this problem is not in ocfs2, the cause looks like the
>>>>>> cluster
>>>>> quorum is missed.
>>>>>> For two-node cluster (does not three-node cluster), if one node
>>>>>> is offline,
>>>>> the quorum will be missed by default.
>>>>>> So, you should configure two-node related quorum setting
>>>>>> according to the
>>>>> pacemaker manual.
>>>>>> Then, DLM can work normal, and ocfs2 resource can start up.
>>>>> Yes its configured accordingly, no-quorum is set to "ignore".
>>>>>
>>>>> property cib-bootstrap-options: \
>>>>>             have-watchdog=true \
>>>>>             stonith-enabled=true \
>>>>>             stonith-timeout=80 \
>>>>>             startup-fencing=true \
>>>>>             no-quorum-policy=ignore
>>>>>
>>>>>> Thanks
>>>>>> Gang
>>>>>>
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> This two node cluster starts resources when both nodes are
>>>>>>> online but
>>>>>>> does not start the ocfs2 resources
>>>>>>>
>>>>>>> when one node is offline. e.g if I gracefully stop the cluster
>>>>>>> resources
>>>>>>> then stop the pacemaker service on
>>>>>>>
>>>>>>> either node, and try to start the ocfs2 resource on the online
>>>>>>> node, it
>>>>>>> fails.
>>>>>>>
>>>>>>> logs:
>>>>>>>
>>>>>>> pipci001 pengine[17732]:   notice: Start   dlm:0#011(pipci001)
>>>>>>> pengine[17732]:   notice: Start   p-fssapmnt:0#011(pipci001)
>>>>>>> pengine[17732]:   notice: Start   p-fsusrsap:0#011(pipci001)
>>>>>>> pipci001 pengine[17732]:   notice: Calculated transition 2, saving
>>>>>>> inputs in /var/lib/pacemaker/pengine/pe-input-339.bz2
>>>>>>> pipci001 crmd[17733]:   notice: Processing graph 2
>>>>>>> (ref=pe_calc-dc-1520613202-31) derived from
>>>>>>> /var/lib/pacemaker/pengine/pe-input-339.bz2
>>>>>>> crmd[17733]:   notice: Initiating start operation dlm_start_0
>>>>>>> locally on
>>>>>>> pipci001
>>>>>>> lrmd[17730]:   notice: executing - rsc:dlm action:start call_id:69
>>>>>>> dlm_controld[19019]: 4575 dlm_controld 4.0.7 started
>>>>>>> lrmd[17730]:   notice: finished - rsc:dlm action:start call_id:69
>>>>>>> pid:18999 exit-code:0 exec-time:1082ms queue-time:1ms
>>>>>>> crmd[17733]:   notice: Result of start operation for dlm on
>>>>>>> pipci001: 0 (ok)
>>>>>>> crmd[17733]:   notice: Initiating monitor operation
>>>>>>> dlm_monitor_60000
>>>>>>> locally on pipci001
>>>>>>> crmd[17733]:   notice: Initiating start operation
>>>>>>> p-fssapmnt_start_0
>>>>>>> locally on pipci001
>>>>>>> lrmd[17730]:   notice: executing - rsc:p-fssapmnt action:start
>>>>>>> call_id:71
>>>>>>> Filesystem(p-fssapmnt)[19052]: INFO: Running start for
>>>>>>> /dev/mapper/sapmnt on /sapmnt
>>>>>>> kernel: [ 4576.529938] dlm: Using TCP for communications
>>>>>>> kernel: [ 4576.530233] dlm: BFA9FF042AA045F4822C2A6A06020EE9:
>>>>>>> joining
>>>>>>> the lockspace group.
>>>>>>> dlm_controld[19019]: 4629 fence work wait for quorum
>>>>>>> dlm_controld[19019]: 4634 BFA9FF042AA045F4822C2A6A06020EE9 wait
>>>>>>> for quorum
>>>>>>> lrmd[17730]:  warning: p-fssapmnt_start_0 process (PID 19052)
>>>>>>> timed out
>>>>>>> kernel: [ 4636.418223] dlm: BFA9FF042AA045F4822C2A6A06020EE9: group
>>>>>>> event done -512 0
>>>>>>> kernel: [ 4636.418227] dlm: BFA9FF042AA045F4822C2A6A06020EE9:
>>>>>>> group join
>>>>>>> failed -512 0
>>>>>>> lrmd[17730]:  warning: p-fssapmnt_start_0:19052 - timed out
>>>>>>> after 60000ms
>>>>>>> lrmd[17730]:   notice: finished - rsc:p-fssapmnt action:start
>>>>>>> call_id:71
>>>>>>> pid:19052 exit-code:1 exec-time:60002ms queue-time:0ms
>>>>>>> kernel: [ 4636.420628] ocfs2: Unmounting device (254,1) on (node 0)
>>>>>>> crmd[17733]:    error: Result of start operation for p-fssapmnt on
>>>>>>> pipci001: Timed Out
>>>>>>> crmd[17733]:  warning: Action 11 (p-fssapmnt_start_0) on
>>>>>>> pipci001 failed
>>>>>>> (target: 0 vs. rc: 1): Error
>>>>>>> crmd[17733]:   notice: Transition aborted by operation
>>>>>>> p-fssapmnt_start_0 'modify' on pipci001: Event failed
>>>>>>> crmd[17733]:  warning: Action 11 (p-fssapmnt_start_0) on
>>>>>>> pipci001 failed
>>>>>>> (target: 0 vs. rc: 1): Error
>>>>>>> crmd[17733]:   notice: Transition 2 (Complete=5, Pending=0,
>>>>>>> Fired=0,
>>>>>>> Skipped=0, Incomplete=6,
>>>>>>> Source=/var/lib/pacemaker/pengine/pe-input-339.bz2): Complete
>>>>>>> pengine[17732]:   notice: Watchdog will be used via SBD if
>>>>>>> fencing is
>>>>>>> required
>>>>>>> pengine[17732]:   notice: On loss of CCM Quorum: Ignore
>>>>>>> pengine[17732]:  warning: Processing failed op start for
>>>>>>> p-fssapmnt:0 on
>>>>>>> pipci001: unknown error (1)
>>>>>>> pengine[17732]:  warning: Processing failed op start for
>>>>>>> p-fssapmnt:0 on
>>>>>>> pipci001: unknown error (1)
>>>>>>> pengine[17732]:  warning: Forcing base-clone away from pipci001
>>>>>>> after
>>>>>>> 1000000 failures (max=2)
>>>>>>> pengine[17732]:  warning: Forcing base-clone away from pipci001
>>>>>>> after
>>>>>>> 1000000 failures (max=2)
>>>>>>> pengine[17732]:   notice: Stop    dlm:0#011(pipci001)
>>>>>>> pengine[17732]:   notice: Stop    p-fssapmnt:0#011(pipci001)
>>>>>>> pengine[17732]:   notice: Calculated transition 3, saving inputs in
>>>>>>> /var/lib/pacemaker/pengine/pe-input-340.bz2
>>>>>>> pengine[17732]:   notice: Watchdog will be used via SBD if
>>>>>>> fencing is
>>>>>>> required
>>>>>>> pengine[17732]:   notice: On loss of CCM Quorum: Ignore
>>>>>>> pengine[17732]:  warning: Processing failed op start for
>>>>>>> p-fssapmnt:0 on
>>>>>>> pipci001: unknown error (1)
>>>>>>> pengine[17732]:  warning: Processing failed op start for
>>>>>>> p-fssapmnt:0 on
>>>>>>> pipci001: unknown error (1)
>>>>>>> pengine[17732]:  warning: Forcing base-clone away from pipci001
>>>>>>> after
>>>>>>> 1000000 failures (max=2)
>>>>>>> pipci001 pengine[17732]:  warning: Forcing base-clone away from
>>>>>>> pipci001
>>>>>>> after 1000000 failures (max=2)
>>>>>>> pengine[17732]:   notice: Stop    dlm:0#011(pipci001)
>>>>>>> pengine[17732]:   notice: Stop    p-fssapmnt:0#011(pipci001)
>>>>>>> pengine[17732]:   notice: Calculated transition 4, saving inputs in
>>>>>>> /var/lib/pacemaker/pengine/pe-input-341.bz2
>>>>>>> crmd[17733]:   notice: Processing graph 4
>>>>>>> (ref=pe_calc-dc-1520613263-36)
>>>>>>> derived from /var/lib/pacemaker/pengine/pe-input-341.bz2
>>>>>>> crmd[17733]:   notice: Initiating stop operation p-fssapmnt_stop_0
>>>>>>> locally on pipci001
>>>>>>> lrmd[17730]:   notice: executing - rsc:p-fssapmnt action:stop
>>>>>>> call_id:72
>>>>>>> Filesystem(p-fssapmnt)[19189]: INFO: Running stop for
>>>>>>> /dev/mapper/sapmnt
>>>>>>> on /sapmnt
>>>>>>> pipci001 lrmd[17730]:   notice: finished - rsc:p-fssapmnt
>>>>>>> action:stop
>>>>>>> call_id:72 pid:19189 exit-code:0 exec-time:83ms queue-time:0ms
>>>>>>> pipci001 crmd[17733]:   notice: Result of stop operation for
>>>>>>> p-fssapmnt
>>>>>>> on pipci001: 0 (ok)
>>>>>>> crmd[17733]:   notice: Initiating stop operation dlm_stop_0
>>>>>>> locally on
>>>>>>> pipci001
>>>>>>> pipci001 lrmd[17730]:   notice: executing - rsc:dlm action:stop
>>>>>>> call_id:74
>>>>>>> pipci001 dlm_controld[19019]: 4636 shutdown ignored, active
>>>>>>> lockspaces
>>>>>>>
>>>>>>>
>>>>>>> resource configuration:
>>>>>>>
>>>>>>> primitive p-fssapmnt Filesystem \
>>>>>>>             params device="/dev/mapper/sapmnt" directory="/sapmnt"
>>>>>>> fstype=ocfs2 \
>>>>>>>             op monitor interval=20 timeout=40 \
>>>>>>>             op start timeout=60 interval=0 \
>>>>>>>             op stop timeout=60 interval=0
>>>>>>> primitive dlm ocf:pacemaker:controld \
>>>>>>>             op monitor interval=60 timeout=60 \
>>>>>>>             op start interval=0 timeout=90 \
>>>>>>>             op stop interval=0 timeout=100
>>>>>>> clone base-clone base-group \
>>>>>>>             meta interleave=true target-role=Started
>>>>>>>
>>>>>>> cluster properties:
>>>>>>> property cib-bootstrap-options: \
>>>>>>>             have-watchdog=true \
>>>>>>>             stonith-enabled=true \
>>>>>>>             stonith-timeout=80 \
>>>>>>>             startup-fencing=true \
>>>>>>>
>>>>>>>
>>>>>>> Software versions:
>>>>>>>
>>>>>>> kernel version: 4.4.114-94.11-default
>>>>>>> pacemaker-1.1.16-4.8.x86_64
>>>>>>> corosync-2.3.6-9.5.1.x86_64
>>>>>>> ocfs2-kmp-default-4.4.114-94.11.3.x86_64
>>>>>>> ocfs2-tools-1.8.5-1.35.x86_64
>>>>>>> dlm-kmp-default-4.4.114-94.11.3.x86_64
>>>>>>> libdlm3-4.0.7-1.28.x86_64
>>>>>>> libdlm-4.0.7-1.28.x86_64
>>>>>>>
>>>>>>>
>>>>>>> -- 
>>>>>>> Regards,
>>>>>>> Muhammad Sharfuddin
>>>>>>>
>>>>>>>
>>>>>>> ---
>>>>>>> This email has been checked for viruses by Avast antivirus
>>>>>>> software.
>>>>>>> https://www.avast.com/antivirus
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Users mailing list: Users at clusterlabs.org
>>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>>>>
>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>> Getting started:
>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>> _______________________________________________
>>>>>> Users mailing list: Users at clusterlabs.org
>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org
>>>>>> Getting started:
>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>
>>>>> -- 
>>>>> Regards,
>>>>> Muhammad Sharfuddin
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list: Users at clusterlabs.org
>>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started:
>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>> _______________________________________________
>>>> Users mailing list: Users at clusterlabs.org
>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started:
>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>>
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org
>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started:
>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org