[ClusterLabs] single node fails to start the ocfs2 resource

Mon Mar 12 07:51:20 EDT 2018

Hello Gang,

as informed, previously cluster was fixed to start the ocfs2 resources by

a) crm resource start dlm

b) mount/umount  the ocfs2 file system manually. (this step was the fix)

and then starting the clone group(which include dlm, ocfs2 file systems) 
worked fine:

c) crm resource start base-clone.

Now I crash the nodes intentionally and then keep only one node online, 
again cluster stopped starting the ocfs2 resources. I again tried to 
follow your instructions i.e

i) crm resource start dlm

then try to mount the ocfs2 file system manually which got hanged this 
time(previously manually mounting helped me):

# cat /proc/3966/stack
[<ffffffffa039f18e>] do_uevent+0x7e/0x200 [dlm]
[<ffffffffa039fe0a>] new_lockspace+0x80a/0xa70 [dlm]
[<ffffffffa03a02d9>] dlm_new_lockspace+0x69/0x160 [dlm]
[<ffffffffa038e758>] user_cluster_connect+0xc8/0x350 [ocfs2_stack_user]
[<ffffffffa03c2872>] ocfs2_cluster_connect+0x192/0x240 [ocfs2_stackglue]
[<ffffffffa045eefc>] ocfs2_dlm_init+0x31c/0x570 [ocfs2]
[<ffffffffa04a9983>] ocfs2_fill_super+0xb33/0x1200 [ocfs2]
[<ffffffff8120e130>] mount_bdev+0x1a0/0x1e0
[<ffffffff8120ea1a>] mount_fs+0x3a/0x170
[<ffffffff81228bf2>] vfs_kern_mount+0x62/0x110
[<ffffffff8122b123>] do_mount+0x213/0xcd0
[<ffffffff8122bed5>] SyS_mount+0x85/0xd0
[<ffffffff81614b0a>] entry_SYSCALL_64_fastpath+0x1e/0xb6
[<ffffffffffffffff>] 0xffffffffffffffff

I killed the mount.ocfs2 process stop(crm resource stop dlm) the dlm 
process, and then try to start(crm resource start dlm) the dlm(which 
previously always get started successfully), this time dlm didn't start 
and I checked the dlm_controld process

cat /proc/3754/stack
[<ffffffff8121dc55>] poll_schedule_timeout+0x45/0x60
[<ffffffff8121f0bc>] do_sys_poll+0x38c/0x4f0
[<ffffffff8121f2dd>] SyS_poll+0x5d/0xe0
[<ffffffff81614b0a>] entry_SYSCALL_64_fastpath+0x1e/0xb6
[<ffffffffffffffff>] 0xffffffffffffffff

Nutshell:

1 - this cluster is configured to run when single node is online

2 - this cluster does not start the ocfs2 resources after a crash when 
only one node is online.

--
Regards,
Muhammad Sharfuddin | +923332144823 | nds.com.pk

On 3/12/2018 12:41 PM, Gang He wrote:
>
>
>> Hello Gang,
>>
>> to follow your instructions, I started the dlm resource via:
>>
>>       crm resource start dlm
>>
>> then mount/unmount the ocfs2 file system manually..(which seems to be
>> the fix of the situation).
>>
>> Now resources are getting started properly on a single node.. I am happy
>> as the issue is fixed, but at the same time I am lost because I have no idea
>>
>> how things get fixed here(merely by mounting/unmounting the ocfs2 file
>> systems)
> >From your description.
> I just wonder  the DLM resource does not work normally under that situation.
> Yan/Bin, do you have any comments about two-node cluster? which configuration settings will affect corosync quorum/DLM ?
>
>
> Thanks
> Gang
>
>
>>
>> --
>> Regards,
>> Muhammad Sharfuddin
>>
>> On 3/12/2018 10:59 AM, Gang He wrote:
>>> Hello Muhammad,
>>>
>>> Usually, ocfs2 resource startup failure is caused by mount command timeout
>> (or hanged).
>>> The sample debugging method is,
>>> remove ocfs2 resource from crm first,
>>> then mount this file system manually, see if the mount command will be
>> timeout or hanged.
>>> If this command is hanged, please watch where is mount.ocfs2 process hanged
>> via "cat /proc/xxx/stack" command.
>>> If the back trace is stopped at DLM kernel module, usually the root cause is
>> cluster configuration problem.
>>>
>>> Thanks
>>> Gang
>>>
>>>
>>>> On 3/12/2018 7:32 AM, Gang He wrote:
>>>>> Hello Muhammad,
>>>>>
>>>>> I think this problem is not in ocfs2, the cause looks like the cluster
>>>> quorum is missed.
>>>>> For two-node cluster (does not three-node cluster), if one node is offline,
>>>> the quorum will be missed by default.
>>>>> So, you should configure two-node related quorum setting according to the
>>>> pacemaker manual.
>>>>> Then, DLM can work normal, and ocfs2 resource can start up.
>>>> Yes its configured accordingly, no-quorum is set to "ignore".
>>>>
>>>> property cib-bootstrap-options: \
>>>>             have-watchdog=true \
>>>>             stonith-enabled=true \
>>>>             stonith-timeout=80 \
>>>>             startup-fencing=true \
>>>>             no-quorum-policy=ignore
>>>>
>>>>> Thanks
>>>>> Gang
>>>>>
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> This two node cluster starts resources when both nodes are online but
>>>>>> does not start the ocfs2 resources
>>>>>>
>>>>>> when one node is offline. e.g if I gracefully stop the cluster resources
>>>>>> then stop the pacemaker service on
>>>>>>
>>>>>> either node, and try to start the ocfs2 resource on the online node, it
>>>>>> fails.
>>>>>>
>>>>>> logs:
>>>>>>
>>>>>> pipci001 pengine[17732]:   notice: Start   dlm:0#011(pipci001)
>>>>>> pengine[17732]:   notice: Start   p-fssapmnt:0#011(pipci001)
>>>>>> pengine[17732]:   notice: Start   p-fsusrsap:0#011(pipci001)
>>>>>> pipci001 pengine[17732]:   notice: Calculated transition 2, saving
>>>>>> inputs in /var/lib/pacemaker/pengine/pe-input-339.bz2
>>>>>> pipci001 crmd[17733]:   notice: Processing graph 2
>>>>>> (ref=pe_calc-dc-1520613202-31) derived from
>>>>>> /var/lib/pacemaker/pengine/pe-input-339.bz2
>>>>>> crmd[17733]:   notice: Initiating start operation dlm_start_0 locally on
>>>>>> pipci001
>>>>>> lrmd[17730]:   notice: executing - rsc:dlm action:start call_id:69
>>>>>> dlm_controld[19019]: 4575 dlm_controld 4.0.7 started
>>>>>> lrmd[17730]:   notice: finished - rsc:dlm action:start call_id:69
>>>>>> pid:18999 exit-code:0 exec-time:1082ms queue-time:1ms
>>>>>> crmd[17733]:   notice: Result of start operation for dlm on pipci001: 0 (ok)
>>>>>> crmd[17733]:   notice: Initiating monitor operation dlm_monitor_60000
>>>>>> locally on pipci001
>>>>>> crmd[17733]:   notice: Initiating start operation p-fssapmnt_start_0
>>>>>> locally on pipci001
>>>>>> lrmd[17730]:   notice: executing - rsc:p-fssapmnt action:start call_id:71
>>>>>> Filesystem(p-fssapmnt)[19052]: INFO: Running start for
>>>>>> /dev/mapper/sapmnt on /sapmnt
>>>>>> kernel: [ 4576.529938] dlm: Using TCP for communications
>>>>>> kernel: [ 4576.530233] dlm: BFA9FF042AA045F4822C2A6A06020EE9: joining
>>>>>> the lockspace group.
>>>>>> dlm_controld[19019]: 4629 fence work wait for quorum
>>>>>> dlm_controld[19019]: 4634 BFA9FF042AA045F4822C2A6A06020EE9 wait for quorum
>>>>>> lrmd[17730]:  warning: p-fssapmnt_start_0 process (PID 19052) timed out
>>>>>> kernel: [ 4636.418223] dlm: BFA9FF042AA045F4822C2A6A06020EE9: group
>>>>>> event done -512 0
>>>>>> kernel: [ 4636.418227] dlm: BFA9FF042AA045F4822C2A6A06020EE9: group join
>>>>>> failed -512 0
>>>>>> lrmd[17730]:  warning: p-fssapmnt_start_0:19052 - timed out after 60000ms
>>>>>> lrmd[17730]:   notice: finished - rsc:p-fssapmnt action:start call_id:71
>>>>>> pid:19052 exit-code:1 exec-time:60002ms queue-time:0ms
>>>>>> kernel: [ 4636.420628] ocfs2: Unmounting device (254,1) on (node 0)
>>>>>> crmd[17733]:    error: Result of start operation for p-fssapmnt on
>>>>>> pipci001: Timed Out
>>>>>> crmd[17733]:  warning: Action 11 (p-fssapmnt_start_0) on pipci001 failed
>>>>>> (target: 0 vs. rc: 1): Error
>>>>>> crmd[17733]:   notice: Transition aborted by operation
>>>>>> p-fssapmnt_start_0 'modify' on pipci001: Event failed
>>>>>> crmd[17733]:  warning: Action 11 (p-fssapmnt_start_0) on pipci001 failed
>>>>>> (target: 0 vs. rc: 1): Error
>>>>>> crmd[17733]:   notice: Transition 2 (Complete=5, Pending=0, Fired=0,
>>>>>> Skipped=0, Incomplete=6,
>>>>>> Source=/var/lib/pacemaker/pengine/pe-input-339.bz2): Complete
>>>>>> pengine[17732]:   notice: Watchdog will be used via SBD if fencing is
>>>>>> required
>>>>>> pengine[17732]:   notice: On loss of CCM Quorum: Ignore
>>>>>> pengine[17732]:  warning: Processing failed op start for p-fssapmnt:0 on
>>>>>> pipci001: unknown error (1)
>>>>>> pengine[17732]:  warning: Processing failed op start for p-fssapmnt:0 on
>>>>>> pipci001: unknown error (1)
>>>>>> pengine[17732]:  warning: Forcing base-clone away from pipci001 after
>>>>>> 1000000 failures (max=2)
>>>>>> pengine[17732]:  warning: Forcing base-clone away from pipci001 after
>>>>>> 1000000 failures (max=2)
>>>>>> pengine[17732]:   notice: Stop    dlm:0#011(pipci001)
>>>>>> pengine[17732]:   notice: Stop    p-fssapmnt:0#011(pipci001)
>>>>>> pengine[17732]:   notice: Calculated transition 3, saving inputs in
>>>>>> /var/lib/pacemaker/pengine/pe-input-340.bz2
>>>>>> pengine[17732]:   notice: Watchdog will be used via SBD if fencing is
>>>>>> required
>>>>>> pengine[17732]:   notice: On loss of CCM Quorum: Ignore
>>>>>> pengine[17732]:  warning: Processing failed op start for p-fssapmnt:0 on
>>>>>> pipci001: unknown error (1)
>>>>>> pengine[17732]:  warning: Processing failed op start for p-fssapmnt:0 on
>>>>>> pipci001: unknown error (1)
>>>>>> pengine[17732]:  warning: Forcing base-clone away from pipci001 after
>>>>>> 1000000 failures (max=2)
>>>>>> pipci001 pengine[17732]:  warning: Forcing base-clone away from pipci001
>>>>>> after 1000000 failures (max=2)
>>>>>> pengine[17732]:   notice: Stop    dlm:0#011(pipci001)
>>>>>> pengine[17732]:   notice: Stop    p-fssapmnt:0#011(pipci001)
>>>>>> pengine[17732]:   notice: Calculated transition 4, saving inputs in
>>>>>> /var/lib/pacemaker/pengine/pe-input-341.bz2
>>>>>> crmd[17733]:   notice: Processing graph 4 (ref=pe_calc-dc-1520613263-36)
>>>>>> derived from /var/lib/pacemaker/pengine/pe-input-341.bz2
>>>>>> crmd[17733]:   notice: Initiating stop operation p-fssapmnt_stop_0
>>>>>> locally on pipci001
>>>>>> lrmd[17730]:   notice: executing - rsc:p-fssapmnt action:stop call_id:72
>>>>>> Filesystem(p-fssapmnt)[19189]: INFO: Running stop for /dev/mapper/sapmnt
>>>>>> on /sapmnt
>>>>>> pipci001 lrmd[17730]:   notice: finished - rsc:p-fssapmnt action:stop
>>>>>> call_id:72 pid:19189 exit-code:0 exec-time:83ms queue-time:0ms
>>>>>> pipci001 crmd[17733]:   notice: Result of stop operation for p-fssapmnt
>>>>>> on pipci001: 0 (ok)
>>>>>> crmd[17733]:   notice: Initiating stop operation dlm_stop_0 locally on
>>>>>> pipci001
>>>>>> pipci001 lrmd[17730]:   notice: executing - rsc:dlm action:stop call_id:74
>>>>>> pipci001 dlm_controld[19019]: 4636 shutdown ignored, active lockspaces
>>>>>>
>>>>>>
>>>>>> resource configuration:
>>>>>>
>>>>>> primitive p-fssapmnt Filesystem \
>>>>>>             params device="/dev/mapper/sapmnt" directory="/sapmnt"
>>>>>> fstype=ocfs2 \
>>>>>>             op monitor interval=20 timeout=40 \
>>>>>>             op start timeout=60 interval=0 \
>>>>>>             op stop timeout=60 interval=0
>>>>>> primitive dlm ocf:pacemaker:controld \
>>>>>>             op monitor interval=60 timeout=60 \
>>>>>>             op start interval=0 timeout=90 \
>>>>>>             op stop interval=0 timeout=100
>>>>>> clone base-clone base-group \
>>>>>>             meta interleave=true target-role=Started
>>>>>>
>>>>>> cluster properties:
>>>>>> property cib-bootstrap-options: \
>>>>>>             have-watchdog=true \
>>>>>>             stonith-enabled=true \
>>>>>>             stonith-timeout=80 \
>>>>>>             startup-fencing=true \
>>>>>>
>>>>>>
>>>>>> Software versions:
>>>>>>
>>>>>> kernel version: 4.4.114-94.11-default
>>>>>> pacemaker-1.1.16-4.8.x86_64
>>>>>> corosync-2.3.6-9.5.1.x86_64
>>>>>> ocfs2-kmp-default-4.4.114-94.11.3.x86_64
>>>>>> ocfs2-tools-1.8.5-1.35.x86_64
>>>>>> dlm-kmp-default-4.4.114-94.11.3.x86_64
>>>>>> libdlm3-4.0.7-1.28.x86_64
>>>>>> libdlm-4.0.7-1.28.x86_64
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> Regards,
>>>>>> Muhammad Sharfuddin
>>>>>>
>>>>>>
>>>>>> ---
>>>>>> This email has been checked for viruses by Avast antivirus software.
>>>>>> https://www.avast.com/antivirus
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list: Users at clusterlabs.org
>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org
>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>> _______________________________________________
>>>>> Users mailing list: Users at clusterlabs.org
>>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>
>>>> --
>>>> Regards,
>>>> Muhammad Sharfuddin
>>>>
>>>> _______________________________________________
>>>> Users mailing list: Users at clusterlabs.org
>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org
>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>