[ClusterLabs] single node fails to start the ocfs2 resource

Mon Mar 12 08:44:13 EDT 2018

Hi Klaus,

primitive sbd-stonith stonith:external/sbd \
         op monitor interval=3000 timeout=20 \
         op start interval=0 timeout=240 \
         op stop interval=0 timeout=100 \
         params sbd_device="/dev/mapper/sbd" \
         meta target-role=Started

property cib-bootstrap-options: \
         have-watchdog=true \
         stonith-enabled=true \
         no-quorum-policy=ignore \
         stonith-timeout=90 \
         startup-fencing=true

# ps -eaf |grep sbd
root      6129     1  0 17:35 ?        00:00:00 sbd: inquisitor
root      6133  6129  0 17:35 ?        00:00:00 sbd: watcher: 
/dev/mapper/sbd - slot: 1 - uuid: 6e80a337-95db-4608-bd62-d59517f39103
root      6134  6129  0 17:35 ?        00:00:00 sbd: watcher: Pacemaker
root      6135  6129  0 17:35 ?        00:00:00 sbd: watcher: Cluster

This cluster does not start ocfs2 resources when I first intentionally 
crashed(reboot) both the nodes, then try to start ocfs2 resource while 
one node is  offline.

To fix the issue, I have one permanent solution, bring the other 
node(offline) online and things get fixed automatically, i.e ocfs2 
resources mounts.

--
Regards,
Muhammad Sharfuddin

On 3/12/2018 5:25 PM, Klaus Wenninger wrote:
> Hi Muhammad!
>
> Could you be a little bit more elaborate on your fencing-setup!
> I read about you using SBD but I don't see any sbd-fencing-resource.
> For the case you wanted to use watchdog-fencing with SBD this
> would require stonith-watchdog-timeout property to be set.
> But watchdog-fencing relies on quorum (without 2-node trickery)
> and thus wouldn't work on a 2-node-cluster anyway.
>
> Didn't read through the whole thread - so I might be missing something ...
>
> Regards,
> Klaus
>
> On 03/12/2018 12:51 PM, Muhammad Sharfuddin wrote:
>> Hello Gang,
>>
>> as informed, previously cluster was fixed to start the ocfs2 resources by
>>
>> a) crm resource start dlm
>>
>> b) mount/umount  the ocfs2 file system manually. (this step was the fix)
>>
>> and then starting the clone group(which include dlm, ocfs2 file
>> systems) worked fine:
>>
>> c) crm resource start base-clone.
>>
>> Now I crash the nodes intentionally and then keep only one node
>> online, again cluster stopped starting the ocfs2 resources. I again
>> tried to follow your instructions i.e
>>
>> i) crm resource start dlm
>>
>> then try to mount the ocfs2 file system manually which got hanged this
>> time(previously manually mounting helped me):
>>
>> # cat /proc/3966/stack
>> [<ffffffffa039f18e>] do_uevent+0x7e/0x200 [dlm]
>> [<ffffffffa039fe0a>] new_lockspace+0x80a/0xa70 [dlm]
>> [<ffffffffa03a02d9>] dlm_new_lockspace+0x69/0x160 [dlm]
>> [<ffffffffa038e758>] user_cluster_connect+0xc8/0x350 [ocfs2_stack_user]
>> [<ffffffffa03c2872>] ocfs2_cluster_connect+0x192/0x240 [ocfs2_stackglue]
>> [<ffffffffa045eefc>] ocfs2_dlm_init+0x31c/0x570 [ocfs2]
>> [<ffffffffa04a9983>] ocfs2_fill_super+0xb33/0x1200 [ocfs2]
>> [<ffffffff8120e130>] mount_bdev+0x1a0/0x1e0
>> [<ffffffff8120ea1a>] mount_fs+0x3a/0x170
>> [<ffffffff81228bf2>] vfs_kern_mount+0x62/0x110
>> [<ffffffff8122b123>] do_mount+0x213/0xcd0
>> [<ffffffff8122bed5>] SyS_mount+0x85/0xd0
>> [<ffffffff81614b0a>] entry_SYSCALL_64_fastpath+0x1e/0xb6
>> [<ffffffffffffffff>] 0xffffffffffffffff
>>
>> I killed the mount.ocfs2 process stop(crm resource stop dlm) the dlm
>> process, and then try to start(crm resource start dlm) the dlm(which
>> previously always get started successfully), this time dlm didn't
>> start and I checked the dlm_controld process
>>
>> cat /proc/3754/stack
>> [<ffffffff8121dc55>] poll_schedule_timeout+0x45/0x60
>> [<ffffffff8121f0bc>] do_sys_poll+0x38c/0x4f0
>> [<ffffffff8121f2dd>] SyS_poll+0x5d/0xe0
>> [<ffffffff81614b0a>] entry_SYSCALL_64_fastpath+0x1e/0xb6
>> [<ffffffffffffffff>] 0xffffffffffffffff
>>
>> Nutshell:
>>
>> 1 - this cluster is configured to run when single node is online
>>
>> 2 - this cluster does not start the ocfs2 resources after a crash when
>> only one node is online.
>>
>> -- 
>> Regards,
>> Muhammad Sharfuddin | +923332144823 | nds.com.pk
>>
>> On 3/12/2018 12:41 PM, Gang He wrote:
>>>
>>>> Hello Gang,
>>>>
>>>> to follow your instructions, I started the dlm resource via:
>>>>
>>>>        crm resource start dlm
>>>>
>>>> then mount/unmount the ocfs2 file system manually..(which seems to be
>>>> the fix of the situation).
>>>>
>>>> Now resources are getting started properly on a single node.. I am
>>>> happy
>>>> as the issue is fixed, but at the same time I am lost because I have
>>>> no idea
>>>>
>>>> how things get fixed here(merely by mounting/unmounting the ocfs2 file
>>>> systems)
>>> >From your description.
>>> I just wonder  the DLM resource does not work normally under that
>>> situation.
>>> Yan/Bin, do you have any comments about two-node cluster? which
>>> configuration settings will affect corosync quorum/DLM ?
>>>
>>>
>>> Thanks
>>> Gang
>>>
>>>
>>>> -- 
>>>> Regards,
>>>> Muhammad Sharfuddin
>>>>
>>>> On 3/12/2018 10:59 AM, Gang He wrote:
>>>>> Hello Muhammad,
>>>>>
>>>>> Usually, ocfs2 resource startup failure is caused by mount command
>>>>> timeout
>>>> (or hanged).
>>>>> The sample debugging method is,
>>>>> remove ocfs2 resource from crm first,
>>>>> then mount this file system manually, see if the mount command will be
>>>> timeout or hanged.
>>>>> If this command is hanged, please watch where is mount.ocfs2
>>>>> process hanged
>>>> via "cat /proc/xxx/stack" command.
>>>>> If the back trace is stopped at DLM kernel module, usually the root
>>>>> cause is
>>>> cluster configuration problem.
>>>>> Thanks
>>>>> Gang
>>>>>
>>>>>
>>>>>> On 3/12/2018 7:32 AM, Gang He wrote:
>>>>>>> Hello Muhammad,
>>>>>>>
>>>>>>> I think this problem is not in ocfs2, the cause looks like the
>>>>>>> cluster
>>>>>> quorum is missed.
>>>>>>> For two-node cluster (does not three-node cluster), if one node
>>>>>>> is offline,
>>>>>> the quorum will be missed by default.
>>>>>>> So, you should configure two-node related quorum setting
>>>>>>> according to the
>>>>>> pacemaker manual.
>>>>>>> Then, DLM can work normal, and ocfs2 resource can start up.
>>>>>> Yes its configured accordingly, no-quorum is set to "ignore".
>>>>>>
>>>>>> property cib-bootstrap-options: \
>>>>>>              have-watchdog=true \
>>>>>>              stonith-enabled=true \
>>>>>>              stonith-timeout=80 \
>>>>>>              startup-fencing=true \
>>>>>>              no-quorum-policy=ignore
>>>>>>
>>>>>>> Thanks
>>>>>>> Gang
>>>>>>>
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> This two node cluster starts resources when both nodes are
>>>>>>>> online but
>>>>>>>> does not start the ocfs2 resources
>>>>>>>>
>>>>>>>> when one node is offline. e.g if I gracefully stop the cluster
>>>>>>>> resources
>>>>>>>> then stop the pacemaker service on
>>>>>>>>
>>>>>>>> either node, and try to start the ocfs2 resource on the online
>>>>>>>> node, it
>>>>>>>> fails.
>>>>>>>>
>>>>>>>> logs:
>>>>>>>>
>>>>>>>> pipci001 pengine[17732]:   notice: Start   dlm:0#011(pipci001)
>>>>>>>> pengine[17732]:   notice: Start   p-fssapmnt:0#011(pipci001)
>>>>>>>> pengine[17732]:   notice: Start   p-fsusrsap:0#011(pipci001)
>>>>>>>> pipci001 pengine[17732]:   notice: Calculated transition 2, saving
>>>>>>>> inputs in /var/lib/pacemaker/pengine/pe-input-339.bz2
>>>>>>>> pipci001 crmd[17733]:   notice: Processing graph 2
>>>>>>>> (ref=pe_calc-dc-1520613202-31) derived from
>>>>>>>> /var/lib/pacemaker/pengine/pe-input-339.bz2
>>>>>>>> crmd[17733]:   notice: Initiating start operation dlm_start_0
>>>>>>>> locally on
>>>>>>>> pipci001
>>>>>>>> lrmd[17730]:   notice: executing - rsc:dlm action:start call_id:69
>>>>>>>> dlm_controld[19019]: 4575 dlm_controld 4.0.7 started
>>>>>>>> lrmd[17730]:   notice: finished - rsc:dlm action:start call_id:69
>>>>>>>> pid:18999 exit-code:0 exec-time:1082ms queue-time:1ms
>>>>>>>> crmd[17733]:   notice: Result of start operation for dlm on
>>>>>>>> pipci001: 0 (ok)
>>>>>>>> crmd[17733]:   notice: Initiating monitor operation
>>>>>>>> dlm_monitor_60000
>>>>>>>> locally on pipci001
>>>>>>>> crmd[17733]:   notice: Initiating start operation
>>>>>>>> p-fssapmnt_start_0
>>>>>>>> locally on pipci001
>>>>>>>> lrmd[17730]:   notice: executing - rsc:p-fssapmnt action:start
>>>>>>>> call_id:71
>>>>>>>> Filesystem(p-fssapmnt)[19052]: INFO: Running start for
>>>>>>>> /dev/mapper/sapmnt on /sapmnt
>>>>>>>> kernel: [ 4576.529938] dlm: Using TCP for communications
>>>>>>>> kernel: [ 4576.530233] dlm: BFA9FF042AA045F4822C2A6A06020EE9:
>>>>>>>> joining
>>>>>>>> the lockspace group.
>>>>>>>> dlm_controld[19019]: 4629 fence work wait for quorum
>>>>>>>> dlm_controld[19019]: 4634 BFA9FF042AA045F4822C2A6A06020EE9 wait
>>>>>>>> for quorum
>>>>>>>> lrmd[17730]:  warning: p-fssapmnt_start_0 process (PID 19052)
>>>>>>>> timed out
>>>>>>>> kernel: [ 4636.418223] dlm: BFA9FF042AA045F4822C2A6A06020EE9: group
>>>>>>>> event done -512 0
>>>>>>>> kernel: [ 4636.418227] dlm: BFA9FF042AA045F4822C2A6A06020EE9:
>>>>>>>> group join
>>>>>>>> failed -512 0
>>>>>>>> lrmd[17730]:  warning: p-fssapmnt_start_0:19052 - timed out
>>>>>>>> after 60000ms
>>>>>>>> lrmd[17730]:   notice: finished - rsc:p-fssapmnt action:start
>>>>>>>> call_id:71
>>>>>>>> pid:19052 exit-code:1 exec-time:60002ms queue-time:0ms
>>>>>>>> kernel: [ 4636.420628] ocfs2: Unmounting device (254,1) on (node 0)
>>>>>>>> crmd[17733]:    error: Result of start operation for p-fssapmnt on
>>>>>>>> pipci001: Timed Out
>>>>>>>> crmd[17733]:  warning: Action 11 (p-fssapmnt_start_0) on
>>>>>>>> pipci001 failed
>>>>>>>> (target: 0 vs. rc: 1): Error
>>>>>>>> crmd[17733]:   notice: Transition aborted by operation
>>>>>>>> p-fssapmnt_start_0 'modify' on pipci001: Event failed
>>>>>>>> crmd[17733]:  warning: Action 11 (p-fssapmnt_start_0) on
>>>>>>>> pipci001 failed
>>>>>>>> (target: 0 vs. rc: 1): Error
>>>>>>>> crmd[17733]:   notice: Transition 2 (Complete=5, Pending=0,
>>>>>>>> Fired=0,
>>>>>>>> Skipped=0, Incomplete=6,
>>>>>>>> Source=/var/lib/pacemaker/pengine/pe-input-339.bz2): Complete
>>>>>>>> pengine[17732]:   notice: Watchdog will be used via SBD if
>>>>>>>> fencing is
>>>>>>>> required
>>>>>>>> pengine[17732]:   notice: On loss of CCM Quorum: Ignore
>>>>>>>> pengine[17732]:  warning: Processing failed op start for
>>>>>>>> p-fssapmnt:0 on
>>>>>>>> pipci001: unknown error (1)
>>>>>>>> pengine[17732]:  warning: Processing failed op start for
>>>>>>>> p-fssapmnt:0 on
>>>>>>>> pipci001: unknown error (1)
>>>>>>>> pengine[17732]:  warning: Forcing base-clone away from pipci001
>>>>>>>> after
>>>>>>>> 1000000 failures (max=2)
>>>>>>>> pengine[17732]:  warning: Forcing base-clone away from pipci001
>>>>>>>> after
>>>>>>>> 1000000 failures (max=2)
>>>>>>>> pengine[17732]:   notice: Stop    dlm:0#011(pipci001)
>>>>>>>> pengine[17732]:   notice: Stop    p-fssapmnt:0#011(pipci001)
>>>>>>>> pengine[17732]:   notice: Calculated transition 3, saving inputs in
>>>>>>>> /var/lib/pacemaker/pengine/pe-input-340.bz2
>>>>>>>> pengine[17732]:   notice: Watchdog will be used via SBD if
>>>>>>>> fencing is
>>>>>>>> required
>>>>>>>> pengine[17732]:   notice: On loss of CCM Quorum: Ignore
>>>>>>>> pengine[17732]:  warning: Processing failed op start for
>>>>>>>> p-fssapmnt:0 on
>>>>>>>> pipci001: unknown error (1)
>>>>>>>> pengine[17732]:  warning: Processing failed op start for
>>>>>>>> p-fssapmnt:0 on
>>>>>>>> pipci001: unknown error (1)
>>>>>>>> pengine[17732]:  warning: Forcing base-clone away from pipci001
>>>>>>>> after
>>>>>>>> 1000000 failures (max=2)
>>>>>>>> pipci001 pengine[17732]:  warning: Forcing base-clone away from
>>>>>>>> pipci001
>>>>>>>> after 1000000 failures (max=2)
>>>>>>>> pengine[17732]:   notice: Stop    dlm:0#011(pipci001)
>>>>>>>> pengine[17732]:   notice: Stop    p-fssapmnt:0#011(pipci001)
>>>>>>>> pengine[17732]:   notice: Calculated transition 4, saving inputs in
>>>>>>>> /var/lib/pacemaker/pengine/pe-input-341.bz2
>>>>>>>> crmd[17733]:   notice: Processing graph 4
>>>>>>>> (ref=pe_calc-dc-1520613263-36)
>>>>>>>> derived from /var/lib/pacemaker/pengine/pe-input-341.bz2
>>>>>>>> crmd[17733]:   notice: Initiating stop operation p-fssapmnt_stop_0
>>>>>>>> locally on pipci001
>>>>>>>> lrmd[17730]:   notice: executing - rsc:p-fssapmnt action:stop
>>>>>>>> call_id:72
>>>>>>>> Filesystem(p-fssapmnt)[19189]: INFO: Running stop for
>>>>>>>> /dev/mapper/sapmnt
>>>>>>>> on /sapmnt
>>>>>>>> pipci001 lrmd[17730]:   notice: finished - rsc:p-fssapmnt
>>>>>>>> action:stop
>>>>>>>> call_id:72 pid:19189 exit-code:0 exec-time:83ms queue-time:0ms
>>>>>>>> pipci001 crmd[17733]:   notice: Result of stop operation for
>>>>>>>> p-fssapmnt
>>>>>>>> on pipci001: 0 (ok)
>>>>>>>> crmd[17733]:   notice: Initiating stop operation dlm_stop_0
>>>>>>>> locally on
>>>>>>>> pipci001
>>>>>>>> pipci001 lrmd[17730]:   notice: executing - rsc:dlm action:stop
>>>>>>>> call_id:74
>>>>>>>> pipci001 dlm_controld[19019]: 4636 shutdown ignored, active
>>>>>>>> lockspaces
>>>>>>>>
>>>>>>>>
>>>>>>>> resource configuration:
>>>>>>>>
>>>>>>>> primitive p-fssapmnt Filesystem \
>>>>>>>>              params device="/dev/mapper/sapmnt" directory="/sapmnt"
>>>>>>>> fstype=ocfs2 \
>>>>>>>>              op monitor interval=20 timeout=40 \
>>>>>>>>              op start timeout=60 interval=0 \
>>>>>>>>              op stop timeout=60 interval=0
>>>>>>>> primitive dlm ocf:pacemaker:controld \
>>>>>>>>              op monitor interval=60 timeout=60 \
>>>>>>>>              op start interval=0 timeout=90 \
>>>>>>>>              op stop interval=0 timeout=100
>>>>>>>> clone base-clone base-group \
>>>>>>>>              meta interleave=true target-role=Started
>>>>>>>>
>>>>>>>> cluster properties:
>>>>>>>> property cib-bootstrap-options: \
>>>>>>>>              have-watchdog=true \
>>>>>>>>              stonith-enabled=true \
>>>>>>>>              stonith-timeout=80 \
>>>>>>>>              startup-fencing=true \
>>>>>>>>
>>>>>>>>
>>>>>>>> Software versions:
>>>>>>>>
>>>>>>>> kernel version: 4.4.114-94.11-default
>>>>>>>> pacemaker-1.1.16-4.8.x86_64
>>>>>>>> corosync-2.3.6-9.5.1.x86_64
>>>>>>>> ocfs2-kmp-default-4.4.114-94.11.3.x86_64
>>>>>>>> ocfs2-tools-1.8.5-1.35.x86_64
>>>>>>>> dlm-kmp-default-4.4.114-94.11.3.x86_64
>>>>>>>> libdlm3-4.0.7-1.28.x86_64
>>>>>>>> libdlm-4.0.7-1.28.x86_64
>>>>>>>>
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> Regards,
>>>>>>>> Muhammad Sharfuddin
>>>>>>>>
>>>>>>>>
>>>>>>>> ---
>>>>>>>> This email has been checked for viruses by Avast antivirus
>>>>>>>> software.
>>>>>>>> https://www.avast.com/antivirus
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Users mailing list: Users at clusterlabs.org
>>>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>>>>>
>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>> Getting started:
>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>> _______________________________________________
>>>>>>> Users mailing list: Users at clusterlabs.org
>>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>>>>
>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>> Getting started:
>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>
>>>>>> -- 
>>>>>> Regards,
>>>>>> Muhammad Sharfuddin
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list: Users at clusterlabs.org
>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org
>>>>>> Getting started:
>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>> _______________________________________________
>>>>> Users mailing list: Users at clusterlabs.org
>>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started:
>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>
>>>> _______________________________________________
>>>> Users mailing list: Users at clusterlabs.org
>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started:
>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org
>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus