[ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] Re: Q: What is lvmlockd locking?

Fri Jan 22 04:12:11 EST 2021

>>> Gang He <ghe at suse.com> schrieb am 22.01.2021 um 09:44 in Nachricht
<a3114cff-bc2a-fd89-8c36-026ffdd707a9 at suse.com>:

> 
> On 2021/1/22 16:17, Ulrich Windl wrote:
>>>>> Gang He <ghe at suse.com> schrieb am 22.01.2021 um 09:13 in Nachricht
>> <1fd1c07d-d12c-fea9-4b17-90a977fe783b at suse.com>:
>>> Hi Ulrich,
>>>
>>> I reviewed the crm configuration file, there are some comments as below,
>>> 1) lvmlockd resource is used for shared VG, if you do not plan to add
>>> any shared VG in your cluster, I suggest to drop this resource and clone.
>>> 2) second, lvmlockd service depends on DLM service, it will create
>>> "lvm_xxx" related lock spaces when any shared VG is created/activated.
>>> but some other resource also depends on DLM to create lock spaces for
>>> avoiding race condition, e.g. clustered MD, ocfs2, etc. Then, the file
>>> system resource should start later than lvm2(lvmlockd) related resources.
>>> That means this order should be wrong.
>>> order ord_lockspace_fs__lvmlockd Mandatory: cln_lockspace_ocfs2
cln_lvmlock
>> 
>> But cln_lockspace_ocfs2 provides the shared filesystem that lvmlockd uses.
I
>> thought for locking in a cluster it needs a cluster-wide filesystem.
> 
> ocfs2 file system resource only depends on DLM resource if you use a 
> shared raw disk(e.g /dev/vdb3), e.g.
> primitive dlm ocf:pacemaker:controld \
>          op start interval=0 timeout=90 \
>          op stop interval=0 timeout=100 \
>          op monitor interval=20 timeout=600
> primitive ocfs2-2 Filesystem \
>          params device="/dev/vdb3" directory="/mnt/shared" fstype=ocfs2 \
>          op monitor interval=20 timeout=40
> group base-group dlm ocfs2-2
> clone base-clone base-group
> 
> If you use ocfs2 file system on top of shared VG(e.g./dev/vg1/lv1), you 
> need to add lvmlock/LVM-activate resource before ocfs2 file system, e.g.
> primitive dlm ocf:pacemaker:controld \
> op monitor interval=60 timeout=60
> primitive lvmlockd lvmlockd \
> op start timeout=90 interval=0 \
> op stop timeout=100 interval=0 \
> op monitor interval=30 timeout=90
> primitive ocfs2-2 Filesystem \
> params device="/dev/vg1/lv1" directory="/mnt/shared" fstype=ocfs2 \
> op monitor interval=20 timeout=40
> primitive vg1 LVM-activate \
> params vgname=vg1 vg_access_mode=lvmlockd activation_mode=shared \
> op start timeout=90s interval=0 \
> op stop timeout=90s interval=0 \
> op monitor interval=30s timeout=90s
> group base-group dlm lvmlockd vg1 ocfs2-2
> clone base-clone base-group

Hi!

I don't see the problem:
As said before OCFS2 used for lockspace does not use LVM itself, but it uses a
clustered-MD (prm_lockspace_ocfs2 Filesystem, cln_lockspace_ocfs2).
That is co-located with DLM and the RAID (cln_lockspace_raid_md10). (And also
for cln_lvmlockd)
Ordering is somewhat redundant as clustered RAID needs DLM, and OCFS needs DLM
and the RAID.

lvmlockd (prm_lvmlockd, cln_lvmlockd) is co-located with DLM (hmm...does that
mean it used DLM and maybe does NOT need a shared filesystem?) and
cln_lockspace_ocfs2.
Accordingly ordering is that vlmlockd starts after DLM (cln_DLM) and after
OCFS (cln_lockspace_ocfs2)

To summarize the related resources:
Node List:
  * Online: [ h16 h18 h19 ]

Full List of Resources:
  * Clone Set: cln_DLM [prm_DLM]:
    * Started: [ h16 h18 h19 ]
  * Clone Set: cln_lvmlockd [prm_lvmlockd]:
    * Started: [ h16 h18 h19 ]
  * Clone Set: cln_lockspace_raid_md10 [prm_lockspace_raid_md10]:
    * Started: [ h16 h18 h19 ]
  * Clone Set: cln_lockspace_ocfs2 [prm_lockspace_ocfs2]:
    * Started: [ h16 h18 h19 ]

Regards,
Ulrich

> 
> Thanks
> Gang
> 
> 
>> 
>>>
>>>
>>> Thanks
>>> Gang
>>>
>>> On 2021/1/21 20:08, Ulrich Windl wrote:
>>>>>>> Gang He <ghe at suse.com> schrieb am 21.01.2021 um 11:30 in Nachricht
>>>> <59b543ee-0824-6b91-d0af-48f66922bc89 at suse.com>:
>>>>> Hi Ulrich,
>>>>>
>>>>> The problem is reproduced stably?  could you help to share your
>>>>> pacemaker crm configure and OS/lvm2/resource‑agents related version
>>>>> information?
>>>>
>>>> OK, the problem occurred on every node, so I guess it's reproducible.
>>>> OS is SLES15 SP2 with all current updates (lvm2-2.03.05-8.18.1.x86_64,
>>>> pacemaker-2.0.4+20200616.2deceaa3a-3.3.1.x86_64,
>>>> resource-agents-4.4.0+git57.70549516-3.12.1.x86_64).
>>>>
>>>> The configuration (somewhat trimmed) is attached.
>>>>
>>>> The only VG the cluster node sees is:
>>>> ph16:~ # vgs
>>>>     VG  #PV #LV #SN Attr   VSize   VFree
>>>>     sys   1   3   0 wz--n- 222.50g    0
>>>>
>>>> Regards,
>>>> Ulrich
>>>>
>>>>> I feel the problem was probably caused by lvmlock resource agent
script,
>>>>> which did not handle this corner case correctly.
>>>>>
>>>>> Thanks
>>>>> Gang
>>>>>
>>>>>
>>>>> On 2021/1/21 17:53, Ulrich Windl wrote:
>>>>>> Hi!
>>>>>>
>>>>>> I have a problem: For tests I had configured lvmlockd. Now that the
>> tests
>>>>> have ended, no LVM is used for cluster resources any more, but lvmlockd
>> is
>>>>> still configured.
>>>>>> Unfortunately I ran into this problem:
>>>>>> On OCFS2 mount was unmounted successfully, another holding the
lockspace
>>>> for
>>>>> lvmlockd is still active.
>>>>>> lvmlockd shuts down. At least it says so.
>>>>>>
>>>>>> Unfortunately that stop never succeeds (runs into a timeout).
>>>>>>
>>>>>> My suspect is something like this:
>>>>>> Some non‑LVM lock exists for the now unmounted OCFS2 filesystem.
>>>>>> lvmlockd want to access that filesystem for unknown reasons.
>>>>>>
>>>>>> I don't understand waht's going on.
>>>>>>
>>>>>> The events at nod shutdown were:
>>>>>> Some Xen PVM was live‑migrated successfully to another node, but
during
>>>> that
>>>>> there was a message like this:
>>>>>> Jan 21 10:20:13 h19 virtlockd[41990]: libvirt version: 6.0.0
>>>>>> Jan 21 10:20:13 h19 virtlockd[41990]: hostname: h19
>>>>>> Jan 21 10:20:13 h19 virtlockd[41990]: resource busy: Lockspace
resource
>>>>> '4c6bebd1f4bc581255b422a65d317f31deef91f777e51ba0daf04419dda7ade5' is
not
>>>>> locked
>>>>>> Jan 21 10:20:13 h19 libvirtd[41991]: libvirt version: 6.0.0
>>>>>> Jan 21 10:20:13 h19 libvirtd[41991]: hostname: h19
>>>>>> Jan 21 10:20:13 h19 libvirtd[41991]: resource busy: Lockspace resource
>>>>> '4c6bebd1f4bc581255b422a65d317f31deef91f777e51ba0daf04419dda7ade5' is
not
>>>>> locked
>>>>>> Jan 21 10:20:13 h19 libvirtd[41991]: Unable to release lease on
>> test‑jeos4
>>>>>> Jan 21 10:20:13 h19 VirtualDomain(prm_xen_test‑jeos4)[32786]: INFO:
>>>>> test‑jeos4: live migration to h18 succeeded.
>>>>>>
>>>>>> Unfortnuately the log message makes it practically impossible to guess
>> what
>>>>
>>>>> the locked object actually is (indirect lock using SHA256 as hash it
>>>> seems).
>>>>>>
>>>>>> Then the OCFS for the VM images unmounts successfully while the stop
of
>>>>> lvmlockd is still busy:
>>>>>> Jan 21 10:20:16 h19 lvmlockd(prm_lvmlockd)[32945]: INFO: stop the
>>>> lockspaces
>>>>> of shared VG(s)...
>>>>>> ...
>>>>>> Jan 21 10:21:56 h19 pacemaker‑controld[42493]:  error: Result of stop
>>>>> operation for prm_lvmlockd on h19: Timed Out
>>>>>>
>>>>>> As said before: I don't have shared VGs any more. I don't understand.
>>>>>>
>>>>>> On a node without VMs running I see:
>>>>>> h19:~ # lvmlockctl ‑d
>>>>>> 1611221190 lvmlockd started
>>>>>> 1611221190 No lockspaces found to adopt
>>>>>> 1611222560 new cl 1 pi 2 fd 8
>>>>>> 1611222560 recv client[10817] cl 1 dump_info . "" mode iv flags 0
>>>>>> 1611222560 send client[10817] cl 1 dump result 0 dump_len 149
>>>>>> 1611222560 send_dump_buf delay 0 total 149
>>>>>> 1611222560 close client[10817] cl 1 fd 8
>>>>>> 1611222563 new cl 2 pi 2 fd 8
>>>>>> 1611222563 recv client[10818] cl 2 dump_log . "" mode iv flags 0
>>>>>>
>>>>>> On a node with VMs running I see:
>>>>>> h16:~ # lvmlockctl ‑d
>>>>>> 1611216942 lvmlockd started
>>>>>> 1611216942 No lockspaces found to adopt
>>>>>> 1611221684 new cl 1 pi 2 fd 8
>>>>>> 1611221684 recv pvs[17159] cl 1 lock gl "" mode sh flags 0
>>>>>> 1611221684 lockspace "lvm_global" not found for dlm gl, adding...
>>>>>> 1611221684 add_lockspace_thread dlm lvm_global version 0
>>>>>> 1611221684 S lvm_global lm_add_lockspace dlm wait 0 adopt 0
>>>>>> 1611221685 S lvm_global lm_add_lockspace done 0
>>>>>> 1611221685 S lvm_global R GLLK action lock sh
>>>>>> 1611221685 S lvm_global R GLLK res_lock cl 1 mode sh
>>>>>> 1611221685 S lvm_global R GLLK lock_dlm
>>>>>> 1611221685 S lvm_global R GLLK res_lock rv 0 read vb 0 0 0
>>>>>> 1611221685 S lvm_global R GLLK res_lock all versions zero
>>>>>> 1611221685 S lvm_global R GLLK res_lock invalidate global state
>>>>>> 1611221685 send pvs[17159] cl 1 lock gl rv 0
>>>>>> 1611221685 recv pvs[17159] cl 1 lock vg "sys" mode sh flags 0
>>>>>> 1611221685 lockspace "lvm_sys" not found
>>>>>> 1611221685 send pvs[17159] cl 1 lock vg rv ‑210 ENOLS
>>>>>> 1611221685 close pvs[17159] cl 1 fd 8
>>>>>> 1611221685 S lvm_global R GLLK res_unlock cl 1 from close
>>>>>> 1611221685 S lvm_global R GLLK unlock_dlm
>>>>>> 1611221685 S lvm_global R GLLK res_unlock lm done
>>>>>> 1611222582 new cl 2 pi 2 fd 8
>>>>>> 1611222582 recv client[19210] cl 2 dump_log . "" mode iv flags 0
>>>>>>
>>>>>> Note: "lvm_sys" may refer to VG sys used for the hypervisor.
>>>>>>
>>>>>> Regards,
>>>>>> Ulrich
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Manage your subscription:
>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>>>>>
>>>>>> ClusterLabs home: https://www.clusterlabs.org/ 
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Manage your subscription:
>>>>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>>>>
>>>>> ClusterLabs home: https://www.clusterlabs.org/ 
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Manage your subscription:
>>>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>>>
>>>> ClusterLabs home: https://www.clusterlabs.org/ 
>>>>
>>>
>>> _______________________________________________
>>> Manage your subscription:
>>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>>
>>> ClusterLabs home: https://www.clusterlabs.org/ 
>> 
>> 
>> 
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users 
>> 
>> ClusterLabs home: https://www.clusterlabs.org/ 
>> 
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/