[ClusterLabs] Antw: [EXT] Re: Q: What is lvmlockd locking?

Fri Jan 22 03:13:39 EST 2021

Hi Ulrich,

I reviewed the crm configuration file, there are some comments as below,
1) lvmlockd resource is used for shared VG, if you do not plan to add 
any shared VG in your cluster, I suggest to drop this resource and clone.
2) second, lvmlockd service depends on DLM service, it will create 
"lvm_xxx" related lock spaces when any shared VG is created/activated.
but some other resource also depends on DLM to create lock spaces for 
avoiding race condition, e.g. clustered MD, ocfs2, etc. Then, the file 
system resource should start later than lvm2(lvmlockd) related resources.
That means this order should be wrong.
order ord_lockspace_fs__lvmlockd Mandatory: cln_lockspace_ocfs2 cln_lvmlock

Thanks
Gang

On 2021/1/21 20:08, Ulrich Windl wrote:
>>>> Gang He <ghe at suse.com> schrieb am 21.01.2021 um 11:30 in Nachricht
> <59b543ee-0824-6b91-d0af-48f66922bc89 at suse.com>:
>> Hi Ulrich,
>>
>> The problem is reproduced stably?  could you help to share your
>> pacemaker crm configure and OS/lvm2/resource‑agents related version
>> information?
> 
> OK, the problem occurred on every node, so I guess it's reproducible.
> OS is SLES15 SP2 with all current updates (lvm2-2.03.05-8.18.1.x86_64,
> pacemaker-2.0.4+20200616.2deceaa3a-3.3.1.x86_64,
> resource-agents-4.4.0+git57.70549516-3.12.1.x86_64).
> 
> The configuration (somewhat trimmed) is attached.
> 
> The only VG the cluster node sees is:
> ph16:~ # vgs
>    VG  #PV #LV #SN Attr   VSize   VFree
>    sys   1   3   0 wz--n- 222.50g    0
> 
> Regards,
> Ulrich
> 
>> I feel the problem was probably caused by lvmlock resource agent script,
>> which did not handle this corner case correctly.
>>
>> Thanks
>> Gang
>>
>>
>> On 2021/1/21 17:53, Ulrich Windl wrote:
>>> Hi!
>>>
>>> I have a problem: For tests I had configured lvmlockd. Now that the tests
>> have ended, no LVM is used for cluster resources any more, but lvmlockd is
>> still configured.
>>> Unfortunately I ran into this problem:
>>> On OCFS2 mount was unmounted successfully, another holding the lockspace
> for
>> lvmlockd is still active.
>>> lvmlockd shuts down. At least it says so.
>>>
>>> Unfortunately that stop never succeeds (runs into a timeout).
>>>
>>> My suspect is something like this:
>>> Some non‑LVM lock exists for the now unmounted OCFS2 filesystem.
>>> lvmlockd want to access that filesystem for unknown reasons.
>>>
>>> I don't understand waht's going on.
>>>
>>> The events at nod shutdown were:
>>> Some Xen PVM was live‑migrated successfully to another node, but during
> that
>> there was a message like this:
>>> Jan 21 10:20:13 h19 virtlockd[41990]: libvirt version: 6.0.0
>>> Jan 21 10:20:13 h19 virtlockd[41990]: hostname: h19
>>> Jan 21 10:20:13 h19 virtlockd[41990]: resource busy: Lockspace resource
>> '4c6bebd1f4bc581255b422a65d317f31deef91f777e51ba0daf04419dda7ade5' is not
>> locked
>>> Jan 21 10:20:13 h19 libvirtd[41991]: libvirt version: 6.0.0
>>> Jan 21 10:20:13 h19 libvirtd[41991]: hostname: h19
>>> Jan 21 10:20:13 h19 libvirtd[41991]: resource busy: Lockspace resource
>> '4c6bebd1f4bc581255b422a65d317f31deef91f777e51ba0daf04419dda7ade5' is not
>> locked
>>> Jan 21 10:20:13 h19 libvirtd[41991]: Unable to release lease on test‑jeos4
>>> Jan 21 10:20:13 h19 VirtualDomain(prm_xen_test‑jeos4)[32786]: INFO:
>> test‑jeos4: live migration to h18 succeeded.
>>>
>>> Unfortnuately the log message makes it practically impossible to guess what
> 
>> the locked object actually is (indirect lock using SHA256 as hash it
> seems).
>>>
>>> Then the OCFS for the VM images unmounts successfully while the stop of
>> lvmlockd is still busy:
>>> Jan 21 10:20:16 h19 lvmlockd(prm_lvmlockd)[32945]: INFO: stop the
> lockspaces
>> of shared VG(s)...
>>> ...
>>> Jan 21 10:21:56 h19 pacemaker‑controld[42493]:  error: Result of stop
>> operation for prm_lvmlockd on h19: Timed Out
>>>
>>> As said before: I don't have shared VGs any more. I don't understand.
>>>
>>> On a node without VMs running I see:
>>> h19:~ # lvmlockctl ‑d
>>> 1611221190 lvmlockd started
>>> 1611221190 No lockspaces found to adopt
>>> 1611222560 new cl 1 pi 2 fd 8
>>> 1611222560 recv client[10817] cl 1 dump_info . "" mode iv flags 0
>>> 1611222560 send client[10817] cl 1 dump result 0 dump_len 149
>>> 1611222560 send_dump_buf delay 0 total 149
>>> 1611222560 close client[10817] cl 1 fd 8
>>> 1611222563 new cl 2 pi 2 fd 8
>>> 1611222563 recv client[10818] cl 2 dump_log . "" mode iv flags 0
>>>
>>> On a node with VMs running I see:
>>> h16:~ # lvmlockctl ‑d
>>> 1611216942 lvmlockd started
>>> 1611216942 No lockspaces found to adopt
>>> 1611221684 new cl 1 pi 2 fd 8
>>> 1611221684 recv pvs[17159] cl 1 lock gl "" mode sh flags 0
>>> 1611221684 lockspace "lvm_global" not found for dlm gl, adding...
>>> 1611221684 add_lockspace_thread dlm lvm_global version 0
>>> 1611221684 S lvm_global lm_add_lockspace dlm wait 0 adopt 0
>>> 1611221685 S lvm_global lm_add_lockspace done 0
>>> 1611221685 S lvm_global R GLLK action lock sh
>>> 1611221685 S lvm_global R GLLK res_lock cl 1 mode sh
>>> 1611221685 S lvm_global R GLLK lock_dlm
>>> 1611221685 S lvm_global R GLLK res_lock rv 0 read vb 0 0 0
>>> 1611221685 S lvm_global R GLLK res_lock all versions zero
>>> 1611221685 S lvm_global R GLLK res_lock invalidate global state
>>> 1611221685 send pvs[17159] cl 1 lock gl rv 0
>>> 1611221685 recv pvs[17159] cl 1 lock vg "sys" mode sh flags 0
>>> 1611221685 lockspace "lvm_sys" not found
>>> 1611221685 send pvs[17159] cl 1 lock vg rv ‑210 ENOLS
>>> 1611221685 close pvs[17159] cl 1 fd 8
>>> 1611221685 S lvm_global R GLLK res_unlock cl 1 from close
>>> 1611221685 S lvm_global R GLLK unlock_dlm
>>> 1611221685 S lvm_global R GLLK res_unlock lm done
>>> 1611222582 new cl 2 pi 2 fd 8
>>> 1611222582 recv client[19210] cl 2 dump_log . "" mode iv flags 0
>>>
>>> Note: "lvm_sys" may refer to VG sys used for the hypervisor.
>>>
>>> Regards,
>>> Ulrich
>>>
>>>
>>>
>>> _______________________________________________
>>> Manage your subscription:
>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> ClusterLabs home: https://www.clusterlabs.org/
>>>
>>
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> ClusterLabs home: https://www.clusterlabs.org/
> 
> 
> 
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
>