[ClusterLabs Developers] bug? in heartbeat/LVM OCF script
Chris Friesen
chris.friesen at windriver.com
Tue Jun 21 19:20:59 UTC 2016
On 06/21/2016 11:58 AM, Vladislav Bogdanov wrote:
> 21.06.2016 20:03, Chris Friesen wrot:
>> On 06/20/2016 05:50 PM, Chris Friesen wrote:
>>>
>>> Hi,
>>>
>>> The heartbeat/LVM OCF script uses the following logic for the LVM_status()
>>> routine:
>>>
>>> if [ -d /dev/$1 ]; then
>>> test "`cd /dev/$1 && ls`" != ""
>>> rc=$?
>>> if [ $rc -ne 0 ]; then
>>> ocf_exit_reason "VG $1 with no logical volumes is not supported by
>>> this RA!"
>>> fi
>>> fi
>>
>> <snip>
>>
>>> I think it would be better to query the activity directly, using something like
>>> "lvs -o name,selected -S lv_active=active,vg_name=<volume_group>"
>>
>> I'm testing with the following code instead of the above snippet and it seems
>> to work okay:
>>
>> # Ask lvm whether the volume group is active. This maps to
>> # the question "Are there any logical volumes that are active in
>> # the specified volume group?".
>> lvs --noheadings -o selected -S lv_active=active,vg_name=${1}|grep -q 1
>
> This ^^^ has a big chance to timeout in both monitor and subsequent stop
> operations if clustered VG is used and clvmd is stuck because dlm is waiting for
> fencing (of another node) to finish.
> Or if (clustered) VG is created on an iSCSI/iSRP/FC/FCoE/etc block device which
> is not available for some period of time due to target/network problems.
>
> Both cases lead to fencing of all cluster nodes.
Got any suggestions on a better way to handle it?
The current code is flawed due to arguably-buggy LVM behaviour...the existance
of a non-empty /dev/<volgroup> directory does not actually guarantee that the
volume group is activated.
Chris
More information about the Developers
mailing list