[ClusterLabs] LVMLockD Recipe for Fedora 37 PCS GFS2 Shared Storage Cluster

Fri Dec 30 20:19:49 EST 2022

On Fri, Dec 30, 2022 at 4:32 PM Gregory Carter <gjcarter2 at gmail.com> wrote:
>
> I recently had some time to read the documentation for LVM Shared Storage using PCS and found that there wasn't any documentation on how to set it up using lvmlockd/GFS2/Fencing in a shared configuration.  There are a couple replication scenarios out there that use DRDB/Cloning which I might try if this isn't possible.
>
> Everything I have found is CLVMD based or is incredibly old created way before LVMLockD deprecated CLVMD.  So I decided to take the guessing route:
>
> So this is how far I got:
>
> 1) I created a LVM volume with GFS2 physically formatted volumes.
> vgcreate "vmgfs2" /dev/sdb1 /dev/sdc1 /dev/sde1 /dev/sdg1 --shared
> lvcreate -n "smvms" -L 50G "vmgfs2"
>
> 2) I formatted a GFS2 file system on the smvs segment:
> mkfs.gfs2 -p lock_dlm -t vmgfs2:smvms -j 8 /dev/vmgfs2/smvms
>
> 3) I then fenced the block devices with stonith like so:
>
> pcs stonith create scsi-shooter fence_scsi pcmk_host_list="hyperv0001.cluster hyperv0002.cluster" devices=/dev/disk/by-id/wwn-0x6001405f0efeeeb51af42c09971344de,/dev/disk/by-id/wwn-0x6001405fbe61a5c9e234a81968805ac3,/dev/disk/by-id/wwn-0x60014055c9a285a61214064a7ad60ccd,/dev/disk/by-id/wwn-0x600140565c8b9cb3f874d7f89ef4afa1 meta provides=unfencing
>
>
> 4) Now for the lvm.conf configuration, that part wasn't obvious.  Since I do not have any lvm volumes on the hypervisors other than the block devices assigned specifically for shared storage I didn't have any filters setup.  (i.e. the hypervisors main block devices are nvme and the other block devices are not part of a shared volume.)
>
> That being said I set the following up in lvm.conf:
> use_lvmlockd = 1
> system_id_source = "none"
> volume_list = []
>
> Some more info:
>
> # vgs
>  VG     #PV #LV #SN Attr   VSize     VFree
>  vmgfs2   4   1   0 wz--ns <1023.91g <973.91g
> and
> # vgs
>  Reading VG vmgfs2 without a lock.
>  VG     #PV #LV #SN Attr   VSize     VFree
>  vmgfs2   4   1   0 wz--ns <1023.91g <973.91g
>
> I assume they both have to have a shared attribute set.
>
> Both machines boot with lvmlockd enabled at boot time.
>
> It also was suggested I setup my initrd on the hypervisors to load the lvm.conf file at startup with the proper configuration options more immediate.
>
> So I then updated my initrd like so:
> dracut -H -f /boot/initramfs-$(uname -r).img $(uname -r)
>
> 5) Next I defined a activation resource, which I took from some relatively recent documentation I found out on the web:
> pcs resource create vmgfs2_pcs ocf:heartbeat:LVM-activate vgname=vmgfs2 vg_access_mode=lvmlockd --group vmgfs2
>
> This however leads to some issues:
>
> Cluster name: vmgfs2
> Cluster Summary:
>  * Stack: corosync
>  * Current DC: hyperv0001.aesgi.com (version 2.1.4-4.fc36-dc6eb4362e) - partition with quorum
>  * Last updated: Fri Dec 30 17:26:45 2022
>  * Last change:  Fri Dec 30 14:48:38 2022 by root via cibadmin on hyperv0001.aesgi.com
>  * 2 nodes configured
>  * 4 resource instances configured
>
> Node List:
>  * Online: [ hyperv0001.aesgi.com hyperv0002.aesgi.com ]
>
> Full List of Resources:
>  * Clone Set: dlm-clone [dlm]:
>    * Started: [ hyperv0001.aesgi.com hyperv0002.aesgi.com ]
>  * scsi-shooter        (stonith:fence_scsi):    Started hyperv0001.aesgi.com
>  * Resource Group: vmgfs2:
>    * vmgfs2_pcs        (ocf::heartbeat:LVM-activate):   Stopped
>
> Failed Resource Actions:
>  * vmgfs2_pcs_start_0 on hyperv0002.aesgi.com 'not configured' (6): call=16, status='complete', exitreason='lvmlockd daemon is not running!', last-rc-chan
> ge='Fri Dec 30 14:53:17 2022', queued=0ms, exec=1429ms
>
> Daemon Status:
>  corosync: active/enabled
>  pacemaker: active/enabled
>  pcsd: active/enabled
>
> I am not sure why I get a lvmlockd not running response as it is configured to run at bootup.

Is it in fact running? The LVM-activate resource agent runs `pgrep
lvmlockd` to check that.

    # Good: lvmlockd is running, and clvmd is not running
    if ! pgrep lvmlockd >/dev/null 2>&1 ; then
        if ocf_is_probe; then
            ocf_log info "initial probe: lvmlockd is not running yet."
            exit $OCF_NOT_RUNNING
        fi

        ocf_exit_reason "lvmlockd daemon is not running!"
        exit $OCF_ERR_CONFIGURED
    fi

You might find this helpful (written for RHEL 9 but should be
applicable to Fedora):
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/configuring_and_managing_high_availability_clusters/assembly_configuring-gfs2-in-a-cluster-configuring-and-managing-high-availability-clusters

>
> But since I used a variety of locations to put together this information, I was wondering if there wasn't a location that already has this specified?
>
>
>
>
>
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/

-- 
Regards,

Reid Wahl (He/Him)
Senior Software Engineer, Red Hat
RHEL High Availability - Pacemaker