[ClusterLabs] Lustre MDT/OST Mount Failures During Virtual Machine Reboot with Pacemaker

Mon Mar 3 09:03:06 UTC 2025

You need the systemd drop-in functionality introduced in RHEL 9.3 to
avoid this issue:
https://bugzilla.redhat.com/show_bug.cgi?id=2184779


Oyvind

On 03/03/25 11:52 +0800, chenzufei at gmail.com wrote:
>1. Background:
>There are three physical servers, each running a KVM virtual machine. The virtual machines host Lustre services (MGS/MDS/OSS). Pacemaker is used to ensure high availability of the Lustre services.
>lustre(2.15.5) + corosync(3.1.5) + pacemaker(2.1.0-8.el8) + pcs(0.10.8)
>2. Problem:
>When a reboot command is issued on one of the virtual machines, the MDT/OST resources are taken over by the virtual machines on other nodes. However, the mounting of these resources fails during the switch (Pacemaker attempts to mount multiple times and eventually succeeds).
>Workaround: Before executing the reboot command, run pcs node standby <node-name> to move the resources away.
>Question: I would like to know if this is an inherent issue with Pacemaker?
>3. Analysis:
>From the log analysis, it appears that the MDT/OST resources are being mounted on the target node before the unmount process is completed on the source node. The Multiple Mount Protection (MMP) detects that the source node has updated the sequence number, which causes the mount operation to fail on the target node.
>4. Logs:
>Node 28 (Source Node):
>Tue Feb 18 23:46:31 CST 2025    reboot
>
>ll /dev/disk/by-id/virtio-ost-node28-3-36
>lrwxrwxrwx 1 root root 9 Feb 18 23:47 /dev/disk/by-id/virtio-ost-node28-3-36 -> ../../vdy
>
>Tue Feb 18 23:46:31 CST 2025
>* ost-36_start_0 on lustre-oss-node29 'error' (1): call=769, status='complete', exitreason='Couldn't mount device [/dev/disk/by-id/virtio-ost-node28-3-36] as /lustre/ost-36', last-rc-change='Tue Feb 18 23:46:32 2025', queued=0ms, exec=21472ms
>
>Feb 18 23:46:31 lustre-oss-node28 systemd[1]: Unmounting /lustre/ost-36...
>Feb 18 23:46:31 lustre-oss-node28 kernel: LDISKFS-fs warning (device vdy): kmmpd:186: czf MMP failure info: epoch:6609375025013, seq: 37, last update time: 1739893591, last update node: lustre-oss-node28, last update device: vdy
>Feb 18 23:46:32 lustre-oss-node28 Filesystem(ost-36)[19748]: INFO: Running stop for /dev/disk/by-id/virtio-ost-node28-3-36 on /lustre/ost-36
>Feb 18 23:46:32 lustre-oss-node28 pacemaker-controld[1700]: notice: Result of stop operation for ost-36 on lustre-oss-node28: ok
>Feb 18 23:46:34 lustre-oss-node28 kernel: LDISKFS-fs warning (device vdy): kmmpd:258: czf set mmp seq clean
>Feb 18 23:46:34 lustre-oss-node28 kernel: LDISKFS-fs warning (device vdy): kmmpd:258: czf MMP failure info: epoch:6612033802827, seq: 4283256144, last update time: 1739893594, last update node: lustre-oss-node28, last update device: vdy
>Feb 18 23:46:34 lustre-oss-node28 systemd[1]: Unmounted /lustre/ost-36.
>
>Node 29 (Target Node):
>/dev/disk/by-id/virtio-ost-node28-3-36 -> ../../vdt
>
>Feb 18 23:46:32 lustre-oss-node29 Filesystem(ost-36)[451114]: INFO: Running start for /dev/disk/by-id/virtio-ost-node28-3-36 on /lustre/ost-36
>Feb 18 23:46:32 lustre-oss-node29 kernel: LDISKFS-fs warning (device vdt): ldiskfs_multi_mount_protect:350: MMP interval 42 higher than expected, please wait.
>Feb 18 23:46:53 lustre-oss-node29 kernel: czf, not equel, Current time: 23974372799987 ns, 37,4283256144
>Feb 18 23:46:53 lustre-oss-node29 kernel: LDISKFS-fs warning (device vdt): ldiskfs_multi_mount_protect:364: czf MMP failure info: epoch:23974372801877, seq: 4283256144, last update time: 1739893594, last update node: lustre-oss-node28, last update device: vdy
>
>
>
>chenzufei at gmail.com

>_______________________________________________
>Manage your subscription:
>https://lists.clusterlabs.org/mailman/listinfo/users
>
>ClusterLabs home: https://www.clusterlabs.org/