[ClusterLabs] Lustre MDT/OST Mount Failures During Virtual Machine Reboot with Pacemaker

Mon Mar 3 03:52:24 UTC 2025

1. Background:
There are three physical servers, each running a KVM virtual machine. The virtual machines host Lustre services (MGS/MDS/OSS). Pacemaker is used to ensure high availability of the Lustre services.
lustre(2.15.5) + corosync(3.1.5) + pacemaker(2.1.0-8.el8) + pcs(0.10.8)
2. Problem:
When a reboot command is issued on one of the virtual machines, the MDT/OST resources are taken over by the virtual machines on other nodes. However, the mounting of these resources fails during the switch (Pacemaker attempts to mount multiple times and eventually succeeds).
Workaround: Before executing the reboot command, run pcs node standby <node-name> to move the resources away.
Question: I would like to know if this is an inherent issue with Pacemaker?
3. Analysis:
From the log analysis, it appears that the MDT/OST resources are being mounted on the target node before the unmount process is completed on the source node. The Multiple Mount Protection (MMP) detects that the source node has updated the sequence number, which causes the mount operation to fail on the target node.
4. Logs:
Node 28 (Source Node):
Tue Feb 18 23:46:31 CST 2025    reboot

ll /dev/disk/by-id/virtio-ost-node28-3-36
lrwxrwxrwx 1 root root 9 Feb 18 23:47 /dev/disk/by-id/virtio-ost-node28-3-36 -> ../../vdy

Tue Feb 18 23:46:31 CST 2025
* ost-36_start_0 on lustre-oss-node29 'error' (1): call=769, status='complete', exitreason='Couldn't mount device [/dev/disk/by-id/virtio-ost-node28-3-36] as /lustre/ost-36', last-rc-change='Tue Feb 18 23:46:32 2025', queued=0ms, exec=21472ms

Feb 18 23:46:31 lustre-oss-node28 systemd[1]: Unmounting /lustre/ost-36...
Feb 18 23:46:31 lustre-oss-node28 kernel: LDISKFS-fs warning (device vdy): kmmpd:186: czf MMP failure info: epoch:6609375025013, seq: 37, last update time: 1739893591, last update node: lustre-oss-node28, last update device: vdy
Feb 18 23:46:32 lustre-oss-node28 Filesystem(ost-36)[19748]: INFO: Running stop for /dev/disk/by-id/virtio-ost-node28-3-36 on /lustre/ost-36
Feb 18 23:46:32 lustre-oss-node28 pacemaker-controld[1700]: notice: Result of stop operation for ost-36 on lustre-oss-node28: ok
Feb 18 23:46:34 lustre-oss-node28 kernel: LDISKFS-fs warning (device vdy): kmmpd:258: czf set mmp seq clean
Feb 18 23:46:34 lustre-oss-node28 kernel: LDISKFS-fs warning (device vdy): kmmpd:258: czf MMP failure info: epoch:6612033802827, seq: 4283256144, last update time: 1739893594, last update node: lustre-oss-node28, last update device: vdy
Feb 18 23:46:34 lustre-oss-node28 systemd[1]: Unmounted /lustre/ost-36.

Node 29 (Target Node):
/dev/disk/by-id/virtio-ost-node28-3-36 -> ../../vdt

Feb 18 23:46:32 lustre-oss-node29 Filesystem(ost-36)[451114]: INFO: Running start for /dev/disk/by-id/virtio-ost-node28-3-36 on /lustre/ost-36
Feb 18 23:46:32 lustre-oss-node29 kernel: LDISKFS-fs warning (device vdt): ldiskfs_multi_mount_protect:350: MMP interval 42 higher than expected, please wait.
Feb 18 23:46:53 lustre-oss-node29 kernel: czf, not equel, Current time: 23974372799987 ns, 37,4283256144
Feb 18 23:46:53 lustre-oss-node29 kernel: LDISKFS-fs warning (device vdt): ldiskfs_multi_mount_protect:364: czf MMP failure info: epoch:23974372801877, seq: 4283256144, last update time: 1739893594, last update node: lustre-oss-node28, last update device: vdy

chenzufei at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20250303/1159cadc/attachment.htm>