[ClusterLabs] VirtualDomain restart caused fencing.
Matthew Schumacher
matt.s at aptalaska.net
Wed Jun 30 11:40:46 EDT 2021
Hello,
I'm not sure how to fix this, but calling 'crm resource restart vm-name' this morning caused an entire node to get fenced, kicking the stool out from under a number of VMs.
Looking at VirtualDomain it looks like the system defaults to a 90s timeout, and if it can't gracefully shutdown the VM with 'virsh shutdown' in 85s, then it calls 'virsh destroy'. For whatever reason, that's not what happened.
I created a mockup where I moved a test vm to it's own node (in case it gets fenced), then loaded something that would ignore acpi shutdown, then called restart. This time it worked. The logs show:
Jun 30 15:32:11 VirtualDomain(vm-testvm)[13047]: INFO: Issuing graceful shutdown request for domain testvm.
Jun 30 15:32:26 VirtualDomain(vm-testvm)[13047]: INFO: Issuing forced shutdown (destroy) request for domain testvm.
I don't have the logs from the original failure due to my node not being persistent, but I wonder if anyone else has run into this.
Here is my resource configuration if that reveals the issue:
crm configure primitive vm-testvm2 VirtualDomain params config="/datastore/vm/testvm/testvm.xml" migration_transport=ssh meta allow-migrate=true target-role=Started op monitor timeout=30 interval=30
Oh, one last question: Can I disable fencing for a specific resource for testing reasons? I'd love to watch this break without fear of fencing.
Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20210630/10775379/attachment.htm>
More information about the Users
mailing list