<html><head><style type='text/css'>p { margin: 0; }</style></head><body><div style='font-family: Times New Roman; font-size: 12pt; color: #000000'>Hi Andrew,<div><br></div><div>Thanks, that sounds good. I am using the Ubuntu HA ppa, so I will wait for a 1.1.7 package to become available.</div><div><br></div><div>Andrew<br><br><hr id="zwchr"><div style="color:#000;font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt;"><b>From: </b>"Andrew Beekhof" <andrew@beekhof.net><br><b>To: </b>"The Pacemaker cluster resource manager" <pacemaker@oss.clusterlabs.org><br><b>Sent: </b>Thursday, March 29, 2012 1:08:21 AM<br><b>Subject: </b>Re: [Pacemaker] VirtualDomain Shutdown Timeout<br><br>On Sun, Mar 25, 2012 at 6:27 AM, Andrew Martin <amartin@xes-inc.com> wrote:<br>> Hello,<br>><br>> I have configured a KVM virtual machine primitive using Pacemaker 1.1.6 and<br>> Heartbeat 3.0.5 on Ubuntu 10.04 Server using DRBD as the storage device (so<br>> there is no shared storage, no live-migration):<br>> primitive p_vm ocf:heartbeat:VirtualDomain \<br>> params config="/vmstore/config/vm.xml" \<br>> meta allow-migrate="false" \<br>> op start interval="0" timeout="180s" \<br>> op stop interval="0" timeout="120s" \<br>> op monitor interval="10" timeout="30"<br>><br>> I would expect the following events to happen on failover on the "from" node<br>> (the migration source) if the VM hangs while shutting down:<br>> 1. VirtualDomain issues "virsh shutdown vm" to gracefully shutdown the VM<br>> 2. pacemaker waits 120 seconds for the timeout specified in the "op stop"<br>> timeout<br>> 3. VirtualDomain waits a bit less than 120 seconds to see if it will<br>> gracefully shutdown. Once it gets to almost 120 seconds, it issues "virsh<br>> destroy vm" to hard stop the VM.<br>> 4. pacemaker wakes up from the 120 second timeout and sees that the VM has<br>> stopped and proceeds with the failover<br>><br>> However, I observed that VirtualDomain seems to be using the timeout from<br>> the "op start" line, 180 seconds, yet pacemaker uses the 120 second timeout.<br>> Thus, the VM is still running after the pacemaker timeout is reached and so<br>> the node is STONITHed. Here is the relevant section of code from<br>> /usr/lib/ocf/resource.d/heartbeat/VirtualDomain:<br>> VirtualDomain_Stop() {<br>> local i<br>> local status<br>> local shutdown_timeout<br>> local out ex<br>><br>> VirtualDomain_Status<br>> status=$?<br>><br>> case $status in<br>> $OCF_SUCCESS)<br>> if ! ocf_is_true $OCF_RESKEY_force_stop; then<br>> # Issue a graceful shutdown request<br>> ocf_log info "Issuing graceful shutdown request for domain<br>> ${DOMAIN_NAME}."<br>> virsh $VIRSH_OPTIONS shutdown ${DOMAIN_NAME}<br>> # The "shutdown_timeout" we use here is the operation<br>> # timeout specified in the CIB, minus 5 seconds<br>> shutdown_timeout=$(( $NOW +<br>> ($OCF_RESKEY_CRM_meta_timeout/1000) -5 ))<br>> # Loop on status until we reach $shutdown_timeout<br>> while [ $NOW -lt $shutdown_timeout ]; do<br>><br>> Doesn't $OCF_RESKEY_CRM_meta_timeout correspond to the timeout value in the<br>> "op stop ..." line?<br><br>It should, however there was a bug in 1.1.6 where this wasn't the case.<br>The relevant patch is:<br> https://github.com/beekhof/pacemaker/commit/fcfe6fe<br><br>Or you could try 1.1.7<br><br>><br>> How can I optimize my pacemaker configuration so that the VM will attempt to<br>> gracefully shutdown and then at worst case destroy the VM before the<br>> pacemaker timeout is reached? Moreover, is there anything I can do inside of<br>> the VM (another Ubuntu 10.04 install) to optimize/speed up the shutdown<br>> process?<br>><br>> Thanks,<br>><br>> Andrew<br>><br>><br>> _______________________________________________<br>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org<br>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker<br>><br>> Project Home: http://www.clusterlabs.org<br>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf<br>> Bugs: http://bugs.clusterlabs.org<br>><br><br>_______________________________________________<br>Pacemaker mailing list: Pacemaker@oss.clusterlabs.org<br>http://oss.clusterlabs.org/mailman/listinfo/pacemaker<br><br>Project Home: http://www.clusterlabs.org<br>Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf<br>Bugs: http://bugs.clusterlabs.org<br></div><br></div></div></body></html>