[ClusterLabs] Antw: [EXT] VirtualDomain does not stop via "crm resource stop" - modify RA ?

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Fri Oct 23 02:06:03 EDT 2020


>>> "Lentes, Bernd" <bernd.lentes at helmholtz-muenchen.de> schrieb am 22.10.2020
um
22:29 in Nachricht
<684655755.160569.1603398597074.JavaMail.zimbra at helmholtz-muenchen.de>:
> Hi guys,
> 
> ocassionally stopping a VirtualDomain resource via "crm resource stop" does

> not work, and in the end the node is fenced, which is ugly.
> I had a look at the RA to see what it does. After trying to stop the domain

> via "virsh shutdown ..." in a configurable time it switches to "virsh 
> destroy".
> i assume "virsh destroy" send a sigkill to the respective process. But when


In "good old xen xm" a "destroy" "tears down" the VM, just throwing it out of
memory, that is from perspective of the VM it is very much like powering it off
(no processes or buffered writes complete). So you typically want to set a
proper timeout to avoid that.
The thing to worry about is the shutdown inside the VM, not outside.

> the host is doing heavily IO it's possible that the process is in "D" state

> (uninterruptible sleep) 
> in which it can't be finished with a SIGKILL. The the node the domain is 
> running on is fenced due to that.
> I digged deeper and found out that the signal is often delivered a bit later

> (just some seconds) and the process is killed, but pacemaker already decided

> to fence the node.
> It's all about this excerp in the RA:
> 
> force_stop()
> {
>         local out ex translate
>         local status=0
> 
>         ocf_log info "Issuing forced shutdown (destroy) request for domain 
> ${DOMAIN_NAME}."
>         out=$(LANG=C virsh $VIRSH_OPTIONS destroy ${DOMAIN_NAME} 2>&1)
>         ex=$?
>         translate=$(echo $out|tr 'A-Z' 'a-z')
>         echo >&2 "$translate"
>         case $ex$translate in
>                 *"error:"*"domain is not running"*|*"error:"*"domain not 
> found"*|\
>                 *"error:"*"failed to get domain"*)
>                         : ;; # unexpected path to the intended outcome, all

> is well
>                 [!0]*)
>                         ocf_exit_reason "forced stop failed"
>                         return $OCF_ERR_GENERIC ;;
>                 0*)
>                         while [ $status != $OCF_NOT_RUNNING ]; do
>                                 VirtualDomain_status
>                                 status=$?
>                         done ;;
>         esac
>         return $OCF_SUCCESS
> }
> 
> I'm thinking about the following:
> How about to let the script wait a bit after "virsh destroy". I saw that 
> usually it just takes some seconds that "virsh destroy" is successfull.
> I'm thinking about this change:
> 
>  ocf_log info "Issuing forced shutdown (destroy) request for domain 
> ${DOMAIN_NAME}."
>         out=$(LANG=C virsh $VIRSH_OPTIONS destroy ${DOMAIN_NAME} 2>&1)
>         ex=$?
>         sleep (10)    <============================ (or maybe configurable)
>         translate=$(echo $out|tr 'A-Z' 'a-z')
> 
> 
> What do you think ?

Destroy will need a little time, so it may be wirth trying that.

Regards,
Ulrich

> 
> Bernd
> 
> 
> -- 
> 
> Bernd Lentes 
> Systemadministration 
> Institute for Metabolism and Cell Death (MCD) 
> Building 25 - office 122 
> HelmholtzZentrum München 
> bernd.lentes at helmholtz-muenchen.de 
> phone: +49 89 3187 1241 
> phone: +49 89 3187 3827 
> fax: +49 89 3187 2294 
> http://www.helmholtz-muenchen.de/mcd 
> 
> stay healthy
> Helmholtz Zentrum München
> 
> Helmholtz Zentrum Muenchen
> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
> Ingolstaedter Landstr. 1
> 85764 Neuherberg
> www.helmholtz-muenchen.de 
> Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling
> Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin 
> Guenther
> Registergericht: Amtsgericht Muenchen HRB 6466
> USt-IdNr: DE 129521671
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 





More information about the Users mailing list