[ClusterLabs] Antw: [EXT] Re: VirtualDomain does not stop via "crm resource stop" - modify RA ?

Strahil Nikolov hunter86_bg at yahoo.com
Mon Oct 26 12:54:18 EDT 2020


I think it's useful - for example a HANA powers up for 10-15min (even more , depends on storage tier) - so the default will time out and the fun starts there.

Maybe the cluster is just showing them without using them , but it looked quite the opposite.

Best Regards,
Strahil Nikolov






В понеделник, 26 октомври 2020 г., 09:34:31 Гринуич+2, Ulrich Windl <ulrich.windl at rz.uni-regensburg.de> написа: 





>>> Strahil Nikolov <hunter86_bg at yahoo.com> schrieb am 23.10.2020 um 17:06 in
Nachricht <428616368.2019191.1603465603970 at mail.yahoo.com>:
> why don't you work with something like this: 'op stop interval =300 
> timeout=600'.

I always thought "interval=" does not make any sense for "start" and "stop",
but only for "monitor"...

> The stop operation will timeout at your requirements without modifying the 
> script.
> 
> Best Regards,
> Strahil Nikolov
> 
> 
> 
> 
> 
> 
> В четвъртък, 22 октомври 2020 г., 23:30:08 Гринуич+3, Lentes, Bernd 
> <bernd.lentes at helmholtz-muenchen.de> написа: 
> 
> 
> 
> 
> 
> Hi guys,
> 
> ocassionally stopping a VirtualDomain resource via "crm resource stop" does

> not work, and in the end the node is fenced, which is ugly.
> I had a look at the RA to see what it does. After trying to stop the domain

> via "virsh shutdown ..." in a configurable time it switches to "virsh 
> destroy".
> i assume "virsh destroy" send a sigkill to the respective process. But when

> the host is doing heavily IO it's possible that the process is in "D" state

> (uninterruptible sleep) 
> in which it can't be finished with a SIGKILL. The the node the domain is 
> running on is fenced due to that.
> I digged deeper and found out that the signal is often delivered a bit later

> (just some seconds) and the process is killed, but pacemaker already decided

> to fence the node.
> It's all about this excerp in the RA:
> 
> force_stop()
> {
>        local out ex translate
>        local status=0
> 
>        ocf_log info "Issuing forced shutdown (destroy) request for domain 
> ${DOMAIN_NAME}."
>        out=$(LANG=C virsh $VIRSH_OPTIONS destroy ${DOMAIN_NAME} 2>&1)
>        ex=$?
>        translate=$(echo $out|tr 'A-Z' 'a-z')
>        echo >&2 "$translate"
>        case $ex$translate in
>                *"error:"*"domain is not running"*|*"error:"*"domain not 
> found"*|\
>                *"error:"*"failed to get domain"*)
>                        : ;; # unexpected path to the intended outcome, all

> is well
>                [!0]*)
>                        ocf_exit_reason "forced stop failed"
>                        return $OCF_ERR_GENERIC ;;
>                0*)
>                        while [ $status != $OCF_NOT_RUNNING ]; do
>                                VirtualDomain_status
>                                status=$?
>                        done ;;
>        esac
>        return $OCF_SUCCESS
> }
> 
> I'm thinking about the following:
> How about to let the script wait a bit after "virsh destroy". I saw that 
> usually it just takes some seconds that "virsh destroy" is successfull.
> I'm thinking about this change:
> 
> ocf_log info "Issuing forced shutdown (destroy) request for domain 
> ${DOMAIN_NAME}."
>        out=$(LANG=C virsh $VIRSH_OPTIONS destroy ${DOMAIN_NAME} 2>&1)
>        ex=$?
>        sleep (10)    <============================ (or maybe configurable)
>        translate=$(echo $out|tr 'A-Z' 'a-z')
> 
> 
> What do you think ?
> 
> Bernd
> 
> 
> -- 
> 
> Bernd Lentes 
> Systemadministration 
> Institute for Metabolism and Cell Death (MCD) 
> Building 25 - office 122 
> HelmholtzZentrum München 
> bernd.lentes at helmholtz-muenchen.de 
> phone: +49 89 3187 1241 
> phone: +49 89 3187 3827 
> fax: +49 89 3187 2294 
> http://www.helmholtz-muenchen.de/mcd 
> 
> stay healthy
> Helmholtz Zentrum München
> 
> Helmholtz Zentrum Muenchen
> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
> Ingolstaedter Landstr. 1
> 85764 Neuherberg
> www.helmholtz-muenchen.de 
> Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling
> Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin 
> Guenther
> Registergericht: Amtsgericht Muenchen HRB 6466
> USt-IdNr: DE 129521671
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 




_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


More information about the Users mailing list