[ClusterLabs] Antw: Re: Antw: [EXT] Re: VirtualDomain does not stop via "crm resource stop" - modify RA ?
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Tue Oct 27 03:27:33 EDT 2020
>>> Strahil Nikolov <hunter86_bg at yahoo.com> schrieb am 26.10.2020 um 17:54 in
Nachricht <293971081.2615607.1603731258371 at mail.yahoo.com>:
> I think it's useful - for example a HANA powers up for 10-15min (even more ,
> depends on storage tier) - so the default will time out and the fun starts
> there.
Hi!
VMs are a classical case where "one size fits all" doesn't work: For migration
we need customized timeouts that depend on the size of VM RAM and the ratio of
dirty pages (writing databases with big buffers are bad candadates for
live-migration, for example). OTOH you don't want your timeouts to be longer
than necessary in case something goes wrong. Well, you can never cover 100%,
but 95-99% is rather good.
Regards,
Ulrich
>
> Maybe the cluster is just showing them without using them , but it looked
> quite the opposite.
>
> Best Regards,
> Strahil Nikolov
>
>
>
>
>
>
> В понеделник, 26 октомври 2020 г., 09:34:31 Гринуич+2, Ulrich Windl
> <ulrich.windl at rz.uni-regensburg.de> написа:
>
>
>
>
>
>>>> Strahil Nikolov <hunter86_bg at yahoo.com> schrieb am 23.10.2020 um 17:06
in
> Nachricht <428616368.2019191.1603465603970 at mail.yahoo.com>:
>> why don't you work with something like this: 'op stop interval =300
>> timeout=600'.
>
> I always thought "interval=" does not make any sense for "start" and
"stop",
> but only for "monitor"...
>
>> The stop operation will timeout at your requirements without modifying the
>> script.
>>
>> Best Regards,
>> Strahil Nikolov
>>
>>
>>
>>
>>
>>
>> В четвъртък, 22 октомври 2020 г., 23:30:08 Гринуич+3, Lentes, Bernd
>> <bernd.lentes at helmholtz-muenchen.de> написа:
>>
>>
>>
>>
>>
>> Hi guys,
>>
>> ocassionally stopping a VirtualDomain resource via "crm resource stop"
does
>
>> not work, and in the end the node is fenced, which is ugly.
>> I had a look at the RA to see what it does. After trying to stop the
domain
>
>> via "virsh shutdown ..." in a configurable time it switches to "virsh
>> destroy".
>> i assume "virsh destroy" send a sigkill to the respective process. But
when
>
>> the host is doing heavily IO it's possible that the process is in "D"
state
>
>> (uninterruptible sleep)
>> in which it can't be finished with a SIGKILL. The the node the domain is
>> running on is fenced due to that.
>> I digged deeper and found out that the signal is often delivered a bit
later
>
>> (just some seconds) and the process is killed, but pacemaker already
decided
>
>> to fence the node.
>> It's all about this excerp in the RA:
>>
>> force_stop()
>> {
>> local out ex translate
>> local status=0
>>
>> ocf_log info "Issuing forced shutdown (destroy) request for domain
>> ${DOMAIN_NAME}."
>> out=$(LANG=C virsh $VIRSH_OPTIONS destroy ${DOMAIN_NAME} 2>&1)
>> ex=$?
>> translate=$(echo $out|tr 'A-Z' 'a-z')
>> echo >&2 "$translate"
>> case $ex$translate in
>> *"error:"*"domain is not running"*|*"error:"*"domain not
>> found"*|\
>> *"error:"*"failed to get domain"*)
>> : ;; # unexpected path to the intended outcome, all
>
>> is well
>> [!0]*)
>> ocf_exit_reason "forced stop failed"
>> return $OCF_ERR_GENERIC ;;
>> 0*)
>> while [ $status != $OCF_NOT_RUNNING ]; do
>> VirtualDomain_status
>> status=$?
>> done ;;
>> esac
>> return $OCF_SUCCESS
>> }
>>
>> I'm thinking about the following:
>> How about to let the script wait a bit after "virsh destroy". I saw that
>> usually it just takes some seconds that "virsh destroy" is successfull.
>> I'm thinking about this change:
>>
>> ocf_log info "Issuing forced shutdown (destroy) request for domain
>> ${DOMAIN_NAME}."
>> out=$(LANG=C virsh $VIRSH_OPTIONS destroy ${DOMAIN_NAME} 2>&1)
>> ex=$?
>> sleep (10) <============================ (or maybe configurable)
>> translate=$(echo $out|tr 'A-Z' 'a-z')
>>
>>
>> What do you think ?
>>
>> Bernd
>>
>>
>> --
>>
>> Bernd Lentes
>> Systemadministration
>> Institute for Metabolism and Cell Death (MCD)
>> Building 25 - office 122
>> HelmholtzZentrum München
>> bernd.lentes at helmholtz-muenchen.de
>> phone: +49 89 3187 1241
>> phone: +49 89 3187 3827
>> fax: +49 89 3187 2294
>> http://www.helmholtz-muenchen.de/mcd
>>
>> stay healthy
>> Helmholtz Zentrum München
>>
>> Helmholtz Zentrum Muenchen
>> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
>> Ingolstaedter Landstr. 1
>> 85764 Neuherberg
>> www.helmholtz-muenchen.de
>> Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling
>> Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin
>> Guenther
>> Registergericht: Amtsgericht Muenchen HRB 6466
>> USt-IdNr: DE 129521671
>>
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> ClusterLabs home: https://www.clusterlabs.org/
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> ClusterLabs home: https://www.clusterlabs.org/
>
>
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
More information about the Users
mailing list