[ClusterLabs] cleanup of a resource leads to restart of Virtual Domains

Lentes, Bernd bernd.lentes at helmholtz-muenchen.de
Mon Sep 30 12:45:52 EDT 2019


>> 
>> Hi Yan,
>> I had a look in the logs and what happened when i issued a "resource cleanup" of
>> the GFS2 resource is
>> that the cluster deleted an entry in the status section:
>> 
>> Sep 26 14:52:52 [9317] ha-idg-2        cib:     info: cib_process_request:
>> Completed cib_delete operation for section  <=================================================
>> //node_state[@uname='ha-idg-1']//lrm_resource[@id='dlm']: OK (rc=0,  

>> and soon later on it recognized dlm on ha-idg-1 as stopped (or stops it):

>> Sep 26 14:52:54 [9321] ha-idg-2    pengine:     info: common_print:
>> dlm    (ocf::pacemaker:controld):      Stopped   <========================================

>> Sep 26 14:52:54 [9321] ha-idg-2    pengine:     info: common_print:
>> clvmd  (ocf::heartbeat:clvm):  Started ha-idg-1
>> Sep 26 14:52:54 [9321] ha-idg-2    pengine:     info: common_print:

>> 
>> Following the logs dlm is running before. Does the deletion of that entry leads
>> to the stop of the dlm resource ?
>> Is that expected behaviour ?
> First, unless "force" is specified, cleanup issued
> for a child resource
> will do the work for the whole resource group.

Ah. Then i will use "force" in the future when i just want to do
a "resource cleanup" for one resource in a group.
But is the initial deleting of the dlm resource in the status section
the expected behaviour when i do a "resource cleanup" ?
Is it because it is the first in the row of that group ?
Sorry for insisting, but i'm interested in really understanding what was going on.

> Cleanup deletes resources' history which triggers (re-) probe of
> resources. But before probe of a resource has been finished, the
> resource will be shown as "Stopped" which doesn't necessarily mean it's
> actually "Stopped". A running resource will be detected to be "Started"
> with the probe.

Deleting history means resetting fail-count and last-failure ?

> Restart of VM was because pengine/crmd thought the resources it depended
> on were really "Stopped" and wasn't patient enough to wait for probe of
> them to finish. That's what the pull request resolved.
> 

I installed it. is there a way to test it ?

Thanks.

Bernd
 

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin Guenther
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671



More information about the Users mailing list