[ClusterLabs] two node cluster: vm starting - shutting down 15min later - starting again 15min later ... and so on

Fri Feb 10 12:49:27 UTC 2017

----- On Feb 10, 2017, at 1:10 AM, Ken Gaillot kgaillot at redhat.com wrote:

> On 02/09/2017 10:48 AM, Lentes, Bernd wrote:
>> Hi,
>> 
>> i have a two node cluster with a vm as a resource. Currently i'm just testing
>> and playing. My vm boots and shuts down again in 15min gaps.
>> Surely this is related to "PEngine Recheck Timer (I_PE_CALC) just popped
>> (900000ms)" found in the logs. I googled, and it is said that this
>> is due to time-based rule
>> (http://oss.clusterlabs.org/pipermail/pacemaker/2009-May/001647.html). OK.
>> But i don't have any time-based rules.
>> This is the config for my vm:
>> 
>> primitive prim_vm_mausdb VirtualDomain \
>>         params config="/var/lib/libvirt/images/xml/mausdb_vm.xml" \
>>         params hypervisor="qemu:///system" \
>>         params migration_transport=ssh \
>>         op start interval=0 timeout=90 \
>>         op stop interval=0 timeout=95 \
>>         op monitor interval=30 timeout=30 \
>>         op migrate_from interval=0 timeout=100 \
>>         op migrate_to interval=0 timeout=120 \
>>         meta allow-migrate=true \
>>         meta target-role=Started \
>>         utilization cpu=2 hv_memory=4099
>> 
>> The only constraint concerning the vm i had was a location (which i didn't
>> create).
> 
> What is the constraint? If its ID starts with "cli-", it was created by
> a command-line tool (such as crm_resource, crm shell or pcs, generally
> for a "move" or "ban" command).
> 
I deleted the one i mentioned, but now i have two again. I didn't create them.
Does the crm create constraints itself ?

location cli-ban-prim_vm_mausdb-on-ha-idg-2 prim_vm_mausdb role=Started -inf: ha-idg-2
location cli-prefer-prim_vm_mausdb prim_vm_mausdb role=Started inf: ha-idg-2

One location constraint inf, one -inf for the same resource on the same node.
Isn't that senseless ?

"crm resorce scores" show -inf for that resource on that node:
native_color: prim_vm_mausdb allocation score on ha-idg-1: 100
native_color: prim_vm_mausdb allocation score on ha-idg-2: -INFINITY

Is -inf stronger ?
Is it true that only the values for "native_color" are notable ?

A principle question: When i have trouble to start/stop/migrate resources,
is it senseful to do a "crm resource cleanup" before trying again ?
(Beneath finding the reason for the trouble).

Sorry for asking basic stuff. I read a lot before, but in practise it's total different.
Although i just have a vm as a resource, and i'm only testing, i'm sometimes astonished about the 
complexity of a simple two node cluster: scores, failcounts, constraints, default values for a lot of variables ...
you have to keep an eye on a lot of stuff.

Bernd

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe
Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons Enhsen
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671