[ClusterLabs] Unfencing cause resource restarts
Pavel Levshin
lpk at 581.spb.su
Tue Oct 11 15:33:12 UTC 2016
11.10.2016 17:40, Ken Gaillot:
> On 10/11/2016 07:06 AM, Pavel Levshin wrote:
>> Hi!
>>
>>
>> In continuation of prevoius mails, now I have more complex setup. Our
>> hardware are capable of two STONITH methods: ILO and SCSI persistent
>> reservations on shared storage. First method works fine, nevertheless,
>> sometimes in the past we faced problems with inaccessible ILO devices or
>> something... So, we would like to have SCSI fencing as an additional method.
>>
>> The problem: when a node 2 recovers, some resources are just stopped and
>> restarted on node 1. As far as I understand, primitive resources are
>> affected, but clone instances are not affected.
>>
>> In the example below, when bvnode2 recovers, vm_smartbv1 is restarted on
>> bvnode1, and vm_smartbv2 is live-migrated without interruption to
>> bvnode2. All other resources are clones working on bvnode1 and they are
>> unaffected.
>>
>> If I set "meta requires=fencing" for vm resources, they are not
>> restarted anymore. But why unfencing of bvnode2 affects resources
>> running on bvnode1?
> That does seem odd.
>
> Something I notice in the config below is that only the ILO devices are
> listed in the fence topology, and the only fence level is "10". Valid
> indexes are 1 to 9, so this should have produced a log error about "Bad
> topology".
>
> If you want the storage fencing as a fallback in case ILO fails, you
> want the devices in two levels, e.g. level 1 = ILO, level 2 = storage.
There were levels 10 and 20 earlier, and this worked (aside from the
problem with unwanted restarts). Docs say that fencing level are numeric
and tried in ascending order, there is no visible restriction on those
numbers. No errors about bad topology. Levels come to play when it is
time to fence someone, which does not happen.
So I assume that levels have nothing to do with the problem. Now the
topology is:
ilo.bvnode2 (stonith:fence_ilo4): Started bvnode1
ilo.bvnode1 (stonith:fence_ilo4): Started bvnode2
storage.bvnode1 (stonith:fence_mpath): Started bvnode1
storage.bvnode2 (stonith:fence_mpath): Started bvnode2
Node: bvnode1
Level 10 - ilo.bvnode1
Level 20 - storage.bvnode1
Node: bvnode2
Level 10 - ilo.bvnode2
Level 20 - storage.bvnode2
--
Pavel Levshin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20161011/0abaff90/attachment.htm>
More information about the Users
mailing list