[Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs
Andrew Beekhof
andrew at beekhof.net
Mon Jul 1 07:14:35 EDT 2013
On 01/07/2013, at 5:32 PM, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
> 29.06.2013 02:22, Andrew Beekhof wrote:
>>
>> On 29/06/2013, at 12:22 AM, Digimer <lists at alteeve.ca> wrote:
>>
>>> On 06/28/2013 06:21 AM, Andrew Beekhof wrote:
>>>>
>>>> On 28/06/2013, at 5:22 PM, Lars Marowsky-Bree <lmb at suse.com> wrote:
>>>>
>>>>> On 2013-06-27T12:53:01, Digimer <lists at alteeve.ca> wrote:
>>>>>
>>>>>> primitive fence_n01_psu1_off stonith:fence_apc_snmp \
>>>>>> params ipaddr="an-p01" pcmk_reboot_action="off" port="1"
>>>>>> pcmk_host_list="an-c03n01.alteeve.ca"
>>>>>> primitive fence_n01_psu1_on stonith:fence_apc_snmp \
>>>>>> params ipaddr="an-p01" pcmk_reboot_action="on" port="1"
>>>>>> pcmk_host_list="an-c03n01.alteeve.ca"
>>>>>
>>>>> So every device twice, including location constraints? I see potential
>>>>> for optimization by improving how the fence code handles this ... That's
>>>>> abhorrently complex. (And I'm not sure the 'action' parameter ought to
>>>>> be overwritten.)
>>>>
>>>> I'm not crazy about it either because it means the device is tied to a specific command.
>>>> But it seems to be something all the RHCS people try to do...
>>>
>>> Maybe something in the rhcs water cooler made us all mad... ;)
>>>
>>>>> Glad you got it working, though.
>>>>>
>>>>>> location loc_fence_n01_ipmi fence_n01_ipmi -inf: an-c03n01.alteeve.ca
>>>>> [...]
>>>>>
>>>>> I'm not sure you need any of these location constraints, by the way. Did
>>>>> you test if it works without them?
>>>>>
>>>>>> Again, this is after just one test. I will want to test it several more
>>>>>> times before I consider it reliable. Ideally, I would love to hear
>>>>>> Andrew or others confirm this looks sane/correct.
>>>>>
>>>>> It looks correct, but not quite sane. ;-) That seems not to be
>>>>> something you can address, though. I'm thinking that fencing topology
>>>>> should be smart enough to, if multiple fencing devices are specified, to
>>>>> know how to expand them to "first all off (if off fails anywhere, it's a
>>>>> failure), then all on (if on fails, it is not a failure)". That'd
>>>>> greatly simplify the syntax.
>>>>
>>>> The RH agents have apparently already been updated to support multiple ports.
>>>> I'm really not keen on having the stonith-ng doing this.
>>>
>>> This doesn't help people who have dual power rails/PDUs for power
>>> redundancy.
>>
>> I'm yet to be convinced that having two PDUs is helping those people in the first place.
>> If it were actually useful, I suspect more than two/three people would have asked for it in the last decade.
>
> I'm just silently waiting for this to happen.
Rarely a good plan.
Better to make my life so miserable that implementing it seems like a vacation in comparison :)
> Although I use different fencing scheme (and plan to use even more
> different one), that is very nice fall-back path for me. And I strongly
> prefer all complexities like reboot -> off-off-on-on to be hidden from
> the configuration. Naturally, that is task for the entity which has
> whole picture of what to do - stonithd. Just my 'IMHO'.
If the tides of public opinion change, then yes, stonithd is the place.
But I can't justify the effort for only a handful of deployments.
>
> And, to PSU/PDU. I, like Digimer, always separate power circuits as much
> as possible. Of course I always use redundant PSUs.
>
> Vladislav
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list