[Pacemaker] Re: Problems when DC node is STONITH'ed.

Satomi TANIGUCHI taniguchis at intellilink.co.jp
Thu Oct 16 02:43:36 EDT 2008

Hi Dejan,

Dejan Muhamedagic wrote:
> Hi Satomi-san,
> On Tue, Oct 14, 2008 at 07:07:00PM +0900, Satomi TANIGUCHI wrote:
>> Hi,
>> I found that there are 2 problems when DC node is STONITH'ed.
>> (1) STONITH operation is executed two times.
> This has been discussed at length in bugzilla, see
> http://developerbugs.linux-foundation.org/show_bug.cgi?id=1904
> which was resolved with WONTFIX. In short, it was deemed to risky
> to implement a remedy for this problem.  Of course, if you think
> you can add more to the discussion, please go ahead.
Sorry, I missed it.
Thank you for your pointing!
I understand how it came about.

Ideally, when DC-node is going to be STONITH'ed,
the new DC-node is elected and it STONITHs the ex-DC,
then these problems will not occur.
But maybe it is not good way from the viewpoint of emergency
because the ex-DC should be STONITH'ed as soon as possible.

Anyway, I understand this is an expected behavior, thanks!
But then, it seems that tengine has to keep having a timeout for waiting
stonithd's result, and long cluster-delay is still required.
Because second STONITH is requested on that transition timeout.
I'm afraid that I misunderstood the true meaning of what Andrew said.

>> (2) Timeout-value which stonithd on DC node waits to reply
>>     the result of STONITH op from other node is
>>     always set to "stonith-timeout" in <cluster_property_set>.
>> [...]
>> The case (2):
>> When this timeout occurs on stonithd on DC
>> during non-DC node's stonithd tries to reset DC,
>> DC-stonithd will send a request to other node,
>> and two or more STONITH plugins are executed in parallel.
>> This is a troublesome problem.
>> The most suitable value as this timeout might be
>> the sum total of "stonith-timeout" of STONITH plugins on the node
>> which is going to receive the STONITH request from DC node, I think.
> This would probably be very difficult for the CRM to get.
Right, I agree with you.
I meant "it is difficult because stonithd on DC can't know the values of
stonith-timeout on other node." with the following sentence
"But DC node can't know that...".

>> But DC node can't know that...
>> I would like to hear your opinions.
> Sorry, but I couldn't exactly follow. Could you please describe
> it in terms of actions.
Sorry, I restate what I meant.
The timeout which stonithd on DC waits for the return of other node's
stonithd needs the value that is longer than the sum total of "stonith-timeout"
of STONITH plugins on the node by all rights.
But it is so difficult to get the values for DC-stonithd.
Then I would like to hear your opinion about what is suitable and practical
value as this timeout which is set in insert_into_executing_queue().
I hope I conveyed to you what I want to say.

For reference, I attached logs when the aforesaid timeout occurs.
The cluster has 3 nodes.
When DC was going to be STONITH'ed, DC sent a request all of non-DC nodes,
and all of them tried to shutdown DC.
And the timeout on DC-stonithd occured, DC-stonithd sent the same request,
then two or more STONITH plugin worked in parallel on every non-DC node.
(Please see sysstats.txt.)
I want to make clear whether the current behavior is expected or a bug.

But I consider that the root of every problem is the node which sends STONITH
request and wait for completion of the op is killed.


> Thanks,
> Dejan
>> Best Regards,
>> _______________________________________________________
>> Linux-HA-Dev: Linux-HA-Dev at lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>> Home Page: http://linux-ha.org/
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: hb_report.tar.gz
Type: application/x-gzip
Size: 49479 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20081016/8c330cde/attachment-0001.bin>

More information about the Pacemaker mailing list