<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Oct 10, 2024 at 9:52 PM Angelo Ruggiero <<a href="mailto:angeloruggiero@yahoo.com">angeloruggiero@yahoo.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="msg-93499451217259133">


<div dir="ltr">

<div style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

<br>

</div>

<div id="m_-93499451217259133appendonsend" style="color:inherit"></div>

<div style="font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

<br>

</div>

<hr style="display:inline-block;width:98%">

<div id="m_-93499451217259133divRplyFwdMsg" dir="ltr" style="color:inherit"><span style="font-family:Calibri,sans-serif;font-size:11pt;color:rgb(0,0,0)"><b>From:</b> Klaus Wenninger <<a href="mailto:kwenning@redhat.com" target="_blank">kwenning@redhat.com</a>><br>

<b>Sent:</b> 10 October 2024 4:52 PM<br>

<b>To:</b> Cluster Labs - All topics related to open-source clustering welcomed <<a href="mailto:users@clusterlabs.org" target="_blank">users@clusterlabs.org</a>><br>

<b>Cc:</b> Angelo Ruggiero <<a href="mailto:angeloruggiero@yahoo.com" target="_blank">angeloruggiero@yahoo.com</a>><br>

<b>Subject:</b> Re: [ClusterLabs] Users Digest, Vol 117, Issue 5</span>

<div> </div>

</div>

<div style="direction:ltr"><br>

</div>

<div style="direction:ltr"><br>

</div>

<div style="direction:ltr">On Thu, Oct 10, 2024 at 3:58 PM Angelo Ruggiero via Users <<a href="mailto:users@clusterlabs.org" id="m_-93499451217259133OWA33df6e66-e5e3-60e2-e76a-90128766b018" target="_blank">users@clusterlabs.org</a>> wrote:</div>

<blockquote style="margin:0px 0px 0px 0.8ex;padding-left:1ex;border-left:1px solid rgb(204,204,204)">

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

Thanks for answering. It helps.</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

<br>

</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

>Main scenario where poison pill shines is 2-node-clusters where you don't<br>

>have usable quorum for watchdog-fencing.</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

<br>

</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

Not sure i understand. As if just 2 node and one node fails it cannot respond to the poision pilll. Maybe i mis your point.</div>

</blockquote>

<div style="direction:ltr"><br>

</div>

<div style="direction:ltr">If in a 2 node setup one node loses contact to the other or sees some other reason why it would like</div>

<div style="direction:ltr">the partner-node to be fenced it will try to write the poison-pill message to the shared disk and if that</div>

<div style="direction:ltr">goes Ok and after a configured wait time for the other node to read the message, respond or the</div>

<div style="direction:ltr">watchdog to kick in it will assume the other node to be fenced.  </div>

<div style="direction:ltr"><br>

</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

AR: Yes, understood. </div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

AR:  I guess i am looking for the killer requirement for my setup that say for 2 node cluster with an usable quorum device (usable to be defined later). Does poison pill via SBD or even fence_vnware give me anything.  I am struggling to find a scenario. See

 my final comment in this reply on monitoring below.</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

<br>

</div>

<blockquote style="margin:0px 0px 0px 0.8ex;padding-left:1ex;border-left:1px solid rgb(204,204,204)">

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

<br>

</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

This also begs the followup question, what defines "usable quroum".  Do you mean for example on seperate  independent network hardware and power supply?</div>

</blockquote>

<div style="direction:ltr"><br>

</div>

<div style="direction:ltr">Quorum in 2 node clusters is a bit different as they will stay quorate when losing connection. To prevent split brain there if they</div>

<div style="direction:ltr">reboot on top they will just regain quorum once they've seen each other (search for 'wait-for-all' to read more).</div>

<div style="direction:ltr">This behavior is of course not usable for watchdog-fencing and thus SBD automatically switches to not relying on quorum in</div>

<div style="direction:ltr">those 2-node setups.</div>

<div style="direction:ltr"> </div>

<blockquote style="margin:0px 0px 0px 0.8ex;padding-left:1ex;border-left:1px solid rgb(204,204,204)">

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

<br>

</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

>Configured with pacemaker-awareness - default - availability of the shared-disk doesn't become an issue as, due to fallback to availability of the 2nd node,  the disk is >no spof (single point of failure) in these clusters.</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

<br>

</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

I did not get the jist of what you are trying to say here. 🙂<br>

<br>

</div>

</blockquote>

<div style="direction:ltr"><br>

</div>

<div style="direction:ltr">I was suggesting a scenario that has 2 cluster nodes + a single shared disk. With kind of 'pure' SBD this would mean that a node</div>

<div style="direction:ltr">that is losing connection to the disk would have to self fence which would mean that this disk would become a so called</div>

<div style="direction:ltr">single-point-of-failure - meaning that available of resources in the cluster would be reduced to availability of this single disk.</div>

<div style="direction:ltr">So I tried to explain why you don't have to fear this reduction of availability using pacemaker-awareness.</div>

<div style="direction:ltr"> </div>

<blockquote style="margin:0px 0px 0px 0.8ex;padding-left:1ex;border-left:1px solid rgb(204,204,204)">

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

>Other nodes btw. can still kill a node with watchdog-fencing. I</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

<br>

</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

How does that work when would the killing node tell the other node not to keep triggering its watchdog? </div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

Having written the above sentence maybe it should go and read up when does the poison pill get sent by the killing node!</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

<br>

</div>

</blockquote>

<div style="direction:ltr"><br>

</div>

<div style="direction:ltr">It would either use cluster-communication to tell the node to self-fence and if that isn't available the case</div>

<div style="direction:ltr">below kicks in.</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

<br>

</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

AR: ok</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

<br>

</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

>Quorum in 2 node clusters is a bit different as they will stay quorate when losing connection. </div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

AR: here you refer to 2 node cluster without a quorum device right?</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

AR: futhermore are you saying that poison pill and maybe even node fencing from the cluster is not needed when you do not have a quroum device for 2 node clusters.</div></div></div></blockquote><div><br></div><div>No that is a misunderstanding. For all I described some sort of SBD setup is needed.</div><div>And yes - when I was talking about 2-node-clusters I meant those without a quorum device - those which have</div><div>the 2-node config set in the corosync-config-file.</div><div>I was just saying that without quorum device (or of course 3 and up full cluster nodes) you can't use watchdog-fencing.</div><div>What you still can use is poison-pill fencing if you want to go for SBD. If it is viable for you considering other aspects</div><div>like credentials or accessibility over the network I guess it is alway worth while looking into fencing via the hypervisor.</div><div>There are definitely benefits in getting a response from the hypervisor that a node is down instead of having to wait</div><div>some time - including some safety addon - for it to self-fence. There are as well benefits if pacemaker can explicitly</div><div>turn a node off and on afterwards instead of triggering a reboot (out of obvious reasons the only way it works with SBD).</div><div>If working with hypervisors using their maintenance features (pausing, migration, ...) together with their virtual watchdog</div><div>implementation or softdog you as well have to consider situations where the watchdog timeout might not happen </div><div>reliably within the specified timeout. </div><div><br></div><div>Regards,</div><div>Klaus</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="msg-93499451217259133"><div dir="ltr">

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

<br>

</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

<br>

</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

Hope that makes things a bit clearer.</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

AR: always 🙂 such discussions are hard in both ways to be clear.</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

<br>

</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

AR: As mentioned in an earlier reply. I think I need to dwell on what failure cases i could have and i should go and research the monitoring the resource agents i intened to use offer</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

I.e IPAddr2, FileSytems and the SAP instance agents as i guess they are the ones that would decide to fence another node. The general case where nodes cannot communicate via the network is builtin.</div>

<div style="direction:ltr"><br>

</div>

<div style="direction:ltr">Regards,</div>

<div style="direction:ltr">Klaus</div>

<div style="direction:ltr"> </div>

<blockquote style="margin:0px 0px 0px 0.8ex;padding-left:1ex;border-left:1px solid rgb(204,204,204)">

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

>If the node isn't able to accept that wish of another</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

>node for it to die it will have lost quorum, have stopped triggering the watchdog anyway.</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

<br>

</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

Yes that is clear to mean the self-fencing is quite powerful.</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

<br>

</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

Thanks for the response.</div>

<div style="direction:ltr;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:11pt;color:rgb(0,0,0)">

<br>

</div>

<div id="m_-93499451217259133x_m_5594183960408677331appendonsend" style="color:inherit"></div>

<hr style="direction:ltr;display:inline-block;width:98%">

<div style="direction:ltr;font-family:Calibri,sans-serif;font-size:11pt;color:rgb(0,0,0)">

<b>From:</b> Users <<a href="mailto:users-bounces@clusterlabs.org" id="m_-93499451217259133OWA26b9aa6f-b143-660c-7a35-b76219f908c3" target="_blank">users-bounces@clusterlabs.org</a>> on behalf of

<a href="mailto:users-request@clusterlabs.org" id="m_-93499451217259133OWA24d3cd99-870c-084a-b4f4-be15d668231c" target="_blank">

users-request@clusterlabs.org</a> <<a href="mailto:users-request@clusterlabs.org" id="m_-93499451217259133OWAac8b5c0e-480a-e4e5-704f-791de55dc80b" target="_blank">users-request@clusterlabs.org</a>><br>

<b>Sent:</b> 10 October 2024 2:00 PM</div>

<div id="m_-93499451217259133x_m_5594183960408677331divRplyFwdMsg" dir="ltr" style="color:inherit">

<span style="font-family:Calibri,sans-serif;font-size:11pt;color:rgb(0,0,0)"><b>To:</b>

<a href="mailto:users@clusterlabs.org" id="m_-93499451217259133OWA1d9def1b-9cb7-d60a-fd37-659127986d85" target="_blank">

users@clusterlabs.org</a> <<a href="mailto:users@clusterlabs.org" id="m_-93499451217259133OWA7f2ad36f-2f38-38ca-a598-b2fedd0bf436" target="_blank">users@clusterlabs.org</a>><br>

</span></div>

<div style="direction:ltr;font-family:Calibri,sans-serif;font-size:11pt;color:rgb(0,0,0)">

<b>Subject:</b> Users Digest, Vol 117, Issue 5</div>

<div style="direction:ltr"> </div>

<div style="direction:ltr;font-size:11pt">Send Users mailing list submissions to<br>

        <a href="mailto:users@clusterlabs.org" id="m_-93499451217259133OWA21082914-0d62-4737-8f2f-d7afd73e7dd7" target="_blank">

users@clusterlabs.org</a><br>

<br>

To subscribe or unsubscribe via the World Wide Web, visit<br>

        <a href="https://lists.clusterlabs.org/mailman/listinfo/users" id="m_-93499451217259133OWA520a4791-4411-b039-1cea-0fe96eb23354" target="_blank">

https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

or, via email, send a message with subject or body 'help' to<br>

        <a href="mailto:users-request@clusterlabs.org" id="m_-93499451217259133OWA1a71fc50-675a-69c4-6dc9-f6a9e79e1250" target="_blank">

users-request@clusterlabs.org</a><br>

<br>

You can reach the person managing the list at<br>

        <a href="mailto:users-owner@clusterlabs.org" id="m_-93499451217259133OWA34b927aa-2f0d-e5ce-e1ad-dddcf03fa085" target="_blank">

users-owner@clusterlabs.org</a><br>

<br>

When replying, please edit your Subject line so it is more specific<br>

than "Re: Contents of Users digest..."<br>

<br>

<br>

Today's Topics:<br>

<br>

   1. Re: Fencing Approach (Klaus Wenninger)<br>

<br>

<br>

----------------------------------------------------------------------<br>

<br>

Message: 1<br>

Date: Wed, 9 Oct 2024 19:03:09 +0200<br>

From: Klaus Wenninger <<a href="mailto:kwenning@redhat.com" id="m_-93499451217259133OWA3b113955-c287-62e4-61f5-80bc97a2d453" target="_blank">kwenning@redhat.com</a>><br>

To: Cluster Labs - All topics related to open-source clustering<br>

        welcomed <<a href="mailto:users@clusterlabs.org" id="m_-93499451217259133OWA65b68517-6c4b-17bb-0f9a-937050214c80" target="_blank">users@clusterlabs.org</a>><br>

Cc: Angelo Ruggiero <<a href="mailto:angeloruggiero@yahoo.com" id="m_-93499451217259133OWA570886a9-7fa6-6a98-477b-8b590a2817ac" target="_blank">angeloruggiero@yahoo.com</a>><br>

Subject: Re: [ClusterLabs] Fencing Approach<br>

Message-ID:<br>

        <CALrDAo332oqdTEMX82nc-BPWJ=<a href="mailto:Ea4n_citY71HLamSOv3Kw-cA@mail.gmail.com" id="m_-93499451217259133OWA1dcd6ce9-9807-c2cb-e659-e2c5edcffcdc" target="_blank">Ea4n_citY71HLamSOv3Kw-cA@mail.gmail.com</a>><br>

Content-Type: text/plain; charset="utf-8"<br>

<br>

On Wed, Oct 9, 2024 at 3:08?PM Angelo Ruggiero via Users <<br>

<a href="mailto:users@clusterlabs.org" id="m_-93499451217259133OWA5fbae908-6bf8-a35f-9175-9f1d86100417" target="_blank">users@clusterlabs.org</a>> wrote:<br>

<br>

> Hello,<br>

><br>

> My setup....<br>

><br>

><br>

>    - We are setting up a pacemaker cluster to run SAP runnig on RHEL on<br>

>    Vmware virtual machines.<br>

>    - We will have two nodes for the application server of SAP and 2 nodes<br>

>    for the Hana database. SAP/RHEL provide good support on how to setup the<br>

>    cluster. ?<br>

>    - SAP will need a number of floating Ips to be moved around as well<br>

>    mountin/unmounted NFS file system coming from a NetApp device. SAP will<br>

>    need processes switching on and off when something happens planned or<br>

>    unplanned.I am not clear if the netapp devic is active and the other site<br>

>    is DR but what i know is the ip addresses just get moved during a DR<br>

>    incident. Just to be complete the HANA data sync is done by HANA itself<br>

>    most probably async with an RPO of 15mins or so.<br>

>    -  We will have a quorum node also with hopefully a seperate network<br>

>    not sure if it will be on a seperate vmware infra though.<br>

>    - I am hoping to be allowed to use the vmware watchdog although it<br>

>    might take some persuading as declared as "non standard" for us by our<br>

>    infra people. I have it already in DEV to play with now.<br>

><br>

> I managed to set the above working just using a floating ip and a nfs<br>

> mount as my resources and I can see the following. The self fencing<br>

> approach works fine i.e the servers reboot when they loose network<br>

> connectivity and/or become in quorate as long as they are offering<br>

> resources.<br>

><br>

> So my questions are in relation to further fencing .... I did a lot of<br>

> reading and saw various reference...<br>

><br>

><br>

>    1. Use of sbd shared storage<br>

><br>

> The question is what does using sbd with a shated storage really give me.<br>

> I need to justify why i need this shared storage again to the infra guys<br>

> but to be honest also to myself.   I have been given this infra and will<br>

> play with it next few days.<br>

><br>

><br>

>    2. Use of fence vmware<br>

><br>

> In addition there is the ability of course to fence using the fence_vmware<br>

> agents and I again I need to justify why i need this. In this particular<br>

> cases it will be a very hard sell because the dev/test and prod<br>

> environments run on the same vmware infra so to use fence_vmware would<br>

> effectively mean dev is connected to prod i.e the user id for a dev or test<br>

> box is being provided by a production environment. I do not have this<br>

> ability at all so cannot play with it.<br>

><br>

><br>

><br>

> My current thought train...i.e the typical things i think about...<br>

><br>

> Perhaps someone can help me be clear on the benefits of 1 and 2 over and<br>

> above the setup i think it doable.<br>

><br>

><br>

>    1.  gives me the ability to use poison pill<br>

><br>

>    But what scenarios does poison pill really help why would the other<br>

>    parts of the cluster want to fence the node if the node itself has not<br>

>    killed it self as it lost quorum either because quorum devcice gone or<br>

>    network connectivity failed and resources needs to be switched off.<br>

><br>

>               What i get is that it is very explict i.e the others nodes<br>

> tell the other server to die. So it must be a case initiated by the other<br>

> nodes.<br>

>               I am struggling to think of a scenarios where the other<br>

> nodes would want to fence it.<br>

><br>

<br>

Main scenario where poison pill shines is 2-node-clusters where you don't<br>

have usable quorum for watchdog-fencing.<br>

Configured with pacemaker-awareness - default - availability of the<br>

shared-disk doesn't become an issue as, due to<br>

fallback to availability of the 2nd node,  the disk is no spof (single<br>

point of failure) in these clusters.<br>

Other nodes btw. can still kill a node with watchdog-fencing. If the node<br>

isn't able to accept that wish of another<br>

node for it to die it will have lost quorum, have stopped triggering the<br>

watchdog anyway.<br>

<br>

Regards,<br>

Klaus<br>

<br>

><br>

> Possible Scenarios, did i miss any?<br>

><br>

>    - Loss of network connection to the node. But that is covered by the<br>

>    node self fencing<br>

>    - If some monitoring said the node was not healthly or responding...<br>

>    Maybe this is the case it is good for but then it must be a partial failure<br>

>    where the node is still part fof the cluster and can respond. I.e not OS<br>

>    freeze or only it looses connection as then the watchdog or the self<br>

>    fencing will kick in.<br>

>    - HW failures, cpu, memory, disk For virtual hardware does that<br>

>    actually ever fail? Sorry if stupid question. I could ask our infra guys<br>

>    but....,<br>

>    So is virtual hardware so reliable that hw failures can be ignored.<br>

>    - Loss of shared storage SAP uses a lot of shared storage via NFS. Not<br>

>    sure what happens when that fails need to research it a bit but each node<br>

>    will sort that out itself I am presuming.<br>

>    - Human error: but no cluster will fix that and the human who makes a<br>

>    change will realise it and revert. ?<br>

><br>

>        2. Fence vmware<br>

><br>

>       I see this as a better poision pill as it works at the hardware<br>

> level. But if I do not need poision pill then i do not need this.<br>

><br>

> In general OS freezes or even panics if take took long are covered by the<br>

> watchdog.<br>

><br>

> regards<br>

> Angelo<br>

><br>

><br>

><br>

><br>

><br>

> _______________________________________________<br>

> Manage your subscription:<br>

> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" id="m_-93499451217259133OWA60424ef8-b9ca-82e7-02c5-060aa411ea8c" target="_blank">

https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

><br>

> ClusterLabs home: <a href="https://www.clusterlabs.org/" id="m_-93499451217259133OWAbb6d3522-ca22-e084-be63-2d934cb6f5d4" target="_blank">

https://www.clusterlabs.org/</a><br>

><br>

-------------- next part --------------<br>

An HTML attachment was scrubbed...<br>

URL: <<a href="https://lists.clusterlabs.org/pipermail/users/attachments/20241009/b9a58eb1/attachment-0001.htm" id="m_-93499451217259133OWA7451767b-4637-48ca-1413-675b9a761dd0" target="_blank">https://lists.clusterlabs.org/pipermail/users/attachments/20241009/b9a58eb1/attachment-0001.htm</a>><br>

<br>

------------------------------<br>

<br>

Subject: Digest Footer<br>

<br>

_______________________________________________<br>

Manage your subscription:<br>

<a href="https://lists.clusterlabs.org/mailman/listinfo/users" id="m_-93499451217259133OWAff7d9da0-87e6-76a6-b1d3-0f42b71c0722" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

<br>

ClusterLabs home: <a href="https://www.clusterlabs.org/" id="m_-93499451217259133OWA8455ab2c-dd83-1fd0-275d-7496f14b1986" target="_blank">

https://www.clusterlabs.org/</a><br>

<br>

<br>

------------------------------<br>

<br>

End of Users Digest, Vol 117, Issue 5<br>

*************************************</div>

<div style="direction:ltr">_______________________________________________<br>

Manage your subscription:<br>

<a href="https://lists.clusterlabs.org/mailman/listinfo/users" id="m_-93499451217259133OWA522b79a0-0e40-a045-2438-f00eb4508625" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

<br>

ClusterLabs home: <a href="https://www.clusterlabs.org/" id="m_-93499451217259133OWA3b2314d7-c5d7-99eb-99c8-384d141dd969" target="_blank">

https://www.clusterlabs.org/</a></div>

</blockquote>

</div>


</div></blockquote></div></div>