<div dir="ltr"><div>Kindly read "fencing is done using fence_scsi" from the previous message as "fencing is configured". <br></div><div><br></div><div>As per the error messages we have analyzed node2 initiated fencing of node1 as many processes of node1 related to cluster have been killed by oom killer and node1 marked as down. <br></div><div>Now many resources of node2 have waited for fencing of node1, as seen from following messages of syslog of node2: <br></div><div><pre class="gmail-aLF-aPX-K0-aPE">dlm_controld[1616]: 91659 lvm_postgres_db_vg wait for fencing
dlm_controld[1616]: 91659 lvm_global wait for fencing<br><br></pre><pre class="gmail-aLF-aPX-K0-aPE"><font face="arial,sans-serif">These were messages when postgresql-12 service was being started on node2. <br></font></pre><pre class="gmail-aLF-aPX-K0-aPE"><font face="arial,sans-serif">As postgresql service is dependent on these services(dlm,lvmlockd and gfs2), it has not started in time on node2. <br></font></pre><pre class="gmail-aLF-aPX-K0-aPE"><font face="arial,sans-serif">And node2 fenced itself after declaring that services can not be started on it. <br></font></pre></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Feb 15, 2021 at 9:00 AM Ulrich Windl <<a href="mailto:Ulrich.Windl@rz.uni-regensburg.de" target="_blank">Ulrich.Windl@rz.uni-regensburg.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">>>> shivraj dongawe <<a href="mailto:shivraj198@gmail.com" target="_blank">shivraj198@gmail.com</a>> schrieb am 15.02.2021 um 08:27 in<br>
Nachricht<br>
<CALpaHO_6LsYM=t76CifsRkFeLYDKQc+hY3kz7PRKp7b4se=-<a href="mailto:Aw@mail.gmail.com" target="_blank">Aw@mail.gmail.com</a>>:<br>
> Fencing is done using fence_scsi.<br>
> Config details are as follows:<br>
> Resource: scsi (class=stonith type=fence_scsi)<br>
> Attributes: devices=/dev/mapper/mpatha pcmk_host_list="node1 node2"<br>
> pcmk_monitor_action=metadata pcmk_reboot_action=off<br>
> Meta Attrs: provides=unfencing<br>
> Operations: monitor interval=60s (scsi-monitor-interval-60s)<br>
> <br>
> On Mon, Feb 15, 2021 at 7:17 AM Ulrich Windl <<br>
> <a href="mailto:Ulrich.Windl@rz.uni-regensburg.de" target="_blank">Ulrich.Windl@rz.uni-regensburg.de</a>> wrote:<br>
> <br>
>> >>> shivraj dongawe <<a href="mailto:shivraj198@gmail.com" target="_blank">shivraj198@gmail.com</a>> schrieb am 14.02.2021 um 12:03<br>
>> in<br>
>> Nachricht<br>
>> <<a href="mailto:CALpaHO--3ERfwST70mBL-Wm9g6yH3YtD-wDA1r_CKnbrsxu4Sg@mail.gmail.com" target="_blank">CALpaHO--3ERfwST70mBL-Wm9g6yH3YtD-wDA1r_CKnbrsxu4Sg@mail.gmail.com</a>>:<br>
>> > We are running a two node cluster on Ubuntu 20.04 LTS. Cluster related<br>
>> > package version details are as<br>
>> > follows: pacemaker/focal-updates,focal-security 2.0.3-3ubuntu4.1 amd64<br>
>> > pacemaker/focal 2.0.3-3ubuntu3 amd64<br>
>> > corosync/focal 3.0.3-2ubuntu2 amd64<br>
>> > pcs/focal 0.10.4-3 all<br>
>> > fence-agents/focal 4.5.2-1 amd64<br>
>> > gfs2-utils/focal 3.2.0-3 amd64<br>
>> > dlm-controld/focal 4.0.9-1build1 amd64<br>
>> > lvm2-lockd/focal 2.03.07-1ubuntu1 amd64<br>
>> ><br>
>> > Cluster configuration details:<br>
>> > 1. Cluster is having a shared storage mounted through gfs2 filesystem<br>
>> with<br>
>> > the help of dlm and lvmlockd.<br>
>> > 2. Corosync is configured to use knet for transport.<br>
>> > 3. Fencing is configured using fence_scsi on the shared storage which is<br>
>> > being used for gfs2 filesystem<br>
>> > 4. Two main resources configured are cluster/virtual ip and<br>
>> postgresql-12,<br>
>> > postgresql-12 is configured as a systemd resource.<br>
>> > We had done failover testing(rebooting/shutting down of a node, link<br>
>> > failure) of the cluster and had observed that resources were getting<br>
>> > migrated properly on the active node.<br>
>> ><br>
>> > Recently we came across an issue which has occurred repeatedly in span of<br>
>> > two days.<br>
>> > Details are below:<br>
>> > 1. Out of memory killer is getting invoked on active node and it starts<br>
>> > killing processes.<br>
>> > Sample is as follows:<br>
>> > postgres invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE),<br>
>> > order=0, oom_score_adj=0<br>
>> > 2. At one instance it started with killing of pacemaker and on another<br>
>> with<br>
>> > postgresql. It does not stop with the killing of a single process it goes<br>
>> > on killing others(more concerning is killing of cluster related<br>
>> processes)<br>
>> > as well. We have observed that swap space on that node is 2 GB against<br>
>> RAM<br>
>> > of 96 GB and are in the process of increasing swap space to see if this<br>
>> > resolves this issue. Postgres is configured with shared_buffers value of<br>
>> 32<br>
>> > GB(which is way less than 96 GB).<br>
>> > We are not yet sure which process is eating up that much memory suddenly.<br>
>> > 3. As a result of killing processes on node1, node2 is trying to fence<br>
>> > node1 and thereby initiating stopping of cluster resources on node1.<br>
>><br>
>> How is fencing being done?<br>
>><br>
>> > 4. At this point we go in a stage where it is assumed that node1 is down<br>
>> > and application resources, cluster IP and postgresql are being started on<br>
>> > node2.<br>
<br>
This is why I was asking: Is your fencing successful ("assumed that node1 is down<br>
"), or isn't it?<br>
<br>
>> > 5. Postgresql on node 2 fails to start in 60 sec(start operation timeout)<br>
>> > and is declared as failed. During the start operation of postgres, we<br>
>> have<br>
>> > found many messages related to failure of fencing and other resources<br>
>> such<br>
>> > as dlm and vg waiting for fencing to complete.<br>
>> > Details of syslog messages of node2 during this event are attached in<br>
>> file.<br>
>> > 6. After this point we are in a state where node1 and node2 both go in<br>
>> > fenced state and resources are unrecoverable(all resources on both<br>
>> nodes).<br>
>> ><br>
>> > Now my question is out of memory issue of node1 can be taken care by<br>
>> > increasing swap and finding out the process responsible for such huge<br>
>> > memory usage and taking necessary actions to minimize that memory usage,<br>
>> > but the other issue that remains unclear is why cluster is not shifted to<br>
>> > node2 cleanly and become unrecoverable.<br>
>><br>
>><br>
>><br>
>><br>
>> _______________________________________________<br>
>> Manage your subscription:<br>
>> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a> <br>
>><br>
>> ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a> <br>
>><br>
<br>
<br>
<br>
_______________________________________________<br>
Manage your subscription:<br>
<a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>
<br>
ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>
</blockquote></div>