<div dir="ltr">Hi Ulrich, <div>I set the cluster in maintenance mode due to the consistent logging of the error messages in the system log. </div><div><br></div><div>Pacemaker has attempted to execute the monitor operation of the resource agent here. Is there a way to find out why pacemaker says 'No such device or address'? </div><div> hanapopdb1-rsc_SAPHana_HPN_HDB00_monitor_60000:195 [ Error performing operation: No such device or address]*  <br></div><div><br></div><div>Regards,</div><div>Aj</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Apr 8, 2022 at 8:23 PM Ulrich Windl <<a href="mailto:Ulrich.Windl@rz.uni-regensburg.de">Ulrich.Windl@rz.uni-regensburg.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">"maintenance-mode=true"? Why?<br>

<br>

<br>

>>> Aj Revelino <<a href="mailto:aj.revelino@gmail.com" target="_blank">aj.revelino@gmail.com</a>> schrieb am 08.04.2022 um 11:17 in Nachricht<br>

<CAJY7vkA=SfaJngsfJnREkFMnMJ0hn=<a href="mailto:ppkec7CyUci32CR3Ro%2Bg@mail.gmail.com" target="_blank">ppkec7CyUci32CR3Ro+g@mail.gmail.com</a>>:<br>

> Hello All,<br>

> I've a 2 node SAP Hana cluster (hanapodb1 and hanapodb2). Pacemaker<br>

> monitors the data replication between the primary and the secondary node.<br>

> The issue is that crm status shows that everything is okay but the system<br>

> log shows the following error log.<br>

> <br>

> <br>

> *pacemaker-controld[3582]:  notice:<br>

> hanapopdb1-rsc_SAPHana_HPN_HDB00_monitor_60000:195 [ Error performing<br>

> operation: No such device or address]*<br>

> I am unable to identify the cause of the error message and resolve it<br>

> <br>

> And due to the above, the data replication between the 2 nodes is recorded<br>

> as failed (SFAIL) . Pls see the excerpt from the CIB below:<br>

> <br>

>  <node_state id="2" in_ccm="true" crmd="online"<br>

> crm-debug-origin="do_update_resource" uname="zhanapopdb2" join="member"<br>

> expected="member"><br>

>       <transient_attributes id="2"><br>

>         <instance_attributes id="status-2"><br>

>          * <nvpair id="status-2-hana_hpn_clone_state"<br>

> name="hana_hpn_clone_state" value="WAITING4PRIM"/>*<br>

>           <nvpair id="status-2-hana_hpn_version" name="hana_hpn_version"<br>

> value="2.00.056.00.1624618329"/><br>

>           <nvpair id="status-2-master-rsc_SAPHana_HPN_HDB00"<br>

> name="master-rsc_SAPHana_HPN_HDB00" value="-INFINITY"/><br>

>           *<nvpair id="status-2-hana_hpn_sync_state"<br>

> name="hana_hpn_sync_state" value="SFAIL"/>*<br>

>           <nvpair id="status-2-hana_hpn_roles" name="hana_hpn_roles"<br>

> value="4:S:master1:master:worker:master"/><br>

>         </instance_attributes><br>

>       </transient_attributes><br>

> <br>

> Pacemaker is able to failover the resources from the primary to the<br>

> secondary but they all fail back to the primary, the moment I clean up the<br>

> failure in the primary node.<br>

> I deleted and recreated the entire configuration and reconfigured the hana<br>

> data replication but it hasn't helped.<br>

> <br>

> <br>

> *Cluster configuration:*<br>

> hanapopdb1:~ # crm configure show<br>

> node 1: hanapopdb1 \<br>

>         attributes hana_hpn_vhost=hanapopdb1 hana_hpn_site=SITE1PO<br>

> hana_hpn_op_mode=logreplay_readaccess hana_hpn_srmode=sync<br>

> lpa_hpn_lpt=1649393239 hana_hpn_remoteHost=hanapopdb2<br>

> node 2: hanapopdb2 \<br>

>         attributes lpa_hpn_lpt=10 hana_hpn_op_mode=logreplay_readaccess<br>

> hana_hpn_vhost=hanapopdb2 hana_hpn_remoteHost=hanapopdb1<br>

> hana_hpn_site=SITE2PO hana_hpn_srmode=sync<br>

> primitive rsc_SAPHanaTopology_HPN_HDB00 ocf:suse:SAPHanaTopology \<br>

>         operations $id=rsc_sap2_HPN_HDB00-operations \<br>

>         op monitor interval=10 timeout=600 \<br>

>         op start interval=0 timeout=600 \<br>

>         op stop interval=0 timeout=300 \<br>

>         params SID=HPN InstanceNumber=00<br>

> primitive rsc_SAPHana_HPN_HDB00 ocf:suse:SAPHana \<br>

>         operations $id=rsc_sap_HPN_HDB00-operations \<br>

>         op start interval=0 timeout=3600 \<br>

>         op stop interval=0 timeout=3600 \<br>

>         op promote interval=0 timeout=3600 \<br>

>         op monitor interval=60 role=Master timeout=700 \<br>

>         op monitor interval=61 role=Slave timeout=700 \<br>

>         params SID=HPN InstanceNumber=00 PREFER_SITE_TAKEOVER=true<br>

> DUPLICATE_PRIMARY_TIMEOUT=7200 AUTOMATED_REGISTER=false<br>

> primitive rsc_ip_HPN_HDB00 IPaddr2 \<br>

>         meta target-role=Started \<br>

>         operations $id=rsc_ip_HPN_HDB00-operations \<br>

>         op monitor interval=10s timeout=20s \<br>

>         params ip=10.10.1.60<br>

> primitive rsc_nc_HPN_HDB00 azure-lb \<br>

>         params port=62506<br>

> primitive stonith-sbd stonith:external/sbd \<br>

>         params pcmk_delay_max=30 \<br>

>         op monitor interval=30 timeout=30<br>

> group g_ip_HPN_HDB00 rsc_ip_HPN_HDB00 rsc_nc_HPN_HDB00<br>

> ms msl_SAPHana_HPN_HDB00 rsc_SAPHana_HPN_HDB00 \<br>

>         meta is-managed=true notify=true clone-max=2 clone-node-max=1<br>

> target-role=Started interleave=true<br>

> clone cln_SAPHanaTopology_HPN_HDB00 rsc_SAPHanaTopology_HPN_HDB00 \<br>

>         meta clone-node-max=1 target-role=Started interleave=true<br>

> colocation col_saphana_ip_HPN_HDB00 4000: g_ip_HPN_HDB00:Started<br>

> msl_SAPHana_HPN_HDB00:Master<br>

> order ord_SAPHana_HPN_HDB00 Optional: cln_SAPHanaTopology_HPN_HDB00<br>

> msl_SAPHana_HPN_HDB00<br>

> property cib-bootstrap-options: \<br>

>         last-lrm-refresh=1649387935 \<br>

>         maintenance-mode=true<br>

> <br>

> Regards,<br>

> <br>

> Aj<br>

<br>

<br>

<br>

_______________________________________________<br>

Manage your subscription:<br>

<a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

<br>

ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>

</blockquote></div>