[ClusterLabs] SAPHanaController & SAPHanaTopology question
Reid Wahl
nwahl at redhat.com
Fri Apr 2 17:07:02 EDT 2021
On Fri, Apr 2, 2021 at 2:04 PM Strahil Nikolov <hunter86_bg at yahoo.com>
wrote:
> Hi Reid,
>
> I will check it out in Monday, but I'm pretty sure I created an order set
> that first stops the topology and only then it stops the nfs-active.
>
> Yet, I made the stupid decision to prevent ocf:heartbeat:Filesystem (and
> setting a huge timeout for the stop operation) from killing those 2 SAP
> processes which led to 'I can't umount, giving up'-like notification and of
> course fenced the entire cluster :D .
>
> Note taken, stonith has now different delays , and Filesystem can kill the
> processes.
>
> As per the SAP note from Andrei, these could really be 'fast restart'
> mechanisms in HANA 2.0 and it looks safe to be killed (will check with SAP
> about that).
>
>
> P.S: Is there a way to remove a whole set in pcs , cause it's really
> irritating when the stupid command wipes the resource from multiple order
> constraints?
>
If you mean a whole constraint set, then yes -- run `pcs constraint --full`
to get a list of all constraints with their constraint IDs. Then run `pcs
constraint remove <constraint_id>` to remove a particular constraint. This
can include set constraints.
>
> Best Regards,
> Strahil Nikolov
>
>
>
> On Fri, Apr 2, 2021 at 23:44, Reid Wahl
> <nwahl at redhat.com> wrote:
> Hi, Strahil.
>
> Based on the constraints documented in the article you're following (RH KB
> solution 5423971), I think I see what's happening.
>
> The SAPHanaTopology resource requires the appropriate nfs-active attribute
> in order to run. That means that if the nfs-active attribute is set to
> false, the SAPHanaTopology resource must stop.
>
> However, there's no rule saying SAPHanaTopology must finish stopping
> before the nfs-active attribute resource stops. In fact, it's quite the
> opposite: the SAPHanaTopology resource stops only after the nfs-active
> resource stops.
>
> At the same time, the NFS resources are allowed to stop after the
> nfs-active attribute resource has stopped. So the NFS resources are
> stopping while the SAPHana* resources are likely still active.
>
> Try something like this:
> # pcs constraint order hana_nfs1_active-clone then
> SAPHanaTopology_<SID>_<instance_num>-clone kind=Optional
> # pcs constraint order hana_nfs2_active-clone then
> SAPHanaTopology_<SID>_<instance_num>-clone kind=Optional
>
> This says "if both hana_nfs1_active and SAPHanaTopology are scheduled to
> start, then make hana_nfs1_active start first. If both are scheduled to
> stop, then make SAPHanaTopology stop first."
>
> "kind=Optional" means there's no order dependency unless both resources
> are already going to be scheduled for the action. I'm using kind=Optional
> here even though kind=Mandatory (the default) would make sense, because
> IIRC there were some unexpected interactions with ordering constraints for
> clones, where events on one node had unwanted effects on other nodes.
>
> I'm not able to test right now since setting up an environment for this
> even with dummy resources is non-trivial -- but you're welcome to try this
> both with and without kind=Optional if you'd like.
>
> Please let us know how this goes.
>
> On Fri, Apr 2, 2021 at 2:20 AM Strahil Nikolov <hunter86_bg at yahoo.com>
> wrote:
>
> Hello All,
>
> I am testing the newly built HANA (Scale-out) cluster and it seems that:
> Neither SAPHanaController, nor SAPHanaTopology are stopping the HANA when
> I put the nodes (same DC = same HANA) in standby. This of course leads to a
> situation where the NFS cannot be umounted and despite the stop timeout -
> leads to fencing(on-fail=fence).
>
> I thought that the Controller resource agent is stopping the HANA and the
> slave role should not be 'stopped' before that .
>
> Maybe my expectations are wrong ?
>
> Best Regards,
> Strahil Nikolov
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
>
>
> --
> Regards,
>
>
> Reid Wahl, RHCA
> Senior Software Maintenance Engineer, Red Hat
> CEE - Platform Support Delivery - ClusterHA
>
>
--
Regards,
Reid Wahl, RHCA
Senior Software Maintenance Engineer, Red Hat
CEE - Platform Support Delivery - ClusterHA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20210402/c0bcbb47/attachment-0001.htm>
More information about the Users
mailing list