[ClusterLabs] Problem with high load (IO)

Strahil Nikolov hunter86_bg at yahoo.com
Mon Sep 27 13:12:54 EDT 2021


Hey Ken,
how should someone set the maintenace via pcs ?
Best Regards,Strahil Nikolov
 
 
  On Mon, Sep 27, 2021 at 19:56, Ken Gaillot<kgaillot at redhat.com> wrote:   On Mon, 2021-09-27 at 12:37 +0200, Lentes, Bernd wrote:
> Hi,
> 
> i have a two-node cluster running on SLES 12SP5 with two HP servers
> and a common FC SAN.
> Most of my resources are virtual domains offering databases and web
> pages.
> The disks from the domains reside on a OCFS2 Volume on a FC SAN.
> Each night a 9pm all domains are snapshotted with the OCFS2 tool
> reflink.
> After the snapshot is created the disks of the domains are copied to
> a NAS, domains are still running.
> The copy procedure occupies the CPU and IO intensively. IO is
> occupied by copy about 90%, the CPU has sometimes a wait about 50%.
> Because of that the domains aren't responsive, so that the monitor
> operation from the RA fails sometimes.
> In worst cases one domain is fenced.
> What would you do in such a situation ?
> I'm thinking of making the cp procedure nicer, with nice. Maybe about
> 10.
> 
> More ideas ?
> 
> 
> Bernd

This is a classic use case for rules:

https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#using-rules-to-control-cluster-options

You can put the cluster into maintenance mode for the window, or
disable the monitor for the window. Of course that also disables any
cluster response. You could instead lengthen operation timeouts during
the window.
-- 
Ken Gaillot <kgaillot at redhat.com>

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/
  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20210927/1ffaeca4/attachment.htm>


More information about the Users mailing list