[ClusterLabs] Problem with high load (IO)

Ken Gaillot kgaillot at redhat.com
Mon Sep 27 12:48:32 EDT 2021

On Mon, 2021-09-27 at 12:37 +0200, Lentes, Bernd wrote:
> Hi,
> i have a two-node cluster running on SLES 12SP5 with two HP servers
> and a common FC SAN.
> Most of my resources are virtual domains offering databases and web
> pages.
> The disks from the domains reside on a OCFS2 Volume on a FC SAN.
> Each night a 9pm all domains are snapshotted with the OCFS2 tool
> reflink.
> After the snapshot is created the disks of the domains are copied to
> a NAS, domains are still running.
> The copy procedure occupies the CPU and IO intensively. IO is
> occupied by copy about 90%, the CPU has sometimes a wait about 50%.
> Because of that the domains aren't responsive, so that the monitor
> operation from the RA fails sometimes.
> In worst cases one domain is fenced.
> What would you do in such a situation ?
> I'm thinking of making the cp procedure nicer, with nice. Maybe about
> 10.
> More ideas ?
> Bernd

This is a classic use case for rules:


You can put the cluster into maintenance mode for the window, or
disable the monitor for the window. Of course that also disables any
cluster response. You could instead lengthen operation timeouts during
the window.
Ken Gaillot <kgaillot at redhat.com>

More information about the Users mailing list