[ClusterLabs] Antw: [EXT] Re: VirtualDomain & "deeper" monitors - what/how?

Thu May 27 05:33:36 EDT 2021

guest-get-fsinfo doesn't seem to work on older agents (centos6) I've found guest-get-time more universal.

Also, found this helpful thread on using monitor_scripts which is part of the VirtualDomain RA

https://linux-ha-dev.linux-ha.narkive.com/yxvySDA2/monitor-scripts-parameter-for-the-virtualdomain-ra-was-re-linux-ha-ocf-resource-agent-for-kvm

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

On Sunday, May 16th, 2021 at 22:49, Kyle O'Donnell <kyleo at 0b10.mx> wrote:

> I am thinking about using the qemu-guest-agent to run one of the available commands to determine the health of the OS inside
>
> virsh qemu-agent-command myvm --pretty '{"execute":"guest-get-fsinfo"}'
>
> https://qemu-project.gitlab.io/qemu/interop/qemu-ga-ref.html
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>
> On Thursday, May 13th, 2021 at 01:28, Andrei Borzenkov arvidjaar at gmail.com wrote:
>
> > On 03.05.2021 09:48, Ulrich Windl wrote:
> >
> > > > > > Ken Gaillot kgaillot at redhat.com schrieb am 30.04.2021 um 16:57 in
> > > > > >
> > > > > > Nachricht
> > > > > >
> > > > > > 3acef4bc31923fb019619c713300444c2dcd354a.camel at redhat.com:
> > > > > >
> > > > > > On Fri, 2021‑04‑30 at 11:00 +0100, lejeczek wrote:
> > > >
> > > > > Hi guys
> > > > >
> > > > > I'd like to ask around for thoughts & suggestions on any
> > > > >
> > > > > semi/official ways to monitor VirtualDomain.
> > > > >
> > > > > Something beyond what included RA does ‑ such as actual
> > > > >
> > > > > health testing of and communication with VM's OS.
> > > > >
> > > > > many thanks, L.
> > > >
> > > > This use case led to a Pacemaker feature many moons ago ...
> > > >
> > > > Pacemaker supports nagios plug‑ins as a resource type (e.g.
> > > >
> > > > nagios:check_apache_status). These are service checks usually used with
> > > >
> > > > monitoring software such as nagios, icinga, etc.
> > > >
> > > > If the service being monitored is inside a VirtualDomain, named vm1 for
> > > >
> > > > example, you can configure the nagios resource with the resource meta‑
> > > >
> > > > attribute container="vm1". If the nagios check fails, Pacemaker will
> > > >
> > > > restart vm1.
> > >
> > > "check fails" mans WARNING, CRITICAL, or UNKNOWN? ;-)
> >
> > switch (rc) {
> >
> > case NAGIOS_STATE_OK:
> >
> > return PCMK_OCF_OK;
> >
> > case NAGIOS_INSUFFICIENT_PRIV:
> >
> > return PCMK_OCF_INSUFFICIENT_PRIV;
> >
> > case NAGIOS_NOT_INSTALLED:
> >
> > return PCMK_OCF_NOT_INSTALLED;
> >
> > case NAGIOS_STATE_WARNING:
> >
> > case NAGIOS_STATE_CRITICAL:
> >
> > case NAGIOS_STATE_UNKNOWN:
> >
> > case NAGIOS_STATE_DEPENDENT:
> >
> > default:
> >
> > return PCMK_OCF_UNKNOWN_ERROR;
> >
> > }
> >
> > return PCMK_OCF_UNKNOWN_ERROR;
> >
> > Manage your subscription:
> >
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > ClusterLabs home: https://www.clusterlabs.org/