[ClusterLabs] Antw: Re: Antw: Re: Cluster failover failure with Unresolved dependency

Lorand Kelemen lorand.kelemen at gmail.com
Mon Mar 21 06:12:40 EDT 2016


Of course, to catch you up:

>> Still experiencing the same behaviour, killing amavisd returns an rc=7
for
>> the monitoring operation on the "victim" node, this soungs logical, but
the
>> logs contain the same: amavisd and virtualip cannot run anywhere.
>>
>> I made sure systemd is clean (amavisd = inactive, not running instead of
>> failed) and also reset the failcount on all resources before killing
>> amavisd.
>>

> What you did is fine. I'm not sure why amavisd and virtualip can't run.

So the cluster is in a clean state, I still get the resource cannot run
anywhere log entry, that is the state you see reproduced in crm_mon output.
However after the cluster recheck interval expires all resources are
started fine on the surviving node.

Best regards,
Lorand

On Mon, Mar 21, 2016 at 10:48 AM, Ulrich Windl <
Ulrich.Windl at rz.uni-regensburg.de> wrote:

>
> >>> Lorand Kelemen <lorand.kelemen at gmail.com> schrieb am 21.03.2016 um
> 10:08 in
> Nachricht
> <CAO2rmm1P3+LSm0ZmPYb5WYNqonaCT5XS4GQzK=ezU+WkknRaDA at mail.gmail.com>:
> > Reproduced it again:
> >
> > Last updated: Mon Mar 21 10:01:18 2016          Last change: Mon Mar 21
> > 09:59:27 2016 by root via crm_attribute on mail1
> > Stack: corosync
> > Current DC: mail2 (version 1.1.13-10.el7_2.2-44eb2dd) - partition with
> > quorum
> > 2 nodes and 10 resources configured
> >
> > Online: [ mail1 mail2 ]
> >
> > Full list of resources:
> >
> >  Resource Group: network-services
> >      virtualip-1        (ocf::heartbeat:IPaddr2):       Stopped
> >  Master/Slave Set: spool-clone [spool]
> >      Masters: [ mail2 ]
> >      Slaves: [ mail1 ]
> >  Master/Slave Set: mail-clone [mail]
> >      Masters: [ mail2 ]
> >      Slaves: [ mail1 ]
> >  Resource Group: fs-services
> >      fs-spool   (ocf::heartbeat:Filesystem):    Stopped
> >      fs-mail    (ocf::heartbeat:Filesystem):    Stopped
> >  Resource Group: mail-services
> >      amavisd    (systemd:amavisd):      Stopped
> >      spamassassin       (systemd:spamassassin): Stopped
> >      clamd      (systemd:clamd at amavisd):        Stopped
> >
> > Node Attributes:
> > * Node mail1:
> >     + master-mail                       : 10000
> >     + master-spool                      : 10000
> > * Node mail2:
> >     + master-mail                       : 10000
> >     + master-spool                      : 10000
> >
> > Migration Summary:
> > * Node mail1:
> > * Node mail2:
> >    amavisd: migration-threshold=1 fail-count=1 last-failure='Mon Mar 21
> > 10:00:53 2016'
> >
> > Failed Actions:
> > * amavisd_monitor_60000 on mail2 'not running' (7): call=2604,
> > status=complete, exitreason='none',
> >     last-rc-change='Mon Mar 21 10:00:53 2016', queued=0ms, exec=0ms
>
> Did you try a resource cleanup for "amavisd"? Like crm_resource -C -r
> amavisd...
>
> >
> > Best regards,
> > Lorand
> >
> > On Mon, Mar 21, 2016 at 9:57 AM, Ulrich Windl <
> > Ulrich.Windl at rz.uni-regensburg.de> wrote:
> >
> >> >>> Lorand Kelemen <lorand.kelemen at gmail.com> schrieb am 18.03.2016 um
> >> 16:42 in
> >> Nachricht
> >> <CAO2rmm2OvdKvcSeQSb9+j=krqcdpyDk6GB4bUkD6CcQZLXw6Wg at mail.gmail.com>:
> >> > I reviewed all the logs, but found nothing out of the ordinary,
> besides
> >> the
> >> > "resource cannot run anywhere" line, however after the cluster recheck
> >> > interval expired the services started fine without any suspicious log
> >> > entries.
> >> >
> >> > If anybody wants to check further I can provide logs, this behaviour
> is
> >> > odd, but good enough for me, with a maximum downtime of cluster
> recheck
> >> > interval...
> >>
> >> Output of "crm_mon -1Arfj"?
> >>
> >> Regards,
> >> Ulrich
> >>
> >>
> >>
> >> _______________________________________________
> >> Users mailing list: Users at clusterlabs.org
> >> http://clusterlabs.org/mailman/listinfo/users
> >>
> >> Project Home: http://www.clusterlabs.org
> >> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> Bugs: http://bugs.clusterlabs.org
> >>
>
>
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20160321/d6a743d3/attachment-0003.html>


More information about the Users mailing list