[ClusterLabs] Antw: Re: Antw: Re: Cluster failover failure with Unresolved dependency

Mon Mar 21 05:48:53 EDT 2016

>>> Lorand Kelemen <lorand.kelemen at gmail.com> schrieb am 21.03.2016 um 10:08 in
Nachricht
<CAO2rmm1P3+LSm0ZmPYb5WYNqonaCT5XS4GQzK=ezU+WkknRaDA at mail.gmail.com>:
> Reproduced it again:
> 
> Last updated: Mon Mar 21 10:01:18 2016          Last change: Mon Mar 21
> 09:59:27 2016 by root via crm_attribute on mail1
> Stack: corosync
> Current DC: mail2 (version 1.1.13-10.el7_2.2-44eb2dd) - partition with
> quorum
> 2 nodes and 10 resources configured
> 
> Online: [ mail1 mail2 ]
> 
> Full list of resources:
> 
>  Resource Group: network-services
>      virtualip-1        (ocf::heartbeat:IPaddr2):       Stopped
>  Master/Slave Set: spool-clone [spool]
>      Masters: [ mail2 ]
>      Slaves: [ mail1 ]
>  Master/Slave Set: mail-clone [mail]
>      Masters: [ mail2 ]
>      Slaves: [ mail1 ]
>  Resource Group: fs-services
>      fs-spool   (ocf::heartbeat:Filesystem):    Stopped
>      fs-mail    (ocf::heartbeat:Filesystem):    Stopped
>  Resource Group: mail-services
>      amavisd    (systemd:amavisd):      Stopped
>      spamassassin       (systemd:spamassassin): Stopped
>      clamd      (systemd:clamd at amavisd):        Stopped
> 
> Node Attributes:
> * Node mail1:
>     + master-mail                       : 10000
>     + master-spool                      : 10000
> * Node mail2:
>     + master-mail                       : 10000
>     + master-spool                      : 10000
> 
> Migration Summary:
> * Node mail1:
> * Node mail2:
>    amavisd: migration-threshold=1 fail-count=1 last-failure='Mon Mar 21
> 10:00:53 2016'
> 
> Failed Actions:
> * amavisd_monitor_60000 on mail2 'not running' (7): call=2604,
> status=complete, exitreason='none',
>     last-rc-change='Mon Mar 21 10:00:53 2016', queued=0ms, exec=0ms

Did you try a resource cleanup for "amavisd"? Like crm_resource -C -r amavisd...

> 
> Best regards,
> Lorand
> 
> On Mon, Mar 21, 2016 at 9:57 AM, Ulrich Windl <
> Ulrich.Windl at rz.uni-regensburg.de> wrote:
> 
>> >>> Lorand Kelemen <lorand.kelemen at gmail.com> schrieb am 18.03.2016 um
>> 16:42 in
>> Nachricht
>> <CAO2rmm2OvdKvcSeQSb9+j=krqcdpyDk6GB4bUkD6CcQZLXw6Wg at mail.gmail.com>:
>> > I reviewed all the logs, but found nothing out of the ordinary, besides
>> the
>> > "resource cannot run anywhere" line, however after the cluster recheck
>> > interval expired the services started fine without any suspicious log
>> > entries.
>> >
>> > If anybody wants to check further I can provide logs, this behaviour is
>> > odd, but good enough for me, with a maximum downtime of cluster recheck
>> > interval...
>>
>> Output of "crm_mon -1Arfj"?
>>
>> Regards,
>> Ulrich
>>
>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org 
>> http://clusterlabs.org/mailman/listinfo/users 
>>
>> Project Home: http://www.clusterlabs.org 
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> Bugs: http://bugs.clusterlabs.org 
>>