<div dir="ltr">Of course, to catch you up:<div><br></div><div><span style="font-size:12.8px">>> Still experiencing the same behaviour, killing amavisd returns an rc=7 for</span><br style="font-size:12.8px"><span style="font-size:12.8px">>> the monitoring operation on the "victim" node, this soungs logical, but the</span><br style="font-size:12.8px"><span style="font-size:12.8px">>> logs contain the same: amavisd and virtualip cannot run anywhere.</span><br style="font-size:12.8px"><span style="font-size:12.8px">>></span><br style="font-size:12.8px"><span style="font-size:12.8px">>> I made sure systemd is clean (amavisd = inactive, not running instead of</span><br style="font-size:12.8px"><span style="font-size:12.8px">>> failed) and also reset the failcount on all resources before killing</span><br style="font-size:12.8px"><span style="font-size:12.8px">>> amavisd.</span><br style="font-size:12.8px"><span style="font-size:12.8px">>></span></div><div><br style="font-size:12.8px"><span style="font-size:12.8px">> What you did is fine. I'm not sure why amavisd and virtualip can't run.</span><br></div><div><span style="font-size:12.8px"><br></span></div><div><span style="font-size:12.8px">So the cluster is in a clean state, I still get the resource cannot run anywhere log entry, that is the state you see reproduced in crm_mon output. However after the cluster recheck interval expires all resources are started fine on the surviving node.</span></div><div><span style="font-size:12.8px"><br></span></div><div><span style="font-size:12.8px">Best regards,</span></div><div><span style="font-size:12.8px">Lorand</span></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Mar 21, 2016 at 10:48 AM, Ulrich Windl <span dir="ltr"><<a href="mailto:Ulrich.Windl@rz.uni-regensburg.de" target="_blank">Ulrich.Windl@rz.uni-regensburg.de</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>

>>> Lorand Kelemen <<a href="mailto:lorand.kelemen@gmail.com">lorand.kelemen@gmail.com</a>> schrieb am 21.03.2016 um 10:08 in<br>

Nachricht<br>

<CAO2rmm1P3+LSm0ZmPYb5WYNqonaCT5XS4GQzK=<a href="mailto:ezU%2BWkknRaDA@mail.gmail.com">ezU+WkknRaDA@mail.gmail.com</a>>:<br>

> Reproduced it again:<br>

><br>

> Last updated: Mon Mar 21 10:01:18 2016          Last change: Mon Mar 21<br>

> 09:59:27 2016 by root via crm_attribute on mail1<br>

> Stack: corosync<br>

> Current DC: mail2 (version 1.1.13-10.el7_2.2-44eb2dd) - partition with<br>

> quorum<br>

> 2 nodes and 10 resources configured<br>

><br>

> Online: [ mail1 mail2 ]<br>

><br>

> Full list of resources:<br>

><br>

>  Resource Group: network-services<br>

>      virtualip-1        (ocf::heartbeat:IPaddr2):       Stopped<br>

>  Master/Slave Set: spool-clone [spool]<br>

>      Masters: [ mail2 ]<br>

>      Slaves: [ mail1 ]<br>

>  Master/Slave Set: mail-clone [mail]<br>

>      Masters: [ mail2 ]<br>

>      Slaves: [ mail1 ]<br>

>  Resource Group: fs-services<br>

>      fs-spool   (ocf::heartbeat:Filesystem):    Stopped<br>

>      fs-mail    (ocf::heartbeat:Filesystem):    Stopped<br>

>  Resource Group: mail-services<br>

>      amavisd    (systemd:amavisd):      Stopped<br>

>      spamassassin       (systemd:spamassassin): Stopped<br>

>      clamd      (systemd:clamd@amavisd):        Stopped<br>

><br>

> Node Attributes:<br>

> * Node mail1:<br>

>     + master-mail                       : 10000<br>

>     + master-spool                      : 10000<br>

> * Node mail2:<br>

>     + master-mail                       : 10000<br>

>     + master-spool                      : 10000<br>

><br>

> Migration Summary:<br>

> * Node mail1:<br>

> * Node mail2:<br>

>    amavisd: migration-threshold=1 fail-count=1 last-failure='Mon Mar 21<br>

> 10:00:53 2016'<br>

><br>

> Failed Actions:<br>

> * amavisd_monitor_60000 on mail2 'not running' (7): call=2604,<br>

> status=complete, exitreason='none',<br>

>     last-rc-change='Mon Mar 21 10:00:53 2016', queued=0ms, exec=0ms<br>

<br>

Did you try a resource cleanup for "amavisd"? Like crm_resource -C -r amavisd...<br>

<br>

><br>

> Best regards,<br>

> Lorand<br>

><br>

> On Mon, Mar 21, 2016 at 9:57 AM, Ulrich Windl <<br>

> <a href="mailto:Ulrich.Windl@rz.uni-regensburg.de">Ulrich.Windl@rz.uni-regensburg.de</a>> wrote:<br>

><br>

>> >>> Lorand Kelemen <<a href="mailto:lorand.kelemen@gmail.com">lorand.kelemen@gmail.com</a>> schrieb am 18.03.2016 um<br>

>> 16:42 in<br>

>> Nachricht<br>

>> <CAO2rmm2OvdKvcSeQSb9+j=<a href="mailto:krqcdpyDk6GB4bUkD6CcQZLXw6Wg@mail.gmail.com">krqcdpyDk6GB4bUkD6CcQZLXw6Wg@mail.gmail.com</a>>:<br>

>> > I reviewed all the logs, but found nothing out of the ordinary, besides<br>

>> the<br>

>> > "resource cannot run anywhere" line, however after the cluster recheck<br>

>> > interval expired the services started fine without any suspicious log<br>

>> > entries.<br>

>> ><br>

>> > If anybody wants to check further I can provide logs, this behaviour is<br>

>> > odd, but good enough for me, with a maximum downtime of cluster recheck<br>

>> > interval...<br>

>><br>

>> Output of "crm_mon -1Arfj"?<br>

>><br>

>> Regards,<br>

>> Ulrich<br>

>><br>

>><br>

>><br>

>> _______________________________________________<br>

>> Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>

>> <a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>

>><br>

>> Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>

>> Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>

>> Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>

>><br>

<br>

<br>

<br>

<br>

<br>

_______________________________________________<br>

Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>

<a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>

<br>

Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>

Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>

Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>

</blockquote></div><br></div>