[Pacemaker] DRBD Split-brain (recovered), but still showing "Failed Actions"

Reid, Mike MBReid at thepei.com
Wed Apr 11 13:48:25 EDT 2012

Thanks Andreas -- I'm not familiar with "maintenance-mode", very good to

Before I go research your suggestion, is the basic idea that you can
enable maintenance mode from ANY node (or just the node with the failed
action?), restart pacemaker/corosync services on ALL nodes (or, again,
just the one with the failed action?) -- all without any cluster service
interruption -- and then disable maintenance mode once the cleaned up
"Failed Actions" have been resolved?

>Message: 3
>Date: Wed, 11 Apr 2012 00:12:10 +0200
>From: Andreas Kurz <andreas at hastexo.com>
>To: pacemaker at oss.clusterlabs.org
>Subject: Re: [Pacemaker] DRBD Split-brain (recovered), but still
>	showing "Failed Actions"
>Message-ID: <4F84B03A.4030004 at hastexo.com>
>Content-Type: text/plain; charset="iso-8859-1"
>On 04/10/2012 05:43 PM, Reid, Mike wrote:
>> Thank you for the suggestion, Andreas. Unfortunately, that does not
>> to have cleaned up the Failed Actions either:
>>> crm resource cleanup msDRBD
>> Cleaning up resDRBD:0 on hostname2
>> Cleaning up resDRBD:1 on hostname2
>> Cleaning up resDRBD:0 on hostname1
>> Cleaning up resDRBD:1 on hostname1
>>> crm_mon -1
>> [...]
>> Failed actions:
>>     resDRBD:1_promote_0 (node=hostname2, call=530, rc=-2, status=Timed
>> Out): unknown exec error
>> Are there any other options that do not involve a failover + restart?
>If you switch your cluster into maintenance mode ...
>crm configure property maintenance-mode=true
>... you can stop pacemaker and even corosync without interrupting your
>services ... don't forget to disable it again after restart.
>Need help with Pacemaker?

More information about the Pacemaker mailing list