[ClusterLabs] Sudden stop of pacemaker functions

Jan Pokorný jpokorny at redhat.com
Wed Feb 17 07:59:37 EST 2016


On 17/02/16 14:10 +0200, Klechomir wrote:
> Having strange issue lately.
> I have two node cluster with some cloned resources on it.
> One of my nodes suddenly starts reporting all its resources down (some of
> them are actually running), stops logging and reminds in this this state
> forever, while still responding to crm commands.
> 
> The curious thing is that restarting corosync/pacemaker doesn't change
> anything.
> 
> Here are the last lines in the log after restart:
> 
> [...]
> Feb 17 12:55:19 [609409] CLUSTER-1        cib:     info:
> cib_process_replace:   Replaced 0.238.40 with 0.238.40 from CLUSTER-2
> Feb 17 12:55:21 [609413] CLUSTER-1      attrd:  warning: attrd_cib_callback:
> Update shutdown=(null) failed: No such device or address
> Feb 17 12:55:22 [609413] CLUSTER-1      attrd:  warning: attrd_cib_callback:
> Update terminate=(null) failed: No such device or address
> Feb 17 12:55:25 [609413] CLUSTER-1      attrd:  warning: attrd_cib_callback:
> Update pingd=(null) failed: No such device or address
> Feb 17 12:55:26 [609413] CLUSTER-1      attrd:  warning: attrd_cib_callback:
> Update fail-count-p_Samba_Server=(null) failed: No such device or address
> Feb 17 12:55:26 [609413] CLUSTER-1      attrd:  warning: attrd_cib_callback:
> Update master-p_Device_drbddrv1=(null) failed: No such device or address
> Feb 17 12:55:27 [609413] CLUSTER-1      attrd:  warning: attrd_cib_callback:
> Update last-failure-p_Samba_Server=(null) failed: No such device or address
> Feb 17 12:55:27 [609413] CLUSTER-1      attrd:  warning: attrd_cib_callback:
> Update probe_complete=(null) failed: No such device or address
> 
> After that the logging on the problematic node stops.

Note sure I follow, what does the following command produce:

    for i in attrd cib corosync crmd lrmd pengine pacemakerd stonithd; do \
    echo "${i}: $(pgrep ${i})"; done

?

> Corosync is v2.1.0.26; Pacemaker v1.1.8

Definitely try a most recent version of Pacemaker; what you are using
is 3.5 years old and plentiful fixes landed since then.

-- 
Jan (Poki)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20160217/e860ca7f/attachment-0003.sig>


More information about the Users mailing list