[ClusterLabs] Pacemaker fails to start after few starts

Andrew Beekhof andrew at beekhof.net
Sun Mar 29 23:36:05 UTC 2015


> On 28 Mar 2015, at 1:10 am, Kostiantyn Ponomarenko <konstantin.ponomarenko at gmail.com> wrote:
> 
> Hi,
> 
> If I start/stop Corosync and Pacemaker few times I get the state where Corosync is running, but Pacemaker cannot start.
> Here is a snippet from /var/log/messages:
> 
> Mar 27 14:00:49 daemon.notice<29> corosync[111057]:   [MAIN  ] Corosync Cluster Engine ('2.3.4'): started and ready to provide service.
> Mar 27 14:00:49 daemon.info<30> corosync[111057]:   [MAIN  ] Corosync built-in features: pie relro bindnow
> Mar 27 14:00:49 daemon.notice<29> corosync[111058]:   [TOTEM ] Initializing transport (UDP/IP Unicast).
> Mar 27 14:00:49 daemon.notice<29> corosync[111058]:   [TOTEM ] Initializing transmit/receive security (NSS) crypto: none hash: sha256
> Mar 27 14:00:49 daemon.notice<29> corosync[111058]:   [TOTEM ] Initializing transport (UDP/IP Unicast).
> Mar 27 14:00:49 daemon.notice<29> corosync[111058]:   [TOTEM ] Initializing transmit/receive security (NSS) crypto: none hash: sha256
> Mar 27 14:00:49 daemon.notice<29> corosync[111058]:   [TOTEM ] The network interface [169.254.0.2] is now up.
> Mar 27 14:00:49 daemon.notice<29> corosync[111058]:   [SERV  ] Service engine loaded: corosync configuration map access [0]
> Mar 27 14:00:49 daemon.info<30> corosync[111058]:   [QB    ] server name: cmap
> Mar 27 14:00:49 daemon.notice<29> corosync[111058]:   [SERV  ] Service engine loaded: corosync configuration service [1]
> Mar 27 14:00:49 daemon.info<30> corosync[111058]:   [QB    ] server name: cfg
> Mar 27 14:00:49 daemon.notice<29> corosync[111058]:   [SERV  ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
> Mar 27 14:00:49 daemon.info<30> corosync[111058]:   [QB    ] server name: cpg
> Mar 27 14:00:49 daemon.notice<29> corosync[111058]:   [SERV  ] Service engine loaded: corosync profile loading service [4]
> Mar 27 14:00:49 daemon.notice<29> corosync[111058]:   [QUORUM] Using quorum provider corosync_votequorum
> Mar 27 14:00:49 daemon.notice<29> corosync[111058]:   [SERV  ] Service engine loaded: corosync vote quorum service v1.0 [5]
> Mar 27 14:00:49 daemon.info<30> corosync[111058]:   [QB    ] server name: votequorum
> Mar 27 14:00:49 daemon.notice<29> corosync[111058]:   [SERV  ] Service engine loaded: corosync cluster quorum service v0.1 [3]
> Mar 27 14:00:49 daemon.info<30> corosync[111058]:   [QB    ] server name: quorum
> Mar 27 14:00:49 daemon.notice<29> corosync[111058]:   [TOTEM ] adding new UDPU member {169.254.0.2}
> Mar 27 14:00:49 daemon.notice<29> corosync[111058]:   [TOTEM ] adding new UDPU member {169.254.0.3}
> Mar 27 14:00:49 daemon.notice<29> corosync[111058]:   [TOTEM ] The network interface [169.254.1.2] is now up.
> Mar 27 14:00:49 daemon.notice<29> corosync[111058]:   [TOTEM ] adding new UDPU member {169.254.1.2}
> Mar 27 14:00:49 daemon.notice<29> corosync[111058]:   [TOTEM ] adding new UDPU member {169.254.1.3}
> Mar 27 14:00:49 daemon.notice<29> corosync[111058]:   [TOTEM ] A new membership (169.254.0.2:1296) was formed. Members joined: 1
> Mar 27 14:00:49 daemon.notice<29> corosync[111058]:   [QUORUM] Members[1]: 1
> Mar 27 14:00:49 daemon.notice<29> corosync[111058]:   [MAIN  ] Completed service synchronization, ready to provide service.
> Mar 27 14:00:49 daemon.notice<29> pacemaker: Starting Pacemaker Cluster Manager
> Mar 27 14:00:49 daemon.notice<29> pacemakerd[111069]:   notice: crm_add_logfile: Additional logging available in /var/log/pacemaker.log
> Mar 27 14:00:49 daemon.err<27> pacemakerd[111069]:    error: mcp_read_config: Couldn't create logfile: /var/log/pacemaker.log

Disk full?

> Mar 27 14:00:49 daemon.notice<29> pacemakerd[111069]:   notice: mcp_read_config: Configured corosync to accept connections from group 107: Library error (2)

Everything else flows from this.
Perhaps one of the corosync people can comment on the conditions under which this call would fail.

Relevant code from pacemaker is:

            char key[PATH_MAX];
            snprintf(key, PATH_MAX, "uidgid.gid.%u", gid);
            rc = cmap_set_uint8(local_handle, key, 1);
            crm_notice("Configured corosync to accept connections from group %u: %s (%d)",
                       gid, ais_error2text(rc), rc);





More information about the Users mailing list