[Pacemaker] problem starting new instance of pacemaker (via corosync)

Vladislav Bogdanov bubble at hoster-ok.com
Fri Sep 7 14:58:17 EDT 2012


07.09.2012 18:28, John White wrote:
> An odd update to this. We run in a stateless environment (nodes are
> pxe booted and have NFS roots, etc). Trying the same install on a VM
> works just fine. I wonder if anyone has experience with pacemaker and
> stateless nodes.

I run it with iso image loaded from PXE server to RAM.
State data and cluster-wide configuration is on CIFS. Volatile RW data
is on tmpfs.

Probably you have some trouble with communication paths used for
interconnection. Try to mount /var/run to tmpfs. Or where is that socket
on linux?

	memset (&address, 0, sizeof (struct sockaddr_un));
	address.sun_family = AF_UNIX;
#if defined(COROSYNC_LINUX)
	sprintf (address.sun_path + 1, "%s", socket_name);
#else
	sprintf (address.sun_path, "%s/%s", SOCKETDIR, socket_name);
#endif

>> Sep 06 14:42:52 n0014.lustre cib: [13223]: info:
init_ais_connection_classic: Connection to our AIS plugin (10) failed:
Library error (2)
It is ENOENT (2)  /* No such file or directory */
Could you provide content of /proc/mounts?

Vladislav


> ----------------
> John White
> HPC Systems Engineer
> (510) 486-7307
> One Cyclotron Rd, MS: 50C-3209C
> Lawrence Berkeley National Lab
> Berkeley, CA 94720
> 
> On Sep 6, 2012, at 2:49 PM, John White <jwhite at lbl.gov> wrote:
> 
>> Hello Folks,
>> 	I'm having a very hard time getting a basic pacemaker setup going.  I've gotten corosync up and running just fine from what i can tell, but once I start with pacemaker commands, I get CIB errors everywhere:
>>
>> -bash-4.1# crm configure
>> Signon to CIB failed: connection failed
>> Init failed, could not perform requested operations
>> ERROR: cannot parse xml: no element found: line 1, column 0
>> crm(live)configure#
>>
>> Digging deeper, I see both attrd and cib failing to connect to the AIS plugin:
>>
>> Sep 06 14:42:52 n0014.lustre attrd: [13225]: notice: crm_cluster_connect: Connecting to cluster infrastructure: classic openais (with plugin)
>> Sep 06 14:42:52 n0014.lustre attrd: [13225]: ERROR: main: HA Signon failed
>> Sep 06 14:42:52 n0014.lustre attrd: [13225]: ERROR: main: Aborting startup
>> -snip-
>> Sep 06 14:42:52 n0014.lustre cib: [13223]: info: get_cluster_type: Cluster type is: 'openais'
>> Sep 06 14:42:52 n0014.lustre cib: [13223]: notice: crm_cluster_connect: Connecting to cluster infrastructure: classic openais (with plugin)
>> Sep 06 14:42:52 n0014.lustre cib: [13223]: info: init_ais_connection_classic: Creating connection to our Corosync plugin
>> Sep 06 14:42:52 n0014.lustre cib: [13223]: info: init_ais_connection_classic: Connection to our AIS plugin (10) failed: Library error (2)
>> Sep 06 14:42:52 n0014.lustre cib: [13223]: CRIT: cib_init: Cannot sign in to the cluster… terminating
>>
>>
>> I'm really at a loss here after 3 days, any ideas or hints as to where I might find a solution?  More logging available upon request.
>>
>>
>>
>> ----------------
>> John White
>> HPC Systems Engineer
>> (510) 486-7307
>> One Cyclotron Rd, MS: 50C-3209C
>> Lawrence Berkeley National Lab
>> Berkeley, CA 94720
>>
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 





More information about the Pacemaker mailing list