[Pacemaker] can't get pacemaker started

Dave Jiang dave.jiang at intel.com
Thu Jul 26 19:31:34 EDT 2012


Hi. I'm following the cluster from scratch guide to create a simple
active/passive 2 node cluster. I'm using the standard packages that come
with Fedora 17. I have corosync running and linked up. However I cannot
seem to get Pacemaker to run correctly. I don't see all the processes
loaded:

17286 ?        Ss     0:00 /usr/sbin/pacemakerd
-f                             
17288 ?        Ss     0:00  \_ /usr/libexec/pacemaker/stonithd  

Looking at the log these stand out:

Jul 26 16:26:02 leftnode cib[17378]:  warning: retrieveCib: Cluster
configuration not found: /var/lib/heartbeat/crm/cib.xml
Jul 26 16:26:02 leftnode attrd[17381]:   notice: crm_cluster_connect:
Connecting to cluster infrastructure: corosync
Jul 26 16:26:02 leftnode cib[17378]:  warning: readCibXmlFile: Primary
configuration corrupt or unusable, trying backup...
Jul 26 16:26:02 leftnode crmd[17383]:     info: crm_log_init_worker:
Changed active directory to /var/lib/heartbeat/cores/hacluster
Jul 26 16:26:02 leftnode cib[17378]:  warning: readCibXmlFile:
Continuing with an empty configuration.

I'm not running heartbeat, should I be? It wasn't talked about in the guide.

And then I noticed the qb_rb_chmod failed and a bunch of other failures.
Any ideas what am I not setting up correctly?

Jul 26 16:26:02 leftnode crmd[17383]:   notice: main: CRM Git Version:
ee0730e13d124c3d58f00016c3376a1de5323cff
Jul 26 16:26:02 leftnode corosync[16373]:   [QB    ]
qb_rb_chmod:cpg-request-16373-17381-254: Operation not permitted (1)
Jul 26 16:26:02 leftnode cib[17378]:     info: validate_with_relaxng:
Creating RNG parser context
Jul 26 16:26:02 leftnode corosync[16373]:   [QB    ] shm connection
FAILED: Operation not permitted (1)
Jul 26 16:26:02 leftnode corosync[16373]:   [QB    ] Error in connection
setup (16373-17381-254): Operation not permitted (1)
Jul 26 16:26:02 leftnode attrd[17381]:    error: init_cpg_connection:
Could not connect to the Cluster Process Group API: 2
Jul 26 16:26:02 leftnode stonith-ng[17379]:     info:
init_ais_connection_once: Connection to 'corosync': established
Jul 26 16:26:02 leftnode attrd[17381]:    error: main: HA Signon failed
Jul 26 16:26:02 leftnode stonith-ng[17379]:     info: crm_new_peer: Node
leftnode now has id: 16820416
Jul 26 16:26:02 leftnode attrd[17381]:    error: main: Aborting startup
Jul 26 16:26:02 leftnode stonith-ng[17379]:     info: crm_new_peer: Node
16820416 is now known as leftnode
Jul 26 16:26:02 leftnode pacemakerd[17377]:    error: pcmk_child_exit:
Child process attrd exited (pid=17381, rc=100)
Jul 26 16:26:02 leftnode pacemakerd[17377]:  warning: pcmk_child_exit:
Pacemaker child process attrd no longer wishes to be respawned. Shutting
ourselves down.
Jul 26 16:26:02 leftnode pacemakerd[17377]:   notice:
pcmk_shutdown_worker: Shuting down Pacemaker
Jul 26 16:26:02 leftnode pacemakerd[17377]:   notice: stop_child:
Stopping crmd: Sent -15 to process 17383
Jul 26 16:26:02 leftnode crmd[17383]:     info: do_cib_control: Could
not connect to the CIB service: connection failed
Jul 26 16:26:02 leftnode cib[17378]:     info: startCib: CIB
Initialization completed successfully
Jul 26 16:26:02 leftnode crmd[17383]:  warning: do_cib_control: Couldn't
complete CIB registration 1 times... pause and retry
Jul 26 16:26:02 leftnode cib[17378]:     info: get_cluster_type: Cluster
type is: 'corosync'
Jul 26 16:26:02 leftnode crmd[17383]:     info: crm_signal_dispatch:
Invoking handler for signal 15: Terminated
Jul 26 16:26:02 leftnode cib[17378]:   notice: crm_cluster_connect:
Connecting to cluster infrastructure: corosync
Jul 26 16:26:02 leftnode crmd[17383]:   notice: crm_shutdown: Requesting
shutdown, upper limit is 1200000ms
Jul 26 16:26:02 leftnode crmd[17383]:  warning: do_log: FSA: Input
I_SHUTDOWN from crm_shutdown() received in state S_STARTING
Jul 26 16:26:02 leftnode corosync[16373]:   [QB    ]
qb_rb_chmod:cpg-request-16373-17378-255: Operation not permitted (1)
Jul 26 16:26:02 leftnode crmd[17383]:   notice: do_state_transition:
State transition S_STARTING -> S_STOPPING [ input=I_SHUTDOWN
cause=C_SHUTDOWN origin=crm_shutdown ]
Jul 26 16:26:02 leftnode crmd[17383]:     info: get_cluster_type:
Cluster type is: 'corosync'
Jul 26 16:26:02 leftnode corosync[16373]:   [QB    ] shm connection
FAILED: Operation not permitted (1)
Jul 26 16:26:02 leftnode crmd[17383]:   notice:
terminate_ais_connection: Disconnecting from Corosync
Jul 26 16:26:02 leftnode corosync[16373]:   [QB    ] Error in connection
setup (16373-17378-255): Operation not permitted (1)
Jul 26 16:26:02 leftnode cib[17378]:    error: init_cpg_connection:
Could not connect to the Cluster Process Group API: 2
Jul 26 16:26:02 leftnode crmd[17383]:     info:
terminate_ais_connection: No CPG connection
Jul 26 16:26:02 leftnode cib[17378]:     crit: cib_init: Cannot sign in
to the cluster... terminating
Jul 26 16:26:02 leftnode crmd[17383]:     info:
terminate_ais_connection: No Quorum connection
Jul 26 16:26:02 leftnode pacemakerd[17377]:    error: pcmk_child_exit:
Child process cib exited (pid=17378, rc=100)
Jul 26 16:26:02 leftnode crmd[17383]:     info: do_ha_control:
Disconnected from OpenAIS
Jul 26 16:26:02 leftnode pacemakerd[17377]:  warning: pcmk_child_exit:
Pacemaker child process cib no longer wishes to be respawned. Shutting
ourselves down.
Jul 26 16:26:02 leftnode crmd[17383]:     info: do_cib_control:
Disconnecting CIB
Jul 26 16:26:02 leftnode crmd[17383]:     info: do_exit: Performing
A_EXIT_0 - gracefully exiting the CRMd
Jul 26 16:26:02 leftnode crmd[17383]:     info: free_mem: Dropping
I_TERMINATE: [ state=S_STOPPING cause=C_FSA_INTERNAL origin=do_stop ]
Jul 26 16:26:02 leftnode crmd[17383]:     info: crm_xml_cleanup:
Cleaning up memory from libxml2
Jul 26 16:26:02 leftnode crmd[17383]:     info: do_exit: [crmd] stopped (0)
Jul 26 16:26:02 leftnode pacemakerd[17377]:     info: pcmk_child_exit:
Child process crmd exited (pid=17383, rc=0)
Jul 26 16:26:02 leftnode pacemakerd[17377]:   notice: stop_child:
Stopping pengine: Sent -15 to process 17382
Jul 26 16:26:02 leftnode pacemakerd[17377]:     info: pcmk_child_exit:
Child process pengine exited (pid=17382, rc=0)
Jul 26 16:26:02 leftnode pacemakerd[17377]:   notice: stop_child:
Stopping lrmd: Sent -15 to process 17380
Jul 26 16:26:02 leftnode lrmd: [17380]: info: lrmd is shutting down
Jul 26 16:26:02 leftnode pacemakerd[17377]:     info: pcmk_child_exit:
Child process lrmd exited (pid=17380, rc=0)
Jul 26 16:26:02 leftnode pacemakerd[17377]:   notice: stop_child:
Stopping stonith-ng: Sent -15 to process 17379





More information about the Pacemaker mailing list