[Pacemaker] Regarding Stonith RAs
neha chatrath
nehachatrath at gmail.com
Thu Nov 24 12:08:53 UTC 2011
Hello,
I could get list of Stontih RAs by installing cman, clvm, ricci, pacemaker,
rgmanages RPMs provided by CentOS 6 distribution.
But unfortunately after installing these packages, all the process related
to Pacemaker are not coming up on starting Heartbeat Deamon.
When I start Heartbeat daemon, only following process are started:
root at p init.d]# ps -eaf |grep heartbeat
root 3522 1 0 17:26 ? 00:00:00 heartbeat: master control
process
root 3525 3522 0 17:26 ? 00:00:00 heartbeat: FIFO reader
root 3526 3522 0 17:26 ? 00:00:00 heartbeat: write: bcast eth1
root 3527 3522 0 17:26 ? 00:00:00 heartbeat: read: bcast eth1
root 3538 3381 0 17:26 pts/3 00:00:00 grep heartbeat
In the log messages, following error logs are observed:
"Nov 24 17:26:19 p heartbeat: [3522]: debug: Signing on API client 3539
(ccm)
Nov 24 17:26:19 p ccm: [3539]: info: Hostname: p
Nov 24 17:26:19 p attrd: [3543]: info: Invoked: /usr/lib/heartbeat/attrd
Nov 24 17:26:19 p stonith-ng: [3542]: info: Invoked:
/usr/lib/heartbeat/stonithd
Nov 24 17:26:19 p cib: [3540]: info: Invoked: /usr/lib/heartbeat/cib
*Nov 24 17:26:19 p lrmd: [3541]: ERROR: socket_wait_conn_new: trying to
create in /var/run/heartbeat/lrm_cmd_sock bind:: No such file or directory*
Nov 24 17:26:19 p lrmd: [3541]: ERROR: main: can not create wait connection
for command.
Nov 24 17:26:19 p lrmd: [3541]: ERROR: Startup aborted (can't create comm
channel). Shutting down.
Nov 24 17:26:19 p heartbeat: [3522]: WARN: Managed /usr/lib/heartbeat/lrmd
-r process 3541 exited with return code 100.
Nov 24 17:26:19 p heartbeat: [3522]: ERROR: Client /usr/lib/heartbeat/lrmd
-r exited with return code 100.
Nov 24 17:26:19 p attrd: [3543]: info: crm_log_init_worker: Changed active
directory to /var/lib/heartbeat/cores/hacluster
Nov 24 17:26:19 p attrd: [3543]: info: main: Starting up
Nov 24 17:26:19 p stonith-ng: [3542]: info: crm_log_init_worker: Changed
active directory to /var/lib/heartbeat/cores/root
Nov 24 17:26:19 p cib: [3540]: info: crm_log_init_worker: Changed active
directory to /var/lib/heartbeat/cores/hacluster
Nov 24 17:26:19 p attrd: [3543]: CRIT: get_cluster_type: This installation
of Pacemaker does not support the '(null)' cluster infrastructure.
Terminating.
Nov 24 17:26:19 p stonith-ng: [3542]: CRIT: get_cluster_type: This
installation of Pacemaker does not support the '(null)' cluster
infrastructure. Terminating.
Nov 24 17:26:19 p heartbeat: [3522]: WARN: Managed /usr/lib/heartbeat/attrd
process 3543 exited with return code 100.
Nov 24 17:26:19 p heartbeat: [3522]: ERROR: Client /usr/lib/heartbeat/attrd
exited with return code 100.
Nov 24 17:26:19 p heartbeat: [3522]: info: the send queue length from
heartbeat to client ccm is set to 1024
Nov 24 17:26:19 p heartbeat: [3522]: WARN: Managed
/usr/lib/heartbeat/stonithd process 3542 exited with return code 100.
Nov 24 17:26:19 p heartbeat: [3522]: ERROR: Client
/usr/lib/heartbeat/stonithd exited with return code 100.
*Nov 24 17:26:19 p cib: [3540]: info: retrieveCib: Reading cluster
configuration from: /var/lib/heartbeat/crm/cib.xml (digest:
/var/lib/heartbeat/crm/cib.xml.sig)*
Nov 24 17:26:19 p cib: [3540]: debug: log_data_element: readCibXmlFile:
[on-disk] <cib epoch="0" num_updates="0" admin_epoch="0"
validate-with="pacemaker-1.2" cib-last-written="Mon Nov 21 11:09:22 2011" >
...
....
Nov 24 17:26:19 p crmd: [3544]: info: crmd_init: Starting crmd
Nov 24 17:26:19 p crmd: [3544]: debug: s_crmd_fsa: Processing I_STARTUP: [
state=S_STARTING cause=C_STARTUP origin=crmd_init ]
Nov 24 17:26:19 p crmd: [3544]: debug: do_fsa_action: actions:trace: //
A_LOG
Nov 24 17:26:19 p crmd: [3544]: debug: do_fsa_action: actions:trace: //
A_STARTUP
Nov 24 17:26:19 p crmd: [3544]: debug: do_startup: Registering Signal
Handlers
Nov 24 17:26:19 p crmd: [3544]: debug: do_startup: Creating CIB and LRM
objects
Nov 24 17:26:19 p crmd: [3544]: debug: do_fsa_action: actions:trace: //
A_CIB_START
Nov 24 17:26:19 p crmd: [3544]: debug: init_client_ipc_comms_nodispatch:
Attempting to talk on: /var/run/crm/cib_rw
Nov 24 17:26:19 p crmd: [3544]: debug: init_client_ipc_comms_nodispatch:
Could not init comms on: /var/run/crm/cib_rw
Nov 24 17:26:19 p crmd: [3544]: debug: cib_native_signon_raw: Connection to
command channel failed
Nov 24 17:26:19 p crmd: [3544]: debug: init_client_ipc_comms_nodispatch:
Attempting to talk on: /var/run/crm/cib_callback
Nov 24 17:26:19 p crmd: [3544]: debug: init_client_ipc_comms_nodispatch:
Could not init comms on: /var/run/crm/cib_callback
Nov 24 17:26:19 p crmd: [3544]: debug: cib_native_signon_raw: Connection to
callback channel failed
Nov 24 17:26:19 p crmd: [3544]: debug: cib_native_signon_raw: Connection to
CIB failed: connection failed
Nov 24 17:26:19 p crmd: [3544]: debug: cib_native_signoff: Signing out of
the CIB Service
Nov 24 17:26:19 p cib: [3540]: info: write_cib_contents: Archived previous
version as /var/lib/heartbeat/crm/cib-1.raw
Nov 24 17:26:19 p cib: [3540]: debug: write_cib_contents: Writing CIB to
disk
...
...
..
Nov 24 17:27:47 p crmd: [3544]: WARN: do_cib_control: Couldn't complete CIB
registration 30 times... pause and retry
Nov 24 17:27:47 p crmd: [3544]: ERROR: do_cib_control: Could not complete
CIB registration 30 times... hard error
Nov 24 17:27:47 p crmd: [3544]: debug: s_crmd_fsa: Processing I_ERROR: [
state=S_STARTING cause=C_FSA_INTERNAL origin=do_cib_control ]
Nov 24 17:27:47 p crmd: [3544]: debug: do_fsa_action: actions:trace: //
A_ERROR
Nov 24 17:27:47 p crmd: [3544]: ERROR: do_log: FSA: Input I_ERROR from
do_cib_control() received in state S_STARTING
Nov 24 17:27:47 p crmd: [3544]: info: do_state_transition: State transition
S_STARTING -> S_RECOVERY [ input=I_ERROR cause=C_FSA_INTERNAL
origin=do_cib_control ]
Nov 24 17:27:47 p crmd: [3544]: debug: do_fsa_action: actions:trace: //
A_DC_TIMER_STOP
Nov 24 17:27:47 p crmd: [3544]: debug: do_fsa_action: actions:trace: //
A_INTEGRATE_TIMER_STOP
Nov 24 17:27:47 p crmd: [3544]: debug: do_fsa_action: actions:trace: //
A_FINALIZE_TIMER_STOP
Nov 24 17:27:47 p crmd: [3544]: debug: do_fsa_action: actions:trace: //
A_RECOVER
Nov 24 17:27:47 p crmd: [3544]: ERROR: do_recover: Action A_RECOVER
(0000000001000000) not supported
Nov 24 17:27:47 p crmd: [3544]: debug: do_fsa_action: actions:trace: //
A_HA_CONNECT
Nov 24 17:27:47 p crmd: [3544]: CRIT: get_cluster_type: This installation
of Pacemaker does not support the '(null)' cluster infrastructure.
Terminating.
Nov 24 17:27:47 p heartbeat: [3522]: WARN: Managed /usr/lib/heartbeat/crmd
process 3544 exited with return code 100.
Nov 24 17:27:47 p heartbeat: [3522]: ERROR: Client /usr/lib/heartbeat/crmd
exited with return code 100."
It seems to be reading configuration info from /var/run/heartbeat
directory but actually the info is present in /usr/var/run/heartbeat.
Can somebody suggest me how should I correct that path?
Path environment variable has the following value:
[root at p init.d]# echo $PATH
/usr/lib/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin
Thanks and regards
Neha Chatrath
Date: Fri, 18 Nov 2011 10:22:22 +1100
From: Andrew Beekhof <andrew at beekhof.net>
To: The Pacemaker cluster resource manager
<pacemaker at oss.clusterlabs.org
>
Subject: Re: [Pacemaker] Regarding Stonith RAs
Message-ID:
<CAEDLWG2QO+-puhr2qOuvXSCRUcg2gXHE=i=1d3LosfN_PcSm9A at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
On Thu, Nov 17, 2011 at 1:28 AM, Dejan Muhamedagic <dejanmm at fastmail.fm>
wrote:
> Hi,
>
> On Wed, Nov 16, 2011 at 05:49:30PM +0530, neha chatrath wrote:
> [...]
>> Nov 14 13:16:57 ggns2mexsatsdp17.hsc.com lrmd: [3976]: notice:
>> on_msg_get_rsc_types: can not find this RA class stonith"
>
> The PILS plugin handling stonith resources was not found.
> Strange, cannot recall seeing this before.
Could be a RHEL6 based distro.
> It should be in
> /usr/lib/heartbeat/plugins/RAExec/stonith.so (or /usr/lib64
> depending on your installation). Please check permissions and if
> this file is really a valid so object file. If everything's in
> order no idea what else could be the reason. You could strace
> lrmd on startup and see what happens between lines 1137 and 1158.
>
> Thanks,
>
> Dejan
On Mon, Nov 14, 2011 at 2:05 PM, neha chatrath <nehachatrath at gmail.com>wrote:
> Hello,
> I am facing issue in configuring a Stonith resource in my system of
> cluster with 2 nodes.
> Whenever I try to give the following command:
> "crm configure primitive app_fence stonith::external/ipmi params hostname=
> ggns2mexsatsdp17.hsc.com ipaddr=192.168.113.17 userid=root
> passwd=pass at abc123" ,
> I get the following errors:
>
> "ERROR: stonith:external/ipmi: could not parse meta-data:
> Traceback (most recent call last):
> File "/usr/sbin/crm", line 41, in <module>
> crm.main.run()
> File "/usr/lib/python2.6/site-packages/crm/main.py", line 249, in run
> if parse_line(levels,shlex.split(' '.join(args))):
> File "/usr/lib/python2.6/site-packages/crm/main.py", line 145, in
> parse_line
> lvl.release()
> File "/usr/lib/python2.6/site-packages/crm/levels.py", line 68, in
> release
> self.droplevel()
> File "/usr/lib/python2.6/site-packages/crm/levels.py", line 87, in
> droplevel
> self.current_level.end_game(self._in_transit)
> File "/usr/lib/python2.6/site-packages/crm/ui.py", line 1524, in end_game
> self.commit("commit")
> File "/usr/lib/python2.6/site-packages/crm/ui.py", line 1425, in commit
> self._verify(mkset_obj("xml","changed"),mkset_obj("xml"))
> File "/usr/lib/python2.6/site-packages/crm/ui.py", line 1324, in _verify
> rc2 = set_obj_semantic.semantic_check(set_obj_all)
> File "/usr/lib/python2.6/site-packages/crm/cibconfig.py", line 280, in
> semantic_check
> rc = self.__check_unique_clash(set_obj_all)
> File "/usr/lib/python2.6/site-packages/crm/cibconfig.py", line 260, in
> __check_unique_clash
> process_primitive(node, clash_dict)
> File "/usr/lib/python2.6/site-packages/crm/cibconfig.py", line 245, in
> process_primitive
> if ra_params[ name ].get("unique") == "1":
> TypeError: 'NoneType' object is unsubscriptable
> "
> From /var/log/messages: following error is being reported from lrmd: "notice:
> on_msg_get_metadata: can not find the class stonith"
>
> It seems the it is not able to find any RAs related to Stonith.
>
> Following is the output of some crm commands:
> *crm(live)ra# classes*
> heartbeat
> lsb
> ocf / heartbeat linbit mcg pacemaker
> *stonith*
>
> *crm(live)ra# list ocf heartbeat*
> AoEtarget AudibleAlarm CTDB
> ClusterMon Delay Dummy
> EvmsSCC Evmsd Filesystem
> ICP IPaddr IPaddr2
> IPsrcaddr IPv6addr LVM
> LinuxSCSI MailTo ManageRAID
> ManageVE Pure-FTPd Raid1
> Route SAPDatabase SAPInstance
> SendArp ServeRAID SphinxSearchDaemon
> Squid Stateful SysInfo
> VIPArip VirtualDomain WAS
> WAS6 WinPopup Xen
> Xinetd anything apache
> conntrackd db2 drbd
> eDir88 exportfs fio
> iSCSILogicalUnit iSCSITarget ids
> iscsi jboss ldirectord
> mysql mysql-proxy nfsserver
> nginx oracle oralsnr
> pgsql pingd portblock
> postfix proftpd rsyncd
> scsi2reservation sfex syslog-ng
> tomcat vmware
>
> *crm(live)ra# list stonith*
>
> crm(live)ra#
>
> All the sotnith related RAs are present in
> /usr/lib/stonith/plugins/external.
>
> Following is the output of "ls command :
> [root at ggns2mexsatsdp17 ~]# ls /usr/lib/stonith/plugins/external/
>
> drac5 hetzner ibmrsa ipmi kdumpcheck nut
> output riloe ssh vmware xen0-ha
> dracmc-telnet hmchttp ibmrsa-telnet ippower9258 libvirt ouput
> rackpdu sbd vcenter xen0
>
> Can somebody please help me with this?
>
> Thanks and regards
> Neha Chatrath
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20111124/7c384878/attachment.htm>
More information about the Pacemaker
mailing list