[Pacemaker] FW: Some resources are restarted after a node joins back cluster after failover.

Andrew Beekhof andrew at beekhof.net
Thu Oct 11 02:04:54 UTC 2012


On Tue, Oct 2, 2012 at 6:42 PM, Poonam Agarwal
<Poonam.Agarwal at ipaccess.com> wrote:
> Hi,
>
>
>
> I had sent this message before, do not know how and why it got dropped.

Perhaps you weren't subscribed yet?

>
> I am facing below issue. Can somebody please help?

There were some problems in this area in the past.
Would you consider upgrading to 1.1.8?  It has many bugfixes.

   http://www.clusterlabs.org/rpm-next/


>
> -Poonam.
>
>
>
> From: Poonam Agarwal
> Sent: Thursday, September 20, 2012 11:24 AM
> To: 'pacemaker at oss.clusterlabs.org'
> Subject: Some resources are restarted after a node joins back cluster after
> failover.
>
>
>
> Hi,
>
>
>
> I have two node HA cluster namely oamdev-vm2 and oamdev-vm3. Oamdev-vm2 was
> master for some resources ms_drbd_resource_r0 and NOSServiceManager0.
>
> Then oamdev-vm2 was taken down by using ‘service corosync stop’. Failover
> happened and oamdev-vm3 became master for all resources.
>
> Now, when oamdev-vm2 came back by using ‘service corosync start’ then, when
> it joined the cluster some resources on master node oamdev-vm3 were
> restarted.
>
> This is not expected as it takes my main resource NosServiceManager0 down
> and increases the application downtime.
>
> I am not able to figure out what is causing this resource restart. Is it any
> ordering or colocation rules causing this??
>
> Below is the version of pacemaker, corosync, corosync configuration and the
> corosync logs at the bottom of this email which highlight the services
> restarted.
>
>
>
> I am using Red Hat 5 with following versions of pacemaker/corosync.
>
> [root at oamdev-vm2 ~]# rpm -qa | grep pacemaker
>
> pacemaker-libs-1.1.5-1.1.el5
>
> pacemaker-1.1.5-1.1.el5
>
> pacemaker-debuginfo-1.0.11-1.2.el5
>
> drbd-pacemaker-8.3.12-1
>
> [root at oamdev-vm2 ~]# rpm -qa | grep corosync
>
> corosync-debuginfo-1.2.7-1.1.el5
>
> corosync-1.2.7-1.1.el5
>
> corosynclib-devel-1.2.7-1.1.el5
>
> corosynclib-1.2.7-1.1.el5
>
>
>
> My corosync conf looks like this:
>
>
>
> node oamdev-vm2
>
> node oamdev-vm3 \
>
>         attributes standby="off"
>
> primitive Apache ocf:heartbeat:apache \
>
>         params configfile="/etc/httpd/conf/httpd.conf"
> statusurl="http://localhost/server-status" \
>
>         op monitor interval="30s" OCF_CHECK_LEVEL="0" \
>
>         op start interval="0" timeout="40s" \
>
>         op stop interval="0" timeout="60s"
>
> primitive Imq ocf:ipaccess:imq \
>
>         op monitor interval="15s"
>
> primitive NOSFileSystem ocf:heartbeat:Filesystem \
>
>         params device="10.255.239.26:/var/lib/ipaccess/export/data"
> directory="/var/lib/ipaccess/data" fstype="nfs" \
>
>         op start interval="0" timeout="60s" \
>
>         op stop interval="0" timeout="360s" \
>
>         op monitor interval="60s"
>
> primitive NOSIpAddress10_255_239_23 ocf:ipaccess:ipaddress \
>
>         params ip="10.255.239.23" cidr_netmask="24" networkType="Internal" \
>
>         op monitor interval="15s" \
>
>         meta target-role="Started"
>
> primitive NOSIpAddress10_255_239_25 ocf:ipaccess:ipaddress \
>
>         params ip="10.255.239.25" cidr_netmask="24" networkType="Internal" \
>
>         op monitor interval="30s" \
>
>         meta target-role="Started"
>
> primitive NOSServiceManager0 ocf:ipaccess:glassfish \
>
>         params objectInstanceId="0"
> databaseUrl="jdbc:mysql://10.255.239.24:3306/nos" \
>
>         op start interval="0" timeout="300" \
>
>         op stop interval="0" timeout="300" \
>
>         op monitor interval="10s" OCF_CHECK_LEVEL="0" \
>
>         meta target-role="Started" resource-stickiness="1000"
>
> primitive p_drbd_resource_fs1 ocf:linbit:drbd \
>
>         params drbd_resource="fs1" \
>
>         op monitor interval="29s" role="Master" timeout="120s" \
>
>         op monitor interval="31s" role="Slave" timeout="120s" \
>
>         op start interval="0" timeout="240s" \
>
>         op stop interval="0" timeout="100s"
>
> primitive p_drbd_resource_r0 ocf:linbit:drbd \
>
>         params drbd_resource="r0" \
>
>         op monitor interval="29s" role="Master" timeout="120s" \
>
>         op monitor interval="31s" role="Slave" timeout="120s" \
>
>         op start interval="0" timeout="240s" \
>
>         op stop interval="0" timeout="100s"
>
> primitive p_export_fs1 ocf:heartbeat:exportfs \
>
>         params clientspec="10.255.239.26/24"
> directory="/var/lib/ipaccess/export/data" fsid="3211"
> options="sync,rw,no_root_squash" \
>
>         op monitor interval="60s"
>
> primitive p_filesystem_drbd_fs1 ocf:heartbeat:Filesystem \
>
>         params device="/dev/drbd/by-res/fs1" options="user_xattr,rw,acl"
> directory="/var/lib/ipaccess/export" fstype="ext3"
>
> primitive p_filesystem_drbd_r0 ocf:heartbeat:Filesystem \
>
>         params device="/dev/drbd/by-res/r0" options="user_xattr,rw,acl"
> directory="/var/lib/mysql" fstype="ext3"
>
> primitive p_ip_fs1 ocf:ipaccess:ipaddress \
>
>         params ip="10.255.239.26" cidr_netmask="24" networkType="Internal" \
>
>         op monitor interval="30s"
>
> primitive p_ip_mysql ocf:ipaccess:ipaddress \
>
>         params ip="10.255.239.24" cidr_netmask="24" networkType="Internal" \
>
>         op monitor interval="30s"
>
> primitive p_mysql ocf:heartbeat:mysql \
>
>         params binary="/usr/bin/mysqld_safe" pid="/var/lib/mysql/mysqld.pid"
> datadir="/var/lib/mysql" \
>
>         op monitor interval="10" timeout="30" \
>
>         op start interval="0" timeout="120" \
>
>         op stop interval="0" timeout="120"
>
> primitive p_nfsserver_fs1 ocf:ipaccess:nfsserver \
>
>         params nfs_init_script="/usr/lib/ipaccess/tools/nfs-ha"
> nfs_shared_infodir="/var/lib/ipaccess/export/nfsinfo"
> nfs_notify_cmd="/usr/lib/ipaccess/tools/nfs-notify" nfs_ip="10.255.239.26" \
>
>         op start interval="0" timeout="60s" \
>
>         op stop interval="0" timeout="60s" \
>
>         op monitor interval="30s"
>
> primitive portmap lsb:portmap \
>
>         op monitor interval="120s"
>
> group fs1_group p_filesystem_drbd_fs1 p_ip_fs1 p_nfsserver_fs1 p_export_fs1
> \
>
>         meta target-role="Started"
>
> group mysql_group p_filesystem_drbd_r0 p_ip_mysql p_mysql \
>
>         meta target-role="Started"
>
> ms ms_drbd_resource_fs1 p_drbd_resource_fs1 \
>
>         meta master-max="1" master-node-max="1" clone-max="2"
> clone-node-max="1" notify="true" target-role="Started"
>
> ms ms_drbd_resource_r0 p_drbd_resource_r0 \
>
>         meta master-max="1" master-node-max="1" clone-max="2"
> clone-node-max="1" notify="true" target-role="Started"
>
> clone ApacheCluster Apache \
>
>         meta globally-unique="false" ordered="false" target-role="Started"
>
> clone ImqCluster Imq \
>
>         meta globally-unique="false" ordered="true" notify="true"
> target-role="Started"
>
> clone NOSFileSystemCluster NOSFileSystem \
>
>         meta target-role="Started"
>
> clone portmapCluster portmap \
>
>         meta target-role="Started"
>
> location location_NOSServiceManager0_oamdev-vm2  NOSServiceManager0 100:
> oamdev-vm2
>
> location location_ms_drbd_resource_fs1_master ms_drbd_resource_fs1 100:
> oamdev-vm3
>
> location location_ms_drbd_resource_fs1_nodes ms_drbd_resource_fs1 \
>
>         rule $id="location_ms_drbd_resource_fs1_nodes-rule" -inf: #uname ne
> oamdev-vm3  and #uname ne oamdev-vm2
>
> location location_ms_drbd_resource_r0_master ms_drbd_resource_r0 100:
> oamdev-vm2
>
> location location_ms_drbd_resource_r0_nodes ms_drbd_resource_r0 \
>
>         rule $id="location_ms_drbd_resource_r0_nodes-rule" -inf: #uname ne
> oamdev-vm2  and #uname ne oamdev-vm3
>
> colocation colocation_NOSIpAddress10_255_239_23_NOSServiceManager0 inf:
> NOSIpAddress10_255_239_23 NOSServiceManager0
>
> colocation colocation_NOSIpAddress10_255_239_25_NOSServiceManager0 inf:
> NOSIpAddress10_255_239_25 NOSServiceManager0
>
> colocation colocation_filesystem_drbd_fs1 inf: fs1_group
> ms_drbd_resource_fs1:Master
>
> colocation colocation_filesystem_drbd_r0 inf: mysql_group
> ms_drbd_resource_r0:Master
>
> order order_NOSFileSystemCluster_after_portmapCluster inf: portmapCluster
> NOSFileSystemCluster
>
> order order_NOSServiceManager0_after_NOSFileSystemCluster inf:
> NOSFileSystemCluster NOSServiceManager0
>
> order order_NOSServiceManager0_after_p_mysql inf: p_mysql NOSServiceManager0
>
> order order_filesystem_after_drbd_fs1 inf: ms_drbd_resource_fs1:promote
> fs1_group:start
>
> order order_filesystem_after_drbd_r0 inf: ms_drbd_resource_r0:promote
> mysql_group:start
>
> order order_fs1_group_after_portmapCluster inf: portmapCluster fs1_group
>
> property $id="cib-bootstrap-options" \
>
>         dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f"
> \
>
>         cluster-infrastructure="openais" \
>
>         expected-quorum-votes="2" \
>
>         stonith-enabled="false" \
>
>         no-quorum-policy="ignore" \
>
>         default-action-timeout="240" \
>
>         start-failure-is-fatal="false"
>
> rsc_defaults $id="rsc-options" \
>
>         failure-timeout="30s" \
>
>         resource-stickiness="100"
>
> op_defaults $id="op-options" \
>
>         on-fail="restart"
>
>
>
>
>
> Corosync logs on the node oamdev-vm3 where resources were restarted(Time:
> Sep 19 12:52:35  ). Highlighted sections where restart is seen.
>
>
>
> Sep 19 12:52:30 oamdev-vm3.lab.ipaccess.com lrmd: [16766]: info: RA output:
> (NOSServiceManager0:monitor:stderr) 2012/09/19_12:52:30 INFO:   monitor:
> running
>
> Sep 19 12:52:31 corosync [pcmk  ] notice: pcmk_peer_update: Transitional
> membership event on ring 32: memb=1, new=0, lost=0
>
> Sep 19 12:52:31 corosync [pcmk  ] info: pcmk_peer_update: memb:
> oamdev-vm3.lab.ipaccess.com 519044874
>
> Sep 19 12:52:31 corosync [pcmk  ] notice: pcmk_peer_update: Stable
> membership event on ring 32: memb=2, new=1, lost=0
>
> Sep 19 12:52:31 corosync [pcmk  ] info: update_member: Node
> 351272714/oamdev-vm2.lab.ipaccess.com is now: member
>
> Sep 19 12:52:31 corosync [pcmk  ] info: pcmk_peer_update: NEW:
> oamdev-vm2.lab.ipaccess.com 351272714
>
> Sep 19 12:52:31 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
> oamdev-vm2.lab.ipaccess.com 351272714
>
> Sep 19 12:52:31 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
> oamdev-vm3.lab.ipaccess.com 519044874
>
> Sep 19 12:52:31 corosync [pcmk  ] info: send_member_notification: Sending
> membership update 32 to 2 children
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: notice:
> ais_dispatch_message: Membership 32: quorum acquired
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> crm_update_peer: Node oamdev-vm2.lab.ipaccess.com: id=351272714 state=member
> (new) addr=r(0) ip(10.255.239.20)  votes=1 born=24 seen=32
> proc=00000000000000000000000000000002
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: notice:
> ais_dispatch_message: Membership 32: quorum acquired
>
> Sep 19 12:52:31 corosync [TOTEM ] A processor joined or left the membership
> and a new membership was formed.
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> ais_status_callback: status: oamdev-vm2.lab.ipaccess.com is now member (was
> lost)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> crm_update_peer: Node oamdev-vm2.lab.ipaccess.com: id=351272714 state=member
> (new) addr=r(0) ip(10.255.239.20)  votes=1 born=24 seen=32
> proc=00000000000000000000000000000002
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> crm_update_quorum: Updating quorum status to true (call=214)
>
> Sep 19 12:52:31 corosync [pcmk  ] info: update_member: 0x4d29c40 Node
> 351272714 (oamdev-vm2.lab.ipaccess.com) born on: 32
>
> Sep 19 12:52:31 corosync [pcmk  ] info: update_member: Node
> oamdev-vm2.lab.ipaccess.com now has process list:
> 00000000000000000000000000111312 (1118994)
>
> Sep 19 12:52:31 corosync [pcmk  ] info: send_member_notification: Sending
> membership update 32 to 2 children
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_delete for section
> //node_state[@uname='oamdev-vm2.lab.ipaccess.com']/lrm
> (origin=local/crmd/210, version=0.35.62): ok (rc=0)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_delete for section
> //node_state[@uname='oamdev-vm2.lab.ipaccess.com']/transient_attributes
> (origin=local/crmd/211, version=0.35.63): ok (rc=0)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_modify for section nodes
> (origin=local/crmd/212, version=0.35.64): ok (rc=0)
>
> Sep 19 12:52:31 corosync [MAIN  ] Completed service synchronization, ready
> to provide service.
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_modify for section cib
> (origin=local/crmd/214, version=0.35.66): ok (rc=0)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> ais_dispatch_message: Membership 32: quorum retained
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> crm_update_peer: Node oamdev-vm2.lab.ipaccess.com: id=351272714 state=member
> addr=r(0) ip(10.255.239.20)  votes=1 born=32 seen=32
> proc=00000000000000000000000000111312 (new)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_sync_one for section 'all'
> (origin=oamdev-vm2.lab.ipaccess.com/oamdev-vm2.lab.ipaccess.com/(null),
> version=0.35.66): ok (rc=0)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> crmd_ais_dispatch: Setting expected votes to 2
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> abort_transition_graph: te_update_diff:276 - Triggered transition abort
> (complete=1, tag=lrm_rsc_op, id=p_mysql_monitor_0,
> magic=0:7;21:1:7:7a06b9e0-1d98-4a00-a287-cbc4178d65e4, cib=0.35.62) :
> Resource op removal
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> erase_xpath_callback: Deletion of
> "//node_state[@uname='oamdev-vm2.lab.ipaccess.com']/lrm": ok (rc=0)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> abort_transition_graph: te_update_diff:163 - Triggered transition abort
> (complete=1, tag=transient_attributes, id=oamdev-vm2.lab.ipaccess.com,
> magic=NA, cib=0.35.63) : Transient attribute: removal
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> erase_xpath_callback: Deletion of
> "//node_state[@uname='oamdev-vm2.lab.ipaccess.com']/transient_attributes":
> ok (rc=0)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [
> input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_state_transition: All 1 cluster nodes are eligible to run resources.
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_pe_invoke: Query 217: Requesting the current CIB: S_POLICY_ENGINE
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_pe_invoke: Query 218: Requesting the current CIB: S_POLICY_ENGINE
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> ais_dispatch_message: Membership 32: quorum retained
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: notice:
> crmd_peer_update: Status update: Client oamdev-vm2.lab.ipaccess.com/crmd now
> has status [online] (DC=true)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> crm_update_peer: Node oamdev-vm2.lab.ipaccess.com: id=351272714 state=member
> addr=r(0) ip(10.255.239.20)  votes=1 born=32 seen=32
> proc=00000000000000000000000000111312 (new)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_modify for section
> crm_config (origin=local/crmd/216, version=0.35.67): ok (rc=0)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_modify for section nodes
> (origin=local/crmd/220, version=0.35.69): ok (rc=0)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> crmd_ais_dispatch: Setting expected votes to 2
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_state_transition: State transition S_POLICY_ENGINE -> S_INTEGRATION [
> input=I_NODE_JOIN cause=C_FSA_INTERNAL origin=crmd_peer_update ]
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info: update_dc:
> Unset DC oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> join_make_offer: Making join offers based on membership 32
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_dc_join_offer_all: join-5: Waiting on 2 outstanding join acks
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_modify for section
> crm_config (origin=local/crmd/223, version=0.35.71): ok (rc=0)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info: update_dc:
> Set DC to oamdev-vm3.lab.ipaccess.com (3.0.5)
>
> Sep 19 12:52:31 oamdev-vm3.lab.ipaccess.com lrmd: [16766]: info: RA output:
> (p_mysql:monitor:stderr) 2012/09/19_12:52:31 INFO: MySQL monitor succeeded
>
>
>
> Sep 19 12:52:34 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info: update_dc:
> Unset DC oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:34 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_dc_join_offer_all: A new node joined the cluster
>
> Sep 19 12:52:34 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_dc_join_offer_all: join-6: Waiting on 2 outstanding join acks
>
> Sep 19 12:52:34 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info: update_dc:
> Set DC to oamdev-vm3.lab.ipaccess.com (3.0.5)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_state_transition: State transition S_INTEGRATION -> S_FINALIZE_JOIN [
> input=I_INTEGRATED cause=C_FSA_INTERNAL origin=check_join_state ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_state_transition: All 2 cluster nodes responded to the join offer.
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_dc_join_finalize: join-6: Syncing the CIB from
> oamdev-vm3.lab.ipaccess.com to the rest of the cluster
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_sync for section 'all'
> (origin=local/crmd/226, version=0.35.71): ok (rc=0)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_modify for section nodes
> (origin=local/crmd/227, version=0.35.72): ok (rc=0)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_modify for section nodes
> (origin=local/crmd/228, version=0.35.73): ok (rc=0)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_delete for section
> //node_state[@uname='oamdev-vm2.lab.ipaccess.com']/transient_attributes
> (origin=oamdev-vm2.lab.ipaccess.com/crmd/6, version=0.35.74): ok (rc=0)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_dc_join_ack: join-6: Updating node state to member for
> oamdev-vm2.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_delete for section
> //node_state[@uname='oamdev-vm2.lab.ipaccess.com']/lrm
> (origin=local/crmd/229, version=0.35.75): ok (rc=0)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> erase_xpath_callback: Deletion of
> "//node_state[@uname='oamdev-vm2.lab.ipaccess.com']/lrm": ok (rc=0)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_dc_join_ack: join-6: Updating node state to member for
> oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_state_transition: State transition S_FINALIZE_JOIN -> S_POLICY_ENGINE [
> input=I_FINALIZED cause=C_FSA_INTERNAL origin=check_join_state ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_state_transition: All 2 cluster nodes are eligible to run resources.
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_dc_join_final: Ensuring DC, quorum and node attributes are up-to-date
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> crm_update_quorum: Updating quorum status to true (call=235)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> abort_transition_graph: do_te_invoke:173 - Triggered transition abort
> (complete=1) : Peer Cancelled
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_pe_invoke: Query 236: Requesting the current CIB: S_POLICY_ENGINE
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com attrd: [16767]: info:
> attrd_local_callback: Sending full refresh (origin=crmd)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com attrd: [16767]: info:
> attrd_trigger_update: Sending flush op to all hosts for: probe_complete
> (true)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_delete for section
> //node_state[@uname='oamdev-vm3.lab.ipaccess.com']/lrm
> (origin=local/crmd/231, version=0.35.77): ok (rc=0)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> abort_transition_graph: te_update_diff:276 - Triggered transition abort
> (complete=1, tag=lrm_rsc_op, id=p_drbd_resource_r0:1_monitor_0,
> magic=0:7;19:42:7:ddd16f01-0ba8-4299-8998-1e292a1b5b4b, cib=0.35.77) :
> Resource op removal
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> erase_xpath_callback: Deletion of
> "//node_state[@uname='oamdev-vm3.lab.ipaccess.com']/lrm": ok (rc=0)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_pe_invoke: Query 237: Requesting the current CIB: S_POLICY_ENGINE
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> te_update_diff: Detected LRM refresh - 17 resources updated: Skipping all
> resource events
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> abort_transition_graph: te_update_diff:236 - Triggered transition abort
> (complete=1, tag=diff, id=(null), magic=NA, cib=0.35.78) : LRM Refresh
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_pe_invoke: Query 238: Requesting the current CIB: S_POLICY_ENGINE
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_modify for section nodes
> (origin=local/crmd/233, version=0.35.79): ok (rc=0)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com cib: [16765]: info:
> cib_process_request: Operation complete: op cib_modify for section cib
> (origin=local/crmd/235, version=0.35.81): ok (rc=0)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_pe_invoke_callback: Invoking the PE: query=238,
> ref=pe_calc-dc-1348055555-190, seq=32, quorate=1
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> unpack_config: On loss of CCM Quorum: Ignore
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> clone_print:  Master/Slave Set: ms_drbd_resource_r0 [p_drbd_resource_r0]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print:      Masters: [ oamdev-vm3.lab.ipaccess.com ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print:      Stopped: [ p_drbd_resource_r0:0 ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> group_print:  Resource Group: mysql_group
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> native_print:      p_filesystem_drbd_r0   (ocf::heartbeat:Filesystem):
> Started oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> native_print:      p_ip_mysql     (ocf::ipaccess:ipaddress):      Started
> oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> native_print:      p_mysql        (ocf::heartbeat:mysql): Started
> oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> clone_print:  Master/Slave Set: ms_drbd_resource_fs1 [p_drbd_resource_fs1]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print:      Masters: [ oamdev-vm3.lab.ipaccess.com ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print:      Stopped: [ p_drbd_resource_fs1:1 ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> clone_print:  Clone Set: portmapCluster [portmap]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print:      Started: [ oamdev-vm3.lab.ipaccess.com ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print:      Stopped: [ portmap:1 ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> group_print:  Resource Group: fs1_group
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> native_print:      p_filesystem_drbd_fs1  (ocf::heartbeat:Filesystem):
> Started oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> native_print:      p_ip_fs1       (ocf::ipaccess:ipaddress):      Started
> oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> native_print:      p_nfsserver_fs1        (ocf::ipaccess:nfsserver):
> Started oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> native_print:      p_export_fs1   (ocf::heartbeat:exportfs):      Started
> oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> clone_print:  Clone Set: NOSFileSystemCluster [NOSFileSystem]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print:      Started: [ oamdev-vm3.lab.ipaccess.com ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print:      Stopped: [ NOSFileSystem:1 ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> native_print: NOSIpAddress10_255_239_23   (ocf::ipaccess:ipaddress):
> Started oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> native_print: NOSServiceManager0  (ocf::ipaccess:glassfish):      Started
> oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> clone_print:  Clone Set: ImqCluster [Imq]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print:      Started: [ oamdev-vm3.lab.ipaccess.com ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print:      Stopped: [ Imq:0 ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> clone_print:  Clone Set: ApacheCluster [Apache]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print:      Started: [ oamdev-vm3.lab.ipaccess.com ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> short_print:      Stopped: [ Apache:0 ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> native_print: NOSIpAddress10_255_239_25   (ocf::ipaccess:ipaddress):
> Started oamdev-vm3.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> RecurringOp:  Start recurring monitor (31s) for p_drbd_resource_r0:0 on
> oamdev-vm2.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> RecurringOp:  Start recurring monitor (31s) for p_drbd_resource_r0:0 on
> oamdev-vm2.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> RecurringOp:  Start recurring monitor (31s) for p_drbd_resource_fs1:1 on
> oamdev-vm2.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> RecurringOp:  Start recurring monitor (31s) for p_drbd_resource_fs1:1 on
> oamdev-vm2.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> RecurringOp:  Start recurring monitor (120s) for portmap:1 on
> oamdev-vm2.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> RecurringOp:  Start recurring monitor (60s) for NOSFileSystem:1 on
> oamdev-vm2.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> RecurringOp:  Start recurring monitor (15s) for Imq:0 on
> oamdev-vm2.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> RecurringOp:  Start recurring monitor (30s) for Apache:0 on
> oamdev-vm2.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Start   p_drbd_resource_r0:0  (oamdev-vm2.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Leave   p_drbd_resource_r0:1  (Master
> oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Leave   p_filesystem_drbd_r0  (Started
> oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Leave   p_ip_mysql    (Started oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Leave   p_mysql       (Started oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Leave   p_drbd_resource_fs1:0 (Master
> oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Start   p_drbd_resource_fs1:1 (oamdev-vm2.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Leave   portmap:0     (Started oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Start   portmap:1     (oamdev-vm2.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Restart p_filesystem_drbd_fs1 (Started
> oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Restart p_ip_fs1      (Started oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Restart p_nfsserver_fs1       (Started
> oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Restart p_export_fs1  (Started oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Restart NOSFileSystem:0       (Started
> oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Start   NOSFileSystem:1       (oamdev-vm2.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Leave   NOSIpAddress10_255_239_23     (Started
> oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Restart NOSServiceManager0    (Started
> oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Start   Imq:0 (oamdev-vm2.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Leave   Imq:1 (Started oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Start   Apache:0      (oamdev-vm2.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Leave   Apache:1      (Started oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com pengine: [16768]: notice:
> LogActions: Leave   NOSIpAddress10_255_239_25     (Started
> oamdev-vm3.lab.ipaccess.com)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com attrd: [16767]: info:
> attrd_trigger_update: Sending flush op to all hosts for:
> master-p_drbd_resource_r0:0 (<null>)
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE
> [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> unpack_graph: Unpacked transition 17: 87 actions in 87 synapses
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> do_te_invoke: Processing graph 17 (ref=pe_calc-dc-1348055555-190) derived
> from /var/lib/pengine/pe-input-5945.bz2
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> te_rsc_command: Initiating action 18: monitor p_drbd_resource_r0:0_monitor_0
> on oamdev-vm2.lab.ipaccess.com
>
> Sep 19 12:52:35 oamdev-vm3.lab.ipaccess.com crmd: [16769]: info:
> te_pseudo_action: Pseudo action 43 fired and confirmed
>
>
>
>
>
> Any help /hints/suggestion are appreciated.
>
>
>
> -Poonam.
>
>
>
>
>
> This message contains confidential information and may be privileged. If you
> are not the intended recipient, please notify the sender and delete the
> message immediately.
>
> ip.access ltd, registration number 3400157, Building 2020,
> Cambourne Business Park, Cambourne, Cambridge CB23 6DW, United Kingdom
>
>




More information about the Pacemaker mailing list