[Pacemaker] FS mount error

Proskurin Kirill proskurin-kv at fxclub.org
Thu Jul 22 03:29:47 EDT 2010


Hello all.

I really new to Pacemaker and try to make some test and learn how it is 
all works. I use Clusters From Scratch pdf from clusterlabs.org as how-to.

What we have:
Debian Lenny 5.0.5 (with kernel 2.6.32-bpo.4-amd64 from backports)
pacemaker 1.0.8+hg15494-4~bpo50+1
openais 1.1.2-2~bpo50+1


Problem:
I try to add fs mount resource but get unknown error. If I mount it by 
hands - all is ok.

crm_mon:

============
Last updated: Thu Jul 22 08:22:20 2010
Stack: openais
Current DC: node01.domain.org - partition with quorum
Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
2 Nodes configured, 2 expected votes
4 Resources configured.
============

Online: [ node02.domain.org node01.domain.org ]

ClusterIP       (ocf::heartbeat:IPaddr2):       Started node02.domain.org
  Master/Slave Set: WebData
      Masters: [ node02.domain.org ]
      Slaves: [ node01.domain.org ]
WebFS   (ocf::heartbeat:Filesystem):    Started node02.domain.org FAILED

Failed actions:
     WebFS_start_0 (node=node01.domain.org, call=18, rc=1, 
status=complete): unknown error
     WebFS_start_0 (node=node02.domain.org, call=301, rc=1, 
status=complete): unknown error

node01:~# crm_verify -VL
crm_verify[1482]: 2010/07/22_08:28:13 WARN: unpack_rsc_op: Processing 
failed op WebFS_start_0 on node01.domain.org: unknown error (1)
crm_verify[1482]: 2010/07/22_08:28:13 WARN: unpack_rsc_op: Processing 
failed op WebFS_start_0 on node02.domain.org: unknown error (1)
crm_verify[1482]: 2010/07/22_08:28:13 WARN: common_apply_stickiness: 
Forcing WebFS away from node01.domain.org after 1000000 failures 
(max=1000000)


node01:~# crm configure show
node node01.domain.org
node node02.domain.org
primitive ClusterIP ocf:heartbeat:IPaddr2 \
	params ip="192.168.1.100" cidr_netmask="32" \
	op monitor interval="30s"
primitive WebFS ocf:heartbeat:Filesystem \
	params device="/dev/drbd0" directory="/var/spool/dovecot" fstype="ext4" \
	op start interval="0" timeout="60s" \
	op stop interval="0" timeout="60s" \
	meta target-role="Started"
primitive WebSite ocf:heartbeat:apache \
	params configfile="/etc/apache2/apache2.conf" \
	op monitor interval="1min" \
	op start interval="0" timeout="40s" \
	op stop interval="0" timeout="60s" \
	meta target-role="Started"
primitive wwwdrbd ocf:linbit:drbd \
	params drbd_resource="drbd0" \
	op monitor interval="60s" \
	op start interval="0" timeout="240s" \
	op stop interval="0" timeout="100s"
ms WebData wwwdrbd \
	meta master-max="1" master-node-max="1" clone-max="2" 
clone-node-max="1" notify="true" target-role="Started"
colocation WebSite-with-WebFS inf: WebSite WebFS
colocation fs_on_drbd inf: WebFS WebData:Master
colocation website-with-ip inf: WebSite ClusterIP
order WebFS-after-WebData inf: WebData:promote WebFS:start
order WebSite-after-WebFS inf: WebFS WebSite
order apache-after-ip inf: ClusterIP WebSite
property $id="cib-bootstrap-options" \
	dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \
	cluster-infrastructure="openais" \
	expected-quorum-votes="2" \
	stonith-enabled="false" \
	last-lrm-refresh="1279717510"


In logs:
Jul 22 08:18:39 node01 crmd: [1814]: ERROR: stonithd_signon: Can't 
initiate connection to stonithd
Jul 22 08:18:39 node01 crmd: [1814]: notice: Not currently connected.
Jul 22 08:18:39 node01 crmd: [1814]: ERROR: te_connect_stonith: Sign-in 
failed: triggered a retry
Jul 22 08:18:39 node01 crmd: [1814]: info: te_connect_stonith: 
Attempting connection to fencing daemon...
Jul 22 08:18:40 node01 crmd: [1814]: ERROR: stonithd_signon: Can't 
initiate connection to stonithd
Jul 22 08:18:40 node01 crmd: [1814]: notice: Not currently connected.
Jul 22 08:18:40 node01 crmd: [1814]: ERROR: te_connect_stonith: Sign-in 
failed: triggered a retry
Jul 22 08:18:40 node01 crmd: [1814]: info: te_connect_stonith: 
Attempting connection to fencing daemon...
Jul 22 08:18:41 node01 crmd: [1814]: ERROR: stonithd_signon: Can't 
initiate connection to stonithd
Jul 22 08:18:41 node01 crmd: [1814]: notice: Not currently connected.
Jul 22 08:18:41 node01 crmd: [1814]: ERROR: te_connect_stonith: Sign-in 
failed: triggered a retry
Jul 22 08:18:41 node01 crmd: [1814]: info: te_connect_stonith: 
Attempting connection to fencing daemon...
Jul 22 08:18:42 node01 cibadmin: [1199]: info: Invoked: cibadmin -Ql -o 
resources
Jul 22 08:18:42 node01 cibadmin: [1200]: info: Invoked: cibadmin -p -R 
-o resources
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: - 
<cib admin_epoch="0" epoch="143" num_updates="2" >
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: - 
   <configuration >
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: - 
     <resources >
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: - 
       <primitive id="WebFS" >
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: - 
         <meta_attributes id="WebFS-meta_attributes" >
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: - 
           <nvpair value="Stopped" id="WebFS-meta_attributes-target-role" />
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: - 
         </meta_attributes>
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: - 
       </primitive>
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: - 
     </resources>
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: - 
   </configuration>
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: - 
</cib>
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: + 
<cib admin_epoch="0" epoch="144" num_updates="1" >
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: + 
   <configuration >
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: + 
     <resources >
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: + 
       <primitive id="WebFS" >
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: + 
         <meta_attributes id="WebFS-meta_attributes" >
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: + 
           <nvpair value="Started" id="WebFS-meta_attributes-target-role" />
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: + 
         </meta_attributes>
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: + 
       </primitive>
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: + 
     </resources>
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: + 
   </configuration>
Jul 22 08:18:42 node01 cib: [1810]: info: log_data_element: cib:diff: + 
</cib>
Jul 22 08:18:42 node01 cib: [1810]: info: cib_process_request: Operation 
complete: op cib_replace for section resources (origin=local/cibadmin/2, 
version=0.144.1): ok (rc=0)
Jul 22 08:18:42 node01 cib: [1201]: info: write_cib_contents: Archived 
previous version as /var/lib/heartbeat/crm/cib-89.raw
Jul 22 08:18:42 node01 cib: [1201]: info: write_cib_contents: Wrote 
version 0.144.0 of the CIB to disk (digest: 
5f51a15c21330c7ff76862ad9a5193b1)
Jul 22 08:18:42 node01 cib: [1201]: info: retrieveCib: Reading cluster 
configuration from: /var/lib/heartbeat/crm/cib.woPqNQ (digest: 
/var/lib/heartbeat/crm/cib.bF43Zi)
Jul 22 08:18:42 node01 crmd: [1814]: ERROR: stonithd_signon: Can't 
initiate connection to stonithd
Jul 22 08:18:42 node01 crmd: [1814]: notice: Not currently connected.
Jul 22 08:18:42 node01 crmd: [1814]: ERROR: te_connect_stonith: Sign-in 
failed: triggered a retry
Jul 22 08:18:42 node01 crmd: [1814]: info: abort_transition_graph: 
need_abort:59 - Triggered transition abort (complete=1) : Non-status change
Jul 22 08:18:42 node01 crmd: [1814]: info: need_abort: Aborting on 
change to admin_epoch
Jul 22 08:18:42 node01 crmd: [1814]: info: do_state_transition: State 
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC 
cause=C_FSA_INTERNAL origin=abort_transition_graph ]
Jul 22 08:18:42 node01 crmd: [1814]: info: do_state_transition: All 2 
cluster nodes are eligible to run resources.
Jul 22 08:18:42 node01 crmd: [1814]: info: do_pe_invoke: Query 350: 
Requesting the current CIB: S_POLICY_ENGINE
Jul 22 08:18:42 node01 crmd: [1814]: info: te_connect_stonith: 
Attempting connection to fencing daemon...
Jul 22 08:18:43 node01 crmd: [1814]: ERROR: stonithd_signon: Can't 
initiate connection to stonithd
Jul 22 08:18:43 node01 crmd: [1814]: notice: Not currently connected.
Jul 22 08:18:43 node01 crmd: [1814]: ERROR: te_connect_stonith: Sign-in 
failed: triggered a retry
Jul 22 08:18:43 node01 crmd: [1814]: info: do_pe_invoke_callback: 
Invoking the PE: query=350, ref=pe_calc-dc-1279783123-729, seq=152, 
quorate=1
Jul 22 08:18:43 node01 crmd: [1814]: info: te_connect_stonith: 
Attempting connection to fencing daemon...
Jul 22 08:18:43 node01 pengine: [1813]: info: unpack_config: Node 
scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Jul 22 08:18:43 node01 pengine: [1813]: info: determine_online_status: 
Node node01.domain.org is online
Jul 22 08:18:43 node01 pengine: [1813]: notice: unpack_rsc_op: Operation 
WebSite_monitor_0 found resource WebSite active on node01.domain.org
Jul 22 08:18:43 node01 pengine: [1813]: WARN: unpack_rsc_op: Processing 
failed op WebFS_start_0 on node01.domain.org: unknown error (1)
Jul 22 08:18:43 node01 pengine: [1813]: info: determine_online_status: 
Node node02.domain.org is online
Jul 22 08:18:43 node01 pengine: [1813]: notice: unpack_rsc_op: Operation 
WebSite_monitor_0 found resource WebSite active on node02.domain.org
Jul 22 08:18:43 node01 pengine: [1813]: WARN: unpack_rsc_op: Processing 
failed op WebFS_start_0 on node02.domain.org: unknown error (1)
Jul 22 08:18:43 node01 pengine: [1813]: notice: native_print: 
ClusterIP#011(ocf::heartbeat:IPaddr2):#011Started node02.domain.org
Jul 22 08:18:43 node01 pengine: [1813]: notice: native_print: 
WebSite#011(ocf::heartbeat:apache):#011Stopped
Jul 22 08:18:43 node01 pengine: [1813]: notice: clone_print: 
Master/Slave Set: WebData
Jul 22 08:18:43 node01 pengine: [1813]: notice: short_print: 
Masters: [ node02.domain.org ]
Jul 22 08:18:43 node01 pengine: [1813]: notice: short_print: 
Slaves: [ node01.domain.org ]
Jul 22 08:18:43 node01 pengine: [1813]: notice: native_print: 
WebFS#011(ocf::heartbeat:Filesystem):#011Stopped
Jul 22 08:18:43 node01 pengine: [1813]: info: get_failcount: WebFS has 
failed 1000000 times on node01.domain.org
Jul 22 08:18:43 node01 pengine: [1813]: WARN: common_apply_stickiness: 
Forcing WebFS away from node01.domain.org after 1000000 failures 
(max=1000000)
Jul 22 08:18:43 node01 pengine: [1813]: info: native_merge_weights: 
WebData: Rolling back scores from WebFS
Jul 22 08:18:43 node01 pengine: [1813]: info: native_merge_weights: 
wwwdrbd:0: Rolling back scores from WebFS
Jul 22 08:18:43 node01 pengine: [1813]: info: native_merge_weights: 
WebData: Rolling back scores from WebFS
Jul 22 08:18:43 node01 pengine: [1813]: info: master_color: Promoting 
wwwdrbd:0 (Master node02.domain.org)
Jul 22 08:18:43 node01 pengine: [1813]: info: master_color: WebData: 
Promoted 1 instances of a possible 1 to master
Jul 22 08:18:43 node01 pengine: [1813]: info: master_color: Promoting 
wwwdrbd:0 (Master node02.domain.org)
Jul 22 08:18:43 node01 pengine: [1813]: info: master_color: WebData: 
Promoted 1 instances of a possible 1 to master
Jul 22 08:18:43 node01 pengine: [1813]: notice: RecurringOp:  Start 
recurring monitor (60s) for WebSite on node02.domain.org
Jul 22 08:18:43 node01 pengine: [1813]: notice: LogActions: Leave 
resource ClusterIP#011(Started node02.domain.org)
Jul 22 08:18:43 node01 pengine: [1813]: notice: LogActions: Start 
WebSite#011(node02.domain.org)
Jul 22 08:18:43 node01 pengine: [1813]: notice: LogActions: Leave 
resource wwwdrbd:0#011(Master node02.domain.org)
Jul 22 08:18:43 node01 pengine: [1813]: notice: LogActions: Leave 
resource wwwdrbd:1#011(Slave node01.domain.org)
Jul 22 08:18:43 node01 pengine: [1813]: notice: LogActions: Start 
WebFS#011(node02.domain.org)
Jul 22 08:18:43 node01 pengine: [1813]: info: process_pe_message: 
Transition 199: PEngine Input stored in: /var/lib/pengine/pe-input-243.bz2
Jul 22 08:18:44 node01 crmd: [1814]: ERROR: stonithd_signon: Can't 
initiate connection to stonithd
Jul 22 08:18:44 node01 crmd: [1814]: notice: Not currently connected.
Jul 22 08:18:44 node01 crmd: [1814]: ERROR: te_connect_stonith: Sign-in 
failed: triggered a retry
Jul 22 08:18:44 node01 crmd: [1814]: info: do_state_transition: State 
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS 
cause=C_IPC_MESSAGE origin=handle_response ]
Jul 22 08:18:44 node01 crmd: [1814]: info: unpack_graph: Unpacked 
transition 199: 4 actions in 4 synapses
Jul 22 08:18:44 node01 crmd: [1814]: info: do_te_invoke: Processing 
graph 199 (ref=pe_calc-dc-1279783123-729) derived from 
/var/lib/pengine/pe-input-243.bz2
Jul 22 08:18:44 node01 crmd: [1814]: info: te_rsc_command: Initiating 
action 42: start WebFS_start_0 on node02.domain.org
Jul 22 08:18:44 node01 crmd: [1814]: info: te_rsc_command: Initiating 
action 5: probe_complete probe_complete on node02.domain.org - no waiting
Jul 22 08:18:44 node01 crmd: [1814]: info: te_connect_stonith: 
Attempting connection to fencing daemon...
Jul 22 08:18:45 node01 crmd: [1814]: ERROR: stonithd_signon: Can't 
initiate connection to stonithd
Jul 22 08:18:45 node01 crmd: [1814]: notice: Not currently connected.
Jul 22 08:18:45 node01 crmd: [1814]: ERROR: te_connect_stonith: Sign-in 
failed: triggered a retry
Jul 22 08:18:45 node01 crmd: [1814]: info: te_connect_stonith: 
Attempting connection to fencing daemon...
Jul 22 08:18:46 node01 crmd: [1814]: ERROR: stonithd_signon: Can't 
initiate connection to stonithd
Jul 22 08:18:46 node01 crmd: [1814]: notice: Not currently connected.
Jul 22 08:18:46 node01 crmd: [1814]: ERROR: te_connect_stonith: Sign-in 
failed: triggered a retry
Jul 22 08:18:46 node01 crmd: [1814]: info: te_connect_stonith: 
Attempting connection to fencing daemon...
Jul 22 08:18:47 node01 crmd: [1814]: ERROR: stonithd_signon: Can't 
initiate connection to stonithd
Jul 22 08:18:47 node01 crmd: [1814]: notice: Not currently connected.
Jul 22 08:18:47 node01 crmd: [1814]: ERROR: te_connect_stonith: Sign-in 
failed: triggered a retry
Jul 22 08:18:47 node01 crmd: [1814]: info: te_connect_stonith: 
Attempting connection to fencing daemon...

-- 
Best regards,
Proskurin Kirill




More information about the Pacemaker mailing list