<html><head><style type='text/css'>p { margin: 0; }</style></head><body><div style='font-family: Arial; font-size: 12pt; color: #000000'>I will see about that error and report back.<br>However I believe that scsi_hostadapter error has been there all along without causing a problem.<br><br>Thanks,<br><span><br><span name="x"></span>Jake Smith<br></span><br><hr id="zwchr"><b>From: </b>"Andrew Beekhof" <andrew@beekhof.net><br><b>To: </b>"The Pacemaker cluster resource manager" <pacemaker@oss.clusterlabs.org><br><b>Cc: </b>"Jake Smith" <jsmith@argotec.com><br><b>Sent: </b>Wednesday, March 9, 2011 3:54:27 AM<br><b>Subject: </b>Re: [Pacemaker] OCFS2 fails to mount file system on node reboot sometimes<br><br>On Tue, Feb 22, 2011 at 7:56 PM, Jake Smith <jsmith@argotec.com> wrote:<br>> I get the following error after reboot sometimes when mounting the ocfs2<br>> file system.  If I manually stop and restart corosync it mounts fine but if<br>> I just try to run cleanup or crm resource start it fails.  I don't<br>> understand how I am getting no local IP address set when both the bonded<br>> links for DRBD sync and bonded links for the network are up.<br><br>I'd suggest starting with why scsi_hostadapter is no longer loaded -<br>since that appears to be the first error.<br><br>><br>><br>><br>> corosync.log:<br>><br>> Feb 22 13:12:12 Condor crmd: [1246]: info: do_lrm_rsc_op: Performing<br>> key=66:4:0:927e853c-e0ee-4f67-a9e7-7cbda27cd316 op=resFS:1_start_0 )<br>><br>> Feb 22 13:12:12 Condor lrmd: [1242]: info: rsc:resFS:1:26: start<br>><br>> Feb 22 13:12:12 Condor lrmd: [1242]: info: RA output: (resFS:1:start:stderr)<br>> FATAL: Module scsi_hostadapter not found.<br>><br>> Feb 22 13:12:12 Condor lrmd: [1242]: info: RA output: (resFS:1:start:stderr)<br>> mount.ocfs2: Transport endpoint is not connected<br>><br>> Feb 22 13:12:12 Condor lrmd: [1242]: info: RA output: (resFS:1:start:stderr)<br>> while mounting /dev/drbd0 on /srv. Check 'dmesg' for more information on<br>> this error.<br>><br>> Feb 22 13:12:12 Condor crmd: [1246]: info: process_lrm_event: LRM operation<br>> resFS:1_start_0 (call=26, rc=1, cib-update=33, confirmed=true) unknown error<br>><br>> Feb 22 13:12:12 Condor attrd: [1243]: info: find_hash_entry: Creating hash<br>> entry for fail-count-resFS:1<br>><br>> Feb 22 13:12:12 Condor attrd: [1243]: info: attrd_trigger_update: Sending<br>> flush op to all hosts for: fail-count-resFS:1 (INFINITY)<br>><br>> Feb 22 13:12:12 Condor attrd: [1243]: info: attrd_perform_update: Sent<br>> update 21: fail-count-resFS:1=INFINITY<br>><br>> Feb 22 13:12:12 Condor attrd: [1243]: info: find_hash_entry: Creating hash<br>> entry for last-failure-resFS:1<br>><br>> Feb 22 13:12:12 Condor attrd: [1243]: info: attrd_trigger_update: Sending<br>> flush op to all hosts for: last-failure-resFS:1 (1298398314)<br>><br>> Feb 22 13:12:12 Condor attrd: [1243]: info: attrd_perform_update: Sent<br>> update 24: last-failure-resFS:1=1298398314<br>><br>> Feb 22 13:12:12 Condor crmd: [1246]: info: do_lrm_rsc_op: Performing<br>> key=5:5:0:927e853c-e0ee-4f67-a9e7-7cbda27cd316 op=resFS:1_stop_0 )<br>><br>> Feb 22 13:12:12 Condor lrmd: [1242]: info: rsc:resFS:1:27: stop<br>><br>><br>><br>> dmesg:<br>><br>> [   23.896124] DLM (built Jan 11 2011 00:00:14) installed<br>><br>> [   23.917418] block drbd0: role( Secondary -> Primary )<br>><br>> [   24.118912] bond1: no IPv6 routers present<br>><br>> [   25.117097] ocfs2: Registered cluster interface user<br>><br>> [   25.144884] OCFS2 Node Manager 1.5.0<br>><br>> [   25.166762] OCFS2 1.5.0<br>><br>> [   27.085394] bond0: no IPv6 routers present<br>><br>> [   27.305886] dlm: no local IP address has been set<br>><br>> [   27.306168] dlm: cannot start dlm lowcomms -107<br>><br>> [   27.306589] (2370,0):ocfs2_dlm_init:2963 ERROR: status = -107<br>><br>> [   27.306959] (2370,0):ocfs2_mount_volume:1792 ERROR: status = -107<br>><br>> [   27.307289] ocfs2: Unmounting device (147,0) on (node 0)<br>><br>><br>><br>> crm_config:<br>><br>> node Condor \<br>><br>>         attributes standby="off"<br>><br>> node Vulture \<br>><br>>         attributes standby="off"<br>><br>> primitive resDLM ocf:pacemaker:controld \<br>><br>>         op monitor interval="120s"<br>><br>> primitive resDRBD ocf:linbit:drbd \<br>><br>>         params drbd_resource="srv" \<br>><br>>         operations $id="resDRBD-operations" \<br>><br>>         op monitor interval="20" role="Master" timeout="20" \<br>><br>>         op monitor interval="30" role="Slave" timeout="20"<br>><br>> primitive resFS ocf:heartbeat:Filesystem \<br>><br>>         params device="/dev/drbd/by-res/srv" directory="/srv" fstype="ocfs2"<br>> \<br>><br>>         op monitor interval="120s"<br>><br>> primitive resIDRAC-CONDOR stonith:ipmilan \<br>><br>>         params hostname="Condor" ipaddr="192.168.2.61" port="623" auth="md5"<br>> priv="admin" login="xxxx" password="xxxx" \<br>><br>>         meta target-role="Started"<br>><br>> primitive resIDRAC-VULTURE stonith:ipmilan \<br>><br>>         params hostname="Vulture" ipaddr="192.168.2.62" port="623"<br>> auth="md5" priv="admin" login="xxxx" password="xxxx" \<br>><br>>         meta target-role="Started"<br>><br>> primitive resO2CB ocf:pacemaker:o2cb \<br>><br>>         op monitor interval="120s"<br>><br>> primitive resSAMBAVIP ocf:heartbeat:IPaddr2 \<br>><br>>         params ip="192.168.2.200" cidr_netmask="32" nic="bond0"<br>> clusterip_hash="sourceip" \<br>><br>>         op monitor interval="30s" \<br>><br>>         meta resource-stickiness="0"<br>><br>> ms msDRBD resDRBD \<br>><br>>         meta resource-stickiness="100" notify="true" master-max="2"<br>> clone-max="2" clone-node-max="1" interleave="true" target-role="Started"<br>><br>> clone cloneDLM resDLM \<br>><br>>         meta globally-unique="false" interleave="true" target-role="Started"<br>><br>> clone cloneFS resFS \<br>><br>>         meta interleave="true" ordered="true" target-role="Started"<br>><br>> clone cloneO2CB resO2CB \<br>><br>>         meta globally-unique="false" interleave="true" target-role="Started"<br>><br>> clone cloneSAMBAVIP resSAMBAVIP \<br>><br>>         meta globally-unique="true" clone-max="2" clone-node-max="2"<br>> target-role="Started"<br>><br>> location locIDRAC-CONDOR resIDRAC-CONDOR -inf: Condor<br>><br>> location locIDRAC-VULTURE resIDRAC-VULTURE -inf: Vulture<br>><br>> colocation colDLMDRBD inf: cloneDLM msDRBD:Master<br>><br>> colocation colFSO2CB inf: cloneFS cloneO2CB<br>><br>> colocation colFSSAMBAVIP inf: cloneFS cloneSAMBAVIP<br>><br>> colocation colO2CBDLM inf: cloneO2CB cloneDLM<br>><br>> order ordDLMO2CB 0: cloneDLM cloneO2CB<br>><br>> order ordDRBDDLM 0: msDRBD:promote cloneDLM<br>><br>> order ordFSSAMBAVIP 0: cloneFS cloneSAMBAVIP<br>><br>> order ordO2CBFS 0: cloneO2CB cloneFS<br>><br>> property $id="cib-bootstrap-options" \<br>><br>>         dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \<br>><br>>         cluster-infrastructure="openais" \<br>><br>>         expected-quorum-votes="2" \<br>><br>>         stonith-enabled="true" \<br>><br>>         no-quorum-policy="ignore" \<br>><br>>         last-lrm-refresh="1298398491"<br>><br>> rsc_defaults $id="rsc-options" \<br>><br>>         resource-stickiness="100"<br>><br>><br>><br>> Thanks!<br>><br>><br>><br>> Jake Smith<br>><br>> _______________________________________________<br>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org<br>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker<br>><br>> Project Home: http://www.clusterlabs.org<br>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf<br>> Bugs:<br>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker<br>><br>><br><br></div></body></html>