[ClusterLabs] epic fail
Ken Gaillot
kgaillot at redhat.com
Mon Jul 24 18:14:50 EDT 2017
On Mon, 2017-07-24 at 12:24 -0500, Dimitri Maziuk wrote:
> OK, how about this:
>
> > Jul 22 14:03:41 zebrafish nfsserver(server_nfs)[6398]: INFO: Status: rpcbind
> > Jul 22 14:03:41 zebrafish nfsserver(server_nfs)[6398]: INFO: Status: nfs-mountd
> > Jul 22 14:03:41 zebrafish nfsserver(server_nfs)[6398]: INFO: Status: nfs-idmapd
> > Jul 22 14:03:41 zebrafish nfsserver(server_nfs)[6398]: INFO: Status: rpc-statd
This reminds me of a situation where systemd was keeping some of the NFS
daemons alive. I forget the details. I think it's something like:
pacemaker is managing the main NFS daemon, but systemd dependencies are
what handles the other daemons, and at shutdown, systemd can try to stop
pacemaker before those are stopped. There was a bugfix in 1.1.15 for it.
If you were upgrading from a CentOS 7 release that had <1.1.15, you
could run into it. Subsequent upgrades would be fine due to the fix ...
> ...
>
> > Jul 22 14:03:45 zebrafish crmd[1078]: notice: Transition aborted by status-1-master-drbd_storage doing create master-drbd_storage=1000: Transient attribute change
> > Jul 22 14:03:45 zebrafish crmd[1078]: notice: Initiating notify operation drbd_storage_post_notify_start_0 locally on zebrafish
> > Jul 22 14:03:45 zebrafish crmd[1078]: notice: Initiating notify operation drbd_storage:1_post_notify_start_0 on lionfish
> > Jul 22 14:03:45 zebrafish crmd[1078]: notice: Result of notify operation for drbd_storage on zebrafish: 0 (ok)
> > Jul 22 14:03:45 zebrafish crmd[1078]: notice: Transition 1 (Complete=21, Pending=0, Fired=0, Skipped=1, Incomplete=1, Source=/var/lib/pacemaker/pengine/pe-input-255.bz2): Stopped
> > Jul 22 14:03:45 zebrafish pengine[1077]: notice: On loss of CCM Quorum: Ignore
> > Jul 22 14:03:45 zebrafish pengine[1077]: notice: Demote drbd_storage:0#011(Master -> Slave zebrafish)
> > Jul 22 14:03:45 zebrafish pengine[1077]: notice: Promote drbd_storage:1#011(Slave -> Master lionfish)
> > Jul 22 14:03:45 zebrafish pengine[1077]: notice: Move drbd_filesystem#011(Started zebrafish -> lionfish)
> > Jul 22 14:03:45 zebrafish pengine[1077]: notice: Move symlink_home#011(Started zebrafish -> lionfish)
> > Jul 22 14:03:45 zebrafish pengine[1077]: notice: Move floating_ip#011(Started zebrafish -> lionfish)
> > Jul 22 14:03:45 zebrafish pengine[1077]: notice: Move server_nfs#011(Started zebrafish -> lionfish)
> > Jul 22 14:03:45 zebrafish pengine[1077]: notice: Move nfsshare_home#011(Started zebrafish -> lionfish)
> > Jul 22 14:03:45 zebrafish pengine[1077]: notice: Move nfs_notify#011(Started zebrafish -> lionfish)
> > Jul 22 14:03:45 zebrafish pengine[1077]: notice: Move symlink_etc_pki#011(Started zebrafish -> lionfish)
> > Jul 22 14:03:45 zebrafish pengine[1077]: notice: Move symlink_etc_dovecot#011(Started zebrafish -> lionfish)
> > Jul 22 14:03:45 zebrafish pengine[1077]: notice: Move symlink_var_dovecot#011(Started zebrafish -> lionfish)
> > Jul 22 14:03:45 zebrafish pengine[1077]: notice: Move server_dovecot#011(Started zebrafish -> lionfish)
> > Jul 22 14:03:45 zebrafish pengine[1077]: notice: Calculated transition 2, saving inputs in /var/lib/pacemaker/pengine/pe-input-256.bz2
> > Jul 22 14:03:45 zebrafish crmd[1078]: notice: Initiating cancel operation drbd_storage_monitor_29000 locally on zebrafish
> > Jul 22 14:03:45 zebrafish crmd[1078]: notice: Initiating stop operation nfs_notify_stop_0 locally on zebrafish
> > Jul 22 14:03:45 zebrafish crmd[1078]: notice: Initiating stop operation server_dovecot_stop_0 locally on zebrafish
> > Jul 22 14:03:45 zebrafish crmd[1078]: notice: Initiating notify operation drbd_storage_pre_notify_demote_0 locally on zebrafish
> > Jul 22 14:03:45 zebrafish systemd: Reloading.
> > Jul 22 14:03:45 zebrafish crmd[1078]: notice: Initiating notify operation drbd_storage_pre_notify_demote_0 on lionfish
> > Jul 22 14:03:46 zebrafish nfsnotify(nfs_notify)[6511]: INFO: previous sm-notify processes terminated before stop action.
> > Jul 22 14:03:46 zebrafish crmd[1078]: notice: Result of stop operation for nfs_notify on zebrafish: 0 (ok)
> > Jul 22 14:03:46 zebrafish crmd[1078]: notice: Initiating stop operation nfsshare_home_stop_0 locally on zebrafish
> > Jul 22 14:03:46 zebrafish crmd[1078]: notice: Result of notify operation for drbd_storage on zebrafish: 0 (ok)
> > Jul 22 14:03:46 zebrafish systemd: Stopping Dovecot IMAP/POP3 email server...
> > Jul 22 14:03:46 zebrafish systemd: Stopped Dovecot IMAP/POP3 email server.
> > Jul 22 14:03:46 zebrafish exportfs(nfsshare_home)[6558]: INFO: Un-exporting file system ...
> > Jul 22 14:03:46 zebrafish kernel: drbd raid: Handshake successful: Agreed network protocol version 101
> > Jul 22 14:03:46 zebrafish kernel: drbd raid: Feature flags enabled on protocol level: 0x7 TRIM THIN_RESYNC WRITE_SAME.
> > Jul 22 14:03:46 zebrafish kernel: drbd raid: conn( WFConnection -> WFReportParams )
> > Jul 22 14:03:46 zebrafish kernel: drbd raid: Starting ack_recv thread (from drbd_r_raid [1626])
> > Jul 22 14:03:46 zebrafish exportfs(nfsshare_home)[6558]: INFO: unexporting 144.92.167.128/25:/raid/home
> > Jul 22 14:03:46 zebrafish exportfs(nfsshare_home)[6558]: INFO: Un-exported file system
> > Jul 22 14:03:46 zebrafish crmd[1078]: notice: Result of stop operation for nfsshare_home on zebrafish: 0 (ok)
> > Jul 22 14:03:46 zebrafish kernel: block drbd0: drbd_sync_handshake:
> > Jul 22 14:03:46 zebrafish kernel: block drbd0: self F5C467DDAA8CAD23:BEA8E935E70AC8E8:30542C9D8D243C3C:30532C9D8D243C3D bits:9327 flags:0
> > Jul 22 14:03:46 zebrafish kernel: block drbd0: peer BEA8E935E70AC8E8:0000000000000000:30542C9D8D243C3D:30532C9D8D243C3D bits:1266688 flags:2
> > Jul 22 14:03:46 zebrafish kernel: block drbd0: uuid_compare()=1 by rule 70
> > Jul 22 14:03:46 zebrafish crmd[1078]: notice: Initiating stop operation symlink_home_stop_0 locally on zebrafish
> > Jul 22 14:03:46 zebrafish crmd[1078]: notice: Initiating stop operation server_nfs_stop_0 locally on zebrafish
> > Jul 22 14:03:46 zebrafish kernel: block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> Consistent )
> > Jul 22 14:03:46 zebrafish symlink(symlink_home)[6613]: INFO: removed '/home'
> > Jul 22 14:03:46 zebrafish crmd[1078]: notice: Result of stop operation for symlin> k_home on zebrafish: 0 (ok)
> > Jul 22 14:03:46 zebrafish kernel: block drbd0: drbd_sync_handshake:
> > Jul 22 14:03:46 zebrafish kernel: block drbd0: self F5C467DDAA8CAD23:BEA8E935E70AC8E8:30542C9D8D243C3C:30532C9D8D243C3D bits:9327 flags:0
> > Jul 22 14:03:46 zebrafish kernel: block drbd0: peer BEA8E935E70AC8E8:0000000000000000:30542C9D8D243C3D:30532C9D8D243C3D bits:1266688 flags:2
> > Jul 22 14:03:46 zebrafish kernel: block drbd0: uuid_compare()=1 by rule 70
> > Jul 22 14:03:46 zebrafish crmd[1078]: notice: Initiating stop operation symlink_home_stop_0 locally on zebrafish
> > Jul 22 14:03:46 zebrafish crmd[1078]: notice: Initiating stop operation server_nfs_stop_0 locally on zebrafish
> > Jul 22 14:03:46 zebrafish kernel: block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> Consistent )
> > Jul 22 14:03:46 zebrafish symlink(symlink_home)[6613]: INFO: removed '/home'
> > Jul 22 14:03:46 zebrafish crmd[1078]: notice: Result of stop operation for symlink_home on zebrafish: 0 (ok)
> > Jul 22 14:03:46 zebrafish kernel: block drbd0: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 1458(1), total 1458; compression: 100.0%
> > Jul 22 14:03:46 zebrafish kernel: block drbd0: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 4700(2), total 4700; compression: 100.0%
> > Jul 22 14:03:46 zebrafish kernel: block drbd0: helper command: /sbin/drbdadm before-resync-source minor-0
> > Jul 22 14:03:46 zebrafish kernel: block drbd0: helper command: /sbin/drbdadm before-resync-source minor-0 exit code 0 (0x0)
> > Jul 22 14:03:46 zebrafish kernel: block drbd0: conn( WFBitMapS -> SyncSource ) pdsk( Consistent -> Inconsistent )
> > Jul 22 14:03:46 zebrafish kernel: block drbd0: Began resync as SyncSource (will sync 5084788 KB [1271197 bits set]).
> > Jul 22 14:03:46 zebrafish kernel: block drbd0: updated sync UUID F5C467DDAA8CAD23:BEA9E935E70AC8E8:BEA8E935E70AC8E8:30542C9D8D243C3C
> > Jul 22 14:03:46 zebrafish nfsserver(server_nfs)[6614]: INFO: Stopping NFS server ...
> > Jul 22 14:03:46 zebrafish systemd: Stopping NFS server and services...
> > Jul 22 14:03:46 zebrafish systemd: Stopped NFS server and services.
> > Jul 22 14:03:46 zebrafish systemd: Stopping NFS Mount Daemon...
> > Jul 22 14:03:46 zebrafish systemd: Stopping NFSv4 ID-name mapping service...
> > Jul 22 14:03:46 zebrafish rpc.mountd[2655]: Caught signal 15, un-registering and exiting.
> > Jul 22 14:03:46 zebrafish systemd: Stopped NFSv4 ID-name mapping service.
> > Jul 22 14:03:46 zebrafish systemd: Stopped NFS Mount Daemon.
> > Jul 22 14:03:46 zebrafish nfsserver(server_nfs)[6614]: INFO: Stop: threads
> > Jul 22 14:03:46 zebrafish kernel: nfsd: last server has exited, flushing export cache
> > Jul 22 14:03:46 zebrafish systemd: Stopping NFS status monitor for NFSv2/3 locking....
> > Jul 22 14:03:46 zebrafish systemd: Stopped NFS status monitor for NFSv2/3 locking..
> > Jul 22 14:03:46 zebrafish nfsserver(server_nfs)[6614]: INFO: Stop: rpc-statd
> > Jul 22 14:03:46 zebrafish nfsserver(server_nfs)[6614]: INFO: Stop: nfs-idmapd
> > Jul 22 14:03:46 zebrafish nfsserver(server_nfs)[6614]: INFO: Stop: nfs-mountd
> > Jul 22 14:03:46 zebrafish systemd: Stopping RPC bind service...
> > Jul 22 14:03:46 zebrafish systemd: Stopped RPC bind service.
> > Jul 22 14:03:46 zebrafish nfsserver(server_nfs)[6614]: INFO: Stop: rpcbind
> > Jul 22 14:03:46 zebrafish nfsserver(server_nfs)[6614]: INFO: Stop: rpc-gssd
> > Jul 22 14:03:46 zebrafish nfsserver(server_nfs)[6614]: INFO: Stop: umount (1/10 attempts)
> > Jul 22 14:03:47 zebrafish nfsserver(server_nfs)[6614]: INFO: NFS server stopped
> > Jul 22 14:03:47 zebrafish crmd[1078]: notice: Result of stop operation for server_nfs on zebrafish: 0 (ok)
> > Jul 22 14:03:47 zebrafish crmd[1078]: notice: Initiating stop operation floating_ip_stop_0 locally on zebrafish
> > Jul 22 14:03:48 zebrafish crmd[1078]: notice: Result of stop operation for server_dovecot on zebrafish: 0 (ok)
> > Jul 22 14:03:48 zebrafish crmd[1078]: notice: Initiating stop operation symlink_etc_pki_stop_0 locally on zebrafish
> > Jul 22 14:03:48 zebrafish IPaddr2(floating_ip)[6769]: INFO: IP status = ok, IP_CIP=
> > Jul 22 14:03:48 zebrafish crmd[1078]: notice: Initiating stop operation symlink_var_dovecot_stop_0 locally on zebrafish
> > Jul 22 14:03:48 zebrafish crmd[1078]: notice: Result of stop operation for floating_ip on zebrafish: 0 (ok)
> > Jul 22 14:03:48 zebrafish symlink(symlink_etc_pki)[6821]: INFO: removed '/etc/pki'
> > Jul 22 14:03:48 zebrafish symlink(symlink_var_dovecot)[6822]: INFO: removed '/var/spool/dovecot'
> > Jul 22 14:03:48 zebrafish crmd[1078]: notice: Result of stop operation for symlink_var_dovecot on zebrafish: 0 (ok)
> > Jul 22 14:03:48 zebrafish crmd[1078]: notice: Initiating stop operation symlink_etc_dovecot_stop_0 locally on zebrafish
> > Jul 22 14:03:48 zebrafish crmd[1078]: notice: Result of stop operation for symlink_etc_pki on zebrafish: 0 (ok)
> > Jul 22 14:03:48 zebrafish symlink(symlink_etc_dovecot)[6863]: INFO: removed '/etc/dovecot'
> > Jul 22 14:03:48 zebrafish crmd[1078]: notice: Result of stop operation for symlink_etc_dovecot on zebrafish: 0 (ok)
> > Jul 22 14:03:48 zebrafish crmd[1078]: notice: Initiating stop operation drbd_filesystem_stop_0 locally on zebrafish
> > Jul 22 14:03:48 zebrafish Filesystem(drbd_filesystem)[6886]: INFO: Running stop for /dev/drbd0 on /raid
> > Jul 22 14:03:48 zebrafish Filesystem(drbd_filesystem)[6886]: INFO: Trying to unmount /raid
> > Jul 22 14:03:48 zebrafish Filesystem(drbd_filesystem)[6886]: ERROR: Couldn't unmount /raid; trying cleanup with TERM
> > Jul 22 14:03:48 zebrafish Filesystem(drbd_filesystem)[6886]: INFO: No processes on /raid were signalled. force_unmount is set to 'yes'
> > Jul 22 14:03:49 zebrafish Filesystem(drbd_filesystem)[6886]: ERROR: Couldn't unmount /raid; trying cleanup with TERM
> > Jul 22 14:03:49 zebrafish Filesystem(drbd_filesystem)[6886]: INFO: No processes on /raid were signalled. force_unmount is set to 'yes'
> > Jul 22 14:03:50 zebrafish ntpd[596]: Deleting interface #8 enp2s0f0, 144.92.167.221#123, interface stats: received=0, sent=0, dropped=0, active_time=260 secs
> > Jul 22 14:03:50 zebrafish Filesystem(drbd_filesystem)[6886]: ERROR: Couldn't unmount /raid; trying cleanup with TERM
> > Jul 22 14:03:50 zebrafish Filesystem(drbd_filesystem)[6886]: INFO: No processes on /raid were signalled. force_unmount is set to 'yes'
> > Jul 22 14:03:51 zebrafish Filesystem(drbd_filesystem)[6886]: ERROR: Couldn't unmount /raid; trying cleanup with KILL
> > Jul 22 14:03:51 zebrafish Filesystem(drbd_filesystem)[6886]: INFO: No processes on /raid were signalled. force_unmount is set to 'yes'
> > Jul 22 14:03:52 zebrafish Filesystem(drbd_filesystem)[6886]: ERROR: Couldn't unmount /raid; trying cleanup with KILL
> > Jul 22 14:03:53 zebrafish Filesystem(drbd_filesystem)[6886]: INFO: No processes on /raid were signalled. force_unmount is set to 'yes'
> > Jul 22 14:03:54 zebrafish Filesystem(drbd_filesystem)[6886]: ERROR: Couldn't unmount /raid; trying cleanup with KILL
> > Jul 22 14:03:54 zebrafish Filesystem(drbd_filesystem)[6886]: INFO: No processes on /raid were signalled. force_unmount is set to 'yes'
> > Jul 22 14:03:55 zebrafish Filesystem(drbd_filesystem)[6886]: ERROR: Couldn't unmount /raid, giving up!
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ umount: /raid: target is busy. ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ (In some cases useful info about processes that use ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ the device is found by lsof(8) or fuser(1)) ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ ocf-exit-reason:Couldn't unmount /raid; trying cleanup with TERM ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ umount: /raid: target is busy. ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ (In some cases useful info about processes that use ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ the device is found by lsof(8) or fuser(1)) ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ ocf-exit-reason:Couldn't unmount /raid; trying cleanup with TERM ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ umount: /raid: target is busy. ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ (In some cases useful info about processes that use ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ the device is found by lsof(8) or fuser(1)) ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ ocf-exit-reason:Couldn't unmount /raid; trying cleanup with TERM ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ umount: /raid: target is busy. ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ (In some cases useful info about processes that use ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ the device is found by lsof(8) or fuser(1)) ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ ocf-exit-reason:Couldn't unmount /raid; trying cleanup with KILL ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ umount: /raid: target is busy. ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ (In some cases useful info about processes that use ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ the device is found by lsof(8) or fuser(1)) ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ ocf-exit-reason:Couldn't unmount /raid; trying cleanup with KILL ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ umount: /raid: target is busy. ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ (In some cases useful info about processes that use ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ the device is found by lsof(8) or fuser(1)) ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ ocf-exit-reason:Couldn't unmount /raid; trying cleanup with KILL ]
> > Jul 22 14:03:55 zebrafish lrmd[1075]: notice: drbd_filesystem_stop_0:6886:stderr [ ocf-exit-reason:Couldn't unmount /raid, giving up! ]
> > Jul 22 14:03:55 zebrafish crmd[1078]: notice: Result of stop operation for drbd_filesystem on zebrafish: 1 (unknown error)
> > Jul 22 14:03:55 zebrafish crmd[1078]: notice: zebrafish-drbd_filesystem_stop_0:101 [ umount: /raid: target is busy.\n (In some cases useful info about processes that use\n the device is found by lsof(8) or fuser(1))\nocf-exit-reason:Couldn't unmount /raid; trying cleanup with TERM\numount: /raid: target is busy.\n (In some cases useful info about processes that use\n the device is found by lsof(8) or fuser(1))\nocf-exit-reason:Couldn't unmount /raid; trying cleanup with TERM\numount: /raid: target is busy.\n
> > Jul 22 14:03:55 zebrafish crmd[1078]: warning: Action 45 (drbd_filesystem_stop_0) on zebrafish failed (target: 0 vs. rc: 1): Error
> > Jul 22 14:03:55 zebrafish crmd[1078]: notice: Transition aborted by operation drbd_filesystem_stop_0 'modify' on zebrafish: Event failed
> > Jul 22 14:03:55 zebrafish crmd[1078]: warning: Action 45 (drbd_filesystem_stop_0) on zebrafish failed (target: 0 vs. rc: 1): Error
> > Jul 22 14:03:55 zebrafish crmd[1078]: notice: Transition 2 (Complete=21, Pending=0, Fired=0, Skipped=0, Incomplete=43, Source=/var/lib/pacemaker/pengine/pe-input-256.bz2): Complete
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: On loss of CCM Quorum: Ignore
> > Jul 22 14:03:55 zebrafish pengine[1077]: warning: Processing failed op stop for drbd_filesystem on zebrafish: unknown error (1)
> > Jul 22 14:03:55 zebrafish pengine[1077]: warning: Processing failed op stop for drbd_filesystem on zebrafish: unknown error (1)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Demote drbd_storage:0#011(Master -> Slave zebrafish)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Promote drbd_storage:1#011(Slave -> Master lionfish)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Start floating_ip#011(lionfish - blocked)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Start server_nfs#011(lionfish - blocked)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Start nfsshare_home#011(lionfish - blocked)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Start nfs_notify#011(lionfish - blocked)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Start symlink_etc_pki#011(lionfish - blocked)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Start symlink_etc_dovecot#011(lionfish - blocked)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Start symlink_var_dovecot#011(lionfish - blocked)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Start server_dovecot#011(lionfish - blocked)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Calculated transition 3, saving inputs in /var/lib/pacemaker/pengine/pe-input-257.bz2
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: On loss of CCM Quorum: Ignore
> > Jul 22 14:03:55 zebrafish pengine[1077]: warning: Processing failed op stop for drbd_filesystem on zebrafish: unknown error (1)
> > Jul 22 14:03:55 zebrafish pengine[1077]: warning: Processing failed op stop for drbd_filesystem on zebrafish: unknown error (1)
> > Jul 22 14:03:55 zebrafish pengine[1077]: warning: Forcing drbd_filesystem away from zebrafish after 1000000 failures (max=1000000)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Demote drbd_storage:0#011(Master -> Slave zebrafish)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Promote drbd_storage:1#011(Slave -> Master lionfish)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Start floating_ip#011(lionfish - blocked)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Start server_nfs#011(lionfish - blocked)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Start nfsshare_home#011(lionfish - blocked)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Start nfs_notify#011(lionfish - blocked)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Start symlink_etc_pki#011(lionfish - blocked)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Start symlink_etc_dovecot#011(lionfish - blocked)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Start symlink_var_dovecot#011(lionfish - blocked)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Start server_dovecot#011(lionfish - blocked)
> > Jul 22 14:03:55 zebrafish pengine[1077]: notice: Calculated transition 4, saving inputs in /var/lib/pacemaker/pengine/pe-input-258.bz2
> > Jul 22 14:03:55 zebrafish crmd[1078]: notice: Initiating demote operation drbd_storage_demote_0 locally on zebrafish
> > Jul 22 14:03:55 zebrafish crmd[1078]: notice: Initiating notify operation drbd_storage_pre_notify_demote_0 locally on zebrafish
> > Jul 22 14:03:55 zebrafish crmd[1078]: notice: Initiating notify operation drbd_storage_pre_notify_demote_0 on lionfish
> > Jul 22 14:03:56 zebrafish kernel: block drbd0: State change failed: Device is held open by someone
> > Jul 22 14:03:56 zebrafish kernel: block drbd0: state = { cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent r----- }
> > Jul 22 14:03:56 zebrafish kernel: block drbd0: wanted = { cs:SyncSource ro:Secondary/Secondary ds:UpToDate/Inconsistent r----- }
> > Jul 22 14:03:56 zebrafish drbd(drbd_storage)[7093]: ERROR: raid: Called drbdadm -c /etc/drbd.conf secondary raid
> > Jul 22 14:03:56 zebrafish drbd(drbd_storage)[7093]: ERROR: raid: Exit code 11
> > Jul 22 14:03:56 zebrafish drbd(drbd_storage)[7093]: ERROR: raid: Command output:
> > Jul 22 14:03:56 zebrafish kernel: block drbd0: State change failed: Device is held open by someone
> > Jul 22 14:03:56 zebrafish kernel: block drbd0: state = { cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent r----- }
> > Jul 22 14:03:56 zebrafish kernel: block drbd0: wanted = { cs:SyncSource ro:Secondary/Secondary ds:UpToDate/Inconsistent r----- }
> > Jul 22 14:03:56 zebrafish drbd(drbd_storage)[7093]: ERROR: raid: Called drbdadm -c /etc/drbd.conf secondary raid
> > Jul 22 14:03:56 zebrafish drbd(drbd_storage)[7093]: ERROR: raid: Exit code 11
> > Jul 22 14:03:56 zebrafish drbd(drbd_storage)[7093]: ERROR: raid: Command output:
> ...
>
> this last bit: "kernel: block drbd 0"x3 and "drbd(drbd_storage)"x3 goes
> on until the power-cycle.
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
--
Ken Gaillot <kgaillot at redhat.com>
More information about the Users
mailing list