[ClusterLabs] epic fail

Dimitri Maziuk dmaziuk at bmrb.wisc.edu
Mon Jul 24 16:01:26 UTC 2017


On 07/24/2017 10:38 AM, Ken Gaillot wrote:

> A restart shouldn't lead to fencing in any case where something's not
> going seriously wrong. I'm not familiar with the "kernel is using it"
> message, I haven't run into that before.

I posted it at least once before.

> 
> Jul 22 14:03:48 zebrafish Filesystem(drbd_filesystem)[6886]: INFO: Running stop for /dev/drbd0 on /raid
> Jul 22 14:03:48 zebrafish Filesystem(drbd_filesystem)[6886]: INFO: Trying to unmount /raid
> Jul 22 14:03:48 zebrafish Filesystem(drbd_filesystem)[6886]: ERROR: Couldn't unmount /raid; trying cleanup with TERM
> Jul 22 14:03:48 zebrafish Filesystem(drbd_filesystem)[6886]: INFO: No processes on /raid were signalled. force_unmount is set to 'yes'
> Jul 22 14:03:49 zebrafish Filesystem(drbd_filesystem)[6886]: ERROR: Couldn't unmount /raid; trying cleanup with TERM
> Jul 22 14:03:49 zebrafish Filesystem(drbd_filesystem)[6886]: INFO: No processes on /raid were signalled. force_unmount is set to 'yes'
> Jul 22 14:03:50 zebrafish ntpd[596]: Deleting interface #8 enp2s0f0, 144.92.167.221#123, interface stats: received=0, sent=0, dropped=0, active_time=260 secs
> Jul 22 14:03:50 zebrafish Filesystem(drbd_filesystem)[6886]: ERROR: Couldn't unmount /raid; trying cleanup with TERM
> Jul 22 14:03:50 zebrafish Filesystem(drbd_filesystem)[6886]: INFO: No processes on /raid were signalled. force_unmount is set to 'yes'
> Jul 22 14:03:51 zebrafish Filesystem(drbd_filesystem)[6886]: ERROR: Couldn't unmount /raid; trying cleanup with KILL
> Jul 22 14:03:51 zebrafish Filesystem(drbd_filesystem)[6886]: INFO: No processes on /raid were signalled. force_unmount is set to 'yes'
> Jul 22 14:03:52 zebrafish Filesystem(drbd_filesystem)[6886]: ERROR: Couldn't unmount /raid; trying cleanup with KILL
> Jul 22 14:03:53 zebrafish Filesystem(drbd_filesystem)[6886]: INFO: No processes on /raid were signalled. force_unmount is set to 'yes'
> Jul 22 14:03:54 zebrafish Filesystem(drbd_filesystem)[6886]: ERROR: Couldn't unmount /raid; trying cleanup with KILL
> Jul 22 14:03:54 zebrafish Filesystem(drbd_filesystem)[6886]: INFO: No processes on /raid were signalled. force_unmount is set to 'yes'
> Jul 22 14:03:55 zebrafish Filesystem(drbd_filesystem)[6886]: ERROR: Couldn't unmount /raid, giving up!
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [ umount: /raid: target is busy. ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [         (In some cases useful info about processes that use ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [          the device is found by lsof(8) or fuser(1)) ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [ ocf-exit-reason:Couldn't unmount /raid; trying cleanup with TERM ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [ umount: /raid: target is busy. ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [         (In some cases useful info about processes that use ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [          the device is found by lsof(8) or fuser(1)) ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [ ocf-exit-reason:Couldn't unmount /raid; trying cleanup with TERM ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [ umount: /raid: target is busy. ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [         (In some cases useful info about processes that use ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [          the device is found by lsof(8) or fuser(1)) ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [ ocf-exit-reason:Couldn't unmount /raid; trying cleanup with TERM ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [ umount: /raid: target is busy. ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [         (In some cases useful info about processes that use ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [          the device is found by lsof(8) or fuser(1)) ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [ ocf-exit-reason:Couldn't unmount /raid; trying cleanup with KILL ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [ umount: /raid: target is busy. ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [         (In some cases useful info about processes that use ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [          the device is found by lsof(8) or fuser(1)) ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [ ocf-exit-reason:Couldn't unmount /raid; trying cleanup with KILL ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [ umount: /raid: target is busy. ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [         (In some cases useful info about processes that use ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [          the device is found by lsof(8) or fuser(1)) ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [ ocf-exit-reason:Couldn't unmount /raid; trying cleanup with KILL ]
> Jul 22 14:03:55 zebrafish lrmd[1075]:  notice: drbd_filesystem_stop_0:6886:stderr [ ocf-exit-reason:Couldn't unmount /raid, giving up! ]
> Jul 22 14:03:55 zebrafish crmd[1078]:  notice: Result of stop operation for drbd_filesystem on zebrafish: 1 (unknown error)
> Jul 22 14:03:55 zebrafish crmd[1078]:  notice: zebrafish-drbd_filesystem_stop_0:101 [ umount: /raid: target is busy.\n        (In some cases useful info about processes that use\n         the device is found by lsof(8) or fuser(1))\nocf-exit-reason:Couldn't unmount /raid; trying cleanup with TERM\numount: /raid: target is busy.\n        (In some cases useful info about processes that use\n         the device is found by lsof(8) or fuser(1))\nocf-exit-reason:Couldn't unmount /raid; trying cleanup with TERM\numount: /raid: target is busy.\n
> Jul 22 14:03:55 zebrafish crmd[1078]: warning: Action 45 (drbd_filesystem_stop_0) on zebrafish failed (target: 0 vs. rc: 1): Error
> Jul 22 14:03:55 zebrafish crmd[1078]:  notice: Transition aborted by operation drbd_filesystem_stop_0 'modify' on zebrafish: Event failed
> Jul 22 14:03:55 zebrafish crmd[1078]: warning: Action 45 (drbd_filesystem_stop_0) on zebrafish failed (target: 0 vs. rc: 1): Error
> Jul 22 14:03:55 zebrafish crmd[1078]:  notice: Transition 2 (Complete=21, Pending=0, Fired=0, Skipped=0, Incomplete=43, Source=/var/lib/pacemaker/pengine/pe-input-256.bz2): Complete

Lsof/fuser show the PID of the process holding FS open as "kernel".

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: OpenPGP digital signature
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20170724/1c942786/attachment-0002.sig>


More information about the Users mailing list