[ClusterLabs] centos 7 drbd fubar
Ken Gaillot
kgaillot at redhat.com
Fri Jan 6 14:41:34 EST 2017
On 12/27/2016 03:08 PM, Dimitri Maziuk wrote:
> I ran centos 7.3.1611 update over the holidays and my drbd + nfs + imap
> active-passive pair locked up again. This has now been consistent for at
> least 3 kernel updates. This time I had enough consoles open to run
> fuser & lsof though.
>
> The procedure:
>
> 1. pcs cluster standby <secondary>
> 2. yum up && reboot <secondary>
> 3. pcs cluster unstandby <secondary>
>
> Fine so far.
>
> 4. pcs cluster standby <primary>
> results in
>
>> Filesystem(drbd_filesystem)[18277]: 2016/12/23_17:36:41 INFO: Running stop for /dev/drbd0 on /raid
>> Filesystem(drbd_filesystem)[18277]: 2016/12/23_17:36:41 INFO: Trying to unmount /raid
>> Filesystem(drbd_filesystem)[18277]: 2016/12/23_17:36:41 ERROR: Couldn't unmount /raid; trying cleanup with TERM
>> Filesystem(drbd_filesystem)[18277]: 2016/12/23_17:36:41 INFO: No processes on /raid were signalled. force_unmount is set to 'yes'
>> Filesystem(drbd_filesystem)[18277]: 2016/12/23_17:36:42 ERROR: Couldn't unmount /raid; trying cleanup with TERM
>> Filesystem(drbd_filesystem)[18277]: 2016/12/23_17:36:42 INFO: No processes on /raid were signalled. force_unmount is set to 'yes'
>> Filesystem(drbd_filesystem)[18277]: 2016/12/23_17:36:43 ERROR: Couldn't unmount /raid; trying cleanup with TERM
>> Filesystem(drbd_filesystem)[18277]: 2016/12/23_17:36:43 INFO: No processes on /raid were signalled. force_unmount is set to 'yes'
>> Filesystem(drbd_filesystem)[18277]: 2016/12/23_17:36:44 ERROR: Couldn't unmount /raid; trying cleanup with KILL
>> Filesystem(drbd_filesystem)[18277]: 2016/12/23_17:36:44 INFO: No processes on /raid were signalled. force_unmount is set to 'yes'
>> Filesystem(drbd_filesystem)[18277]: 2016/12/23_17:36:45 ERROR: Couldn't unmount /raid; trying cleanup with KILL
>> Filesystem(drbd_filesystem)[18277]: 2016/12/23_17:36:46 INFO: No processes on /raid were signalled. force_unmount is set to 'yes'
>> Filesystem(drbd_filesystem)[18277]: 2016/12/23_17:36:47 ERROR: Couldn't unmount /raid; trying cleanup with KILL
>> Filesystem(drbd_filesystem)[18277]: 2016/12/23_17:36:47 INFO: No processes on /raid were signalled. force_unmount is set to 'yes'
>> Filesystem(drbd_filesystem)[18277]: 2016/12/23_17:36:48 ERROR: Couldn't unmount /raid, giving up!
>> Dec 23 17:36:48 [1138] zebrafish.bmrb.wisc.edu lrmd: notice: operation_finished: drbd_filesystem_stop_0:18277:stderr [ umount: /raid: target i
>> s busy. ]
>
> ... until the system's powered down. Before power down I ran lsof, it
> hung, and fuser:
>
>> # fuser -vum /raid
>> USER PID ACCESS COMMAND
>> /raid: root kernel mount (root)/raid
>
> After running yum up on the primary and rebooting it again,
>
> 5. pcs cluster unstandby <primary>
> causes the same fail to unmount loop on the secondary, that has to be
> powered down until the primary recovers.
>
> Hopefully I'm doing something wrong, please someone tell me what it is.
> Anyone? Bueller?
That is disconcerting. Since no one here seems to know, have you tried
asking on the drbd list? It sounds like an issue with the drbd kernel
module.
http://lists.linbit.com/listinfo/drbd-user
More information about the Users
mailing list