[ClusterLabs] Antw: [EXT] Re: Antw: Hanging OCFS2 Filesystem any one else?

Wed Jun 16 01:48:29 EDT 2021

Hi Ulrich,

On 2021/6/15 17:01, Ulrich Windl wrote:
> Hi Guys!
> 
> Just to keep you informed on the issue:
> I was informed that I'm not the only one seeing this problem, and there seems
> to be some "negative interference" between BtrFS reorganizing its extents
> periodically and OCFS2 making reflink snapshots (a local cron job here) in
> current SUSE SLES kernels. It seems that happens almost exactly at 0:00 o'
> clock.
We encountered the same hang in local environment, the problem looks 
like caused by btrfs btrfs-balance job run, but I need to crash the 
kernel for the further analysis.
Hi Ulrich, do you know how to reproduce this hang stably? e.g. run 
reflink snapshot script and trigger the btrfs-balance job

Thanks
Gang

> 
> The only thing that BtrFS and OCFS2 have in common here is that BtrFS provides
> the mount point for OCFS2.
> 
> Regards,
> Ulrich
> 
>>>> Ulrich Windl schrieb am 02.06.2021 um 11:00 in Nachricht <60B748A4.E0C :
> 161 :
> 60728>:
>>>>> Gang He <GHe at suse.com> schrieb am 02.06.2021 um 08:34 in Nachricht
>>
> <AM6PR04MB6488DE7D2DA906BAD73FA3A1CF3D9 at AM6PR04MB6488.eurprd04.prod.outlook.c
> 
>> om>
>>
>>> Hi Ulrich,
>>>
>>> The hang problem looks like a fix
> (90bd070aae6c4fb5d302f9c4b9c88be60c8197ec
>>> ocfs2: fix deadlock between setattr and dio_end_io_write), but it is not
>> 100%
>>> sure.
>>> If possible, could you help to report a bug to SUSE, then we can work on
>>> that further.
>>
>> Hi!
>>
>> Actually a service request for the issue is open at SUSE. However I don't
>> know which L3 engineer is working on it.
>> I have some "funny" effects, like these:
>> On one node "ls" hangs, but can be interrupted with ^C; on another node "ls"
> 
>> also hangs, but cannot be stopped with ^C or ^Z
>> (Most processes cannot even be killed with "kill -9")
>> "ls" on the directory also hangs, just as an "rm" for a non-existent file
>>
>> What I really wonder is what triggered the effect, and more importantly  how
> 
>> to recover from it.
>> Initially I had suspected a rather full (95%) flesystem, but that means
>> there are still 24GB available.
>> The other suspect was concurrent creation of reflink snapshots while the
>> file being snapshot did change (e.g. allocate a hole in a sparse file)
>>
>> Regards,
>> Ulrich
>>
>>>
>>> Thanks
>>> Gang
>>>
>>> ________________________________________
>>> From: Users <users‑bounces at clusterlabs.org> on behalf of Ulrich Windl
>>> <Ulrich.Windl at rz.uni‑regensburg.de>
>>> Sent: Tuesday, June 1, 2021 15:14
>>> To: users at clusterlabs.org
>>> Subject: [ClusterLabs] Antw: Hanging OCFS2 Filesystem any one else?
>>>
>>>>>> Ulrich Windl schrieb am 31.05.2021 um 12:11 in Nachricht <60B4B65A.A8F
> : 161
>>
>>> :
>>> 60728>:
>>>> Hi!
>>>>
>>>> We have an OCFS2 filesystem shared between three cluster nodes (SLES 15
> SP2,
>>>> Kernel 5.3.18‑24.64‑default). The filesystem is filled up to about 95%,
> and
>>>> we have an odd effect:
>>>> A stat() systemcall to some of the files hangs indefinitely (state "D").
>>>> ("ls ‑l" and "rm" also hang, but I suspect those are calling state()
>>>> internally, too).
>>>> My only suspect is that the effect might be related to the 95% being
> used.
>>>> The other suspect is that concurrent reflink calls may trigger the
> effect.
>>>>
>>>> Did anyone else experience something similar?
>>>
>>> Hi!
>>>
>>> I have some details:
>>> It seems there is a reader/writer deadlock trying to allocate additional
>>> blocks for a file.
>>> The stacktrace looks like this:
>>> Jun 01 07:56:31 h16 kernel:  rwsem_down_write_slowpath+0x251/0x620
>>> Jun 01 07:56:31 h16 kernel:  ? __ocfs2_change_file_space+0xb3/0x620
> [ocfs2]
>>> Jun 01 07:56:31 h16 kernel:  __ocfs2_change_file_space+0xb3/0x620 [ocfs2]
>>> Jun 01 07:56:31 h16 kernel:  ocfs2_fallocate+0x82/0xa0 [ocfs2]
>>> Jun 01 07:56:31 h16 kernel:  vfs_fallocate+0x13f/0x2a0
>>> Jun 01 07:56:31 h16 kernel:  ksys_fallocate+0x3c/0x70
>>> Jun 01 07:56:31 h16 kernel:  __x64_sys_fallocate+0x1a/0x20
>>> Jun 01 07:56:31 h16 kernel:  do_syscall_64+0x5b/0x1e0
>>>
>>> That is the only writer (on that host), bit there are multiple readers
> like
>>> this:
>>> Jun 01 07:56:31 h16 kernel:  rwsem_down_read_slowpath+0x172/0x300
>>> Jun 01 07:56:31 h16 kernel:  ? dput+0x2c/0x2f0
>>> Jun 01 07:56:31 h16 kernel:  ? lookup_slow+0x27/0x50
>>> Jun 01 07:56:31 h16 kernel:  lookup_slow+0x27/0x50
>>> Jun 01 07:56:31 h16 kernel:  walk_component+0x1c4/0x300
>>> Jun 01 07:56:31 h16 kernel:  ? path_init+0x192/0x320
>>> Jun 01 07:56:31 h16 kernel:  path_lookupat+0x6e/0x210
>>> Jun 01 07:56:31 h16 kernel:  ? __put_lkb+0x45/0xd0 [dlm]
>>> Jun 01 07:56:31 h16 kernel:  filename_lookup+0xb6/0x190
>>> Jun 01 07:56:31 h16 kernel:  ? kmem_cache_alloc+0x3d/0x250
>>> Jun 01 07:56:31 h16 kernel:  ? getname_flags+0x66/0x1d0
>>> Jun 01 07:56:31 h16 kernel:  ? vfs_statx+0x73/0xe0
>>> Jun 01 07:56:31 h16 kernel:  vfs_statx+0x73/0xe0
>>> Jun 01 07:56:31 h16 kernel:  ? fsnotify_grab_connector+0x46/0x80
>>> Jun 01 07:56:31 h16 kernel:  __do_sys_newstat+0x39/0x70
>>> Jun 01 07:56:31 h16 kernel:  ? do_unlinkat+0x92/0x320
>>> Jun 01 07:56:31 h16 kernel:  do_syscall_64+0x5b/0x1e0
>>>
>>> So that will match the hanging stat() quite nicely!
>>>
>>> However the PID displayed as holding the writer does not exist in the
> system
>>
>>> (on that node).
>>>
>>> Regards,
>>> Ulrich
>>>
>>>
>>>>
>>>> Regards,
>>>> Ulrich
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Manage your subscription:
>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> ClusterLabs home: https://www.clusterlabs.org/
>>>
>>>
>>> _______________________________________________
>>> Manage your subscription:
>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> ClusterLabs home: https://www.clusterlabs.org/
>>
>>
>>
>>
> 
> 
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
>