[ClusterLabs] Some unexpected DLM messages; OCFS2 related? "send_repeat_remove dir" / "send_repeat_remove dir"

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Fri Oct 8 04:38:07 EDT 2021


Hi!

I just noticed these two messages on two nodes of a 3-node cluster:
Oct 08 10:00:14 h18 kernel: dlm: 790F9C237C2A45758135FE4945B7A744: send_repeat_remove dir 119 O000000000000000009d83500000000
Oct 08 10:00:14 h19 kernel: dlm: 790F9C237C2A45758135FE4945B7A744: receive_remove from 118 not found O000000000000000009d83500000000

Due to "genuine configuration"TM node id 118 corresponds to node h18, while 119 corresponds to (you guessed it!) node h19.

journalctl colors these messages in red, so I guess they are somewhat unexpected.

My guess is that the messages are related to OCFS2 reflink snapshots that are created every hour (and may take 15 Seconds).
The kernel is 5.3.18-24.83-default (SLES15 SP2) on all nodes, and we actually had a lockup on OCFS2 snapshots with some older kernels.
So I wonder whether those messages may still be an indication of some problem.

My snapshots do not create any directories ("dir"), BTW. But the nodes create/rename/remove different files in the same directory.

Actually I have snapshots with these date stamps:
Change: 2021-10-08 10:00:10.275375897 +0200
Change: 2021-10-08 10:00:15.371632277 +0200
Change: 2021-10-08 10:00:15.371632277 +0200
Change: 2021-10-08 10:00:15.371632277 +0200
Change: 2021-10-08 10:00:15.371632277 +0200
Change: 2021-10-08 10:00:15.371632277 +0200
Change: 2021-10-08 10:00:15.443455304 +0200
Change: 2021-10-08 10:00:15.938216584 +0200
Change: 2021-10-08 10:00:15.974216964 +0200
Change: 2021-10-08 10:00:16.183466675 +0200
Change: 2021-10-08 10:00:16.223467289 +0200
Change: 2021-10-08 10:00:16.251467719 +0200
Change: 2021-10-08 10:00:17.375484990 +0200
Change: 2021-10-08 10:00:18.187497465 +0200
Change: 2021-10-08 10:00:18.843647739 +0200 

Regards,
Ulrich






More information about the Users mailing list