[ClusterLabs] Antw: heads up: series of core dumps in SLES15 SP3 ("kernel: BUG: Bad rss-counter state mm:00000000d1a9d1f5 idx:1 val:4")

Tue Mar 8 08:43:04 EST 2022

Hi!

You may getting bored, but since the last message we had at least two more kernel crashes, one on a machine with latest updates (firmware and SLES), and the vendor diagnostics found no hardware problem.
The latest kdump (20 minutes ago) was:

---
[259832.062351] BTRFS info (device dm-0): qgroup scan completed (inconsistency flag cleared)
[263221.648397] BUG: Bad rss-counter state mm:00000000276fd188 idx:1 val:2
[266675.397157] BUG: Bad rss-counter state mm:000000003a0ed3fb idx:1 val:235
[272206.129167] BUG: kernel NULL pointer dereference, address: 0000000000000008
[272206.129183] #PF: supervisor write access in kernel mode
[272206.129187] #PF: error_code(0x0002) - not-present page
[272206.129190] PGD 0 P4D 0
[272206.129196] Oops: 0002 [#1] SMP NOPTI
[272206.129202] CPU: 2 PID: 417 Comm: kswapd0 Tainted: G               X    5.3.18-150300.59.49-default #1 SLE15-SP3
[272206.129209] Hardware name: Dell Inc. PowerEdge R7415/07YXFK, BIOS 1.17.0 07/30/2021
[272206.129219] RIP: e030:down_read_trylock+0x18/0x50
[272206.129224] Code: 20 48 c7 47 08 00 00 00 00 c7 47 10 00 00 00 00 c3 90 0f 1f 44 00 00 31 c0 48 b9 07 00 00 00 00 00 00 80 48 8d 90 00 01 00 00 <f0> 48 0f b1 17 75 21 48 8b 47 08 65 48 8b 14 25 c0 8b 01 00 83 e0
[272206.129233] RSP: e02b:ffffc90040cafc10 EFLAGS: 00010246
[272206.129237] RAX: 0000000000000000 RBX: ffff88812435e4d1 RCX: 8000000000000007
[272206.129241] RDX: 0000000000000100 RSI: ffffc90040cafca8 RDI: 0000000000000008
[272206.129245] RBP: ffffea0009608000 R08: 0000000000000067 R09: 0000000557878d4d
[272206.129250] R10: 0000000000000000 R11: 0000000000000003 R12: ffff88812435e4d0
[272206.129254] R13: 0000000000000008 R14: ffffea0009608008 R15: ffffea0009608000
[272206.129263] FS:  0000000000000000(0000) GS:ffff888391480000(0000) knlGS:0000000000000000
[272206.129267] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[272206.129271] CR2: 0000000000000008 CR3: 000000033e414000 CR4: 0000000000050660
[272206.129276] Call Trace:
[272206.129286]  page_lock_anon_vma_read+0x48/0xe0
[272206.129292]  rmap_walk_anon+0x16c/0x250
[272206.129297]  page_referenced+0xd5/0x170
[272206.129301]  ? rmap_walk_anon+0x250/0x250
[272206.129305]  ? page_get_anon_vma+0x80/0x80
[272206.129311]  shrink_active_list+0x2dd/0x490
[272206.129316]  balance_pgdat+0x50b/0x630
[272206.129321]  kswapd+0x14b/0x3d0
[272206.129326]  ? wait_woken+0x80/0x80
[272206.129330]  ? balance_pgdat+0x630/0x630
[272206.129335]  kthread+0x10d/0x130
[272206.129339]  ? kthread_park+0xa0/0xa0
[272206.129345]  ret_from_fork+0x22/0x40
...
[272206.129468] Supported: Yes, External
[272206.129474] CR2: 0000000000000008
[272206.129508] ---[ end trace 92e4c491ab44e169 ]---
[272206.129512] RIP: e030:down_read_trylock+0x18/0x50
[272206.129516] Code: 20 48 c7 47 08 00 00 00 00 c7 47 10 00 00 00 00 c3 90 0f 1f 44 00 00 31 c0 48 b9 07 00 00 00 00 00 00 80 48 8d 90 00 01 00 00 <f0> 48 0f b1 17 75 21 48 8b 47 08 65 48 8b 14 25 c0 8b 01 00 83 e0
[272206.129525] RSP: e02b:ffffc90040cafc10 EFLAGS: 00010246
[272206.129528] RAX: 0000000000000000 RBX: ffff88812435e4d1 RCX: 8000000000000007
[272206.129533] RDX: 0000000000000100 RSI: ffffc90040cafca8 RDI: 0000000000000008
[272206.129537] RBP: ffffea0009608000 R08: 0000000000000067 R09: 0000000557878d4d
[272206.129541] R10: 0000000000000000 R11: 0000000000000003 R12: ffff88812435e4d0
[272206.129545] R13: 0000000000000008 R14: ffffea0009608008 R15: ffffea0009608000
[272206.129552] FS:  0000000000000000(0000) GS:ffff888391480000(0000) knlGS:0000000000000000
[272206.129557] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[272206.129561] CR2: 0000000000000008 CR3: 000000033e414000 CR4: 0000000000050660
[272206.129566] Kernel panic - not syncing: Fatal exception
[272206.896249] Kernel Offset: disabled
---

Regards,
Ulrich

>>> Ulrich Windl schrieb am 01.03.2022 um 09:48 in Nachricht <621DDDDA.166 : 161 :
60728>:
> Hi!
> 
> I want to give an update on this issue (support is working on it):
> 
> First I recommend everyone using Xen and a Dell PowerEdge R7415 _not_ to 
> upgrade to SLES15 SP3, as we have about one crash per node and week.
> We had one last night a few minutes after BtrFS balance had finished.
> Meanwhile we also had a few crash dumps (kdump), and the call stacks are 
> like this:
> 
> (5.3.18-150300.59.43-default)
> [1175886.947081] ocfs2: Mounting device (9,10) on (node 116, slot 1) with 
> ordered data mode.
> [1175905.783132] general protection fault: 0000 [#1] SMP NOPTI
> [1175905.785704] RIP: e030:down_read_trylock+0x18/0x50
> [1175905.798982] Call Trace:
> [1175905.800305]  page_lock_anon_vma_read+0x48/0xe0
> [1175905.801659]  rmap_walk_anon+0x16c/0x250
> [1175905.803021]  page_referenced+0xd5/0x170
> [1175905.804254]  ? rmap_walk_anon+0x250/0x250
> [1175905.805377]  ? page_get_anon_vma+0x80/0x80
> [1175905.806593]  shrink_active_list+0x2dd/0x490
> [1175905.807920]  shrink_lruvec+0x4aa/0x6e0
> [1175905.809253]  ? free_unref_page_list+0x16f/0x180
> [1175905.810460]  ? free_unref_page_list+0x16f/0x180
> [1175905.811024]  ? shrink_node+0x143/0x600
> [1175905.811593]  shrink_node+0x143/0x600
> [1175905.812149]  balance_pgdat+0x28a/0x630
> [1175905.812716]  kswapd+0x14b/0x3d0
> [1175905.813263]  ? wait_woken+0x80/0x80
> [1175905.813793]  ? balance_pgdat+0x630/0x630
> [1175905.814319]  kthread+0x10d/0x130
> [1175905.814847]  ? kthread_park+0xa0/0xa0
> [1175905.815386]  ret_from_fork+0x22/0x40
> 
> (5.3.18-150300.59.49-default)
> [27926.595977] BUG: kernel NULL pointer dereference, address: 
> 0000000000000007
> [27926.597124] #PF: supervisor write access in kernel mode
> [27926.598197] #PF: error_code(0x0002) - not-present page
> [27926.599265] PGD 0 P4D 0
> [27926.600322] Oops: 0002 [#1] SMP NOPTI
> [27926.603924] RIP: e030:down_read_trylock+0x18/0x50
> [27926.618618] Call Trace:
> [27926.619138]  page_lock_anon_vma_read+0x48/0xe0
> [27926.619665]  rmap_walk_anon+0x16c/0x250
> [27926.620182]  page_referenced+0xd5/0x170
> [27926.620705]  ? rmap_walk_anon+0x250/0x250
> [27926.621214]  ? page_get_anon_vma+0x80/0x80
> [27926.621772]  shrink_active_list+0x2dd/0x490
> [27926.622280]  balance_pgdat+0x50b/0x630
> [27926.622793]  kswapd+0x14b/0x3d0
> [27926.623308]  ? wait_woken+0x80/0x80
> [27926.623822]  ? balance_pgdat+0x630/0x630
> [27926.624329]  kthread+0x10d/0x130
> [27926.624843]  ? kthread_park+0xa0/0xa0
> [27926.625577]  ret_from_fork+0x22/0x40
> 
> (5.3.18-150300.59.49-default)
> [566428.257264] BTRFS info (device dm-0): scrub: finished on devid 1 with 
> status: 0
> [571252.379396] ping[3707]: segfault at 0 ip 0000000000000000 sp 
> 00007fffbae8dc10 error 14 in bash[55a3a9d26000+f1000]
> [571252.379410] Code: Bad RIP value.
> [571252.948920] BUG: Bad rss-counter state mm:000000004d568db6 idx:1 val:4
> [571262.985375] general protection fault: 0000 [#1] SMP NOPTI
> [571262.985410] RIP: e030:down_read_trylock+0x18/0x50
> [571262.985470] Call Trace:
> [571262.985481]  page_lock_anon_vma_read+0x48/0xe0
> [571262.985487]  rmap_walk_anon+0x16c/0x250
> [571262.985492]  page_referenced+0xd5/0x170
> [571262.985496]  ? rmap_walk_anon+0x250/0x250
> [571262.985500]  ? page_get_anon_vma+0x80/0x80
> [571262.985506]  shrink_active_list+0x2dd/0x490
> [571262.985512]  shrink_lruvec+0x4aa/0x6e0
> [571262.985517]  ? free_unref_page_list+0x16f/0x180
> [571262.985522]  ? free_unref_page_list+0x16f/0x180
> [571262.985526]  ? shrink_node+0x143/0x600
> [571262.985529]  shrink_node+0x143/0x600
> [571262.985534]  balance_pgdat+0x28a/0x630
> [571262.985539]  kswapd+0x14b/0x3d0
> [571262.985544]  ? wait_woken+0x80/0x80
> [571262.985548]  ? balance_pgdat+0x630/0x630
> [571262.985553]  kthread+0x10d/0x130
> [571262.985557]  ? kthread_park+0xa0/0xa0
> [571262.985563]  ret_from_fork+0x22/0x40
> 
> (5.3.18-150300.59.49-default)
> [22707.270890] #PF: supervisor write access in kernel mode
> [22707.271539] #PF: error_code(0x0002) - not-present page
> [22707.272159] PGD 0 P4D 0
> [22707.272786] Oops: 0002 [#1] SMP NOPTI
> [22707.274680] RIP: e030:down_read_trylock+0x18/0x50
> [22707.282129] Call Trace:
> [22707.282682]  page_lock_anon_vma_read+0x48/0xe0
> [22707.283225]  rmap_walk_anon+0x16c/0x250
> [22707.283772]  ? uncharge_batch+0xe3/0x180
> [22707.284307]  try_to_unmap+0x93/0xf0
> [22707.284835]  ? page_remove_rmap+0x2c0/0x2c0
> [22707.285375]  ? page_not_mapped+0x20/0x20
> [22707.285906]  ? page_get_anon_vma+0x80/0x80
> [22707.286428]  ? invalid_mkclean_vma+0x20/0x20
> [22707.286974]  migrate_pages+0x857/0xb50
> [22707.287494]  ? isolate_freepages_block+0x370/0x370
> [22707.288013]  ? move_freelist_tail+0xd0/0xd0
> [22707.288538]  compact_zone+0x775/0xd90
> [22707.289059]  kcompactd_do_work+0xfe/0x2a0
> [22707.289576]  ? xen_load_sp0+0x7a/0x160
> [22707.290096]  ? __set_cpus_allowed_ptr+0xb5/0x1e0
> [22707.290623]  ? kcompactd_do_work+0x2a0/0x2a0
> [22707.291158]  ? kcompactd+0x84/0x1e0
> [22707.291703]  kcompactd+0x84/0x1e0
> [22707.292243]  ? wait_woken+0x80/0x80
> [22707.292788]  kthread+0x10d/0x130
> [22707.293329]  ? kthread_park+0xa0/0xa0
> [22707.293868]  ret_from_fork+0x22/0x40
> 
> All those dumps happened with BIOS 1.17.0; today I realized that there was a 
> critical BIOS update (1.18.0) that looks as if the CPU microcode was updated.
> However I don't know any details abot the change.
> 
> Regards,
> Ulrich
> 
> >>> Ulrich Windl schrieb am 10.02.2022 um 09:02 in Nachricht <6204C67B.FDC : 161 
> :
> 60728>:
> > Hi!
> > 
> > just a heads up: Shortly after midnight one of our SLES15 SP3 cluster nodes 
> 
> > started to send SIGSEGV to processes, eventually also to pacemaker, 
> resulting 
> > in a node fence.
> > I suspect some kernel problem. The configuration was running since a week 
> or 
> > so (since last reboot), however.
> > (For SP2 we once had a kernel lock-up, and the suspect was that it might be 
> 
> > related to BtrFS balancing or snapshotting, but that was just as suspect. 
> At 
> > midnight snapper was activated, too, so who knows...)
> > Kernel: 5.3.18-150300.59.43-default
> > 
> > Summary of events:
> > Feb 10 00:00:02 h16 dbus-daemon[5905]: [system] Successfully activated 
> > service 'org.opensuse.Snapper'
> > Feb 10 00:00:02 h16 systemd[1]: Started DBus interface for snapper.
> > Feb 10 00:00:02 h16 systemd[1]: snapper-timeline.service: Succeeded.
> > Feb 10 00:00:02 h16 kernel: traps: mandb[4484] general protection fault 
> > ip:7f4f21876160 sp:7ffe25a71ff8 error:0 in 
> libc-2.31.so[7f4f217fa000+1cb000]
> > Feb 10 00:00:03 h16 systemd-coredump[4488]: Process 4484 (mandb) of user 13 
> 
> > dumped core.
> > Feb 10 00:00:03 h16 kernel: BUG: Bad rss-counter state mm:00000000d1a9d1f5 
> > idx:1 val:4
> > Feb 10 00:00:03 h16 kernel: mandb[4547]: segfault at 8b86 ip 
> > 0000000000008b86 sp 00007ffe25a73058 error 14 in mandb[55dcc3a12000+20000]
> > Feb 10 00:00:03 h16 kernel: Code: Bad RIP value.
> > Feb 10 00:00:04 h16 systemd-coredump[4549]: Process 4547 (mandb) of user 13 
> 
> > dumped core.
> > Feb 10 00:00:04 h16 kernel: BUG: Bad rss-counter state mm:00000000c4f00529 
> > idx:1 val:5
> > Feb 10 00:00:05 h16 kernel: BUG: Bad rss-counter state mm:00000000aae27ee5 
> > idx:1 val:59
> > Feb 10 00:00:06 h16 systemd-coredump[4610]: Process 4606 (mandb) of user 13 
> 
> > dumped core.
> > Feb 10 00:00:06 h16 kernel: traps: mandb[4640] general protection fault 
> > ip:7f4f218c6caf sp:7ffe25a73110 error:0 in 
> libc-2.31.so[7f4f217fa000+1cb000]
> > Feb 10 00:00:06 h16 kernel: BUG: Bad rss-counter state mm:00000000babee882 
> > idx:1 val:2
> > Feb 10 00:00:08 h16 systemd-coredump[4645]: Process 4643 (systemd) of user 
> 0 
> > dumped core.
> > 
> > That doesn't sound good, does it?
> > 
> > Feb 10 00:00:08 h16 systemd[4642]: Caught <SEGV>, dumped core as pid 4643.
> > Feb 10 00:00:08 h16 systemd[4642]: Freezing execution.
> > Feb 10 00:00:29 h16 kernel: pacemaker-execd[4704]: segfault at 3a46 ip 
> > 0000000000003a46 sp 00007ffe2c700508 error 14 in 
> > pacemaker-execd[55e474755000+b000]
> > Feb 10 00:00:29 h16 kernel: Code: Bad RIP value.
> > Feb 10 00:00:30 h16 kernel: BUG: Bad rss-counter state mm:00000000b1203e21 
> > idx:1 val:2
> > Feb 10 00:00:34 h16 kernel: libvirtd[5685]: segfault at 0 ip 
> > 00007f745c487e73 sp 00007ffc70e95a58 error 6 in 
> > libc-2.31.so[7f745c3fe000+1cb000]
> > Feb 10 00:00:34 h16 kernel: Code: Bad RIP value.
> > Feb 10 00:00:34 h16 kernel: BUG: Bad rss-counter state mm:00000000d755caae 
> > idx:1 val:69691
> > Feb 10 00:00:34 h16 kernel: VirtualDomain[5781]: segfault at 0 ip 
> > 0000000000000000 sp 00007ffdc5c98660 error 14 in bash[55669b8cb000+f1000]
> > Feb 10 00:00:34 h16 kernel: Code: Bad RIP value.
> > Feb 10 00:00:35 h16 systemd-coredump[5742]: Process 5689 (Filesystem) of 
> > user 0 dumped core.
> > Feb 10 00:00:35 h16 kernel: BUG: Bad rss-counter state mm:0000000042171789 
> > idx:1 val:2
> > Feb 10 00:00:36 h16 systemd-coredump[5803]: Process 5781 (VirtualDomain) of 
> 
> > user 0 dumped core.
> > Feb 10 00:00:36 h16 kernel: BUG: Bad rss-counter state mm:00000000713058ae 
> > idx:1 val:6
> > ...many more...
> > Feb 10 00:03:33 h16 systemd-coredump[13479]: Process 13400 (systemd) of 
> user 
> > 0 dumped core.
> > -- Reboot --
> > Feb 10 00:06:59 h16 kernel: Linux version 5.3.18-150300.59.43-default 
> > (geeko at buildhost) (gcc version 7.5.0 (SUSE Linux)) #1 SMP Sun Jan 23 
> 19:27:23 
> > UTC 2022 (c76af22)
> > (eventually)
> > 
> > Another reboot:
> > Feb 10 00:08:18 h16 sbd[7067]:    emerg: do_exit: Rebooting system: reboot
> > -- Reboot --
> > Feb 10 00:11:43 h16 kernel: Linux version 5.3.18-150300.59.43-default 
> > (geeko at buildhost) (gcc version 7.5.0 (SUSE Linux)) #1 SMP Sun Jan 23 
> 19:27:23 
> > UTC 2022 (c76af22)
> > 
> > Since then the node (Dell PowerEdge R7415) is running normally again.
> > 
> > Regards,
> > Ulrich
> > 
> > 
> 
> 
> 
>