[ClusterLabs] Antw: [EXT] Re: Pacemaker crashed and produce a coredump file

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Thu Jul 30 04:39:38 EDT 2020


>>> Strahil Nikolov <hunter86_bg at yahoo.com> schrieb am 30.07.2020 um 10:23 in
Nachricht <B346C97F-F4CB-4CE1-85EF-EA1EB8CFD788 at yahoo.com>:
> Early  systemd bugs caused dbus  issues  and session files  not being
cleaned 
>  up properly.  At least EL 7.4  or  older  were  affected.
> 
> What is your OS and version?
> 
> P.S.: I know your pain. I am still fighting to explain that without planned

> downtime, the end users will definitely get unplanned downtime.

Well it seems the only way to ensure no unplanned downtime on the long run ist
to test (and fix) it periodically, probably causing unplanned downtime
periodically, but (if things go well) with decreasing probability ;-)

> 
> Best Regards,
> Strahil Nikolov
> 
> На 29 юли 2020 г. 12:46:16 GMT+03:00, lkxjtu <lkxjtu at 163.com> написа:
>>Hi Reid Wahl,
>>
>>
>>There are more log informations below. The reason seems to be that
>>communication with DBUS timed out. Any suggestions?
>>
>>
>>1672712 Jul 24 21:20:17 [3945305] B0610011       lrmd:     info:
>>pcmk_dbus_timeout_dispatch:    Timeout 0x147bbd0 expired
>>1672713 Jul 24 21:20:17 [3945305] B0610011       lrmd:     info:
>>pcmk_dbus_find_error:  LoadUnit error
>>'org.freedesktop.DBus.Error.NoReply': Did not receive a reply.        
>>Possible causes include: the remote application did not send a reply,
>>the message bus security policy blocked the reply, the reply timeout
>>expired, or the n        etwork connection was broken.
>>1672714 Jul 24 21:20:17 [3945305] B0610011       lrmd:    error:
>>systemd_loadunit_result:       Unexepcted DBus type, expected o in 's'
>>instead of s
>>1672715 Jul 24 21:20:17 [3945305] B0610011       lrmd:    error:
>>crm_abort:     systemd_unit_exec_with_unit: Triggered fatal assert at
>>systemd.c:514 : unit
>>1672716 2020-07-24T21:20:17.701484+08:00 B0610011 lrmd[3945305]:   
>>error: systemd_loadunit_result: Unexepcted DBus type, expected o in 's'
>>instead of s
>>1672717 2020-07-24T21:20:17.701517+08:00 B0610011 lrmd[3945305]:   
>>error: crm_abort: systemd_unit_exec_with_unit: Triggered fatal assert
>>at systemd.c:514 : unit
>>1672718 Jul 24 21:20:17 [3945306] B0610011       crmd:    error:
>>crm_ipc_read:  Connection to lrmd failed
>>
>>
>>
>>> Hi,
>>>
>>> It looks like this is a bug that was fixed in later releases. The
>>`path`
>>> variable was a null pointer when it was passed to
>>> `systemd_unit_exec_with_unit` as the `unit` argument. Commit 62a0d26a
>>>
>><https://github.com/ClusterLabs/pacemaker/commit/62a0d26a8f85fbcee9b56524ea3f1

> ae0171cbe52#diff-00b989f66499e2081134c17c06d2b359R201>
>>> adds a null check to the `path` variable before using it to call
>>`systemd_unit_exec_with_unit`.
>>>
>>> I believe pacemaker-1.1.15-11.el7 is the first RHEL pacemaker release
>>that
>>> contains the fix. Can you upgrade and see if the issue is resolved?
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 





More information about the Users mailing list