[ClusterLabs] Pacemaker crashed and produce a coredump file

Klaus Wenninger kwenning at redhat.com
Thu Jul 30 04:04:47 EDT 2020


On 7/29/20 10:39 AM, Reid Wahl wrote:
> Hi,
>
> It looks like this is a bug that was fixed in later releases. The
> `path` variable was a null pointer when it was passed to
> `systemd_unit_exec_with_unit` as the `unit` argument. Commit 62a0d26a
> <https://github.com/ClusterLabs/pacemaker/commit/62a0d26a8f85fbcee9b56524ea3f1ae0171cbe52#diff-00b989f66499e2081134c17c06d2b359R201>
> adds a null check to the `path` variable before using it to call
> `systemd_unit_exec_with_unit`.
>
Hmm ... the calltree shows dbus-api being used inside
dbus_connection_dispatch which iirc
isn't allowed.
Could be related to https://github.com/ClusterLabs/pacemaker/pull/1201.

Klaus
> I believe pacemaker-1.1.15-11.el7 is the first RHEL pacemaker release
> that contains the fix. Can you upgrade and see if the issue is resolved?
>
> On Tue, Jul 28, 2020 at 11:49 PM lkxjtu <lkxjtu at 163.com
> <mailto:lkxjtu at 163.com>> wrote:
>
>     RPM Version Information:
>     corosync-2.3.4-7.el7_2.1.x86_64
>     pacemaker-1.1.12-22.el7.x86_64
>
>
>     Coredump file backtrace:
>
>     ```
>     warning: .dynamic section for "/lib64/libk5crypto.so.3" is not at
>     the expected address (wrong library or version mismatch?)
>     Missing separate debuginfo for
>     Try: yum --enablerepo='*debug*' install
>     /usr/lib/debug/.build-id/91/375124d864f2692ced1c4a5f090826b7074dc0
>     [Thread debugging using libthread_db enabled]
>     Using host libthread_db library "/lib64/libthread_db.so.1".
>     Core was generated by `/usr/libexec/pacemaker/lrmd'.
>     Program terminated with signal 6, Aborted.
>     #0  0x00007f6ee9ed85f7 in raise () from /lib64/libc.so.6
>     Missing separate debuginfos, use: debuginfo-install
>     audit-libs-2.4.1-5.el7.x86_64 bzip2-libs-1.0.6-13.el7.x86_64
>     corosynclib-2.3.4-7.el7_2.1.x86_64 dbus-libs-1.6.12-13.el7.x86_64
>     glib2-2.42.2-5.el7.x86_64 glibc-2.17-106.el7_2.4.x86_64
>     gmp-6.0.0-12.el7_1.x86_64 gnutls-3.3.8-14.el7_2.x86_64
>     keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-19.el7.x86_64
>     libcom_err-1.42.9-7.el7.x86_64 libffi-3.0.13-16.el7.x86_64
>     libqb-0.17.1-2.el7.1.x86_64 libselinux-2.5-12.el7.x86_64
>     libtasn1-3.8-2.el7.x86_64 libtool-ltdl-2.4.2-21.el7_2.x86_64
>     libuuid-2.23.2-26.el7_2.2.x86_64 libxml2-2.9.1-6.el7_2.2.x86_64
>     libxslt-1.1.28-5.el7.x86_64 nettle-2.7.1-4.el7.x86_64
>     openssl-libs-1.0.1e-51.el7_2.4.x86_64 p11-kit-0.20.7-3.el7.x86_64
>     pam-1.1.8-12.el7_1.1.x86_64 pcre-8.32-15.el7.x86_64
>     trousers-0.3.13-1.el7.x86_64 xz-libs-5.1.2-12alpha.el7.x86_64
>     zlib-1.2.7-15.el7.x86_64
>     (gdb) bt
>     #0  0x00007f6ee9ed85f7 in raise () from /lib64/libc.so.6
>     #1  0x00007f6ee9ed9ce8 in abort () from /lib64/libc.so.6
>     #2  0x00007f6eeb7e1f67 in crm_abort
>     (file=file at entry=0x7f6eeb5c863e "systemd.c",
>     function=function at entry=0x7f6eeb5c8ec0 <__FUNCTION__.33949>
>     "systemd_unit_exec_with_unit", line=line at entry=514,
>         assert_condition=assert_condition at entry=0x7f6eeb5c8713 "unit",
>     do_core=do_core at entry=1, do_fork=<optimized out>, do_fork at entry=0)
>     at utils.c:1197
>     #3  0x00007f6eeb5c5cef in systemd_unit_exec_with_unit
>     (op=op at entry=0x13adce0, unit=0x0) at systemd.c:514
>     #4  0x00007f6eeb5c5e81 in systemd_loadunit_result
>     (reply=reply at entry=0x139f2a0, op=op at entry=0x13adce0) at systemd.c:175
>     #5  0x00007f6eeb5c6181 in systemd_loadunit_cb (pending=0x13aa380,
>     user_data=0x13adce0) at systemd.c:197
>     #6  0x00007f6eeb16f862 in complete_pending_call_and_unlock () from
>     /lib64/libdbus-1.so.3
>     #7  0x00007f6eeb172b51 in dbus_connection_dispatch () from
>     /lib64/libdbus-1.so.3
>     #8  0x00007f6eeb5c1e40 in pcmk_dbus_connection_dispatch
>     (connection=0x13a4cb0, new_status=DBUS_DISPATCH_DATA_REMAINS,
>     data=0x0) at dbus.c:388
>     #9  0x00007f6eeb171260 in
>     _dbus_connection_update_dispatch_status_and_unlock () from
>     /lib64/libdbus-1.so.3
>     #10 0x00007f6eeb172a93 in reply_handler_timeout () from
>     /lib64/libdbus-1.so.3
>     #11 0x00007f6eeb5c1daf in pcmk_dbus_timeout_dispatch
>     (data=0x13aa660) at dbus.c:491
>     #12 0x00007f6ee97a21c3 in g_timeout_dispatch () from
>     /lib64/libglib-2.0.so.0
>     #13 0x00007f6ee97a17aa in g_main_context_dispatch () from
>     /lib64/libglib-2.0.so.0
>     #14 0x00007f6ee97a1af8 in g_main_context_iterate.isra.24 () from
>     /lib64/libglib-2.0.so.0
>     #15 0x00007f6ee97a1dca in g_main_loop_run () from
>     /lib64/libglib-2.0.so.0
>     #16 0x0000000000402824 in main (argc=<optimized out>,
>     argv=0x7ffce752b258) at main.c:344
>     (gdb) up
>     #1  0x00007f6ee9ed9ce8 in abort () from /lib64/libc.so.6
>     (gdb) up
>     #2  0x00007f6eeb7e1f67 in crm_abort
>     (file=file at entry=0x7f6eeb5c863e "systemd.c",
>     function=function at entry=0x7f6eeb5c8ec0 <__FUNCTION__.33949>
>     "systemd_unit_exec_with_unit", line=line at entry=514,
>         assert_condition=assert_condition at entry=0x7f6eeb5c8713 "unit",
>     do_core=do_core at entry=1, do_fork=<optimized out>, do_fork at entry=0)
>     at utils.c:1197
>     1197            abort();
>     (gdb) up
>     #3  0x00007f6eeb5c5cef in systemd_unit_exec_with_unit
>     (op=op at entry=0x13adce0, unit=0x0) at systemd.c:514
>     514         CRM_ASSERT(unit);
>     (gdb) up
>     #4  0x00007f6eeb5c5e81 in systemd_loadunit_result
>     (reply=reply at entry=0x139f2a0, op=op at entry=0x13adce0) at systemd.c:175
>     175             systemd_unit_exec_with_unit(op, path);
>     (gdb) up
>     #5  0x00007f6eeb5c6181 in systemd_loadunit_cb (pending=0x13aa380,
>     user_data=0x13adce0) at systemd.c:197
>     197         systemd_loadunit_result(reply, user_data);
>     (gdb) up
>     #6  0x00007f6eeb16f862 in complete_pending_call_and_unlock () from
>     /lib64/libdbus-1.so.3
>     (gdb) up
>     #7  0x00007f6eeb172b51 in dbus_connection_dispatch () from
>     /lib64/libdbus-1.so.3
>     (gdb) up
>     #8  0x00007f6eeb5c1e40 in pcmk_dbus_connection_dispatch
>     (connection=0x13a4cb0, new_status=DBUS_DISPATCH_DATA_REMAINS,
>     data=0x0) at dbus.c:388
>     388             dbus_connection_dispatch(connection);
>     (gdb) up
>     #9  0x00007f6eeb171260 in
>     _dbus_connection_update_dispatch_status_and_unlock () from
>     /lib64/libdbus-1.so.3
>     (gdb) up
>     #10 0x00007f6eeb172a93 in reply_handler_timeout () from
>     /lib64/libdbus-1.so.3
>     (gdb) up
>     #11 0x00007f6eeb5c1daf in pcmk_dbus_timeout_dispatch
>     (data=0x13aa660) at dbus.c:491
>     491         dbus_timeout_handle(data);
>     (gdb) up
>     #12 0x00007f6ee97a21c3 in g_timeout_dispatch () from
>     /lib64/libglib-2.0.so.0
>     (gdb) up
>     #13 0x00007f6ee97a17aa in g_main_context_dispatch () from
>     /lib64/libglib-2.0.so.0
>     (gdb) up
>     #14 0x00007f6ee97a1af8 in g_main_context_iterate.isra.24 () from
>     /lib64/libglib-2.0.so.0
>     (gdb) up
>     #15 0x00007f6ee97a1dca in g_main_loop_run () from
>     /lib64/libglib-2.0.so.0
>     (gdb) up
>     #16 0x0000000000402824 in main (argc=<optimized out>,
>     argv=0x7ffce752b258) at main.c:344
>     344         g_main_run(mainloop);
>     (gdb) r
>     Starting program: /usr/libexec/pacemaker/lrmd
>     [Thread debugging using libthread_db enabled]
>     Using host libthread_db library "/lib64/libthread_db.so.1".
>     [Inferior 1 (process 3889819) exited with code 0144]
>     ```
>
>     From the backtrace, I found the program assert failed in function
>     of "systemd_uit_exec_with_unit", because the parameter of path is
>     "0x0".
>     I don't quite understand what may lead to the failure of this
>     assert? Is it a bug or a configuration problem?
>
>
>
>
>      
>
>     _______________________________________________
>     Manage your subscription:
>     https://lists.clusterlabs.org/mailman/listinfo/users
>
>     ClusterLabs home: https://www.clusterlabs.org/
>
>
>
> -- 
> Regards,
>
> Reid Wahl, RHCA
> Software Maintenance Engineer, Red Hat
> CEE - Platform Support Delivery - ClusterHA
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20200730/e088d025/attachment-0001.htm>


More information about the Users mailing list