[ClusterLabs] Pacemaker crashed and produce a coredump file

Reid Wahl nwahl at redhat.com
Wed Jul 29 04:39:10 EDT 2020


Hi,

It looks like this is a bug that was fixed in later releases. The `path`
variable was a null pointer when it was passed to
`systemd_unit_exec_with_unit` as the `unit` argument. Commit 62a0d26a
<https://github.com/ClusterLabs/pacemaker/commit/62a0d26a8f85fbcee9b56524ea3f1ae0171cbe52#diff-00b989f66499e2081134c17c06d2b359R201>
adds a null check to the `path` variable before using it to call
`systemd_unit_exec_with_unit`.

I believe pacemaker-1.1.15-11.el7 is the first RHEL pacemaker release that
contains the fix. Can you upgrade and see if the issue is resolved?

On Tue, Jul 28, 2020 at 11:49 PM lkxjtu <lkxjtu at 163.com> wrote:

> RPM Version Information:
> corosync-2.3.4-7.el7_2.1.x86_64
> pacemaker-1.1.12-22.el7.x86_64
>
>
> Coredump file backtrace:
>
> ```
> warning: .dynamic section for "/lib64/libk5crypto.so.3" is not at the
> expected address (wrong library or version mismatch?)
> Missing separate debuginfo for
> Try: yum --enablerepo='*debug*' install
> /usr/lib/debug/.build-id/91/375124d864f2692ced1c4a5f090826b7074dc0
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> Core was generated by `/usr/libexec/pacemaker/lrmd'.
> Program terminated with signal 6, Aborted.
> #0  0x00007f6ee9ed85f7 in raise () from /lib64/libc.so.6
> Missing separate debuginfos, use: debuginfo-install
> audit-libs-2.4.1-5.el7.x86_64 bzip2-libs-1.0.6-13.el7.x86_64
> corosynclib-2.3.4-7.el7_2.1.x86_64 dbus-libs-1.6.12-13.el7.x86_64
> glib2-2.42.2-5.el7.x86_64 glibc-2.17-106.el7_2.4.x86_64
> gmp-6.0.0-12.el7_1.x86_64 gnutls-3.3.8-14.el7_2.x86_64
> keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-19.el7.x86_64
> libcom_err-1.42.9-7.el7.x86_64 libffi-3.0.13-16.el7.x86_64
> libqb-0.17.1-2.el7.1.x86_64 libselinux-2.5-12.el7.x86_64
> libtasn1-3.8-2.el7.x86_64 libtool-ltdl-2.4.2-21.el7_2.x86_64
> libuuid-2.23.2-26.el7_2.2.x86_64 libxml2-2.9.1-6.el7_2.2.x86_64
> libxslt-1.1.28-5.el7.x86_64 nettle-2.7.1-4.el7.x86_64
> openssl-libs-1.0.1e-51.el7_2.4.x86_64 p11-kit-0.20.7-3.el7.x86_64
> pam-1.1.8-12.el7_1.1.x86_64 pcre-8.32-15.el7.x86_64
> trousers-0.3.13-1.el7.x86_64 xz-libs-5.1.2-12alpha.el7.x86_64
> zlib-1.2.7-15.el7.x86_64
> (gdb) bt
> #0  0x00007f6ee9ed85f7 in raise () from /lib64/libc.so.6
> #1  0x00007f6ee9ed9ce8 in abort () from /lib64/libc.so.6
> #2  0x00007f6eeb7e1f67 in crm_abort (file=file at entry=0x7f6eeb5c863e
> "systemd.c", function=function at entry=0x7f6eeb5c8ec0 <__FUNCTION__.33949>
> "systemd_unit_exec_with_unit", line=line at entry=514,
>     assert_condition=assert_condition at entry=0x7f6eeb5c8713 "unit",
> do_core=do_core at entry=1, do_fork=<optimized out>, do_fork at entry=0) at
> utils.c:1197
> #3  0x00007f6eeb5c5cef in systemd_unit_exec_with_unit (op=op at entry=0x13adce0,
> unit=0x0) at systemd.c:514
> #4  0x00007f6eeb5c5e81 in systemd_loadunit_result (reply=reply at entry=0x139f2a0,
> op=op at entry=0x13adce0) at systemd.c:175
> #5  0x00007f6eeb5c6181 in systemd_loadunit_cb (pending=0x13aa380,
> user_data=0x13adce0) at systemd.c:197
> #6  0x00007f6eeb16f862 in complete_pending_call_and_unlock () from
> /lib64/libdbus-1.so.3
> #7  0x00007f6eeb172b51 in dbus_connection_dispatch () from
> /lib64/libdbus-1.so.3
> #8  0x00007f6eeb5c1e40 in pcmk_dbus_connection_dispatch
> (connection=0x13a4cb0, new_status=DBUS_DISPATCH_DATA_REMAINS, data=0x0) at
> dbus.c:388
> #9  0x00007f6eeb171260 in
> _dbus_connection_update_dispatch_status_and_unlock () from
> /lib64/libdbus-1.so.3
> #10 0x00007f6eeb172a93 in reply_handler_timeout () from
> /lib64/libdbus-1.so.3
> #11 0x00007f6eeb5c1daf in pcmk_dbus_timeout_dispatch (data=0x13aa660) at
> dbus.c:491
> #12 0x00007f6ee97a21c3 in g_timeout_dispatch () from
> /lib64/libglib-2.0.so.0
> #13 0x00007f6ee97a17aa in g_main_context_dispatch () from
> /lib64/libglib-2.0.so.0
> #14 0x00007f6ee97a1af8 in g_main_context_iterate.isra.24 () from
> /lib64/libglib-2.0.so.0
> #15 0x00007f6ee97a1dca in g_main_loop_run () from /lib64/libglib-2.0.so.0
> #16 0x0000000000402824 in main (argc=<optimized out>, argv=0x7ffce752b258)
> at main.c:344
> (gdb) up
> #1  0x00007f6ee9ed9ce8 in abort () from /lib64/libc.so.6
> (gdb) up
> #2  0x00007f6eeb7e1f67 in crm_abort (file=file at entry=0x7f6eeb5c863e
> "systemd.c", function=function at entry=0x7f6eeb5c8ec0 <__FUNCTION__.33949>
> "systemd_unit_exec_with_unit", line=line at entry=514,
>     assert_condition=assert_condition at entry=0x7f6eeb5c8713 "unit",
> do_core=do_core at entry=1, do_fork=<optimized out>, do_fork at entry=0) at
> utils.c:1197
> 1197            abort();
> (gdb) up
> #3  0x00007f6eeb5c5cef in systemd_unit_exec_with_unit (op=op at entry=0x13adce0,
> unit=0x0) at systemd.c:514
> 514         CRM_ASSERT(unit);
> (gdb) up
> #4  0x00007f6eeb5c5e81 in systemd_loadunit_result (reply=reply at entry=0x139f2a0,
> op=op at entry=0x13adce0) at systemd.c:175
> 175             systemd_unit_exec_with_unit(op, path);
> (gdb) up
> #5  0x00007f6eeb5c6181 in systemd_loadunit_cb (pending=0x13aa380,
> user_data=0x13adce0) at systemd.c:197
> 197         systemd_loadunit_result(reply, user_data);
> (gdb) up
> #6  0x00007f6eeb16f862 in complete_pending_call_and_unlock () from
> /lib64/libdbus-1.so.3
> (gdb) up
> #7  0x00007f6eeb172b51 in dbus_connection_dispatch () from
> /lib64/libdbus-1.so.3
> (gdb) up
> #8  0x00007f6eeb5c1e40 in pcmk_dbus_connection_dispatch
> (connection=0x13a4cb0, new_status=DBUS_DISPATCH_DATA_REMAINS, data=0x0) at
> dbus.c:388
> 388             dbus_connection_dispatch(connection);
> (gdb) up
> #9  0x00007f6eeb171260 in
> _dbus_connection_update_dispatch_status_and_unlock () from
> /lib64/libdbus-1.so.3
> (gdb) up
> #10 0x00007f6eeb172a93 in reply_handler_timeout () from
> /lib64/libdbus-1.so.3
> (gdb) up
> #11 0x00007f6eeb5c1daf in pcmk_dbus_timeout_dispatch (data=0x13aa660) at
> dbus.c:491
> 491         dbus_timeout_handle(data);
> (gdb) up
> #12 0x00007f6ee97a21c3 in g_timeout_dispatch () from
> /lib64/libglib-2.0.so.0
> (gdb) up
> #13 0x00007f6ee97a17aa in g_main_context_dispatch () from
> /lib64/libglib-2.0.so.0
> (gdb) up
> #14 0x00007f6ee97a1af8 in g_main_context_iterate.isra.24 () from
> /lib64/libglib-2.0.so.0
> (gdb) up
> #15 0x00007f6ee97a1dca in g_main_loop_run () from /lib64/libglib-2.0.so.0
> (gdb) up
> #16 0x0000000000402824 in main (argc=<optimized out>, argv=0x7ffce752b258)
> at main.c:344
> 344         g_main_run(mainloop);
> (gdb) r
> Starting program: /usr/libexec/pacemaker/lrmd
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> [Inferior 1 (process 3889819) exited with code 0144]
> ```
>
> From the backtrace, I found the program assert failed in function of
> "systemd_uit_exec_with_unit", because the parameter of path is "0x0".
> I don't quite understand what may lead to the failure of this assert? Is
> it a bug or a configuration problem?
>
>
>
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>


-- 
Regards,

Reid Wahl, RHCA
Software Maintenance Engineer, Red Hat
CEE - Platform Support Delivery - ClusterHA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20200729/c50cc256/attachment-0001.htm>


More information about the Users mailing list