[ClusterLabs] Pacemaker crashed and produce a coredump file
Klaus Wenninger
kwenning at redhat.com
Thu Jul 30 04:04:47 EDT 2020
On 7/29/20 10:39 AM, Reid Wahl wrote:
> Hi,
>
> It looks like this is a bug that was fixed in later releases. The
> `path` variable was a null pointer when it was passed to
> `systemd_unit_exec_with_unit` as the `unit` argument. Commit 62a0d26a
> <https://github.com/ClusterLabs/pacemaker/commit/62a0d26a8f85fbcee9b56524ea3f1ae0171cbe52#diff-00b989f66499e2081134c17c06d2b359R201>
> adds a null check to the `path` variable before using it to call
> `systemd_unit_exec_with_unit`.
>
Hmm ... the calltree shows dbus-api being used inside
dbus_connection_dispatch which iirc
isn't allowed.
Could be related to https://github.com/ClusterLabs/pacemaker/pull/1201.
Klaus
> I believe pacemaker-1.1.15-11.el7 is the first RHEL pacemaker release
> that contains the fix. Can you upgrade and see if the issue is resolved?
>
> On Tue, Jul 28, 2020 at 11:49 PM lkxjtu <lkxjtu at 163.com
> <mailto:lkxjtu at 163.com>> wrote:
>
> RPM Version Information:
> corosync-2.3.4-7.el7_2.1.x86_64
> pacemaker-1.1.12-22.el7.x86_64
>
>
> Coredump file backtrace:
>
> ```
> warning: .dynamic section for "/lib64/libk5crypto.so.3" is not at
> the expected address (wrong library or version mismatch?)
> Missing separate debuginfo for
> Try: yum --enablerepo='*debug*' install
> /usr/lib/debug/.build-id/91/375124d864f2692ced1c4a5f090826b7074dc0
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> Core was generated by `/usr/libexec/pacemaker/lrmd'.
> Program terminated with signal 6, Aborted.
> #0 0x00007f6ee9ed85f7 in raise () from /lib64/libc.so.6
> Missing separate debuginfos, use: debuginfo-install
> audit-libs-2.4.1-5.el7.x86_64 bzip2-libs-1.0.6-13.el7.x86_64
> corosynclib-2.3.4-7.el7_2.1.x86_64 dbus-libs-1.6.12-13.el7.x86_64
> glib2-2.42.2-5.el7.x86_64 glibc-2.17-106.el7_2.4.x86_64
> gmp-6.0.0-12.el7_1.x86_64 gnutls-3.3.8-14.el7_2.x86_64
> keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-19.el7.x86_64
> libcom_err-1.42.9-7.el7.x86_64 libffi-3.0.13-16.el7.x86_64
> libqb-0.17.1-2.el7.1.x86_64 libselinux-2.5-12.el7.x86_64
> libtasn1-3.8-2.el7.x86_64 libtool-ltdl-2.4.2-21.el7_2.x86_64
> libuuid-2.23.2-26.el7_2.2.x86_64 libxml2-2.9.1-6.el7_2.2.x86_64
> libxslt-1.1.28-5.el7.x86_64 nettle-2.7.1-4.el7.x86_64
> openssl-libs-1.0.1e-51.el7_2.4.x86_64 p11-kit-0.20.7-3.el7.x86_64
> pam-1.1.8-12.el7_1.1.x86_64 pcre-8.32-15.el7.x86_64
> trousers-0.3.13-1.el7.x86_64 xz-libs-5.1.2-12alpha.el7.x86_64
> zlib-1.2.7-15.el7.x86_64
> (gdb) bt
> #0 0x00007f6ee9ed85f7 in raise () from /lib64/libc.so.6
> #1 0x00007f6ee9ed9ce8 in abort () from /lib64/libc.so.6
> #2 0x00007f6eeb7e1f67 in crm_abort
> (file=file at entry=0x7f6eeb5c863e "systemd.c",
> function=function at entry=0x7f6eeb5c8ec0 <__FUNCTION__.33949>
> "systemd_unit_exec_with_unit", line=line at entry=514,
> assert_condition=assert_condition at entry=0x7f6eeb5c8713 "unit",
> do_core=do_core at entry=1, do_fork=<optimized out>, do_fork at entry=0)
> at utils.c:1197
> #3 0x00007f6eeb5c5cef in systemd_unit_exec_with_unit
> (op=op at entry=0x13adce0, unit=0x0) at systemd.c:514
> #4 0x00007f6eeb5c5e81 in systemd_loadunit_result
> (reply=reply at entry=0x139f2a0, op=op at entry=0x13adce0) at systemd.c:175
> #5 0x00007f6eeb5c6181 in systemd_loadunit_cb (pending=0x13aa380,
> user_data=0x13adce0) at systemd.c:197
> #6 0x00007f6eeb16f862 in complete_pending_call_and_unlock () from
> /lib64/libdbus-1.so.3
> #7 0x00007f6eeb172b51 in dbus_connection_dispatch () from
> /lib64/libdbus-1.so.3
> #8 0x00007f6eeb5c1e40 in pcmk_dbus_connection_dispatch
> (connection=0x13a4cb0, new_status=DBUS_DISPATCH_DATA_REMAINS,
> data=0x0) at dbus.c:388
> #9 0x00007f6eeb171260 in
> _dbus_connection_update_dispatch_status_and_unlock () from
> /lib64/libdbus-1.so.3
> #10 0x00007f6eeb172a93 in reply_handler_timeout () from
> /lib64/libdbus-1.so.3
> #11 0x00007f6eeb5c1daf in pcmk_dbus_timeout_dispatch
> (data=0x13aa660) at dbus.c:491
> #12 0x00007f6ee97a21c3 in g_timeout_dispatch () from
> /lib64/libglib-2.0.so.0
> #13 0x00007f6ee97a17aa in g_main_context_dispatch () from
> /lib64/libglib-2.0.so.0
> #14 0x00007f6ee97a1af8 in g_main_context_iterate.isra.24 () from
> /lib64/libglib-2.0.so.0
> #15 0x00007f6ee97a1dca in g_main_loop_run () from
> /lib64/libglib-2.0.so.0
> #16 0x0000000000402824 in main (argc=<optimized out>,
> argv=0x7ffce752b258) at main.c:344
> (gdb) up
> #1 0x00007f6ee9ed9ce8 in abort () from /lib64/libc.so.6
> (gdb) up
> #2 0x00007f6eeb7e1f67 in crm_abort
> (file=file at entry=0x7f6eeb5c863e "systemd.c",
> function=function at entry=0x7f6eeb5c8ec0 <__FUNCTION__.33949>
> "systemd_unit_exec_with_unit", line=line at entry=514,
> assert_condition=assert_condition at entry=0x7f6eeb5c8713 "unit",
> do_core=do_core at entry=1, do_fork=<optimized out>, do_fork at entry=0)
> at utils.c:1197
> 1197 abort();
> (gdb) up
> #3 0x00007f6eeb5c5cef in systemd_unit_exec_with_unit
> (op=op at entry=0x13adce0, unit=0x0) at systemd.c:514
> 514 CRM_ASSERT(unit);
> (gdb) up
> #4 0x00007f6eeb5c5e81 in systemd_loadunit_result
> (reply=reply at entry=0x139f2a0, op=op at entry=0x13adce0) at systemd.c:175
> 175 systemd_unit_exec_with_unit(op, path);
> (gdb) up
> #5 0x00007f6eeb5c6181 in systemd_loadunit_cb (pending=0x13aa380,
> user_data=0x13adce0) at systemd.c:197
> 197 systemd_loadunit_result(reply, user_data);
> (gdb) up
> #6 0x00007f6eeb16f862 in complete_pending_call_and_unlock () from
> /lib64/libdbus-1.so.3
> (gdb) up
> #7 0x00007f6eeb172b51 in dbus_connection_dispatch () from
> /lib64/libdbus-1.so.3
> (gdb) up
> #8 0x00007f6eeb5c1e40 in pcmk_dbus_connection_dispatch
> (connection=0x13a4cb0, new_status=DBUS_DISPATCH_DATA_REMAINS,
> data=0x0) at dbus.c:388
> 388 dbus_connection_dispatch(connection);
> (gdb) up
> #9 0x00007f6eeb171260 in
> _dbus_connection_update_dispatch_status_and_unlock () from
> /lib64/libdbus-1.so.3
> (gdb) up
> #10 0x00007f6eeb172a93 in reply_handler_timeout () from
> /lib64/libdbus-1.so.3
> (gdb) up
> #11 0x00007f6eeb5c1daf in pcmk_dbus_timeout_dispatch
> (data=0x13aa660) at dbus.c:491
> 491 dbus_timeout_handle(data);
> (gdb) up
> #12 0x00007f6ee97a21c3 in g_timeout_dispatch () from
> /lib64/libglib-2.0.so.0
> (gdb) up
> #13 0x00007f6ee97a17aa in g_main_context_dispatch () from
> /lib64/libglib-2.0.so.0
> (gdb) up
> #14 0x00007f6ee97a1af8 in g_main_context_iterate.isra.24 () from
> /lib64/libglib-2.0.so.0
> (gdb) up
> #15 0x00007f6ee97a1dca in g_main_loop_run () from
> /lib64/libglib-2.0.so.0
> (gdb) up
> #16 0x0000000000402824 in main (argc=<optimized out>,
> argv=0x7ffce752b258) at main.c:344
> 344 g_main_run(mainloop);
> (gdb) r
> Starting program: /usr/libexec/pacemaker/lrmd
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> [Inferior 1 (process 3889819) exited with code 0144]
> ```
>
> From the backtrace, I found the program assert failed in function
> of "systemd_uit_exec_with_unit", because the parameter of path is
> "0x0".
> I don't quite understand what may lead to the failure of this
> assert? Is it a bug or a configuration problem?
>
>
>
>
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
>
>
> --
> Regards,
>
> Reid Wahl, RHCA
> Software Maintenance Engineer, Red Hat
> CEE - Platform Support Delivery - ClusterHA
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20200730/e088d025/attachment-0001.htm>
More information about the Users
mailing list