[Pacemaker] cibadmin -Q: Call cib_query failed (-62): Timer expired

Andrew Beekhof andrew at beekhof.net
Thu Sep 26 22:25:28 EDT 2013


On 27/09/2013, at 8:45 AM, Radoslaw Garbacz <radoslaw.garbacz at xtremedatainc.com> wrote:

> Hi,
> 
> I have a problem starting up a cluster after upgrading corosync from
> 1.4 to 2.3.2 and pacemaker from 1.8 to 1.9.
> 
> All "crm_node" calls report well, but any CIB manipulation fails, i.e.:
> * crm_node -q: 1
> * crm_node -l: OK
> * crm_node -p: OK
> * cibadmin -Q: Call cib_query failed (-62): Timer expired

Does cibadmin -Ql work?
If so, there might be a DC election going on (look in the logs for "crmd").
Is the error transient or persistent?

> 
> No iptables, no SELinux, 3 nodes cluster, corosync.conf:
> ...
>        ringnumber: 0
>        bindnetaddr: ...
>        mcastport: 7800
>    }
> 
>    transport: udpu
> 
> 
> 
> Any help greatly appreciated.
> 
> 
> Below is some more information:
> 
> * pacemaker logs:
> 
> Sep 26 22:24:00 [2836] ip-10-114-210-162        cib:     info:
> crm_client_new:  Connecting 0x111b780 for uid=0 gid=0 pid=2883
> id=977d6f23-963b-41a4-8fe0-a63024080d41
> Sep 26 22:24:00 [2836] ip-10-114-210-162        cib:     info:
> cib_process_request:     Forwarding cib_query operation for section
> 'all' to master (origin=local/cibadmin/2)
> Sep 26 22:24:30 [2836] ip-10-114-210-162        cib:     info:
> crm_client_destroy:      Destroying 0 events
> 
> 
> * ps axf | grep pacemaker|corosync:
> 
> 2806 ?        Ssl    0:10 corosync
> 2834 pts/1    S      0:00 pacemakerd
> 2836 ?        Ss     0:01  \_ /usr/libexec/pacemaker/cib
> 2837 ?        Ss     0:00  \_ /usr/libexec/pacemaker/stonithd
> 2838 ?        Ss     0:00  \_ /usr/libexec/pacemaker/lrmd
> 2839 ?        Ss     0:00  \_ /usr/libexec/pacemaker/attrd
> 2840 ?        Ss     0:00  \_ /usr/libexec/pacemaker/pengine
> 2841 ?        Ss     0:00  \_ /usr/libexec/pacemaker/crmd
> 
> 
> * strace cibadmin -Q:
> 
> open("/dev/shm/qb-cib_rw-event-2836-2897-12-data", O_RDWR) = 6
> ftruncate(6, 20480000)                  = 0
> mmap(NULL, 40960000, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
> 0x7fa221692000
> mmap(0x7fa221692000, 20480000, PROT_READ|PROT_WRITE,
> MAP_SHARED|MAP_FIXED, 6, 0) = 0x7fa221692000
> mmap(0x7fa222a1a000, 20480000, PROT_READ|PROT_WRITE,
> MAP_SHARED|MAP_FIXED, 6, 0) = 0x7fa222a1a000
> close(6)                                = 0
> close(5)                                = 0
> close(6)                                = -1 EBADF (Bad file descriptor)
> fstat(4, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
> fcntl(4, F_GETFL)                       = 0x802 (flags O_RDWR|O_NONBLOCK)
> poll([{fd=4, events=POLLIN}], 1, 0)     = 0 (Timeout)
> poll([{fd=4, events=POLLIN}], 1, 0)     = 0 (Timeout)
> sendto(4, "~", 1, MSG_NOSIGNAL, NULL, 0) = 1
> futex(0x7fa22df4cb60, FUTEX_WAKE_PRIVATE, 2147483647) = 0
> gettimeofday({1380234692, 68879}, NULL) = 0
> poll([{fd=4, events=POLLIN}], 1, 0)     = 0 (Timeout)
> poll([{fd=4, events=POLLIN}], 1, 0)     = 0 (Timeout)
> gettimeofday({1380234692, 69522}, NULL) = 0
> sendto(4, "\274", 1, MSG_NOSIGNAL, NULL, 0) = 1
> poll([{fd=4, events=POLLIN}], 1, 0)     = 0 (Timeout)
> gettimeofday({1380234692, 70085}, NULL) = 0
> gettimeofday({1380234692, 70197}, NULL) = 0
> poll([{fd=4, events=POLLIN}], 1, 30000) = 0 (Timeout)
> gettimeofday({1380234722, 91625}, NULL) = 0
> write(2, "Call cib_query failed (-62): Tim"..., 43Call cib_query
> failed (-62): Timer expired
> ) = 43
> poll([{fd=4, events=POLLIN}], 1, 0)     = 0 (Timeout)
> 
> 
> * netstat -lxp:
> 
> Active UNIX domain sockets (only servers)
> Proto RefCnt Flags       Type       State         I-Node PID/Program
> name    Path
> unix  2      [ ACC ]     STREAM     LISTENING     20021  2836/cib
>      @cib_rw
> unix  2      [ ACC ]     STREAM     LISTENING     19958  2838/lrmd
>      @lrmd
> unix  2      [ ACC ]     STREAM     LISTENING     19789  2806/corosync
>      @quorum
> unix  2      [ ACC ]     STREAM     LISTENING     19786  2806/corosync
>      @cmap
> unix  2      [ ACC ]     STREAM     LISTENING     20020  2836/cib
>      @cib_ro
> unix  2      [ ACC ]     STREAM     LISTENING     20057  2837/stonithd
>      @stonith-ng
> unix  2      [ ACC ]     STREAM     LISTENING     19787  2806/corosync
>      @cfg
> unix  2      [ ACC ]     STREAM     LISTENING     19906
> 2834/pacemakerd     @pacemakerd
> unix  2      [ ACC ]     STREAM     LISTENING     19788  2806/corosync
>      @cpg
> unix  2      [ ACC ]     STREAM     LISTENING     20022  2836/cib
>      @cib_shm
> unix  2      [ ACC ]     STREAM     LISTENING     19985  2840/pengine
>      @pengine
> 
> 
> 
> Thanks in advance,
> 
> -- 
> Best Regards,
> 
> Radoslaw Garbacz
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130927/d6ac5d1e/attachment-0003.sig>


More information about the Pacemaker mailing list