[Pacemaker] The active trap of the SNMP is delayed.

renayama19661014 at ybb.ne.jp renayama19661014 at ybb.ne.jp
Tue Jun 14 21:29:35 EDT 2011


Hi All,

I found a problem with a trap of the SNMP.(from hbagent.)

A trap of active of the node seems to have possibilities to be delayed.

In addition, this problem sometimes occurs and does not always occur.


I confirmed it in the next procedure.

Step1) Start a node.

============
Last updated: Wed Jun 15 19:23:39 2011
Stack: Heartbeat
Current DC: srv02 (afe72fff-b7b4-4663-b845-872df29c635d) - partition WITHOUT quorum
Version: 1.0.11-6e010d6b0d49a6b929d17c0114e9d2d934dc8e04
2 Nodes configured, unknown expected votes
1 Resources configured.
============

Online: [ srv01 srv02 ]

 Resource Group: group-1
     prmDummy1  (ocf::heartbeat:Dummy): Started srv01

Migration summary:
* Node srv02: 
* Node srv01: 


Step2) Intercept one interface of the Heartbeat communication.

# iptables -A INPUT -i eth1 -s ! 192.168.10.110 -j DROP
# iptables -A INPUT -i eth1 -s ! 192.168.10.120 -j DROP


Step3) The next trap is received in SNMP managers.

(snip)
Jun 15 19:24:30 snmp-manager snmptrapd[4771]: 2011-06-15 19:24:30 <UNKNOWN> [UDP: [192.168.40.120]:59010]: DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (23014) 0:03:50.14       SNMPv2-MIB::snmpTrapOID.0 = OID: LINUX-HA-MIB::LHAIFStatusUpdate        LINUX-HA-MIB::LHANodeName = STRING: srv01       LINUX-HA-MIB::LHAIFName = STRING: eth1       LINUX-HA-MIB::LHAIFStatus = INTEGER: down(2) 
   ----> No problem.
Jun 15 19:24:32 snmp-manager snmptrapd[4771]: 2011-06-15 19:24:32 <UNKNOWN> [UDP: [192.168.40.110]:44001]: DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (23597) 0:03:55.97       SNMPv2-MIB::snmpTrapOID.0 = OID: LINUX-HA-MIB::LHANodeStatusUpdate      LINUX-HA-MIB::LHANodeName = STRING: srv02       LINUX-HA-MIB::LHANodeStatus = INTEGER: active(3)
   ----> The trap of active is improper in this timing.
Jun 15 19:24:34 snmp-manager snmptrapd[4771]: 2011-06-15 19:24:34 <UNKNOWN> [UDP: [192.168.40.110]:44001]: DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (23803) 0:03:58.03       SNMPv2-MIB::snmpTrapOID.0 = OID: LINUX-HA-MIB::LHAIFStatusUpdate        LINUX-HA-MIB::LHANodeName = STRING: srv02       LINUX-HA-MIB::LHAIFName = STRING: eth1       LINUX-HA-MIB::LHAIFStatus = INTEGER: down(2) 
   ----> No problem.
(snip)

Between the traps which interface intercepted, it is strange that the active trap of the node comes.

And I think that it is necessary for the active trap to be sent in an earlier timing.


This problem seems to happen in Heartbeat2.1.4.

I watched some sources, but think that client_lib of Heartbeat has a problem somehow or other.
Transmitted F_STATUS message is late and seems to be handled.


Best Regards,
Hideo Yamauchi.





More information about the Pacemaker mailing list