[ClusterLabs] [Problem] The SNMP trap which has been already started is transmitted.
renayama19661014 at ybb.ne.jp
renayama19661014 at ybb.ne.jp
Mon Aug 17 03:05:15 CEST 2015
Hi Andrew,
Thank you for comments.
I will confirm it tomorrow.
I am a vacation today.
Best Regards,
Hideo Yamauchi.
----- Original Message -----
> From: Andrew Beekhof <andrew at beekhof.net>
> To: renayama19661014 at ybb.ne.jp; Cluster Labs - All topics related to open-source clustering welcomed <users at clusterlabs.org>
> Cc:
> Date: 2015/8/17, Mon 09:30
> Subject: Re: [ClusterLabs] [Problem] The SNMP trap which has been already started is transmitted.
>
>
>> On 4 Aug 2015, at 7:36 pm, renayama19661014 at ybb.ne.jp wrote:
>>
>> Hi Andrew,
>>
>> Thank you for comments.
>>
>>>> However, a trap of crm_mon is sent to an SNMP manager.
>>>
>>> Are you using the built-in SNMP logic or using -E to give crm_mon a
> script which
>>> is then producing the trap?
>>> (I’m trying to figure out who could be turning the monitor action into
> a start)
>>
>>
>> I used the built-in SNMP.
>> I started as a daemon with -d option.
>
> Is it running on both nodes or just snmp1?
> Because there is no logic in crm_mon that would have remapped the monitor to
> start, so my working theory is that its a duplicate of an old event.
> Can you tell which node the trap is being sent from?
>
>>
>>
>> Best Regards,
>> Hideo Yamauchi.
>>
>>
>> ----- Original Message -----
>>> From: Andrew Beekhof <andrew at beekhof.net>
>>> To: renayama19661014 at ybb.ne.jp; Cluster Labs - All topics related to
> open-source clustering welcomed <users at clusterlabs.org>
>>> Cc:
>>> Date: 2015/8/4, Tue 14:15
>>> Subject: Re: [ClusterLabs] [Problem] The SNMP trap which has been
> already started is transmitted.
>>>
>>>
>>>> On 27 Jul 2015, at 4:18 pm, renayama19661014 at ybb.ne.jp wrote:
>>>>
>>>> Hi All,
>>>>
>>>> The transmission of the SNMP trap of crm_mon seems to have a
> problem.
>>>> I identified a problem on latest Pacemaker and Pacemaker1.1.13.
>>>>
>>>>
>>>> Step 1) I constitute a cluster and send simple CLI file.
>>>>
>>>> [root at snmp1 ~]# crm_mon -1
>>>> Last updated: Mon Jul 27 14:40:37 2015 Last change: Mon
> Jul 27
>>> 14:40:29 2015 by root via cibadmin on snmp1
>>>> Stack: corosync
>>>> Current DC: snmp1 (version 1.1.13-3d781d3) - partition with quorum
>>>> 2 nodes and 1 resource configured
>>>>
>>>> Online: [ snmp1 snmp2 ]
>>>>
>>>> prmDummy (ocf::heartbeat:Dummy): Started snmp1
>>>>
>>>> Step 2) I stop a node of the standby once.
>>>>
>>>> [root at snmp2 ~]# stop pacemaker
>>>> pacemaker stop/waiting
>>>>
>>>>
>>>> Step 3) I start a node of the standby again.
>>>> [root at snmp2 ~]# start pacemaker
>>>> pacemaker start/running, process 2284
>>>>
>>>> Step 4) The indication of crm_mon does not change in particular.
>>>> [root at snmp1 ~]# crm_mon -1
>>>> Last updated: Mon Jul 27 14:45:12 2015 Last change: Mon
> Jul 27
>>> 14:40:29 2015 by root via cibadmin on snmp1
>>>> Stack: corosync
>>>> Current DC: snmp1 (version 1.1.13-3d781d3) - partition with quorum
>>>> 2 nodes and 1 resource configured
>>>>
>>>> Online: [ snmp1 snmp2 ]
>>>>
>>>> prmDummy (ocf::heartbeat:Dummy): Started snmp1
>>>>
>>>>
>>>> In addition, as for the resource that started in snmp1 node,
> nothing
>>> changes.
>>>>
>>>> -------
>>>> Jul 27 14:41:39 snmp1 crmd[29116]: notice: State transition
> S_IDLE ->
>>> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL
>>> origin=abort_transition_graph ]
>>>> Jul 27 14:41:39 snmp1 cib[29111]: info: Completed cib_modify
> operation
>>> for section status: OK (rc=0, origin=snmp1/attrd/11, version=0.4.20)
>>>> Jul 27 14:41:39 snmp1 attrd[29114]: info: Update 11 for
> probe_complete:
>>> OK (0)
>>>> Jul 27 14:41:39 snmp1 attrd[29114]: info: Update 11 for
>>> probe_complete[snmp1]=true: OK (0)
>>>> Jul 27 14:41:39 snmp1 attrd[29114]: info: Update 11 for
>>> probe_complete[snmp2]=true: OK (0)
>>>> Jul 27 14:41:39 snmp1 cib[29202]: info: Wrote version 0.4.0 of
> the CIB
>>> to disk (digest: a1f1920279fe0b1466a79cab09fa77d6)
>>>> Jul 27 14:41:39 snmp1 pengine[29115]: notice: On loss of CCM
> Quorum:
>>> Ignore
>>>> Jul 27 14:41:39 snmp1 pengine[29115]: info: Node snmp2 is
> online
>>>> Jul 27 14:41:39 snmp1 pengine[29115]: info: Node snmp1 is
> online
>>>> Jul 27 14:41:39 snmp1 pengine[29115]: info:
>>> prmDummy#011(ocf::heartbeat:Dummy):#011Started snmp1
>>>> Jul 27 14:41:39 snmp1 pengine[29115]: info: Leave
>>> prmDummy#011(Started snmp1)
>>>> -------
>>>>
>>>> However, a trap of crm_mon is sent to an SNMP manager.
>>>
>>> Are you using the built-in SNMP logic or using -E to give crm_mon a
> script which
>>> is then producing the trap?
>>> (I’m trying to figure out who could be turning the monitor action into
> a start)
>>>
>>>> The resource does not reboot, but the SNMP trap which a resource
> started is
>>> sent.
>>>>
>>>> -------
>>>> Jul 27 14:41:39 SNMP-MANAGER snmptrapd[4521]: 2015-07-27 14:41:39
> snmp1
>>> [UDP:
>>>
> [192.168.40.100]:35265->[192.168.40.2]]:#012DISMAN-EVENT-MIB::sysUpTimeInstance
>
>>> = Timeticks: (1437975699) 166 days,
> 10:22:36.99#011SNMPv2-MIB::snmpTrapOID.0 =
>>> OID:
>>>
> PACEMAKER-MIB::pacemakerNotification#011PACEMAKER-MIB::pacemakerNotificationResource
>
>>> = STRING:
> "prmDummy"#011PACEMAKER-MIB::pacemakerNotificationNode =
>>> STRING:
> "snmp1"#011PACEMAKER-MIB::pacemakerNotificationOperation =
>>> STRING:
> "start"#011PACEMAKER-MIB::pacemakerNotificationDescription =
>>> STRING:
> "OK"#011PACEMAKER-MIB::pacemakerNotificationReturnCode =
>>> INTEGER: 0#011PACEMAKER-MIB::pacemakerNotificationTargetReturnCode =
> INTEGER:
>>> 0#011PACEMAKER-MIB::pacemakerNotificationStatus = INTEGER: 0
>>>> Jul 27 14:41:39 SNMP-MANAGER snmptrapd[4521]: 2015-07-27 14:41:39
> snmp1
>>> [UDP:
>>>
> [192.168.40.100]:35265->[192.168.40.2]]:#012DISMAN-EVENT-MIB::sysUpTimeInstance
>
>>> = Timeticks: (1437975699) 166 days,
> 10:22:36.99#011SNMPv2-MIB::snmpTrapOID.0 =
>>> OID:
>>>
> PACEMAKER-MIB::pacemakerNotification#011PACEMAKER-MIB::pacemakerNotificationResource
>
>>> = STRING:
> "prmDummy"#011PACEMAKER-MIB::pacemakerNotificationNode =
>>> STRING:
> "snmp1"#011PACEMAKER-MIB::pacemakerNotificationOperation =
>>> STRING:
> "monitor"#011PACEMAKER-MIB::pacemakerNotificationDescription =
>>> STRING:
> "OK"#011PACEMAKER-MIB::pacemakerNotificationReturnCode =
>>> INTEGER: 0#011PACEMAKER-MIB::pacemakerNotificationTargetReturnCode =
> INTEGER:
>>> 0#011PACEMAKER-MIB::pacemakerNotificationStatus = INTEGER: 0
>>>> -------
>>>>
>>>> A difference of CIB occurring by the start stop of the node seems
> to have a
>>> problem.
>>>> By this difference, crm_mon transmits an unnecessary SNMP trap.
>>>> -------
>>>> Jul 27 14:41:39 snmp1 cib[29111]: info: + /cib:
> @num_updates=19
>>>> Jul 27 14:41:39 snmp1 cib[29111]: info: +
>>> /cib/status/node_state[@id='3232238190']:
>>> @crm-debug-origin=do_update_resource
>>>> Jul 27 14:41:39 snmp1 cib[29111]: info: ++
>>>
> /cib/status/node_state[@id='3232238190']/lrm[@id='3232238190']/lrm_resources:
>
>>> <lrm_resource id="prmDummy" type="Dummy"
>>> class="ocf" provider="heartbeat"/>
>>>> Jul 27 14:41:39 snmp1 cib[29111]: info: ++
>
>>> <lrm_rsc_op
>>> id="prmDummy_last_0"
> operation_key="prmDummy_monitor_0"
>>> operation="monitor"
> crm-debug-origin="do_update_resource"
>>> crm_feature_set="3.0.10"
>>> transition-key="6:6:7:34187f48-1f81-49c8-846e-ff3ed4c8f787"
>>>
> transition-magic="0:7;6:6:7:34187f48-1f81-49c8-846e-ff3ed4c8f787"
>>> on_node="snmp2" call-id="5" rc-code="7"
>>> op-status="0" interval="0"
> last-run="1437975699"
>>> last-rc-change="1437975699" exec-time="18" queue-ti
>>>> Jul 27 14:41:39 snmp1 cib[29111]: info: ++
>
>>> </lrm_resource>
>>>> -------
>>>>
>>>> I registered this problem with Bugzilla.
>>>> * http://bugs.clusterlabs.org/show_bug.cgi?id=5245
>>>> * The log attached it to Bugzilla.
>>>>
>>>> Best Regards,
>>>> Hideo Yamauchi.
>>>>
>>>> _______________________________________________
>>>> Users mailing list: Users at clusterlabs.org
>>>> http://clusterlabs.org/mailman/listinfo/users
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
More information about the Users
mailing list