[ClusterLabs] Coming in 1.1.15: Event-driven alerts
renayama19661014 at ybb.ne.jp
renayama19661014 at ybb.ne.jp
Wed May 11 21:28:59 UTC 2016
Hi Klaus,
Thank you for comment.
I confirm your comment.
I think that I ask you a question again.
Many thanks!
Hideo Yamauchi.
----- Original Message -----
> From: Klaus Wenninger <kwenning at redhat.com>
> To: users at clusterlabs.org
> Cc:
> Date: 2016/5/11, Wed 14:13
> Subject: Re: [ClusterLabs] Coming in 1.1.15: Event-driven alerts
>
> On 05/10/2016 11:19 PM, renayama19661014 at ybb.ne.jp wrote:
>> Hi All,
>>
>> After all our member needs the control of the turn of the transmission of
> the SNMP trap.
>>
>> We make a patch of the control of the turn of the transmission and intend
> to send it.
>>
>> Probably, with the patch, we add the "ordered" attribute that we
> sent by an email before.
> Actually I still don't think that simple serialization of the calling of
> the snmptrap-tool
> is a good solution to tackle the problem of loosing the order of traps
> arriving at
> some management station:
>
> - makes things worse in case of traps coming from multiple nodes
> - doesn't help when the order is lost on the network.
>
> Anyway I see 2 other scenarios where a certain degree of serialization might
> be helpful:
>
> - alert agent-scripts that can't handle being called concurrently
> - performance issues that might arise on some systems that lack the
> performance-headroom needed and/or the agent-scripts in place
> require significant effort and/or there are a lot of resources/events
> that trigger a vast amount of alerts being handled in parallel
>
> So I could imagine the introduction of a meta-atribute that specifies a
> queue
> to be used for serialization.
>
> - 'none' is default and leads to the behavior we have at the moment.
> - any other queue-name leads to the instantiation of an additional queue
>
> This approach should allow merely any kind of serialization you can think of
> with as little impact as needed.
> e.g. if the agent doesn't cope with concurrent calls you use a queue per
> agent leading to all recipients being handled in a serialized way (and of
> course the different alerts as well). And all the other agents are running
> in parallel.
> e.g. you can have a separate queue for a single recipient leading to
> the alerts being sent there being serialized.
> e.g. if the performance impact should be kept at a minimal level you
> would use a single queue for all agents and all recipients
>
>>
>>
>> Best Regards,
>> Hideo Yamauchi.
>>
>>
>> ----- Original Message -----
>>> From: "renayama19661014 at ybb.ne.jp"
> <renayama19661014 at ybb.ne.jp>
>>> To: "kwenning at redhat.com" <kwenning at redhat.com>;
> "users at clusterlabs.org" <users at clusterlabs.org>; Cluster Labs -
> All topics related to open-source clustering welcomed
> <users at clusterlabs.org>
>>> Cc:
>>> Date: 2016/4/28, Thu 22:43
>>> Subject: Re: [ClusterLabs] Coming in 1.1.15: Event-driven alerts
>>>
>>> Hi Klaus,
>>>
>>> Because the script is performed the effectiveness of in async, I think
> that it
>>> is difficult to set "uptime" by the method of the sample.
>>> After all we may request the transmission of the order.
>>> #The patch before mine only controls a practice turn of the async and
> is not a
>>> thing giving load of crmd.
>>>
>>> Japan begins a rest for one week from tomorrow.
>>> I discuss after vacation with a member.
>>>
>>> Best Regards,
>>> Hideo Yamauchi.
>>>
>>>
>>>
>>> ----- Original Message -----
>>>> From: Klaus Wenninger <kwenning at redhat.com>
>>>> To: users at clusterlabs.org
>>>> Cc:
>>>> Date: 2016/4/28, Thu 03:14
>>>> Subject: Re: [ClusterLabs] Coming in 1.1.15: Event-driven alerts
>>>>
>>>> On 04/27/2016 04:19 PM, renayama19661014 at ybb.ne.jp wrote:
>>>>> Hi All,
>>>>>
>>>>> We have a request for a new SNMP function.
>>>>>
>>>>>
>>>>> The order of traps is not right.
>>>>>
>>>>> The turn of the trap is not sometimes followed.
>>>>> This is because the handling of notice carries out
> "path" in
>>>> async.
>>>>> I think that it is necessary to wait for completion of the
> practice at
>>>> "path" unit of "alerts".
>>>>>
>>>>> The turn of the trap is different from the real stop order of
> the
>>> resource.
>>>> Writing the alerts in a local list and having the alert-scripts
> called
>>>> in a serialized manner
>>>> would lead to the snmptrap-tool creating timestamps in the order
> of the
>>>> occurrence
>>>> of the alerts.
>>>> Having the snmp-manager order the traps by timestamp this would
> indeed
>>>> lead to
>>>> seeing them in the order they had occured.
>>>>
>>>> But this approach has a number of drawbacks:
>>>>
>>>> - it works just when the traps are coming from one node as there
> is no
>>>> way to serialize
>>>> over nodes - at least none that would work under all
> circumstances we
>>>> want alerts
>>>> to be delivered
>>>>
>>>> - it distorts the timestamps created even more from the points in
> time
>>>> when the
>>>> alert had been triggered - making the result in a
> multi-node-scenario
>>>> even worse and
>>>> making it hard to correlate with other sources of information
> like
>>>> logfiles
>>>>
>>>> - if you imagine a scenario with multiple mechanisms of delivering
> an
>>>> alert + multiple
>>>> recipients we couldn't use a single list but we would need
> something
>>> more
>>>> complicated to prevent unneeded delays, delays coming from one
> of the
>>>> delivery
>>>> methods not working properly due to e.g. a recipient that is not
>>>> reachable, ...
>>>> (all solvable of course but if it doesn't solve your problem
> in the
>>>> first place why the effort)
>>>>
>>>> The alternative approach taken doesn't create the timestamps
> in the
>>>> scripts but
>>>> provides timestamps to the scripts already.
>>>> This way it doesn't matter if the execution of the script is
> delayed.
>>>>
>>>>
>>>> A short example how this approach could be used with snmp-traps:
>>>>
>>>> edit pcmk_snmp_helper.sh:
>>>>
>>>> ...
>>>> starttickfile="/var/run/starttick"
>>>>
>>>> # hack to have a reference
>>>> # can have it e.g. in an attribute to be visible throughout the
> cluster
>>>> if [ ! -f ${starttickfile} ] ; then
>>>> echo ${CRM_alert_timestamp} > ${starttickfile}
>>>> fi
>>>>
>>>> starttick=`cat ${starttickfile}`
>>>> ticks=`eval ${CRM_alert_timestamp} - ${starttick}`
>>>>
>>>> if [[ ${CRM_alert_rc} != 0 && ${CRM_alert_task} ==
>>> "monitor"
>>>> ]] || [[
>>>> ${CRM_alert_task} != "monitor" ]] ; then
>>>> # This trap is compliant with PACEMAKER MIB
>>>> #
>>>>
> https://github.com/ClusterLabs/pacemaker/blob/master/extra/PCMK-MIB.txt
>>>> /usr/bin/snmptrap -v 2c -c public ${CRM_alert_recipient}
> ${ticks}
>>>> PACEMAKER-MIB::pacemakerNotificationTrap \
>>>> PACEMAKER-MIB::pacemakerNotificationNode s
>>> "${CRM_alert_node}"
>>>> \
>>>> PACEMAKER-MIB::pacemakerNotificationResource s
>>>> "${CRM_alert_rsc}" \
>>>> PACEMAKER-MIB::pacemakerNotificationOperation s
>>>> "${CRM_alert_task}" \
>>>> PACEMAKER-MIB::pacemakerNotificationDescription s
>>>> "${CRM_alert_desc}" \
>>>> PACEMAKER-MIB::pacemakerNotificationStatus i
>>>> "${CRM_alert_status}" \
>>>> PACEMAKER-MIB::pacemakerNotificationReturnCode i
> ${CRM_alert_rc}
>>> \
>>>> PACEMAKER-MIB::pacemakerNotificationTargetReturnCode i
>>>> ${CRM_alert_target_rc} && exit 0 || exit 1
>>>> fi
>>>>
>>>> exit 0
>>>> ...
>>>>
>>>> add a section to the cib:
>>>>
>>>> cibadmin --create --xml-text '<configuration>
> <alerts>
>>> <alert
>>>> id="snmp_traps"
>>>>
> path="/usr/share/pacemaker/tests/pcmk_snmp_helper.sh">
>>>> <meta_attributes id="meta_snmp_traps"> <nvpair
>>>> id="snmp_timestamp"
>>>> name="tstamp_format" value="%s%02N"/>
>>>> </meta_attributes> <recipient
>>>> id="trap_destination"
> value="192.168.123.3"/>
>>>> </alert> </alerts>
>>>> </configuration>'
>>>>
>>>>
>>>> This should solve the issue of correct order after being sorted by
>>>> timestamps
>>>> without having the ugly side-effects as described above.
>>>>
>>>> I hope I understood your scenario correctly and this small example
>>>> points out how I roughly would suggest to cope with the issue.
>>>>
>>>> Regards,
>>>> Klaus
>>>>> ----
>>>>> [root at rh72-01 ~]# grep Operation /var/log/ha-log | grep stop
>>>>> Apr 25 18:48:48 rh72-01 crmd[28897]: notice: Operation
>>> prmDummy1_stop_0:
>>>> ok (node=rh72-01, call=33, rc=0, cib-update=56, confirmed=true)
>>>>> Apr 25 18:48:48 rh72-01 crmd[28897]: notice: Operation
>>> prmDummy3_stop_0:
>>>> ok (node=rh72-01, call=37, rc=0, cib-update=57, confirmed=true)
>>>>> Apr 25 18:48:48 rh72-01 crmd[28897]: notice: Operation
>>> prmDummy4_stop_0:
>>>> ok (node=rh72-01, call=39, rc=0, cib-update=58, confirmed=true)
>>>>> Apr 25 18:48:48 rh72-01 crmd[28897]: notice: Operation
>>> prmDummy2_stop_0:
>>>> ok (node=rh72-01, call=35, rc=0, cib-update=59, confirmed=true)
>>>>> Apr 25 18:48:48 rh72-01 crmd[28897]: notice: Operation
>>> prmDummy5_stop_0:
>>>> ok (node=rh72-01, call=41, rc=0, cib-update=60, confirmed=true)
>>>>> Apr 25 18:48:50 snmp-manager snmptrapd[6865]: 2016-04-25
> 18:48:50
>>>> <UNKNOWN> [UDP:
>>>>
>>>
> [192.168.28.170]:40613->[192.168.28.189]:162]:#012DISMAN-EVENT-MIB::sysUpTimeInstance
>
>>>
>>>> = Timeticks: (25512486) 2 days,
> 22:52:04.86#011SNMPv2-MIB::snmpTrapOID.0 =
>>> OID:
>>>
> PACEMAKER-MIB::pacemakerNotificationTrap#011PACEMAKER-MIB::pacemakerNotificationNode
>
>>>
>>>> = STRING:
>>> "rh72-01"#011PACEMAKER-MIB::pacemakerNotificationResource =
>>>> STRING:
>>> "prmDummy3"#011PACEMAKER-MIB::pacemakerNotificationOperation
> =
>>>> STRING:
> "stop"#011PACEMAKER-MIB::pacemakerNotificationDescription
>>> =
>>>> STRING:
> "ok"#011PACEMAKER-MIB::pacemakerNotificationStatus =
>>> INTEGER:
>>>> 0#011PACEMAKER-MIB::pacemakerNotificationReturnCode = INTEGER:
>>>> 0#011PACEMAKER-MIB::pacemakerNotificationTargetReturnCode =
> INTEGER: 0
>>>>> Apr 25 18:48:50 snmp-manager snmptrapd[6865]: 2016-04-25
> 18:48:50
>>>> <UNKNOWN> [UDP:
>>>>
>>>
> [192.168.28.170]:39581->[192.168.28.189]:162]:#012DISMAN-EVENT-MIB::sysUpTimeInstance
>
>>>
>>>> = Timeticks: (25512489) 2 days,
> 22:52:04.89#011SNMPv2-MIB::snmpTrapOID.0 =
>>> OID:
>>>
> PACEMAKER-MIB::pacemakerNotificationTrap#011PACEMAKER-MIB::pacemakerNotificationNode
>
>>>
>>>> = STRING:
>>> "rh72-01"#011PACEMAKER-MIB::pacemakerNotificationResource =
>>>> STRING:
>>> "prmDummy4"#011PACEMAKER-MIB::pacemakerNotificationOperation
> =
>>>> STRING:
> "stop"#011PACEMAKER-MIB::pacemakerNotificationDescription
>>> =
>>>> STRING:
> "ok"#011PACEMAKER-MIB::pacemakerNotificationStatus =
>>> INTEGER:
>>>> 0#011PACEMAKER-MIB::pacemakerNotificationReturnCode = INTEGER:
>>>> 0#011PACEMAKER-MIB::pacemakerNotificationTargetReturnCode =
> INTEGER: 0
>>>>> Apr 25 18:48:50 snmp-manager snmptrapd[6865]: 2016-04-25
> 18:48:50
>>>> <UNKNOWN> [UDP:
>>>>
>>>
> [192.168.28.170]:37166->[192.168.28.189]:162]:#012DISMAN-EVENT-MIB::sysUpTimeInstance
>
>>>
>>>> = Timeticks: (25512490) 2 days,
> 22:52:04.90#011SNMPv2-MIB::snmpTrapOID.0 =
>>> OID:
>>>
> PACEMAKER-MIB::pacemakerNotificationTrap#011PACEMAKER-MIB::pacemakerNotificationNode
>
>>>
>>>> = STRING:
>>> "rh72-01"#011PACEMAKER-MIB::pacemakerNotificationResource =
>>>> STRING:
>>> "prmDummy1"#011PACEMAKER-MIB::pacemakerNotificationOperation
> =
>>>> STRING:
> "stop"#011PACEMAKER-MIB::pacemakerNotificationDescription
>>> =
>>>> STRING:
> "ok"#011PACEMAKER-MIB::pacemakerNotificationStatus =
>>> INTEGER:
>>>> 0#011PACEMAKER-MIB::pacemakerNotificationReturnCode = INTEGER:
>>>> 0#011PACEMAKER-MIB::pacemakerNotificationTargetReturnCode =
> INTEGER: 0
>>>>> Apr 25 18:48:50 snmp-manager snmptrapd[6865]: 2016-04-25
> 18:48:50
>>>> <UNKNOWN> [UDP:
>>>>
>>>
> [192.168.28.170]:53502->[192.168.28.189]:162]:#012DISMAN-EVENT-MIB::sysUpTimeInstance
>
>>>
>>>> = Timeticks: (25512494) 2 days,
> 22:52:04.94#011SNMPv2-MIB::snmpTrapOID.0 =
>>> OID:
>>>
> PACEMAKER-MIB::pacemakerNotificationTrap#011PACEMAKER-MIB::pacemakerNotificationNode
>
>>>
>>>> = STRING:
>>> "rh72-01"#011PACEMAKER-MIB::pacemakerNotificationResource =
>>>> STRING:
>>> "prmDummy2"#011PACEMAKER-MIB::pacemakerNotificationOperation
> =
>>>> STRING:
> "stop"#011PACEMAKER-MIB::pacemakerNotificationDescription
>>> =
>>>> STRING:
> "ok"#011PACEMAKER-MIB::pacemakerNotificationStatus =
>>> INTEGER:
>>>> 0#011PACEMAKER-MIB::pacemakerNotificationReturnCode = INTEGER:
>>>> 0#011PACEMAKER-MIB::pacemakerNotificationTargetReturnCode =
> INTEGER: 0
>>>>> Apr 25 18:48:50 snmp-manager snmptrapd[6865]: 2016-04-25
> 18:48:50
>>>> <UNKNOWN> [UDP:
>>>>
>>>
> [192.168.28.170]:45956->[192.168.28.189]:162]:#012DISMAN-EVENT-MIB::sysUpTimeInstance
>
>>>
>>>> = Timeticks: (25512497) 2 days,
> 22:52:04.97#011SNMPv2-MIB::snmpTrapOID.0 =
>>> OID:
>>>
> PACEMAKER-MIB::pacemakerNotificationTrap#011PACEMAKER-MIB::pacemakerNotificationNode
>
>>>
>>>> = STRING:
>>> "rh72-01"#011PACEMAKER-MIB::pacemakerNotificationResource =
>>>> STRING:
>>> "prmDummy5"#011PACEMAKER-MIB::pacemakerNotificationOperation
> =
>>>> STRING:
> "stop"#011PACEMAKER-MIB::pacemakerNotificationDescription
>>> =
>>>> STRING:
> "ok"#011PACEMAKER-MIB::pacemakerNotificationStatus =
>>> INTEGER:
>>>> 0#011PACEMAKER-MIB::pacemakerNotificationReturnCode = INTEGER:
>>>> 0#011PACEMAKER-MIB::pacemakerNotificationTargetReturnCode =
> INTEGER: 0
>>>>> ----
>>>>>
>>>>> I think that there is "timestamp" attribute for
> async by
>>> this
>>>> change.
>>>>> The order of traps may be important to a user.
>>>>> I suggest addition to "alert" element with
>>> "orderd"
>>>> attribute.
>>>>> * orderd
>>>>> false : The present processing.
>>>>> true : Control the transmission order of the trap.
>>>>>
>>>>> ----
>>>>> <configuration>
>>>>> <alerts>
>>>>> <alert id="notify_9"
>>>>>
> path="/usr/share/pacemaker/tests/pcmk_alert_sample1.sh"
>>>> ordered="true">
>>>>> (snip)
>>>>> </alert>
>>>>> <alert id="notify_9"
>>>>>
> path="/usr/share/pacemaker/tests/pcmk_alert_sample2.sh"
>>>> ordered="false">
>>>>> (snip)
>>>>> </alert>
>>>>> </alerts>
>>>>> </configuration>
>>>>>
>>>>> ----
>>>>>
>>>>> I send a patch to cope with this problem before.
>>>>> The former patch may be useful for the correction.
>>>>> * https://github.com/ClusterLabs/pacemaker/pull/847
>>>>>
>>>>> I intend to write the patch if everybody agrees to
> "ordered"
>>>> attribute.
>>>>> Best Regards,
>>>>> Hideo Yamauchi.
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list: Users at clusterlabs.org
>>>>> http://clusterlabs.org/mailman/listinfo/users
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started:
>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>>
>>>> _______________________________________________
>>>> Users mailing list: Users at clusterlabs.org
>>>> http://clusterlabs.org/mailman/listinfo/users
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>>
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org
>>> http://clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
More information about the Users
mailing list