[Pacemaker] RFC: What part of the XML configuration do you hate the most?

Andrew Beekhof beekhof at gmail.com
Mon Oct 27 09:36:54 EDT 2008


On Oct 23, 2008, at 11:49 AM, Satomi TANIGUCHI wrote:

> Hi Andrew,
>
>
> Andrew Beekhof wrote:
>> On Sep 25, 2008, at 6:58 AM, Satomi TANIGUCHI wrote:
>>> Hi Andrew!
>>>
>>> Thank you so much for taking care of this patch!
>>>
>>>
>>> Andrew Beekhof wrote:
>>>> On a technical level, the use of inhibit_notify means that the  
>>>> cluster wont even act on the standby action until something else  
>>>> happens to invoke the PE again.
>>> Right.
>>> To avoid to create a similar graph two or more times,
>>> I set inhibit_notify option...
>>> But it doesn't matter now.
>>>
>>>> There is no need to even have a standby action... one can simply  
>>>> do:
>>>> +        } else if(on_fail == action_fail_standby) {
>>>> +            node->details->standby = TRUE;
>>>> +
>>>> in process_rsc_state() and it would take effect immediately -  
>>>> making most of the patch redundant.
>>> Without changing CIB, resources are moved undoubtedly but
>>> crm_mon can't show the node's status correctly.
>> I didn't notice that.  It should do.  I'll try and find some time  
>> to check today.
>
> I modified my patch for Pacemaker-dev(68d9e602fcb2).
> Its roles are:
> (1) add standby action to graph.
> (2) update CIB on standby action.
> I hope its specification is similar to your consideration.

I'm confused... I implemented this last month:
    http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/79962235e1bb

And your patch still implements it with an extra TE action that I  
explained wasn't required.

>
>
>
> Best Regards,
> Satomi TANIGUCHI
>
>>>
>>> I think it should show the node is "standby".
>>> What do you think?
>>>
>>>> I still think its strange that you'd want to migrate away all  
>>>> resources because an unrelated one failed... but its your cluster.
>>> The policy is that
>>> "The node which even one resource failed is no longer safe".
>> I still think its strange :-)
>>>
>>>
>>>
>>>> I'll apply a modified version of this patch today.
>>> Thanks a lot!!
>>>
>>>
>>> Regards,
>>> Satomi TANIGUCHI
>>>
>>>
>>>
>>>
>>>> On Sep 24, 2008, at 10:34 AM, Satomi TANIGUCHI wrote:
>>>>> Hello,
>>>>>
>>>>> Now I'm posting the patch which is to implement on_fail="standby".
>>>>> This patch is for pacemaker-dev(5383f371494e).
>>>>>
>>>>> Its purpose is to move all resources away from the node
>>>>> when a resource is failed on that.
>>>>> This setting is for start or monitor operation, not for stop op.
>>>>> And as far as I confirm, the loop which Andrew said doesn't  
>>>>> appear.
>>>>>
>>>>> Your comments and suggestions are really appreciated.
>>>>>
>>>>>
>>>>> Best Regards,
>>>>> Satomi TANIGUCHI
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Satomi Taniguchi wrote:
>>>>>> Hi Andrew,
>>>>>> Andrew Beekhof wrote:
>>>>>> >
>>>>>> (snip)
>>>>>> >
>>>>>> > no, i'm indicating that you've underestimated the scope of  
>>>>>> the problem
>>>>>> >
>>>>>> (snip)
>>>>>> Bugzilla #1601 is caused by moving healthy resource in STONITH  
>>>>>> ordering, isn't it?
>>>>>> I changed nothing about STONITH action when I implemented  
>>>>>> on_fail="standby".
>>>>>> On the failure of stop operation or when Sprit-Brain occurs,
>>>>>> I completely agree with that on_fail should be "fence".
>>>>>> But I consider about start or monitor operation's failure.
>>>>>> And on_fail="standby" is on the assumption that it is used only  
>>>>>> for these operations.
>>>>>> Its purpose is not to move healthy resources before doing  
>>>>>> STONITH,
>>>>>> but to move all resources away from the node which a resouce is  
>>>>>> failed.
>>>>>> And in any operation, Bugzilla#1601 doesn't occur because I  
>>>>>> changed nothing about STONITH.
>>>>>> STONITH doesn't require to stop any resources.
>>>>>> The following is why I make much of start and monitor operations.
>>>>>> What I regard seriously are:
>>>>>> - 1)On a resource's failure, only the failed resource
>>>>>>    and resources which are in the same group move from
>>>>>>    the failed node.
>>>>>>    -> At present, to move all resources (even if they are not
>>>>>>       in the group or have no constraints) away from
>>>>>>       the failed node automatically, on_fail setting of
>>>>>>       not only stop but start and monitor has to be set
>>>>>>       "fence" and the failure node has to be killed by STONITH.
>>>>>> - 2)(In connection with 1) When resources are moved away by  
>>>>>> failure
>>>>>>    of start or monitor operation, they should be shutdown  
>>>>>> normally.
>>>>>>    -> It sounds extremely normal, but it is impossible
>>>>>>       if you accord with 1).
>>>>>>    -> Of course, I know that I have to kill the failed node
>>>>>>       immediately if stop operation's failure or Split-Brain  
>>>>>> occurs.
>>>>>> - 3)Rebooting the failed node may lose the evidence of
>>>>>>    the real cause of a failure
>>>>>>    (nearly equal administrators can't analyse the failure).
>>>>>>    -> This is as Keisuke-san wrote before.
>>>>>>       It is a really serious matter in Enterprise services.
>>>>>> To solve the matters above, I implemented on_fail="standby".
>>>>>> If you have any other ideas to solve them, please let me know.
>>>>>> Just for reference, there is an example in attached files:
>>>>>> a resource group named "grpPostgreSQLDB" consists of  
>>>>>> IPaddr("prmIpPostgreSQLDB") and pgsql("prmApPostgreSQLDB") is  
>>>>>> working on node2.
>>>>>> (See: crm_mon_before.log)
>>>>>> I modified pgsql's stop function to always return  
>>>>>> $OCF_ERR_GENERIC.
>>>>>> When IPaddr resource failed, and its monitor's on_fail is  
>>>>>> "standby", pgsql tried to stop but it failed.
>>>>>> (See: pe-warn-0.node2.gif)
>>>>>> Then STONITH was executed according to the setting of pgsql's  
>>>>>> stop operation, on_fail="fence".
>>>>>> (See: pe-warn-1.node2.gif and pe-warn-0.node1.gif)
>>>>>> STONITH killed node2 pitilessly, and both resources of the  
>>>>>> group moved to node1 peacefully.
>>>>>> (See: crm_mon_after.log)
>>>>>> Best Regards,
>>>>>> Satomi Taniguchi
>>>>>> Andrew Beekhof wrote:
>>>>>>>
>>>>>>> On Aug 4, 2008, at 8:11 AM, Satomi Taniguchi wrote:
>>>>>>>
>>>>>>>> Hi Andrew,
>>>>>>>>
>>>>>>>> Thank you for your opitions!
>>>>>>>> But I'm afraid that you've misunderstood my intentions...
>>>>>>>
>>>>>>> no, i'm indicating that you've underestimated the scope of the  
>>>>>>> problem
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Andrew Beekhof wrote:
>>>>>>>> (snip)
>>>>>>>>> Two problems...
>>>>>>>>> The first is that standby happens after the fencing event,  
>>>>>>>>> so it's not really doing anything to migrate the healthy  
>>>>>>>>> resources.
>>>>>>>>
>>>>>>>> In the graph, the object "stonith-1 stop 0 rh5node1" just means
>>>>>>>> "a plugin named stonith-1 on rh5node1 stops",
>>>>>>>> not "fencing event occurs".
>>>>>>>>
>>>>>>>> For example, Node1 has two resource groups.
>>>>>>>> When a resource in one group is failed,
>>>>>>>> all resources in both groups stopped completely,
>>>>>>>> and stonith plugin on Node1 stopped.
>>>>>>>> After this, both resource group work on Node2.
>>>>>>>> I attacched a graph, cib.xml
>>>>>>>> and crm_mon's logs (before and after a resource broke down).
>>>>>>>> Please see them.
>>>>>>>>
>>>>>>>>
>>>>>>>>> Stop RscZ -(depends on)-> Stop RscY  -(depends on)-> Stonith  
>>>>>>>>> NodeX  -(depends on)-> Stop RscZ  -(depends on)-> ...
>>>>>>>> I just want to stop all resources without STONITH when  
>>>>>>>> monitor NG,
>>>>>>>> I don't want to change any actions when stop NG.
>>>>>>>> The setting on_fail="standby" is for start or monitor  
>>>>>>>> operation, and
>>>>>>>> it is on condition that the setting of stop operation's  
>>>>>>>> on_fail is "fence".
>>>>>>>> Then, STONITH is not executed when start or monitor is failed,
>>>>>>>> but it is executed when stop is failed.
>>>>>>>>
>>>>>>>> So, if RscY's monitor operation is failed,
>>>>>>>> its stop operation doesn't depend on "Sonith NodeX".
>>>>>>>> And if it is failed to stop RscY,
>>>>>>>> NodeX is turned off by STONITH, and the loop above does not  
>>>>>>>> occur.
>>>>>>>>
>>>>>>>>
>>>>>>>> Best Regards,
>>>>>>>> Satomi Taniguchi
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Pacemaker mailing list
>>>>>>>> Pacemaker at clusterlabs.org
>>>>>>>> http://list.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Pacemaker mailing list
>>>>>>> Pacemaker at clusterlabs.org
>>>>>>> http://list.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>> ------------------------------------------------------------------------
>>>>>> _______________________________________________
>>>>>> Pacemaker mailing list
>>>>>> Pacemaker at clusterlabs.org
>>>>>> http://list.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>
>>>>>
>>>>> diff -urN pacemaker-dev.orig/crmd/te_actions.c pacemaker-dev/ 
>>>>> crmd/te_actions.c
>>>>> --- pacemaker-dev.orig/crmd/te_actions.c    2008-09-24  
>>>>> 11:05:09.000000000 +0900
>>>>> +++ pacemaker-dev/crmd/te_actions.c    2008-09-24  
>>>>> 12:26:54.000000000 +0900
>>>>> @@ -161,6 +161,54 @@
>>>>>   return TRUE;
>>>>> }
>>>>>
>>>>> +static gboolean
>>>>> +te_standby_node(crm_graph_t *graph, crm_action_t *action)
>>>>> +{
>>>>> +    const char *id = NULL;
>>>>> +    const char *uuid = NULL;
>>>>> +    const char *target = NULL;
>>>>> +
>>>>> +    char *attr_id = NULL;
>>>>> +    int str_length = 2;
>>>>> +    const char *attr_name = "standby";
>>>>> +
>>>>> +    id = ID(action->xml);
>>>>> +    target = crm_element_value(action->xml, XML_LRM_ATTR_TARGET);
>>>>> +    uuid = crm_element_value(action->xml,  
>>>>> XML_LRM_ATTR_TARGET_UUID);
>>>>> +
>>>>> +    CRM_CHECK(id != NULL,
>>>>> +          crm_log_xml_warn(action->xml, "BadAction");
>>>>> +          return FALSE);
>>>>> +    CRM_CHECK(uuid != NULL,
>>>>> +          crm_log_xml_warn(action->xml, "BadAction");
>>>>> +          return FALSE);
>>>>> +    CRM_CHECK(target != NULL,
>>>>> +          crm_log_xml_warn(action->xml, "BadAction");
>>>>> +          return FALSE);
>>>>> +
>>>>> +    te_log_action(LOG_INFO,
>>>>> +              "Executing standby operation (%s) on %s", id,  
>>>>> target);
>>>>> +
>>>>> +    str_length += strlen(attr_name);
>>>>> +    str_length += strlen(uuid);
>>>>> +
>>>>> +    crm_malloc0(attr_id, str_length);
>>>>> +    sprintf(attr_id, "%s-%s", attr_name, uuid);
>>>>> +
>>>>> +    if (cib_ok > update_attr(fsa_cib_conn, cib_inhibit_notify,
>>>>> +        XML_CIB_TAG_NODES, uuid, NULL, attr_id, attr_name,  
>>>>> "on", FALSE)) {
>>>>> +        crm_err("Cannot standby %s: update_attr() call  
>>>>> failed.", target);
>>>>> +    }
>>>>> +    crm_free(attr_id);
>>>>> +
>>>>> +    crm_info("Skipping wait for %d", action->id);
>>>>> +    action->confirmed = TRUE;
>>>>> +    update_graph(graph, action);
>>>>> +    trigger_graph();
>>>>> +
>>>>> +    return TRUE;
>>>>> +}
>>>>> +
>>>>> static int get_target_rc(crm_action_t *action)
>>>>> {
>>>>>   const char *target_rc_s = g_hash_table_lookup(
>>>>> @@ -471,7 +519,8 @@
>>>>>   te_pseudo_action,
>>>>>   te_rsc_command,
>>>>>   te_crm_command,
>>>>> -    te_fence_node
>>>>> +    te_fence_node,
>>>>> +    te_standby_node
>>>>> };
>>>>>
>>>>> void
>>>>> diff -urN pacemaker-dev.orig/include/crm/crm.h pacemaker-dev/ 
>>>>> include/crm/crm.h
>>>>> --- pacemaker-dev.orig/include/crm/crm.h    2008-09-24  
>>>>> 11:05:09.000000000 +0900
>>>>> +++ pacemaker-dev/include/crm/crm.h    2008-09-24  
>>>>> 12:26:54.000000000 +0900
>>>>> @@ -143,6 +143,7 @@
>>>>> #define CRM_OP_SHUTDOWN_REQ    "req_shutdown"
>>>>> #define CRM_OP_SHUTDOWN     "do_shutdown"
>>>>> #define CRM_OP_FENCE         "stonith"
>>>>> +#define CRM_OP_STANDBY        "standby"
>>>>> #define CRM_OP_EVENTCC        "event_cc"
>>>>> #define CRM_OP_TEABORT        "te_abort"
>>>>> #define CRM_OP_TEABORTED    "te_abort_confirmed" /* we asked */
>>>>> diff -urN pacemaker-dev.orig/include/crm/pengine/common.h  
>>>>> pacemaker-dev/include/crm/pengine/common.h
>>>>> --- pacemaker-dev.orig/include/crm/pengine/common.h     
>>>>> 2008-09-24 11:05:09.000000000 +0900
>>>>> +++ pacemaker-dev/include/crm/pengine/common.h    2008-09-24  
>>>>> 12:26:54.000000000 +0900
>>>>> @@ -33,6 +33,7 @@
>>>>>   action_fail_migrate,    /* recover by moving it somewhere else  
>>>>> */
>>>>>   action_fail_block,
>>>>>   action_fail_stop,
>>>>> +    action_fail_standby,
>>>>>   action_fail_fence
>>>>> };
>>>>>
>>>>> @@ -51,6 +52,7 @@
>>>>>   action_demote,
>>>>>   action_demoted,
>>>>>   shutdown_crm,
>>>>> +    standby_node,
>>>>>   stonith_node
>>>>> };
>>>>>
>>>>> diff -urN pacemaker-dev.orig/include/crm/pengine/status.h  
>>>>> pacemaker-dev/include/crm/pengine/status.h
>>>>> --- pacemaker-dev.orig/include/crm/pengine/status.h     
>>>>> 2008-09-24 11:05:09.000000000 +0900
>>>>> +++ pacemaker-dev/include/crm/pengine/status.h    2008-09-24  
>>>>> 12:26:54.000000000 +0900
>>>>> @@ -107,6 +107,7 @@
>>>>>       gboolean standby;
>>>>>       gboolean pending;
>>>>>       gboolean unclean;
>>>>> +        gboolean action_standby;
>>>>>       gboolean shutdown;
>>>>>       gboolean expected_up;
>>>>>       gboolean is_dc;
>>>>> diff -urN pacemaker-dev.orig/include/crm/transition.h pacemaker- 
>>>>> dev/include/crm/transition.h
>>>>> --- pacemaker-dev.orig/include/crm/transition.h    2008-09-24  
>>>>> 11:05:09.000000000 +0900
>>>>> +++ pacemaker-dev/include/crm/transition.h    2008-09-24  
>>>>> 12:26:54.000000000 +0900
>>>>> @@ -113,6 +113,7 @@
>>>>>       gboolean (*rsc)(crm_graph_t *graph, crm_action_t *action);
>>>>>       gboolean (*crmd)(crm_graph_t *graph, crm_action_t *action);
>>>>>       gboolean (*stonith)(crm_graph_t *graph, crm_action_t  
>>>>> *action);
>>>>> +        gboolean (*standby)(crm_graph_t *graph, crm_action_t  
>>>>> *action);
>>>>> } crm_graph_functions_t;
>>>>>
>>>>> enum transition_status {
>>>>> diff -urN pacemaker-dev.orig/lib/pengine/common.c pacemaker-dev/ 
>>>>> lib/pengine/common.c
>>>>> --- pacemaker-dev.orig/lib/pengine/common.c    2008-09-24  
>>>>> 11:05:09.000000000 +0900
>>>>> +++ pacemaker-dev/lib/pengine/common.c    2008-09-24  
>>>>> 12:26:54.000000000 +0900
>>>>> @@ -154,6 +154,9 @@
>>>>>       case action_fail_fence:
>>>>>           result = "fence";
>>>>>           break;
>>>>> +        case action_fail_standby:
>>>>> +            result = "standby";
>>>>> +            break;
>>>>>   }
>>>>>   return result;
>>>>> }
>>>>> @@ -175,6 +178,8 @@
>>>>>       return shutdown_crm;
>>>>>   } else if(safe_str_eq(task, CRM_OP_FENCE)) {
>>>>>       return stonith_node;
>>>>> +    } else if(safe_str_eq(task, CRM_OP_STANDBY)) {
>>>>> +        return standby_node;
>>>>>   } else if(safe_str_eq(task, CRMD_ACTION_STATUS)) {
>>>>>       return monitor_rsc;
>>>>>   } else if(safe_str_eq(task, CRMD_ACTION_NOTIFY)) {
>>>>> @@ -242,6 +247,9 @@
>>>>>       case stonith_node:
>>>>>           result = CRM_OP_FENCE;
>>>>>           break;
>>>>> +        case standby_node:
>>>>> +            result = CRM_OP_STANDBY;
>>>>> +            break;
>>>>>       case monitor_rsc:
>>>>>           result = CRMD_ACTION_STATUS;
>>>>>           break;
>>>>> diff -urN pacemaker-dev.orig/lib/pengine/unpack.c pacemaker-dev/ 
>>>>> lib/pengine/unpack.c
>>>>> --- pacemaker-dev.orig/lib/pengine/unpack.c    2008-09-24  
>>>>> 11:05:09.000000000 +0900
>>>>> +++ pacemaker-dev/lib/pengine/unpack.c    2008-09-24  
>>>>> 12:26:54.000000000 +0900
>>>>> @@ -244,6 +244,7 @@
>>>>>            */
>>>>>           new_node->details->unclean = TRUE;
>>>>>       }
>>>>> +        new_node->details->action_standby = FALSE;
>>>>>              if(type == NULL
>>>>>          || safe_str_eq(type, "member")
>>>>> @@ -811,6 +812,10 @@
>>>>>           node->details->unclean = TRUE;
>>>>>           stop_action(rsc, node, FALSE);
>>>>>              +        } else if(on_fail == action_fail_standby) {
>>>>> +            node->details->action_standby = TRUE;
>>>>> +            stop_action(rsc, node, FALSE);
>>>>> +
>>>>>       } else if(on_fail == action_fail_block) {
>>>>>           /* is_managed == FALSE will prevent any
>>>>>            * actions being sent for the resource
>>>>> diff -urN pacemaker-dev.orig/lib/pengine/utils.c pacemaker-dev/ 
>>>>> lib/pengine/utils.c
>>>>> --- pacemaker-dev.orig/lib/pengine/utils.c    2008-09-24  
>>>>> 11:05:09.000000000 +0900
>>>>> +++ pacemaker-dev/lib/pengine/utils.c    2008-09-24  
>>>>> 12:26:54.000000000 +0900
>>>>> @@ -707,6 +707,10 @@
>>>>>           value = "stop resource";
>>>>>       }
>>>>>      +    } else if(safe_str_eq(value, "standby")) {
>>>>> +        action->on_fail = action_fail_standby;
>>>>> +        value = "node fencing (standby)";
>>>>> +
>>>>>   } else if(safe_str_eq(value, "ignore")
>>>>>       || safe_str_eq(value, "nothing")) {
>>>>>       action->on_fail = action_fail_ignore;
>>>>> diff -urN pacemaker-dev.orig/lib/transition/graph.c pacemaker- 
>>>>> dev/lib/transition/graph.c
>>>>> --- pacemaker-dev.orig/lib/transition/graph.c    2008-09-24  
>>>>> 11:05:09.000000000 +0900
>>>>> +++ pacemaker-dev/lib/transition/graph.c    2008-09-24  
>>>>> 12:26:54.000000000 +0900
>>>>> @@ -188,6 +188,11 @@
>>>>>           crm_debug_2("Executing STONITH-event: %d",
>>>>>                     action->id);
>>>>>           return graph_fns->stonith(graph, action);
>>>>> +
>>>>> +        } else if(safe_str_eq(task, CRM_OP_STANDBY)) {
>>>>> +            crm_debug_2("Executing STANDBY-event: %d",
>>>>> +                      action->id);
>>>>> +            return graph_fns->standby(graph, action);
>>>>>       }
>>>>>              crm_debug_2("Executing crm-event: %d", action->id);
>>>>> diff -urN pacemaker-dev.orig/lib/transition/utils.c pacemaker- 
>>>>> dev/lib/transition/utils.c
>>>>> --- pacemaker-dev.orig/lib/transition/utils.c    2008-09-24  
>>>>> 11:05:09.000000000 +0900
>>>>> +++ pacemaker-dev/lib/transition/utils.c    2008-09-24  
>>>>> 12:26:54.000000000 +0900
>>>>> @@ -41,6 +41,7 @@
>>>>>   pseudo_action_dummy,
>>>>>   pseudo_action_dummy,
>>>>>   pseudo_action_dummy,
>>>>> +    pseudo_action_dummy,
>>>>>   pseudo_action_dummy
>>>>> };
>>>>>
>>>>> @@ -61,6 +62,7 @@
>>>>>   CRM_ASSERT(graph_fns->crmd != NULL);
>>>>>   CRM_ASSERT(graph_fns->pseudo != NULL);
>>>>>   CRM_ASSERT(graph_fns->stonith != NULL);
>>>>> +    CRM_ASSERT(graph_fns->standby != NULL);
>>>>> }
>>>>>
>>>>> const char *
>>>>> diff -urN pacemaker-dev.orig/pengine/allocate.c pacemaker-dev/ 
>>>>> pengine/allocate.c
>>>>> --- pacemaker-dev.orig/pengine/allocate.c    2008-09-24  
>>>>> 11:05:09.000000000 +0900
>>>>> +++ pacemaker-dev/pengine/allocate.c    2008-09-24  
>>>>> 12:26:54.000000000 +0900
>>>>> @@ -777,6 +777,14 @@
>>>>>               last_stonith = stonith_op;                       }
>>>>>
>>>>> +        } else if(node->details->online && node->details- 
>>>>> >action_standby) {
>>>>> +            action_t *standby_op = NULL;
>>>>> +
>>>>> +            standby_op = custom_action(
>>>>> +                NULL, crm_strdup(CRM_OP_STANDBY),
>>>>> +                CRM_OP_STANDBY, node, FALSE, TRUE, data_set);
>>>>> +            standby_constraints(node, standby_op, data_set);
>>>>> +
>>>>>       } else if(node->details->online && node->details- 
>>>>> >shutdown) {                       action_t *down_op =  
>>>>> NULL;               crm_info("Scheduling Node %s for shutdown",
>>>>> diff -urN pacemaker-dev.orig/pengine/graph.c pacemaker-dev/ 
>>>>> pengine/graph.c
>>>>> --- pacemaker-dev.orig/pengine/graph.c    2008-09-24  
>>>>> 11:05:09.000000000 +0900
>>>>> +++ pacemaker-dev/pengine/graph.c    2008-09-24  
>>>>> 12:26:54.000000000 +0900
>>>>> @@ -347,6 +347,29 @@
>>>>>   return TRUE;
>>>>> }
>>>>>
>>>>> +gboolean
>>>>> +standby_constraints(
>>>>> +    node_t *node, action_t *standby_op, pe_working_set_t  
>>>>> *data_set)
>>>>> +{
>>>>> +    /* add the stop to the before lists so it counts as a pre-req
>>>>> +     * for the standby
>>>>> +     */
>>>>> +    slist_iter(
>>>>> +        rsc, resource_t, node->details->running_rsc, lpc,
>>>>> +
>>>>> +        if(is_not_set(rsc->flags, pe_rsc_managed)) {
>>>>> +            continue;
>>>>> +        }
>>>>> +
>>>>> +        custom_action_order(
>>>>> +            rsc, stop_key(rsc), NULL,
>>>>> +            NULL, crm_strdup(CRM_OP_STANDBY), standby_op,
>>>>> +            pe_order_implies_left, data_set);
>>>>> +    );
>>>>> +
>>>>> +    return TRUE;
>>>>> +}
>>>>> +
>>>>> static void dup_attr(gpointer key, gpointer value, gpointer  
>>>>> user_data)
>>>>> {
>>>>>   g_hash_table_replace(user_data, crm_strdup(key),  
>>>>> crm_strdup(value));
>>>>> @@ -369,6 +392,9 @@
>>>>>       action_xml = create_xml_node(NULL, XML_GRAPH_TAG_CRM_EVENT);
>>>>> /*         needs_node_info = FALSE; */
>>>>>      +    } else if(safe_str_eq(action->task, CRM_OP_STANDBY)) {
>>>>> +        action_xml = create_xml_node(NULL,  
>>>>> XML_GRAPH_TAG_CRM_EVENT);
>>>>> +
>>>>>   } else if(safe_str_eq(action->task, CRM_OP_SHUTDOWN)) {
>>>>>       action_xml = create_xml_node(NULL, XML_GRAPH_TAG_CRM_EVENT);
>>>>>
>>>>> diff -urN pacemaker-dev.orig/pengine/group.c pacemaker-dev/ 
>>>>> pengine/group.c
>>>>> --- pacemaker-dev.orig/pengine/group.c    2008-09-24  
>>>>> 11:05:09.000000000 +0900
>>>>> +++ pacemaker-dev/pengine/group.c    2008-09-24  
>>>>> 12:26:54.000000000 +0900
>>>>> @@ -435,6 +435,7 @@
>>>>>       case action_notified:
>>>>>       case shutdown_crm:
>>>>>       case stonith_node:
>>>>> +        case standby_node:
>>>>>           break;
>>>>>       case stop_rsc:
>>>>>       case stopped_rsc:
>>>>> diff -urN pacemaker-dev.orig/pengine/pengine.h pacemaker-dev/ 
>>>>> pengine/pengine.h
>>>>> --- pacemaker-dev.orig/pengine/pengine.h    2008-09-24  
>>>>> 11:05:09.000000000 +0900
>>>>> +++ pacemaker-dev/pengine/pengine.h    2008-09-24  
>>>>> 12:26:54.000000000 +0900
>>>>> @@ -150,6 +150,9 @@
>>>>> extern gboolean stonith_constraints(
>>>>>   node_t *node, action_t *stonith_op, pe_working_set_t *data_set);
>>>>>
>>>>> +extern gboolean standby_constraints(
>>>>> +    node_t *node, action_t *standby_op, pe_working_set_t  
>>>>> *data_set);
>>>>> +
>>>>> extern int custom_action_order(
>>>>>   resource_t *lh_rsc, char *lh_task, action_t *lh_action,
>>>>>   resource_t *rh_rsc, char *rh_task, action_t *rh_action,
>>>>> diff -urN pacemaker-dev.orig/pengine/utils.c pacemaker-dev/ 
>>>>> pengine/utils.c
>>>>> --- pacemaker-dev.orig/pengine/utils.c    2008-09-24  
>>>>> 11:05:12.000000000 +0900
>>>>> +++ pacemaker-dev/pengine/utils.c    2008-09-24  
>>>>> 12:26:54.000000000 +0900
>>>>> @@ -180,10 +180,13 @@
>>>>>   if(node->details->online == FALSE
>>>>>      || node->details->shutdown
>>>>>      || node->details->unclean
>>>>> -       || node->details->standby) {
>>>>> -        crm_debug_2("%s: online=%d, unclean=%d, standby=%d",
>>>>> +       || node->details->standby
>>>>> +       || node->details->action_standby) {
>>>>> +        crm_debug_2("%s: online=%d, unclean=%d, standby=%d" \
>>>>> +                ", action_standby=%d",
>>>>>               node->details->uname, node->details->online,
>>>>> -                node->details->unclean, node->details->standby);
>>>>> +                node->details->unclean, node->details->standby,
>>>>> +                node->details->action_standby);
>>>>>       return FALSE;
>>>>>   }
>>>>>   return TRUE;
>>>>> @@ -337,6 +340,7 @@
>>>>>       case monitor_rsc:
>>>>>       case shutdown_crm:
>>>>>       case stonith_node:
>>>>> +        case standby_node:
>>>>>           task = no_action;
>>>>>           break;
>>>>>       default:
>>>>> @@ -429,6 +433,7 @@
>>>>>       switch(text2task(action->task)) {
>>>>>       case stonith_node:
>>>>> +        case standby_node:
>>>>>       case shutdown_crm:
>>>>>           do_crm_log(log_level,
>>>>>                     "%s%s%sAction %d: %s%s%s%s%s%s",
>>>>> diff -urN pacemaker-dev.orig/xml/crm-1.0.dtd pacemaker-dev/xml/ 
>>>>> crm-1.0.dtd
>>>>> --- pacemaker-dev.orig/xml/crm-1.0.dtd    2008-09-24  
>>>>> 11:05:12.000000000 +0900
>>>>> +++ pacemaker-dev/xml/crm-1.0.dtd    2008-09-24  
>>>>> 12:26:54.000000000 +0900
>>>>> @@ -266,7 +266,7 @@
>>>>>         disabled      (true|yes|1|false|no|0)        'false'
>>>>>         role          (Master|Slave|Started|Stopped) 'Started'
>>>>>         prereq        (nothing|quorum|fencing)       #IMPLIED
>>>>> -          on_fail       (ignore|block|stop|restart|fence)      
>>>>> #IMPLIED>
>>>>> +          on_fail       (ignore|block|stop|restart|fence| 
>>>>> standby)     #IMPLIED>
>>>>> <!--
>>>>> Use this to emulate v1 type Heartbeat groups.
>>>>> Defining a resource group is a quick way to make sure that the  
>>>>> resources:
>>>>> diff -urN pacemaker-dev.orig/xml/crm-transitional.dtd pacemaker- 
>>>>> dev/xml/crm-transitional.dtd
>>>>> --- pacemaker-dev.orig/xml/crm-transitional.dtd    2008-09-24  
>>>>> 11:05:12.000000000 +0900
>>>>> +++ pacemaker-dev/xml/crm-transitional.dtd    2008-09-24  
>>>>> 12:26:54.000000000 +0900
>>>>> @@ -272,7 +272,7 @@
>>>>>         disabled      (true|yes|1|false|no|0)        'false'
>>>>>         role          (Master|Slave|Started|Stopped) 'Started'
>>>>>         prereq        (nothing|quorum|fencing)       #IMPLIED
>>>>> -          on_fail       (ignore|block|stop|restart|fence)      
>>>>> #IMPLIED>
>>>>> +          on_fail       (ignore|block|stop|restart|fence| 
>>>>> standby)     #IMPLIED>
>>>>> <!--
>>>>> Use this to emulate v1 type Heartbeat groups.
>>>>> Defining a resource group is a quick way to make sure that the  
>>>>> resources:
>>>>> diff -urN pacemaker-dev.orig/xml/crm.dtd pacemaker-dev/xml/crm.dtd
>>>>> --- pacemaker-dev.orig/xml/crm.dtd    2008-09-24  
>>>>> 11:05:12.000000000 +0900
>>>>> +++ pacemaker-dev/xml/crm.dtd    2008-09-24 12:26:54.000000000  
>>>>> +0900
>>>>> @@ -266,7 +266,7 @@
>>>>>         disabled      (true|yes|1|false|no|0)        'false'
>>>>>         role          (Master|Slave|Started|Stopped) 'Started'
>>>>>         prereq        (nothing|quorum|fencing)       #IMPLIED
>>>>> -          on_fail       (ignore|block|stop|restart|fence)      
>>>>> #IMPLIED>
>>>>> +          on_fail       (ignore|block|stop|restart|fence| 
>>>>> standby)     #IMPLIED>
>>>>> <!--
>>>>> Use this to emulate v1 type Heartbeat groups.
>>>>> Defining a resource group is a quick way to make sure that the  
>>>>> resources:
>>>>> diff -urN pacemaker-dev.orig/xml/resources.rng.in pacemaker-dev/ 
>>>>> xml/resources.rng.in
>>>>> --- pacemaker-dev.orig/xml/resources.rng.in    2008-09-24  
>>>>> 11:05:12.000000000 +0900
>>>>> +++ pacemaker-dev/xml/resources.rng.in    2008-09-24  
>>>>> 12:26:54.000000000 +0900
>>>>> @@ -160,6 +160,7 @@
>>>>>         <value>block</value>
>>>>>         <value>stop</value>
>>>>>         <value>restart</value>
>>>>> +          <value>standby</value>
>>>>>         <value>fence</value>
>>>>>       </choice>
>>>>>         </attribute>
>>>>>
>>>>> _______________________________________________
>>>>> Pacemaker mailing list
>>>>> Pacemaker at clusterlabs.org
>>>>> http://list.clusterlabs.org/mailman/listinfo/pacemaker
>>>> _______________________________________________
>>>> Pacemaker mailing list
>>>> Pacemaker at clusterlabs.org
>>>> http://list.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list
>>> Pacemaker at clusterlabs.org
>>> http://list.clusterlabs.org/mailman/listinfo/pacemaker
>> _______________________________________________
>> Pacemaker mailing list
>> Pacemaker at clusterlabs.org
>> http://list.clusterlabs.org/mailman/listinfo/pacemaker
>
> diff -urN pacemaker-dev.org/crmd/te_actions.c pacemaker-dev.mod/crmd/ 
> te_actions.c
> --- pacemaker-dev.org/crmd/te_actions.c	2008-10-23  
> 10:50:03.000000000 +0900
> +++ pacemaker-dev.mod/crmd/te_actions.c	2008-10-23  
> 10:54:29.000000000 +0900
> @@ -160,6 +160,42 @@
> 	return TRUE;
> }
>
> +static gboolean
> +te_standby_node(crm_graph_t *graph, crm_action_t *action)
> +{
> +	const char *id = NULL;
> +	const char *uuid = NULL;
> +	const char *target = NULL;
> +
> +	id = ID(action->xml);
> +	target = crm_element_value(action->xml, XML_LRM_ATTR_TARGET);
> +	uuid = crm_element_value(action->xml, XML_LRM_ATTR_TARGET_UUID);
> +
> +	CRM_CHECK(id != NULL,
> +		  crm_log_xml_warn(action->xml, "BadAction");
> +		  return FALSE);
> +	CRM_CHECK(uuid != NULL,
> +		  crm_log_xml_warn(action->xml, "BadAction");
> +		  return FALSE);
> +	CRM_CHECK(target != NULL,
> +		  crm_log_xml_warn(action->xml, "BadAction");
> +		  return FALSE);
> +
> +	te_log_action(LOG_INFO,
> +		      "Executing standby operation (%s) on %s", id, target);
> +
> +	if (cib_ok > set_standby(fsa_cib_conn, uuid, XML_CIB_TAG_NODES,  
> "on")) {
> +		crm_err("Cannot standby %s: set_standby() call failed.", target);
> +	}
> +
> +	crm_info("Skipping wait for %d", action->id);
> +	action->confirmed = TRUE;
> +	update_graph(graph, action);
> +	trigger_graph();
> +
> +	return TRUE;
> +}
> +
> static int get_target_rc(crm_action_t *action)
> {
> 	const char *target_rc_s = g_hash_table_lookup(
> @@ -470,7 +506,8 @@
> 	te_pseudo_action,
> 	te_rsc_command,
> 	te_crm_command,
> -	te_fence_node
> +	te_fence_node,
> +	te_standby_node
> };
>
> void
> diff -urN pacemaker-dev.org/include/crm/crm.h pacemaker-dev.mod/ 
> include/crm/crm.h
> --- pacemaker-dev.org/include/crm/crm.h	2008-10-23  
> 10:50:04.000000000 +0900
> +++ pacemaker-dev.mod/include/crm/crm.h	2008-10-23  
> 10:54:29.000000000 +0900
> @@ -143,6 +143,7 @@
> #define CRM_OP_SHUTDOWN_REQ	"req_shutdown"
> #define CRM_OP_SHUTDOWN 	"do_shutdown"
> #define CRM_OP_FENCE	 	"stonith"
> +#define CRM_OP_STANDBY		"standby"
> #define CRM_OP_EVENTCC		"event_cc"
> #define CRM_OP_TEABORT		"te_abort"
> #define CRM_OP_TEABORTED	"te_abort_confirmed" /* we asked */
> diff -urN pacemaker-dev.org/include/crm/pengine/common.h pacemaker- 
> dev.mod/include/crm/pengine/common.h
> --- pacemaker-dev.org/include/crm/pengine/common.h	2008-10-23  
> 10:50:04.000000000 +0900
> +++ pacemaker-dev.mod/include/crm/pengine/common.h	2008-10-23  
> 10:54:29.000000000 +0900
> @@ -52,6 +52,7 @@
> 	action_demote,
> 	action_demoted,
> 	shutdown_crm,
> +	standby_node,
> 	stonith_node
> };
>
> diff -urN pacemaker-dev.org/include/crm/pengine/status.h pacemaker- 
> dev.mod/include/crm/pengine/status.h
> --- pacemaker-dev.org/include/crm/pengine/status.h	2008-10-23  
> 10:50:04.000000000 +0900
> +++ pacemaker-dev.mod/include/crm/pengine/status.h	2008-10-23  
> 10:54:29.000000000 +0900
> @@ -106,6 +106,7 @@
> 		gboolean standby;
> 		gboolean pending;
> 		gboolean unclean;
> +		gboolean action_standby;
> 		gboolean shutdown;
> 		gboolean expected_up;
> 		gboolean is_dc;
> diff -urN pacemaker-dev.org/include/crm/transition.h pacemaker- 
> dev.mod/include/crm/transition.h
> --- pacemaker-dev.org/include/crm/transition.h	2008-10-23  
> 10:50:04.000000000 +0900
> +++ pacemaker-dev.mod/include/crm/transition.h	2008-10-23  
> 10:54:29.000000000 +0900
> @@ -115,6 +115,7 @@
> 		gboolean (*rsc)(crm_graph_t *graph, crm_action_t *action);
> 		gboolean (*crmd)(crm_graph_t *graph, crm_action_t *action);
> 		gboolean (*stonith)(crm_graph_t *graph, crm_action_t *action);
> +		gboolean (*standby)(crm_graph_t *graph, crm_action_t *action);
> } crm_graph_functions_t;
>
> enum transition_status {
> diff -urN pacemaker-dev.org/lib/pengine/common.c pacemaker-dev.mod/ 
> lib/pengine/common.c
> --- pacemaker-dev.org/lib/pengine/common.c	2008-10-23  
> 10:50:04.000000000 +0900
> +++ pacemaker-dev.mod/lib/pengine/common.c	2008-10-23  
> 10:54:29.000000000 +0900
> @@ -178,6 +178,8 @@
> 		return shutdown_crm;
> 	} else if(safe_str_eq(task, CRM_OP_FENCE)) {
> 		return stonith_node;
> +	} else if(safe_str_eq(task, CRM_OP_STANDBY)) {
> +		return standby_node;
> 	} else if(safe_str_eq(task, CRMD_ACTION_STATUS)) {
> 		return monitor_rsc;
> 	} else if(safe_str_eq(task, CRMD_ACTION_NOTIFY)) {
> @@ -245,6 +247,9 @@
> 		case stonith_node:
> 			result = CRM_OP_FENCE;
> 			break;
> +		case standby_node:
> +			result = CRM_OP_STANDBY;
> +			break;
> 		case monitor_rsc:
> 			result = CRMD_ACTION_STATUS;
> 			break;
> diff -urN pacemaker-dev.org/lib/pengine/unpack.c pacemaker-dev.mod/ 
> lib/pengine/unpack.c
> --- pacemaker-dev.org/lib/pengine/unpack.c	2008-10-23  
> 10:50:04.000000000 +0900
> +++ pacemaker-dev.mod/lib/pengine/unpack.c	2008-10-23  
> 10:54:29.000000000 +0900
> @@ -240,6 +240,7 @@
> 			 */
> 			new_node->details->unclean = TRUE;
> 		}
> +		new_node->details->action_standby = FALSE;
> 		
> 		if(type == NULL
> 		   || safe_str_eq(type, "member")
> @@ -809,6 +810,7 @@
> 				
> 		} else if(on_fail == action_fail_standby) {
> 			node->details->standby = TRUE;
> +			node->details->action_standby = TRUE;
>
> 		} else if(on_fail == action_fail_block) {
> 			/* is_managed == FALSE will prevent any
> diff -urN pacemaker-dev.org/lib/transition/graph.c pacemaker-dev.mod/ 
> lib/transition/graph.c
> --- pacemaker-dev.org/lib/transition/graph.c	2008-10-23  
> 10:50:04.000000000 +0900
> +++ pacemaker-dev.mod/lib/transition/graph.c	2008-10-23  
> 10:54:29.000000000 +0900
> @@ -188,6 +188,11 @@
> 			crm_debug_2("Executing STONITH-event: %d",
> 				      action->id);
> 			return graph_fns->stonith(graph, action);
> +
> +		} else if(safe_str_eq(task, CRM_OP_STANDBY)) {
> +			crm_debug_2("Executing STANDBY-event: %d",
> +				      action->id);
> +			return graph_fns->standby(graph, action);
> 		}
> 		
> 		crm_debug_2("Executing crm-event: %d", action->id);
> diff -urN pacemaker-dev.org/lib/transition/utils.c pacemaker-dev.mod/ 
> lib/transition/utils.c
> --- pacemaker-dev.org/lib/transition/utils.c	2008-10-23  
> 10:50:04.000000000 +0900
> +++ pacemaker-dev.mod/lib/transition/utils.c	2008-10-23  
> 10:54:30.000000000 +0900
> @@ -41,6 +41,7 @@
> 	pseudo_action_dummy,
> 	pseudo_action_dummy,
> 	pseudo_action_dummy,
> +	pseudo_action_dummy,
> 	pseudo_action_dummy
> };
>
> @@ -61,6 +62,7 @@
> 	CRM_ASSERT(graph_fns->crmd != NULL);
> 	CRM_ASSERT(graph_fns->pseudo != NULL);
> 	CRM_ASSERT(graph_fns->stonith != NULL);
> +	CRM_ASSERT(graph_fns->standby != NULL);
> }
>
> const char *
> diff -urN pacemaker-dev.org/pengine/allocate.c pacemaker-dev.mod/ 
> pengine/allocate.c
> --- pacemaker-dev.org/pengine/allocate.c	2008-10-23  
> 10:50:04.000000000 +0900
> +++ pacemaker-dev.mod/pengine/allocate.c	2008-10-23  
> 10:54:30.000000000 +0900
> @@ -774,6 +774,14 @@
> 				last_stonith = stonith_op;			
> 			}
>
> +		} else if(node->details->online && node->details->action_standby) {
> +			action_t *standby_op = NULL;
> +
> +			standby_op = custom_action(
> +				NULL, crm_strdup(CRM_OP_STANDBY),
> +				CRM_OP_STANDBY, node, FALSE, TRUE, data_set);
> +			standby_constraints(node, standby_op, data_set);
> +
> 		} else if(node->details->online && node->details->shutdown) {			
> 			action_t *down_op = NULL;	
> 			crm_info("Scheduling Node %s for shutdown",
> diff -urN pacemaker-dev.org/pengine/graph.c pacemaker-dev.mod/ 
> pengine/graph.c
> --- pacemaker-dev.org/pengine/graph.c	2008-10-23 10:50:04.000000000  
> +0900
> +++ pacemaker-dev.mod/pengine/graph.c	2008-10-23 10:54:30.000000000  
> +0900
> @@ -347,6 +347,29 @@
> 	return TRUE;
> }
>
> +gboolean
> +standby_constraints(
> +	node_t *node, action_t *standby_op, pe_working_set_t *data_set)
> +{
> +	/* add the stop to the before lists so it counts as a pre-req
> +	 * for the standby
> +	 */
> +	slist_iter(
> +		rsc, resource_t, node->details->running_rsc, lpc,
> +
> +		if(is_not_set(rsc->flags, pe_rsc_managed)) {
> +			continue;
> +		}
> +
> +		custom_action_order(
> +			rsc, stop_key(rsc), NULL,
> +			NULL, crm_strdup(CRM_OP_STANDBY), standby_op,
> +			pe_order_implies_left, data_set);
> +	);
> +
> +	return TRUE;
> +}
> +
> static void dup_attr(gpointer key, gpointer value, gpointer user_data)
> {
> 	g_hash_table_replace(user_data, crm_strdup(key), crm_strdup(value));
> @@ -369,6 +392,9 @@
> 		action_xml = create_xml_node(NULL, XML_GRAPH_TAG_CRM_EVENT);
> /* 		needs_node_info = FALSE; */
> 		
> +	} else if(safe_str_eq(action->task, CRM_OP_STANDBY)) {
> +		action_xml = create_xml_node(NULL, XML_GRAPH_TAG_CRM_EVENT);
> +
> 	} else if(safe_str_eq(action->task, CRM_OP_SHUTDOWN)) {
> 		action_xml = create_xml_node(NULL, XML_GRAPH_TAG_CRM_EVENT);
>
> diff -urN pacemaker-dev.org/pengine/group.c pacemaker-dev.mod/ 
> pengine/group.c
> --- pacemaker-dev.org/pengine/group.c	2008-10-23 10:50:04.000000000  
> +0900
> +++ pacemaker-dev.mod/pengine/group.c	2008-10-23 10:54:30.000000000  
> +0900
> @@ -435,6 +435,7 @@
> 		case action_notified:
> 		case shutdown_crm:
> 		case stonith_node:
> +		case standby_node:
> 		    break;
> 		case stop_rsc:
> 		case stopped_rsc:
> diff -urN pacemaker-dev.org/pengine/pengine.h pacemaker-dev.mod/ 
> pengine/pengine.h
> --- pacemaker-dev.org/pengine/pengine.h	2008-10-23  
> 10:50:04.000000000 +0900
> +++ pacemaker-dev.mod/pengine/pengine.h	2008-10-23  
> 10:54:30.000000000 +0900
> @@ -150,6 +150,9 @@
> extern gboolean stonith_constraints(
> 	node_t *node, action_t *stonith_op, pe_working_set_t *data_set);
>
> +extern gboolean standby_constraints(
> +	node_t *node, action_t *standby_op, pe_working_set_t *data_set);
> +
> extern int custom_action_order(
> 	resource_t *lh_rsc, char *lh_task, action_t *lh_action,
> 	resource_t *rh_rsc, char *rh_task, action_t *rh_action,
> diff -urN pacemaker-dev.org/pengine/utils.c pacemaker-dev.mod/ 
> pengine/utils.c
> --- pacemaker-dev.org/pengine/utils.c	2008-10-23 10:50:07.000000000  
> +0900
> +++ pacemaker-dev.mod/pengine/utils.c	2008-10-23 10:54:30.000000000  
> +0900
> @@ -337,6 +337,7 @@
> 		case monitor_rsc:
> 		case shutdown_crm:
> 		case stonith_node:
> +		case standby_node:
> 			task = no_action;
> 			break;
> 		default:
> @@ -429,6 +430,7 @@
> 	
> 	switch(text2task(action->task)) {
> 		case stonith_node:
> +		case standby_node:
> 		case shutdown_crm:
> 			do_crm_log(log_level,
> 				      "%s%s%sAction %d: %s%s%s%s%s%s",
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at clusterlabs.org
> http://list.clusterlabs.org/mailman/listinfo/pacemaker





More information about the Pacemaker mailing list