[Pacemaker] hangs pending
Andrey Groshev
greenx at yandex.ru
Thu Feb 20 11:04:28 UTC 2014
20.02.2014, 13:57, "Andrew Beekhof" <andrew at beekhof.net>:
> On 20 Feb 2014, at 5:33 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>
>> 20.02.2014, 01:22, "Andrew Beekhof" <andrew at beekhof.net>:
>>> On 20 Feb 2014, at 4:18 am, Andrey Groshev <greenx at yandex.ru> wrote:
>>>> 19.02.2014, 06:47, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>> On 18 Feb 2014, at 9:29 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>> Hi, ALL and Andrew!
>>>>>>
>>>>>> Today is a good day - I killed a lot, and a lot of shooting at me.
>>>>>> In general - I am happy (almost like an elephant) :)
>>>>>> Except resources on the node are important to me eight processes: corosync,pacemakerd,cib,stonithd,lrmd,attrd,pengine,crmd.
>>>>>> I killed them with different signals (4,6,11 and even 9).
>>>>>> Behavior does not depend of number signal - it's good.
>>>>>> If STONITH send reboot to the node - it rebooted and rejoined the cluster - too it's good.
>>>>>> But the behavior is different from killing various demons.
>>>>>>
>>>>>> Turned four groups:
>>>>>> 1. corosync,cib - STONITH work 100%.
>>>>>> Kill via any signals - call STONITH and reboot.
>>>>> excellent
>>>>>> 3. stonithd,attrd,pengine - not need STONITH
>>>>>> This daemons simple restart, resources - stay running.
>>>>> right
>>>>>> 2. lrmd,crmd - strange behavior STONITH.
>>>>>> Sometimes called STONITH - and the corresponding reaction.
>>>>>> Sometimes restart daemon
>>>>> The daemon will always try to restart, the only variable is how long it takes the peer to notice and initiate fencing.
>>>>> If the failure happens just before a they're due to receive totem token, the failure will be very quickly detected and the node fenced.
>>>>> If the failure happens just after, then detection will take longer - giving the node longer to recover and not be fenced.
>>>>>
>>>>> So fence/not fence is normal and to be expected.
>>>>>> and restart resources with large delay MS:pgsql.
>>>>>> One time after restart crmd - pgsql don't restart.
>>>>> I would not expect pgsql to ever restart - if the RA does its job properly anyway.
>>>>> In the case the node is not fenced, the crmd will respawn and the the PE will request that it re-detect the state of all resources.
>>>>>
>>>>> If the agent reports "all good", then there is nothing more to do.
>>>>> If the agent is not reporting "all good", you should really be asking why.
>>>>>> 4. pacemakerd - nothing happens.
>>>>> On non-systemd based machines, correct.
>>>>>
>>>>> On a systemd based machine pacemakerd is respawned and reattaches to the existing daemons.
>>>>> Any subsequent daemon failure will be detected and the daemon respawned.
>>>> And! I almost forgot about IT!
>>>> Exist another (NORMAL) the variants, the methods, the ideas?
>>>> Without this ... @$%#$%&$%^&$%^&##@#$$^$%& !!!!!
>>>> Otherwise - it's a full epic fail ;)
>>> -ENOPARSE
>> OK, I remove my personal attitude to "systemd".
>> Let me explain.
>>
>> Somewhere in the beginning of this topic, I wrote:
>> A.G.:Who knows who runs lrmd?
>> A.B.:Pacemakerd.
>> That's one!
>>
>> Let's see the list of processes:
>> #ps -axf
>> .....
>> 6067 ? Ssl 7:24 corosync
>> 6092 ? S 0:25 pacemakerd
>> 6094 ? Ss 116:13 \_ /usr/libexec/pacemaker/cib
>> 6095 ? Ss 0:25 \_ /usr/libexec/pacemaker/stonithd
>> 6096 ? Ss 1:27 \_ /usr/libexec/pacemaker/lrmd
>> 6097 ? Ss 0:49 \_ /usr/libexec/pacemaker/attrd
>> 6098 ? Ss 0:25 \_ /usr/libexec/pacemaker/pengine
>> 6099 ? Ss 0:29 \_ /usr/libexec/pacemaker/crmd
>> .....
>> That's two!
>
> Whats two? I don't follow.
In the sense that it creates other processes. But it does not matter.
>> And more, more...
>> Now you must understand - why I want this process to work always.
>> Even I think, No need for anyone here to explain it!
>>
>> And Now you say about "pacemakerd nice work, but only on systemd distros" !!!
>
> No, I;m saying it works _better_ on systemd distros.
> On non-systemd distros you still need quite a few unlikely-to-happen failures to trigger a situation in which the node still gets fenced and recovered (assuming no-one saw any of the error messages and didn't run "service pacemaker restart" prior to the additional failures).
>
Can you show me the place where:
"On a systemd based machine pacemakerd is respawned and reattaches to the existing daemons."?
If I respawn via upstart process pacemakerd - "reattaches to the existing daemons" ?
>> What should I do now?
>> * Integrate systemd in CentOS?
>> * Migrate to Fefora?
>> * Buy RHEL7 !?
>
> Option 3 is particularly good :)
It's too easy. Normal heroes are always going to bypass :)
>> Each a variants is great, but don't fit for me.
>>
>> P.S. And I'm not talking distros which don't migrate to systemd (and will not do).
>
> Are there any? Even debian and ubuntu have raised the white flag.
It certainly a lyrics, but potentially it can be any Unix-like system.
>> Do not be offended! We also do so.
>> We are building a secret military factory,
>> large concrete fence around it,
>> wall barbed wire, but forget to install the gates. :)
>>>>>> And then I can kill any process of the third group. They do not restart.
>>>>> Until they become needed.
>>>>> Eg. if the DC goes to invoke the policy engine, that will fail causing the crmd to fail and the node to be fenced.
>>>>>> Generaly don't touch corosync,cib and maybe lrmd,crmd.
>>>>>>
>>>>>> What do you think about this?
>>>>>> The main question of this topic - we decided.
>>>>>> But this varied behavior - another big problem.
>>>>>>
>>>>>> 17.02.2014, 08:52, "Andrey Groshev" <greenx at yandex.ru>:
>>>>>>> 17.02.2014, 02:27, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>>> With no quick follow-up, dare one hope that means the patch worked? :-)
>>>>>>> Hi,
>>>>>>> No, unfortunately the chief changed my plans on Friday and all day I was engaged in a parallel project.
>>>>>>> I hope that today have time to carry out the necessary tests.
>>>>>>>> On 14 Feb 2014, at 3:37 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>>> Yes, of course. Now beginning build world and test )
>>>>>>>>>
>>>>>>>>> 14.02.2014, 04:41, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>>>>> The previous patch wasn't quite right.
>>>>>>>>>> Could you try this new one?
>>>>>>>>>>
>>>>>>>>>> http://paste.fedoraproject.org/77123/13923376/
>>>>>>>>>>
>>>>>>>>>> [11:23 AM] beekhof at f19 ~/Development/sources/pacemaker/devel ☺ # git diff
>>>>>>>>>> diff --git a/crmd/callbacks.c b/crmd/callbacks.c
>>>>>>>>>> index ac4b905..d49525b 100644
>>>>>>>>>> --- a/crmd/callbacks.c
>>>>>>>>>> +++ b/crmd/callbacks.c
>>>>>>>>>> @@ -199,8 +199,7 @@ peer_update_callback(enum crm_status_type type, crm_node_t * node, const void *d
>>>>>>>>>> stop_te_timer(down->timer);
>>>>>>>>>>
>>>>>>>>>> flags |= node_update_join | node_update_expected;
>>>>>>>>>> - crm_update_peer_join(__FUNCTION__, node, crm_join_none);
>>>>>>>>>> - crm_update_peer_expected(__FUNCTION__, node, CRMD_JOINSTATE_DOWN);
>>>>>>>>>> + crmd_peer_down(node, FALSE);
>>>>>>>>>> check_join_state(fsa_state, __FUNCTION__);
>>>>>>>>>>
>>>>>>>>>> update_graph(transition_graph, down);
>>>>>>>>>> diff --git a/crmd/crmd_utils.h b/crmd/crmd_utils.h
>>>>>>>>>> index bc472c2..1a2577a 100644
>>>>>>>>>> --- a/crmd/crmd_utils.h
>>>>>>>>>> +++ b/crmd/crmd_utils.h
>>>>>>>>>> @@ -100,6 +100,7 @@ void crmd_join_phase_log(int level);
>>>>>>>>>> const char *get_timer_desc(fsa_timer_t * timer);
>>>>>>>>>> gboolean too_many_st_failures(void);
>>>>>>>>>> void st_fail_count_reset(const char * target);
>>>>>>>>>> +void crmd_peer_down(crm_node_t *peer, bool full);
>>>>>>>>>>
>>>>>>>>>> # define fsa_register_cib_callback(id, flag, data, fn) do { \
>>>>>>>>>> fsa_cib_conn->cmds->register_callback( \
>>>>>>>>>> diff --git a/crmd/te_actions.c b/crmd/te_actions.c
>>>>>>>>>> index f31d4ec..3bfce59 100644
>>>>>>>>>> --- a/crmd/te_actions.c
>>>>>>>>>> +++ b/crmd/te_actions.c
>>>>>>>>>> @@ -80,11 +80,8 @@ send_stonith_update(crm_action_t * action, const char *target, const char *uuid)
>>>>>>>>>> crm_info("Recording uuid '%s' for node '%s'", uuid, target);
>>>>>>>>>> peer->uuid = strdup(uuid);
>>>>>>>>>> }
>>>>>>>>>> - crm_update_peer_proc(__FUNCTION__, peer, crm_proc_none, NULL);
>>>>>>>>>> - crm_update_peer_state(__FUNCTION__, peer, CRM_NODE_LOST, 0);
>>>>>>>>>> - crm_update_peer_expected(__FUNCTION__, peer, CRMD_JOINSTATE_DOWN);
>>>>>>>>>> - crm_update_peer_join(__FUNCTION__, peer, crm_join_none);
>>>>>>>>>>
>>>>>>>>>> + crmd_peer_down(peer, TRUE);
>>>>>>>>>> node_state =
>>>>>>>>>> do_update_node_cib(peer,
>>>>>>>>>> node_update_cluster | node_update_peer | node_update_join |
>>>>>>>>>> diff --git a/crmd/te_utils.c b/crmd/te_utils.c
>>>>>>>>>> index ad7e573..0c92e95 100644
>>>>>>>>>> --- a/crmd/te_utils.c
>>>>>>>>>> +++ b/crmd/te_utils.c
>>>>>>>>>> @@ -247,10 +247,7 @@ tengine_stonith_notify(stonith_t * st, stonith_event_t * st_event)
>>>>>>>>>>
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> - crm_update_peer_proc(__FUNCTION__, peer, crm_proc_none, NULL);
>>>>>>>>>> - crm_update_peer_state(__FUNCTION__, peer, CRM_NODE_LOST, 0);
>>>>>>>>>> - crm_update_peer_expected(__FUNCTION__, peer, CRMD_JOINSTATE_DOWN);
>>>>>>>>>> - crm_update_peer_join(__FUNCTION__, peer, crm_join_none);
>>>>>>>>>> + crmd_peer_down(peer, TRUE);
>>>>>>>>>> }
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> diff --git a/crmd/utils.c b/crmd/utils.c
>>>>>>>>>> index 3988cfe..2df53ab 100644
>>>>>>>>>> --- a/crmd/utils.c
>>>>>>>>>> +++ b/crmd/utils.c
>>>>>>>>>> @@ -1077,3 +1077,13 @@ update_attrd_remote_node_removed(const char *host, const char *user_name)
>>>>>>>>>> crm_trace("telling attrd to clear attributes for remote host %s", host);
>>>>>>>>>> update_attrd_helper(host, NULL, NULL, user_name, TRUE, 'C');
>>>>>>>>>> }
>>>>>>>>>> +
>>>>>>>>>> +void crmd_peer_down(crm_node_t *peer, bool full)
>>>>>>>>>> +{
>>>>>>>>>> + if(full && peer->state == NULL) {
>>>>>>>>>> + crm_update_peer_state(__FUNCTION__, peer, CRM_NODE_LOST, 0);
>>>>>>>>>> + crm_update_peer_proc(__FUNCTION__, peer, crm_proc_none, NULL);
>>>>>>>>>> + }
>>>>>>>>>> + crm_update_peer_join(__FUNCTION__, peer, crm_join_none);
>>>>>>>>>> + crm_update_peer_expected(__FUNCTION__, peer, CRMD_JOINSTATE_DOWN);
>>>>>>>>>> +}
>>>>>>>>>>
>>>>>>>>>> On 16 Jan 2014, at 7:24 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>>>>> 16.01.2014, 01:30, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>>>>>>> On 16 Jan 2014, at 12:41 am, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>>>>>>> 15.01.2014, 02:53, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>>>>>>>>> On 15 Jan 2014, at 12:15 am, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>>>>>>>>> 14.01.2014, 10:00, "Andrey Groshev" <greenx at yandex.ru>:
>>>>>>>>>>>>>>>> 14.01.2014, 07:47, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>>>>>>>>>>>> Ok, here's what happens:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 1. node2 is lost
>>>>>>>>>>>>>>>>> 2. fencing of node2 starts
>>>>>>>>>>>>>>>>> 3. node2 reboots (and cluster starts)
>>>>>>>>>>>>>>>>> 4. node2 returns to the membership
>>>>>>>>>>>>>>>>> 5. node2 is marked as a cluster member
>>>>>>>>>>>>>>>>> 6. DC tries to bring it into the cluster, but needs to cancel the active transition first.
>>>>>>>>>>>>>>>>> Which is a problem since the node2 fencing operation is part of that
>>>>>>>>>>>>>>>>> 7. node2 is in a transition (pending) state until fencing passes or fails
>>>>>>>>>>>>>>>>> 8a. fencing fails: transition completes and the node joins the cluster
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thats in theory, except we automatically try again. Which isn't appropriate.
>>>>>>>>>>>>>>>>> This should be relatively easy to fix.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 8b. fencing passes: the node is incorrectly marked as offline
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This I have no idea how to fix yet.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On another note, it doesn't look like this agent works at all.
>>>>>>>>>>>>>>>>> The node has been back online for a long time and the agent is still timing out after 10 minutes.
>>>>>>>>>>>>>>>>> So "Once the script makes sure that the victim will rebooted and again available via ssh - it exit with 0." does not seem true.
>>>>>>>>>>>>>>>> Damn. Looks like you're right. At some time I broke my agent and had not noticed it. Who will understand.
>>>>>>>>>>>>>>> I repaired my agent - after send reboot he is wait STDIN.
>>>>>>>>>>>>>>> Returned "normally" a behavior - hangs "pending", until manually send reboot. :)
>>>>>>>>>>>>>> Right. Now you're in case 8b.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can you try this patch: http://paste.fedoraproject.org/68450/38973966
>>>>>>>>>>>>> Killed all day experiences.
>>>>>>>>>>>>> It turns out here that:
>>>>>>>>>>>>> 1. Did cluster.
>>>>>>>>>>>>> 2. On the node-2 send signal (-4) - killed corosink
>>>>>>>>>>>>> 3. From node-1 (there DC) - stonith sent reboot
>>>>>>>>>>>>> 4. Noda rebooted and resources start.
>>>>>>>>>>>>> 5. Again. On the node-2 send signal (-4) - killed corosink
>>>>>>>>>>>>> 6. Again. From node-1 (there DC) - stonith sent reboot
>>>>>>>>>>>>> 7. Noda-2 rebooted and hangs in "pending"
>>>>>>>>>>>>> 8. Waiting, waiting..... manually reboot.
>>>>>>>>>>>>> 9. Noda-2 reboot and raised resources start.
>>>>>>>>>>>>> 10. GOTO p.2
>>>>>>>>>>>> Logs?
>>>>>>>>>>> Yesterday I wrote an additional letter why not put the logs.
>>>>>>>>>>> Read it please, it contains a few more questions.
>>>>>>>>>>> Today again began to hang and continue along the same cycle.
>>>>>>>>>>> Logs here http://send2me.ru/crmrep2.tar.bz2
>>>>>>>>>>>>>>> New logs: http://send2me.ru/crmrep1.tar.bz2
>>>>>>>>>>>>>>>>> On 14 Jan 2014, at 1:19 pm, Andrew Beekhof <andrew at beekhof.net> wrote:
>>>>>>>>>>>>>>>>>> Apart from anything else, your timeout needs to be bigger:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Jan 13 12:21:36 [17223] dev-cluster2-node1.unix.tensor.ru stonith-ng: ( commands.c:1321 ) error: log_operation: Operation 'reboot' [11331] (call 2 from crmd.17227) for host 'dev-cluster2-node2.unix.tensor.ru' with device 'st1' returned: -62 (Timer expired)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 14 Jan 2014, at 7:18 am, Andrew Beekhof <andrew at beekhof.net> wrote:
>>>>>>>>>>>>>>>>>>> On 13 Jan 2014, at 8:31 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>>>>>>>>>>>>>> 13.01.2014, 02:51, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>>>>>>>>>>>>>>>> On 10 Jan 2014, at 9:55 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>>>>>>>>>>>>>>>> 10.01.2014, 14:31, "Andrey Groshev" <greenx at yandex.ru>:
>>>>>>>>>>>>>>>>>>>>>>> 10.01.2014, 14:01, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>>>>>>>>>>>>>>>>>>> On 10 Jan 2014, at 5:03 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> 10.01.2014, 05:29, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>>>>>>>>>>>>>>>>>>>>> On 9 Jan 2014, at 11:11 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> 08.01.2014, 06:22, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 29 Nov 2013, at 7:17 pm, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, ALL.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm still trying to cope with the fact that after the fence - node hangs in "pending".
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Please define "pending". Where did you see this?
>>>>>>>>>>>>>>>>>>>>>>>>>>> In crm_mon:
>>>>>>>>>>>>>>>>>>>>>>>>>>> ......
>>>>>>>>>>>>>>>>>>>>>>>>>>> Node dev-cluster2-node2 (172793105): pending
>>>>>>>>>>>>>>>>>>>>>>>>>>> ......
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> The experiment was like this:
>>>>>>>>>>>>>>>>>>>>>>>>>>> Four nodes in cluster.
>>>>>>>>>>>>>>>>>>>>>>>>>>> On one of them kill corosync or pacemakerd (signal 4 or 6 oк 11).
>>>>>>>>>>>>>>>>>>>>>>>>>>> Thereafter, the remaining start it constantly reboot, under various pretexts, "softly whistling", "fly low", "not a cluster member!" ...
>>>>>>>>>>>>>>>>>>>>>>>>>>> Then in the log fell out "Too many failures ...."
>>>>>>>>>>>>>>>>>>>>>>>>>>> All this time in the status in crm_mon is "pending".
>>>>>>>>>>>>>>>>>>>>>>>>>>> Depending on the wind direction changed to "UNCLEAN"
>>>>>>>>>>>>>>>>>>>>>>>>>>> Much time has passed and I can not accurately describe the behavior...
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Now I am in the following state:
>>>>>>>>>>>>>>>>>>>>>>>>>>> I tried locate the problem. Came here with this.
>>>>>>>>>>>>>>>>>>>>>>>>>>> I set big value in property stonith-timeout="600s".
>>>>>>>>>>>>>>>>>>>>>>>>>>> And got the following behavior:
>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. pkill -4 corosync
>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. from node with DC call my fence agent "sshbykey"
>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. It sends reboot victim and waits until she comes to life again.
>>>>>>>>>>>>>>>>>>>>>>>>>> Hmmm.... what version of pacemaker?
>>>>>>>>>>>>>>>>>>>>>>>>>> This sounds like a timing issue that we fixed a while back
>>>>>>>>>>>>>>>>>>>>>>>>> Was a version 1.1.11 from December 3.
>>>>>>>>>>>>>>>>>>>>>>>>> Now try full update and retest.
>>>>>>>>>>>>>>>>>>>>>>>> That should be recent enough. Can you create a crm_report the next time you reproduce?
>>>>>>>>>>>>>>>>>>>>>>> Of course yes. Little delay.... :)
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> ......
>>>>>>>>>>>>>>>>>>>>>>> cc1: warnings being treated as errors
>>>>>>>>>>>>>>>>>>>>>>> upstart.c: In function ‘upstart_job_property’:
>>>>>>>>>>>>>>>>>>>>>>> upstart.c:264: error: implicit declaration of function ‘g_variant_lookup_value’
>>>>>>>>>>>>>>>>>>>>>>> upstart.c:264: error: nested extern declaration of ‘g_variant_lookup_value’
>>>>>>>>>>>>>>>>>>>>>>> upstart.c:264: error: assignment makes pointer from integer without a cast
>>>>>>>>>>>>>>>>>>>>>>> gmake[2]: *** [libcrmservice_la-upstart.lo] Error 1
>>>>>>>>>>>>>>>>>>>>>>> gmake[2]: Leaving directory `/root/ha/pacemaker/lib/services'
>>>>>>>>>>>>>>>>>>>>>>> make[1]: *** [all-recursive] Error 1
>>>>>>>>>>>>>>>>>>>>>>> make[1]: Leaving directory `/root/ha/pacemaker/lib'
>>>>>>>>>>>>>>>>>>>>>>> make: *** [core] Error 1
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I'm trying to solve this a problem.
>>>>>>>>>>>>>>>>>>>>>> Do not get solved quickly...
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> https://developer.gnome.org/glib/2.28/glib-GVariant.html#g-variant-lookup-value
>>>>>>>>>>>>>>>>>>>>>> g_variant_lookup_value () Since 2.28
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> # yum list installed glib2
>>>>>>>>>>>>>>>>>>>>>> Loaded plugins: fastestmirror, rhnplugin, security
>>>>>>>>>>>>>>>>>>>>>> This system is receiving updates from RHN Classic or Red Hat Satellite.
>>>>>>>>>>>>>>>>>>>>>> Loading mirror speeds from cached hostfile
>>>>>>>>>>>>>>>>>>>>>> Installed Packages
>>>>>>>>>>>>>>>>>>>>>> glib2.x86_64 2.26.1-3.el6 installed
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> # cat /etc/issue
>>>>>>>>>>>>>>>>>>>>>> CentOS release 6.5 (Final)
>>>>>>>>>>>>>>>>>>>>>> Kernel \r on an \m
>>>>>>>>>>>>>>>>>>>>> Can you try this patch?
>>>>>>>>>>>>>>>>>>>>> Upstart jobs wont work, but the code will compile
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> diff --git a/lib/services/upstart.c b/lib/services/upstart.c
>>>>>>>>>>>>>>>>>>>>> index 831e7cf..195c3a4 100644
>>>>>>>>>>>>>>>>>>>>> --- a/lib/services/upstart.c
>>>>>>>>>>>>>>>>>>>>> +++ b/lib/services/upstart.c
>>>>>>>>>>>>>>>>>>>>> @@ -231,12 +231,21 @@ upstart_job_exists(const char *name)
>>>>>>>>>>>>>>>>>>>>> static char *
>>>>>>>>>>>>>>>>>>>>> upstart_job_property(const char *obj, const gchar * iface, const char *name)
>>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>> + char *output = NULL;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +#if !GLIB_CHECK_VERSION(2,28,0)
>>>>>>>>>>>>>>>>>>>>> + static bool err = TRUE;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> + if(err) {
>>>>>>>>>>>>>>>>>>>>> + crm_err("This version of glib is too old to support upstart jobs");
>>>>>>>>>>>>>>>>>>>>> + err = FALSE;
>>>>>>>>>>>>>>>>>>>>> + }
>>>>>>>>>>>>>>>>>>>>> +#else
>>>>>>>>>>>>>>>>>>>>> GError *error = NULL;
>>>>>>>>>>>>>>>>>>>>> GDBusProxy *proxy;
>>>>>>>>>>>>>>>>>>>>> GVariant *asv = NULL;
>>>>>>>>>>>>>>>>>>>>> GVariant *value = NULL;
>>>>>>>>>>>>>>>>>>>>> GVariant *_ret = NULL;
>>>>>>>>>>>>>>>>>>>>> - char *output = NULL;
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> crm_info("Calling GetAll on %s", obj);
>>>>>>>>>>>>>>>>>>>>> proxy = get_proxy(obj, BUS_PROPERTY_IFACE);
>>>>>>>>>>>>>>>>>>>>> @@ -272,6 +281,7 @@ upstart_job_property(const char *obj, const gchar * iface, const char *name)
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> g_object_unref(proxy);
>>>>>>>>>>>>>>>>>>>>> g_variant_unref(_ret);
>>>>>>>>>>>>>>>>>>>>> +#endif
>>>>>>>>>>>>>>>>>>>>> return output;
>>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>> Ok :) I patch source.
>>>>>>>>>>>>>>>>>>>> Type "make rc" - the same error.
>>>>>>>>>>>>>>>>>>> Because its not building your local changes
>>>>>>>>>>>>>>>>>>>> Make new copy via "fetch" - the same error.
>>>>>>>>>>>>>>>>>>>> It seems that if not exist ClusterLabs-pacemaker-Pacemaker-1.1.11-rc3.tar.gz, then download it.
>>>>>>>>>>>>>>>>>>>> Otherwise use exist archive.
>>>>>>>>>>>>>>>>>>>> Cutted log .......
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> # make rc
>>>>>>>>>>>>>>>>>>>> make TAG=Pacemaker-1.1.11-rc3 rpm
>>>>>>>>>>>>>>>>>>>> make[1]: Entering directory `/root/ha/pacemaker'
>>>>>>>>>>>>>>>>>>>> rm -f pacemaker-dirty.tar.* pacemaker-tip.tar.* pacemaker-HEAD.tar.*
>>>>>>>>>>>>>>>>>>>> if [ ! -f ClusterLabs-pacemaker-Pacemaker-1.1.11-rc3.tar.gz ]; then \
>>>>>>>>>>>>>>>>>>>> rm -f pacemaker.tar.*; \
>>>>>>>>>>>>>>>>>>>> if [ Pacemaker-1.1.11-rc3 = dirty ]; then \
>>>>>>>>>>>>>>>>>>>> git commit -m "DO-NOT-PUSH" -a; \
>>>>>>>>>>>>>>>>>>>> git archive --prefix=ClusterLabs-pacemaker-Pacemaker-1.1.11-rc3/ HEAD | gzip > ClusterLabs-pacemaker-Pacemaker-1.1.11-rc3.tar.gz; \
>>>>>>>>>>>>>>>>>>>> git reset --mixed HEAD^; \
>>>>>>>>>>>>>>>>>>>> else \
>>>>>>>>>>>>>>>>>>>> git archive --prefix=ClusterLabs-pacemaker-Pacemaker-1.1.11-rc3/ Pacemaker-1.1.11-rc3 | gzip > ClusterLabs-pacemaker-Pacemaker-1.1.11-rc3.tar.gz; \
>>>>>>>>>>>>>>>>>>>> fi; \
>>>>>>>>>>>>>>>>>>>> echo `date`: Rebuilt ClusterLabs-pacemaker-Pacemaker-1.1.11-rc3.tar.gz; \
>>>>>>>>>>>>>>>>>>>> else \
>>>>>>>>>>>>>>>>>>>> echo `date`: Using existing tarball: ClusterLabs-pacemaker-Pacemaker-1.1.11-rc3.tar.gz; \
>>>>>>>>>>>>>>>>>>>> fi
>>>>>>>>>>>>>>>>>>>> Mon Jan 13 13:23:21 MSK 2014: Using existing tarball: ClusterLabs-pacemaker-Pacemaker-1.1.11-rc3.tar.gz
>>>>>>>>>>>>>>>>>>>> .......
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Well, "make rpm" - build rpms and I create cluster.
>>>>>>>>>>>>>>>>>>>> I spent the same tests and confirmed the behavior.
>>>>>>>>>>>>>>>>>>>> crm_reoprt log here - http://send2me.ru/crmrep.tar.bz2
>>>>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>>> ,
>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>>>>> ,
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>>>
>>>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>>> ,
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>>
>>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>
>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>> ,
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>
>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>> _______________________________________________
>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>
>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>> ,
>>>>>>>> _______________________________________________
>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>
>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>> _______________________________________________
>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>
>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>> _______________________________________________
>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org
>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>> ,
>>>>> _______________________________________________
>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>> ,
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> ,
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list