[ClusterLabs] Cluster resources migration from CMAN to Pacemaker
jaspal singla
jaspal.singla at gmail.com
Tue Feb 9 10:04:11 UTC 2016
Hi Jan/Digiman,
Thanks for your replies. Based on your inputs, I managed to configure these
values and results were fine but still have some doubts for which I would
seek your help. I also tried to dig some of issues on internet but seems
due to lack of cman -> pacemaker documentation, I couldn't find any.
I have configured 8 scripts under one resource as you recommended. But out
of which 2 scripts are not being executed by cluster by cluster itself.
When I tried to execute the same script manually, I am able to do it but
through pacemaker command I don't.
For example:
This is the output of crm_mon command:
###############################################################################################################
Last updated: Mon Feb 8 17:30:57 2016 Last change: Mon Feb 8
17:03:29 2016 by hacluster via crmd on ha1-103.cisco.com
Stack: corosync
Current DC: ha1-103.cisco.com (version 1.1.13-10.el7-44eb2dd) - partition
with quorum
1 node and 10 resources configured
Online: [ ha1-103.cisco.com ]
Resource Group: ctm_service
FSCheck
(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/FsCheckAgent.py):
Started ha1-103.cisco.com
NTW_IF
(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/NtwIFAgent.py): Started
ha1-103.cisco.com
CTM_RSYNC
(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/RsyncAgent.py): Started
ha1-103.cisco.com
REPL_IF
(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/ODG_IFAgent.py): Started
ha1-103.cisco.com
ORACLE_REPLICATOR
(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/ODG_ReplicatorAgent.py):
Started ha1-103.cisco.com
CTM_SID
(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/OracleAgent.py): Started
ha1-103.cisco.com
CTM_SRV
(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/CtmAgent.py): Stopped
CTM_APACHE
(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/ApacheAgent.py): Stopped
Resource Group: ctm_heartbeat
CTM_HEARTBEAT
(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/HeartBeat.py): Started
ha1-103.cisco.com
Resource Group: ctm_monitoring
FLASHBACK
(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/FlashBackMonitor.py):
Started ha1-103.cisco.com
Failed Actions:
* CTM_SRV_start_0 on ha1-103.cisco.com 'unknown error' (1): call=577,
status=complete, exitreason='none',
last-rc-change='Mon Feb 8 17:12:33 2016', queued=0ms, exec=74ms
#################################################################################################################
CTM_SRV && CTM_APACHE are in stopped state. These services are not being
executed by cluster OR it is being failed somehow by cluster, not sure
why? When I manually execute CTM_SRV script, the script gets executed
without issues.
-> For manually execution of this script I ran the below command:
# /cisco/PrimeOpticalServer/HA/bin/OracleAgent.py status
Output:
_________________________________________________________________________________________________________________
2016-02-08 17:48:41,888 INFO MainThread CtmAgent
=========================================================
Executing preliminary checks...
Check Oracle and Listener availability
=> Oracle and listener are up.
Migration check
=> Migration check completed successfully.
Check the status of the DB archivelog
=> DB archivelog check completed successfully.
Check of Oracle scheduler...
=> Check of Oracle scheduler completed successfully
Initializing database tables
=> Database tables initialized successfully.
Install in cache the store procedure
=> Installing store procedures completed successfully
Gather the oracle system stats
=> Oracle stats completed successfully
Preliminary checks completed.
=========================================================
Starting base services...
Starting Zookeeper...
JMX enabled by default
Using config: /opt/CiscoTransportManagerServer/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
Retrieving name service port...
Starting name service...
Base services started.
=========================================================
Starting Prime Optical services...
Prime Optical services started.
=========================================================
Cisco Prime Optical Server Version: 10.5.0.0.214 / Oracle Embedded
-------------------------------------------------------------------------------------
USER PID %CPU %MEM START TIME PROCESS
-------------------------------------------------------------------------------------
root 16282 0.0 0.0 17:48:11 0:00 CTM Server
root 16308 0.0 0.1 17:48:16 0:00 CTM Server
root 16172 0.1 0.1 17:48:10 0:00 NameService
root 16701 24.8 7.5 17:48:27 0:27 TOMCAT
root 16104 0.2 0.2 17:48:09 0:00 Zookeeper
-------------------------------------------------------------------------------------
For startup details, see:
/opt/CiscoTransportManagerServer/log/ctms-start.log
2016-02-08 17:48:41,888 WARNING MainThread CtmAgent CTM restartd at attempt
1
_________________________________________________________________________________________________________________
The script gets executed and I could see that service was started but
crm_mon output still shows this CTM_SRV script in stopped state, why?
-> When I try to start the script through pcs commad, I get the below
errors in logs. I tried to debug, but couldn't manage to rectify. I'd
really appreciate if any help can be provided in order to get this resolved.
# pcs resource enable CTM_SRV
Output:
_________________________________________________________________________________________________________________
Feb 08 17:12:42 [12877] ha1-103.cisco.com pengine: debug:
determine_op_status: CTM_SRV_start_0 on ha1-103.cisco.com returned
'unknown error' (1) instead of the expected value: 'ok' (0)
Feb 08 17:12:42 [12877] ha1-103.cisco.com pengine: warning:
unpack_rsc_op_failure: Processing failed op start for CTM_SRV on
ha1-103.cisco.com: unknown error (1)
Feb 08 17:12:42 [12877] ha1-103.cisco.com pengine: debug:
determine_op_status: CTM_SRV_start_0 on ha1-103.cisco.com returned
'unknown error' (1) instead of the expected value: 'ok' (0)
Feb 08 17:12:42 [12877] ha1-103.cisco.com pengine: warning:
unpack_rsc_op_failure: Processing failed op start for CTM_SRV on
ha1-103.cisco.com: unknown error (1)
Feb 08 17:12:42 [12877] ha1-103.cisco.com pengine: info:
native_print: CTM_SRV
(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/CtmAgent.py): FAILED
ha1-103.cisco.com
Feb 08 17:12:42 [12877] ha1-103.cisco.com pengine: info:
get_failcount_full: CTM_SRV has failed INFINITY times on
ha1-103.cisco.com
Feb 08 17:12:42 [12877] ha1-103.cisco.com pengine: warning:
common_apply_stickiness: Forcing CTM_SRV away from ha1-103.cisco.com
after 1000000 failures (max=1000000)
Feb 08 17:12:42 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: FSCheck: Rolling back scores from CTM_SRV
Feb 08 17:12:42 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: NTW_IF: Rolling back scores from CTM_SRV
Feb 08 17:12:42 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: CTM_RSYNC: Rolling back scores from CTM_SRV
Feb 08 17:12:42 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: REPL_IF: Rolling back scores from CTM_SRV
Feb 08 17:12:42 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: ORACLE_REPLICATOR: Rolling back scores from CTM_SRV
Feb 08 17:12:42 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: CTM_SID: Rolling back scores from CTM_SRV
Feb 08 17:12:42 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: CTM_SRV: Rolling back scores from CTM_APACHE
Feb 08 17:12:42 [12877] ha1-103.cisco.com pengine: debug:
native_assign_node: All nodes for resource CTM_SRV are unavailable,
unclean or shutting down (ha1-103.cisco.com: 1, -1000000)
Feb 08 17:12:42 [12877] ha1-103.cisco.com pengine: debug:
native_assign_node: Could not allocate a node for CTM_SRV
Feb 08 17:12:42 [12877] ha1-103.cisco.com pengine: debug:
native_assign_node: Processing CTM_SRV_stop_0
Feb 08 17:12:42 [12877] ha1-103.cisco.com pengine: info:
native_color: Resource CTM_SRV cannot run anywhere
Feb 08 17:12:42 [12877] ha1-103.cisco.com pengine: notice: LogActions:
Stop CTM_SRV (ha1-103.cisco.com)
Feb 08 17:12:42 [12878] ha1-103.cisco.com crmd: notice:
te_rsc_command: Initiating action 7: stop CTM_SRV_stop_0 on
ha1-103.cisco.com (local)
Feb 08 17:12:42 [12878] ha1-103.cisco.com crmd: debug:
do_lrm_rsc_op: Stopped 0 recurring operations in preparation for
CTM_SRV_stop_0
Feb 08 17:12:42 [12878] ha1-103.cisco.com crmd: info:
do_lrm_rsc_op: Performing key=7:177:0:c1f19bee-9119-48fa-9ebd-6ffeaf24e112
op=CTM_SRV_stop_0
Feb 08 17:12:42 [12875] ha1-103.cisco.com lrmd: info:
log_execute: executing - rsc:CTM_SRV action:stop call_id:578
Feb 08 17:12:42 [12875] ha1-103.cisco.com lrmd: debug:
operation_finished: CTM_SRV_stop_0:498 - exited with rc=0
Feb 08 17:12:42 [12875] ha1-103.cisco.com lrmd: debug:
operation_finished: CTM_SRV_stop_0:498:stderr [ -- empty -- ]
Feb 08 17:12:42 [12875] ha1-103.cisco.com lrmd: debug:
operation_finished: CTM_SRV_stop_0:498:stdout [ 0 ]
Feb 08 17:12:42 [12875] ha1-103.cisco.com lrmd: info:
log_finished: finished - rsc:CTM_SRV action:stop call_id:578 pid:498
exit-code:0 exec-time:142ms queue-time:0ms
Feb 08 17:12:42 [12878] ha1-103.cisco.com crmd: debug:
create_operation_update: do_update_resource: Updating resource
CTM_SRV after stop op complete (interval=0)
Feb 08 17:12:42 [12878] ha1-103.cisco.com crmd: notice:
process_lrm_event: Operation CTM_SRV_stop_0: ok (node=ha1-103.cisco.com,
call=578, rc=0, cib-update=901, confirmed=true)
Feb 08 17:12:42 [12878] ha1-103.cisco.com crmd: debug:
process_lrm_event: ha1-103.cisco.com-CTM_SRV_stop_0:578 [ 0\n ]
Feb 08 17:12:42 [12878] ha1-103.cisco.com crmd: debug:
update_history_cache: Updating history for 'CTM_SRV' with stop op
Feb 08 17:12:42 [12873] ha1-103.cisco.com cib: info:
cib_perform_op: +
/cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='CTM_SRV']/lrm_rsc_op[@id='CTM_SRV_last_0']:
@operation_key=CTM_
SRV_stop_0, @operation=stop,
@transition-key=7:177:0:c1f19bee-9119-48fa-9ebd-6ffeaf24e112,
@transition-magic=0:0;7:177:0:c1f19bee-9119-48fa-9ebd-6ffeaf24e112,
@call-id=578, @rc-code=0, @last-run=1454969562, @last-rc-change=1454969562,
@exec-time=142
Feb 08 17:12:42 [12878] ha1-103.cisco.com crmd: info:
match_graph_event: Action CTM_SRV_stop_0 (7) confirmed on
ha1-103.cisco.com (rc=0)
Feb 08 17:27:42 [12877] ha1-103.cisco.com pengine: debug:
determine_op_status: CTM_SRV_start_0 on ha1-103.cisco.com returned
'unknown error' (1) instead of the expected value: 'ok' (0)
Feb 08 17:27:42 [12877] ha1-103.cisco.com pengine: warning:
unpack_rsc_op_failure: Processing failed op start for CTM_SRV on
ha1-103.cisco.com: unknown error (1)
Feb 08 17:27:42 [12877] ha1-103.cisco.com pengine: info:
native_print: CTM_SRV
(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/CtmAgent.py): Stopped
Feb 08 17:27:42 [12877] ha1-103.cisco.com pengine: info:
get_failcount_full: CTM_SRV has failed INFINITY times on
ha1-103.cisco.com
Feb 08 17:27:42 [12877] ha1-103.cisco.com pengine: warning:
common_apply_stickiness: Forcing CTM_SRV away from ha1-103.cisco.com
after 1000000 failures (max=1000000)
Feb 08 17:27:42 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: FSCheck: Rolling back scores from CTM_SRV
Feb 08 17:27:42 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: NTW_IF: Rolling back scores from CTM_SRV
Feb 08 17:27:42 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: CTM_RSYNC: Rolling back scores from CTM_SRV
Feb 08 17:27:42 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: REPL_IF: Rolling back scores from CTM_SRV
Feb 08 17:27:42 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: ORACLE_REPLICATOR: Rolling back scores from CTM_SRV
Feb 08 17:27:42 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: CTM_SID: Rolling back scores from CTM_SRV
Feb 08 17:27:42 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: CTM_SRV: Rolling back scores from CTM_APACHE
Feb 08 17:27:42 [12877] ha1-103.cisco.com pengine: debug:
native_assign_node: All nodes for resource CTM_SRV are unavailable,
unclean or shutting down (ha1-103.cisco.com: 1, -1000000)
Feb 08 17:27:42 [12877] ha1-103.cisco.com pengine: debug:
native_assign_node: Could not allocate a node for CTM_SRV
Feb 08 17:27:42 [12877] ha1-103.cisco.com pengine: info:
native_color: Resource CTM_SRV cannot run anywhere
Feb 08 17:27:42 [12877] ha1-103.cisco.com pengine: info: LogActions:
Leave CTM_SRV (Stopped)
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: debug:
determine_op_status: CTM_SRV_start_0 on ha1-103.cisco.com returned
'unknown error' (1) instead of the expected value: 'ok' (0)
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: warning:
unpack_rsc_op_failure: Processing failed op start for CTM_SRV on
ha1-103.cisco.com: unknown error (1)
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info:
native_print: CTM_SRV
(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/CtmAgent.py): Stopped
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info:
get_failcount_full: CTM_SRV has failed INFINITY times on
ha1-103.cisco.com
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: warning:
common_apply_stickiness: Forcing CTM_SRV away from ha1-103.cisco.com
after 1000000 failures (max=1000000)
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: FSCheck: Rolling back scores from CTM_SRV
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: NTW_IF: Rolling back scores from CTM_SRV
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: CTM_RSYNC: Rolling back scores from CTM_SRV
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: REPL_IF: Rolling back scores from CTM_SRV
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: ORACLE_REPLICATOR: Rolling back scores from CTM_SRV
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: CTM_SID: Rolling back scores from CTM_SRV
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: CTM_SRV: Rolling back scores from CTM_APACHE
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: debug:
native_assign_node: All nodes for resource CTM_SRV are unavailable,
unclean or shutting down (ha1-103.cisco.com: 1, -1000000)
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: debug:
native_assign_node: Could not allocate a node for CTM_SRV
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info:
native_color: Resource CTM_SRV cannot run anywhere
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info: LogActions:
Leave CTM_SRV (Stopped)
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: debug:
determine_op_status: CTM_SRV_start_0 on ha1-103.cisco.com returned
'unknown error' (1) instead of the expected value: 'ok' (0)
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: warning:
unpack_rsc_op_failure: Processing failed op start for CTM_SRV on
ha1-103.cisco.com: unknown error (1)
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info:
native_print: CTM_SRV
(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/CtmAgent.py): Stopped
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info:
get_failcount_full: CTM_SRV has failed INFINITY times on
ha1-103.cisco.com
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: warning:
common_apply_stickiness: Forcing CTM_SRV away from ha1-103.cisco.com
after 1000000 failures (max=1000000)
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: FSCheck: Rolling back scores from CTM_SRV
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: NTW_IF: Rolling back scores from CTM_SRV
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: CTM_RSYNC: Rolling back scores from CTM_SRV
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: REPL_IF: Rolling back scores from CTM_SRV
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: ORACLE_REPLICATOR: Rolling back scores from CTM_SRV
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: CTM_SID: Rolling back scores from CTM_SRV
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: CTM_SRV: Rolling back scores from CTM_APACHE
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: debug:
native_assign_node: All nodes for resource CTM_SRV are unavailable,
unclean or shutting down (ha1-103.cisco.com: 1, -1000000)
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: debug:
native_assign_node: Could not allocate a node for CTM_SRV
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info:
native_color: Resource CTM_SRV cannot run anywhere
Feb 08 17:38:00 [12877] ha1-103.cisco.com pengine: info: LogActions:
Leave CTM_SRV (Stopped)
Feb 08 17:38:20 [12877] ha1-103.cisco.com pengine: debug:
determine_op_status: CTM_SRV_start_0 on ha1-103.cisco.com returned
'unknown error' (1) instead of the expected value: 'ok' (0)
Feb 08 17:38:20 [12877] ha1-103.cisco.com pengine: warning:
unpack_rsc_op_failure: Processing failed op start for CTM_SRV on
ha1-103.cisco.com: unknown error (1)
Feb 08 17:38:20 [12877] ha1-103.cisco.com pengine: info:
native_print: CTM_SRV
(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/CtmAgent.py): Stopped
Feb 08 17:38:20 [12877] ha1-103.cisco.com pengine: info:
get_failcount_full: CTM_SRV has failed INFINITY times on
ha1-103.cisco.com
Feb 08 17:38:20 [12877] ha1-103.cisco.com pengine: warning:
common_apply_stickiness: Forcing CTM_SRV away from ha1-103.cisco.com
after 1000000 failures (max=1000000)
Feb 08 17:38:20 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: CTM_SID: Rolling back scores from CTM_SRV
Feb 08 17:38:20 [12877] ha1-103.cisco.com pengine: info:
rsc_merge_weights: CTM_SRV: Rolling back scores from CTM_APACHE
Feb 08 17:38:20 [12877] ha1-103.cisco.com pengine: debug:
native_assign_node: All nodes for resource CTM_SRV are unavailable,
unclean or shutting down (ha1-103.cisco.com: 1, -1000000)
Feb 08 17:38:20 [12877] ha1-103.cisco.com pengine: debug:
native_assign_node: Could not allocate a node for CTM_SRV
Feb 08 17:38:20 [12877] ha1-103.cisco.com pengine: info:
native_color: Resource CTM_SRV cannot run anywhere
Feb 08 17:38:20 [12877] ha1-103.cisco.com pengine: info: LogActions:
Leave CTM_SRV (Stopped)
________________________________________________________________________________________________________________________
Thanks
Jaspal
------------------------------
>
> Message: 3
> Date: Sat, 30 Jan 2016 03:48:03 +0100
> From: Jan Pokorn? <jpokorny at redhat.com>
> To: users at clusterlabs.org
> Subject: Re: [ClusterLabs] Cluster resources migration from CMAN to
> Pacemaker
> Message-ID: <20160130024803.GA27849 at redhat.com>
> Content-Type: text/plain; charset="utf-8"
>
> On 27/01/16 19:41 +0100, Jan Pokorn? wrote:
> > On 27/01/16 11:04 -0600, Ken Gaillot wrote:
> >> On 01/27/2016 02:34 AM, jaspal singla wrote:
> >>> 1) In CMAN, there was meta attribute - autostart=0 (This parameter
> disables
> >>> the start of all services when RGManager starts). Is there any way for
> such
> >>> behavior in Pacemaker?
> >
> > Please be more careful about the descriptions; autostart=0 specified
> > at the given resource group ("service" or "vm" tag) means just not to
> > start anything contained in this very one automatically (also upon
> > new resources being defined, IIUIC), definitely not "all services".
> >
> > [...]
> >
> >> I don't think there's any exact replacement for autostart in pacemaker.
> >> Probably the closest is to set target-role=Stopped before stopping the
> >> cluster, and set target-role=Started when services are desired to be
> >> started.
>
> Beside is-managed=false (as currently used in clufter), I also looked
> at downright disabling "start" action, but this turned out to be a naive
> approach caused by unclear documentation.
>
> Pushing for a bit more clarity (hopefully):
> https://github.com/ClusterLabs/pacemaker/pull/905
>
> >>> 2) Please put some alternatives to exclusive=0 and
> __independent_subtree?
> >>> what we have in Pacemaker instead of these?
>
> (exclusive property discussed in the other subthread; as a recap,
> no extra effort is needed to achieve exclusive=0, exclusive=1 is
> currently a show stopper in clufter as neither approach is versatile
> enough)
>
> > For __independent_subtree, each component must be a separate pacemaker
> > resource, and the constraints between them would depend on exactly what
> > you were trying to accomplish. The key concepts here are ordering
> > constraints, colocation constraints, kind=Mandatory/Optional (for
> > ordering constraints), and ordered sets.
>
> Current approach in clufter as of the next branch:
> - __independent_subtree=1 -> do nothing special (hardly can be
> improved?)
> - __independent_subtree=2 -> for that very resource, set operations
> as follows:
> monitor (interval=60s) on-fail=ignore
> stop interval=0 on-fail=stop
>
> Groups carrying such resources are not unrolled into primitives plus
> contraints, as the above might suggest (also default kind=Mandatory
> for underlying order constraints should fit well).
>
> Please holler if this is not sound.
>
>
> So when put together with some other changes/fixes, current
> suggested/informative sequence of pcs commands goes like this:
>
> pcs cluster auth ha1-105.test.com
> pcs cluster setup --start --name HA1-105_CLUSTER ha1-105.test.com \
> --consensus 12000 --token 10000 --join 60
> sleep 60
> pcs cluster cib tmp-cib.xml --config
> pcs -f tmp-cib.xml property set stonith-enabled=false
> pcs -f tmp-cib.xml \
> resource create RESOURCE-script-FSCheck \
> lsb:../../..//data/Product/HA/bin/FsCheckAgent.py \
> op monitor interval=30s
> pcs -f tmp-cib.xml \
> resource create RESOURCE-script-NTW_IF \
> lsb:../../..//data/Product/HA/bin/NtwIFAgent.py \
> op monitor interval=30s
> pcs -f tmp-cib.xml \
> resource create RESOURCE-script-CTM_RSYNC \
> lsb:../../..//data/Product/HA/bin/RsyncAgent.py \
> op monitor interval=30s on-fail=ignore stop interval=0 on-fail=stop
> pcs -f tmp-cib.xml \
> resource create RESOURCE-script-REPL_IF \
> lsb:../../..//data/Product/HA/bin/ODG_IFAgent.py \
> op monitor interval=30s on-fail=ignore stop interval=0 on-fail=stop
> pcs -f tmp-cib.xml \
> resource create RESOURCE-script-ORACLE_REPLICATOR \
> lsb:../../..//data/Product/HA/bin/ODG_ReplicatorAgent.py \
> op monitor interval=30s on-fail=ignore stop interval=0 on-fail=stop
> pcs -f tmp-cib.xml \
> resource create RESOURCE-script-CTM_SID \
> lsb:../../..//data/Product/HA/bin/OracleAgent.py \
> op monitor interval=30s
> pcs -f tmp-cib.xml \
> resource create RESOURCE-script-CTM_SRV \
> lsb:../../..//data/Product/HA/bin/CtmAgent.py \
> op monitor interval=30s
> pcs -f tmp-cib.xml \
> resource create RESOURCE-script-CTM_APACHE \
> lsb:../../..//data/Product/HA/bin/ApacheAgent.py \
> op monitor interval=30s
> pcs -f tmp-cib.xml \
> resource create RESOURCE-script-CTM_HEARTBEAT \
> lsb:../../..//data/Product/HA/bin/HeartBeat.py \
> op monitor interval=30s
> pcs -f tmp-cib.xml \
> resource create RESOURCE-script-FLASHBACK \
> lsb:../../..//data/Product/HA/bin/FlashBackMonitor.py \
> op monitor interval=30s
> pcs -f tmp-cib.xml \
> resource group add SERVICE-ctm_service-GROUP RESOURCE-script-FSCheck \
> RESOURCE-script-NTW_IF RESOURCE-script-CTM_RSYNC \
> RESOURCE-script-REPL_IF RESOURCE-script-ORACLE_REPLICATOR \
> RESOURCE-script-CTM_SID RESOURCE-script-CTM_SRV \
> RESOURCE-script-CTM_APACHE
> pcs -f tmp-cib.xml resource \
> meta SERVICE-ctm_service-GROUP is-managed=false
> pcs -f tmp-cib.xml \
> resource group add SERVICE-ctm_heartbeat-GROUP \
> RESOURCE-script-CTM_HEARTBEAT
> pcs -f tmp-cib.xml resource \
> meta SERVICE-ctm_heartbeat-GROUP migration-threshold=3 \
> failure-timeout=900
> pcs -f tmp-cib.xml \
> resource group add SERVICE-ctm_monitoring-GROUP \
> RESOURCE-script-FLASHBACK
> pcs -f tmp-cib.xml resource \
> meta SERVICE-ctm_monitoring-GROUP migration-threshold=3 \
> failure-timeout=900
> pcs cluster cib-push tmp-cib.xml --config
>
>
> Any suggestions welcome...
>
> --
> Jan (Poki)
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: application/pgp-signature
> Size: 819 bytes
> Desc: not available
> URL: <
> http://clusterlabs.org/pipermail/users/attachments/20160130/5eb89fbd/attachment-0001.sig
> >
>
> ------------------------------
>
> _______________________________________________
> Users mailing list
> Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
>
> End of Users Digest, Vol 12, Issue 48
> *************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20160209/46c07027/attachment-0003.html>
More information about the Users
mailing list