[ClusterLabs] SAP HANA resource start problem
Muhammad Sharfuddin
M.Sharfuddin at nds.com.pk
Fri May 12 06:30:35 EDT 2017
is there a bug in SAP HANA resource ? crm_mon shows that cluster started
the resource and keep the HANA resource in slave state, while in actual
cluster doesn't start the resources, we found following events in the logs:
2017-05-12T01:01:55.194469+05:00 saphdbtst1 crmd[26357]: notice:
Initiating start operation rsc_SAPHana_TST_HDB00_start_0 locally on
saphdbtst1
2017-05-12T01:01:55.195425+05:00 saphdbtst1 lrmd[26354]: notice:
executing - rsc:rsc_SAPHana_TST_HDB00 action:start call_id:26
2017-05-12T01:01:55.487895+05:00 saphdbtst1 su: (to tstadm) root on none
2017-05-12T01:01:55.496584+05:00 saphdbtst1 systemd[1]: Created slice
User Slice of tstadm.
2017-05-12T01:01:55.519314+05:00 saphdbtst1 systemd[1]: Starting User
Manager for UID 1001...
2017-05-12T01:01:55.524831+05:00 saphdbtst1 systemd[1]: Started Session
c11387 of user tstadm.
2017-05-12T01:01:55.558886+05:00 saphdbtst1 systemd[27728]: Reached
target Paths.
2017-05-12T01:01:55.559365+05:00 saphdbtst1 systemd[27728]: Reached
target Timers.
2017-05-12T01:01:55.559740+05:00 saphdbtst1 systemd[27728]: Reached
target Sockets.
2017-05-12T01:01:55.560092+05:00 saphdbtst1 systemd[27728]: Reached
target Basic System.
2017-05-12T01:01:55.560437+05:00 saphdbtst1 systemd[27728]: Reached
target Default.
2017-05-12T01:01:55.560786+05:00 saphdbtst1 systemd[27728]: Startup
finished in 31ms.
2017-05-12T01:01:55.561336+05:00 saphdbtst1 systemd[1]: Started User
Manager for UID 1001.
2017-05-12T01:01:55.961270+05:00 saphdbtst1 systemd[1]: Stopping User
Manager for UID 1001...
2017-05-12T01:01:55.964003+05:00 saphdbtst1 systemd[27728]: Reached
target Shutdown.
2017-05-12T01:01:55.983723+05:00 saphdbtst1 systemd[27728]: Stopped
target Default.
2017-05-12T01:01:55.984039+05:00 saphdbtst1 systemd[27728]: Starting
Exit the Session...
2017-05-12T01:01:55.984333+05:00 saphdbtst1 systemd[27728]: Stopped
target Basic System.
2017-05-12T01:01:55.984615+05:00 saphdbtst1 systemd[27728]: Stopped
target Timers.
2017-05-12T01:01:55.984895+05:00 saphdbtst1 systemd[27728]: Stopped
target Sockets.
2017-05-12T01:01:55.985169+05:00 saphdbtst1 systemd[27728]: Stopped
target Paths.
2017-05-12T01:01:55.990544+05:00 saphdbtst1
SAPHana(rsc_SAPHana_TST_HDB00)[27645]: INFO: RA ==== begin action
start_clone (0.152.17) ====
2017-05-12T01:01:55.995850+05:00 saphdbtst1 systemd[27728]: Received
SIGRTMIN+24 from PID 27878 (kill).
2017-05-12T01:01:55.999861+05:00 saphdbtst1 systemd[1]: Stopped User
Manager for UID 1001.
2017-05-12T01:01:56.000460+05:00 saphdbtst1 systemd[1]: Removed slice
User Slice of tstadm.
2017-05-12T01:01:56.033425+05:00 saphdbtst1 crmd[26357]: notice:
Transition aborted by status-180881403-hana_tst_clone_state doing create
hana_tst_clone_state=DEMOTED: Transient attribute change
2017-05-12T01:01:56.044385+05:00 saphdbtst1 su: (to tstadm) root on none
2017-05-12T01:01:56.052758+05:00 saphdbtst1 systemd[1]: Created slice
User Slice of tstadm.
2017-05-12T01:01:56.075366+05:00 saphdbtst1 systemd[1]: Starting User
Manager for UID 1001...
2017-05-12T01:01:56.082157+05:00 saphdbtst1 systemd[1]: Started Session
c11388 of user tstadm.
2017-05-12T01:01:56.111651+05:00 saphdbtst1 systemd[27928]: Reached
target Sockets.
2017-05-12T01:01:56.112031+05:00 saphdbtst1 systemd[27928]: Reached
target Paths.
2017-05-12T01:01:56.112345+05:00 saphdbtst1 systemd[27928]: Reached
target Timers.
2017-05-12T01:01:56.112640+05:00 saphdbtst1 systemd[27928]: Reached
target Basic System.
2017-05-12T01:01:56.112936+05:00 saphdbtst1 systemd[27928]: Reached
target Default.
2017-05-12T01:01:56.113207+05:00 saphdbtst1 systemd[27928]: Startup
finished in 28ms.
2017-05-12T01:01:56.113480+05:00 saphdbtst1 systemd[1]: Started User
Manager for UID 1001.
2017-05-12T01:01:59.625804+05:00 saphdbtst1 systemd[1]: Stopping User
Manager for UID 1001...
2017-05-12T01:01:59.628078+05:00 saphdbtst1 systemd[27928]: Reached
target Shutdown.
2017-05-12T01:01:59.628471+05:00 saphdbtst1 systemd[27928]: Stopped
target Default.
2017-05-12T01:01:59.628958+05:00 saphdbtst1 systemd[27928]: Stopped
target Basic System.
2017-05-12T01:01:59.643490+05:00 saphdbtst1 systemd[27928]: Stopped
target Sockets.
2017-05-12T01:01:59.643798+05:00 saphdbtst1 systemd[27928]: Stopped
target Timers.
2017-05-12T01:01:59.644092+05:00 saphdbtst1 systemd[27928]: Stopped
target Paths.
2017-05-12T01:01:59.644370+05:00 saphdbtst1 systemd[27928]: Starting
Exit the Session...
2017-05-12T01:01:59.653728+05:00 saphdbtst1 systemd[27928]: Received
SIGRTMIN+24 from PID 28054 (kill).
2017-05-12T01:01:59.657669+05:00 saphdbtst1 systemd[1]: Stopped User
Manager for UID 1001.
2017-05-12T01:01:59.658274+05:00 saphdbtst1 systemd[1]: Removed slice
User Slice of tstadm.
2017-05-12T01:01:59.777090+05:00 saphdbtst1 su: (to tstadm) root on none
2017-05-12T01:01:59.785074+05:00 saphdbtst1 systemd[1]: Created slice
User Slice of tstadm.
2017-05-12T01:01:59.815357+05:00 saphdbtst1 systemd[1]: Starting User
Manager for UID 1001...
2017-05-12T01:01:59.819877+05:00 saphdbtst1 systemd[1]: Started Session
c11389 of user tstadm.
2017-05-12T01:01:59.853494+05:00 saphdbtst1 systemd[28143]: Reached
target Sockets.
2017-05-12T01:01:59.853871+05:00 saphdbtst1 systemd[28143]: Reached
target Timers.
2017-05-12T01:01:59.854199+05:00 saphdbtst1 systemd[28143]: Reached
target Paths.
2017-05-12T01:01:59.854502+05:00 saphdbtst1 systemd[28143]: Reached
target Basic System.
2017-05-12T01:01:59.854805+05:00 saphdbtst1 systemd[28143]: Reached
target Default.
2017-05-12T01:01:59.855137+05:00 saphdbtst1 systemd[28143]: Startup
finished in 28ms.
2017-05-12T01:01:59.855425+05:00 saphdbtst1 systemd[1]: Started User
Manager for UID 1001.
2017-05-12T01:02:02.611336+05:00 saphdbtst1 systemd[1]: Stopping User
Manager for UID 1001...
2017-05-12T01:02:02.611752+05:00 saphdbtst1 systemd[28143]: Stopped
target Default.
2017-05-12T01:02:02.627462+05:00 saphdbtst1 systemd[28143]: Reached
target Shutdown.
2017-05-12T01:02:02.627769+05:00 saphdbtst1 systemd[28143]: Starting
Exit the Session...
2017-05-12T01:02:02.628067+05:00 saphdbtst1 systemd[28143]: Stopped
target Basic System.
2017-05-12T01:02:02.628505+05:00 saphdbtst1 systemd[28143]: Stopped
target Timers.
2017-05-12T01:02:02.629228+05:00 saphdbtst1 systemd[28143]: Stopped
target Sockets.
2017-05-12T01:02:02.629531+05:00 saphdbtst1 systemd[28143]: Stopped
target Paths.
2017-05-12T01:02:02.638313+05:00 saphdbtst1 systemd[28143]: Received
SIGRTMIN+24 from PID 28258 (kill).
2017-05-12T01:02:02.642638+05:00 saphdbtst1 systemd[1]: Stopped User
Manager for UID 1001.
2017-05-12T01:02:02.643555+05:00 saphdbtst1 systemd[1]: Removed slice
User Slice of tstadm.
2017-05-12T01:02:02.650676+05:00 saphdbtst1
SAPHana(rsc_SAPHana_TST_HDB00)[27645]: INFO: LPA: landcape: DOWN, LPA:
wait ==> keep waiting
2017-05-12T01:02:02.659956+05:00 saphdbtst1
SAPHana(rsc_SAPHana_TST_HDB00)[27645]: INFO: RA: landcape: DOWN, LPA:
wait ==> keep waiting
2017-05-12T01:02:02.765628+05:00 saphdbtst1
SAPHana(rsc_SAPHana_TST_HDB00)[27645]: INFO: RA ==== end action
start_clone with rc=0 (0.152.17) (7s)====
2017-05-12T01:02:02.769587+05:00 saphdbtst1 lrmd[26354]: notice:
rsc_SAPHana_TST_HDB00_start_0:27645:stderr [ Error: NIECONN_REFUSED
(Connection refused), NiRawConnect failed in plugin_fopen() ]
2017-05-12T01:02:02.769909+05:00 saphdbtst1 lrmd[26354]: notice:
rsc_SAPHana_TST_HDB00_start_0:27645:stderr [ Error: NIECONN_REFUSED
(Connection refused), NiRawConnect failed in plugin_fopen() ]
2017-05-12T01:02:02.770164+05:00 saphdbtst1 lrmd[26354]: notice:
finished - rsc:rsc_SAPHana_TST_HDB00 action:start call_id:26 pid:27645
exit-code:0 exec-time:7574ms queue-time:0ms
2017-05-12T01:02:02.811710+05:00 saphdbtst1 crmd[26357]: notice: Result
of start operation for rsc_SAPHana_TST_HDB00 on saphdbtst1: 0 (ok)
--
Regards,
Muhammad Sharfuddin
On 05/11/2017 11:20 PM, Muhammad Sharfuddin wrote:
> pacemaker 1.1.15-21.1
> libpacemaker3 1.1.15-21.1
> DB: SAP HANA SPS 12
>
> Manually HANA DB starts and work perfectly, Master/Primary replicates
> to Secondary/Slave perfectly.
>
> But when start the HANA DB pacemaker resource, crm_mon shows that HANA
> DB resource gets started and both the nodes becomes Slave, and keep
> showing them Slave forever. However HANA DB didn't get start from the
> cluster which could be verified by running "sapcontrol -nr 00
> -function GetProcessList" which shows system is not running.
>
> HANA Topology and DB resource configuration:
>
> primitive rsc_SAPHanaTopology_TST_HDB00 ocf:suse:SAPHanaTopology \
> operations $id=rsc_sap2_TST_HDB00-operations \
> op monitor interval=10 timeout=600 \
> op start interval=0 timeout=600 \
> op stop interval=0 timeout=300 \
> params SID=TST InstanceNumber=00
>
> primitive rsc_SAPHana_TST_HDB00 ocf:suse:SAPHana \
> operations $id=rsc_sap_TST_HDB00-operations \
> op start interval=0 timeout=3600 \
> op stop interval=0 timeout=3600 \
> op promote interval=0 timeout=1600 \
> op monitor interval=60 role=Master timeout=700 \
> op monitor interval=61 role=Slave timeout=700 \
> params SID=TST InstanceNumber=00 PREFER_SITE_TAKEOVER=true
> DUPLICATE_PRIMARY_TIMEOUT=600 AUTOMATED_REGISTER=true
>
> ms msl_SAPHana_TST_HDB00 rsc_SAPHana_TST_HDB00 \
> meta is-managed=true notify=true clone-max=2 clone-node-max=1
> target-role=Started interleave=true
>
> clone cln_SAPHanaTopology_TST_HDB00 rsc_SAPHanaTopology_TST_HDB00 \
> meta is-managed=true clone-node-max=1 target-role=Started
> interleave=true
>
> Following events are logged when cluster tries to start the HANA DB
> resource:
> 2017-05-11T15:29:35.775044+05:00 saphdbtst2 crmd[10195]: notice:
> Initiating monitor operation p_fence_saphdbtst1_monitor_0 locally on
> saphdbtst2
> 2017-05-11T15:29:35.776600+05:00 saphdbtst2 crmd[10195]: notice:
> Initiating monitor operation p_fence_saphdbtst1_monitor_0 on saphdbtst1
> 2017-05-11T15:29:35.777021+05:00 saphdbtst2 crmd[10195]: notice:
> Initiating monitor operation p_fence_saphdbtst2_monitor_0 locally on
> saphdbtst2
> 2017-05-11T15:29:35.779302+05:00 saphdbtst2 crmd[10195]: notice:
> Initiating monitor operation p_fence_saphdbtst2_monitor_0 on saphdbtst1
> 2017-05-11T15:29:35.779770+05:00 saphdbtst2 crmd[10195]: notice:
> Initiating monitor operation rsc_ip_TST_HDB00_monitor_0 locally on
> saphdbtst2
> 2017-05-11T15:29:35.843129+05:00 saphdbtst2 crmd[10195]: notice:
> Initiating monitor operation rsc_ip_TST_HDB00_monitor_0 on saphdbtst1
> 2017-05-11T15:29:35.843567+05:00 saphdbtst2 crmd[10195]: notice:
> Initiating monitor operation rsc_SAPHana_TST_HDB00:0_monitor_0 locally
> on saphdbtst2
> 2017-05-11T15:29:35.845257+05:00 saphdbtst2 crmd[10195]: notice:
> Initiating monitor operation rsc_SAPHana_TST_HDB00:0_monitor_0 on
> saphdbtst1
> 2017-05-11T15:29:35.845682+05:00 saphdbtst2 crmd[10195]: notice:
> Initiating monitor operation rsc_SAPHanaTopology_TST_HDB00:0_monitor_0
> locally on saphdbtst2
> 2017-05-11T15:29:35.847105+05:00 saphdbtst2 crmd[10195]: notice:
> Initiating monitor operation rsc_SAPHanaTopology_TST_HDB00:0_monitor_0
> on saphdbtst1
> 2017-05-11T15:29:35.902758+05:00 saphdbtst2 crmd[10195]: notice:
> Result of probe operation for p_fence_saphdbtst1 on saphdbtst2: 7 (not
> running)
> 2017-05-11T15:29:35.903336+05:00 saphdbtst2 crmd[10195]: notice:
> Result of probe operation for p_fence_saphdbtst2 on saphdbtst2: 7 (not
> running)
> 2017-05-11T15:29:35.938865+05:00 saphdbtst2 crmd[10195]: notice:
> Result of probe operation for rsc_ip_TST_HDB00 on saphdbtst2: 7 (not
> running)
> 2017-05-11T15:29:35.950344+05:00 saphdbtst2 su: (to tstadm) root on none
> 2017-05-11T15:29:35.984612+05:00 saphdbtst2 systemd[1]: Started
> Session c45235 of user tstadm.
> 2017-05-11T15:29:36.432092+05:00 saphdbtst2 su: (to tstadm) root on none
> 2017-05-11T15:29:36.456621+05:00 saphdbtst2 systemd[1]: Started
> Session c45236 of user tstadm.
> 2017-05-11T15:29:36.463414+05:00 saphdbtst2 su: (to tstadm) root on none
> 2017-05-11T15:29:36.468627+05:00 saphdbtst2 systemd[1]: Started
> Session c45237 of user tstadm.
> 2017-05-11T15:29:36.991382+05:00 saphdbtst2
> SAPHana(rsc_SAPHana_TST_HDB00)[10203]: INFO: RA ==== begin action
> monitor_clone (0.152.17) ====
> 2017-05-11T15:29:37.050443+05:00 saphdbtst2 su: (to tstadm) root on none
> 2017-05-11T15:29:37.072682+05:00 saphdbtst2 systemd[1]: Started
> Session c45238 of user tstadm.
> 2017-05-11T15:29:40.077857+05:00 saphdbtst2 su: (to tstadm) root on none
> 2017-05-11T15:29:40.100617+05:00 saphdbtst2 systemd[1]: Started
> Session c45239 of user tstadm.
> 2017-05-11T15:29:40.121749+05:00 saphdbtst2 crmd[10195]: notice:
> Transition aborted by status-180881403-master-rsc_SAPHana_TST_HDB00
> doing create master-rsc_SAPHana_TST_HDB00=5: Transient attribute change
> 2017-05-11T15:29:40.614835+05:00 saphdbtst2 su: (to tstadm) root on none
> 2017-05-11T15:29:40.636628+05:00 saphdbtst2 systemd[1]: Started
> Session c45240 of user tstadm.
> 2017-05-11T15:29:43.363140+05:00 saphdbtst2
> SAPHana(rsc_SAPHana_TST_HDB00)[10203]: INFO: RA ==== end action
> monitor_clone with rc=7 (0.152.17) (8s)====
> 2017-05-11T15:29:43.367148+05:00 saphdbtst2 lrmd[10192]: notice:
> rsc_SAPHana_TST_HDB00_monitor_0:10203:stderr [ Error performing
> operation: No such device or address ]
> 2017-05-11T15:29:43.367490+05:00 saphdbtst2 lrmd[10192]: notice:
> rsc_SAPHana_TST_HDB00_monitor_0:10203:stderr [ Error performing
> operation: No such device or address ]
> 2017-05-11T15:29:43.407874+05:00 saphdbtst2 crmd[10195]: notice:
> Result of probe operation for rsc_SAPHana_TST_HDB00 on saphdbtst2: 7
> (not running)
> 2017-05-11T15:29:43.408223+05:00 saphdbtst2 crmd[10195]: notice:
> saphdbtst2-rsc_SAPHana_TST_HDB00_monitor_0:18 [ Error performing
> operation: No such device or address\nError performing operation: No
> such device or address\n ]
>
>
> Nutshell: crm_mon shows that HANA DB started and both nodes are Slave,
> but DB didn't get started by cluster, sapcontrol command shows that
> HANA is not running on either node.
>
>
>
> --
> Regards,
>
> Muhammad Sharfuddin
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20170512/1bbb7799/attachment-0003.html>
More information about the Users
mailing list