[ClusterLabs] Oracle resource agent failure

Wed Apr 5 10:31:34 CEST 2017

Thank you,
We tried   getting resource out of group.
But unfortunately the resource which is not corrupted still  does not start
In Logs we see the following error

warning: Forcing testdbc away from krplporacle001 after 1000000 failures (max=1000000)
Apr  5 11:24:00 krplporacle001 pengine[2800]: warning: Forcing testdbc away from krplporacle002 after 1000000 failures (max=1000000)

Seems that if the testdbc  does not start it fails  and then tries on node 2 and fails it
How can we make it start

pcs resource show testdbc
Resource: testdbc (class=ocf provider=heartbeat type=oracle)
  Attributes: sid=testdbc home=/u01/app/oracle/product/12.1.0.2/db_1 user=oracle monuser=C##OCFMON1 monpassword=C##OCFMON1 monprofile=C##OCFMONPROFILE1
  Operations: start interval=0s timeout=500 on-fail=restart (testdbc-start-interval-0s)
              stop interval=0s timeout=500 on-fail=ignore (testdbc-stop-interval-0s)
              monitor interval=60s timeout=500 on-fail=restart (testdbc-monitor-interval-60s)
[root at krplporacle001 ~]#
From: Muhammad Sharfuddin [mailto:M.Sharfuddin at nds.com.pk]
Sent: Wednesday, April 05, 2017 8:53 AM
To: users at clusterlabs.org; SAYED, MAJID ALI SYED AMJAD ALI
Subject: Re: [ClusterLabs] Oracle resource agent failure

On 04/05/2017 09:53 AM, SAYED, MAJID ALI SYED AMJAD ALI wrote:
Hello,
We are trying to build a 2 node Active/Passive  Linux HA cluster using pacemaker and corosync on RHEL 7.2 using shared storage (SAN) that will be used for oracle database
We have build LVM resource, Filesystem resource and Virtual IP.
The database administrator has build  couple of  oracle db and oracle listener resources.
All these resources are in group

Resource Group: oraclegrp
     vgres      (ocf::heartbeat:LVM):   Started krplporacle001
     Cluster_ip (ocf::heartbeat:IPaddr2):       Started krplporacle001
     u02        (ocf::heartbeat:Filesystem):    Started krplporacle001
     u03        (ocf::heartbeat:Filesystem):    Started krplporacle001
     u04        (ocf::heartbeat:Filesystem):    Started krplporacle001
     ls_testdbc (ocf::heartbeat:oralsnr):       Started krplporacle001
     testdbc    (ocf::heartbeat:oracle):        Stopped
     ls_samtest (ocf::heartbeat:oralsnr):       Stopped
     samtest    (ocf::heartbeat:oracle):        Stopped

For  testing purpose he has purposely corrupted testdbc oracle resource and transferred the resource to passive node.
But  we are not sure if one oracle resource in group why the other is not starting after transferring the service
Any help would be much appreciated
The logs only specify that oracle resource for testdbc is not running , but it does not even attempt to start samtest?

resources in the resource group started in the sequence, so in your case if testdbc is stopped(for any reason) cluster wouldn't even attempt to start the "ls_samtest" and "samtest", because they are dependent upon successful startup of 'testdbc'.

Hope this helps.

--
Regards,

Muhammad Sharfuddin
Cell: +92-3332144823 | UAN: +92(21) 111-111-142 ext: 112 | NDS.COM.PK<http://www.nds.com.pk>

________________________________
This Email and any files transmitted may contain confidential and/or privileged information and is intended solely for the addressee(s) named. If you have received this information in error, or are being posted by accident, please notify the sender by return Email, do not redistribute this email message, delete it immediately and keep no copies of it. All opinions and/or views expressed in this email are solely those of the author and do not necessarily represent those of NGHA. Any purchase order, purchase advice or legal commitment is only valid once backed by the signed hardcopy by the authorized person from NGHA.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20170405/3aae3b7b/attachment.html>