[ClusterLabs] Antw: Re: Oracle resource agent failure

Wed Apr 5 04:46:00 EDT 2017

>>> "SAYED, MAJID ALI SYED AMJAD ALI" <sayedma2 at NGHA.MED.SA> schrieb am 05.04.2017
um 10:31 in Nachricht
<DBE5516C77BE4D4EB6CB2F6B5AB0EB648EBD85F1 at RIYSVMBX-800.KAMC-RD.ngha.med>:
> Thank you,
> We tried   getting resource out of group.
> But unfortunately the resource which is not corrupted still  does not start

I'm afraid if your admin corrupted the database (" For  testing purpose he has purposely corrupted testdbc oracle resource"), the admin should fix it to let it start again, especially if some standard recovery like this won't work:
connect / as sysdba
shutdown abort
startup mount
alter database end backup;
alter database open;

> In Logs we see the following error
> 
> warning: Forcing testdbc away from krplporacle001 after 1000000 failures 
> (max=1000000)
> Apr  5 11:24:00 krplporacle001 pengine[2800]: warning: Forcing testdbc away 
> from krplporacle002 after 1000000 failures (max=1000000)
> 
> Seems that if the testdbc  does not start it fails  and then tries on node 2 
> and fails it
> How can we make it start

See above!

Regards,
Ulrich

> 
> 
> pcs resource show testdbc
> Resource: testdbc (class=ocf provider=heartbeat type=oracle)
>   Attributes: sid=testdbc home=/u01/app/oracle/product/12.1.0.2/db_1 
> user=oracle monuser=C##OCFMON1 monpassword=C##OCFMON1 
> monprofile=C##OCFMONPROFILE1
>   Operations: start interval=0s timeout=500 on-fail=restart 
> (testdbc-start-interval-0s)
>               stop interval=0s timeout=500 on-fail=ignore 
> (testdbc-stop-interval-0s)
>               monitor interval=60s timeout=500 on-fail=restart 
> (testdbc-monitor-interval-60s)
> [root at krplporacle001 ~]#
> From: Muhammad Sharfuddin [mailto:M.Sharfuddin at nds.com.pk]
> Sent: Wednesday, April 05, 2017 8:53 AM
> To: users at clusterlabs.org; SAYED, MAJID ALI SYED AMJAD ALI
> Subject: Re: [ClusterLabs] Oracle resource agent failure
> 
> 
> On 04/05/2017 09:53 AM, SAYED, MAJID ALI SYED AMJAD ALI wrote:
> Hello,
> We are trying to build a 2 node Active/Passive  Linux HA cluster using 
> pacemaker and corosync on RHEL 7.2 using shared storage (SAN) that will be 
> used for oracle database
> We have build LVM resource, Filesystem resource and Virtual IP.
> The database administrator has build  couple of  oracle db and oracle 
> listener resources.
> All these resources are in group
> 
> Resource Group: oraclegrp
>      vgres      (ocf::heartbeat:LVM):   Started krplporacle001
>      Cluster_ip (ocf::heartbeat:IPaddr2):       Started krplporacle001
>      u02        (ocf::heartbeat:Filesystem):    Started krplporacle001
>      u03        (ocf::heartbeat:Filesystem):    Started krplporacle001
>      u04        (ocf::heartbeat:Filesystem):    Started krplporacle001
>      ls_testdbc (ocf::heartbeat:oralsnr):       Started krplporacle001
>      testdbc    (ocf::heartbeat:oracle):        Stopped
>      ls_samtest (ocf::heartbeat:oralsnr):       Stopped
>      samtest    (ocf::heartbeat:oracle):        Stopped
> 
> For  testing purpose he has purposely corrupted testdbc oracle resource and 
> transferred the resource to passive node.
> But  we are not sure if one oracle resource in group why the other is not 
> starting after transferring the service
> Any help would be much appreciated
> The logs only specify that oracle resource for testdbc is not running , but 
> it does not even attempt to start samtest?
> 
> resources in the resource group started in the sequence, so in your case if 
> testdbc is stopped(for any reason) cluster wouldn't even attempt to start the 
> "ls_samtest" and "samtest", because they are dependent upon successful 
> startup of 'testdbc'.
> 
> Hope this helps.
> 
> --
> Regards,
> 
> Muhammad Sharfuddin
> Cell: +92-3332144823 | UAN: +92(21) 111-111-142 ext: 112 | 
> NDS.COM.PK<http://www.nds.com.pk>
> 
> ________________________________
> This Email and any files transmitted may contain confidential and/or 
> privileged information and is intended solely for the addressee(s) named. If 
> you have received this information in error, or are being posted by accident, 
> please notify the sender by return Email, do not redistribute this email 
> message, delete it immediately and keep no copies of it. All opinions and/or 
> views expressed in this email are solely those of the author and do not 
> necessarily represent those of NGHA. Any purchase order, purchase advice or 
> legal commitment is only valid once backed by the signed hardcopy by the 
> authorized person from NGHA.