[ClusterLabs] Issues with DB2 HADR Resource Agent

Sun Mar 4 04:00:20 EST 2018

On 02/19/2018 11:25 PM, Dileep V Nair wrote:
> Hello Ondrej,
> 
> I am still having issues with my DB2 HADR on Pacemaker. When I do a
> db2_kill on Primary for testing, initially it does a restart of DB2 on
> the same node. But if I let it run for some days and then try the same
> test, it goes into fencing and then reboots the Primary Node.
> 
> I am not sure how exactly it should behave in case my DB2 crashes on
> Primary.
> 
> Also if I crash the Node 1 (the node itself, not only DB2), it promotes
> Node 2 to Primary, but once the Pacemaker is started again on Node 1,
> the DB on Node 1 is also promoted to Primary. Is that expected behaviour ?
> Regards,
> 
> *Dileep V Nair*
> Senior AIX Administrator
> Cloud Managed Services Delivery (MSD), India
> IBM Cloud
> 
> ------------------------------------------------------------------------
> *E-mail:*_dilenair at in.ibm.com_ <mailto:dilenair at in.ibm.com>	
> Outer Ring Road, Embassy Manya
> Bangalore, KA 560045
> India

Hello Dileep,

Sorry for later reply. (my email filters sometimes misbehaves)

Seeing a fencing after db2_kill is interesting but questions is what has
triggered the fencing. Was is failure of DB2 to stop or some other
resource failure?

When DB2 was successfully promoted on one node while previous has
crashed, the one that was crashed should detect that it is 'outdated
Primary' in DB2. When this happens the cluster will not attempt to
promote it to Master and will leave it as slave. Investigation on DB2
side might be needed to determine if this didn't happen.

In case that you have some procedure that results in this behavior
constantly I can check on my testing machine to see if I can reproduce
it - this may give a hint if it is more cluster issue or DB2 issue that
needs to be addressed.

-- 
Ondrej Faměra
@Red Hat