Configuring Active- Passive Clustering Oracle Databases in Grid Infrastructure.

Introduction:

For a billing environment I have been asked to change the current Real application solution to a HA solution with an Active – Passive Database in the Cluster ware of Oracle (Grid Infrastructure) (11.2.0.2.0). Even though I have had / seen examples of this working in 11gr1 the fact we use 11Gr2 allowed us to  skip steps needed in the 11gr1 environment  (such as a dedicated virtual Ip address, able to  fail over and a listener that has to come along with that as well). This solution is based on the fact that in this project we have set up scan listener(s) and also that we connect via a Service (taf). Also it should be said that the parameters for the cluster ware have changed again between Oracle 11gr1 and 11gr2. So it was a nice challenge to see, adapt and solve the issues

It is my estimate that a maintenance window of app 1 Hour  per Database will be needed, doing the analyses,  the changes,  the removal and adding it back to the cluster and  do relocate tests.

Ps  i have marked specific  cluster commands in yellow.

Happy Reading.

Step 1 Get info about the (current) clustered database

Note: You should run this script as the user Oracle!!

As a start I have performed this for all the existing databases (on server1) in order to get information about the database that is currently in the cluster ware as a Rac database:

#!/bin/ksh
## 2012 04 16
## M Bruggink
## collecting the resources from the cluster for an active – passive solution in which the db is treated like an application and can relocate.
##

export NLS_DATE_FORMAT=’DD.MM.YYYY HH24:MI:SS’
export NLS_LANG=AMERICAN_AMERICA.WE8ISO8859P
export ORACLE_BASE=/opt/oracle
export ORACLE_HOME=/opt/crs/product/112_ee_64/crs/
export PATH=/usr/local/bin:$ORACLE_HOME/bin:$PATH

export LOG_DIR=/opt/oracle/admin/tools/logs
cd $LOG_DIR

SID_LIST=”<SID1> <SID2> <SID3>”

for curSid in $SID_LIST
do
export ORACLE_SID=$curSid
/opt/crs/product/112_ee_64/crs/bin/crsctl status resource ora.$ORACLE_SID.db -p>ora.$ORACLE_SID.resource
done

Step 2 Remove the current Cluster database from Cluster

Note: You should run this script as the user Oracle!!
Be sure you have created the resource file in step 1 before proceeding.

srvctl stop database -d <<DB_NAME>>
srvctl remove database -d <<DB_NAME>>
Remove the database <<DB_NAME>>? (y/[n]) y

Note.  These commands will stop and remove the database from the Cluster ware as a Rac database so better be sure what you are doing.
BTW the names are case sensitive. At my site they were defined lowercase!! So of course you need to give the srvctl commands in proper case.

In this project we had to give up the Rac environment in place and had to implement the Active passive Solution in which an Instance/Database is controlled as a resource by the Oracle Cluster ware. The Rac solution was not certified to be used from the Software supplier and Rac One was not an option due to the fact that at that point and time this Software was not part of licence agreement with customer , and due to fact that Rac One  was not in the Product catalogue.

Step 3 Copy and Edit the resource file

Note: You should run this script as the user Oracle!!

The files created In step 1 are used as a baseline for creating the .resource file.
The resource file is to be regarded similar as a parfile in datapump or export / import. You have the choice of either use such a resource file when you give the command (and all settings will be in that file) OR you get a very long command line command. J  So in my case I have chosen to use the resource file cause of ease of use (and reuse).

Copy and edit the resource files as prep for Resource. The resource files I have copied to *.config (to have a backup as a source file).

AND

I have copied those config files to a fixed directory:

/opt/crs/product/112_ee_64/crs/crs/public

#### note this directory is also used in the config files as a location for the action script.
#### And I have started editing that config file. In my example this file is ora.<<DB_NAME>>.config.
#### PS!!! ONLY the following parameters are accepted and or needed by the cluster ware.  Documentation on the internet etc. is neither thorough nor complete.  PLUS that there is a change of behaviour in attributes that are valid between Oracle 11gr1 and Oracle 11gr2
But trust me with these settings it worked!!!!! So you have to amend / delete a lot of information that came from that crsctl status in step 1.

BTW files like this resource file NEED to be set up for every database in scope!!!

#### In the end the Configuration looks like this (ora.<<DB_NAME>>.config)
NAME=app.<<DB_NAME>>.db
TYPE=cluster_resource
ACL=owner:oracle:rwx,pgrp:dba:rwx,other::r–
ACTION_SCRIPT=/opt/crs/product/112_ee_64/crs/crs/public/hacluster<<DB_NAME>>.active
ACTIVE_PLACEMENT=0
AUTO_START=restore
CARDINALITY=1
CHECK_INTERVAL=10
DEGREE=1
DESCRIPTION=Resource <<DB_NAME>> Db
ENABLED=1
HOSTING_MEMBERS=SERVER1 SERVER2
LOAD=1
LOGGING_LEVEL=1
PLACEMENT=restricted
RESTART_ATTEMPTS=1
START_DEPENDENCIES=hard(ora.<<DB_NAME>>_DATA.dg,ora.<<DB_NAME>>_FRA.dg,ora.<<DB_NAME>>_REDO.dg) weak(type:ora.listener.type,global:type:ora.scan_listener.type,uniform:ora.ons,global:ora.gns) pullup(ora.<<DB_NAME>>_DATA.dg,ora.<<DB_NAME>>_FRA.dg,ora.<<DB_NAME>>_REDO.dg)
START_TIMEOUT=600
STOP_DEPENDENCIES=hard(intermediate:ora.asm,shutdown:ora.<<DB_NAME>>_DATA.dg,shutdown:ora.<<DB_NAME>>_FRA.dg,shutdown:ora.<<DB_NAME>>_REDO.dg)
STOP_TIMEOUT=600
UPTIME_THRESHOLD=1h

NOTE:  HOSTING_MEMBERS:  the primary server should be mentioned FIRST,  the second entry is  the failover  server then.
NOTE: _DEPENDENCIES:  these are the diskgroups , and the asm environment that needs to be there when resource is started.

Step 4 Look and feel of Action script

Every separate Database will become a new cluster resource in this way. That means that every database will have:
· A resource file as described in step 3
· An action script, specific for that database!

And in the resource file you have to point to location and name of this action script!!!

### I first used a script from a great book on Rac that had a su – oracle in it. I did not get it to work properly so in the end I ended up with a different version of that action script:  having removed the su – oracle which I had in there it worked!
### Lesson learned, stick to one plan, but be flexible and change to other approach if you fail..

### cat hacluster<<DB_NAME>>.active
#!/bin/bash
export ORACLE_SID=<<DB_NAME>>1
export ORACLE_HOME=/opt/oracle/product/112_ee_64/db
export PATH=/usr/local/bin:$ORACLE_HOME/bin:$PATH
export ORACLE_OWNER=oracle

case $1 in
‘start’)
$ORACLE_HOME/bin/sqlplus /nolog<<EOF
conn / as sysdba
startup
exit
EOF
RET=0
;;

‘stop’)
$ORACLE_HOME/bin/sqlplus /nolog<<EOF
conn / as sysdba
shutdown immediate
exit
EOF
RET=0
;;

‘clean’)
$ORACLE_HOME/bin/sqlplus /nolog <<EOF
conn / as sysdba
shutdown abort
exit
EOF
RET=0
;;

‘check’)
# check for the existance of the smon process for $ORACLE_SID
# this check could be improved, but was kept short on purpose
found=`ps -ef | grep smon | grep $ORACLE_SID | wc -l`
if [ $found = 0 ]; then
RET=1
else
RET=0
fi
;;
*)
RET=0
;;
esac
if [ $RET -eq 0 ]; then
exit 0
else
exit 1
fi

NOTE: A generic script like this could also perform activities on Linux this way, by adding the required corresponding Linux commands to the specific CASE statement (start, stop, clean).
NOTE:  the check case could be used to send message to monitoring tools like HP Openview.

Step 5 Copy the config file and action script

I have copied the resource file (.resource) and config file (.config) and the action script (.active) to the other node where it is supposed to run as well (using scp).
#### NOTE IT is mandatory to have same script , with same privileges and executable on all the boxes where this  needs to  be  ( in my example  server1 and server2 cause databases are only supposed to relocate on node 1 and 2 ( and not also to  node 3 – 4  which is a pitty).

Step 6 Setup / compare environment on all Servers in scope

o Make sure that directory structure is same on both servers for the diag destination and the audit files.
o Set links if needed for pointing to the Database instead of the instance.
o AND make sure there is a proper init.ora (pointing to a spfile in the asm).
o And a proper password file for the specific Instance out there
o Change Oratab .

Step 7 Cluster ware activities

HA script and config file are located in /opt/crs/product/112_ee_64/crs/crs/public/ as was recommended in documentation.  In the following steps you will have to add the new resource to the cluster and start it.  Hmm and test it of course:

Step 7.1 Add the resource to the cluster ware

Note: You should run this script as the user ORACLE!!
Note:  SET CRS HOME (use ccc)

crsctl add resource app.<<DB_NAME>>.db -type cluster_resource -file /opt/crs/product/112_ee_64/crs/crs/public/ora.<<DB_NAME>>.config

NOTE:-type cluste_resource is MANDATORY in this solution.
NOTE: do NOT use ora .<<DB_NAME>>.db in your naming convention, cause it will be confusing  to others.. Real cluster resources, set up by Oracle CRS use the ora.  The rest of us mortals should not do that.
I checked its present with:
crsctl status  resource –t

#### When I did this and i saw it work, I can tell you first smile on face ..

Step 7.2 Start the resource in the cluster ware

Note: You should run this script as the user ORACLE!!
Note:  SET CRS HOME (use ccc)

Now it is Time to get things working together, let us start the resource.  Keep fingers crossed (as I do for you now)

crsctl start resource app.<<DB_NAME>>.db

#### This worked as well (omg this felt so good)

Step 8 Relocate a resource

Note: You can run this script as the user ORACLE!!
Note:  SET CRS HOME (use ccc)

Ultimately when all databases are setup properly then even a node fall down has to be tested. But the manual relocate performed by the user oracle:

crsctl relocate  resource app.<<DB_NAME>>.db

#### Conclusion it all worked as to be wanted / hoped for / needed  / desired etc. !!!! So big smile and happy DBA!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s