Playing with Cluster commands 4 single Instances in Grid Infra Structure

Introduction.

This weekend ( 20 – 22  February 2015)  I am involved in a big Data migration of app 900K  Customers  and load of data into environments that i have set up as Single Instances under control of the  Grid Infra structure in 11.2.0.3 on Red Hat Linux. As always during such big operations there is a need to have a fall-back plan for when all will break. Since I have the luxury  that  we Can use EMC clone technology a fall-back scenario have been set up where during  the week EMC storage Clones have been setup for the databases in scope. These clones are permanent syncing  with the  Source databases on the machines at the moment.

This Friday the application will be stopped, After feedback from Application team I will have to stop  the databases via the cluster (GI). As always as prep , started to make notes which i will share / elaborate here to do stop – start – checks of those Databases..

Setup:

All my databases  have been registered in the GI( Grid Infra) as an application resource since I was not allowed to use RAC or Rac one  during setup of these environments.  Yet  I had to offer  a higher availability  that is why i implemented a poor-mans-rac where a Database becomes a resource in the cluster , that is capable of failing over to another ( specific and specified Node in the cluster).

In the end  when i had my setup in place , the information in the cluster looks  pretty much like this:

### status in detail

/opt/crs/product/11203/crs/bin/crsctl status resource app.mydb1.db -p

NAME=app.mydb1.db

TYPE=cluster_resource

ACL=owner:oracle:rwx,pgrp:dba:rwx,other::r–

ACTION_FAILURE_TEMPLATE=

ACTION_SCRIPT=/opt/crs/product/11203/crs/crs/public/ora.mydb1.active

ACTIVE_PLACEMENT=0

AGENT_FILENAME=%CRS_HOME%/bin/scriptagent

AUTO_START=restore

CARDINALITY=1

CHECK_INTERVAL=10

DEFAULT_TEMPLATE=

DEGREE=1

DESCRIPTION=Resource mydb1 DB

ENABLED=1

FAILOVER_DELAY=0

FAILURE_INTERVAL=0

FAILURE_THRESHOLD=0

HOSTING_MEMBERS=mysrvr05hr mysrvr04hr

LOAD=1

LOGGING_LEVEL=1

NOT_RESTARTING_TEMPLATE=

OFFLINE_CHECK_INTERVAL=0

PLACEMENT=restricted

PROFILE_CHANGE_TEMPLATE=

RESTART_ATTEMPTS=1

SCRIPT_TIMEOUT=60

SERVER_POOLS=

START_DEPENDENCIES=hard(ora.MYDB1_DATA.dg,ora.MYDB1_FRA.dg,ora.MYDB1_REDO.dg) weak(type:ora.listener.type,global:type:ora.scan_listener.type,uniform:ora.ons,global:ora.gns) pullup(ora.MYDB1_DATA.dg,ora.MYDB1_FRA.dg,ora.MYDB1_REDO.dg)

START_TIMEOUT=600

STATE_CHANGE_TEMPLATE=

STOP_DEPENDENCIES=hard(intermediate:ora.asm,shutdown:ora.MYDB1_DATA.dg,shutdown:ora.MYDB1_FRA.dg,shutdown:ora.MYDB1_REDO.dg)

STOP_TIMEOUT=600

UPTIME_THRESHOLD=1h

As  you can see i have set up the dependencies with the disk groups ( start_ and stop_)  i have  set up placement to be restricted ( so the db can only start on restricted number of nodes ( which i defined in hosting_members).

This evening action plan will involve:

### Checking my resources for status and where they are running at moment. So i know where they are when i start my actions . PS the -C 3 is a nice option to show some extra lines in Linux level  about the  resource.

/opt/crs/product/11203/crs/bin/crsctl status resource -t|grep app -C 3

 app.mydb1.db

      1        ONLINE  ONLINE       mysrvr05hr                                    

app.mydb2.db

      1        ONLINE  ONLINE       mysrvr04hr                                    

app.mydb3.db

      1        ONLINE  ONLINE       mysrvr02hr                                     

###  checking status  on a high level .

/opt/crs/product/11203/crs/bin/crsctl status resource app.mydb1.db

/opt/crs/product/11203/crs/bin/crsctl status resource app.mydb2.db

/opt/crs/product/11203/crs/bin/crsctl status resource app.mydb3.db

In order to enable my colleagues to do the EMC split properly the application will be stopped. Once i have my Go  after that  i will stop the databases using GI commands:

### stopping resources:

/opt/crs/product/11203/crs/bin/crsctl stop resource app.mydb1.db

/opt/crs/product/11203/crs/bin/crsctl stop resource app.mydb2.db

/opt/crs/product/11203/crs/bin/crsctl stop resource app.mydb3.db

Once  my storage colleague has finished the EMC split ( this should take only minutes because the databases have been  in sync mode with the production all week, i will put some databases in noarchivelog mode manually to be faster in doing Datapump loads. After shutting down  the databases again I will start them again using the GI command:

### starting resources:

/opt/crs/product/11203/crs/bin/crsctl start resource app.mydb1.db

/opt/crs/product/11203/crs/bin/crsctl start resource app.mydb2.db

/opt/crs/product/11203/crs/bin/crsctl start resource app.mydb3.db

##-  Relocate  if needed                                                                   

–  =========================================                                                                     

–  server mysrvr05hr                                     :                                                                                              

–                         crsctl relocate  resource app.mydb1.db                                                

 

–  server mysrvr04hr                                     :                                                                                              

–                         crsctl relocate  resource app.mydb2.db                                                

 

## Alternatively :

–  server mysrvr05hr:

–                         crsctl relocate  resource app.mydb1.db   -n mysrvr04hr

–  server mysrvr04hr:

–                         crsctl relocate  resource app.mydb2.db   -n mysrvr05hr

On Saturday will stop the databases that are in noarchivelog mode again via the cluster and put them back to archivelog mode. After that i have scheduled a level 0 Backup with rman.

Happy reading,

Mathijs.

11gR2 Database Services and Instance Shutdown

This needs check with app supplier. Great note.

jarneil

I’m a big fan of accessing the database via services and there are some nice new features with database services in 11gR2. However I got a nasty shock when performing some patch maintenance with an 11.2.0.1 RAC system that had applications using services. Essentially I did not realise what happens to a service when you shutdown an instance for maintenance. Let me demonstrate:

This has the following configuration:

So the service is now online on the node where DBA1 (preferred node in definition) runs:

any examples I’ve seen, show what happens to a service when you perform shutdown abort. First lets see what our tns connection looks like:

Which gives the following in V$SESSION when you connect using this definition:

Lets abort the node:

Oh that’s not good. Look whats happened to my application:

Let’s bring everything back and try a different kind of shutdown This time using the following:

View original post 133 more words