Rman Recovery of a Rac Database

Introduction:

As part of a QA  process I am restoring a Rac database.  Rman as a back-up recovery that  has been around some time now (I started working with it as from Oracle 8) will be used for it.  Even though I am aware of debates that this could be done much easier via the Gui I am still tempted to use command line ( scripting for it).  This is the general information about the  environment where this will happen today: The 2 Instance Rac Database is on Linux servers on two nodes. Backups are performed to tape directly using Rman. We are using Networker for that. The environment  i am working with is a 11.2.3.0 with ASM. As part of sharing information below are the steps I performed with some add-on comment.

Preparations:

First of all let us see if this is a Rac database  cause in such a scenario will have to bring down all the other instances and do the recovery  on one node (in my world  i always tend to go for instance 1):

SQL> show parameter cluster

NAME                                             TYPE   VALUE

———————————— ———– ——————————

cluster_database                          boolean    TRUE

cluster_database_instances      integer      2

cluster_interconnects                 string

OK  So in our case that is a yes.

Performing the steps:

Turn it into a one DB- Instance:

In order to do  a restore – recovery we will shutdown all other instances but one and work with it. This is the first mandatory step. It will need a restart of  your remaining instance .

alter system set cluster_database = False scope = SPFILE sid='*';

Stopping – Starting – Stopping and investigating Options:

In Linux I will stop the database so the Grid Infra will no longer be in control of this environment:

srvctl stop database -d MYDBP

Then  via sqlplus  i do a startup mount to see if the environment is happy, then i shut it down again

startup mount;
shutdown;

Now the most important step will be to get  yourself a clear vision which  in place backup you want to use for your  restore. In my case that was the most recent level 0 backup. Backups are already being implemented for this database  and for each backup i keep log files so in the log directory. There I select my most current level 0 backup where we will aim for.

-rw-r–r– 1 oracle dba 20435 Nov 11 20:50 level0_MYDBP1.2012_11_11_20:00:16

-rw-r–r– 1 oracle dba 19868 Nov 12 20:51 level1_MYDBP1.2012_11_12_20:00:27

-rw-r–r– 1 oracle dba 19868 Nov 13 20:55 level1_MYDBP1.2012_11_13_20:00:22

This tells me that the backup ended  Nov 11 20:50 so  that is the time stamp i will be aiming for because in this scenario we will play for a point and time recovery to simulate a customer requesting a restore to a specific point and then before things went bad in the application (side) data.  Examples of such requests would be wrong billing cycle, batch run, dropping data, drop table (indeed recycle_bin is still not on at my site) etc.

Of course you will have to check if a scenario like this would be helpful for your special case.  To me that is the charm of backup – recovery  strategies  you will always have to do quick but thorough analyses of the scenarios available to you. Maybe this post is giving you hints how the restore challenge in a case similar to this could be addressed. But of course you always have to test any solution that you implement.

If you want to script the required step then a script would have following main steps:

  1. Restore control file of the moment you want to restore to.
  2. Restore & recover the database to that very same point and time.
  3. Since this is a point and time recovery and NOT a full recovery, open the Database with resetlogs.
  4. Create a new rman  level 0 backup

Note script also needs:

export NLS_DATE_FORMAT=”DD-MON-YYYY HH24:MI:SS”

1) Restore the controlfile:

Very important  to understand is that only in a point and time scenario you will restore your Control file.  If you aim for a full recovery never ever ever  restore your Controlfile !!!!!!

First step is to make sure the database starts in proper status:

${ORACLE_HOME}/bin/sqlplus "/ as sysdba"  << EOF
shutdown abort;
startup nomount;
exit
EOF

When you do a point in time recovery in  Rman you need mention the specific time to which you want to restore to (1).

As you will see below one control file is restored (2) and then it is replicated to the other location (3) (my spfile has two controlfiles  in place).

${ORACLE_HOME}/bin/rman TARGET  / RCVCAT rman_MYDBP1/MYDBP1@RMAN MSGLOG ${LOGPATH}\/${LOGFILE} << EOF
run {
set until time "TO_DATE('11-11-2012 21:00:00','DD-MM-YYYY HH24:MI:SS')"; (1)
allocate channel t1 type 'SBT_TAPE';
send 'NSR_ENV=(NSR_SERVER=backupserver,NSR_CLIENT=mydbserver,NSR_SAVESET_BROWSE="0032 Days",NSR_SAVESET_RETENTION="0032 Days")';
restore controlfile to '+MYDBP_DATA01/MYDBP/control01.ctl' ; (2)
release channel t1;
allocate channel d1 type disk;
replicate controlfile from '+MYDBP_DATA01/MYDBP/control01.ctl' ; (3)
release channel d1;
}
exit
EOF

Once the control file is back in place we are going to set the new status for the database, because we want it to be mounted before we can do a restore with Rman:

${ORACLE_HOME}/bin/sqlplus "/ as sysdba"  << EOF
shutdown abort;
startup mount;
exit
EOF

2) Restore & recover of the database:

This is the part where things really will be put to motion. As you will see here also we enter the proper time to which we want to return to with our Database (1).

Note: SET PARALLELMEDIARESTORE OFF.  I still have the experience that if you forget to add this , you might be restoring using only one channel, slowing things down  so  make sure you have the parameter in use.

${ORACLE_HOME}/bin/rman TARGET   / RCVCAT rman_MYDBP1/MYDBP1@RMAN MSGLOG ${LOGPATH}\/${LOGFILE} << EOF
SET PARALLELMEDIARESTORE OFF;
run {
set until time "TO_DATE('11-11-2012 21:00:00','DD-MM-YYYY HH24:MI:SS')";(1)
allocate channel t1 type 'SBT_TAPE';
.
.
allocate channel tN type 'SBT_TAPE';
send 'NSR_ENV=( NSR_SERVER=backupserver,NSR_CLIENT=mydbserver1, NSR_SAVESET_BROWSE="0032 Days",NSR_SAVESET_RETENTION="0032 Days")'; 
restore database;
recover database;
release channel t1;
.
.
release channel tN;
}
exit
EOF

Cool Note: Should you need a  rerun of the restore then Rman is (even in ASM) nice to you , it will pick up the pace from where you have left. So if you have restored 60 files out of 100, an new restore action would start with file 61!  I was familiar with this on file systems , but it is great that this also works in 11gR2 with ASM.

3) Time to open the database.

Please make triple double sure that you only do this step when you do a Point-In-Time recovery  !!!

${ORACLE_HOME}/bin/sqlplus "/ as sysdba"  << EOF
ALTER DATABASE OPEN RESETLOGS
/
EOF

Total restore time  took me app. 70 Minutes for a little more then 1 TB db.  Once i checked my logs I have performed some checks, like a database restart after a gentle close (always have to smile when reading or writing that). And I have checked if the temp files of the temp tablespace  was there.

Lasts couple of steps:

  • In sqlplus : alter system set cluster_database = TRUE scope = SPFILE sid='*';
  • In Sqlplus:  stopped the database  gentle with shutdown immediate;
  • in Linux I started the database:  srvctl stop database -d MYDBP

After that happy dba and  big step closer to a next step in QA.

Happy reading,

Mathijs

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s