Rman Archive backup failing with RMAN-03002, RMAN-06059, ORA-19625 in Rac

Introduction:

As member of the  team every so many days I am part of hot line , taking care of Tickets and indeed being the (hopefully friendly) voice in your ear when  you call  the  hot line number of Oracle team at the Company. One of the tickets drew my attention ( hmm maybe cause it was about RMAN because for years I love to work with RMAN) .  An archive backup had failed various times to I was asked to investigate and fix. This Blog will show you the analyses and steps I have performed  to fix this issue.

Details:

At  work we run a shell script to run both level and archive backups and in the logfile that was produced for the Archive backup this was my first clue:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of backup command at 07/11/2014 07:27:45
RMAN-06059: expected archived log not found, loss of archived log compromises recoverability
ORA-19625: error identifying file /opt/oracle/product/11202_ee_64/db/dbs/MYDB2_DATA1MYDB2_776281254_47588_2.arc
ORA-27037: unable to get file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3

###    So  it was my interpretation that the specific archive was not  present in the ASM diskgroup but it was local  present on the other Node in the  RAC  in the $ORACLE_HOME/dbs directory.  My investigations  on the three nodes showed me:

## On first node :

SQL> archive log list
Database log mode              Archive Mode
Automatic archival             Enabled <<– wrong cause the archives are supposed  to be in the +MYDB2_FRA1 diskgroup as a ground rule.
Oldest online log sequence     51210
Next log sequence to archive   51213
Current log sequence           51213

On the second node :

SQL> archive log list
Database log mode                 Archive Mode
Automatic archival                  Enabled
Archive destination                 MYDB2_DATA1  <<– wrong cause the + is missing.
Oldest online log sequence     47585
Next log sequence to archive   47588
Current log sequence             47588

## on the 3rd node:

SQL> archive log list
Database log mode                 Archive Mode
Automatic archival                  Enabled
Archive destination                 MYDB2_DATA1 <<– wrong cause the + is missing .
Oldest online log sequence     52542
Next log sequence to archive   52545
Current log sequence             52545

So after that my conclusion was  that  due to the wrong archive_destination on two out of three nodes in my rac the archives where being written on a local filesystem on those two boxes  and on top of that in the $ORACLE_HOME/dbs directory. Well that was not good  !!

## So first steps to be performed  was to correct the archive_destination to:

ALTER SYSTEM SET log_archive_dest_1=’LOCATION=+MYDB2_FRA1′ SCOPE=BOTH SID=’MYDB21′;
ALTER SYSTEM SET log_archive_dest_1=’LOCATION=+MYDB2_FRA1′ SCOPE=BOTH SID=’MYDB2′;
ALTER SYSTEM SET log_archive_dest_1=’LOCATION=+MYDB2_FRA1′ SCOPE=BOTH SID=’MYDB23′;

## Once that was corrected I performed a couple of log switches to see archives being born in the +MYDB2_FRA1 diskgroup. Which was the case so already a bit happy me.

## Since I had noticed that the missing +  in the archive destination caused Oracle to create the archives.
## in $ORACLE_HOME/dbs.
## I had to copy  the archives present there to asm disk group  in asmcmd in the correct sub-folder ( I noticed that the archives had been of the last 2 recent days by the creation date of them in Linux so  I was aware where to put them in ASM):

cp /opt/oracle/product/11202_ee_64/db/dbs/MYDB2_DATA1MYDB2_776281254_47523_2.arc .
cp /opt/oracle/product/11202_ee_64/db/dbs/MYDB2_DATA1MYDB2_776281254_47525_2.arc .
cp /opt/oracle/product/11202_ee_64/db/dbs/MYDB2_DATA1MYDB2_776281254_47526_2.arc .

## and I did the same on the other server.

oracle@mysrvr:/opt/oracle/product/11202_ee_64/db/dbs/ [MYDB23]# ls -ltr *.arc .

cp /opt/oracle/product/11202_ee_64/db/dbs/MYDB2_DATA1MYDB2_776281254_52480_3.arc .
cp /opt/oracle/product/11202_ee_64/db/dbs/MYDB2_DATA1MYDB2_776281254_52481_3.arc .
cp /opt/oracle/product/11202_ee_64/db/dbs/MYDB2_DATA1MYDB2_776281254_52482_3.arc .

## Next step was how to find out how to register these archives after they have moved to ASM diskgroup in rman. And again fellow bloggers did not let me down so after spending some time on Google surfing   I came up with this:

I connected to the rman environment both target  and catalog and I performed this which was nice since  I only had to point to the directory where all the archives were located now in ASM after my copy action:

catalog start with  ‘+MYDB2_FRA1/MYDB2/ARCHIVELOG/2014_07_10/’;
catalog start with  ‘+MYDB2_FRA1/MYDB2/ARCHIVELOG/2014_07_11/’;

## Once  that had been done I ran an archive backup with success. And  was happy  that I had solved another Puzzle .

 

As always Happy reading  and till next time,

 

Mathijs

 

 

The good , the bad and the OPatch (applying PSU April 2014 on Rac and Oracle Restart)

Introduction:

 

Last couple of weeks  I have been busy  patching and upgrading Production , Preproduction en test environment and during those activities OPatch was my friend and tool for that.  Looking back an after talking to colleagues I decided to create a post for this .  In my patching activities I had to apply  a recent PSU patch to both the Grid Infra structure and Rdbms , do an Upgrade of the software  and add the latest PSU patch again. In  your  preparations for OPatch I had issues with regard to storage present on the  mount-point of the Grid Infrastructure . So as part of  activities  you should take a look at your file-system size  Since the  PSU patches will need at least  5 GB free space in the mount.

Preparations:

 

  • As was mentioned in the introduction make sure you have at least  5GB ( more is better in this case) in the mount-point where the Grid infra Structure is located . In my case I had /opt/crs/product/11202/crs as a mount with 15GB of space. In this mount the grid software had been installed and One Psu patch had been applied in the old days ( we are talking October 2012 PSU ). And while applying a required PSU (October 2013) ( required for the upgrade to Oracle 11.2.0.3)  there was not enough space to install the software.
  • Since my current platform is Linux ( this is all about patching Rac environments and Oracle Restart env.) I looked at Metalink and downloaded: p6880880_112000_Linux-x86-64.zip. With every PSU patch you install you should ask yourself is my opatch up to date enough , or should  I download a fresh copy  from Metalink. I tend to  check  and to download a fresh copy every time i am my T-shirt “I-m a patch Dba today and I Like it “.
  • In my environment my software installs look pretty much like this :
    • Grid Infra structure is installed in /opt/crs/product//crs
    • Rdbms is installed in /opt/oracle/product/11202_ee_64/db
    • oh and a bit confusing perhaps my ORACLE_BASE is  the same as the home of the ORACLE user ( which is /opt/oracle)

## tips

•    Make a subdirectory for each psu patch you apply if un unzip N psu patches in same directory opatch will apply them every  time again.
•    Is auto really auto , tend to do it with –oh  which still works fine for me.
•    Keep your Opatch tool up to date .

## Setting up your patching :

oracle@mysrvr:/opt/oracle/stage []# dir
drwxr-xr-x  5 oracle dba     4096 Jun 23 13:29 .
drwxr-xr-x 32 oracle dba     4096 Jun 23 15:22 ..
drwxr-xr-x  2 oracle dba     4096 Jun 11 13:32 OPatchlogs
drwxr-xr-x  2 oracle dba     4096 Jun 23 13:28 psuApr2014
drwxr-xr-x  2 oracle dba     4096 Jun 23 13:29 psuOct2013

## inside psuOct2013

oracle@mysrvr:/opt/oracle/stage/psuOct2013 []# ls -ltr
total 288260
-rw-r–r– 1 oracle dba        21 Apr  4  2013 README.txt
drwxr-xr-x 5 oracle dba      4096 Apr  4  2013 16459322
-rw-r–r– 1 oracle dba       450 Oct  9  2013 bundle.xml
drwxrwxr-x 9 oracle dba      4096 Oct 10  2013 17082367
-rw-rw-r– 1 oracle dba    141496 Jan 20 05:18 README.html
-rw-rw-r– 1 oracle dba    136871 Jan 20 05:18 PatchSearch.xml
-rwxr-xr-x 1 oracle dba 294574955 Jun  4 07:28 p17272753_112020_Linux-x86-64.zip

## Inside psuApr2014

oracle@mysrvr:/opt/oracle/stage/psuApr2014 []# ls -ltr
total 586820
drwxr-xr-x  5 oracle dba      4096 Jan  9 16:27 17592127
drwxrwxr-x 12 oracle dba      4096 Feb  5 07:04 18031683
-rw-r–r–  1 oracle dba       450 Feb 10 10:16 bundle.xml
-rw-r–r–  1 oracle dba         0 Feb 10 10:17 README.txt
-rw-rw-r–  1 oracle dba     59977 Apr 15 12:18 README.html
-rw-rw-r–  1 oracle dba    125015 Apr 15 14:17 PatchSearch.xml
-rwxr-xr-x  1 oracle dba 600096863 May 16 15:33 p18139678_112030_Linux-x86-64.zip

 

## Applying  PSU April 2014

unzip /opt/oracle/stage/p6880880_112000_Linux-x86-64.zip in your GRID_HOME and ORACLE_HOME directory
/opt/oracle/product/11203_ee_64/db/OPatch/ocm/bin/ocm.rsp  set up a response file (and make not of the absolute path for that response file because you will need it during opatch apply.
/opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp : that is my absolute path to the response file
unzip   p18139678_112030_Linux-x86-64.zip ( this was PSU april 2014 )
AS ROOT:export PATH=/opt/crs/product/11203/crs/OPatch:$PATH
export PATH=/opt/oracle/product/11203_ee_64/db/OPatch:$PATH
which opatch ( check if root can run opatch now )
PER NODE in your Cluster as ROOT :
##Crs
opatch auto /opt/oracle/stage/

unzip /opt/oracle/stage/p6880880_112000_Linux-x86-64.zip in your cdora directory
/opt/oracle/product/11203_ee_64/db/OPatch/ocm/bin/ocm.rsp
/opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp
unzip   p18139678_112030_Linux-x86-64.zip
export PATH=/opt/crs/product/11203/crs/OPatch:$PATH
export PATH=/opt/oracle/product/11203_ee_64/db/OPatch:$PATH
which opatch
PER NODE:
##Crs
opatch auto /opt/oracle/stage/

unzip /opt/oracle/stage/p6880880_112000_Linux-x86-64.zip in your cdora directory
/opt/oracle/product/11203_ee_64/db/OPatch/ocm/bin/ocm.rsp
/opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp
unzip   p18139678_112030_Linux-x86-64.zip
export PATH=/opt/crs/product/11203/crs/OPatch:$PATH
export PATH=/opt/oracle/product/11203_ee_64/db/OPatch:$PATH
which opatch
PPER NODE AS ROOT:##Crsopatch auto /opt/oracle/stage/psuApr2014 -ocmrf /opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp -oh /opt/crs/product/11203/crs

##Rdbms

opatch auto /opt/oracle/stage/psuApr2014 -ocmrf /opt/oracle/product/11203_ee_64/db/OPatch/ocm/bin/ocm.rsp -oh /opt/oracle/product/11203_ee_64/db

## Oracle Restart

/opatch auto /opt/oracle/stage/psuApr2014 -ocmrf /opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp -oh /opt/crs/product/11203/crs

/opatch auto /opt/oracle/stage/psuApr2014 -ocmrf /opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp -oh /opt/oracle/product/11203_ee_64/db

-ocmrf /opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp -oh /opt/crs/product/11203/crs
##Rdbms
opatch auto /opt/oracle/stage/11203 -ocmrf /opt/oracle/product/11203_ee_64/db/OPatch/ocm/bin/ocm.rsp -oh /opt/oracle/product/11203_ee_64/db

## Oracle Restart
/opatch auto /opt/oracle/stage/psuApr2014 -ocmrf /opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp -oh /opt/crs/product/11203/crs

/opatch auto /opt/oracle/stage/psuApr2014 -ocmrf /opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp -oh /opt/oracle/product/11203_ee_64/db

 

And as last recommendation . Check the logfiles that are produced during the OPatch in detail  cause i have seen a situation where the OPatch reported “succeeded”  but a detailed look in the logs showed that one of the patches had not been applied due to lack of space !!!!

 

As always happy reading and have a great day,

 

Mathijs

Changing Directory privileges The story of the ORACLE_BASE in Psu patching with Opatch in 11.2.0.2.0

Introduction

 

This week i have applied   psu patches both on Real Application Cluster and Oracle Restart environments and that brought me to an interesting  observation again. The way we set up the environments in Linux had consequences with regard to Privileges to my directories changing.   Lets go into more detail and see if  reading this post is  of interest to  you too.

 Details.

First let us take a look at the way the servers i work on are set up:

  • When the oracle user logs in on the Linux box his HOME  ( the home of the Oracle user)  is set to /opt/oracle.
  • Long ago  ( long time before  having experience with Rac environments the Company standards dictated that the ORACLE_BASE should always be the /opt/oracle. These standards are still alive for our current installations ( with or without  Real Application Clusters) and that is something i have to respect and have to deal with.
  • When installing the Oracle software for an Oracle Rac environment two Main software trees are important:
  1. The Grid Infra structure software is being installed under /opt/oracle/crs/product//crs
  2. The Rdbms software is installed in /opt/oracle /product/_ee_64/db

The Grid Infra ( aka GI )  (1) has its own software tree because in that required way Root owner ship of the Software can be guaranteed during setup and while running, not impacting the normal Oracle Database software and Software use . ( after all we would not want to run all as root right ( even when that sounds tempting some times 🙂 ).

The Normal Oracle software (2) is installed in the ORACLE_BASE which is correct ( because it does not need root privileges to run) .

However  installing and patching a GI environment is going to be an issue if you keep the ORACLE_BASE to /opt/oracle  and it is the same as the HOME of the Oracle user since after applying psu patch to the Grid Infra  i all over sudden had issues in various ways:

After apply psu for gi :oracle@nlmysrvr/opt []# ls -al

total 101

drwxrwxr-x 19 root   root   4096 Nov 13  2013 .

drwxr-xr-x 28 root   root   4096 May 26 09:52 ..

drwxr-xr-x  3 root   root   4096 Nov 13  2013 bmc

drwxr-xr-x  4 root   hpsmh  4096 Apr  5  2012 compaq

drwxr-xr-x  5 oracle dba    4096 May  3  2012 crs

drwxr-xr-x  3 root   root   4096 Apr 12  2012 emc

lrwxrwxrwx  1 root   root     13 Apr  5  2012 eRunbook -> /opt/novaInfo

drwxr-xr-x 13 root   hpsmh  4096 Apr  5  2012 hp

drwx——  2 root   root  16384 Apr  5  2012 lost+found

drwxr-xr-x  3 root   root   4096 Apr  5  2012 lsi

drwxr-xr-x  9 root   root   1024 Dec 13  2012 networker

drwxr-x—  7 root   root   4096 Apr  5  2012 novaInfo

drwxrwxr-x  6 root   root   4096 Dec  4  2012 nsr

drwxr-x— 33 root   dba   12288 Jun 12 13:22 oracle

drwxr-xr-x  3 root   root   4096 May  3  2012 ORCLfmap

dr-xr-xr-x 25 bin    bin    4096 Jul 10  2013 OV

dr-xr-xr-x 14 bin    bin    4096 Dec 11  2012 perf

drwxr-xr-x  4 root   root   4096 Sep 12  2012 SAN

drwxr-xr-x  5 root   root   4096 Jul 20  2012 semo

drwxr-xr-x  9 root   root   4096 Jul 10  2012 SP

## During my next ssh log in i noticed this:

ssh nlmysrvr

Last login: Thu Jun 12 13:22:15 2014 from homeServer.domain

/usr/bin/xauth:  timeout in locking authority file /opt/oracle/.Xauthority

 

Logins took a long time ..

##

Solution:

As root

chown oracle:dba oracle again on the /opt/oracle directory.

 

I spoke to Oracle consultants on this and they recommended  that best practice would be:

  1. Set up  a new standard in which the User Oracle when logging in would start in a directory like /home/oracle.
  2. Set up a Software tree  for the Rdbms Admin  in an ORACLE_BASE  ( so in our case /opt/oracle would be acceptable).
  3. Install the GI software in its own directory since it will  run as root and it would be out of the ORACLE_BASE (  i learned would be best to unset the environment variable ORACLE_BASE).

 

As we all know , old habits or standards die-hard . So at moment i will settle for every  scenario to unset the ORACLE_BASE variable , do my checks  before , and after installation ( always keep a few sessions ope) and i would tweak the privileges were needed after applying the  psu patch or after running root(upgrade).sh  for the GI installation. But i would not only check :

  • /opt/oracle privileges  after running a psu patch
  • would even when things look ok  go into /opt/oracle/.ssh and make sure that the ..  has 755 as a  value .
  • if not the case would chown  and or chmod to fix things .

 

As always there is never a one solution fits all purposes here . In an ideal world indeed would love to see the home of the Oracle User in /home/Oracle and have an ORACLE_BASE  under /opt/oracle.  But as always we don’t always live in an ideal work so it is better to be aware of possible side effects like the ones i describe here and make it part of your plans to look at the situation before and after applying  patches or installing software . One of the nice quotes i came to appreciate was  success just loves preparation . Make it part of plan to check things when starting to patch .

 

Happy reading and till we meet again in next blog.

 

Mathijs

 

 

ORA-12514, TNS:listener does not.. in 11.2.0.2 and 11.2.0.3 Rac env.

This morning in the Dutch environment we have been bitten by that bug again on one of the databases:

Bug 11772838 – oraagent modifies LOCAL_LISTENER with an incorrect value, when the parameter was set in spfile (Doc ID 11772838.8)

So clearly a case of Monday again and both 11.2.0.2 and 11.2.0.3 Rac environments might be affected!

Customer complained that his connection to the instance was not working:
oracle.javatools.db.DBException: java.sql.SQLException: Listener refused the connection with the following error:
ORA-12514, TNS:listener does not currently know of service requested in connect descriptor

I logged into the  instances and due to the fact that customer could tell me connection to second instance working I compared local_listener again.
And indeed  that one had been altered as is described in the note by the cluster agent last friday  :

SQL> show parameter listener

## wrong
NAME TYPE VALUE
———————————— ———– ——————————
listener_networks string
local_listener string (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=195.233.333.77)(PORT=1521))))

## correct

local_listener string (DESCRIPTION=(ADDRESS_LIST=(AD
DRESS=(PROTOCOL=TCP)(HOST=mysrvr36r-vip)(PORT=33007))))

The fix is straight forward  from the note ( and the workaround  used is  indeed  to add a blank ( grins saved by a blank  !)

ALTER SYSTEM SET local_listener='(DESCRIPTION= (ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=mysrvr35r-vip)(PORT=33007))))’ SCOPE=BOTH SID=’MYDBWHP1′;

ALTER SYSTEM SET local_listener='(DESCRIPTION= (ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=mysrvr36r-vip)(PORT=33007))))’ SCOPE=BOTH SID=’MYDBWHP2′;

 

I searched the alert file of the instance on the 35r and indeed found  in the alert file:

Fri May 16 21:47:01 2014
ALTER SYSTEM SET local_listener='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=195.233.333.77)(PORT=1521))))’ SCOPE=MEMORY SID=’MYDBWHP1′;

Note:
I have checked the cluster logfiles but I cannot find a true issue there .

and both Customer and Dba happy again.

happy reading and best of luck,

Mathijs.

When ORA-01144: File size (6553600 blocks) exceeds … hits you

Introduction.

In February  i had set up a database ( 11.2.02.) on Linux with ASM.  One funny thing there was that the only data tablespace was requested to be in size of a 4.5 TB  just for holding data and indexes. Recently I was asked by the the Application team to break up the tablespace into two tablespaces. In itself that is not that big a deal of course but during recreate of the tablespace i got a nice error messages which triggers me to this mini post :

Details.

This is how I had set up the create / alter of the tablespace at first:

create tablespace MY_DATA datafile  '+MYTS_DATA01' size 50G autoextend off
 EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT AUTO autoallocate;

create tablespace MY_DATA
*
ERROR at line 1:
ORA-01144: File size (6553600 blocks) exceeds maximum of 4194303 blocks

After  using Google ( always happy to find blogs that point into the right direction . I found out that the error message was triggered by the db_block_size was only 8k in my database.  (you can also take a look at  ORA-01144 on Create Tablespace or Resize of Datafile (Doc ID 104280.1).

After that the fix  for this issue was that datafile size simply needed to be smaller for my action to become succesfull:

My example for one of the tablespaces now looks like this :
#!/bin/ksh
#

${ORACLE_HOME}/bin/sqlplus /nolog << EOF
set echo on
set feedback on
spool CreaAdd.lst
--
connect / as sysdba
set echo on
set feedback on
set timing on

drop tablespace MY_DATA including contents and datafiles;

create tablespace MY_DATA
  datafile  '+MYTS_DATA01' size 30G autoextend off
EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT AUTO autoallocate;
alter database backup controlfile to trace;
EOF
x=1
while [ $x -le 68 ]
do
 $ORACLE_HOME/bin/sqlplus /nolog << EOF
 set echo on
 set feedback on
 set timing on
 spool step6_MY_DATA_$x.lst
 connect / as sysdba
 alter tablespace MY_DATA add datafile  '+MYTS_DATA01' size 30G autoextend off;
 quit 
EOF
x=$(( $x + 1 ))
done

exit

Happy end .. now i only need wait till the datafiles are added .

Happy reading ,

 

Mathijs.

 

 

What’s in a Name (renaming ASM Diskgroup in 11.2)

Introduction:

Once again  it showed that the old  quote .. what’s  in a name .. is still valid even in an Oracle 11.2.0.3.0 Oracle restart environment  so credits to William S. for that. Well  you might wonder where is he heading with the article with title like this again. Frankly i had set up an Oracle restart environment where during the Installation  you are asked about one ( or more )  Asm Disks to be added to a Diskgroup that will hold the Spfile of the ASM instance.  During that install i made a typo in the name of the Diskgroup and only noticed of course after finishing install.

I searched the Web and came across a real cool scenario which i have explored and guess what  it works !  Below you find the steps i have performed.   Environment was: 11.2.0.3 Enterprise Edition on Red  Hat Linux.

Details:

On the web  I found bloggers sharing the idea that you can indeed rename a diskgroup in 11gr2. The tool for that is on the OS level and is part of you GridInfrastructure software. So it could very well look pretty much like this:

 which renamedg
/opt/crs/product/11203_ee_64/crs/bin/renamedg

Anyhow i had a bit of a struggle with the syntax  but in the end this is the command that does the magic. Note first few attempts failed since  I did not add the asm_diskstring in the command line . Be wise and please use it in a modified if necessary way from the beginning:

 

Step 1 Preparations

As root i stopped the Oracle restart environment :

 ./crsctl stop has

Step 2 Renaming the Diskgroup

Syntax

renamedg phase=both dgname=DUMM_YDATA newdgname=DUMMY_DATA verbose=true asm_diskstring=’/dev/mapper/asm*p1′

As you can see i am renaming my diskgroup DUMM_YDATA to DUMMY_DATA  this way.

 

Output coming from that was :
NOTE: No asm libraries found in the system
Parsing parameters..
Parameters in effect:
Old DG name       : DUMM_YDATA
New DG name          : DUMMY_DATA
Phases               :
Phase 1
Phase 2
Discovery str        : /dev/mapper/asm*p1
Clean              : TRUE
Raw only           : TRUE
renamedg operation: phase=both dgname=DUMM_YDATA newdgname=DUMMY_DATA verbose=true asm_diskstring=/dev/mapper/asm*p1
Executing phase 1
Discovering the group
Performing discovery with string:/dev/mapper/asm*p1
Identified disk UFS:/dev/mapper/asm-spfile01p1 with disk number:0 and timestamp (33001295 453578752)
Checking for hearbeat...
Re-discovering the group
Performing discovery with string:/dev/mapper/asm*p1
Identified disk UFS:/dev/mapper/asm-spfile01p1 with disk number:0 and timestamp (33001295 453578752)
Checking if the diskgroup is mounted or used by CSS
Checking disk number:0
Generating configuration file..
Completed phase 1
Executing phase 2
Looking for /dev/mapper/asm-spfile01p1
Modifying the header
Completed phase 2

Step 3 Where to find the Spfile for ASM in Grid Infra

 

With this first success it was time to move on to step 2. I modified the information in the cluster layer with this once i had looked up the exact name of the spfile with ASMCMD:

Syntax
srvctl modify asm  -p +DUMMY_DATA/asm/asmparameterfile/registry.253.843232019

 

Step 4 Restarted the Has deamon:

As Root:

 ./crsctl start has

 

Step 5 Final activities: Check and cleanup:

 

SQL> show parameter spfile

NAME               TYPE VALUE
------------ ----------- ------------------------------
spfile       string      +DUMMY_DATA/asm/asmparameterfile/registry.253.843232019

 

Checked also the information in the Cluster:

oracle@mysrvr10hr:/opt/oracle [+ASM]# crsctl status resource -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DUMMY_DATA.dg
               ONLINE  ONLINE       mysrvr10hr
ora.DUMM_YDATA.dg
               ONLINE  OFFLINE      mysrvr10hr
ora.LISTENER.lsnr
               ONLINE  ONLINE       mysrvr10hr
ora.asm
               ONLINE  ONLINE       mysrvr10hr                 Started
ora.ons
               OFFLINE OFFLINE      mysrvr10hr
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
      1        ONLINE  ONLINE       mysrvr10hr
ora.diskmon
      1        OFFLINE OFFLINE
ora.evmd
      1        ONLINE  ONLINE       mysrvr10hr

## checked in asmcmd

asmcmd
[Option  -p will be used ]
ASMCMD [+] > spget
+DUMMY_DATA/asm/asmparameterfile/registry.253.843232019

Ok last step, clean up since the cluster still knows about the Diskgroup with the wrong name:

srvctl remove diskgroup -g DUMM_YDATA

 

Mission completed and again a smiling dba here. Please Test before do !

 

Happy reading,

 

Mathijs

Moving from the Asmlib to multi-pathing / creating a shared spfile in ASM – Grid Infra

Introduction

No this is not a note in defense or offense for the ASMLIB. It is merely a registration of the fact that in the past Real application clusters  have been setup with ASMLIB  and these Clusters have turned into trouble(d) environments recently due to wrong allocation of ASMLIB Disks. Since the company standards have changed over the year in favor of the Multi-pathing  it has been decided that the troubled cluster would  have to start using the Multi-pathing . For a general overview of ASMLIB and Multi-pathing :  http://oracle.su/docs/11g/server.112/e10500/asmprepare.htm.

Also it has been decided to alter the standards a that ruled couple of years  ago ( implementing Rac clusters with Asm with ASMLIB) to now a days company standards which embraced the Multi-pathing. So far no one ever did this kind of convert on live systems so it would need a proper scenario a dress-rehearsal test before we would implement this on production environment .

In general terms it was explained to me that ASMLIB is like a layer on TOP of the Multi-Pathing  presenting Disks in ASM with a specific labeling. The asm_diskstring  in an ASMLIB environment would point to those disks with  ‘ORCL:1’ . Where the asm_diskstring would be like  the key which disks should be discovered by the ASM instance to work with. The scenario turned out to be remarkable simple to the Oracle Dbas. Bottom line was that even when the disks have been labeled for ASMLIB  if we would alter the asm_diskstring to point to the disks via ‘/dev/mapper/asm*p1’ that would already do the job since the ASMLIB labels would simply be ignored by multi-path. But as always do not  try this at home unless  you alter it in a controlled way as described below.

Details:

As a preparation i ran this scenario on a preproduction box  and with  that being a good dress-rehearsal implemented this on the production as well.  More details will be described below:

Plan of Approach
1 alter system set asm_diskstring = ‘/dev/mapper/asm*p1’ scope = spfile; Oracle
1 Stopping Crs on all 4 Nodes Oracle
1 Started Crs on mysrvr33r Oracle
1 Started Crs on mysrvr34r Oracle
1 Started Crs on mysrvr35r Oracle
1 Started Crs on mysrvr36r Oracle
2 Stopping Crs on all 4 Nodes Oracle
2 Removing ASMLIB RPMS 33r 34r 35r 36r Linux
2 Server reboot of mysrvr33r Linux
2 Check Cluster on mysrvr33r , all up Oracle
2 Restart of mysrvr34r 35r 36r Linux
2 Check  33r 34r 35r 36r Oracle
3 Stopping Crs on 34r 35r 36r Nodes Oracle
3 Create spfile in asm diskgroup on 33r Oracle
3 Restart of  CRS  on 33r Oracle
3 Check gpnp profile / spfile on 33r Oracle
3 Restart of crs on mysrvr34r 35r 36r Oracle
3 Check gpnp profile / spfile on 33r – 36r Oracle
  1. I have used three waves of activities . In the first wave As as preparation i altered the asm_diskgstring in the spfile of the asm instances already ( of course not yet active till after the next restart ) .  After that i stopped the full cluster on all 4 nodes and started with my first node to see the effects ( and to see the asm instance and database instances to be started ) all nodes had been restarted. And it showed all was running well.
  2. In this action i worked together with the Linux admins. The boxes in scope running on RedHat  they wanted to get rid of the ASMLIB in the kernel as well so in the second wave i shutdown the full cluster one more time, they removed the rpms from the Linux and rebooted  the first box and all was well after that. So after my checks the other three boxes were started in parallel and the end result was a happy  and running cluster again !
  3. In wave three i had to fix some old hurt that these 4 boxes were still working with local spfiles instead of a shared spfile in the asm instance. When this Cluster was build by me some three years ago it was born as a 11.1 cluster environment and it had been set up with local copies of  the spfile:

Setting up a (lost) Spfile in ASM in a Grid infrastructure environment:

##  First step  i thought would be to stop the cluster since i did that on another scenario. Much to my surprise i was recommended to do this in the running environment so i  did  this after i prepared a valid init.ora:

 SQL> create spfile ='+CLUSTERDATA'  from pfile = '/opt/oracle/+ASM1/admin/pfile/init+ASM1.NEW' ;
create spfile ='+CLUSTERDATA'  from pfile = '/opt/oracle/+ASM1/admin/pfile/init+ASM1.NEW'
*
ERROR at line 1:
ORA-29780: unable to connect to GPnP daemon [CLSGPNP_ERR]

## As you can see that did not work.  Investigation brought  following note (Environment Variable ORA_CRS_HOME MUST be UNSET in 11gR2/12c GI (Doc ID 1502996.1) . I checked and indeed that environment Variable was present in my  .profile:

#### bad practice to have ORA_CRS_HOME  set in your .profile if you are using GI so we unset it !!!!!!!!!!!!!!!!!!!

oracle@mysrvr33r:/opt/oracle/+ASM1/admin/pfile [+ASM1]# unset  ORA_CRS_HOME
oracle@mysrvr33r:/opt/oracle/+ASM1/admin/pfile [+ASM1]# echo $ORA_CRS_HOME
## after that  the create of the spfile worked.
SQL> create spfile ='+CLUSTERDATA'  from pfile = '/opt/oracle/+ASM1/admin/pfile/init+ASM1.NEW' ;
File created.

##  Lets do some checks  ( in asmcmd)  it showed:

ASMCMD [+] > spget
+CLUSTERDATA/mysrvr3_cluster/asmparameterfile/registry.253.841967299

## Second check using the gpnptool:

gpnptool get

Warning: some command line parameters were defaulted. Resulting command line:          /opt/crs/product/112_ee_64/crs/bin/gpnptool.bin get -o- http://www.grid-pnp.org/2005/11/gpnp-profile” xmlns:gpnp=”http://www.grid-pnp.org/2005/11/gpnp-profile” xmlns:orcl=”http://www.oracle.com/gpnp/2005/11/gpnp-profile” xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xsi:schemaLocation=”http://www.grid-pnp.org/2005/11/gpnp-profile gpnp-profile.xsd” ProfileSequence=”58″ ClusterUId=”2b7266b0d5797f65ff0fcf4c8e7931d6″ ClusterName=”mysrvr3_cluster” PALocation=””><gpnp:Network-Profile>SPFile=”+CLUSTERDATA/mysrvr3_cluster/asmparameterfile/registry.253.841967299″/><orcl:OCR-Profile id=”ocr” OCRId=”372102285″/>http://www.w3.org/2000/09/xmldsig#“><ds:CanonicalizationMethod Algorithm=”http://www.w3.org/2001/10/xml-exc-c14n#“/><ds:SignatureMethod Algorithm=”http://www.w3.org/2000/09/xmldsig#rsa-sha1“/><ds:Reference URI=””><ds:Transforms><ds:Transform Algorithm=”http://www.w3.org/2000/09/xmldsig#enveloped-signature“/><ds:Transform Algorithm=”http://www.w3.org/2001/10/xml-exc-c14n#“> <InclusiveNamespaces xmlns=”http://www.w3.org/2001/10/xml-exc-c14n#” PrefixList=”gpnp orcl xsi”/>http://www.w3.org/2000/09/xmldsig#sha1“/>3Tjuts50Gi92r42OMa4Pb17PiYc=B941IphE6D1FqVhc1u/+NwhAM3QXbBRiMT0plxhXyptUnj4mu1T1UFP/5yG+yBIzblquOy4aqxNBthMy7aQW0lyS4QfMZbjWYhYH2nvbrnnyqY/ZoYXOY0QaAYciboALXxJxCzup6ZGxCnsgtT8G/b08z679j8NlMvykdE2pmWY=

## in asmcmd:

ls -l +CLUSTERDATA/mysrvr3_cluster/asmparameterfile/registry.253.841967299

[Option  -p will be used ]

Type              Redund  Striped  Time             Sys  Name

ASMPARAMETERFILE  UNPROT  COARSE   MAR 11 23:00:00  Y    registry.253.841967299

## alert log shows:

Tue Mar 11 23:48:18 2014

NOTE: updated gpnp profile ASM diskstring: /dev/mapper/asm*p1

NOTE: updated gpnp profile ASM diskstring: /dev/mapper/asm*p1

## looks good  on other servers as well ( checked them all and  they showed similar like below:

gpnptool get

## As a next step i moved the existing spfile in the $ORACLE_HOME/dbs

mv spfile+ASM1.ora spfile+ASM1.ora.20140311.old

## then edited a  the init.ora to make it point to the ASM-Diskgroup as well.

vi init+ASM1.ora

## i restarted the cluster and after that i checked in the asm Instance.

SQL> show parameter spfile
NAME                                                      TYPE VALUE
------------------------------------ ----------- ------------------------------
spfile                                                       string               +CLUSTERDATA/mysrvr3_cluster/as
mparameterfile/registry.253.84
1967299

## checks are asmcmd spget  and gpnptool get

##  copied  the init.ora to otherboxes

cat
oracle@mysrvr33r:/opt/crs/product/112_ee_64/crs/dbs [+ASM1]# cat init+ASM1.ora
spfile='+CLUSTERDATA/mysrvr3_cluster/asmparameterfile/registry.253.841967299'
1062      scp init+ASM1.ora oracle@mysrvr34r:/opt/crs/product/112_ee_64/crs/dbs/init+ASM2.ora
1063      scp init+ASM1.ora oracle@mysrvr35r:/opt/crs/product/112_ee_64/crs/dbs/init+ASM3.ora
1064      scp init+ASM1.ora oracle@mysrvr36r:/opt/crs/product/112_ee_64/crs/dbs/init+ASM4.ora

Mission completed. As always  happy reading And DO test before you  implement!

Mathijs .

ORA-01031: insufficient privileges with logon as sydba

Introduction:

I guess we all sometimes have that famous face palm moment where we do not have a direct success story to offer when a customer asks something . In my case I was asked to offer the sys password to a third-party. I did grumble about it but hey that is the way it is sometimes.  And yes during the time where they have sys-password  not responsible for that database AND I will  have to audit the system after  I take control of it again but that is another story.

Details:

In this case I had altered the password of sys , I tested it locally on the machine and I had given it to the customer and well he came back to me telling me it did not work …

The error they got and the one I researched for:

sqlplus "sys/password@MYB.prod.nl as sysdba"

ERROR:
ORA-01031: insufficient privileges

Well That one made me frown because : 
sqlplus system@MYDB.prod.nl

Connected to:

And then do this locally without connect string:

SQL> connect sys as sysdba 
Enter password: 
Connected.

So with the Web on my side  I was looking for clues-on this one. And again love the on-line community of Oracle because soon I found out  that I should check if password files existed and if the password on that file was set according to the sys password.

I checked and frowned because there was no password file present at all in the $ORACLE_HOME/dbs ! So I created a password file. Tested it and  it worked !

Surely that was a face-palm moment to me like hey I have been there – done that 🙂  but still nice got it working asap after all .

Syntax  IN $ORACLE_HOME/dbs: 
orapwd FILE=orapwMYDB1 password=
After that I copied it to the second node with scp and of course renamed  orapwMYDB1 to  orapwMYDB2.

See. Sometimes it is not that complex a matter to keep customer and Dba happy 🙂

Happy reading,

Mathijs

Are we or Are we not going to relink Oracle software after A Linux Upgrade

Introduction

Recently i have talked to  my managers with this one big topic on my mind . The Linux department is rapidly patching their servers to higher levels of Linux and they ask for support during and after them patching the OS. From management side I needed clear directive  about Oracle team following those activities closely. With Oracle support  I have started to gather information about the should or should not  relink the software . Experience will tell which way was the right one.  Cause of course there is also risks involved when relinking the Oracle  software . Examples from past seem to tell that relink brought trouble with it too.

As a central note i was given  Mos Note Is It Necessary To Relink Oracle Following OS Upgrade? (Doc ID 444595.1)

In there you see two great references :

Note 883299.1 Oracle 11gR2 Relink New Feature

and

Note 220970.1 RAC: Frequently Asked Questions
Note 284785.1 How to Check Whether Oracle Binary/Instance is RAC Enabled and Relink Oracle Binary in RAC

From the rac note ) 220970.1  this is a interesting quote as with regard to  the Grid infra structure environment:

Do I need to relink the Oracle Clusterware / Grid Infrastructure home after an OS upgrade?

Using Oracle Clusterware 10g and 11.1, Oracle Clusterware binaries cannot be relinked. However, the client shared libraries, which are part of the home can be relinked, in most cases there should not be a need to relink them. See Note:743649.1 for more information. 
 
 Using Oracle Grid Infrastructure 11.2 and higher, there are some executables in the Grid home that can and should be relinked after an OS upgrade. The following steps describe how to relink an Oracle Grid Infrastructure for Clusters home: 
 
 As root:
 
 # cd Grid_home/crs/install
 # perl rootcrs.pl -unlock
 
 As the grid infrastructure for a cluster owner:
 
 $ export ORACLE_HOME=Grid_home
 $ Grid_home/bin/relink
 
 As root again:
 
 # cd Grid_home/crs/install
 # perl rootcrs.pl -patch 
 
 Note: If using Oracle Grid Infrastructure for Standalone Environments (Oracle Restart), see the Oracle Documentation for more information: http://docs.oracle.com/cd/E11882_01/install.112/e24321/oraclerestart.htm#BABJCHIJ

So should you relink ? Well frankly I would follow recommends  of Oracle in this matter  with the idea that it needs to be done on low priority boxes like test or preprod first before touching production . And should you decide not to relink due to experiences in the past , well i would recommend to be standby once the servers are patched Os wise and the grid infra structure  and databases are starting .

 

happy reading,

 

Mathijs

Cloning your 11.2 Oracle home from an existing server to a new server

Introduction:

At the moment I am reading a great book on 12c (Pro Oracle Database 12c administration) by Darl Kuhn (http://www.apress.com/9781430257288) . And I really like his style and elaborations on topics . While i was  reading it is like being held a mirror. Indeed most times I would also use the runInstaller interactively to run the installation of the Oracle binaries and yes would even frown if I would not be allowed to use a xterm to do installations. Well Darl you gave me an inspiration for set up of environment i need shortly.  So this is a big applause for him already!

Details:

For a data migration environment i am asked to set up Oracle on a Linux server identical to the production servers that are part of the scope of the project.  After installation I will be working with BCVS to bring copies of a number of databases to this box. The production servers are 11.2.2.0 with  one-off patches and a psu patch ..  And as an important add-on information i have installed the Grid Infrastructure software already on the new box so i have Oracle Restart with an ASM instance running. ( that also means that  i have an OraInst location and an existing Oracle Inventory in place too). So which way would i head.

  1. Install Oracle binaries from scratch , and apply all the patches
  2. Clone  the Rdbms software from a production box.

Since i got time i investigated both options. But in the end i loved the scenario 2 below because I always had done the normal interactive way installation:

First $ORACLE_HOME was checked:

$ echo $ORACLE_HOME
/opt/oracle/product/112_ee_64/db

Next step was to move to the ORACLE_HOME

$ cd $ORACLE_HOME

Almost there. Moved back  ( up ) one level ( so I was in /opt/oracle/product/112_ee_64 directory).

$ cd ..

Was aiming  for the tar utility to copy everything in the subdirectory db (and below)  in the  /opt/oracle/product/112_ee_64/.

Note if you get error messages of files not being included ( happened to me too make sure you do a nohup  tar -cvf myorahome.tar>Mylogfile.lst) Most likely you either ran out of space or did not have enough privileges to run the tar for all files ( so i ran  tar as root in the end).

$ tar -cvf myorahome.tar db
  • tar -cvf is the command  to run with the options -cvf
  • myorahome.tar will be holding all the files in all the subdirectories below the /opt/oracle/product/112_ee_64/
  • db is the subdirectory ( part of my ORACLE_HOME) which I need to be added to the tar file.

That created for a full-sized Rdbms Installation with all patches included a Tarred file  of about  5G. After that  myorahome.tar file was copied to the new server with scp.

The tar file was copied to the already existing /opt/oracle/product/112_ee_64 on the new machine. There the tar file was extracted and the extract created the subdirectory db again ( and all others below 2) as part of the extract:

$ cd /opt/oracle/product/112_ee_64
Note: make sure you have plenty space on the new server ( better check that before extraction.
Next, extract the files:
$ tar -xvf myorahome.tar

Next step was needed to include, to register this cloned installation in the Oracle Inventory. In my case I used the runInstaller inside the new Oracle home. First to make sure that the new Oracle Home was not known in the Inventory by detaching it:

./runInstaller -detachHome ORACLE_HOME="/opt/oracle/product/112_ee_64/db"

Then I added the cloned software to the Inventory:

./runInstaller -silent -ignoreSysPrereqs -attachHome ORACLE_HOME="/opt/oracle/product/112_ee_64/db" ORACLE_HOME_NAME="OraDb11g_home1" LOCAL_NODE="" CLUSTER_NODES="

But of course the proof of the pudding is in tasting it once this had finished:

So  it was needed to check with Opatch if the Orainventory now was aware about all installed patches.

export PATH=/opt/oracle/product/112_ee_64/db/OPatch:$PATH

I checked  opatch was the correct one:

which opatch

Then I checked the patches with the command below:

opach lsinventory -oh /opt/oracle/product/112_ee_64/db

And it showed all the correct patches . So indeed I had cloned the Oracle software in a correct way to this machine . Yah ! And a  big thank you to Darl  for a great book which pointed me in the right direction.

Happy reading,

Mathijs