Altering the Hearbeat in Oracle Rac environment.

Introduction

When being asked for a part of job description of a dba  I would like to take a moment, smile and reply:  to serve and to protect the cluster the databases and the data of course. In the matter at hand this will have to mean that this missions/assignment will be all about increasing High availability and improve the functionality of the cluster interconnect between nodes and that all for one price: doubling  your cluster interconnect ips on the various layers.

And to be honest this assignment is even more of interest because after all how often is there a need or a challenge to alter IP addresses once you have setup the cluster. Hmm only valid options  I could think of would be action where there is a action like lifting and shifting the servers to other rooms  or as in this case because we simply want to improve availability.

Some days before the action was scheduled, I looked into the matter of the required dedicated ips together with the colleagues from team. In the file in /etc/hosts being the source for that we looked for ips which had  hb in the alias of the naming convention which is according to standards on OS level. This information  was used to setup the ips on the OS level. Once that task was completed a first and very important test had to be: can you ping these new and dedicated ips from every node which is part of the cluster. And since this is a happy flow scenario that was of course what happened. During the maintenance window itself detailed steps as seen below have been performed to make these addresses known / usable on the cluster layer in the grid infrastructure.

Detailed Plan

In this scenario we are using a 4 node Rac cluster on RH with 11.2.0.4 Grid Infrastructure. When looking on the OS level this is what we found already present in the hosts file. And even better these ips where all available and not in use.

grep -i hb /etc/hosts
 • 10.124.10.75 mysrvrahr-hb1.dc.nl mysrvrahr-hb1
 • 10.124.11.75 mysrvrahr-hb2.dc.nl mysrvrahr-hb2
 • 10.124.10.76 mysrvrbhr-hb1.dc.nl mysrvrbhr-hb1
 • 10.124.11.76 mysrvrbhr-hb2.dc.nl mysrvrbhr-hb2
 • 10.124.10.77 mysrvrchr-hb1.dc.nl mysrvrchr-hb1
 • 10.124.11.77 mysrvrchr-hb2.dc.nl mysrvrchr-hb2
 • 10.124.10.78 mysrvrdhr-hb1.dc.nl mysrvrdhr-hb1
 • 10.124.11.78 mysrvrdhr-hb2.dc.nl mysrvrdhr-hb2

The steps below have been followed based on a great Mos note in order to complete the tasks that are needed to make the Grid infrastructure (cluster) aware of the new ips. The scenario is running through a number of steps to be well prepared but also of course to be on the save side before and during the changes on the cluster layer.  It is like hmm paying respect and being brave but cautious .

Preparation steps:
As of 11.2 Grid Infrastructure, the private network configuration is not only stored in OCR but also in the gpnp profile. Documentation was very clear on this: If the private network is not available or its definition is incorrect, the CRSD process will not start and any subsequent changes to the OCR will be impossible.

  • Therefore care needs to be taken when making modifications to the configuration of the private network.
  • It is important to perform the changes in the correct order.

Note that manual modification of gpnp profile is not supported so it is best to stick to proper actions and not go into hacking mode!

So let’s take a backup of profile.xml on all cluster nodes before proceeding:

As grid user ( in my case the oracle user) , move to the correct directory( cd $GRID_HOME/gpnp/<hostname>/profiles/peer/).

cd /app/oracle/product/11.2.0/grid/gpnp/mysrvrahr/profiles/peer 
cd /app/oracle/product/11.2.0/grid/gpnp/mysrvrbhr/profiles/peer 
cd /app/oracle/product/11.2.0/grid/gpnp/mysrvrchr/profiles/peer 
cd /app/oracle/product/11.2.0/grid/gpnp/mysrvrdhr/profiles/peer 

During startup of cluster Oracle is relying on this very important xml file
for specific data like spfile / diskgroups and of course IPS. 

cp -p profile.xml profile.xml.bk

Ensure Oracle Clusterware is running on ALL cluster nodes in the cluster and save current status of resource. (Better save then sorry and to make sure you know about the health of  the cluster and its resources as they have been defined. So do check the cluster and save the current status of resources in a file as a pre-change image.

/app/oracle/product/11.2.0/grid/bin/crsctl check cluster -all 
/app/oracle/product/11.2.0/grid/bin/crsctl status resource -t>/tmp/beforeNewIps.lst

As grid user( in my case the oracle user): Get the existing information.

showing which interfaces are defined in the cluster.
##below you will see that the current (single) cluster interconnect is set up at 64.18.112.208

/app/oracle/product/11.2.0/grid/bin/oifcfg getif
bond1 64.18.112.208  global  cluster_interconnect
bond0  195.233.190.64  global  public

The command iflist will show you the network information known on the OS. Showing defined all ( or specific ) ips. Check the interfaces / subnet address can be identified by command for eth specifically:

/app/oracle/product/11.2.0/grid/bin/oifcfg iflist|grep -i eth|sort 
64.18.32.0 eth0
10.124.10.0 eth2
10.124.11.0 eth6

or

## check  interfaces / subnets in general:
 /app/oracle/product/11.2.0/grid/bin/oifcfg iflist|sort

Since we now have a good picture of the status of the cluster and since we know more about the ips being used (oifcfg getif) and about the ips being present on the system (oifcfg iflist) all things set to Add the new cluster_interconnect information. As you can see definition of both eth2 Address and eth6. And with the -global parameter the information is shared in the complete cluster on all nodes

/app/oracle/product/11.2.0/grid/bin/oifcfg setif -global eth2/10.124.10.0:cluster_interconnect 
/app/oracle/product/11.2.0/grid/bin/oifcfg setif -global eth6/10.124.11.0:cluster_interconnect

Of course there cannot be a change without verifying it. So i checked on all nodes with below command.

/app/oracle/product/11.2.0/grid/bin/oifcfg getif  

Since we are using 11GR2 Grid Infrastructure below steps are to be followed now: Shutdown Oracle Cluster ware on all nodes and disable the Oracle Cluster ware as root

Action is to be performed as the root user: 
sudo su -  
./app/oracle/product/11.2.0/grid/bin/crsctl stop crs 
./app/oracle/product/11.2.0/grid/bin/crsctl disable crs

In this specific scenario my Linux brothers in arms had already made the network configuration change at OS level as was required and that great job was seen in the oifcfg iflist command. They made sure that the new interfaces were available on all nodes after their change.

(check to ping the interfaces on all nodes with script kindly provided by Linux team member). 
for x in 10 11;do for xx in 75 76 77 78;do ping -c2 10.124.${x}.${xx}|egrep 'icmp_seq|transmitted';done;echo;done 
for x in a b c d; do for xx in 1 2;do ping -c2 mysrvr${x}hr-hb$xx|egrep 'icmp_seq|transmitted';done;echo;done 

Well all went well and has been checked so it is time to restart Oracle Cluster ware and once completed enable Oracle Cluster ware again.

On all nodes in the cluster:

## as root user: 
sudo su -  
/app/oracle/product/11.2.0/grid/bin/crsctl start crs

Seeing  = believing in this matter so after some time Check:

/app/oracle/product/11.2.0/grid/bin/crsctl check cluster -all 

In the step below we are checking the status of the resources in the cluster again and adding that information to  a file. This “post” operation file is then being used to compare the status of the cluster resources before and after.

/app/oracle/product/11.2.0/grid/bin/crsctl status resource -t>/tmp/afterNewIps.lst 
sdiff /tmp/afterNewIps.lst /tmp/beforeNewIps.lst

This compare showed me that a 10G RAC database resource and its services needed my intention, so via the cluster commands i checked and observed their status after starting them with srvctl command as the oracle user.  Once completed I ran another check as described and ah happy me all resource in the post status file were in a similar status ( online online) as in the pre status file.

as root user: 
sudo su -  
/app/oracle/product/11.2.0/grid/bin/crsctl enable crs

Time to wrap up this scenario. As part of housekeeping remove the old interface:

/app/oracle/product/11.2.0/grid/bin/oifcfg delif -global bond1/64.18.112.208:cluster_interconnect

Verified the environment one more time.

/app/oracle/product/11.2.0/grid/bin/oifcfg getif

Cluster ware proved already of course but checked the databases and listeners as a last sanity check and that completed the tasks for this time.

Time to inform Apps team that they can perform their sanity checks and start the applications again.

Happy reading and till next Time.

Mathijs

Transport Tablespace as Mig to 11.2.0.4 with rman.

Introduction:

For one of the projects the question came in to investigate and set up a  11.2.0.4 Real application cluster database with an extra challenge that the migration had to be done cross-platform from Oracle 10.2.0.3 on Solaris platform to 11.2.0.4.0 on Linux. From application provider came the suggestion to investigate a back-up – restore scenario with an upgrade on the new server ( Linux environment). Due to the fact that the Source environment was 10.20.3 on Solaris and due to fact we were heading towards a Rac cluster environment on on Linux that suggestion was  the first that was send to the dustbin.

Normal export  / import was the second scenario that was explored. Of course this is a valid scenario  but given  the fact that the database was more than 1.x TB not exactly the most favorite way to bring  the data across. But whit scripting and using multiple par-files  and or with moving  partitioned data across in waves would be a  fair plan-b.

From reading though Had put my mind to the use of  transportable tablespaces as a way forward with this challenging question.

Requirements:

As preparation for the job requested to have Nas filesystem mounted between the source Server (MySunServer) holding the 10G database and the target Server (MyLinuxcluster). This Nas filesystem  would hold  the datapumps to be created, to hold the scripts and parfiles  / config files as was suggested  based on Mos Note ( 1389592.1 ). Nas system was / read-writable from both  servers. The  perl scripts that come with the note will support in the transport of the tablespaces but also help in  the convert of big endian to little endian And as a bonus in my case will do the copy into ASM.

Due to the layout of the database in the source environment  Rman was chosen as the best way forward with the scenario.

As a preparation an 110204 Rac database was set up on the target cluster. This  database only to hold the normal tablespaces and a smal temporary tablespace for the users. ( In TTS solution the name of the data tablespaces that come across to the new environment may not exist in the new environment). All data- application users have been pre created on the new environment with a  new – default user tablespace.

Details & Comments

Configuration file for the Perl scripts:

This is a file  that is part of the unzipped file from the Mos note. It needs to be setup to match your specific needs.  Will only show settings  I have used and  its comments:

xtt.properties:
## Reduce Transportable Tablespace Downtime using Incremental Backups
## (Doc ID 1389592.1)

## Properties file for xttdriver.pl

## See documentation below and My Oracle Support Note 1389592.1 for details.
## Tablespaces to transport

## Specify tablespace names in CAPITAL letters.

tablespaces=MYDB_DATA,MYDB_EUC_DATA,MYDB_EUC_INDEX,MYDB_INDEX,MYTS,USERS
##tablespaces=MYCOMPTTS
## Source database platform ID

## platformid

## Source database platform id, obtained from V$DATABASE.PLATFORM_ID

platformid=2

## srclink

## Database link in the destination database that refers to the source

## database. Datafiles will be transferred over this database link using
## dbms_file_transfer.
srclink=TTSLINK

## Location where datafile copies are created during the “-p prepare” step.

## This location must have sufficient free space to hold copies of all
## datafiles being transported.

dfcopydir=/mycomp_mig_db_2_linux/mybox/rman

## backupformat

## Location where incremental backups are created.

backupformat=/mycomp_mig_db_2_linux/mybox/rman

## Destination system file locations

## stageondest

## Location where datafile copies are placed by the user when they are

## transferred manually from the souce system. This location must have
## sufficient free space to hold copies of all datafiles being transported.

stageondest=/mycomp_mig_db_2_linux/mybox/rman

# storageondest

## This parameter is used only when Prepare phase method is RMAN backup.

## Location where the converted datafile copies will be written during the

## "-c conversion of datafiles" step. This is the final location of the
## datafiles where they will be used by the destination database.
storageondest=+MYDBP_DATA01/mydbp/datafile
## backupondest

## Location where converted incremental backups on the destination system

## will be written during the "-r roll forward datafiles" step.

## NOTE: If this is set to an ASM location then define properties

##      asm_home and asm_sid below. If this is set to a file system
##       location, then comment out asm_home and asm_sid below
backupondest=+MYDBP_DATA01/mydbp/datafile

## asm_home, asm_sid

## Grid home and SID for the ASM instance that runs on the destination

asm_home=/opt/crs/product/11204/crs

asm_sid=+ASM1
## Parallel parameters

parallel=8

## rollparallel

## Defines the level of parallelism for the -r roll forward operation.

## If undefined, default value is 0 (serial roll forward).

rollparallel=2
## getfileparallel

## Defines the level of parallelism for the -G operation

getfileparallel=4

## desttmpdir

## This should be defined to same directory as TMPDIR for getting the

## temporary files. The incremental backups will be copied to directory pointed
## by stageondest parameter.
desttmpdir=/mycomp_mig_db_2_linux/MYDBP/scripts

 

Below in a Table format you will see the steps performed with comments.

Steps do qualify for

  • I for Initial steps – activities
  • P for Preparation
  • R for Roll Forward activities
  • T for Transport activities

Server column shows where the action needs to be done.

Step Server What needs 2 b done
I1.3 Source Identify the tablespace(s) in the source database that will be transported ( Application owner needs to support with schema owner information) :
tablespaces=MYDB_DATA,MYDB_EUC_DATA,MYDB_EUC_INDEX,

MYDB_INDEX,MYTS,USERS

I1.5 Source + Target In my case project offered an nfs filesystem which i could use : Nfs filesystem : /mycomp_mig_db_2_linux
I1.6 Source Together with the Mos note cam  this zip file : Unzip rman-xttconvert.zip.
I1.7 Source Tailor the extracted file xtt.properties file on the source system to match your environment.
I1.8 Target As the oracle software owner copy all xttconvert scripts and the modified xtt.properties file to the destination system. This was not needed since we used the nas filesystem.
P1.9 Source + Target On both environments set up this:

export TMPDIR= /mycomp_mig_db_2_linux/MYDBP/scripts.

P2B.1 Source perl xttdriver.pl -p
Note. Do Not use ]$ $ORACLE_HOME/perl/bin/perl this did not work
P2B.2 Source Copy files to destination. N/A since we use NFS
P2B3 Target On the destination system, logged in as the oracle user with the environment (ORACLE_HOME and ORACLE_SID environment variables) pointing to the destination database, copy the rmanconvert.cmd file created in step 2B.1 from the source system and run the convert datafiles step as follows:
[oracle@dest]$ scp oracle@source:/home/oracle/xtt/rmanconvert.cmd /home/oracle/xtt N/A since we use NFS.
perl/bin/perl xttdriver.pl –c
R3.1 Source On the source system, logged in as the oracle user with the environment (ORACLE_HOME and ORACLE_SID environment variables) pointing to the source database, run the create incremental step as follows:
perl xttdriver.pl –I
R3.3 Target [oracle@dest]$ scp oracle@source:/home/oracle/xtt/xttplan.txt /home/oracle/xtt
[oracle@dest]$ scp oracle@source:/home/oracle/xtt/tsbkupmap.txt /home/oracle/xtt
 Since we are using Nas shared filesystem no need to copy with scp  between source and target.
perl xttdriver.pl -r
R3.4 Source On the source system, logged in as the oracle user with the environment (ORACLE_HOME and ORACLE_SID environment variables) pointing to the source database, run the determine new FROM_SCN step as follows:
perl xttdriver.pl –s
R3.5 Source 1.     If you need to bring the files at the destination database closer in sync with the production system, then repeat the Roll Forward phase, starting with step 3.1.
2.     If the files at the destination database are as close as desired to the source database, then proceed to the Transport phase.
T4.0 Source As found in note : Alter Tablespace Read Only Hanging When There Are Active TX In Any Tablespace (Doc ID 554832.1). A restart of the database is required to have no active transactions. Alternative during off hours . Actually during a first test with one dedicated tablespace with only one  object it took more than 7 hrs. Oracle seems to look and wait  for ALL active transactions, not only the ones that would impact   the object in the test tablespace i worked with.
T4.1 Source On the source system, logged in as the oracle user with the environment (ORACLE_HOME and ORACLE_SID environment variables) pointing to the source database, make the tablespaces being transported READ ONLY.
alter tablespace MYDB_DATA read only;
alter tablespace MYDB_EUC_DATA read only;
alter tablespace MYDB_EUC_INDEX read only;
alter tablespace MYDB_INDEX read only;
alter tablespace MYTS read only;
alter tablespace USERS read only;
T4.2 Source Repeat steps 3.1 through 3.3 one last time to create, transfer, convert, and apply the final incremental backup to the destination datafiles.
perl xttdriver.pl -i
T4.2 Target [oracle@dest]$ scp oracle@source:/home/oracle/xtt/xttplan.txt /home/oracle/xtt
[oracle@dest]$ scp oracle@source:/home/oracle/xtt/tsbkupmap.txt /home/oracle/xtt
perl xttdriver.pl –r
..
T4.3 Target On the destination system, logged in as the oracle user with the environment (ORACLE_HOME and ORACLE_SID environment variables) pointing to the destination database, run the generate Data Pump TTS command step as follows:
perl xttdriver.pl –e
The generate Data Pump TTS command step creates a sample Data Pump network_link transportable import command in the file xttplugin.txt. It will hold list of all the TTS you have configured and all its transport_datafiles in details.
Example of that generated file : cat xttplugin.txt
impdp directory=MYDB_XTT_DIR logfile=tts_imp.log \
network_link=TTSLINK.PROD.NL transport_full_check=no \
transport_tablespaces=MYCOMPTTS ,A,B,C\
transport_datafiles=’+MYDBP_DATA01/mycomptts_152.xtf’
Note in our example once edited we chmodded   xttplugin.txt with 744 and ran it as script .
T4.3 Source After the object metadata being transported has been extracted from the source database, the tablespaces in the source database may be made READ WRITE again, if desired.
T4.4 Target At this step, the transported data is READ ONLY in the destination database.  Perform application specific validation to verify the transported data.
Also, run RMAN to check for physical and logical block corruption by running VALIDATE TABLESPACE as follows:
In rman:
validate tablespace MYDB_DATA, MYDB_EUC_DATA, MYDB_EUC_INDEX, MYDB_INDEX, MYTS, USERS check logical;
T4.5 Target alter tablespace MYDB_DATA read write;
alter tablespace MYDB_EUC_DATA read write;
alter tablespace MYDB_EUC_INDEX read write;
alter tablespace MYDB_INDEX read write,
alter tablespace MYTS read write;
alter tablespace USERS read write;
T5 Source + Target Cleanup of NFS filesystem.
Put Source Db in restricted mode as a fallback after the go live for couple of days then put it to tape and decommission it;

When PRCD-1027 and PRCD-1229 is spoiling your rainy day

Introduction:

More then one year ago i had set up an Oracle restart environment  with Grid Infra, ASM and Databases all in 11.2.0.2.0 since that was a requirement from vendor at first. Once the server had been handed over to production  I got the request that it should also host EMC  based  Clones and those clones where 11.2.0.3.0. That meant i had to upgrade both Grid infrastructure and the database software and of course the databases as well.

So i geared up , did an upgrade of the GI and the Rdbms software and of course of the local databases in place. After that  the Emc clones had been added and every thing  looked fine .

Until ……….

Error Messages after Server reboot:

Well until the server got rebooted. After that server reboot a first sign that things where wrong was that the databases , did not start via the grid infra structure which was not expected !

So there I was again ready for solving another puzzle and of course people waiting for the DBs to come online so they could work.

## First clue:

I checked the Resource ( the database ) in the cluster with:   crsctl status resource ….  –p

Much to my surprise that showed the wrong oracle home ( it was 11.2.0.2.0 the initial Oracle Home before upgrade). But I was so sure that I had upgraded the database.. What did i miss . Even more strange was that the Cluster agent kept altering my oratab  for the specific database to have the old oracle home ( and  it would almost stick out tongue at me telling  #line has been added by agent ).

## Second clue

When i altered the oratab to show the correct oracle home i could start the database via sqlplus which was indeed my second clue .

After a big face-palm it became clear to me that the cluster was not having correct status in the cluster ware about that Oracle Home ..

## Will srvctl modify do the Job:

srvctl modify database -d mydb -o /opt/oracle/product/11203_ee_64/db

##output:

PRCD-1027 : Failed to retrieve database mydb

PRCD-1229 : An attempt to access configuration of database migoms was rejected because its version 11.2.0.2.0 differs from the program version 11.2.0.3.0. Instead run the program from /opt/oracle/product/11202_ee_64/db

Well that was not expected.  Especially since that other clue was that the db can be started as 1120.3 db when oracle env is put properly in ORATAB.

##Solution:

First I tried :

srvctl modify database -d mydb -o /opt/oracle/product/11203_ee_64/db

but that is wrong as we already saw in this post.

Hmm then i thought of an expression in German. once you start doing it the right way  things will start to work for you :

Plain and simple this is what  I have to do making things right again:

srvctl upgrade database -d  mydb -o /opt/oracle/product/11203_ee_64/db

After that I started mydb via cluster she is happy now.

## Bottom-line ( aka lesson learned ).

If you upgrade your databases in on an oracle Restart /  Rac cluster environment make it part of your upgrade plan to upgrade the information in the clusterlayer of that specific database.

As always,

Happy Reading and till we meet again.

Mathijs

PRVF-4557 During installation of Oracle Binaries ( 11.2.0.3)

Introduction:

During an Oracle software install for a new RAC environment once again  I was surprised by Oracle . I had installed the Grid infrastructure (GI) 11.2.0.3 on RedHat Linux with success and I was about to install the Oracle Binaries which should be a piece of cake with the installed  GI up and running properly.  According to the old quote “Never stop learning ( or being surprised  )”   during the preparations the runInstaller  managed to give me a surprise  and I was unable to continue with installation. So obviously this  surprise needed  fixing first.

 

Details:

When running the runInstaller , after choosing the option to install the Oracle binaries on a Real Application Cluster environment  and after  a couple of options  I received a pop up telling me :

PRVF-4557 : Node application “ora.svrb1hr.vip” is offline on node

I was unable to go ahead and had to investigate. As always Google and the Oracle Community where  my brothers in oracle arms for this so i Came across this scenario :

First I checked my hosts file to make sure  information to be present on both nodes. That showed  following details which looked ok:

oracle@svrb1hr:/opt/oracle [CRM]# grep  vip /etc/hosts 
10.97.242.32 svrb1hr-vip.myenv.dc-world.de svrb1hr-vip
10.97.242.33 svrb2hr-vip.myenv.dc-world.de svrb2hr-vip

Next step was  checking my cluster resources. As  mentioned before GI install had finished  properly the day before so that really made me wonder:

oracle@svrb1hr:/opt/oracle [CRM]# crsctl status resource -t 
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CLUSTERDATA.dg
               ONLINE  ONLINE       svrb1hr                                     
               ONLINE  ONLINE       svrb2hr                                     
ora.LISTENER.lsnr
               ONLINE  OFFLINE      svrb1hr                                     
               ONLINE  ONLINE       svrb2hr                                     
ora.asm
               ONLINE  ONLINE       svrb1hr                 Started             
               ONLINE  ONLINE       svrb2hr                 Started             
ora.gsd
               OFFLINE OFFLINE      svrb1hr                                     
               OFFLINE OFFLINE      svrb2hr                                     
ora.net1.network
               ONLINE  ONLINE       svrb1hr                                     
               ONLINE  ONLINE       svrb2hr                                     
ora.ons
               ONLINE  ONLINE       svrb1hr                                     
               ONLINE  ONLINE       svrb2hr                                     
ora.registry.acfs
               ONLINE  ONLINE       svrb1hr                                     
               ONLINE  ONLINE       svrb2hr                                     
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       svrb1hr                                     
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       svrb2hr                                     
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       svrb1hr                                     
ora.cvu
      1        ONLINE  ONLINE       svrb2hr                                     
ora.svrb1hr.vip
      1        ONLINE  INTERMEDIATE svrb2hr                 FAILED OVER         
ora.svrb2hr.vip
      1        ONLINE  ONLINE       svrb2hr                                     
ora.oc4j
      1        ONLINE  ONLINE       svrb1hr                                     
ora.scan1.vip
      1        ONLINE  ONLINE       svrb1hr                                     
ora.scan2.vip
      1        ONLINE  ONLINE       svrb2hr                                     
ora.scan3.vip
      1        ONLINE  ONLINE       svrb1hr                                     

Well that was unexpected, because  expectation was that the srvrb1hr.vip should be running on the first node. Still don’t understand what happened to cause this. Hmm and frankly if you have suggestions what happened please let me know . But  I did know  it was needed to bring back the vip  address to the first server.

First attempt  was to issue the command needed  ( crs_relocate ) on the node where  I was already working ( node 1) .

oracle@svrb1hr:/opt/oracle [CRM]# which crs_relocate
/opt/crs/product/11203/crs/bin/crs_relocate

oracle@svrb1hr:/opt/oracle [CRM]# crs_relocate svrb1hr.vip
CRS-0210: Could not find resource ‘svrb1hr.vip’.
###  activities needed 2 b done from second node

Grumbling with this  but well at least it was explained what to do next ….

So I opened a session against the second node , made sure my Oracle Home was pointing to the GI.

oracle@svrb1hr:/opt/oracle [CRM]# ssh svrb2hr

On the second box  the command crs_relocate was entered:

                     
oracle@svrb2hr:/opt/oracle [CRS]# crs_relocate ora.svrb1hr.vip
Attempting to stop `ora.svrb1hr.vip` on member `svrb2hr`
Stop of `ora.svrb1hr.vip` on member `svrb2hr` succeeded.
Attempting to start `ora.svrb1hr.vip` on member `svrb1hr`
Start of `ora.svrb1hr.vip` on member `svrb1hr` succeeded.
Attempting to start `ora.LISTENER.lsnr` on member `svrb1hr`
Start of `ora.LISTENER.lsnr` on member `svrb1hr` succeeded.

Well that  looked promessing  so let’s check one more time then:

 

oracle@svrb2hr:/opt/oracle [CRS]#  crsctl status resource -t 
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CLUSTERDATA.dg
               ONLINE  ONLINE       svrb1hr                                     
               ONLINE  ONLINE       svrb2hr                                     
ora.LISTENER.lsnr
               ONLINE  ONLINE       svrb1hr                                     
               ONLINE  ONLINE       svrb2hr                                     
ora.asm
               ONLINE  ONLINE       svrb1hr                 Started             
               ONLINE  ONLINE       svrb2hr                 Started             
ora.gsd
               OFFLINE OFFLINE      svrb1hr                                     
               OFFLINE OFFLINE      svrb2hr                                     
ora.net1.network
               ONLINE  ONLINE       svrb1hr                                     
               ONLINE  ONLINE       svrb2hr                                     
ora.ons
               ONLINE  ONLINE       svrb1hr                                     
               ONLINE  ONLINE       svrb2hr                                     
ora.registry.acfs
               ONLINE  ONLINE       svrb1hr                                     
               ONLINE  ONLINE       svrb2hr                                     
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       svrb1hr                                     
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       svrb2hr                                     
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       svrb1hr                                     
ora.cvu
      1        ONLINE  ONLINE       svrb2hr                                     
ora.svrb1hr.vip
      1        ONLINE  ONLINE       svrb1hr                                     
ora.svrb2hr.vip
      1        ONLINE  ONLINE       svrb2hr                                     
ora.oc4j
      1        ONLINE  ONLINE       svrb1hr                                     
ora.scan1.vip
      1        ONLINE  ONLINE       svrb1hr                                     
ora.scan2.vip
      1        ONLINE  ONLINE       svrb2hr                                     
ora.scan3.vip
      1        ONLINE  ONLINE       svrb1hr

 

Much better!!  after this I restarted the runInstaller again from the first node and indeed installation of  the Oracle Binaries was flawless.

 

As always  happy  reading ,

 

Mathijs

Dropping redo groups in Another Thread

Introduction:

Quite some time ago I had to turn a Rac environment into a single Instance / Database. However I did notice there where still redo groups of the second instance present which I wanted to drop. This is just a quick blog  to show steps I have followed :

 

## Checking about the redologs and the thread information:

SQL> select * from v$log;

GROUP#   THREAD# SEQUENCE#     BYTES BLOCKSIZE      MEMBERS ARC STATUS          FIRST_CHANGE# FIRST_TIME      NEXT_CHANGE# NEXT_TIME

———- ———- ———- ———- ———- ———- — —————- ————- ——————- ———— ——————-

1         1      1931 2147483648       512         2 YES INACTIVE         1.3789E+13 14.08.2014 18:46:19   1.3789E+13 18.08.2014 07:08:18

2         1      1933 2147483648       512         2 YES INACTIVE         1.3789E+13 18.08.2014 07:08:20   1.3789E+13 18.08.2014 07:08:22

3         1      1934 2147483648       512         2 NO CURRENT          1.3789E+13 18.08.2014 07:08:22   2.8147E+14

4         2        128 2147483648       512         2 YES INACTIVE            146319194 19.04.2012 23:21:10   146397537 20.04.2012 09:19:04

5         2        127 2147483648       512         2 YES INACTIVE            146311735 19.04.2012 22:35:29   146319194 19.04.2012 23:21:10

6         2        126 2147483648       512         2 YES INACTIVE            146169044 18.04.2012 23:20:54   146311735 19.04.2012 22:35:29

 

## Now lets try to drop a group that is part of 2nd Thread

SQL> alter database drop logfile group 4;

alter database drop logfile group 4

*

ERROR at line 1:

ORA-01623: log 4 is current log for instance MYDBP12 (thread 2) – cannot drop

ORA-00312: online log 4 thread 2: ‘+MYDBP1_REDO/MYDBP1/onlinelog/group_4.260.772812657’

ORA-00312: online log 4 thread 2: ‘+MYDBP1_DATA/MYDBP1/onlinelog/group_4.267.772812685’

 

 ## Checking Threads

SQL> select THREAD#, STATUS, ENABLED from v$thread;

THREAD# STATUS ENABLED

———- —— ——–

1 OPEN   PUBLIC

2 CLOSED PUBLIC

 

## And as we saw in first query which groups belong to the thread we want to drop: 

SQL> select group# from v$log where THREAD#=2;

GROUP#

———-

4

5

6

 

## First we need to make sure  that the thread will not interfere anymore so we disable it

SQL>

SQL> alter database disable thread 2;

Database altered.

 

## Now we can drop the logfile group:

SQL> alter database drop logfile group 4;

Database altered.

SQL> alter database drop logfile group 5;

Database altered.

SQL> alter database drop logfile group 6;

Database altered.

##  Believe is one thing , checking is better to see if thread is gone:

SQL> select THREAD#, STATUS, ENABLED from v$thread;

THREAD# STATUS ENABLED

———- —— ——–

1 OPEN   PUBLIC

## Lets check the groups from redo:

SQL> select * from v$log;

GROUP#   THREAD# SEQUENCE#     BYTES BLOCKSIZE      MEMBERS ARC STATUS          FIRST_CHANGE# FIRST_TIME      NEXT_CHANGE# NEXT_TIME

———- ———- ———- ———- ———- ———- — —————- ————- ——————- ———— ——————-

1         1      1931 2147483648       512         2 YES INACTIVE         1.3789E+13 14.08.2014 18:46:19   1.3789E+13 18.08.2014 07:08:18

2         1      1933 2147483648       512         2 YES INACTIVE         1.3789E+13 18.08.2014 07:08:20   1.3789E+13 18.08.2014 07:08:22

3         1      1934 2147483648       512         2 NO CURRENT          1.3789E+13 18.08.2014 07:08:22   2.8147E+14

 

Happy Dba again !

 

Enjoy reading and till we meet again,

 

Mathijs

 

The good , the bad and the OPatch (applying PSU April 2014 on Rac and Oracle Restart)

Introduction:

 

Last couple of weeks  I have been busy  patching and upgrading Production , Preproduction en test environment and during those activities OPatch was my friend and tool for that.  Looking back an after talking to colleagues I decided to create a post for this .  In my patching activities I had to apply  a recent PSU patch to both the Grid Infra structure and Rdbms , do an Upgrade of the software  and add the latest PSU patch again. In  your  preparations for OPatch I had issues with regard to storage present on the  mount-point of the Grid Infrastructure . So as part of  activities  you should take a look at your file-system size  Since the  PSU patches will need at least  5 GB free space in the mount.

Preparations:

 

  • As was mentioned in the introduction make sure you have at least  5GB ( more is better in this case) in the mount-point where the Grid infra Structure is located . In my case I had /opt/crs/product/11202/crs as a mount with 15GB of space. In this mount the grid software had been installed and One Psu patch had been applied in the old days ( we are talking October 2012 PSU ). And while applying a required PSU (October 2013) ( required for the upgrade to Oracle 11.2.0.3)  there was not enough space to install the software.
  • Since my current platform is Linux ( this is all about patching Rac environments and Oracle Restart env.) I looked at Metalink and downloaded: p6880880_112000_Linux-x86-64.zip. With every PSU patch you install you should ask yourself is my opatch up to date enough , or should  I download a fresh copy  from Metalink. I tend to  check  and to download a fresh copy every time i am my T-shirt “I-m a patch Dba today and I Like it “.
  • In my environment my software installs look pretty much like this :
    • Grid Infra structure is installed in /opt/crs/product//crs
    • Rdbms is installed in /opt/oracle/product/11202_ee_64/db
    • oh and a bit confusing perhaps my ORACLE_BASE is  the same as the home of the ORACLE user ( which is /opt/oracle)

## tips

•    Make a subdirectory for each psu patch you apply if un unzip N psu patches in same directory opatch will apply them every  time again.
•    Is auto really auto , tend to do it with –oh  which still works fine for me.
•    Keep your Opatch tool up to date .

## Setting up your patching :

oracle@mysrvr:/opt/oracle/stage []# dir
drwxr-xr-x  5 oracle dba     4096 Jun 23 13:29 .
drwxr-xr-x 32 oracle dba     4096 Jun 23 15:22 ..
drwxr-xr-x  2 oracle dba     4096 Jun 11 13:32 OPatchlogs
drwxr-xr-x  2 oracle dba     4096 Jun 23 13:28 psuApr2014
drwxr-xr-x  2 oracle dba     4096 Jun 23 13:29 psuOct2013

## inside psuOct2013

oracle@mysrvr:/opt/oracle/stage/psuOct2013 []# ls -ltr
total 288260
-rw-r–r– 1 oracle dba        21 Apr  4  2013 README.txt
drwxr-xr-x 5 oracle dba      4096 Apr  4  2013 16459322
-rw-r–r– 1 oracle dba       450 Oct  9  2013 bundle.xml
drwxrwxr-x 9 oracle dba      4096 Oct 10  2013 17082367
-rw-rw-r– 1 oracle dba    141496 Jan 20 05:18 README.html
-rw-rw-r– 1 oracle dba    136871 Jan 20 05:18 PatchSearch.xml
-rwxr-xr-x 1 oracle dba 294574955 Jun  4 07:28 p17272753_112020_Linux-x86-64.zip

## Inside psuApr2014

oracle@mysrvr:/opt/oracle/stage/psuApr2014 []# ls -ltr
total 586820
drwxr-xr-x  5 oracle dba      4096 Jan  9 16:27 17592127
drwxrwxr-x 12 oracle dba      4096 Feb  5 07:04 18031683
-rw-r–r–  1 oracle dba       450 Feb 10 10:16 bundle.xml
-rw-r–r–  1 oracle dba         0 Feb 10 10:17 README.txt
-rw-rw-r–  1 oracle dba     59977 Apr 15 12:18 README.html
-rw-rw-r–  1 oracle dba    125015 Apr 15 14:17 PatchSearch.xml
-rwxr-xr-x  1 oracle dba 600096863 May 16 15:33 p18139678_112030_Linux-x86-64.zip

 

## Applying  PSU April 2014

unzip /opt/oracle/stage/p6880880_112000_Linux-x86-64.zip in your GRID_HOME and ORACLE_HOME directory
/opt/oracle/product/11203_ee_64/db/OPatch/ocm/bin/ocm.rsp  set up a response file (and make not of the absolute path for that response file because you will need it during opatch apply.
/opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp : that is my absolute path to the response file
unzip   p18139678_112030_Linux-x86-64.zip ( this was PSU april 2014 )
AS ROOT:export PATH=/opt/crs/product/11203/crs/OPatch:$PATH
export PATH=/opt/oracle/product/11203_ee_64/db/OPatch:$PATH
which opatch ( check if root can run opatch now )
PER NODE in your Cluster as ROOT :
##Crs
opatch auto /opt/oracle/stage/

unzip /opt/oracle/stage/p6880880_112000_Linux-x86-64.zip in your cdora directory
/opt/oracle/product/11203_ee_64/db/OPatch/ocm/bin/ocm.rsp
/opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp
unzip   p18139678_112030_Linux-x86-64.zip
export PATH=/opt/crs/product/11203/crs/OPatch:$PATH
export PATH=/opt/oracle/product/11203_ee_64/db/OPatch:$PATH
which opatch
PER NODE:
##Crs
opatch auto /opt/oracle/stage/

unzip /opt/oracle/stage/p6880880_112000_Linux-x86-64.zip in your cdora directory
/opt/oracle/product/11203_ee_64/db/OPatch/ocm/bin/ocm.rsp
/opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp
unzip   p18139678_112030_Linux-x86-64.zip
export PATH=/opt/crs/product/11203/crs/OPatch:$PATH
export PATH=/opt/oracle/product/11203_ee_64/db/OPatch:$PATH
which opatch
PPER NODE AS ROOT:##Crsopatch auto /opt/oracle/stage/psuApr2014 -ocmrf /opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp -oh /opt/crs/product/11203/crs

##Rdbms

opatch auto /opt/oracle/stage/psuApr2014 -ocmrf /opt/oracle/product/11203_ee_64/db/OPatch/ocm/bin/ocm.rsp -oh /opt/oracle/product/11203_ee_64/db

## Oracle Restart

/opatch auto /opt/oracle/stage/psuApr2014 -ocmrf /opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp -oh /opt/crs/product/11203/crs

/opatch auto /opt/oracle/stage/psuApr2014 -ocmrf /opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp -oh /opt/oracle/product/11203_ee_64/db

-ocmrf /opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp -oh /opt/crs/product/11203/crs
##Rdbms
opatch auto /opt/oracle/stage/11203 -ocmrf /opt/oracle/product/11203_ee_64/db/OPatch/ocm/bin/ocm.rsp -oh /opt/oracle/product/11203_ee_64/db

## Oracle Restart
/opatch auto /opt/oracle/stage/psuApr2014 -ocmrf /opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp -oh /opt/crs/product/11203/crs

/opatch auto /opt/oracle/stage/psuApr2014 -ocmrf /opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp -oh /opt/oracle/product/11203_ee_64/db

 

And as last recommendation . Check the logfiles that are produced during the OPatch in detail  cause i have seen a situation where the OPatch reported “succeeded”  but a detailed look in the logs showed that one of the patches had not been applied due to lack of space !!!!

 

As always happy reading and have a great day,

 

Mathijs

ORA-01031: insufficient privileges with logon as sydba

Introduction:

I guess we all sometimes have that famous face palm moment where we do not have a direct success story to offer when a customer asks something . In my case I was asked to offer the sys password to a third-party. I did grumble about it but hey that is the way it is sometimes.  And yes during the time where they have sys-password  not responsible for that database AND I will  have to audit the system after  I take control of it again but that is another story.

Details:

In this case I had altered the password of sys , I tested it locally on the machine and I had given it to the customer and well he came back to me telling me it did not work …

The error they got and the one I researched for:

sqlplus "sys/password@MYB.prod.nl as sysdba"

ERROR:
ORA-01031: insufficient privileges

Well That one made me frown because : 
sqlplus system@MYDB.prod.nl

Connected to:

And then do this locally without connect string:

SQL> connect sys as sysdba 
Enter password: 
Connected.

So with the Web on my side  I was looking for clues-on this one. And again love the on-line community of Oracle because soon I found out  that I should check if password files existed and if the password on that file was set according to the sys password.

I checked and frowned because there was no password file present at all in the $ORACLE_HOME/dbs ! So I created a password file. Tested it and  it worked !

Surely that was a face-palm moment to me like hey I have been there – done that 🙂  but still nice got it working asap after all .

Syntax  IN $ORACLE_HOME/dbs: 
orapwd FILE=orapwMYDB1 password=
After that I copied it to the second node with scp and of course renamed  orapwMYDB1 to  orapwMYDB2.

See. Sometimes it is not that complex a matter to keep customer and Dba happy 🙂

Happy reading,

Mathijs