Move Oracle Rac Database to a new diskgroup in Asm (A real life scenario)

Introduction:

Below you will find detailled steps to move a Rac database ( Oracle 11.2.0.3) to a different Diskgroup using a total maintenance mode scenario where the Database is not available during the maintenance window for the appliciaton.

Happy reading,

Mathijs

Detailled  Scenario of a diskgroup move of a Rac Database

## create fresh pfile  to be used  as a basis for a new spfile in the new diskgroup
create pfile=’/opt/oracle/MYDB1/admin/pfile/initMYDB1.ora.20140111_1800′ from spfile;
## create new spfile in the new diskgroup
create spfile=’+MYDB_DATA01′ from pfile=’/opt/oracle/MYDB1/admin/pfile/initMYDB1.ora.20140111_1800′;

## Shutdown the database vi the cluster :
##  all actions being performed after this shutdown using sqlplus till further notice !
srvctl stop database  -d MYDB

## In a second screen make sure your environment is pointing to the ASM environment and go to Asmcmd and go to the new disk group to find new name

asmcmd -p
cd MYDB_DATA01/MYDB/PARAMETERFILE

## it shows:
ls -ltr
spfile.256.836591237

## Copy the file spfile.256.831311031 to the new disk group which will work as a more human readable file name:

cp +MYDB_DATA01/MYDB/PARAMETERFILE/spfile.256.836591237  +MYDB_DATA01/MYDB/spfileMYDB.ora

### After doing that on Linux level alter the location of the spfile in the init.ora in the $ORACLE_HOME/dbs  ON ALL NODES  ( mysrvr25r / mysrvr26r)

cd/opt/oracle/product/11203_ee_64/db/dbs

ls -ltr  initMYDB*

### current content:  initMYDB1.ora check and adapt on second node as well
spfile=’+DATA1/MYDB/spfileMYDB.ora’
## After changing the disk group my new init.ora looks like this:

SPFILE=’+MYDB_DATA01/MYDB/spfileMYDB.ora’

## Start the  database to find out that you did good thing :

SQL> startup

## Working with the control files and Perform Backup:
## shows we have three controlfiles in place
SQL> show parameter control_files

NAME                                                      TYPE VALUE
—————————- ———– ——————————
+DATA1/MYDB/control01.ctl,+DATA1/MYDB/control02.ctl,+DATA1/MYDB/control03.ctl

## Set new location of controlfile in SPFILE:
alter system set control_files=’+MYDB_DATA01/MYDB/control01.ctl’, ‘+MYDB_FRA1/MYDB/control02.ctl’, ‘+MYDB_DATA01/MYDB/control03.ctl’ scope=spfile sid=’*’;
alter system set cluster_database=false scope=spfile;
## Shutdown your database
SQL> shutdown;

## Open Asmcmd again with the environment pointing to +ASM instance:
##  Copy the current control file from +DATA01 to the correct Diskgroups and sync them by this copy
ASMCMD
cp +DATA1/MYDB/control01.ctl  +MYDB_DATA01/MYDB/control01.ctl
cp +DATA1/MYDB/control01.ctl  +MYDB_FRA1/MYDB/control02.ctl
cp +DATA1/MYDB/control01.ctl  +MYDB_DATA01/MYDB/control03.ctl
## check it:
ls -l +MYDB_DATA01/MYDB/control01.ctl
ls -l +MYDB_FRA1/MYDB/control02.ctl
ls -l +MYDB_DATA01/MYDB/control03.ctl

##  Start your database with startup nomount

SQL> startup nomount;

## Start an rman session : Open “rman target /” and restore from old control and mount + open database:
## Not 100 % sure if this step was needed since we copied file in asmcmd already but  it wont hurt and takes little time
rman target /
restore controlfile to ‘+MYDB_DATA01/MYDB/control01.ctl’ from ‘+DATA1/MYDB/control01.ctl’;
restore controlfile to ‘+MYDB_FRA1/MYDB/control02.ctl’   from ‘+DATA1/MYDB/control01.ctl’;
restore controlfile to ‘+MYDB_DATA01/MYDB/control03.ctl’ from ‘+DATA1/MYDB/control01.ctl’;

##This will show
rman target /

Recovery Manager: Release 11.2.0.3.0 – Production on Sat Jan 11 18:53:29 2014

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: MYDB (not mounted)

RMAN> restore controlfile to ‘+MYDB_DATA01/MYDB/control01.ctl’ from ‘+DATA1/MYDB/control01.ctl’;

Starting restore at 11.01.2014 18:54:00
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1010 device type=DISK

channel ORA_DISK_1: copied control file copy
Finished restore at 11.01.2014 18:54:02

RMAN> restore controlfile to ‘+MYDB_FRA1/MYDB/control02.ctl’   from ‘+DATA1/MYDB/control01.ctl’;

Starting restore at 11.01.2014 18:54:33
using channel ORA_DISK_1

channel ORA_DISK_1: copied control file copy
Finished restore at 11.01.2014 18:54:34

RMAN> restore controlfile to ‘+MYDB_DATA01/MYDB/control03.ctl’ from ‘+DATA1/MYDB/control01.ctl’;

Starting restore at 11.01.2014 18:54:51
using channel ORA_DISK_1

channel ORA_DISK_1: copied control file copy
Finished restore at 11.01.2014 18:54:52

## Mount the database via Rman

RMAN> sql ‘alter database mount’;

##This will show

List of instances:
1 (myinst: 1)
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
Sat Jan 11 18:39:32 2014
LCK0 started with pid=31, OS id=4011
Starting background process RSMN
Sat Jan 11 18:39:33 2014
RSMN started with pid=32, OS id=4015
ORACLE_BASE from environment = /opt/oracle
Sat Jan 11 18:39:33 2014
ALTER DATABASE   MOUNT
This instance was first to mount
NOTE: Loaded library: System
SUCCESS: diskgroup DATA1 was mounted
NOTE: dependency between database MYDB and diskgroup resource ora.DATA1.dg is established
Successful mount of redo thread 1, with mount id 2306013605
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT

## Now it is time to make a backup of the database into the new Disk group (+MYDB_DATA01). If you are in a rac environment make sure all other instances are down.

## Issue the following command in rman because this will create a one to one copy of the database in the new Disk group:

RMAN>backup as copy database format ‘+MYDB_DATA01’;

##This will show:

RMAN> backup as copy database format ‘+MYDB_DATA01’;

Starting backup at 11.01.2014 18:58:11
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1766 device type=DISK
channel ORA_DISK_1: starting datafile copy
input datafile file number=00006 name=+DATA1/MYDB/datafile/gemprod.275.786977627
output file name=+MYDB_DATA01/MYDB/datafile/gemprod.260.836593093 tag=TAG20140111T185812 RECID=15 STAMP=836593134
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:45
channel ORA_DISK_1: starting datafile copy
input datafile file number=00002 name=+DATA1/MYDB/datafile/sysaux.262.786977877
output file name=+MYDB_DATA01/MYDB/datafile/sysaux.261.836593139 tag=TAG20140111T185812 RECID=16 STAMP=836593155
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:25
channel ORA_DISK_1: starting datafile copy
input datafile file number=00001 name=+DATA1/MYDB/datafile/system.269.786977709
output file name=+MYDB_DATA01/MYDB/datafile/system.262.836593163 tag=TAG20140111T185812 RECID=17 STAMP=836593175
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:15
channel ORA_DISK_1: starting datafile copy
input datafile file number=00003 name=+DATA1/MYDB/datafile/undotbs1.266.786977783
output file name=+MYDB_DATA01/MYDB/datafile/undotbs1.263.836593179 tag=TAG20140111T185812 RECID=18 STAMP=836593187
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:15
channel ORA_DISK_1: starting datafile copy
input datafile file number=00007 name=+DATA1/MYDB/datafile/undotbs2.267.786977883
output file name=+MYDB_DATA01/MYDB/datafile/undotbs2.264.836593195 tag=TAG20140111T185812 RECID=19 STAMP=836593202
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:15
channel ORA_DISK_1: starting datafile copy
input datafile file number=00004 name=+DATA1/MYDB/datafile/users.259.786977783
output file name=+MYDB_DATA01/MYDB/datafile/users.265.836593209 tag=TAG20140111T185812 RECID=20 STAMP=836593211
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:03
channel ORA_DISK_1: starting datafile copy
input datafile file number=00005 name=+DATA1/MYDB/datafile/tools.263.786977709
output file name=+MYDB_DATA01/MYDB/datafile/tools.266.836593213 tag=TAG20140111T185812 RECID=21 STAMP=836593215
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:03
channel ORA_DISK_1: starting datafile copy
copying current control file
output file name=+MYDB_DATA01/MYDB/controlfile/backup.267.836593215 tag=TAG20140111T185812 RECID=22 STAMP=836593217
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:03
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current SPFILE in backup set
channel ORA_DISK_1: starting piece 1 at 11.01.2014 19:00:18
channel ORA_DISK_1: finished piece 1 at 11.01.2014 19:00:19
piece handle=+MYDB_DATA01/MYDB/backupset/2014_01_11/nnsnf0_tag20140111t185812_0.268.836593219 tag=TAG20140111T185812 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 11.01.2014 19:00:19

## Once that has finished issue the following command in Rman (note this switch to command performs the  … set newname for you  which is great:

RMAN>switch database to copy;

##This will show you:

RMAN> switch database to copy;

datafile 1 switched to datafile copy “+MYDB_DATA01/MYDB/datafile/system.262.836593163”
datafile 2 switched to datafile copy “+MYDB_DATA01/MYDB/datafile/sysaux.261.836593139”
datafile 3 switched to datafile copy “+MYDB_DATA01/MYDB/datafile/undotbs1.263.836593179”
datafile 4 switched to datafile copy “+MYDB_DATA01/MYDB/datafile/users.265.836593209”
datafile 5 switched to datafile copy “+MYDB_DATA01/MYDB/datafile/tools.266.836593213”
datafile 6 switched to datafile copy “+MYDB_DATA01/MYDB/datafile/gemprod.260.836593093”
datafile 7 switched to datafile copy “+MYDB_DATA01/MYDB/datafile/undotbs2.264.836593195”

##When that is finished issue following command in rman :
RMAN>sql ‘alter database open’;

## Start / open  Second instance as well via Sqlplus  as a check on the second box.
alter system set cluster_database=true scope=spfile;
startup

##Your alert file has been updated with following information:
Sat Jan 11 19:02:35 2014
WARNING: cataloging database area datafile
+DATA1/MYDB/datafile/system.269.786977709 as recovery area datafilecopy.
This datafilecopy is accounted into used space. Consider incrementing
db_recovery_file_dest_size parameter value by size of datafile.
Switch of datafile 1 complete to datafile copy
checkpoint is 13560558508638
WARNING: cataloging database area datafile
+DATA1/MYDB/datafile/sysaux.262.786977877 as recovery area datafilecopy.
This datafilecopy is accounted into used space. Consider incrementing
db_recovery_file_dest_size parameter value by size of datafile.
Switch of datafile 2 complete to datafile copy
checkpoint is 13560558508638
WARNING: cataloging database area datafile
+DATA1/MYDB/datafile/undotbs1.266.786977783 as recovery area datafilecopy.
This datafilecopy is accounted into used space. Consider incrementing
db_recovery_file_dest_size parameter value by size of datafile.
Switch of datafile 3 complete to datafile copy
checkpoint is 13560558508638
WARNING: cataloging database area datafile
+DATA1/MYDB/datafile/users.259.786977783 as recovery area datafilecopy.
This datafilecopy is accounted into used space. Consider incrementing
db_recovery_file_dest_size parameter value by size of datafile.
Switch of datafile 4 complete to datafile copy
checkpoint is 13560558508638
WARNING: cataloging database area datafile
+DATA1/MYDB/datafile/tools.263.786977709 as recovery area datafilecopy.
This datafilecopy is accounted into used space. Consider incrementing
db_recovery_file_dest_size parameter value by size of datafile.
Switch of datafile 5 complete to datafile copy
checkpoint is 13560558508638
WARNING: cataloging database area datafile
+DATA1/MYDB/datafile/gemprod.275.786977627 as recovery area datafilecopy.
This datafilecopy is accounted into used space. Consider incrementing
db_recovery_file_dest_size parameter value by size of datafile.
Switch of datafile 6 complete to datafile copy
checkpoint is 13560558508638
WARNING: cataloging database area datafile
+DATA1/MYDB/datafile/undotbs2.267.786977883 as recovery area datafilecopy.
This datafilecopy is accounted into used space. Consider incrementing
db_recovery_file_dest_size parameter value by size of datafile.
Switch of datafile 7 complete to datafile copy
checkpoint is 13560558508638
Sat Jan 11 19:04:21 2014
alter database open

## Open a Sqlplus session and check for the %create% parameter:

show parameter create
SQL> show parameter db_create

NAME                                 TYPE        VALUE
———————————— ———– ——————————
db_create_file_dest                  string
db_create_online_log_dest_1          string
db_create_online_log_dest_2          string
db_create_online_log_dest_3          string
db_create_online_log_dest_4          string
db_create_online_log_dest_5          string

alter system set db_create_file_dest=’+MYDB_DATA01′ sid=’*’;
alter system set db_create_online_log_dest_1=’+MYDB_FRA1′ sid=’*’;

## check again
show parameter db_create
show parameter cluster

##Now it is time to work with the temp files. You will have to create a New temp tablespace in the new disk group, make that the default one and drop the old one:
set lines 200
col tablespace_name format a40
col file_name format a80
select tablespace_name,FILE_NAME, bytes/1024/1024 MB from dba_temp_files;

## This shows: FILE_NAME
TABLESPACE_NAME                          FILE_NAME                                                                                MB
—————————————- ——————————————————————————– ———-
TEMP                                     +DATA1/MYDB/tempfile/temp.268.774880131                                             8192

##This will create a new temp tablespace , in the new disk group , make it the default tablespace and drop old

create temporary tablespace TEMP02 tempfile size 1024m;
alter database default temporary tablespace TEMP02;
drop tablespace TEMP ;
create temporary tablespace TEMP tempfile size 8192m;
alter database default temporary tablespace TEMP;
drop tablespace TEMP02;

##Check it again:
set lines 200
col tablespace_name format a40
col file_name format a80
select tablespace_name,FILE_NAME, bytes/1024/1024 MB from dba_temp_files;

## Working with the redo logs:
## In this step first we have to add new members to each group (to each thread (in a rac)). After that and after switching the log files you can delete the members in the old disk group
## First check the environment:

Set lines 2000
select * from v$log;
SQL>
GROUP#    THREAD#  SEQUENCE#      BYTES  BLOCKSIZE    MEMBERS ARC STATUS           FIRST_CHANGE# FIRST_TIME          NEXT_CHANGE# NEXT_TIME
———- ———- ———- ———- ———- ———- — —————- ————- ——————- ———— ——————-
1          1      28288  209715200        512          2 YES INACTIVE            1.3559E+13 08.01.2014 14:54:59   1.3559E+13 08.01.2014 15:24:59
2          1      28289  209715200        512          2 YES INACTIVE            1.3559E+13 08.01.2014 15:24:59   1.3559E+13 08.01.2014 15:54:59
3          1      28290  209715200        512          2 NO  CURRENT             1.3559E+13 08.01.2014 15:54:59   2.8147E+14
5          2      28097  209715200        512          2 YES INACTIVE            1.3559E+13 08.01.2014 15:24:53   1.3559E+13 08.01.2014 15:54:52
6          2      28098  209715200        512          2 NO  CURRENT             1.3559E+13 08.01.2014 15:54:52   2.8147E+14
7          2      28095  209715200        512          2 YES INACTIVE            1.3559E+13 08.01.2014 14:24:51   1.3559E+13 08.01.2014 14:54:53
8          2      28096  209715200        512          2 YES INACTIVE            1.3559E+13 08.01.2014 14:54:53   1.3559E+13 08.01.2014 15:24:53

## and
col member format a80
set pagesize 33
select GROUP#,MEMBER from v$logfile order by 1;
GROUP# MEMBER
———- ——————————————————————————–
1 +DATA1/MYDB/onlinelog/group_1.260.774880079
1 +MYDB_FRA1/MYDB/onlinelog/group_1.258.774880081
2 +DATA1/MYDB/onlinelog/group_2.264.774880083
2 +MYDB_FRA1/MYDB/onlinelog/group_2.257.774880085
3 +DATA1/MYDB/onlinelog/group_3.261.774880087
3 +MYDB_FRA1/MYDB/onlinelog/group_3.256.774880089
5 +DATA1/MYDB/onlinelog/group_5.271.774880749
5 +MYDB_FRA1/MYDB/onlinelog/group_5.259.774880751
6 +DATA1/MYDB/onlinelog/group_6.272.774880753
6 +MYDB_FRA1/MYDB/onlinelog/group_6.260.774880755
7 +DATA1/MYDB/onlinelog/group_7.273.774880757
7 +MYDB_FRA1/MYDB/onlinelog/group_7.261.774880759
8 +DATA1/MYDB/onlinelog/group_8.274.774880763
8 +MYDB_FRA1/MYDB/onlinelog/group_8.262.774880765

##First we add new members to the correct , new disk group:
alter database add logfile member ‘+MYDB_DATA01’ to group 1;
alter database add logfile member ‘+MYDB_DATA01’ to group 2;
alter database add logfile member ‘+MYDB_DATA01’ to group 3;
alter database add logfile member ‘+MYDB_DATA01’ to group 5;
alter database add logfile member ‘+MYDB_DATA01’ to group 6;
alter database add logfile member ‘+MYDB_DATA01’ to group 7;
alter database add logfile member ‘+MYDB_DATA01’ to group 8;

###Check again:
select GROUP#,MEMBER from v$logfile order by 1;

## Perform some switches to make sure the new members have been in use  ( archive log current performs logswitch in  the whole  Database )

alter system archive log current;

alter system archive log current;

alter system archive log current;

alter system archive log current;

alter system archive log current;

alter system archive log current;

alter system archive log current;

alter system archive log current;

## First check the environment again:
Set lines 2000
select * from v$log;

##It is time to drop the members from the old ( data1 )
select GROUP#,MEMBER from v$logfile order by 1;

GROUP# MEMBER
———- ——————————————————————————–
1 +DATA1/MYDB/onlinelog/group_1.260.774880079
1 +MYDB_FRA1/MYDB/onlinelog/group_1.258.774880081
2 +DATA1/MYDB/onlinelog/group_2.264.774880083
2 +MYDB_FRA1/MYDB/onlinelog/group_2.257.774880085
3 +DATA1/MYDB/onlinelog/group_3.261.774880087
3 +MYDB_FRA1/MYDB/onlinelog/group_3.256.774880089
5 +DATA1/MYDB/onlinelog/group_5.271.774880749
5 +MYDB_FRA1/MYDB/onlinelog/group_5.259.774880751
6 +DATA1/MYDB/onlinelog/group_6.272.774880753
6 +MYDB_FRA1/MYDB/onlinelog/group_6.260.774880755
7 +DATA1/MYDB/onlinelog/group_7.273.774880757
7 +MYDB_FRA1/MYDB/onlinelog/group_7.261.774880759
8 +DATA1/MYDB/onlinelog/group_8.274.774880763
8 +MYDB_FRA1/MYDB/onlinelog/group_8.262.774880765

##So we have to drop the redo members that point to the old ( DATA01 ) Disk group ( but the group can not be current !!):

GROUP# MEMBER
———- ——————————————————————————–
1 +DATA1/MYDB/onlinelog/group_1.260.774880079
2 +DATA1/MYDB/onlinelog/group_2.264.774880083
3 +DATA1/MYDB/onlinelog/group_3.261.774880087
5 +DATA1/MYDB/onlinelog/group_5.271.774880749
6 +DATA1/MYDB/onlinelog/group_6.272.774880753
7 +DATA1/MYDB/onlinelog/group_7.273.774880757
8 +DATA1/MYDB/onlinelog/group_8.274.774880763

alter database drop logfile member ‘+DATA1/MYDB/onlinelog/group_1.260.774880079’;
alter database drop logfile member ‘+DATA1/MYDB/onlinelog/group_2.264.774880083’;
alter database drop logfile member ‘+DATA1/MYDB/onlinelog/group_3.261.774880087’;
alter database drop logfile member ‘+DATA1/MYDB/onlinelog/group_5.271.774880749’;
alter database drop logfile member ‘+DATA1/MYDB/onlinelog/group_6.272.774880753’;
alter database drop logfile member ‘+DATA1/MYDB/onlinelog/group_7.273.774880757’;
alter database drop logfile member ‘+DATA1/MYDB/onlinelog/group_8.274.774880763’;

##Checked again
select GROUP#,MEMBER from v$logfile order by 1;

##Working in the clusterware:

After these activities I tried Stopping and starting via srvctl ( this is an oracle restart environment crashed ) ..   Alertlog was having error messages and the start failed …   I did notice that the environment was using the old SPFILE in +Data01 again … I checked the spfile which was wrong again..

cd  $ORACLE_HOME/db/dbs

cat initMYDB1.ora
spfile=’+DATA1/MYDB/spfileMYDB.ora’

## so the clusteragent had altered my changes
##altered init.ora again and started that worked ..

##In sqlplus:

select name from v$controlfile
union
select name from v$datafile
union
select name from v$tempfile
union
select member from v$logfile
union
select filename from v$block_change_tracking
union
select name from v$flashback_database_logfile;

## This shows:

NAME
——————————————————————————–
+DATA1/MYDB/changetracking/ctf.265.774880707
+DATA1/MYDB/control01.ctl
+DATA1/MYDB/control02.ctl
+DATA1/MYDB/control03.ctl
+DATA1/MYDB/datafile/gemprod.275.786977627
+DATA1/MYDB/datafile/sysaux.262.786977877
+DATA1/MYDB/datafile/system.269.786977709
+DATA1/MYDB/datafile/tools.263.786977709
+DATA1/MYDB/datafile/undotbs1.266.786977783
+DATA1/MYDB/datafile/undotbs2.267.786977883
+DATA1/MYDB/datafile/users.259.786977783
+DATA1/MYDB/onlinelog/group_1.260.774880079
+DATA1/MYDB/onlinelog/group_2.264.774880083
+DATA1/MYDB/onlinelog/group_3.261.774880087
+DATA1/MYDB/onlinelog/group_5.271.774880749
+DATA1/MYDB/onlinelog/group_6.272.774880753
+DATA1/MYDB/onlinelog/group_7.273.774880757
+DATA1/MYDB/onlinelog/group_8.274.774880763
+DATA1/MYDB/tempfile/temp.268.774880131
+MYDB_FRA1/MYDB/onlinelog/group_1.258.774880081
+MYDB_FRA1/MYDB/onlinelog/group_2.257.774880085
+MYDB_FRA1/MYDB/onlinelog/group_3.256.774880089
+MYDB_FRA1/MYDB/onlinelog/group_5.259.774880751
+MYDB_FRA1/MYDB/onlinelog/group_6.260.774880755
+MYDB_FRA1/MYDB/onlinelog/group_7.261.774880759
+MYDB_FRA1/MYDB/onlinelog/group_8.262.774880765

##  means another action point as with regard to the block change tracking
SELECT filename, status, bytes
FROM v$block_change_tracking;
2
FILENAME                                                                         STATUS          BYTES
——————————————————————————– ———- ———-
+DATA1/MYDB/changetracking/ctf.265.774880707                                  ENABLED      11599872

## Disable it and enable it on the new location ( the new disk group )
ALTER DATABASE DISABLE BLOCK CHANGE TRACKING;
ALTER DATABASE ENABLE BLOCK CHANGE TRACKING USING FILE ‘+MYDB_DATA01’;

## In an Oracle restart or Rac Environment you need to check the Clusterware setup now since it has knowledge about spfile, disk groups being used etc.
## First check the configuration in the Clusterware for the database:
srvctl config database -d MYDB
Database unique name: MYDB
Database name:
Oracle home: /opt/oracle/product/11203_ee_64/db
Oracle user: oracle
Spfile: +DATA1/MYDB/spfileMYDB.ora
Domain: prod.vis
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: MYDB
Database instances: MYDB1,MYDB2
Disk Groups: DATA1,MYDB_FRA1
Mount point paths:
Services: MYDB_TAF.prod.vis
Type: RAC
Database is administrator managed

##So we have to perform two action points:
·        Make the spfile point to the correct disk group ( our new +MYDB_DATA01)
·        Disk groups attribute still knows about that data1 disk group ( and it should not)

##First modification will be to inform the Clusterware which spfile to use:
srvctl modify database -d MYDB -p ‘+MYDB_DATA01/MYDB/spfileMYDB.ora’

##After that similar action for the disk groups:
srvctl modify database -d  MYDB -a ‘MYDB_DATA01,MYDB_FRA1’

## don’t believe it check it
srvctl config database -d MYDB

##This looks better so now let’s do a stop & start with srvctl
srvctl stop database -d MYDB
srvctl start database -d MYDB

##That worked !  happy dba

ORA-01110: data file 2504 Errors occurred during index rebuild

This morning i came across this :

ORA-01110: data file 2504 Errors occurred during index rebuild

I examined and I followed steps below after that index rebuild was success again.

## first checked for the properties since this is a 10g environment:
COLUMN property_name FORMAT A30
COLUMN property_value FORMAT A30
COLUMN description FORMAT A50
SET LINESIZE 200

SELECT *
FROM   database_properties
WHERE  property_name like ‘%TABLESPACE’;SQL> SQL> SQL> SQL> SQL>   2    3  

PROPERTY_NAME                  PROPERTY_VALUE                 DESCRIPTION
—————————— —————————— ————————————————–
DEFAULT_TEMP_TABLESPACE        TEMP                           Name of default temporary tablespace
DEFAULT_PERMANENT_TABLESPACE   SYSTEM                         Name of default permanent tablespace

##Created a new temporary tablespace:
CREATE TEMPORARY TABLESPACE TEMP2 TEMPFILE  ‘/db/MYDB/temp/temp_99.dbf’ size 1024M;

##Made this the new default temporary tablespace
ALTER DATABASE DEFAULT TEMPORARY TABLESPACE temp2;

## checked if temp tablespace was in use (would have to kill those sessions in case of)
SELECT USERNAME, SESSION_NUM, SESSION_ADDR FROM V$SORT_USAGE;

##Dropped the old tablespace.
DROP TABLESPACE temp INCLUDING CONTENTS AND DATAFILES;

## Removed the file on Unix. i do know i might have settled for a reuse, given the fact of possible corrupt thought this was better.
rm temp_01.dbf

## Created  new temp tablespace.
CREATE TEMPORARY TABLESPACE TEMP TEMPFILE ‘/db/MYDB/temp/temp_01.dbf’ size 2000M;

## Removed all temp files on the os
oracle@mysrvr1:/db/MYDB/temp [OPTMYDB]# rm  temp_02.dbf
oracle@mysrvr1:/db/MYDB/temp [OPTMYDB]# rm  temp_03.dbf
oracle@mysrvr1:/db/MYDB/temp [OPTMYDB]# rm  temp_04.dbf
oracle@mysrvr1:/db/MYDB/temp [OPTMYDB]# rm  temp_05.dbf
oracle@mysrvr1:/db/MYDB/temp [OPTMYDB]# rm  temp_06.dbf
oracle@mysrvr1:/db/MYDB/temp [OPTMYDB]# rm  temp_07.dbf

##  temp files added
ALTER TABLESPACE temp ADD TEMPFILE ‘/db/MYDB/temp/temp_02.dbf’ size 2000m;
ALTER TABLESPACE temp ADD TEMPFILE ‘/db/MYDB/temp/temp_03.dbf’ size 2000m;
ALTER TABLESPACE temp ADD TEMPFILE ‘/db/MYDB/temp/temp_04.dbf’ size 2000m;
ALTER TABLESPACE temp ADD TEMPFILE ‘/db/MYDB/temp/temp_05.dbf’ size 2000m;
ALTER TABLESPACE temp ADD TEMPFILE ‘/db/MYDB/temp/temp_06.dbf’ size 2000m;
ALTER TABLESPACE temp ADD TEMPFILE ‘/db/MYDB/temp/temp_07.dbf’ size 2000m;

## Default tablespace was put back to temp again
ALTER DATABASE DEFAULT TEMPORARY TABLESPACE temp;

##  Temp2 tablespace dropped
DROP TABLESPACE temp2 INCLUDING CONTENTS AND DATAFILES;

## This tablepace also a temp one had issues in same way:
DROP TABLESPACE tools_temp  INCLUDING CONTENTS AND DATAFILES;

rm  tools_temp_01.dbf

## tools_temp hatte auch macke deshalb die auch raus und rein.
CREATE TEMPORARY TABLESPACE TOOLS_TEMP  TEMPFILE ‘/db/MYDB/temp/tools_temp_01.dbf’ size 1024M;

## All back to normal again .

Happy reading,

Mathijs

TNS-12514 in an Oracle Rac Environment

Introduction:

On a frequent bases team members do hotline service where they act as a first line of contact when there are issues with the Oracle Environment on the various boxes. In the last week of 2013 Monday 30st of December  it was my turn to do that duty when one of the interesting  issues showed that  even caused a Priority One incident on the server srvr_33r which is part of a  4 node Rac cluster with the  database MYDB. So once again time to gear up and go investigate what was going on. Users received the following Error Message:  TNS-12514: TNS:listener does not know of service requested in connect descriptor

Summary:

In the end it showed that the Oracle Instance on the 33r Box was suffering from a communication issue with its ASM instance since this ASM instance had issues with its shared_pool ( in the alert file of the Oracle Instance at the start of the incident i found : ORA-04031: unable to allocate 760 bytes of shared memory (“shared pool”,”unknown object”,”KKSSP^928″,”kglss”).

Due to that  shared_pool issue the Instance was unable to communicate with its ASM Instance when it needed the information about  its Local listener in the spfile. That made that information unavailable to the Oracle instance so the Cluster agent performed this :  ALTER SYSTEM SET local_listener='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=10.20.30.72)(PORT=1521))))’ SCOPE=MEMORY SID=’MYDB1′;

So It altered my LOCAL_LISTENER in memory to the default listener on the cluster, listening on 1521 with NO services defined for this specific database !!

Even worse was that the environment was hit by an Oracle Bug ( which was pointed out by a team member to me : Bug 11772838 – oraagent modifies LOCAL_LISTENER with an incorrect value, when the parameter was set in spfile (Doc ID 11772838.8)

I have implemented the workaround for the listener and will have to schedule a maintenance window on the cluster to fix the Memory issues in the ASM instance(s).

Details:

Environment setting:  4 node 11.2.0.3 Grid Infrastructure  with a mix of  Databases (and versions) present. Every database has a dedicated listener on a specific port. And services for that database register with that dedicated listener. Every database has its spfile in an ASM Diskgroup.

When i started performing checks i first looked at the listener on the first Node ( Instance1 is on srvr_33r)

# lsnrctl status listener_MYDB
LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 30-DEC-2013 10:02:51
 Copyright (c) 1991, 2011, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=srvr_33r-vip)(PORT=20008)(IP=FIRST)))
 STATUS of the LISTENER
 ------------------------
 Alias LISTENER_MYDB
 Version TNSLSNR for Linux: Version 11.2.0.3.0 - Production
 Start Date 23-OCT-2013 01:21:27
 Uptime 68 days 9 hr. 41 min. 24 sec
 Trace Level off
 Security ON: Local OS Authentication
 SNMP OFF
 Listener Parameter File /opt/oracle/product/11203_ee_64/db/network/admin/listener.ora
 Listener Log File /opt/oracle/MYDB/admin/network/log/listener_MYDB.log
 Listening Endpoints Summary...
 (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.20.30.72)(PORT=20008)))
 (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.20.30.71)(PORT=20008)))
 Services Summary...
 Service "MYDB" has 1 instance(s).
 Instance "MYDB", status UNKNOWN, has 1 handler(s) for this service...
 The command completed successfully

 

And i compared that to the information in the listener on the other node: 

(Instance 2 is on srvr_34r)
oracle@srvr_34r:/opt/oracle [MYDB2]# lsnrctl status listener_MYDB
LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 30-DEC-2013 10:00:27
 Copyright (c) 1991, 2011, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=srvr_34r-vip)(PORT=20008)(IP=FIRST)))
 STATUS of the LISTENER
 ------------------------
 Alias LISTENER_MYDB
 Version TNSLSNR for Linux: Version 11.2.0.3.0 - Production
 Start Date 23-OCT-2013 03:10:15
 Uptime 68 days 7 hr. 50 min. 12 sec
 Trace Level off
 Security ON: Local OS Authentication
 SNMP OFF
 Listener Parameter File /opt/oracle/product/11203_ee_64/db/network/admin/listener.ora
 Listener Log File /opt/oracle/MYDB/admin/network/log/listener_MYDB.log
 Listening Endpoints Summary...
 (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.20.30.75)(PORT=20008)))
 (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.20.30.73)(PORT=20008)))
 Services Summary...
 Service "MYDB" has 1 instance(s).
 Instance "MYDB", status UNKNOWN, has 1 handler(s) for this service...
 Service "MYDB.prod.nl" has 1 instance(s).
 Instance "MYDB2", status READY, has 1 handler(s) for this service...
 Service "MYDB_TAF.prod.nl" has 1 instance(s).
 Instance "MYDB2", status READY, has 1 handler(s) for this service...
 The command completed successfully

 

Conclusion: So in short the listener on srvr_33r for this specific database was unaware of its services even though  the services where / are in place.

HOWEVER  when I checked for the local listener on the srvr_33r server   in an sqlplus session  this is what I saw:

SQL> show parameter listener
NAME TYPE VALUE
 ------------------------------------ ----------- ------------------------------
 listener_networks string
 local_listener string (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=10.20.30.72)(PORT=1521))))
 remote_listener string srvr_3rcl-scan:33100

Which proves that the Spfile was at this morning time during incident ( December 30 as of 09:00)  no longer aware of its LOCAL listener !!! In the end we fixed the issue by adding the local listener again and after that issues with connectivity where resolved.

Note when I check the listener Logs I do see that listener was unaware of services at the start of  the incident .

/opt/oracle/MYDB/admin/network/log/listener_MYDB.log
30-DEC-2013 00:14:40 * (CONNECT_DATA=(SERVICE_NAME=MYDB.prod.nl)(SERVER=DEDICATED)(CID=(PROGRAM=c:\oraclexe\app\oracle\product\11.2.0\server\bin\ORACLE.EXE)(HOST=X1234567)(USER=SYSTEM))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.60.205.57)(PORT=58445)) * establish * MYDB.prod.nl * 12514
TNS-12514: TNS:listener does not currently know of service requested in connect descriptor
30-DEC-2013 00:14:40 * (CONNECT_DATA=(SERVICE_NAME=MYDB.prod.nl)(SERVER=DEDICATED)(CID=(PROGRAM=c:\oraclexe\app\oracle\product\11.2.0\server\bin\ORACLE.EXE)(HOST=X1234567)(USER=SYSTEM))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.60.205.57)(PORT=58446)) * establish * MYDB.prod.nl * 12514
TNS-12514: TNS:listener does not currently know of service requested in connect descriptor

In the alert logs of the Instance I found this information and I think that this has caused our PRiO 1 incident due to the instance could not talk to the asm instance:

view /opt/oracle/MYDB/admin/Arch/alert_MYDB1.log.20131230_000158
See Note 411.1 at My Oracle Support for error and packaging details.
WARNING: ASM communication error: op 0 state 0x0 (15055)
ERROR: direct connection failure with ASM
Dumping diagnostic data in directory=[cdmp_20131229112802], requested by (instance=1, osid=29870 (M000)), summary=[incident=48997].
Sun Dec 29 11:28:07 2013
Errors in file /opt/oracle/MYDB/diag/rdbms/MYDB/MYDB1/trace/MYDB1_m000_29870.trc  (incident=48998):
ORA-04031: unable to allocate 760 bytes of shared memory ("shared pool","unknown object","KKSSP^928","kglss")
Incident details in: /opt/oracle/MYDB/diag/rdbms/MYDB/MYDB1/incident/incdir_48998/MYDB1_m000_29870_i48998.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
WARNING: ASM communication error: op 0 state 0x0 (15055)
ERROR: direct connection failure with ASM
Sun Dec 29 11:28:08 2013
Dumping diagnostic data in directory=[cdmp_20131229112808], requested by (instance=1, osid=29870 (M000)), summary=[incident=48998].
Sun Dec 29 11:28:33 2013
ALTER SYSTEM SET local_listener='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=10.20.30.72)(PORT=1521))))' SCOPE=MEMORY SID='MYDB1';

Addendum:

After Altering the local listener to the required and correct listener i closed the ticket. Much to my surprise after couple of hours to be kicked by the same issue on the same database again. A  Colleague pointed me to Mos to  Bug 11772838 – oraagent modifies LOCAL_LISTENER with an incorrect value, when the parameter was set in spfile (Doc ID 11772838.8) .  So that proved i would need to make another step to have a proper workaround in place before i could start plan a maintenance window that would help to fix the cause of all this ( which was the Oracle Instance unable to communicate to its ASM instance when it needed  information about  the local listener in its spfile).

As Always Happy reading,

Mathijs