The Patch and the Furious ( Patching Grid Infra 11.2.0.3. and Rdbms 11.2.0.3 with PSU April 2014)

Introduction:

In  June 2014   I have been upgrading quite a few environments  in our billing environment which was still 11.2.0.5.0. for both for Grid infra structure and Rdbms on RedHat Linux.  My first scenario after reading and setting up a first environment  was to install  a mandatory 12539000 (well mandatory for that patch level)  One off  patch  and then upgraded to 11.2.0.3 with PSU April 2014. But of course a Dba should be flexible in his approaches so  as a form of adapting while doing I decided that it would be best to use PSU Oct 2012 in full and then Migrate to 11.2.0.3. on both my Grid Infra and Rdbms environment. Since  I had an agreement with customer that the latest PSU patch would be applied  I aimed for implementing April 2014.

By request I got the following Documents on Mos as a must read before going to the upgrade:

———————————————————————————————————————————
Things to Consider Before Upgrading to 11.2.0.3/11.2.0.4 Grid Infrastructure/ASM (Doc ID 1363369.1)
ORAchk Upgrade Readiness Assessment (Doc ID 1457357.1)
Before upgrade database check this note: Complete Checklist for Manual Upgrades to 11gR2 (Doc ID 837570.1)
Upgrade / Downgrade Assistant: Oracle Database/Client (Doc ID 1561791.2)
———————————————————————————————————————————-

 Detailed  Plan and comments:

Installation and Patching  the software
Current setup :
“Grid Infrastructure Patch Set Update 11.2.0.2.5 (includes DB PSU 11.2.0.2.5)
“Database Patch Set Update : 11.2.0.2.5 (13343424)”
Before beginning, use an Oracle provide tool called “orachk” utility to check your environment for Upgrade Readiness.  Reference:  ORAchk 11.2.0.3 Upgrade Readiness Assessment (Doc ID 1457357.1).Command: orachk -u -o pre Things to Consider Before Upgrading to 11.2.0.3/11.2.0.4 Grid Infrastructure/ASM (Doc ID 1363369.1)
orachk.zip
Of course cluvfy is you friend so you should use it as part of preparationruncluvfy.sh stage -pre crsinst -upgrade -n mysrvr01hr,mysrvr02hr -rolling -src_crshome /opt/crs/product/112_ee_64/crs -dest_crshome /opt/crs/product/11203_ee_64/crs -dest_version 11.2.0.3.0 -fixup -fixupdir /tmp -verbose>/tmp/11203Upgrade.lst Things to Consider Before Upgrading to 11.2.0.3/11.2.0.4 Grid Infrastructure/ASM (Doc ID 1363369.1)
New Oracle Home :   /opt/crs/product/11203_ee_64/crs
Orachk script pointed out I would need this One off patch before  migrate:PATCH 12539000
The patch must be applied to GI home and is recommended for 11.2.0.2 database homes. It’s available on top of all DB PSUs which are part of corresponding GI PSU.
It’s recommended to be applied with “opatch auto” with latest 11.2 opatch (patch 6880880)as the ROOT user:export PATH=/opt/crs/product/112_ee_64/crs/OPatch/:$PATH
which opatch$GRID_HOME/OPatch/opatch auto /opt/oracle/stage -oh /opt/crs/product/112_ee_64/crs$GRID_HOME/OPatch/opatch auto   /opt/oracle/stage -oh /opt/oracle/product/112_ee_64/db

When opatch asks the following question, enter ‘yes’ without quote:
Enter ‘yes’ if you have unzipped this patch to an empty directory to proceed (yes/no):yes

If opatch auto fails, apply it manually:
1. Stop all databases manually as database users: /opt/oracle/product/112_ee_64/db/bin/srvctl stop database -d <dbname>
2. Unlock GI home as root user:For GI Cluster environment:
# $GRID_HOME/crs/install/rootcrs.pl -unlock

For GI Standalone (Oracle Restart):
# $GRID_HOME/crs/install/roothas.pl -unlock

3. Apply the patch to GI and database homes:

As oracle user: /opt/crs/product/112_ee_64/crs/OPatch/opatch napply -oh /opt/crs/product/112_ee_64/crs -local /opt/oracle/stage/12539000

As database user:   /opt/crs/product/112_ee_64/crs/OPatch/opatch napply -oh /opt/oracle/product/112_ee_64/db   -local /opt/oracle/stage/12539000

4. Lock GI home as root user:

For GI Cluster environment:

# $GRID_HOME/rdbms/install/rootadd_rdbms.sh
# $GRID_HOME/crs/install/rootcrs.pl -patch

For GI Standalone (Oracle Restart):

# $GRID_HOME/rdbms/install/rootadd_rdbms.sh
# $GRID_HOME/crs/install/roothas.pl -patch

5. Start all databases manually as database users: <DB_HOME>/bin/srvctl start database -d <dbname>

Since my ORACLE_BASE (/opt/oracle) is the same as my Home of the Oracle User (/opt/oracle)  I had issue that after the patching the owner of /opt/oracle became root!!:
drwxr-x— 31 root dba   4096 May 23 09:21 oracleThis meant trouble if i would end session and perform new ssh session from my admin box(being unable to log in) so I added a check before sign off , cause this has to be oracle:dba
 Patching the Grind Infra structure:
As they say always better to be safe the sorry so make a copy of the Crs software before start
## as root:
tar -c –recursion -pP –file=/opt/oracle/export/CRSbackup.tar /opt/crs/product/
If you have had an existing installation on your system, and you are using the same user account to install this installation, then unset the following environment variables: ORA_CRS_HOME; ORACLE_HOME; ORA_NLS10; TNS_ADMIN; and any other environment variable set for the Oracle installation user that is connected with Oracle software homes. unset ORA_CRS_HOME
unset ORACLE_HOME
unset ORA_NLS10
unset TNS_ADMIN
In RAC environments Upgrade the Grid infra structure Node wise.
Checks you can perform before and after upgrade:
crsctl query has softwareversion
crsctl query crs softwareversion
crsctl query has releaseversion
crsctl query crs releaseversion
/opt/oracle/admin/tools/cSpfile.ksh This shell script is creating a copy of spfile to Init.ora.
After that I created Migration pfiles with larger settings for shared_pool. Recommended setting Min. Value Shared_pool_size > 600M ( prefer 1024M).
/opt/oracle/admin/tools/cSrvctlActNew.ksh status / stop / config This is a shell script to check the current status in the clusterware.
/opt/oracle/admin/tools/cSrvctlAct.ksh config Check the current setup in the clusterware
Dryrun . Start ./runInstaller to check Prerequisites before install and correct if needed. Of course clufvy is part of the runInstaller activities. Check for warnings and errors and correct them
in ./runInstaller in the GUI use the option : upgrade an existing Oracle Clusterware and Oracle ASM installation Note. 11.2.0.3 is an OUT of Place installation that means that you will have to Install it in separate Oracle Home.
Install 11.2.0.3 in   /opt/oracle/product/11203_ee_64/db ./opt/oracle/stage/database/runInstaller
unzip /opt/oracle/stage/p6880880_112000_Linux-x86-64.zip in your new GRID_HOME directory. Copy and install the Latest Opatch to both GI and Rdbms since after this Installation we will have to apply patch.
/opt/oracle/product/11203_ee_64/db/OPatch/ocm/bin/ocm.rsp Create a response File needed during the opatch in the subdirectory  /opt/crs/product/11203_ee_64/crs/OPatch/ocm/bin/
/opt/crs/product/11203_ee_64/crs/OPatch/ocm/bin/ocm.rsp
export PATH=/opt/crs/product/11203_ee_64/crs/OPatch:$PATH Opatch runs as root , Set Path
which opatch Check version of Opatch
opatch auto /opt/oracle/stage -ocmrf /opt/crs/product/11203_ee_64/crs/OPatch/ocm/bin/ocm.rsp -oh /opt/crs/product/11203_ee_64/crs Run Opatch
Note on dry run the system would not come down since instances still open . I used my tools to stop them after which patching continued
Opatch had ISSUE with Fact that rdbms 11203 had no resources ( it is software only so the patching crashed). Recommendation is to run this part of PSU after the mig. DO NOT run During the GI upgrade !!!!
You may check  results with opatch ls inventory when you are pointing to the GRID_HOME or with  “orachk” utility.  Reference:  ORAchk 11.2.0.3 Upgrade Readiness Assessment (Doc ID 1457357.1)./opt/oracle/product/orachk -u -o post
Upgrade the Databases:
Perform MANUAL upgrade for the databases in scope. During an earlier upgrade DBUA messed up by adding local listener to Init.ora and continued altering the oratab by the Grid agent. That is why I recommend against the DBU for bulk upgrades . I would script the Upgrade using a Fixed Oracle_HOME ( the new one ) and a dedicated initora / spfile for the MIG.
Steps for Manual Upgrade: Preferred WAY !
utlu112i.sql AND dbupgdiag.sql ( both located in /opt/oracle/admin/tools  dbupdiag.sql can be obtained from MOS.
Create a new pfile from spfile and check if there are parameteres that need increase ( shared_pool_size might be a good candidate, and job_queue_processes needs 0 during the migration.  OR if all is fine well then simply  make job_queue_processes = 0 in your spfile. spfile has production size no alter needed for shared_pool_size. Job_queue_processes = 20 is needed.
1) Start sqlplus and run catupgrd.sql script from the NEW $ORACLE_HOME/rdbms/admin  As a preparatation make sure you have listener.ora , all init.oras ( spfiles) password file present in the proper Subdirectories on your NEW ORACLE_HOME.
sqlplus ” / as sysdba “
spool /tmp/upgrade<DB>.log
startup upgrade;  If you needed to prepare a init<sid>.mig   as a migration pfile you need to say: startup upgrade pfile=’/ / /init<sid>.mig .When you perform a startup upgrade these parameters are altered by Oracle:ALTER SYSTEM enable restricted session;ALTER SYSTEM SET _system_trig_enabled=FALSE SCOPE=MEMORY;Autotune of undo retention is turned off.

ALTER SYSTEM SET _undo_autotune=FALSE SCOPE=MEMORY;

ALTER SYSTEM SET undo_retention=900 SCOPE=MEMORY;

ALTER SYSTEM SET aq_tm_processes=0 SCOPE=MEMORY;

ALTER SYSTEM SET enable_ddl_logging=FALSE SCOPE=MEMORY;

Resource Manager disabled during database migration: plan ” not set

ALTER SYSTEM SET resource_manager_plan=” SCOPE=MEMORY;

ALTER SYSTEM SET recyclebin=’OFF’ DEFERRED SCOPE=MEMORY;

Resource Manager disabled during database migration

replication_dependency_tracking turned off (no async multimaster replication found)

set echo on
@?/rdbms/admin/catupgrd.sql; After Catupgrd.sql finishes it will shutdown the database.
2) Check catupgrd.sql spool file for errors.
3) Restart the database in normal mode.
4)   @$ORACLE_HOME/rdbms/admin/catuppst.sql; Post steps for the migration.
5)   @$ORACLE_HOME/rdbms/admin/utlrp.sql;
alter system set “_use_adaptive_log_file_sync”=FALSE scope = both; Requested by customer.
set lines 2000
select instance_name from v$instance; Check sanity of upgrade.
select * from v$version; Check sanity of upgrade.
select COMP_NAME,VERSION,STATUS,MODIFIED from dba_registry order by 1; Check sanity of upgrade all the installed components should be valid !
select * from DBA_REGISTRY_HISTORY order by action_time desc Check if Catbundle Ran (shows most recent Entry first.
Check $ORACLE_HOME/dbs for the presence / correct init.ora Should point to the Asm Diskgroup to an Spfile.
srvctl upgrade database -d <Db> -o /opt/oracle/product/11203_ee_64/db Inform clusterware about altered Oracle home.
srvctl modify database -d<Db> -p ‘+DATA/<Db>/spfile<Db>.ora’ make sure Clusterware knows about the Spfile alter if needed.
srvctl modify database  -d <Db> -o ‘/opt/oracle/product/11203_ee_64/db’ make sure Clusterware knows about the New Oracle Home.
If you have a listener per Database make sure it is started from the NEW oracle Home with the correct listener.ora
/opt/oracle/admin/tools/cSrvctlAct.ksh status Check status of db in cluster with shell script.
/opt/oracle/admin/tools/cSrvctlAct.ksh config Check configuration of db in cluster.
srvctl stop database -d   & start database -d as check . /opt/oracle/admin/tools/cSrvctlAct.ksh stop /opt/oracle/admin/tools/cSrvctlAct.ksh start As a test stop and start via srvctl stop/start database -d <Db>
ln -s /opt/networker/lib/libnwora.so libobk.so in new oracle home Check For the Networker Lib present in the new Oracle Home.
run an archive or control file   backup as check Run an archive backup as a test
rman catalog username/password@alias
RMAN> UPGRADE CATALOG;
 Apply PSU patch to rdbms:
/opt/oracle/product/11203_ee_64/db/OPatch/ocm/bin/ocm.rsp
export PATH=/opt/oracle/product/11203_ee_64/db/OPatch:$PATH Opatch runs as root , Set Path
which opatch Check version of Opatch
opatch auto /opt/oracle/stage -ocmrf /opt/oracle/product/11203_ee_64/db/OPatch/ocm/bin/ocm.rsp -oh /opt/oracle/product/11203_ee_64/db
@catbundle.sql psu apply
2. cd $ORACLE_HOME/rdbms/admin
3. sqlplus /nolog
4. SQL> CONNECT / AS SYSDBA
5. SQL> STARTUP
6. SQL> @catbundle.sql psu apply
7. SQL> QUIT
 After applying a PSU patch this is needed For each database using the patched ORACLE_HOME. BUT it was not needed in my scenario where I had installed and patched the empty ORACLE_HOME 11.2.0.3 and migrated the databases using the new ORACLE_HOME  after that.
FallBack scenarios:
Grid Infra fall back :
Downgrading to a release 11.2.0.1 or later release:
Use the command syntax Grid_home/perl/bin/perl rootcrs.pl -downgrade -oldcrshome oldGridHomePath -version oldGridversion, where oldGridhomepath is the path to the previous release Oracle Grid Infrastructure home, and oldGridversion is the release to which you want to downgrade. For example:
# /u01/app/11.2.0/grid/perl/bin/perl rootcrs.pl -downgrade -oldcrshome /u01/app/11.2.0/grid -version 11.2.0.1.0
If you want to stop a partial or failed 11g release 2 (11.2) installation and restore the previous release Oracle Clusterware, then use the -force flag with this command.
After the rootcrs.pl -downgrade script has completed on all remote nodes, on the local node use the command syntax Grid_home/crs/install/rootcrs.pl -downgrade -lastnode -oldcrshome pre11.2_crs_home -version pre11.2_crs_version [-force], where pre11.2_crs_home is the home of the earlier Oracle Clusterware installation, and pre11.2_crs_version is the release number of the earlier Oracle Clusterware installation.
For example:
# /u01/app/11.2.0/grid/perl/bin/perl rootcrs.pl -downgrade -lastnode -oldcrshome
/u01/app/crs -version 11.1.0.6.0
This script downgrades the OCR. If you want to stop a partial or failed 11g Release 2 (11.2) installation and restore the previous release Oracle Clusterware, then use the -force flag with this command.
Log in as the Grid infrastructure installation owner, and run the following commands, where /u01/app/grid is the location of the new (upgraded) Grid home (11.2):
.Grid_home/oui/bin/runInstaller -nowait -waitforcompletion -ignoreSysPrereqs -updateNodeList -silent CRS=false ORACLE_HOME=/u01/app/grid
As the Grid infrastructure installation owner, run the command ./runInstaller -nowait -waitforcompletion -ignoreSysPrereqs -updateNodeList -silent CRS=true ORACLE_HOME=pre11.2_crs_home, where pre11.2_crs_home represents the home directory of the earlier Oracle Clusterware installation.
For example:
.Grid_home/oui/bin/runInstaller -nowait -waitforcompletion -ignoreSysPrereqs -updateNodeList -silent CRS=true ORACLE_HOME=/u01/app/crs
DownGrade Rdbms – Databases
The database software 11.2.0.2 should still be in place so no need for extra action.
If you are downgrading to Oracle Database 11g Release 1 (11.1), then the COMPATIBLE initialization parameter must be set to 11.0.0 or lower.
Remove Unsupported Parameters from Server Parameter File (SPFILE)
Perform a Full Backup of Oracle Database Before Downgrading
Downgrading Oracle Database to an Earlier Release
1. If you have enabled Oracle Database Vault on your database, then you must:
Grant the Database Vault DV_PATCH_ADMIN role for the SYS account.
Disable Database Vault before downgrading the database.
  1. If you previously had upgraded the database and then used the DBMS_DST PL/SQL package to update the database time zone version, then you must apply the patch for the same time zone file version into the earlier release’s Oracle home before downgrading.
As an example scenario, assume that a release 10.2.0.4 database on Linux x64 using DSTv4 had been upgraded to release 11.2.0.2, and DBMS_DST was then run to update this database to DSTv14. Then, before downgrading from release 11.2.0.3 to 10.2.0.4, you need to apply on the release 10.2.0.4 side the DSTv14 patch for 10.2.0.4 for Linux x64. This ensures that your TIMESTAMP WITH TIME ZONE data is not logically corrupted during retrieval.
To find which time zone file version your database is using, run:
SELECT value$ FROM sys.props$ WHERE NAME = ‘DST_PRIMARY_TT_VERSION’;
  1. If you had set the ORA_TZFILE environment variable to the full path name of the timezone.dat file when you upgraded to Oracle Database 11g Release 2 (11.2), then you must unset it if you subsequently downgrade your database.
Two time zone files are included in the Oracle home directory:
The default time zone file at
$ORACLE_HOME/oracore/zoneinfo/timezonelrg.dat
A smaller time zone file at
$ORACLE_HOME/oracore/zoneinfo/timezone.dat
If you do not unset the ORA_TZFILE variable, then connecting to the database using the smaller time zone file might produce the following errors:
SP2-1503: Unable to initialize Oracle call interface
SP2-0152: ORACLE may not be functioning properly
4. If you have Oracle Application Express on your database, then you must copy the apxrelod.sql file from the Oracle Database 11g Release 2 (11.2) ORACLE_HOME/apex/ directory to a directory outside of the Oracle home, such as the temporary directory on your system.
Make a note of the new location of this file.
5. If you have created objects based on fixed objects, then drop these objects to avoid possible ORA-00600 errors. You can re-create these objects after the downgrade.
6. Log in to the system as the owner of the Oracle Database 11g Release 2 (11.2) Oracle home directory.
  1. This step is required only if Enterprise Manager Database Control is already configured for the database.
Stop Database Control, as follows:
  1. Set the ORACLE_UNQNAME environment variable to the database unique name.
  1. Run the following command:
ORACLE_HOME/bin/emctl stop dbconsole
If the database being downgraded is an Oracle RAC database, then perform this step on all instances.
  1. If you are downgrading an Oracle RAC database to 10g Release 1 (10.1), then you must remove extra voting disks before shutting down the Oracle Clusterware stack.
1. To see the number of voting disks used and to list voting disk paths, run the following command:
Oracle_Clusterware_Home/bin/crsctl query css votedisk
2. Remove each additional voting disk you find in the previous step by running the following command, where path is the voting disk path listed in the File Name column displayed in the previous step:
Oracle_Clusterware_Home/bin/crsctl delete css votedisk path
For example:
Oracle_Clusterware_Home/bin/crsctl delete css votedisk /share/cluster2/vote_cluster2.dbf
Do not delete the last voting disk.
9. At a system prompt, change to the ORACLE_HOME/rdbms/admin directory.
Note:
If you are downgrading a cluster database, then shut down the instance completely and change the CLUSTER_DATABASE initialization parameter to FALSE. After the downgrade, you must set this parameter back to TRUE.
10. Using SQL*Plus, connect to the database instance as a user with SYSDBA privileges.
11. Start up the instance in DOWNGRADE mode:
SQL> STARTUP DOWNGRADE
You might be required to use the PFILE option to specify the location of your initialization parameter file.
12. If you have Enterprise Manager configured in your database, then drop the Enterprise Manager user:
DROP USER sysman CASCADE;
13. Set the system to spool results to a log file so you can track the changes and issues:
SQL> SPOOL downgrade.log
14. Run catdwgrd.sql:
SQL> @catdwgrd.sql
The following are notes about running the script:
You must use the version of the script included with Oracle Database 11g Release 2 (11.2).
You must run the script in the Oracle Database 11g Release 2 (11.2) environment.
The script downgrades all Oracle Database components in the database to the major release or Oracle Database 11g patch release from which you originally upgraded.
If you encounter any problems when you run the script, or any of the scripts in the remaining steps, then correct the causes of the problems and rerun the script. You can rerun any of the scripts described in this chapter as many times as necessary.
If the downgrade for a component fails, then an ORA-39709 error is displayed and the SQL*Plus session terminates without downgrading the Oracle Database data dictionary. All components must be successfully downgraded before the Oracle Database data dictionary is downgraded. You must identify and fix the problem before rerunning the catdwgrd.sql script.
15. Turn off the spooling of script results to the log file:
SQL> SPOOL OFF
Then, check the spool file and verify that there were no errors generated during the downgrade. You named the spool file in Step 13 and the suggested name was downgrade.log. Correct any problems you find in this file and rerun the downgrade script if necessary.
Note:
If you want to save the results of the first time you ran the downgrade script, then before you rerun it be sure to rename downgrade.log to something else.
16. Shut down the instance:
SQL> SHUTDOWN IMMEDIATE
17. Exit SQL*Plus.
18. If your operating system is Linux or UNIX, then change the following environment variables to point to the directories of the release to which you are downgrading:
ORACLE_HOME
PATH
You should also check that your oratab file and any client scripts that set the value of ORACLE_HOME point to the downgraded Oracle home.
See Also:
Oracle Database Installation Guide for your operating system for information about setting other important environment variables on your operating system
  1. If your operating system is Windows, then complete the following steps:
  1. Stop all Oracle services, including the OracleServiceSID Oracle service of the Oracle Database 11g Release 2 (11.2) database, where SID is the instance name.
For example, if your SID is ORCL, then enter the following at a command prompt:
C:\> NET STOP OracleServiceORCL
See Also:
Oracle Database Net Services Administrator’s Guide
  1. Delete the Oracle service at a command prompt by issuing the ORADIM command. For example, if your SID is ORCL, then enter the following command:
C:\> ORADIM -DELETE -SID ORCL
  1. Create the Oracle service of the database that you are downgrading at a command prompt using the ORADIM command.
C:\> ORADIM -NEW -SID SID -INTPWD PASSWORD -MAXUSERS USERS
     -STARTMODE AUTO -PFILE ORACLE_HOME\DATABASE\INITSID.ORA
This syntax includes the following variables:
Variable
SID
PASSWORD
USERS
ORACLE_HOME
For example, if you are downgrading to Oracle Database 10g Release 2 (10.2), if your SID is ORCL, your PASSWORD is TWxy5791, the maximum number of USERS is 10, and the ORACLE_HOME directory is C:\ORANT, then enter the following command:
C:\> ORADIM -NEW -SID ORCL -INTPWD TWxy5791 -MAXUSERS 10
     -STARTMODE AUTO -PFILE C:\ORANT\DATABASE\INITORCL.ORA
Note:
You are not required to change any Windows Registry settings when downgrading a database. The ORADIM utility makes all necessary changes automatically.
20. Restore the configuration files (for example, parameter files, password files, and so on) of the release to which you are downgrading.
If this is an Oracle RAC database, execute the following command to return the database to single instance mode:
set CLUSTER_DATABASE=FALSE
Note:
If you are downgrading a cluster database, then perform this step on all nodes in which this cluster database has instances configured. Set the CLUSTER_DATABASE initialization parameter to FALSE. After the downgrade, you must set this initialization parameter back to TRUE.
21. At a system prompt, change to the ORACLE_HOME/rdbms/admin directory of the previous release.
22. Start SQL*Plus.
23. Connect to the database instance as a user with SYSDBA privileges.
24. Start up the instance:
SQL> STARTUP UPGRADE
25. Set the system to spool results to a log file to track changes and issues:
SQL> SPOOL reload.log
26. Run catrelod.sql:
SQL> @catrelod.sql
The catrelod.sql script reloads the appropriate version of all of the database components in the downgraded database.
27. If you are downgrading to Oracle Database 11g Release 1 (11.1.0.6), run the xsrelod.sql script:
SQL> @xsrelod.sql
Running the xsrelod.sql script avoids the following error:
PLS-00306: wrong number or types of arguments in call
to ‘INVALIDATE_DSD_CACHE’ DBMS_XS_DATA_SECURITY_EVENTS
PL/SQL: Statement ignored
Note:
The PLS-00306 error is not an issue for Oracle Database release 11.2.0.7 or later.
28. If you are downgrading to Oracle Database 10g Release 1 (10.1.0.5) and you have XDB in your database, then run the dbmsxdbt.sql script:
@dbmsxdbt.sql
Running the dbmsxdbt.sql script avoids the following errors when reloading XDB.DBMS_XDBT package:
PLS-00201: identifier ‘CTXSYS.CTX_OUTPUT’ must be declaredPLS-00201: identifier ‘CTX_DDL’ must be declared
SQL> @apxrelod.sql
Running the apxrelod.sql script avoids package APEX_030200.WWV_FLOW_HELP being INVALID due to the following error:
PLS-00201: identifier ‘CTX_DDL’ must be declared
30. Turn off the spooling of script results to the log file:
SQL> SPOOL OFF
31. Shut down and restart the instance for normal operation:
SQL> SHUTDOWN IMMEDIATE
SQL> STARTUP
You might be required to use the PFILE option to specify the location of your initialization parameter file.
  1. Perform this step if the database is configured for Oracle Label Security and you are downgrading to Oracle Database 10g Release 1 (10.1).
  1. Copy the olstrig.sql script from the Oracle Database 11g Release 2 (11.2) Oracle home to the Oracle home of the version to which the database is to be downgraded.
  1. Run olstrig.sql to re-create DML triggers on tables with Oracle Label Security policies.
SQL> @olstrig.sql
See the Oracle Label Security Administrator’s Guide for more information.
33. Run the utlrp.sql script:
SQL> @utlrp.sql
The utlrp.sql script recompiles all existing PL/SQL modules that were previously in an INVALID state, such as packages, procedures, types, and so on.
34. Exit SQL*Plus.
35. If you are downgrading a cluster database, then you must run the following command to downgrade the Oracle Clusterware database configuration:
$ srvctl downgrade database -d db-unique-name -o old_ORACLE_HOME -t to_relnum
[where db-unique-name is the database name (not the instance name), old_ORACLE_HOME is the location of the old Oracle home in which the downgraded database will be running. In this example, to_relnum is the database release number to which the database is being downgraded. (For example: 11.2.0.1.0.)]
Caution:
By default, any named user may create a server pool. To restrict the operating system users that have this privilege, Oracle strongly recommends that you add specific users to the CRS Administrators list.
See Also: Oracle Clusterware Administration and Deployment Guide for more information about adding users to the CRS Administrators list.

Looking back at these environments now they are patches my recommendation can only be that these environments should be patched more often but as we all know that is sometimes easier said then done.
As always don’t believe it  and test it first !!!

 

PS. the fallback scenario has not been used nor implemented it has been added here for  reference .

Happy reading and till next time.

Mathijs

 

 

 

Rman Archive backup failing with RMAN-03002, RMAN-06059, ORA-19625 in Rac

Introduction:

As member of the  team every so many days I am part of hot line , taking care of Tickets and indeed being the (hopefully friendly) voice in your ear when  you call  the  hot line number of Oracle team at the Company. One of the tickets drew my attention ( hmm maybe cause it was about RMAN because for years I love to work with RMAN) .  An archive backup had failed various times to I was asked to investigate and fix. This Blog will show you the analyses and steps I have performed  to fix this issue.

Details:

At  work we run a shell script to run both level and archive backups and in the logfile that was produced for the Archive backup this was my first clue:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of backup command at 07/11/2014 07:27:45
RMAN-06059: expected archived log not found, loss of archived log compromises recoverability
ORA-19625: error identifying file /opt/oracle/product/11202_ee_64/db/dbs/MYDB2_DATA1MYDB2_776281254_47588_2.arc
ORA-27037: unable to get file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3

###    So  it was my interpretation that the specific archive was not  present in the ASM diskgroup but it was local  present on the other Node in the  RAC  in the $ORACLE_HOME/dbs directory.  My investigations  on the three nodes showed me:

## On first node :

SQL> archive log list
Database log mode              Archive Mode
Automatic archival             Enabled <<– wrong cause the archives are supposed  to be in the +MYDB2_FRA1 diskgroup as a ground rule.
Oldest online log sequence     51210
Next log sequence to archive   51213
Current log sequence           51213

On the second node :

SQL> archive log list
Database log mode                 Archive Mode
Automatic archival                  Enabled
Archive destination                 MYDB2_DATA1  <<– wrong cause the + is missing.
Oldest online log sequence     47585
Next log sequence to archive   47588
Current log sequence             47588

## on the 3rd node:

SQL> archive log list
Database log mode                 Archive Mode
Automatic archival                  Enabled
Archive destination                 MYDB2_DATA1 <<– wrong cause the + is missing .
Oldest online log sequence     52542
Next log sequence to archive   52545
Current log sequence             52545

So after that my conclusion was  that  due to the wrong archive_destination on two out of three nodes in my rac the archives where being written on a local filesystem on those two boxes  and on top of that in the $ORACLE_HOME/dbs directory. Well that was not good  !!

## So first steps to be performed  was to correct the archive_destination to:

ALTER SYSTEM SET log_archive_dest_1=’LOCATION=+MYDB2_FRA1′ SCOPE=BOTH SID=’MYDB21′;
ALTER SYSTEM SET log_archive_dest_1=’LOCATION=+MYDB2_FRA1′ SCOPE=BOTH SID=’MYDB2′;
ALTER SYSTEM SET log_archive_dest_1=’LOCATION=+MYDB2_FRA1′ SCOPE=BOTH SID=’MYDB23′;

## Once that was corrected I performed a couple of log switches to see archives being born in the +MYDB2_FRA1 diskgroup. Which was the case so already a bit happy me.

## Since I had noticed that the missing +  in the archive destination caused Oracle to create the archives.
## in $ORACLE_HOME/dbs.
## I had to copy  the archives present there to asm disk group  in asmcmd in the correct sub-folder ( I noticed that the archives had been of the last 2 recent days by the creation date of them in Linux so  I was aware where to put them in ASM):

cp /opt/oracle/product/11202_ee_64/db/dbs/MYDB2_DATA1MYDB2_776281254_47523_2.arc .
cp /opt/oracle/product/11202_ee_64/db/dbs/MYDB2_DATA1MYDB2_776281254_47525_2.arc .
cp /opt/oracle/product/11202_ee_64/db/dbs/MYDB2_DATA1MYDB2_776281254_47526_2.arc .

## and I did the same on the other server.

oracle@mysrvr:/opt/oracle/product/11202_ee_64/db/dbs/ [MYDB23]# ls -ltr *.arc .

cp /opt/oracle/product/11202_ee_64/db/dbs/MYDB2_DATA1MYDB2_776281254_52480_3.arc .
cp /opt/oracle/product/11202_ee_64/db/dbs/MYDB2_DATA1MYDB2_776281254_52481_3.arc .
cp /opt/oracle/product/11202_ee_64/db/dbs/MYDB2_DATA1MYDB2_776281254_52482_3.arc .

## Next step was how to find out how to register these archives after they have moved to ASM diskgroup in rman. And again fellow bloggers did not let me down so after spending some time on Google surfing   I came up with this:

I connected to the rman environment both target  and catalog and I performed this which was nice since  I only had to point to the directory where all the archives were located now in ASM after my copy action:

catalog start with  ‘+MYDB2_FRA1/MYDB2/ARCHIVELOG/2014_07_10/’;
catalog start with  ‘+MYDB2_FRA1/MYDB2/ARCHIVELOG/2014_07_11/’;

## Once  that had been done I ran an archive backup with success. And  was happy  that I had solved another Puzzle .

 

As always Happy reading  and till next time,

 

Mathijs

 

 

The good , the bad and the OPatch (applying PSU April 2014 on Rac and Oracle Restart)

Introduction:

 

Last couple of weeks  I have been busy  patching and upgrading Production , Preproduction en test environment and during those activities OPatch was my friend and tool for that.  Looking back an after talking to colleagues I decided to create a post for this .  In my patching activities I had to apply  a recent PSU patch to both the Grid Infra structure and Rdbms , do an Upgrade of the software  and add the latest PSU patch again. In  your  preparations for OPatch I had issues with regard to storage present on the  mount-point of the Grid Infrastructure . So as part of  activities  you should take a look at your file-system size  Since the  PSU patches will need at least  5 GB free space in the mount.

Preparations:

 

  • As was mentioned in the introduction make sure you have at least  5GB ( more is better in this case) in the mount-point where the Grid infra Structure is located . In my case I had /opt/crs/product/11202/crs as a mount with 15GB of space. In this mount the grid software had been installed and One Psu patch had been applied in the old days ( we are talking October 2012 PSU ). And while applying a required PSU (October 2013) ( required for the upgrade to Oracle 11.2.0.3)  there was not enough space to install the software.
  • Since my current platform is Linux ( this is all about patching Rac environments and Oracle Restart env.) I looked at Metalink and downloaded: p6880880_112000_Linux-x86-64.zip. With every PSU patch you install you should ask yourself is my opatch up to date enough , or should  I download a fresh copy  from Metalink. I tend to  check  and to download a fresh copy every time i am my T-shirt “I-m a patch Dba today and I Like it “.
  • In my environment my software installs look pretty much like this :
    • Grid Infra structure is installed in /opt/crs/product//crs
    • Rdbms is installed in /opt/oracle/product/11202_ee_64/db
    • oh and a bit confusing perhaps my ORACLE_BASE is  the same as the home of the ORACLE user ( which is /opt/oracle)

## tips

•    Make a subdirectory for each psu patch you apply if un unzip N psu patches in same directory opatch will apply them every  time again.
•    Is auto really auto , tend to do it with –oh  which still works fine for me.
•    Keep your Opatch tool up to date .

## Setting up your patching :

oracle@mysrvr:/opt/oracle/stage []# dir
drwxr-xr-x  5 oracle dba     4096 Jun 23 13:29 .
drwxr-xr-x 32 oracle dba     4096 Jun 23 15:22 ..
drwxr-xr-x  2 oracle dba     4096 Jun 11 13:32 OPatchlogs
drwxr-xr-x  2 oracle dba     4096 Jun 23 13:28 psuApr2014
drwxr-xr-x  2 oracle dba     4096 Jun 23 13:29 psuOct2013

## inside psuOct2013

oracle@mysrvr:/opt/oracle/stage/psuOct2013 []# ls -ltr
total 288260
-rw-r–r– 1 oracle dba        21 Apr  4  2013 README.txt
drwxr-xr-x 5 oracle dba      4096 Apr  4  2013 16459322
-rw-r–r– 1 oracle dba       450 Oct  9  2013 bundle.xml
drwxrwxr-x 9 oracle dba      4096 Oct 10  2013 17082367
-rw-rw-r– 1 oracle dba    141496 Jan 20 05:18 README.html
-rw-rw-r– 1 oracle dba    136871 Jan 20 05:18 PatchSearch.xml
-rwxr-xr-x 1 oracle dba 294574955 Jun  4 07:28 p17272753_112020_Linux-x86-64.zip

## Inside psuApr2014

oracle@mysrvr:/opt/oracle/stage/psuApr2014 []# ls -ltr
total 586820
drwxr-xr-x  5 oracle dba      4096 Jan  9 16:27 17592127
drwxrwxr-x 12 oracle dba      4096 Feb  5 07:04 18031683
-rw-r–r–  1 oracle dba       450 Feb 10 10:16 bundle.xml
-rw-r–r–  1 oracle dba         0 Feb 10 10:17 README.txt
-rw-rw-r–  1 oracle dba     59977 Apr 15 12:18 README.html
-rw-rw-r–  1 oracle dba    125015 Apr 15 14:17 PatchSearch.xml
-rwxr-xr-x  1 oracle dba 600096863 May 16 15:33 p18139678_112030_Linux-x86-64.zip

 

## Applying  PSU April 2014

unzip /opt/oracle/stage/p6880880_112000_Linux-x86-64.zip in your GRID_HOME and ORACLE_HOME directory
/opt/oracle/product/11203_ee_64/db/OPatch/ocm/bin/ocm.rsp  set up a response file (and make not of the absolute path for that response file because you will need it during opatch apply.
/opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp : that is my absolute path to the response file
unzip   p18139678_112030_Linux-x86-64.zip ( this was PSU april 2014 )
AS ROOT:export PATH=/opt/crs/product/11203/crs/OPatch:$PATH
export PATH=/opt/oracle/product/11203_ee_64/db/OPatch:$PATH
which opatch ( check if root can run opatch now )
PER NODE in your Cluster as ROOT :
##Crs
opatch auto /opt/oracle/stage/

unzip /opt/oracle/stage/p6880880_112000_Linux-x86-64.zip in your cdora directory
/opt/oracle/product/11203_ee_64/db/OPatch/ocm/bin/ocm.rsp
/opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp
unzip   p18139678_112030_Linux-x86-64.zip
export PATH=/opt/crs/product/11203/crs/OPatch:$PATH
export PATH=/opt/oracle/product/11203_ee_64/db/OPatch:$PATH
which opatch
PER NODE:
##Crs
opatch auto /opt/oracle/stage/

unzip /opt/oracle/stage/p6880880_112000_Linux-x86-64.zip in your cdora directory
/opt/oracle/product/11203_ee_64/db/OPatch/ocm/bin/ocm.rsp
/opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp
unzip   p18139678_112030_Linux-x86-64.zip
export PATH=/opt/crs/product/11203/crs/OPatch:$PATH
export PATH=/opt/oracle/product/11203_ee_64/db/OPatch:$PATH
which opatch
PPER NODE AS ROOT:##Crsopatch auto /opt/oracle/stage/psuApr2014 -ocmrf /opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp -oh /opt/crs/product/11203/crs

##Rdbms

opatch auto /opt/oracle/stage/psuApr2014 -ocmrf /opt/oracle/product/11203_ee_64/db/OPatch/ocm/bin/ocm.rsp -oh /opt/oracle/product/11203_ee_64/db

## Oracle Restart

/opatch auto /opt/oracle/stage/psuApr2014 -ocmrf /opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp -oh /opt/crs/product/11203/crs

/opatch auto /opt/oracle/stage/psuApr2014 -ocmrf /opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp -oh /opt/oracle/product/11203_ee_64/db

-ocmrf /opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp -oh /opt/crs/product/11203/crs
##Rdbms
opatch auto /opt/oracle/stage/11203 -ocmrf /opt/oracle/product/11203_ee_64/db/OPatch/ocm/bin/ocm.rsp -oh /opt/oracle/product/11203_ee_64/db

## Oracle Restart
/opatch auto /opt/oracle/stage/psuApr2014 -ocmrf /opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp -oh /opt/crs/product/11203/crs

/opatch auto /opt/oracle/stage/psuApr2014 -ocmrf /opt/crs/product/11203/crs/OPatch/ocm/bin/ocm.rsp -oh /opt/oracle/product/11203_ee_64/db

 

And as last recommendation . Check the logfiles that are produced during the OPatch in detail  cause i have seen a situation where the OPatch reported “succeeded”  but a detailed look in the logs showed that one of the patches had not been applied due to lack of space !!!!

 

As always happy reading and have a great day,

 

Mathijs

Changing Directory privileges The story of the ORACLE_BASE in Psu patching with Opatch in 11.2.0.2.0

Introduction

 

This week i have applied   psu patches both on Real Application Cluster and Oracle Restart environments and that brought me to an interesting  observation again. The way we set up the environments in Linux had consequences with regard to Privileges to my directories changing.   Lets go into more detail and see if  reading this post is  of interest to  you too.

 Details.

First let us take a look at the way the servers i work on are set up:

  • When the oracle user logs in on the Linux box his HOME  ( the home of the Oracle user)  is set to /opt/oracle.
  • Long ago  ( long time before  having experience with Rac environments the Company standards dictated that the ORACLE_BASE should always be the /opt/oracle. These standards are still alive for our current installations ( with or without  Real Application Clusters) and that is something i have to respect and have to deal with.
  • When installing the Oracle software for an Oracle Rac environment two Main software trees are important:
  1. The Grid Infra structure software is being installed under /opt/oracle/crs/product//crs
  2. The Rdbms software is installed in /opt/oracle /product/_ee_64/db

The Grid Infra ( aka GI )  (1) has its own software tree because in that required way Root owner ship of the Software can be guaranteed during setup and while running, not impacting the normal Oracle Database software and Software use . ( after all we would not want to run all as root right ( even when that sounds tempting some times 🙂 ).

The Normal Oracle software (2) is installed in the ORACLE_BASE which is correct ( because it does not need root privileges to run) .

However  installing and patching a GI environment is going to be an issue if you keep the ORACLE_BASE to /opt/oracle  and it is the same as the HOME of the Oracle user since after applying psu patch to the Grid Infra  i all over sudden had issues in various ways:

After apply psu for gi :oracle@nlmysrvr/opt []# ls -al

total 101

drwxrwxr-x 19 root   root   4096 Nov 13  2013 .

drwxr-xr-x 28 root   root   4096 May 26 09:52 ..

drwxr-xr-x  3 root   root   4096 Nov 13  2013 bmc

drwxr-xr-x  4 root   hpsmh  4096 Apr  5  2012 compaq

drwxr-xr-x  5 oracle dba    4096 May  3  2012 crs

drwxr-xr-x  3 root   root   4096 Apr 12  2012 emc

lrwxrwxrwx  1 root   root     13 Apr  5  2012 eRunbook -> /opt/novaInfo

drwxr-xr-x 13 root   hpsmh  4096 Apr  5  2012 hp

drwx——  2 root   root  16384 Apr  5  2012 lost+found

drwxr-xr-x  3 root   root   4096 Apr  5  2012 lsi

drwxr-xr-x  9 root   root   1024 Dec 13  2012 networker

drwxr-x—  7 root   root   4096 Apr  5  2012 novaInfo

drwxrwxr-x  6 root   root   4096 Dec  4  2012 nsr

drwxr-x— 33 root   dba   12288 Jun 12 13:22 oracle

drwxr-xr-x  3 root   root   4096 May  3  2012 ORCLfmap

dr-xr-xr-x 25 bin    bin    4096 Jul 10  2013 OV

dr-xr-xr-x 14 bin    bin    4096 Dec 11  2012 perf

drwxr-xr-x  4 root   root   4096 Sep 12  2012 SAN

drwxr-xr-x  5 root   root   4096 Jul 20  2012 semo

drwxr-xr-x  9 root   root   4096 Jul 10  2012 SP

## During my next ssh log in i noticed this:

ssh nlmysrvr

Last login: Thu Jun 12 13:22:15 2014 from homeServer.domain

/usr/bin/xauth:  timeout in locking authority file /opt/oracle/.Xauthority

 

Logins took a long time ..

##

Solution:

As root

chown oracle:dba oracle again on the /opt/oracle directory.

 

I spoke to Oracle consultants on this and they recommended  that best practice would be:

  1. Set up  a new standard in which the User Oracle when logging in would start in a directory like /home/oracle.
  2. Set up a Software tree  for the Rdbms Admin  in an ORACLE_BASE  ( so in our case /opt/oracle would be acceptable).
  3. Install the GI software in its own directory since it will  run as root and it would be out of the ORACLE_BASE (  i learned would be best to unset the environment variable ORACLE_BASE).

 

As we all know , old habits or standards die-hard . So at moment i will settle for every  scenario to unset the ORACLE_BASE variable , do my checks  before , and after installation ( always keep a few sessions ope) and i would tweak the privileges were needed after applying the  psu patch or after running root(upgrade).sh  for the GI installation. But i would not only check :

  • /opt/oracle privileges  after running a psu patch
  • would even when things look ok  go into /opt/oracle/.ssh and make sure that the ..  has 755 as a  value .
  • if not the case would chown  and or chmod to fix things .

 

As always there is never a one solution fits all purposes here . In an ideal world indeed would love to see the home of the Oracle User in /home/Oracle and have an ORACLE_BASE  under /opt/oracle.  But as always we don’t always live in an ideal work so it is better to be aware of possible side effects like the ones i describe here and make it part of your plans to look at the situation before and after applying  patches or installing software . One of the nice quotes i came to appreciate was  success just loves preparation . Make it part of plan to check things when starting to patch .

 

Happy reading and till we meet again in next blog.

 

Mathijs

 

 

What’s in a Name (renaming ASM Diskgroup in 11.2)

Introduction:

Once again  it showed that the old  quote .. what’s  in a name .. is still valid even in an Oracle 11.2.0.3.0 Oracle restart environment  so credits to William S. for that. Well  you might wonder where is he heading with the article with title like this again. Frankly i had set up an Oracle restart environment where during the Installation  you are asked about one ( or more )  Asm Disks to be added to a Diskgroup that will hold the Spfile of the ASM instance.  During that install i made a typo in the name of the Diskgroup and only noticed of course after finishing install.

I searched the Web and came across a real cool scenario which i have explored and guess what  it works !  Below you find the steps i have performed.   Environment was: 11.2.0.3 Enterprise Edition on Red  Hat Linux.

Details:

On the web  I found bloggers sharing the idea that you can indeed rename a diskgroup in 11gr2. The tool for that is on the OS level and is part of you GridInfrastructure software. So it could very well look pretty much like this:

 which renamedg
/opt/crs/product/11203_ee_64/crs/bin/renamedg

Anyhow i had a bit of a struggle with the syntax  but in the end this is the command that does the magic. Note first few attempts failed since  I did not add the asm_diskstring in the command line . Be wise and please use it in a modified if necessary way from the beginning:

 

Step 1 Preparations

As root i stopped the Oracle restart environment :

 ./crsctl stop has

Step 2 Renaming the Diskgroup

Syntax

renamedg phase=both dgname=DUMM_YDATA newdgname=DUMMY_DATA verbose=true asm_diskstring=’/dev/mapper/asm*p1′

As you can see i am renaming my diskgroup DUMM_YDATA to DUMMY_DATA  this way.

 

Output coming from that was :
NOTE: No asm libraries found in the system
Parsing parameters..
Parameters in effect:
Old DG name       : DUMM_YDATA
New DG name          : DUMMY_DATA
Phases               :
Phase 1
Phase 2
Discovery str        : /dev/mapper/asm*p1
Clean              : TRUE
Raw only           : TRUE
renamedg operation: phase=both dgname=DUMM_YDATA newdgname=DUMMY_DATA verbose=true asm_diskstring=/dev/mapper/asm*p1
Executing phase 1
Discovering the group
Performing discovery with string:/dev/mapper/asm*p1
Identified disk UFS:/dev/mapper/asm-spfile01p1 with disk number:0 and timestamp (33001295 453578752)
Checking for hearbeat...
Re-discovering the group
Performing discovery with string:/dev/mapper/asm*p1
Identified disk UFS:/dev/mapper/asm-spfile01p1 with disk number:0 and timestamp (33001295 453578752)
Checking if the diskgroup is mounted or used by CSS
Checking disk number:0
Generating configuration file..
Completed phase 1
Executing phase 2
Looking for /dev/mapper/asm-spfile01p1
Modifying the header
Completed phase 2

Step 3 Where to find the Spfile for ASM in Grid Infra

 

With this first success it was time to move on to step 2. I modified the information in the cluster layer with this once i had looked up the exact name of the spfile with ASMCMD:

Syntax
srvctl modify asm  -p +DUMMY_DATA/asm/asmparameterfile/registry.253.843232019

 

Step 4 Restarted the Has deamon:

As Root:

 ./crsctl start has

 

Step 5 Final activities: Check and cleanup:

 

SQL> show parameter spfile

NAME               TYPE VALUE
------------ ----------- ------------------------------
spfile       string      +DUMMY_DATA/asm/asmparameterfile/registry.253.843232019

 

Checked also the information in the Cluster:

oracle@mysrvr10hr:/opt/oracle [+ASM]# crsctl status resource -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DUMMY_DATA.dg
               ONLINE  ONLINE       mysrvr10hr
ora.DUMM_YDATA.dg
               ONLINE  OFFLINE      mysrvr10hr
ora.LISTENER.lsnr
               ONLINE  ONLINE       mysrvr10hr
ora.asm
               ONLINE  ONLINE       mysrvr10hr                 Started
ora.ons
               OFFLINE OFFLINE      mysrvr10hr
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
      1        ONLINE  ONLINE       mysrvr10hr
ora.diskmon
      1        OFFLINE OFFLINE
ora.evmd
      1        ONLINE  ONLINE       mysrvr10hr

## checked in asmcmd

asmcmd
[Option  -p will be used ]
ASMCMD [+] > spget
+DUMMY_DATA/asm/asmparameterfile/registry.253.843232019

Ok last step, clean up since the cluster still knows about the Diskgroup with the wrong name:

srvctl remove diskgroup -g DUMM_YDATA

 

Mission completed and again a smiling dba here. Please Test before do !

 

Happy reading,

 

Mathijs

Moving from the Asmlib to multi-pathing / creating a shared spfile in ASM – Grid Infra

Introduction

No this is not a note in defense or offense for the ASMLIB. It is merely a registration of the fact that in the past Real application clusters  have been setup with ASMLIB  and these Clusters have turned into trouble(d) environments recently due to wrong allocation of ASMLIB Disks. Since the company standards have changed over the year in favor of the Multi-pathing  it has been decided that the troubled cluster would  have to start using the Multi-pathing . For a general overview of ASMLIB and Multi-pathing :  http://oracle.su/docs/11g/server.112/e10500/asmprepare.htm.

Also it has been decided to alter the standards a that ruled couple of years  ago ( implementing Rac clusters with Asm with ASMLIB) to now a days company standards which embraced the Multi-pathing. So far no one ever did this kind of convert on live systems so it would need a proper scenario a dress-rehearsal test before we would implement this on production environment .

In general terms it was explained to me that ASMLIB is like a layer on TOP of the Multi-Pathing  presenting Disks in ASM with a specific labeling. The asm_diskstring  in an ASMLIB environment would point to those disks with  ‘ORCL:1’ . Where the asm_diskstring would be like  the key which disks should be discovered by the ASM instance to work with. The scenario turned out to be remarkable simple to the Oracle Dbas. Bottom line was that even when the disks have been labeled for ASMLIB  if we would alter the asm_diskstring to point to the disks via ‘/dev/mapper/asm*p1’ that would already do the job since the ASMLIB labels would simply be ignored by multi-path. But as always do not  try this at home unless  you alter it in a controlled way as described below.

Details:

As a preparation i ran this scenario on a preproduction box  and with  that being a good dress-rehearsal implemented this on the production as well.  More details will be described below:

Plan of Approach
1 alter system set asm_diskstring = ‘/dev/mapper/asm*p1’ scope = spfile; Oracle
1 Stopping Crs on all 4 Nodes Oracle
1 Started Crs on mysrvr33r Oracle
1 Started Crs on mysrvr34r Oracle
1 Started Crs on mysrvr35r Oracle
1 Started Crs on mysrvr36r Oracle
2 Stopping Crs on all 4 Nodes Oracle
2 Removing ASMLIB RPMS 33r 34r 35r 36r Linux
2 Server reboot of mysrvr33r Linux
2 Check Cluster on mysrvr33r , all up Oracle
2 Restart of mysrvr34r 35r 36r Linux
2 Check  33r 34r 35r 36r Oracle
3 Stopping Crs on 34r 35r 36r Nodes Oracle
3 Create spfile in asm diskgroup on 33r Oracle
3 Restart of  CRS  on 33r Oracle
3 Check gpnp profile / spfile on 33r Oracle
3 Restart of crs on mysrvr34r 35r 36r Oracle
3 Check gpnp profile / spfile on 33r – 36r Oracle
  1. I have used three waves of activities . In the first wave As as preparation i altered the asm_diskgstring in the spfile of the asm instances already ( of course not yet active till after the next restart ) .  After that i stopped the full cluster on all 4 nodes and started with my first node to see the effects ( and to see the asm instance and database instances to be started ) all nodes had been restarted. And it showed all was running well.
  2. In this action i worked together with the Linux admins. The boxes in scope running on RedHat  they wanted to get rid of the ASMLIB in the kernel as well so in the second wave i shutdown the full cluster one more time, they removed the rpms from the Linux and rebooted  the first box and all was well after that. So after my checks the other three boxes were started in parallel and the end result was a happy  and running cluster again !
  3. In wave three i had to fix some old hurt that these 4 boxes were still working with local spfiles instead of a shared spfile in the asm instance. When this Cluster was build by me some three years ago it was born as a 11.1 cluster environment and it had been set up with local copies of  the spfile:

Setting up a (lost) Spfile in ASM in a Grid infrastructure environment:

##  First step  i thought would be to stop the cluster since i did that on another scenario. Much to my surprise i was recommended to do this in the running environment so i  did  this after i prepared a valid init.ora:

 SQL> create spfile ='+CLUSTERDATA'  from pfile = '/opt/oracle/+ASM1/admin/pfile/init+ASM1.NEW' ;
create spfile ='+CLUSTERDATA'  from pfile = '/opt/oracle/+ASM1/admin/pfile/init+ASM1.NEW'
*
ERROR at line 1:
ORA-29780: unable to connect to GPnP daemon [CLSGPNP_ERR]

## As you can see that did not work.  Investigation brought  following note (Environment Variable ORA_CRS_HOME MUST be UNSET in 11gR2/12c GI (Doc ID 1502996.1) . I checked and indeed that environment Variable was present in my  .profile:

#### bad practice to have ORA_CRS_HOME  set in your .profile if you are using GI so we unset it !!!!!!!!!!!!!!!!!!!

oracle@mysrvr33r:/opt/oracle/+ASM1/admin/pfile [+ASM1]# unset  ORA_CRS_HOME
oracle@mysrvr33r:/opt/oracle/+ASM1/admin/pfile [+ASM1]# echo $ORA_CRS_HOME
## after that  the create of the spfile worked.
SQL> create spfile ='+CLUSTERDATA'  from pfile = '/opt/oracle/+ASM1/admin/pfile/init+ASM1.NEW' ;
File created.

##  Lets do some checks  ( in asmcmd)  it showed:

ASMCMD [+] > spget
+CLUSTERDATA/mysrvr3_cluster/asmparameterfile/registry.253.841967299

## Second check using the gpnptool:

gpnptool get

Warning: some command line parameters were defaulted. Resulting command line:          /opt/crs/product/112_ee_64/crs/bin/gpnptool.bin get -o- http://www.grid-pnp.org/2005/11/gpnp-profile” xmlns:gpnp=”http://www.grid-pnp.org/2005/11/gpnp-profile” xmlns:orcl=”http://www.oracle.com/gpnp/2005/11/gpnp-profile” xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xsi:schemaLocation=”http://www.grid-pnp.org/2005/11/gpnp-profile gpnp-profile.xsd” ProfileSequence=”58″ ClusterUId=”2b7266b0d5797f65ff0fcf4c8e7931d6″ ClusterName=”mysrvr3_cluster” PALocation=””><gpnp:Network-Profile>SPFile=”+CLUSTERDATA/mysrvr3_cluster/asmparameterfile/registry.253.841967299″/><orcl:OCR-Profile id=”ocr” OCRId=”372102285″/>http://www.w3.org/2000/09/xmldsig#“><ds:CanonicalizationMethod Algorithm=”http://www.w3.org/2001/10/xml-exc-c14n#“/><ds:SignatureMethod Algorithm=”http://www.w3.org/2000/09/xmldsig#rsa-sha1“/><ds:Reference URI=””><ds:Transforms><ds:Transform Algorithm=”http://www.w3.org/2000/09/xmldsig#enveloped-signature“/><ds:Transform Algorithm=”http://www.w3.org/2001/10/xml-exc-c14n#“> <InclusiveNamespaces xmlns=”http://www.w3.org/2001/10/xml-exc-c14n#” PrefixList=”gpnp orcl xsi”/>http://www.w3.org/2000/09/xmldsig#sha1“/>3Tjuts50Gi92r42OMa4Pb17PiYc=B941IphE6D1FqVhc1u/+NwhAM3QXbBRiMT0plxhXyptUnj4mu1T1UFP/5yG+yBIzblquOy4aqxNBthMy7aQW0lyS4QfMZbjWYhYH2nvbrnnyqY/ZoYXOY0QaAYciboALXxJxCzup6ZGxCnsgtT8G/b08z679j8NlMvykdE2pmWY=

## in asmcmd:

ls -l +CLUSTERDATA/mysrvr3_cluster/asmparameterfile/registry.253.841967299

[Option  -p will be used ]

Type              Redund  Striped  Time             Sys  Name

ASMPARAMETERFILE  UNPROT  COARSE   MAR 11 23:00:00  Y    registry.253.841967299

## alert log shows:

Tue Mar 11 23:48:18 2014

NOTE: updated gpnp profile ASM diskstring: /dev/mapper/asm*p1

NOTE: updated gpnp profile ASM diskstring: /dev/mapper/asm*p1

## looks good  on other servers as well ( checked them all and  they showed similar like below:

gpnptool get

## As a next step i moved the existing spfile in the $ORACLE_HOME/dbs

mv spfile+ASM1.ora spfile+ASM1.ora.20140311.old

## then edited a  the init.ora to make it point to the ASM-Diskgroup as well.

vi init+ASM1.ora

## i restarted the cluster and after that i checked in the asm Instance.

SQL> show parameter spfile
NAME                                                      TYPE VALUE
------------------------------------ ----------- ------------------------------
spfile                                                       string               +CLUSTERDATA/mysrvr3_cluster/as
mparameterfile/registry.253.84
1967299

## checks are asmcmd spget  and gpnptool get

##  copied  the init.ora to otherboxes

cat
oracle@mysrvr33r:/opt/crs/product/112_ee_64/crs/dbs [+ASM1]# cat init+ASM1.ora
spfile='+CLUSTERDATA/mysrvr3_cluster/asmparameterfile/registry.253.841967299'
1062      scp init+ASM1.ora oracle@mysrvr34r:/opt/crs/product/112_ee_64/crs/dbs/init+ASM2.ora
1063      scp init+ASM1.ora oracle@mysrvr35r:/opt/crs/product/112_ee_64/crs/dbs/init+ASM3.ora
1064      scp init+ASM1.ora oracle@mysrvr36r:/opt/crs/product/112_ee_64/crs/dbs/init+ASM4.ora

Mission completed. As always  happy reading And DO test before you  implement!

Mathijs .

When Rman shows errors like RMAN-20220 and RMAN-06004

Introduction:

On one of the product ion databases we are using BCVS  (Business Continuity Volume) ) in the Emc boxes to make  Rman backups.  Basically done by putting  the source (production) Database into begin backup mode then split the mirror. After the split the source environment is put to end backup mode and the Bcv is being mounted on a different server to make the Rman level backups.  It is also important to mention the we are using an Rman catalog  database where every source database present has its own Schema with the catalog for that database only.

On the Source database  the archives are put to tape.

For a great explain of the concept I recommend following blog  by one of my Heroes:  Martin Bach:

http://martincarstenbach.wordpress.com/2011/05/24/offloading-production-backups-to-a-different-storage-array/

In my environment  i got alarmed that the backup of the control file on the backup server was no longer  running. So it was time again to gear up and go out there and investigate .

Details:

On the backup server we use a tailored script to do the level backup , and in that script aso a backup of the control file  is included .  When I was looking at the log files I see that this Rman (bcv)  level backup is  doing two things :

  • Produce a level backup which is successful ,
  • After that  a copy of the control file is registered in the catalog and is put to  tape . That part is failing all the time now ( even though  the log files show the  backup as success) .

In the log files on the backup server I see after a successful level backup  following error:

 RMAN> run {
2> debug off;
3> allocate channel d1 type disk;
4> catalog controlfilecopy '/opt/oracle/admin/backup/MYDB1/backup_controlfile_MYDB1_2014_02_01_23:20.bck';
5> release channel d1;
6> allocate channel t1  type  'SBT_TAPE';
7> send 'NSR_ENV=(NSR_SERVER=MYNSR_SERVER,NSR_CLIENT=MYDB_SERVER, NSR_DATA_VOLUME_POOL=REP#0032#G02#LTO3, NSR_GROUP=DAT_0032_G02_
LTO3_TSR_local_01, NSR_SAVESET_BROWSE="0032 Days", NSR_SAVESET_RETENTION="0032 Days")';
8> backup
9> format 'ctrl_level1_%d_201402012208_3605724822_%p_%U'
10> controlfilecopy '/opt/oracle/admin/backup/MYDB1/backup_controlfile_MYDB1_2014_02_01_23:20.bck';
11> release channel t1;
12> }
Debugging turned off

allocated channel: d1
channel d1: sid=593 devtype=DISK

cataloged control file copy
control file copy filename=/opt/oracle/admin/backup/MYDB1/backup_controlfile_MYDB1_2014_02_01_23:20.bck recid=1280 stamp=83
8423230

released channel: d1

allocated channel: t1
channel t1: sid=593 devtype=SBT_TAPE
channel t1: NMO v5.0.0.0

sent command to channel: t1

Starting backup at 01-FEB-14
released channel: t1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of backup command at 02/01/2014 23:20:35
RMAN-06004: ORACLE error from recovery catalog database: RMAN-20220: control file copy not found in the recovery catalog
RMAN-06090: error while looking up control file copy: /opt/oracle/admin/backup/MYDB1/backup_controlfile_MYDB1_2014_02_01_23:20.bck

At  first moment I don’t see it why this is failing so it is under investigation but it already feels like a nice puzzle.

When I checked the production side  to get a complete  overview  that was a shock to me. Because in the  log files of the Archive backups i found a clue what was going on but i also noticed that the backups where failing . In the log files I saw:

...
released channel: d1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of crosscheck command at 02/02/2014 23:23:30
RMAN-06004: ORACLE error from recovery catalog database: RMAN-20011: target database incarnation is not current in recovery catalog

Hmm wait a minute did we just see a valuable clue to  solve the case ?

So After that i started investigating after surfing the web. After connecting to the catalog ( schema) in the Rman database I started following query:

SQL> SELECT name, DBID, RESETLOGS_TIME FROM rc_database;
NAME           DBID RESETLOGS_TIME
-------- ---------- -------------------
MYDB1   3605724822 31.01.2014 07:05:51

That was making me frown  because in the catalog it was known that there had been a resetlog ???

Then I checked the incarnation:

SQL> SELECT dbid, name, dbinc_key, resetlogs_change#, resetlogs_time FROM rc_database_incarnation;
DBID NAME      DBINC_KEY RESETLOGS_CHANGE# RESETLOGS_TIME
---------- -------- ---------- ----------------- -------------------
3605724822 MYDB1            2        7.9895E+12 21.07.2009 17:18:42
3605724822 MYDB1           73                 1 04.04.2008 15:11:18
3605724822 MYDB1           74           2461875 19.05.2008 16:28:20
3605724822 MYDB1      2553841        1.0587E+13 31.01.2014 07:05:51

Not good at  all  Cause apparently the Catalog  had other information (  showing  a reset logs )  then the production database  ( which had its last reset logs  somewhere back in 2008 ) . And that was  also  the information I saw when I queried the production database ( v$database view is showing  information about a reset logs in

RESETLOGS_CHANGE# NUMBER System change number (SCN) at open resetlogs
RESETLOGS_TIME DATE Timestamp of open resetlogs

In Rman environment   the command list incarnation only returned an empty  line and a prompt which was unexpected too.

After giving it some thought and consulting a colleague I decided to stick to the easy scenario where:

  • I performed an export of the Rman schema for that Database on the Rman Catalog database server and  I will keep that export the upcoming  4 weeks for when an old restore would be needed.
  • I dropped the Rman  user  for that specific database in the Rman catalog database ( drop user  rman_Mydb1 cascade).
  • I registered the database again as a new Rman_schema in the Rman catalog database .

## After that it was time to check things>

rman TARGET  / RCVCAT rman_MYDB1/*****@RMAN
Recovery Manager: Release 10.2.0.3.0 - Production on Mon Feb 3 17:15:36 2014
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
connected to target database: MYDB1 (DBID=3605724822)
connected to recovery catalog database
RMAN> list incarnation;
List of Database Incarnations
DB Key  Inc Key DB Name  DB ID            STATUS  Reset SCN  Reset Time
------- ------- -------- ---------------- --- ---------- ----------
1       73      MYDB1   3605724822       PARENT  1          04.04.2008 15:11:18
1       74      MYDB1   3605724822       PARENT  2461875    19.05.2008 16:28:20
1       2       MYDB1   3605724822       CURRENT 7989501317585 21.07.2009 17:18:42

Looked  OK to me. After that  I ran an Archive backup and i checked the result of the scheduled Rman level backup the very next day.   Once again the concept worked as a charme  so happy dba again.

Happy reading,

Mathijs