Introduction:

With spring 2017 around new initiatives are developed. As a preparation to start doing Database upgrades to 12C it will be a mandatory step to upgrade the Cluster-ware ( Grid-Infrastructure) first before doing the database part. So in this case very happy me that finally the time has come that one of the customers requests to upgrade a number of Clusters to 12C Grid-infrastructure. In this document will share thoughts , and my plan to tackle this interesting puzzle. Since the first Cluster upgrade will happen pretty soon (this week) the document might evolve with the lessons learned of that first upgrade. Happy reading in advance.

Preparations:

It could be some text of a fortune cookie but every success just loves preparation so in this case that will not be any different. First thing to do was to identify a scope of clusters that had to be upgraded. Together with customer an inventory list had to be created and in the end 10 Clusters have been defined as part of scope for this action. 8 Test clusters and 2 production environments . Interesting detail will be that all Clusters have been patched pretty recently all holding 11.2.0.4 Grid infrastructure with some extra challenge that the below Operating system will come in two flavors (being Red Hat Linux server release 5.11 (Tikanga) and 6.5 (Santiago). Curious in advance already to see if these different versions of Red Hat will have an influence of the steps to be performed. In the details below you will find more details on detailed preparations and actions of the upgrade.

Operating System:

One of the first steps to investigate is of course to find out if the Operating versions at hand are supported ones for the Upgrade. Oracle support confirmed that even though it would be recommended to upgrade the 5.11 Red Hat version first to Red Hat 7, it should work with the 5.11 version at hand. The 6.5 Os version was okay anyhow. The project decided however that an OS upgrade of the 5.11 boxes would delay things so upgrading the OS will be done in a different project.

Storage:

Before even considering to run the upgrade of the grid-infrastructure some extra time needs to be spend to investigate the storage in place in the Cluster for such upgrade. Often the Oracle software is first set up locally on each box on Volume group VG0 but with the out-of-place-installation these days that might become a challenge if there is not enough local storage present anymore in the box. Due to standards those root disk become nearly untouchable. For my project this storage requirement has been defined as an absolute minimum which means there will be a need for extra local storage per node or even for San storage per node which will be presented as required mount points to me. If such storage would not (or no longer be present locally) I have to request and received additional storage for it.

/app/grid	/app/oracle	/var/opt/oracle	/tmp	San 4 lvm dbs
50GB	70GB	32M	1GB

Short explain for this:

/app/grid : 12C Grid-Infra software will be installed.
/app/oracle: For the 12C Database software.
/var/opt/oracle and /tmp: required minimum space.
San 4 lvm dbs:  will be setup for 4GB mountpoints, for each Instance on the local node in order to hold logfiles.

When migrating to 12C and coming from 11G please be informed that you might need extra storage in your OCR – VOTING disk group due to a new feature as well. This new repository database will have to be implemented during the upgrade. This Grid Infrastructure management repository (GIMR) database has become mandatory in Oracle GI 12.1.0.2. Data files associated with it will be created in same diskgroup as OCR or voting. (Average growth per day per node = app 750 MB so a 4 node cluster would lead at default retention of 3 days to app 9 GB storage requirement in OCR or VOTING diskgroup). A fortunate Note is that retention can be changed. Well in my case this means that more ASM disks will need to be added to the specific disk group. At work most OCR and VOTING diskgroups are set up as bare minimum ( in normal redundancy with three disks each like 4 GB each). ( extra info on this topic: https://blogs.oracle.com/UPGRADE/entry/grid_infrastructure_management_repository_gimr)

Detailed preparations and health checks.

One of the quotes in IT sometimes is that you should not touch a well running system. Well in this case I would like to add but if you do, come well prepared. In this case i have put the focus on the three below tools to prove that the current system is in a good shape to run the upgrade which is also to be regarded as a health check of the environment. These preps are based on the Mos note (1579762.1) from from reading Chapter 13 in the great book “Expert Oracle Rac 12C” by Syed Jaffar Hussain, Tariq Farooq,Riyaj Shamsudeen and Kai Yu. ( ISBN-13 (electronic): 978-1-4302-5045-6).

Opatch
RACcheck: Orachk
Runcluvfy

Opatch

Using opatch in order to make sure that the Orainventory is in good shape on all nodes in the cluster. Command issued is investiging the current gridinfrastructure:

opatch lsinventory -oh /opt/crs/product/11204/crs -detail

-oh means for the specific ORACLE_HOME.

-detail shows all details.

RACcheck: Orachk

I have looked on Metalink and Downloaded and installed this tool on the cluster (nodes).

orachk Version

12.2.0.1.2_20161215

Following Quick start guide for this tool:

http://docs.oracle.com/cd/E68491_01/OEXUG/quick-start-guide.htm#OEXUG-GUID-CB4224DA-F389-4E9C-AB6A-C57F46A80C61

Clear information to be found In mos :

ORAchk Upgrade Readiness Assessment (Doc ID 1457357.1)

With the tool downloaded below steps have been performed:

According to documentation the tool needs to be copied, unpacked (and installed) in suptools subdirectory of the cluster software installation.

scp orachk.zip oracle@mysrvr23hr:/opt/crs/product/11204/crs/suptools
scp orachk.zip oracle@mysrvr24hr:/opt/crs/product/11204/crs/suptools

Once unzipped the tool can run in two modes, a pre upgrade mode and a post upgrade mode:

./orachk u -o pre |tee Orachk_pre_20170124.log
./orachk u -o post |tee Orachk_post_20170124.log

Note: the tee command will also create a log file holding all the steps – progress information during run time.
Note: /opt/oracle/.orachk should be empty before stat otherwise:‘Another instance of orachk is running on:: #  message.

Runcluvfy

Working with runcluvfy is like meeting an old friend again. Yet each time it is a bit of struggle to find optimal syntax – parameters to be used for your set up.

#Wrong setup was
./runcluvfy.sh stage -pre crsinst -upgrade -n mysrvr23hr,mysrvr24hr -rolling -fixup -src_crshome /opt/crs/product/11204/crs -dest_home /app/grid/product/12102/grid -dest_version 12.1.0 -verbose

## working version
./runcluvfy.sh stage -pre crsinst -n mysrvr23hr,mysrvr24hr -verbose|tee runcluvfy_20170130_pre.lst
Or
./runcluvfy.sh stage -pre crsinst -upgrade -rolling -src_crshome /opt/crs/product/11204/crs -dest_crshome /app/grid/product/12102/grid -dest_version 12.1.0.2.0 -verbose|tee runcluvfy_20170130_preUpgrade.lst

Upgrade steps:

Now it will become to plan and set up your upgrade steps after the confidence build on the preparation. In the upgrade multiple approaches will be possible. But my goal in this is plain and simple, minimum Impact on Cluster and on the databases hosted on that cluster so I will be aiming for this Scenario: rolling upgrade ASM + Clusterware. A baseline for such will be the below URL:

https://docs.oracle.com/database/121/CWLIN/procstop.htm#CWLIN10001

Working according to company standards will require to use following specific settings for an $ORACLE_BASE, $ORACLE_HOME for the GI installation and a different $ORACLE_HOME for the database software.

oracle@mysrvrhr:/home/oracle [CRS]# echo $ORACLE_BASE
/app/oracle
oracle@mysrvrhr:/home/oracle [CRS]# echo $ORACLE_HOME
/app/grid/product/12102/grid

oracle@mysrvrhr:/home/oracle [MYDB1]# echo $ORACLE_HOME
/app/oracle/product/12102/db

Below in the bullets will go through the steps and comment where needed.

Due to Grid Infrastructure management repository (GIMR) database I had to add larger disks to VOTING diskgroup to have enough storage in place (the steps on how to add the new disks and drop the old ones are too detailed for this blog (after all it is a blog and not a book 🙂 so I will have to blog about that in a separate blog).
Check /tmp because upgrade requires at least 1GB present in /tmp. Either clean up or have /tmp extended. (use ls -lSh command).
check ocr integrity by :

cluvfy comp ocr -n all -verbose

Check backup of ocr and voting disk in the cluster:
```
ocrconfig -showbackup
```

Note: this command can be performed as ORACLE user and will shows info similar to the information below. Interesting aspect here was that I issued the command on the first node ( but the automated back-ups are all on node 11hr).

oracle@mysrvr09hr:/opt/oracle [CRS]# ocrconfig -showbackup
mysrvr11hr 2017/04/21 05:20:36 /opt/crs/product/11204/crs/cdata/mysrvr03cl/backup00.ocr
mysrvr11hr 2017/04/21 01:20:29 /opt/crs/product/11204/crs/cdata/mysrvr03cl/backup01.ocr
mysrvr11hr 2017/04/20 21:20:07 /opt/crs/product/11204/crs/cdata/mysrvr03cl/backup02.ocr
mysrvr11hr 2017/04/20 01:19:42 /opt/crs/product/11204/crs/cdata/mysrvr03cl/day.ocr
mysrvr11hr 2017/04/12 17:16:11 /opt/crs/product/11204/crs/cdata/mysrvr03cl/week.ocr
PROT-25: Manual backups for the Oracle Cluster Registry are not available

As the root user Run a Manual Backup of the OCR information. Run the ocrconfig -manualbackup command on a node where the Oracle Cluster-ware stack is up and running to force Oracle Cluster-ware to perform a backup of OCR at any time, rather than wait for the automatic backup. Note: The -manualbackup option is especially useful when you want to obtain a binary backup on demand, such as before you make changes to OCR. The OLR only supports manual backups. NOTE: In 11gR2, the voting files are backed up automatically as part of OCR. Oracle recommends NOT used dd command to backup or restore as this can lead to loss of the voting disk.

mysrvr09hr:root:/root # cd /opt/crs/product/11204/crs/bin/
mysrvr09hr:root:/opt/crs/product/11204/crs/bin # ./ocrconfig -manualbackup
mysrvr11hr 2017/04/21 09:12:40 /opt/crs/product/11204/crs/cdata/mysrvr03cl/backup_20170421_091240.ocr

## Checking a second time will now also show a manual backup 2 b in place:
mysrvr09hr:root:/opt/crs/product/11204/crs/bin # ./ocrconfig -showbackup
mysrvr11hr 2017/04/21 05:20:36 /opt/crs/product/11204/crs/cdata/mysrvr03cl/backup00.ocr
mysrvr11hr 2017/04/21 01:20:29 /opt/crs/product/11204/crs/cdata/mysrvr03cl/backup01.ocr
mysrvr11hr 2017/04/20 21:20:07 /opt/crs/product/11204/crs/cdata/mysrvr03cl/backup02.ocr
mysrvr11hr 2017/04/20 01:19:42 /opt/crs/product/11204/crs/cdata/mysrvr03cl/day.ocr
mysrvr11hr 2017/04/12 17:16:11 /opt/crs/product/11204/crs/cdata/mysrvr03cl/week.ocr
mysrvr11hr 2017/04/21 09:12:40 /opt/crs/product/11204/crs/cdata/mysrvr03cl/backup_20170421_091240.ocr

Last line is now showing the manual backup
(since it is showing the format (backup_yyyymmdd_hhmmss.ocr)

Check Location of OCR and Voting Disk (need to be in a diskgroup )

##How:
cat /etc/oracle/ocr.loc

## Shows output similiar to this
## (if ocr is already mirrored in other Diskgroup with normal Redundancy)
#Device/file getting replaced by device +OCR
ocrconfig_loc=+VOTE
ocrmirrorconfig_loc=+OCR

##How: 
crsctl query css votedisk

## Will show 3 voting disks in Disk group Vote due to Normal redundancy (and 3 Disk)
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
 1. ONLINE 36b26f862b9a4f54bfba3096e3d50afa (/dev/mapper/asm-vote01) [VOTE]
 2. ONLINE 9d45d791c1124febbf0a093d5a185c13 (/dev/mapper/asm-vote02) [VOTE]
 3. ONLINE 1b7e510a302e4f03bfdea942d55d7067 (/dev/mapper/asm-vote03) [VOTE]
Located 3 voting disk(s).

## check in ASM:
select a.name dg_name,
a.GROUP_NUMBER dg_number,
a.state dg_state,
b.DISK_NUMBER d_number, 
b.name d_name,
b.mount_status
d_mount_status,
b.header_status
d_header_status,
b.mode_status
d_mode_status,
b.state d_state,
b.FAILGROUP
d_failgroup,
b.path d_path
from
v$asm_diskgroup a,
v$asm_disk b
where
a.GROUP_NUMBER(+) = b.GROUP_NUMBER
order by 2,4;

Unset environment Variables:

unset ORACLE_BASE 
unset ORACLE_HOME 
unset GI_HOME 
unset ORA_CRS_HOME 
unset TNS_ADMIN
unset ORACLE_SID
unset ORA_NLS10

Check active crs version and software version:

## using the current CRS to document current active - and software version
/opt/crs/product/11204/crs/bin/crsctl query crs activeversion
/opt/crs/product/11204/crs/bin/crsctl query crs softwareversion

Performing a Standard Upgrade from an Earlier Release

## Use the following procedure to upgrade the cluster from an earlier release:
Start the installer, and select the option to upgrade an existing Oracle Clusterware and Oracle ASM installation.
On the node selection page, select all nodes.
Select installation options as prompted.
Note: Oracle recommends that you configure root script automation,
so that the sh script can be run automatically during the upgrade.
Run root scripts, using either automatically or manually:

Running root scripts automatically:
TIP: If you have configured root script automation,
then use the pause between batches to relocate services from the nodes running the previous release to the new release.
Comment Mathijs: I have not decided yet on this automation step.
In the documentation read as prep for the upgrade you see the option to create multiple batches:
like batch 1 starting node,
batch 2 all but last node,
batch 3 last node.
I will use both the automated way for one cluster and then use the below manual (old school method mentioned below) on another cluster.

Running root scripts manually:
If you have not configured root script automation, then when prompted,
run the rootupgrade.sh script on each node in the cluster that you want to upgrade.

If you run root scripts manually, then run the script on the local node first.
The script shuts down the earlier release installation, replaces it with the new Oracle Clusterware release, and starts the new Oracle Clusterware installation.
After the script completes successfully, you can run the script in parallel on all nodes except for one, which you select as the last node.
When the script is run successfully on all the nodes except the last node, run the script on the last node.
After running the sh script on the last node in the cluster, if you are upgrading from a release earlier than Oracle Grid Infrastructure 11g Release 2 (11.2.0.2),
and left the check box labeled ASMCA checked, which is the default, then Oracle Automatic Storage Management Configuration Assistant ASMCA runs automatically,
and the Oracle Grid Infrastructure upgrade is complete.
If you unchecked the box during the interview stage of the upgrade, then ASMCA is not run automatically.

If an earlier release of Oracle Automatic Storage Management (Oracle ASM) is installed, then the installer starts ASMCA to upgrade Oracle ASM to 12c Release 1 (12.1).
You can choose to upgrade Oracle ASM at this time, or upgrade it later.
Oracle recommends that you upgrade Oracle ASM at the same time that you upgrade Oracle Clusterware.
Until Oracle ASM is upgraded, Oracle Databases that use Oracle ASM cannot be created and the Oracle ASM management tools in the Oracle Grid Infrastructure 12c Release 1 (12.1) home (for example, srvctl) do not work.

Note:
Because the Oracle Grid Infrastructure home is in a different location than the former Oracle Clusterware and Oracle ASM homes,
update any scripts or applications that use utilities, libraries, or other files that reside in the Oracle Clusterware and Oracle ASM homes.

Check active crs version and software version:

/opt/crs/product/11204/crs/bin/crsctl query crs activeversion
/opt/crs/product/11204/crs/bin/crsctl query crs softwareversion

Post upgrade checks:

 ps -ef|grep d.bin should show daemons started from 12C.

Thoughts on Rollback:

Of course each migration will be as good as its preparation. But still your plan should at least hold the steps for a rollback in case you might not make it to a successful completed task. Below you will find the steps mentioned in general.

On all remote nodes, use the command syntax Grid_home/crs/install/rootcrs.sh -downgrade to stop the 12c Release 1 (12.1).

On the local node use the command syntax Grid_home/crs/install/rootcrs.sh -downgrade -lastnode

On any of the cluster member nodes where the rootupgrade.sh script has run successfully:

cd /u01/app/12.1.0/grid/oui/bin
./runInstaller -nowait -waitforcompletion -ignoreSysPrereqs -updateNodeList
-silent CRS=false ORACLE_HOME=/u01/app/12.1.0/grid

On any of the cluster member nodes where the rootupgrade script has run successfully:
In Old ORACLE_HOME (the earlier Oracle Clusterware installation).$ cd /opt/crs/product/11204/crs/oui/bin/
$ ./runInstaller -nowait -waitforcompletion -ignoreSysPrereqs -updateNodeList -silent CRS=true ORACLE_HOME=/u01/app/crs

Start the Oracle Clusterware stack manually:
On each node, start Oracle Clusterware from the earlier release Oracle Clusterware home:
/opt/crs/product/11204/crs/bin/crsctl start crs

As always thank you for taking an interest in my blog. Happy reading and till the next time.

Mathijs

Mathijs Bruggink