Mission Critical system with Oracle Rac

Introduction.

Brief discussion of a case study for mission critical (rac) system(s) in various buildings.

In this case study there are two storage boxes , each one on different location and referred to a mx0530 or mx0531. In assigning the luns in this concept a naming convention as below was thought of as an identifier for them.

If your really want to take it to the next level an additional lun is needed as a quorum disk:

Information used for this case study can be retrieved with a big thank you to Markus for sharing this on the web.:

As always enjoy,

Mathijs

Changing Hearbeat in Oracle Rac.

 
1As of 11.2 Grid Infrastructure, the private network configuration is not only stored in OCR but also in the gpnp profile.  If the private network is not available or its definition is incorrect, the CRSD process will not start and any subsequent changes to the OCR will be impossible. Therefore care needs to be taken when making modifications to the configuration of the private network. It is important to perform the changes in the correct order. Please also note that manual modification of gpnp profile is not supported.
Please take a backup of profile.xml on all cluster nodes before proceeding, as grid user:
cd $GRID_HOME/gpnp/<hostname>/profiles/peer/
cd /app/oracle/product/12.x.x/grid/gpnp/mysrvrahr/profiles/peer
cd /app/oracle/product/12.x.x/grid/gpnp/mysrvrbhr/profiles/peer
cp -p profile.xml profile.xml.bk
2Ensure Oracle Clusterware is running on ALL cluster nodes in the cluster and save current status of resource.
/app/oracle/product/12.x.x/grid/bin/crsctl check cluster -all
/app/oracle/product/12.x.x/grid/bin/crsctl status resource -t>/tmp/beforeNewIps.lst
3As grid user ( always curious who might that b, to me it was the Oracle user btw.):
Get the existing information.
Here you will see only One interconnect in place For example:
/app/oracle/product/12.x.x/grid/bin/oifcfg getif
bond1  172.18.112.208  global  cluster_interconnect
bond0  195.233.190.64  global  public
##
Check the interfaces / subnet address can be identified by command for eth specifically:
/app/oracle/product/12.x.x/grid/bin/oifcfg iflist|grep -i eth|sort
eth0  172.18.32.0
eth2  192.168.10.0
eth6  192.168.11.0
or check  interfaces / subnets in general on OS with oifcfg:
/app/oracle/product/12.x.x/grid/bin/oifcfg iflist|sort
4Add the new cluster_interconnect information:
/app/oracle/product/12.x.x/grid/bin/oifcfg setif -global eth2/192.168.10.0:cluster_interconnect,asm
/app/oracle/product/12.x.x/grid/bin/oifcfg setif -global eth6/192.168.11.0:cluster_interconnect,asm
5Verify the change:
/app/oracle/product/12.x.x/grid/bin/oifcfg getif

With this information checked and in place it is time for setting up new listeners for asm since the original ASM listener during the installation used eth0 and that eth0 will be dropped  – removed from cluster configuration in steps below:

Existing listener ASMNET1LSNR  will become new one ASMNET122LSNR.
srvctl add listener -asmlistener -l ASMNET1221LSNR -subnet 192.168.10.0 (as mentioned this is the eth2 interface that we are going to use).
srvctl add listener -asmlistener -l ASMNET1222LSNR -subnet 192.168.11.0 (as mentioned this is the eth6 interface that we are going to use).

As always seeing is believing : use 
crsctl status resource -t to see details.
Note: The new ASM listener is created as a resource and it is in a status offline offline on all nodes in the cluster at this point and time.

In the next step we will remove the old ASM listener, and use a -f option to prevent errors – messages with regard to dependencies.

srvctl update listener -listener ASMNET1LSNR_ASM -asm -remove -force
I have checked again with crsctl status resource -t to make sure the old resource is gone now.

Removing the old ASM listener
In the Mos note there is a little inconsistency because it claims  that as a next step the old ASM listener should be stopped.  I was able to grep for the listener ( ps -ef|grep -i inherit)  and i saw it on OS level on the machine(S). But I am not able to stop that listener  since the cluster resource is already gone and lsnrctl did not work. Solution: What I noticed that when I skipped this step and stopped and started the cluster which is mandatory in this scenario, the listener was gone on all nodes.
Should have given this command, but that is NOT working: lsnrctl stop ASMNET1LSNR_ASM
Check configuration before restarting GI:

First command: srvctl config listener -asmlistener Name: ASMNET122LSNR_ASM Type: ASM Listener Owner: oracle Subnet: 192.168.10.0 Home: <CRS home> End points: TCP:1527 Listener is enabled. Listener is individually enabled on nodes: Listener is individually disabled on nodes: Second Command: srvctl config asm ASM home: <CRS home> Password file: +VOTE/orapwASM Backup of Password file: ASM listener: LISTENER ASM instance count: ALL Cluster ASM listener: ASMNET122LSNR_ASM
6In GridInfrastructure: Shutdown Oracle Clusterware on all nodes and disable the Oracle Clusterware as root user ( in my example i was allowed to sudo ):
 sudo su – 
./app/oracle/product/12.x.x/grid/bin/crsctl stop crs
./app/oracle/product/12.x.x/grid/bin/crsctl disable crs
7Make the network configuration change at OS level as required, ensure the new interface is available on all nodes after the change. ( check to ping the interfaces on all nodes ).
for x in 10 11;do for xx in 75 76 77 78;do ping -c2 192.168.${x}.${xx}|egrep ‘icmp_seq|transmitted’;done;echo;done
for x in a b c d; do for xx in 1 2;do ping -c2 mysrvr${x}hr-hb$xx|egrep ‘icmp_seq|transmitted’;done;echo;done 
8Restart Oracle Clusterware on all nodes as root user:
sudo su – 
./app/oracle/product/12.x.x/grid/bin/crsctl start crs
9Check 
/app/oracle/product/12.x.x/grid/bin/crsctl check cluster -all
/app/oracle/product/12.x.x/grid/bin/crsctl status resource -t>/tmp/afterNewIps.lst
sdiff /tmp/afterNewIps.lst /tmp/beforeNewIps.lst
Enable Oracle Clusterware on all nodes as root user:
./app/oracle/product/12.x.x/grid/bin/crsctl enable crs
10Remove the old interface if required:
/app/oracle/product/12.x.x/grid/bin/oifcfg delif -global bond1/172.18.112.208:cluster_interconnect
11Verify the remove:
/app/oracle/product/12.x.x/grid/bin/oifcfg getif

Upgrading to 19C GI from 12.2

Required Software:

Needed software 
GI:  LINUX.X64_193000_grid_home.zip
RDBMS:  LINUX.X64_193000_db_home.zip
30899722 for April 2020p30899722_190000_Linux-x86-64.zip
Opatch:p6880880_200000_Linux-x86-64.zip
Autonomous Health Framework (AHF) – Including TFA and ORAchk/EXAChkAHF-LINUX_v20.1.3.zip

The current installation has to be checked with 2 tools: orachk and cluvfy. Both are included in the 19c software, but it is good practice to download the actual version from MOS (note 1268927.1).

Note: the AHF has 2 requirements: app 5 – 10 GB of storage AND the hierarchy of directory where you will install it needs to be owned by root in full. Recommended to create a FS in such way root:root  in /var/SP/ahf and perform install there.

oracle@mysrvr:/app/oracle/stage [MYDB]# ls -al *

-rw-r–r–.  1 oracle dba  165 Jun 24 11:30 status.file

Patch 28553832 was needed on mysrvr since it was not patched since 2018:

total 433452

drwxr-xr-x. 3 oracle dba      4096 Jun 24 11:59 .

drwxr-xr-x. 9 oracle dba      4096 Jun 24 11:58 ..

drwxr-x—. 4 oracle dba      4096 Dec 25  2018 28553832

-rw-r–r–. 1 oracle dba 443838687 Jun 24 11:59 p28553832_12201190115OCWJAN2019RU_Linux-x86-64.zip

Patch 30899722 for april 2020:

total 12

drwxr-xr-x. 3 oracle dba 4096 Jun 19 11:43 .

drwxr-xr-x. 9 oracle dba 4096 Jun 24 11:58 ..

drwxr-x—. 4 oracle dba 4096 Jun 24 11:00 Patch_30899722_GI_RELEASE_UPDATE_19.7.0.0.0__14_Apr_2020

Patch CVU only needed on new cluster:

total 286872

-rwxr-x—.  1 oracle dba 293648959 Jun 24 10:19 cvupack_Linux_x86_64.zip

To install GI:

total 2821488

drwxr-xr-x. 3 oracle dba       4096 Jun 24 10:20 .

drwxr-xr-x. 9 oracle dba       4096 Jun 24 11:58 ..

drwxr-xr-x. 2 oracle dba       4096 Jun 24 10:22 DELETEME

-rwxr-x—. 1 oracle dba 2889184573 Jun 19 11:38 LINUX.X64_193000_grid_home.zip

Latest versions of Opatch:

total 231280

drwxr-xr-x. 2 oracle dba      4096 Jun 24 10:54 .

drwxr-xr-x. 9 oracle dba      4096 Jun 24 11:58 ..

-rwxr-x—. 1 oracle dba 118408624 Jun 24 10:54 p6880880_200000_Linux-x86-64.zip

Database software RDBMS:

total 2988008

drwxr-xr-x. 2 oracle dba       4096 Jun 19 11:40 .

drwxr-xr-x. 9 oracle dba       4096 Jun 24 11:58 ..

-rwxr-x—. 1 oracle dba 3059705302 Jun 19 11:43 LINUX.X64_193000_db_home.zip

Latest TFA:

total 258556

drwxr-xr-x. 2 oracle dba      4096 Jun 19 14:30 .

drwxr-xr-x. 9 oracle dba      4096 Jun 24 11:58 ..

-rw-r–r–. 1 oracle dba 264751391 Jun 19 14:30 TFA-LINUX_v19.2.1.zip

Running preparations:

./cluvfy stage -pre crsinst -n <Nodes>  -verbose > /tmp/results_cluvfy_001.txt

This showed below types of errors, which due to fact this is an existing box with clusterware running was ok ( after checking proceeded with the installation):

Failures were encountered during execution of CVU verification request “stage -pre crsinst”.

Verifying Group Existence: asmadmin …FAILED

mysrvr: PRVG-10461 : Group “asmadmin” selected for privileges “OSASM” does

          not exist on node “mysrvr”.

Verifying Group Existence: asmdba …FAILED

mysrvr: PRVG-10461 : Group “asmdba” selected for privileges “OSDBA” does not

          exist on node “mysrvr”.

Verifying Group Membership: asmadmin …FAILED

mysrvr: PRVG-10460 : User “oracle” does not belong to group “asmadmin”

          selected for privileges “OSASM” on node “mysrvr”.

Verifying Group Membership: asmdba …FAILED

mysrvr: PRVG-10460 : User “oracle” does not belong to group “asmdba” selected

          for privileges “OSDBA” on node “mysrvr”.

Verifying Node Connectivity …WARNING

mysrvr: PRVG-11069 : IP address “169.254.0.2” of network interface “idrac” on

          the node “mysrvr” would conflict with HAIP usage.

Verifying Domain Sockets …FAILED

mysrvr: PRVG-11750 : File “/var/tmp/.oracle/ora_gipc_monitor_ag_mysrvr_”

          exists on node “mysrvr”.

mysrvr: PRVG-11750 : File “/var/tmp/.oracle/ora_gipc_css_ctrllcl_mysrvr_”

          exists on node “mysrvr”.

mysrvr: PRVG-11750 : File “/var/tmp/.oracle/ora_gipc_mysrvr_EVMD” exists on

          node “mysrvr”.

mysrvr: PRVG-11750 : File “/var/tmp/.oracle/ora_gipc_mysrvr_CSSD” exists on

          node “mysrvr”.

mysrvr: PRVG-11750 : File “/var/tmp/.oracle/ora_gipc_agent_ag_mysrvr_”

          exists on node “mysrvr”.

mysrvr: PRVG-11750 : File “/var/tmp/.oracle/ora_gipc_mysrvr_INIT” exists on

          node “mysrvr”.

Verifying ASM Filter Driver configuration …WARNING

mysrvr: PRVE-10237 : Existence of files

          “/lib/modules/3.10.0-693.el7.x86_64/extra/oracle/oracleafd.ko,/lib/mod

          ules/3.10.0-862.9.1.el7.x86_64/weak-updates/oracle/oracleafd.ko,/opt/o

          racle/extapi/64/asm/orcl/1/libafd12.so” is not expected on node

          “mysrvr” before Clusterware installation or upgrade.

mysrvr: PRVE-10239 : ASM Filter Driver “oracleafd” is not expected to be

          loaded on node “mysrvr” before Clusterware installation or upgrade.

Grid Infrastructure installation:

Before really starting the Gui follow these steps:

– GRID_HOME:

mkdir -p /app/grid/product/19c/grid

cd /app/grid/product/19c/grid

ls -la

If not empty:

rm -rf *

rm -rf .patch_storage

## Unzip the file in /app/grid/product/19c/grid

unzip /app/oracle/stage/GI/LINUX.X64_193000_grid_home.zip

Opatch version.

For the installation get yourself the latest version of opatch from Metalink and add it to stage directory.

cd /app/grid/product/19c/grid

mkdir OPatch_old_20200513

cp -pir OPatch/* OPatch_old_20200513/

unzip /app/oracle/stage/Opstch/p6880880_200000_Linux-x86-64.zip

-> respond with “A”

Below will patch the GI software with April 2020 And will start the Gui after.

cd /app/grid/product/19c/grid/

unset ORACLE_HOME

unset ORACLE_BASE

export ORACLE_BASE=/app/oracle

./gridSetup.sh -applyPSU /app/oracle/stage/30899722/Patch_30899722_GI_RELEASE_UPDATE_19.7.0.0.0__14_Apr_2020/30899722

##After patching, the Installer started which requested me to stop the databases that were using the current ASM.

##Stopping the instances that are all using the same Oracle home:

srvctl stop home -o /app/oracle/product/12201/db -s /app/oracle/stage/status.file -n $(uname -n) -stopoption immediate

## after that I restarted:

/app/grid/product/19c/grid/gridSetup.sh &

In the next screens, select next till the pre-requisite Checks are running. There I ended up with this:

This message made it all to clear to me that Patching was needed.  After completing that I should do a retry.

Patch 28553832  :

Oracle Clusterware 12C Release 2 (12.2.0.1.0OCWJAN2019RU)

Patch for Bug# 28553832 for Linux-x86-64 platform

This patch is RAC Rolling Installable.

Released: 25 December , 2018

6 Bugs Fixed by This Patch

This patch includes the following bug fixes:

13852018 DB12; NEED TEST PATCH FOR DB12 FROM SE FOR EVERY CANDIDATE DB LABEL

The log of current session can be found at:

  /app/oracle/crsdata/mysrvr/crsconfig/roothas_2020-06-24_01-03-35PM.log

2020/06/24 13:03:36 CLSRSC-595: Executing upgrade step 1 of 12: ‘UpgPrechecks’.

2020/06/24 13:03:39 CLSRSC-595: Executing upgrade step 2 of 12: ‘GetOldConfig’.

2020/06/24 13:03:41 CLSRSC-595: Executing upgrade step 3 of 12: ‘GenSiteGUIDs’.

2020/06/24 13:03:41 CLSRSC-595: Executing upgrade step 4 of 12: ‘SetupOSD’.

2020/06/24 13:03:41 CLSRSC-595: Executing upgrade step 5 of 12: ‘PreUpgrade’.

ASM has been upgraded and started successfully.

2020/06/24 13:04:38 CLSRSC-595: Executing upgrade step 6 of 12: ‘UpgradeAFD’.

2020/06/24 13:06:43 CLSRSC-595: Executing upgrade step 7 of 12: ‘UpgradeOLR’.

clscfg: EXISTING configuration version 0 detected.

Creating OCR keys for user ‘oracle’, privgrp ‘dba’..

Operation successful.

2020/06/24 13:06:47 CLSRSC-595: Executing upgrade step 8 of 12: ‘UpgradeOCR’.

LOCAL ONLY MODE

Successfully accumulated necessary OCR keys.

Creating OCR keys for user ‘root’, privgrp ‘root’..

Operation successful.

CRS-4664: Node mysrvr successfully pinned.

2020/06/24 13:06:50 CLSRSC-595: Executing upgrade step 9 of 12: ‘CreateOHASD’.

2020/06/24 13:06:51 CLSRSC-595: Executing upgrade step 10 of 12: ‘ConfigOHASD’.

2020/06/24 13:06:51 CLSRSC-329: Replacing Clusterware entries in file ‘oracle-ohasd.service’

2020/06/24 13:07:34 CLSRSC-595: Executing upgrade step 11 of 12: ‘UpgradeSIHA’.

mysrvr     2020/06/24 13:08:33     /app/oracle/crsdata/mysrvr/olr/backup_20200624_130833.olr     3633918477

mysrvr     2018/11/05 21:55:51     /app/grid/product/12201/grid/cdata/mysrvr/backup_20181105_215551.olr     2960767134

2020/06/24 13:08:34 CLSRSC-595: Executing upgrade step 12 of 12: ‘InstallACFS’.

2020/06/24 13:10:03 CLSRSC-327: Successfully configured Oracle Restart for a standalone server

## Clicked ok

Photo by Pixabay on Pexels.com

srvctl start home -o /app/oracle/product/12201/db -s /app/oracle/stage/status.file

Checks:

oracle@mysrvr:/opt/oracle/diag/asm/+asm/+ASM/trace [+ASM]#  crsctl query has releaseversion

Oracle High Availability Services release version on the local node is [19.0.0.0.0]

oracle@mysrvr:/opt/oracle/diag/asm/+asm/+ASM/trace [+ASM]# crsctl query has releasepatch

Oracle Clusterware release patch level is [3633918477] and the complete list of patches [30869156 30869304 30894985 30898856 ] have been applied on the local node. The release patch string is [19.7.0.0.0].

oracle@mysrvr:/opt/oracle/diag/asm/+asm/+ASM/trace [+ASM]#  crsctl query has softwarepatch

Oracle Clusterware patch level on node mysrvr is [3633918477].