The return of the relink Grid Infrastructure and Rdbms relink

Introduction:

This week  I have been part of the debate again , do we or don’t we relink when major activities like  Upgrade of Linux Kernel is performed . I  have been asked to do the relink after the Rac cluster was upgraded on Linux. So as always thought it would be wise to make notes during the day as a plan to be performed during the night . In this blog you will find the  steps i have performed on a two node Rac cluster with 11.2.0.4 Grid Infrastructure and two Oracle software trees  holding 11.2.0.4 Rdbms and 11.1. Rdbms.

With regard to relinking  discussion in team had been like .. 1) we might break things in relinking  and 2) we don’t have the resources to do that for every server. My recommendation is to follow Oracle in this  and do deal with relink of  the Grid Infra right after  OS has been relinked . Cause if something is broken during the Upgrade and your relinking there after well at least you know where it came from and can deal with things as from there . Where as  if you do not relink your Software right after such a major change on OS you might still be hit in the dark in the upcoming weeks and  you would need to figure out then what might have caused things.

You can even debate on the fact if it is needed to stop the resources like listeners and databases gracefully before shutting down the cluster  or  to perform a checkpoint in your database and just shutdown the crs .  I have been doing both approaches and never had issues so far. But i can imagine that heavy used , busy systems  might prefer the grace shutdown before shutting down GI.

 

Below you will find my steps . As always happy reading  and till we meet again ,

Mathijs.

 

Detailed Plan:

 

mysrvrar / mysrvrbr Steps 1 – 8 will be performed on all two nodes in my cluster, in a sequential order with some delay to make sure no cluster panic will occur.
1 crsctl status resource -t>/tmp/BeforeWork.lst Check your cluster in order to be able to compare it to what it looks like after the relinking. Maybe it is even a good idea to put it into a file. Often i end up on clusters which i am not that familiar with on a daily basis. So i tend to make this overview before i start working on the cluster.
2 cSpfile.ksh This is a home made script in which several activities are performed. It will perform a create a spfile , do a checkpoint and do switch logfile right before shutting down the cluster node.
3 emctl stop agent
4 srvctl stop home -o $ORACLE_HOME -s /tmp/statusRDBMS -n mysrvrar This will stop all resources that started from 1120.4 home and keep a record of them in the file in /tmp/status RDBMS. This will be convenient when starting again .
5
6 srvctl stop instance -d MYDBCM -i MYDBCM1 This is a shared cluster so we have customers requiring the 1120.4 software and some the 11.1 software . The 11.1 databases have to be stopped individually.
srvctl stop instance -d MYDBCMAC -i MYDBCMAC1
7 srvctl stop listener -n mysrvrar -l listener_MYDBCM1 It is common to have a listener per database so i will stop the 11.1 listener   in proper way as well.
srvctl stop listener -n mysrvrar -l listener_MYDBCMAC1
8 As root: Dealing with the cluster means you have to logon or perform sudo su – as the ORACLE user to become ROOT to perform the needed task to stop the cluster-ware on the cluster node.
9 cd /opt/crs/product/11204/crs/bin
10 ./crsctl disable crs During this maintenance Linux will be patching and rebooting various times so i was asked to make sure that the Grid Infra structure is not starting at each reboot till we are ready.
11 ./crsctl stop crs Last step as preparation for the Linux guys to patch the Machines . Shutting down the Grid Infra structure. Time to take a 2hr sleep.
Time to Relink the software on the two nodes Starting relink on the first node. Performing steps   9 and following . I will complete all steps needed on the first node and see to it that the Grid Infrastructure is started before moving on to the second node.
12 CHECK IF CRS IS DOWN otherwise REPEAT step 4 After Returning to the cluster still check if crs is down.   Because it is better to be safe then sorry.
13 As root: In order to relink the Grid Infra you have to become the root user again.
14 cd /opt/crs/product/11204/crs/bin as root
15 cd /opt/crs/product/11204/crs/crs/install
16 perl rootcrs.pl -unlock Earlier this night the GI was shutdown for Linux patching. When you perform this perl rootcrs .pl -unlock it will try to shutdown the GI. So in my case i got a message that the system was not able to stop the crs ..
17 As the grid infrastructure for a cluster owner: This was a bit tricky. Cause the owner of the Grid Infra in my case is Oracle so dont try this as root . Better to open a second window as Oracle for the steps below.
18 export ORACLE_HOME=/opt/crs/product/11204/crs As the Oracle user.
cd /opt/crs/product/11204/crs/bin As the Oracle user.
19 relink Relink will also write a relink log which you can tail.
20 [Step 1] Log into the UNIX system as the Oracle software owner: Once the GI software has been relinked it is time for relinking the Oracle Homes( in my case an 11.1 and 11.2. software tree). In my case i logged on as the oracle user.
21 [STEP 2] Verify that your $ORACLE_HOME is set correctly:
22 For all Oracle Versions and Platforms, perform this basic environment check first:
export $ORACLE_HOME= /opt/oracle/product/11204_ee_64/db Oracle 11.2.0.4
export $ORACLE_HOME= /opt/oracle/product/111_ee_64/db Oracle 11.1
cd $ORACLE_HOME
pwd Check the environment.
23 [Step 3] Verify and/or Configure the UNIX Environment for proper relinking:
Set LD_LIBRARY_PATH to include $ORACLE_HOME/lib LD_LIBRARY_PATH needs to be in place so when relinking both ORACLE versions make sure you set the environment in a correct way.
export LD_LIBRARY_PATH=/opt/oracle/product/11204_ee_64/db/lib
echo $LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/opt/oracle/product/111_ee_64/db/lib
echo $LD_LIBRARY_PATH
24 [Step 4] For all Oracle Versions and UNIX Platforms:
Verify that you performed Step 2 correctly: Check , check and check again
env | grep -i LD_ ….make sure that you see the correct absolute path for $ORACLE_HOME in the variable definitions.
25 [Step 5] For all Oracle Versions and UNIX Platforms:
Verify umask is set correctly:
umask This must return 022. If it does not, set umask to 022.
umask 022
umask
26 [Step 6] Run the OS Commands to Relink Oracle:
Important Notes:
* Before relinking Oracle, shut down both the database and the listener.
* The following commands will output a lot of text to your session window. To capture this output for upload to support, redirect the output to a file.
* If relinking a client installation, it’s expected that some aspects of the following commands will fail if the components were not originally installed.
27 For all UNIX platforms:
Oracle 8.1.X, 9.X.X, 10.X.X or 11.X.X
————————————-
$ORACLE_HOME/bin/relink all Oracle 11.1
$ORACLE_HOME/bin/relink oracle 11.2
writing relink log to: /opt/oracle/product/11204_ee_64/db/install/relink.log
28 How to Tell if Relinking Was Successful: If relinking was successful, the make command will eventually return to the OS prompt without an error. There will NOT be a ‘Relinking Successful’ type message. I performed a tail on the logfiles as relink was running in a second window and did not see any issues. And as the note says wait for the prompt to return ( with no comments – messages ) and you are good to go
29 As root again: Since i am relinking both the GI and the RDBMS i have moved this step ( starting the GI again till after the RDBMS relinking has finished because of   course during the relink of RDBMS the environment ( Databases , listeners ) have to be down !
30 cd /opt/crs/product/11204/crs/crs/install/
31 perl rootcrs.pl -patch This perl rotcrs.pl -patch wil also start the cluster on this node again.NOTE we had issues that this was hanging on the first Node . It appeared that the second node was up and running after all ( my Linux Colleague had issued a crsctl disable crs from an old not active cluster-ware software which was still present on the box) . So in this specific scenario on second node i stopped crs again   then the script continued on first node.
32 crsctl enable crs If you have used the disable crs . Enable it again so after a node reboot the GI will start.
33 As Oracle
emctl start agent Agent was already running so no manual action needed.
34 srvctl start home -o $ORACLE_HOME -s /tmp/statusRDBMS -n mysrvrar This will start all resources started from 1120.4 home. The resources had been saved previously in the /tmp/statusRDBMS file
35 srvctl start instance -d MYDBCM -i MYDBCM1 Starting the 11.1 Resources.
srvctl start instance -d MYDBCMAC -i MYDBCMAC1
36 srvctl start listener -n mysrvrar -l listener_MYDBCM1 Starting the 11.1 Resources.
srvctl start listener -n mysrvrar -l listener_MYDBCMAC1
37 As Oracle User on the second node once it is relinked:
38 srvctl start instance -d MYDBCM -i MYDBCM2 Starting the 11.1 Resources.
srvctl start instance -d MYDBCMAC -i MYDBCMAC2
39 srvctl start listener -n mysrvrbr -l listener_REQMOD2 Starting the 11.1 Resources.
srvctl start listener -n mysrvrbr -l listener_MYDBCM2
srvctl start home -o $ORACLE_HOME -s /tmp/statusRDBMS -n mysrvrbr
crsctl status resource -t Check your cluster again and compare the result with the status before. Hopefully all resources will appear online online   or at least show the situation as it was before . There might be an extra activity if you are using   services that have been relocated   during the action. In such case you will have to relocate them again to the original location.

 

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s