No this is not a note in defense or offense for the ASMLIB. It is merely a registration of the fact that in the past Real application clusters have been setup with ASMLIB and these Clusters have turned into trouble(d) environments recently due to wrong allocation of ASMLIB Disks. Since the company standards have changed over the year in favor of the Multi-pathing it has been decided that the troubled cluster would have to start using the Multi-pathing . For a general overview of ASMLIB and Multi-pathing : http://oracle.su/docs/11g/server.112/e10500/asmprepare.htm.
Also it has been decided to alter the standards a that ruled couple of years ago ( implementing Rac clusters with Asm with ASMLIB) to now a days company standards which embraced the Multi-pathing. So far no one ever did this kind of convert on live systems so it would need a proper scenario a dress-rehearsal test before we would implement this on production environment .
In general terms it was explained to me that ASMLIB is like a layer on TOP of the Multi-Pathing presenting Disks in ASM with a specific labeling. The asm_diskstring in an ASMLIB environment would point to those disks with ‘ORCL:1’ . Where the asm_diskstring would be like the key which disks should be discovered by the ASM instance to work with. The scenario turned out to be remarkable simple to the Oracle Dbas. Bottom line was that even when the disks have been labeled for ASMLIB if we would alter the asm_diskstring to point to the disks via ‘/dev/mapper/asm*p1’ that would already do the job since the ASMLIB labels would simply be ignored by multi-path. But as always do not try this at home unless you alter it in a controlled way as described below.
As a preparation i ran this scenario on a preproduction box and with that being a good dress-rehearsal implemented this on the production as well. More details will be described below:
|Plan of Approach|
|1||alter system set asm_diskstring = ‘/dev/mapper/asm*p1’ scope = spfile;||Oracle|
|1||Stopping Crs on all 4 Nodes||Oracle|
|1||Started Crs on mysrvr33r||Oracle|
|1||Started Crs on mysrvr34r||Oracle|
|1||Started Crs on mysrvr35r||Oracle|
|1||Started Crs on mysrvr36r||Oracle|
|2||Stopping Crs on all 4 Nodes||Oracle|
|2||Removing ASMLIB RPMS 33r 34r 35r 36r||Linux|
|2||Server reboot of mysrvr33r||Linux|
|2||Check Cluster on mysrvr33r , all up||Oracle|
|2||Restart of mysrvr34r 35r 36r||Linux|
|2||Check 33r 34r 35r 36r||Oracle|
|3||Stopping Crs on 34r 35r 36r Nodes||Oracle|
|3||Create spfile in asm diskgroup on 33r||Oracle|
|3||Restart of CRS on 33r||Oracle|
|3||Check gpnp profile / spfile on 33r||Oracle|
|3||Restart of crs on mysrvr34r 35r 36r||Oracle|
|3||Check gpnp profile / spfile on 33r – 36r||Oracle|
- I have used three waves of activities . In the first wave As as preparation i altered the asm_diskgstring in the spfile of the asm instances already ( of course not yet active till after the next restart ) . After that i stopped the full cluster on all 4 nodes and started with my first node to see the effects ( and to see the asm instance and database instances to be started ) all nodes had been restarted. And it showed all was running well.
- In this action i worked together with the Linux admins. The boxes in scope running on RedHat they wanted to get rid of the ASMLIB in the kernel as well so in the second wave i shutdown the full cluster one more time, they removed the rpms from the Linux and rebooted the first box and all was well after that. So after my checks the other three boxes were started in parallel and the end result was a happy and running cluster again !
- In wave three i had to fix some old hurt that these 4 boxes were still working with local spfiles instead of a shared spfile in the asm instance. When this Cluster was build by me some three years ago it was born as a 11.1 cluster environment and it had been set up with local copies of the spfile:
Setting up a (lost) Spfile in ASM in a Grid infrastructure environment:
## First step i thought would be to stop the cluster since i did that on another scenario. Much to my surprise i was recommended to do this in the running environment so i did this after i prepared a valid init.ora:
SQL> create spfile ='+CLUSTERDATA' from pfile = '/opt/oracle/+ASM1/admin/pfile/init+ASM1.NEW' ; create spfile ='+CLUSTERDATA' from pfile = '/opt/oracle/+ASM1/admin/pfile/init+ASM1.NEW' * ERROR at line 1: ORA-29780: unable to connect to GPnP daemon [CLSGPNP_ERR]
## As you can see that did not work. Investigation brought following note (Environment Variable ORA_CRS_HOME MUST be UNSET in 11gR2/12c GI (Doc ID 1502996.1) . I checked and indeed that environment Variable was present in my .profile:
#### bad practice to have ORA_CRS_HOME set in your .profile if you are using GI so we unset it !!!!!!!!!!!!!!!!!!!
oracle@mysrvr33r:/opt/oracle/+ASM1/admin/pfile [+ASM1]# unset ORA_CRS_HOME oracle@mysrvr33r:/opt/oracle/+ASM1/admin/pfile [+ASM1]# echo $ORA_CRS_HOME ## after that the create of the spfile worked. SQL> create spfile ='+CLUSTERDATA' from pfile = '/opt/oracle/+ASM1/admin/pfile/init+ASM1.NEW' ; File created.
## Lets do some checks ( in asmcmd) it showed:
ASMCMD [+] > spget +CLUSTERDATA/mysrvr3_cluster/asmparameterfile/registry.253.841967299
## Second check using the gpnptool:
Warning: some command line parameters were defaulted. Resulting command line: /opt/crs/product/112_ee_64/crs/bin/gpnptool.bin get -o- http://www.grid-pnp.org/2005/11/gpnp-profile” xmlns:gpnp=”http://www.grid-pnp.org/2005/11/gpnp-profile” xmlns:orcl=”http://www.oracle.com/gpnp/2005/11/gpnp-profile” xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xsi:schemaLocation=”http://www.grid-pnp.org/2005/11/gpnp-profile gpnp-profile.xsd” ProfileSequence=”58″ ClusterUId=”2b7266b0d5797f65ff0fcf4c8e7931d6″ ClusterName=”mysrvr3_cluster” PALocation=””><gpnp:Network-Profile>SPFile=”+CLUSTERDATA/mysrvr3_cluster/asmparameterfile/registry.253.841967299″/><orcl:OCR-Profile id=”ocr” OCRId=”372102285″/>http://www.w3.org/2000/09/xmldsig#“><ds:CanonicalizationMethod Algorithm=”http://www.w3.org/2001/10/xml-exc-c14n#“/><ds:SignatureMethod Algorithm=”http://www.w3.org/2000/09/xmldsig#rsa-sha1“/><ds:Reference URI=””><ds:Transforms><ds:Transform Algorithm=”http://www.w3.org/2000/09/xmldsig#enveloped-signature“/><ds:Transform Algorithm=”http://www.w3.org/2001/10/xml-exc-c14n#“> <InclusiveNamespaces xmlns=”http://www.w3.org/2001/10/xml-exc-c14n#” PrefixList=”gpnp orcl xsi”/>http://www.w3.org/2000/09/xmldsig#sha1“/>3Tjuts50Gi92r42OMa4Pb17PiYc=B941IphE6D1FqVhc1u/+NwhAM3QXbBRiMT0plxhXyptUnj4mu1T1UFP/5yG+yBIzblquOy4aqxNBthMy7aQW0lyS4QfMZbjWYhYH2nvbrnnyqY/ZoYXOY0QaAYciboALXxJxCzup6ZGxCnsgtT8G/b08z679j8NlMvykdE2pmWY=
## in asmcmd:
ls -l +CLUSTERDATA/mysrvr3_cluster/asmparameterfile/registry.253.841967299
[Option -p will be used ]
Type Redund Striped Time Sys Name
ASMPARAMETERFILE UNPROT COARSE MAR 11 23:00:00 Y registry.253.841967299
## alert log shows:
Tue Mar 11 23:48:18 2014
NOTE: updated gpnp profile ASM diskstring: /dev/mapper/asm*p1
NOTE: updated gpnp profile ASM diskstring: /dev/mapper/asm*p1
## looks good on other servers as well ( checked them all and they showed similar like below:
## As a next step i moved the existing spfile in the $ORACLE_HOME/dbs
mv spfile+ASM1.ora spfile+ASM1.ora.20140311.old
## then edited a the init.ora to make it point to the ASM-Diskgroup as well.
## i restarted the cluster and after that i checked in the asm Instance.
SQL> show parameter spfile NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ spfile string +CLUSTERDATA/mysrvr3_cluster/as mparameterfile/registry.253.84 1967299
## checks are asmcmd spget and gpnptool get
## copied the init.ora to otherboxes
cat oracle@mysrvr33r:/opt/crs/product/112_ee_64/crs/dbs [+ASM1]# cat init+ASM1.ora spfile='+CLUSTERDATA/mysrvr3_cluster/asmparameterfile/registry.253.841967299' 1062 scp init+ASM1.ora oracle@mysrvr34r:/opt/crs/product/112_ee_64/crs/dbs/init+ASM2.ora 1063 scp init+ASM1.ora oracle@mysrvr35r:/opt/crs/product/112_ee_64/crs/dbs/init+ASM3.ora 1064 scp init+ASM1.ora oracle@mysrvr36r:/opt/crs/product/112_ee_64/crs/dbs/init+ASM4.ora
Mission completed. As always happy reading And DO test before you implement!