Missing or Corrupted Spfile in ASM instance in Rac

Introduction:

Old saying is , always expect the unexpected, well  this time this was another proof of that. During patching of an 8 node cluster on the first node we came across 2 issues , both requiring a work around. Issue one was  that after applying January 2019 Cluster did not start. This workaround is not part of this note btw. The second issue was that once we had the first Workaround in place the asm instance on the first node would not start. This note is explaining the steps followed to create a new spfile for the ASM instance(s) in a rac cluster.

General information:

This was the scenario where we found ourselves in: Patching in a rolling way had started on the  first node with January 2019 on 12.2 Oracle (GI and RDBMS) the first node was patched but crs would not start ( and hmm never liked opatchauto a lot 2 b honest). Together with Oracle support a work around was provided  but after that the ASM instance still would not start. During compare of environments it showed one very significant  memory setting on this cluster. Maybe this would have been MEMORY_*  and we are using hugepages on that cluster)  that prevented asm from starting once the patching  on the first node completed .

As a work around we created a pfile and once the cluster on node one was up we started the asm instance with the  pfile ( which was altered by me).

However we could no longer update the spfile for the other asm instances since oracle told us that in a rolling upgrade mode  you cannot make changes to the spfile . That meant that on all 8 Nodes we performed the patching , and once crs was up we had to start the asm with a copy of the init.ora  we used on 1st node too. In our case 8 nodes with a local copy of the init.ora which did not make us happy at all. That however brought us to below scenario where you need to bring  the asm back to using an spfile .

Important note: Since oracle 11.2  the GPNP profile is the key for such change !!!!

From old days , came up with below scenario to create a pfile,  alter that file to meet your needs and bring that as a spfile for the asm instance :

Scenario

  • Could be used with a missing spfile  – or  corrupted .
  • With an existing spfile with wrong settings , but where  you cannot alter since you started  patching already ( spfile updates are prohibited in rolling upgrade scenario ) ,  so maybe best practice is to analyse spfile before.
  • Important message: Scenario  has the requirement that the FULL cluster stack is down and you will work with ONE node only!!

#### With spget you can check current location of spfile  in asmcmd.

ASMCMD [+] > spget

+VOTE/mysrvr18cl/ASMPARAMETERFILE/registry.253.978015605

#### created and altered the pfile on the first node. ( and copied it to all other nodes during the workaround).

oracle@mysrvr1dr:/app/grid/product/12201/grid/dbs []# cd /app/oracle/admin/+ASM1/pfile

oracle@mysrvr1dr:/app/oracle/admin/+ASM1/pfile []# ls -ltr

total 4

-rw-r–r–. 1 oracle dba 2433 Feb  8 09:23 initASM.ora

##### starting  cluster  1st attempt (recalled that the cluster needed 2 b in some part of restricted mode for that, so all of cluster was stopped , then below command was issued). But Oracle showed mercy , telling to use the correct syntax:

mysrvr1dr:root:/app/grid/product/12201/grid/bin $ ./crsctl start crs restrict

Parse error:

  ‘restrict’ is an invalid argument

Usage:

  crsctl start crs [-excl [-nocrs | -cssonly]] | [-wait | -waithas | -nowait] | [-noautostart]

     Start OHAS on this server

where

     -excl        Start Oracle Clusterware in exclusive mode

     -nocrs       Start Oracle Clusterware in exclusive mode without starting CRS

     -nowait      Do not wait for OHAS to start

     -wait        Wait until startup is complete and display all progress and status messages

     -waithas     Wait until startup is complete and display OHASD progress and status messages

     -cssonly     Start only CSS

     -noautostart Start only OHAS

## Then started cluster in exclusive mode ,  that failed too  btw since it tried to start the asm instance, which was still holding the original spfile with the incorrect information.

mysrvr1dr:root:/app/grid/product/12201/grid/bin $ ./crsctl start crs  -excl

  • CRS-2672: Attempting to start ‘ora.cssdmonitor’ on ‘mysrvr1dr’
  • CRS-2676: Start of ‘ora.cssdmonitor’ on ‘mysrvr1dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.cssd’ on ‘mysrvr1dr’
  • CRS-2672: Attempting to start ‘ora.diskmon’ on ‘mysrvr1dr’
  • CRS-2676: Start of ‘ora.diskmon’ on ‘mysrvr1dr’ succeeded
  • CRS-2676: Start of ‘ora.cssd’ on ‘mysrvr1dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.ctssd’ on ‘mysrvr1dr’
  • CRS-2672: Attempting to start ‘ora.cluster_interconnect.haip’ on ‘mysrvr1dr’
  • CRS-2676: Start of ‘ora.ctssd’ on ‘mysrvr1dr’ succeeded
  • CRS-2676: Start of ‘ora.cluster_interconnect.haip’ on ‘mysrvr1dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.asm’ on ‘mysrvr1dr’
  • CRS-2674: Start of ‘ora.asm’ on ‘mysrvr1dr’ failed
  • CRS-2672: Attempting to start ‘ora.storage’ on ‘mysrvr1dr’
  • ORA-15077: could not locate ASM instance serving a required diskgroup
  • CRS-2674: Start of ‘ora.storage’ on ‘mysrvr1dr’ failed
  • CRS-2679: Attempting to clean ‘ora.storage’ on ‘mysrvr1dr’
  • CRS-2681: Clean of ‘ora.storage’ on ‘mysrvr1dr’ succeeded
  • CRS-2673: Attempting to stop ‘ora.cluster_interconnect.haip’ on ‘mysrvr1dr’
  • CRS-2677: Stop of ‘ora.cluster_interconnect.haip’ on ‘mysrvr1dr’ succeeded
  • CRS-2673: Attempting to stop ‘ora.ctssd’ on ‘mysrvr1dr’
  • CRS-2677: Stop of ‘ora.ctssd’ on ‘mysrvr1dr’ succeeded
  • CRS-4000: Command Start failed, or completed with errors.

## Next attempt ,  whole cluster was down , on first node performed  

  • mysrvr1dr:root:/app/grid/product/12201/grid/bin $ ./crsctl start crs -excl -nocrs
  • ## in this nocrs mode  we were able to start the +ASM1 instance  manually now.
  • SQL> STARTUP PFILE=’/app/oracle/admin/+ASM1/pfile/initASM.ora’;
  • ## Once ASM had started  create the new SPFILE
  • create spfile = ‘+VOTE’ from PFILE=’/app/oracle/admin/+ASM1/pfile/initASM.ora’;
  • ## in Alert noticed this which is good SINCE this shows gpnp profile had been updated accordingly
  • 019-02-08T16:31:32.839547+01:00
  • NOTE: updated gpnp profile ASM SPFILE to
  • NOTE: header on disk 0 advanced to format #2 using fcn 0.0
  • NOTE: header on disk 2 advanced to format #2 using fcn 0.0
  • NOTE: updated gpnp profile ASM diskstring: /dev/mapper/ASM_*
  • NOTE: updated gpnp profile ASM diskstring: /dev/mapper/ASM_*
  • 2019-02-08T16:31:34.381619+01:00
  • NOTE: updated gpnp profile ASM SPFILE to +VOTE/mysrvr18cl/ASMPARAMETERFILE/registry.253.999707493

##### Checked our activities  in ASMCMD:

  • oracle@mysrvr1dr:/app/oracle/admin/+ASM1/pfile [+ASM1]# asmcmd
  • [Option  -p will be used ]
  • ASMCMD [+] > spget
  • +VOTE/mysrvr18cl/ASMPARAMETERFILE/registry.253.999707493

### Checked our activities in gpnptool

oracle@mysrvr1dr:/app/oracle/admin/+ASM1/pfile [+ASM1]# gpnptool get

Warning: some command line parameters were defaulted. Resulting command line:

         /app/grid/product/12201/grid/bin/gpnptool.bin get -o-

<?xml version=”1.0″ encoding=”UTF-8″?><gpnp:GPnP-Profile Version=”1.0″ xmlns=”http://www.grid-pnp.org/2005/11/gpnp-profile” xmlns:gpnp=”http://www.grid-pnp.org/2005/11/gpnp-profile” xmlns:orcl=”http://www.oracle.com/gpnp/2005/11/gpnp-profile” xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xsi:schemaLocation=”http://www.grid-pnp.org/2005/11/gpnp-profile gpnp-profile.xsd” ProfileSequence=”7″ ClusterUId=”afc024ecfd5ffff8ffbeda0a212bebe1″ ClusterName=”mysrvr18cl” PALocation=””><gpnp:Network-Profile><gpnp:HostNetwork id=”gen” HostName=”*”><gpnp:Network id=”net1″ IP=”198.19.11.0″ Adapter=”bond0″ Use=”public”/><gpnp:Network id=”net2″ IP=”192.168.10.0″ Adapter=”eth3″ Use=”asm,cluster_interconnect”/><gpnp:Network id=”net3″ IP=”192.168.11.0″ Adapter=”eth5″ Use=”cluster_interconnect”/></gpnp:HostNetwork></gpnp:Network-Profile><orcl:CSS-Profile id=”css” DiscoveryString=”+asm” LeaseDuration=”400″/><orcl:ASM-Profile id=”asm” DiscoveryString=”/dev/mapper/ASM_*” SPFile=”+VOTE/mysrvr18cl/ASMPARAMETERFILE/registry.253.999707493″ Mode=”remote” Extended=”false”/><ds:Signature xmlns:ds=”http://www.w3.org/2000/09/xmldsig#“><ds:SignedInfo><ds:CanonicalizationMethod Algorithm=”http://www.w3.org/2001/10/xml-exc-c14n#“/><ds:SignatureMethod Algorithm=”http://www.w3.org/2000/09/xmldsig#rsa-sha1“/><ds:Reference URI=””><ds:Transforms><ds:Transform Algorithm=”http://www.w3.org/2000/09/xmldsig#enveloped-signature“/><ds:Transform Algorithm=”http://www.w3.org/2001/10/xml-exc-c14n#“> <InclusiveNamespaces xmlns=”http://www.w3.org/2001/10/xml-exc-c14n#” PrefixList=”gpnp orcl xsi”/></ds:Transform></ds:Transforms><ds:DigestMethod Algorithm=”http://www.w3.org/2000/09/xmldsig#sha1“/><ds:DigestValue>QH9UPO559zhufkrc7tFxQts6oF0=</ds:DigestValue></ds:Reference></ds:SignedInfo><ds:SignatureValue>aL2hOnxyLt5YwMcPjGg8LUDx2KD97Y75eLv+

yqvcfQ5O705K8ceQPCnwnsTs4Wn5E1jNeYCEzXnrVp5zM3hMbz9LdEEP2GKk9XJInQprWc39z7JKxm4uEw

NX3Ocs54FqxP1JdBX7PRiMh/

ePd8CoJIVtIaVMD29giX078uGwXcQ=</ds:SignatureValue></ds:Signature></gpnp:GPnP-Profile>

### since we have started cluster with -excl –nocrs  time to stop the cluster and start it normally

mysrvr1dr:root:/app/grid/product/12201/grid/bin $ ./crsctl stop crs

  • CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on ‘mysrvr1dr’
  • CRS-2673: Attempting to stop ‘ora.crsd’ on ‘mysrvr1dr’
  • CRS-2677: Stop of ‘ora.crsd’ on ‘mysrvr1dr’ succeeded
  • CRS-2673: Attempting to stop ‘ora.cluster_interconnect.haip’ on ‘mysrvr1dr’
  • CRS-2673: Attempting to stop ‘ora.crf’ on ‘mysrvr1dr’
  • CRS-2673: Attempting to stop ‘ora.drivers.acfs’ on ‘mysrvr1dr’
  • CRS-2673: Attempting to stop ‘ora.gpnpd’ on ‘mysrvr1dr’
  • CRS-2673: Attempting to stop ‘ora.mdnsd’ on ‘mysrvr1dr’
  • CRS-2677: Stop of ‘ora.drivers.acfs’ on ‘mysrvr1dr’ succeeded
  • CRS-2677: Stop of ‘ora.cluster_interconnect.haip’ on ‘mysrvr1dr’ succeeded
  • CRS-2677: Stop of ‘ora.crf’ on ‘mysrvr1dr’ succeeded
  • CRS-2677: Stop of ‘ora.gpnpd’ on ‘mysrvr1dr’ succeeded
  • CRS-2673: Attempting to stop ‘ora.ctssd’ on ‘mysrvr1dr’
  • CRS-2673: Attempting to stop ‘ora.storage’ on ‘mysrvr1dr’
  • CRS-2677: Stop of ‘ora.storage’ on ‘mysrvr1dr’ succeeded
  • CRS-2677: Stop of ‘ora.mdnsd’ on ‘mysrvr1dr’ succeeded
  • CRS-2677: Stop of ‘ora.ctssd’ on ‘mysrvr1dr’ succeeded
  • CRS-2673: Attempting to stop ‘ora.cssd’ on ‘mysrvr1dr’

### starting cluster normally on first node in normal mode

mysrvr1dr:root:/app/grid/product/12201/grid/bin $ ./crsctl start crs

###  Had small issue , so decided to stop the cluster on node 1 with force option

mysrvr1dr:root:/app/grid/product/12201/grid/bin $ ./crsctl stop crs  -f

  • CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on ‘mysrvr1dr’
  • CRS-2673: Attempting to stop ‘ora.mdnsd’ on ‘mysrvr1dr’
  • CRS-2673: Attempting to stop ‘ora.gpnpd’ on ‘mysrvr1dr’
  • CRS-2677: Stop of ‘ora.mdnsd’ on ‘mysrvr1dr’ succeeded
  • CRS-2677: Stop of ‘ora.gpnpd’ on ‘mysrvr1dr’ succeeded
  • CRS-2673: Attempting to stop ‘ora.ctssd’ on ‘mysrvr1dr’
  • CRS-2673: Attempting to stop ‘ora.evmd’ on ‘mysrvr1dr’
  • CRS-2673: Attempting to stop ‘ora.asm’ on ‘mysrvr1dr’
  • CRS-2673: Attempting to stop ‘ora.drivers.acfs’ on ‘mysrvr1dr’
  • CRS-2677: Stop of ‘ora.drivers.acfs’ on ‘mysrvr1dr’ succeeded
  • CRS-2677: Stop of ‘ora.ctssd’ on ‘mysrvr1dr’ succeeded
  • CRS-2677: Stop of ‘ora.evmd’ on ‘mysrvr1dr’ succeeded
  • CRS-2677: Stop of ‘ora.asm’ on ‘mysrvr1dr’ succeeded
  • CRS-2673: Attempting to stop ‘ora.cluster_interconnect.haip’ on ‘mysrvr1dr’
  • CRS-2677: Stop of ‘ora.cluster_interconnect.haip’ on ‘mysrvr1dr’ succeeded
  • CRS-2673: Attempting to stop ‘ora.cssd’ on ‘mysrvr1dr’
  • CRS-2677: Stop of ‘ora.cssd’ on ‘mysrvr1dr’ succeeded
  • CRS-2673: Attempting to stop ‘ora.gipcd’ on ‘mysrvr1dr’
  • CRS-2677: Stop of ‘ora.gipcd’ on ‘mysrvr1dr’ succeeded
  • CRS-2793: Shutdown of Oracle High Availability Services-managed resources on ‘mysrvr1dr’ has completed
  • CRS-4133: Oracle High Availability Services has been stopped.

#### Time to start the cluster in normal mode for all nodes

mysrvr1dr:root:/root $ cd /app/grid/product/12201/grid/bin

mysrvr1dr:root:/app/grid/product/12201/grid/bin $ ./crsctl start cluster -all

  • CRS-2672: Attempting to start ‘ora.evmd’ on ‘mysrvr6dr’
  • CRS-2672: Attempting to start ‘ora.cssdmonitor’ on ‘mysrvr6dr’
  • CRS-2672: Attempting to start ‘ora.cssdmonitor’ on ‘mysrvr4dr’
  • CRS-2672: Attempting to start ‘ora.cssdmonitor’ on ‘mysrvr2dr’
  • CRS-2672: Attempting to start ‘ora.cssdmonitor’ on ‘mysrvr3dr’
  • CRS-2672: Attempting to start ‘ora.cssdmonitor’ on ‘mysrvr8dr’
  • CRS-2672: Attempting to start ‘ora.evmd’ on ‘mysrvr2dr’
  • CRS-2672: Attempting to start ‘ora.evmd’ on ‘mysrvr4dr’
  • CRS-2672: Attempting to start ‘ora.cssdmonitor’ on ‘mysrvr7dr’
  • CRS-2672: Attempting to start ‘ora.evmd’ on ‘mysrvr3dr’
  • CRS-2672: Attempting to start ‘ora.evmd’ on ‘mysrvr8dr’
  • CRS-2672: Attempting to start ‘ora.evmd’ on ‘mysrvr7dr’
  • CRS-2672: Attempting to start ‘ora.cssdmonitor’ on ‘mysrvr5dr’
  • CRS-2672: Attempting to start ‘ora.evmd’ on ‘mysrvr5dr’
  • CRS-2676: Start of ‘ora.cssdmonitor’ on ‘mysrvr4dr’ succeeded
  • CRS-2676: Start of ‘ora.cssdmonitor’ on ‘mysrvr8dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.cssd’ on ‘mysrvr4dr’
  • CRS-2672: Attempting to start ‘ora.diskmon’ on ‘mysrvr4dr’
  • CRS-2672: Attempting to start ‘ora.cssd’ on ‘mysrvr8dr’
  • CRS-2676: Start of ‘ora.cssdmonitor’ on ‘mysrvr2dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.diskmon’ on ‘mysrvr8dr’
  • CRS-2676: Start of ‘ora.cssdmonitor’ on ‘mysrvr3dr’ succeeded
  • CRS-2676: Start of ‘ora.cssdmonitor’ on ‘mysrvr6dr’ succeeded
  • CRS-2676: Start of ‘ora.cssdmonitor’ on ‘mysrvr7dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.cssd’ on ‘mysrvr2dr’
  • CRS-2672: Attempting to start ‘ora.cssd’ on ‘mysrvr6dr’
  • CRS-2672: Attempting to start ‘ora.diskmon’ on ‘mysrvr2dr’
  • CRS-2672: Attempting to start ‘ora.cssd’ on ‘mysrvr3dr’
  • CRS-2672: Attempting to start ‘ora.diskmon’ on ‘mysrvr6dr’
  • CRS-2672: Attempting to start ‘ora.cssd’ on ‘mysrvr7dr’
  • CRS-2672: Attempting to start ‘ora.diskmon’ on ‘mysrvr3dr’
  • CRS-2676: Start of ‘ora.diskmon’ on ‘mysrvr4dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.diskmon’ on ‘mysrvr7dr’
  • CRS-2676: Start of ‘ora.diskmon’ on ‘mysrvr8dr’ succeeded
  • CRS-2676: Start of ‘ora.diskmon’ on ‘mysrvr2dr’ succeeded
  • CRS-2676: Start of ‘ora.diskmon’ on ‘mysrvr6dr’ succeeded
  • CRS-2676: Start of ‘ora.diskmon’ on ‘mysrvr3dr’ succeeded
  • CRS-2676: Start of ‘ora.diskmon’ on ‘mysrvr7dr’ succeeded
  • CRS-2676: Start of ‘ora.cssdmonitor’ on ‘mysrvr5dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.cssd’ on ‘mysrvr5dr’
  • CRS-2672: Attempting to start ‘ora.diskmon’ on ‘mysrvr5dr’
  • CRS-2676: Start of ‘ora.diskmon’ on ‘mysrvr5dr’ succeeded
  • CRS-2676: Start of ‘ora.evmd’ on ‘mysrvr6dr’ succeeded
  • CRS-2676: Start of ‘ora.evmd’ on ‘mysrvr2dr’ succeeded
  • CRS-2676: Start of ‘ora.evmd’ on ‘mysrvr4dr’ succeeded
  • CRS-2676: Start of ‘ora.evmd’ on ‘mysrvr8dr’ succeeded
  • CRS-2676: Start of ‘ora.evmd’ on ‘mysrvr3dr’ succeeded
  • CRS-2676: Start of ‘ora.evmd’ on ‘mysrvr7dr’ succeeded
  • CRS-2676: Start of ‘ora.evmd’ on ‘mysrvr5dr’ succeeded
  • CRS-2676: Start of ‘ora.cssd’ on ‘mysrvr8dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.ctssd’ on ‘mysrvr8dr’
  • CRS-2672: Attempting to start ‘ora.cluster_interconnect.haip’ on ‘mysrvr8dr’
  • CRS-2676: Start of ‘ora.cssd’ on ‘mysrvr2dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.ctssd’ on ‘mysrvr2dr’
  • CRS-2672: Attempting to start ‘ora.cluster_interconnect.haip’ on ‘mysrvr2dr’
  • CRS-2676: Start of ‘ora.cssd’ on ‘mysrvr5dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.ctssd’ on ‘mysrvr5dr’
  • CRS-2672: Attempting to start ‘ora.cluster_interconnect.haip’ on ‘mysrvr5dr’
  • CRS-2676: Start of ‘ora.ctssd’ on ‘mysrvr8dr’ succeeded
  • CRS-2676: Start of ‘ora.ctssd’ on ‘mysrvr2dr’ succeeded
  • CRS-2676: Start of ‘ora.cssd’ on ‘mysrvr7dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.ctssd’ on ‘mysrvr7dr’
  • CRS-2672: Attempting to start ‘ora.cluster_interconnect.haip’ on ‘mysrvr7dr’
  • CRS-2676: Start of ‘ora.ctssd’ on ‘mysrvr5dr’ succeeded
  • CRS-2676: Start of ‘ora.cssd’ on ‘mysrvr4dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.ctssd’ on ‘mysrvr4dr’
  • CRS-2672: Attempting to start ‘ora.cluster_interconnect.haip’ on ‘mysrvr4dr’
  • CRS-2676: Start of ‘ora.cssd’ on ‘mysrvr3dr’ succeeded
  • CRS-2676: Start of ‘ora.cssd’ on ‘mysrvr6dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.ctssd’ on ‘mysrvr3dr’
  • CRS-2672: Attempting to start ‘ora.cluster_interconnect.haip’ on ‘mysrvr3dr’
  • CRS-2672: Attempting to start ‘ora.ctssd’ on ‘mysrvr6dr’
  • CRS-2672: Attempting to start ‘ora.cluster_interconnect.haip’ on ‘mysrvr6dr’
  • CRS-2676: Start of ‘ora.ctssd’ on ‘mysrvr7dr’ succeeded
  • CRS-2676: Start of ‘ora.cluster_interconnect.haip’ on ‘mysrvr8dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.asm’ on ‘mysrvr8dr’
  • CRS-2676: Start of ‘ora.asm’ on ‘mysrvr8dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.storage’ on ‘mysrvr8dr’
  • CRS-2676: Start of ‘ora.ctssd’ on ‘mysrvr4dr’ succeeded
  • CRS-2676: Start of ‘ora.ctssd’ on ‘mysrvr3dr’ succeeded
  • CRS-2676: Start of ‘ora.ctssd’ on ‘mysrvr6dr’ succeeded
  • CRS-2676: Start of ‘ora.cluster_interconnect.haip’ on ‘mysrvr2dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.asm’ on ‘mysrvr2dr’
  • CRS-2676: Start of ‘ora.asm’ on ‘mysrvr2dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.storage’ on ‘mysrvr2dr’
  • CRS-2676: Start of ‘ora.storage’ on ‘mysrvr8dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.crsd’ on ‘mysrvr8dr’
  • CRS-2676: Start of ‘ora.crsd’ on ‘mysrvr8dr’ succeeded
  • CRS-2676: Start of ‘ora.storage’ on ‘mysrvr2dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.crsd’ on ‘mysrvr2dr’
  • CRS-2676: Start of ‘ora.cluster_interconnect.haip’ on ‘mysrvr5dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.asm’ on ‘mysrvr5dr’
  • CRS-2676: Start of ‘ora.asm’ on ‘mysrvr5dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.storage’ on ‘mysrvr5dr’
  • CRS-2676: Start of ‘ora.crsd’ on ‘mysrvr2dr’ succeeded
  • CRS-2676: Start of ‘ora.storage’ on ‘mysrvr5dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.crsd’ on ‘mysrvr5dr’
  • CRS-2676: Start of ‘ora.cluster_interconnect.haip’ on ‘mysrvr7dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.asm’ on ‘mysrvr7dr’
  • CRS-2676: Start of ‘ora.asm’ on ‘mysrvr7dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.storage’ on ‘mysrvr7dr’
  • CRS-2676: Start of ‘ora.crsd’ on ‘mysrvr5dr’ succeeded
  • CRS-2676: Start of ‘ora.storage’ on ‘mysrvr7dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.crsd’ on ‘mysrvr7dr’
  • CRS-2676: Start of ‘ora.cluster_interconnect.haip’ on ‘mysrvr6dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.asm’ on ‘mysrvr6dr’
  • CRS-2676: Start of ‘ora.asm’ on ‘mysrvr6dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.storage’ on ‘mysrvr6dr’
  • CRS-2676: Start of ‘ora.cluster_interconnect.haip’ on ‘mysrvr4dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.asm’ on ‘mysrvr4dr’
  • CRS-2676: Start of ‘ora.asm’ on ‘mysrvr4dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.storage’ on ‘mysrvr4dr’
  • CRS-2676: Start of ‘ora.cluster_interconnect.haip’ on ‘mysrvr3dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.asm’ on ‘mysrvr3dr’
  • CRS-2676: Start of ‘ora.asm’ on ‘mysrvr3dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.storage’ on ‘mysrvr3dr’
  • CRS-2676: Start of ‘ora.crsd’ on ‘mysrvr7dr’ succeeded
  • CRS-2676: Start of ‘ora.storage’ on ‘mysrvr6dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.crsd’ on ‘mysrvr6dr’
  • CRS-2676: Start of ‘ora.storage’ on ‘mysrvr4dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.crsd’ on ‘mysrvr4dr’
  • CRS-2676: Start of ‘ora.storage’ on ‘mysrvr3dr’ succeeded
  • CRS-2672: Attempting to start ‘ora.crsd’ on ‘mysrvr3dr’
  • CRS-2676: Start of ‘ora.crsd’ on ‘mysrvr6dr’ succeeded
  • CRS-2676: Start of ‘ora.crsd’ on ‘mysrvr4dr’ succeeded
  • CRS-2676: Start of ‘ora.crsd’ on ‘mysrvr3dr’ succeeded
  • CRS-4690: Oracle Clusterware is already running on ‘mysrvr1dr’ à fine since  we kept cluster running on node 1

CRS-4000: Command Start failed, or completed with errors.

### checks performed :

On each node: ps -ef|grep d.bin

On each node: crsctl stat res -t -init

On a node: crsctl check cluster -all

Happy reading,

And till next time

Mathijs

Asm Instance not starting after Cluster Node Reboot

Introduction.

I have been involved again in a situation where  the Rac cluster did not start after a reboot of the server during a maintenance window. And as always a true challenge that was. In such cases it is true that the alert log of the node and the ohasd logging  will be your best friends ( well together with Metalink and Google of course).

Details:

After a Os patching action on one of the nodes on one of my 11.2 Racs (Grid Infrastructure)  i was contacted can you please take a look cause the clusterware is not starting. After first investigation it showed that statement was not entirely true .  The cluster ware itself had been started but the  log file for the ohasd. showed following details , that it was not able to start the asm Resource due to ORA-01031: insufficient privileges.

this is what it showed:

## /opt/crs/product/11.2.0.2_a/crs/log/Mysrvr1r/ohasd [+ASM1]# view ohasd.log
2013-08-06 11:05:01.643: [    AGFW][1980881216] {0:0:2} Received the reply to the message: RESOURCE_CLEAN[ora.asm 1 1] ID 4100:411 from the agent /opt/crs/product/11.2.0.2_a/crs/bin/oraagent_oracle
2013-08-06 11:05:01.644: [    AGFW][1980881216] {0:0:2} Agfw Proxy Server sending the reply to PE for message:RESOURCE_CLEAN[ora.asm 1 1] ID 4100:410
2013-08-06 11:05:01.644: [   CRSPE][1991387456] {0:0:2} Received reply to action [Clean] message ID: 410
2013-08-06 11:05:01.644: [   CRSPE][1991387456] {0:0:2} Got agent-specific msg: ORA-01031: insufficient privileges
2013-08-06 11:05:01.646: [    AGFW][1980881216] {0:0:2} Received the reply to the message: RESOURCE_CLEAN[ora.asm 1 1] ID 4100:411 from the agent /opt/crs/product/11.2.0.2_a/crs/bin/oraagent_oracle
2013-08-06 11:05:01.646: [    AGFW][1980881216] {0:0:2} Agfw Proxy Server sending the reply to PE for message:RESOURCE_CLEAN[ora.asm 1 1] ID 4100:410
2013-08-06 11:05:01.646: [   CRSPE][1991387456] {0:0:2} Received reply to action [Clean] message ID: 410
2013-08-06 11:05:01.829: [    AGFW][1980881216] {0:0:2} Received the reply to the message: RESOURCE_CLEAN[ora.asm 1 1] ID 4100:411 from the agent /opt/crs/product/11.2.0.2_a/crs/bin/oraagent_oracle
2013-08-06 11:05:01.829: [    AGFW][1980881216] {0:0:2} Agfw Proxy Server sending the last reply to PE for message:RESOURCE_CLEAN[ora.asm 1 1] ID 4100:410
2013-08-06 11:05:01.829: [   CRSPE][1991387456] {0:0:2} Received reply to action [Clean] message ID: 410
2013-08-06 11:05:01.829: [   CRSPE][1991387456] {0:0:2} RI [ora.asm 1 1] new internal state: [STABLE] old value: [CLEANING]
2013-08-06 11:05:01.829: [   CRSPE][1991387456] {0:0:2} CRS-2681: Clean of 'ora.asm' on 'Mysrvr1r' succeeded

That did not look all to good. I had a first guess about what was going on by trying to connect to the asm instance on that box via sqlplus ( sqlplus / as sysasm). When that showed  also the ORA-01031: insufficient privileges.

I had to giggle cause when  looking for that  message on the web  i ended up with my blog. Which proves once again that you can help yourself by helping others by sharing in the Oracle community.   Basically i  focused on  three metalink notes that might apply:

Troubleshooting ORA-1031: Insufficient Privileges While Connecting As SYSDBA [ID 730067.1]

UNIX: Checklist for Resolving Connect AS SYSDBA Issues [ID 69642.1]

UNIX: Diagnostic C program for ORA-1031 from CONNECT INTERNAL / AS SYSDBA [ID 67984.1]

The third note (67984.1) was my bingo !  So it was proved that my groupid ( dba) altered from 101 to some other value by a ldap lookup.. I have asked the Linux colleague  to disable these lookups and after that the asm instance started and all the instances as well.  As a workaround , in the /etc/ldap.conf they have added the oracle user to the nss_initgroups_ignoreusers to prevent this from happening.

Happy reading,

Mathijs