Thoughts on Adding a Single Instance To Oracle Clusterware (Grid Infra).

General: Configuring Active/Passive Clustering for Oracle Database ( Single Instance in Crs)

The Oracle database can easily be configured to use the Cluster ware framework for high availability.

Using Grid Infrastructure to protect a database resource is a very cost-effective way of setting up an Active/passive cluster. As an added advantage, using only one vendor’s software stack to implement the cluster can make troubleshooting easier. Staff already familiar with RAC will easily be able to set up and run this configuration because it uses a same software stack: all commands and log files are in familiar locations, and troubleshooting does not rely on the input from other teams.

To set up an active/passive cluster with 11g Release 2, you need to initially install Grid Infrastructure on all nodes of the cluster. Grid Infrastructure will give the user with a cluster logical volume manager: ASM. If for some reason another file system is required, you can choose the supported cluster file systems, including ACFS. Using a cluster file system that is mounted concurrently to both cluster nodes offers the advantage of not having to remount the database files and binaries in case of a node failure. Some configurations we saw suffered from extended fail over periods caused by required file system checks before the file system could be remounted.

On top of the Grid Infrastructure build, you do a local installation of the RDBMS. It is important that you do not choose a cluster installation when prompted so; otherwise, you risk violating your license agreement with Oracle.

Continue reading

Steps to Recreate my Central Inventory in a Real Applications Cluster environment on Linux

Introduction:

In the last quarter of 2012 i have set up  a 4 Node Real Application Cluster on Linux which was great to do as always. In day-to-day business I had lost track of this environment  which in itself is not that bad of course. However this week I needed the environment as a preparation for patching so it would be mandatory to check with opatch last patch installed in the grid infra structure and in the rdbms homes that have been set up on those boxes. Well it was to be expected that other dbas would work on those boxes, maybe / likely might have installed new patches so it was really like getting back together with old friends to see what was under the hood.

Well it was a bit of disappointment that the inventory destination  (/opt/oracle/orInventory) was simply empty . Opatch did not like it either cause it  exited with a returncode 73. But then again on the bright side this meant i could take note of the necessary steps  performed  and turn it into a blog.

Continue reading

When Starting a Rac database brings you: ORA-29760: instance_number parameter not specified.

Introduction

Recently was supporting a colleague who was unable to start a Real Application Database after he had  created the environment by manual scripts. As i already mentioned in another post nothing wrong with exploring both methods ( either using DBCA (with care but hey any tool will come with a manual and  some shortcomings) or manual. Well if you decide to work manually  of course it will take maybe even  more discipline in setting up init.ora files , registering the database and its instances into the Grid Infra structure layer etc.  But generally speaking there is a challenge in either approach you choose. Well on to our case.  The Database would not start via the clusterware (with srvctl ) but manually  it was possible to start either of the two instances in this Rac.

Investigation:

Well first glimpse to see what might be going on:

SQL> select * from gv$instance;

INST_ID INSTANCE_NUMBER INSTANCE_NAME HOST_NAME VERSION STARTUP_TIME STATUS PAR THREAD# ARCHIVE LOG_SWITCH_WAIT
———- ————— —————- —————————————————————- —————– ——————- ———— — ———- ——- —————
LOGINS SHU DATABASE_STATUS INSTANCE_ROLE ACTIVE_ST BLO
———- — —————– —————— ——— —
1 1 MYDB1 server1hr 11.2.0.3.0 11.04.2013 16:14:31 OPEN YES 1 STARTED
ALLOWED NO ACTIVE PRIMARY_INSTANCE NORMAL NO

2 2 MYDB2 server2hr 11.2.0.3.0 11.04.2013 16:25:54 OPEN YES 2 STARTED
ALLOWED NO ACTIVE PRIMARY_INSTANCE NORMAL NO

That query  showed both instances running. 

Then the colleague tried:

srvctl start database -d  … 

PRCR-1079 : Failed to start resource ora.mydb.db

CRS-5017: The resource action “ora.mydb.db start” encountered the following error:
ORA-29760: instance_number parameter not specified
. For details refer to “(:CLSN00107:)” in “/opt/crs/product/11203_ee_64/crs/log/server1hr/agent/crsd/oraagent_oracle/oraagent_oracle.log”.

CRS-2674: Start of ‘ora.mydb.db’ on ‘server1hr’ failed
CRS-2632: There are no more servers to try to place resource ‘ora.mydb.db’ on that would satisfy its placement policy
CRS-5017: The resource action “ora.mydb.db start” encountered the following error:
ORA-29760: instance_number parameter not specified
. For details refer to “(:CLSN00107:)” in “/opt/crs/product/11203_ee_64/crs/log/server2hr/agent/crsd/oraagent_oracle/oraagent_oracle.log”.

CRS-2674: Start of ‘ora.mydb.db’ on ‘server2hr’ failed

Well then it was time to Google  for ORA-29760: instance_number parameter not specified

A Fellow blogger  mentioned the following Mos Note to solve this issue. And it once again proved that if you really settle for doing manual setup of you Rac environment you have to be well prepared and consistent in the naming of your parameters in you configuration file ( init.ora as a base for an spfile) and the way you enter services in the cluster layer. It once again proved that  names are registered in the way you enter them.  In plain English it is Case sensitive so  the old saying : what you type is what you get is very applicable.

MOS note ‘ORA-29760: instance_number parameter not specified’ When Starting the Database with Srvctl [ID 749515.1]

Happy reading and as always best of luck.

Mathijs.

11.2 Rac – Instance unwilling to start with Error ORA-00132

Introduction:

This week  had one of those days where after working together with customer and Oracle Consultant  one  of  the Rac Databases needed a restart due to a variable change.  Together with customer it was decided to restart second  instance first so first took care of the  environment by enabling a Maintenance mode so our monitoring tool would not strike me with tickets.

The second instance was   friendly:

  • the shutdown: srvctl stop instance -d -i went smooth .
  • the startup: srvctl start instance -d -i went smooth .

Then i  performed same steps on first Instance:

  • the shutdown: srvctl stop instance -d -i went smooth .
  • the startup: srvctl start instance -d -i   failed  with an error message that puzzled me a lot .

ORA-00132: syntax error or unresolved network name ‘myserver-scan:33000’

I checked my scan listener settings:

oracle@myserver-s1:/opt/oracle [CRS]#
SCAN Listener LISTENER_SCAN1 is enabled
SCAN listener LISTENER_SCAN1 is running on node myserver-s4
SCAN Listener LISTENER_SCAN2 is enabled
SCAN listener LISTENER_SCAN2 is running on node myserver-s2
SCAN Listener LISTENER_SCAN3 is enabled
SCAN listener LISTENER_SCAN3 is running on node myserver-s3

oracle@myserver-s1:/opt/oracle [CRS]# srvctl config scan_listener
SCAN Listener LISTENER_SCAN1 exists. Port: TCP:33000
SCAN Listener LISTENER_SCAN2 exists. Port: TCP:33000
SCAN Listener LISTENER_SCAN3 exists. Port: TCP:33000

Crs agent log showed same error and i could not start the instance via the  srvctl tool. The error suggested that my remote listener ( i am using scan ) was the issue.. But the  remote listener was not even part of  my tnsnames.ora, since my spfile value for the remote listener = myserver-scan:33000.

That was soo confusing. Well  I attempted to start the Instance via sqlplus  and that worked. I checked the cluster and it reported  ONLINE ONLINE on the resource so  the first pressure was off.

To solve it , make sure that  the sqlnet.ora is similar on all environments and it seems to need the ezconnect information in there  ( in names.directory_path).([ID 1318026.1])

Happy reading

Mathijs

Is it really the good – the – bad – and the Gui Dba building Racs?

Introduction

From my experience for some Database Administrators / sites it  still seem to be more tough /  hardcore to use the command line instead of using tools that  could do the same for you only with a lot less effort. And maybe that is also regarded to be a risk cause it is so easy. Well my thoughts on this is that a healthy mix on the use Gui – tools and command lines should  be an asset  to every administrator. And indeed before even starting debate when using the Gui of course you should always know what you are doing .. but then again that is always the case is it not ?

Creating an Rac manually – using DBCA.

Personally i think that the DBCA tool is  your best way forward if you are creating a new Rac environment. Of course  this tool also has also its limits ( such as on a 4 Node environment it will  by default set your cluster_database_instances default to 4 even when you build only a two node Rac)  but in general it is best way to work cause not only will create the Database , and the instances on the various Nodes  but it also adds the information to the 11G Grid infrastructure ( the Cluster ware ). And  for documentation purposes you can / should also always generate the create scripts in a default ( for me always /opt/oracle//admin/create) directory.

Yesterday i had to support a colleague who decided to create a two node RAC by hand . Indeed  even  when i suggested to him that using the DBCA would be helpful and fast in his actions it was his call to do it all by hand. But then he got stuck in the process with various actions.

If you decide to do  a create manually ( as they say maybe you should try to not use the Gui just cause it is written in this blog).  I think following steps are important:

  1. Work accurately ( setting up init oras , one for single Instance – Db) add the Instance specific information (in my case undo tablespace and local_listener ) were needed. Create spfile from that pfile.
  2. Start – create the single instance-db in full with the spfile.
  3. Once the single instance is alive start alter your spfile:
    1. alter system set cluster_database=true scope = spfile:
    2. alter system set instance_number= 1 scope = spfile sid=”instance1″ ;
    3. alter system set instance_number= 2 scope = spfile sid=”instance2″ ;
    4. alter system set thread = 1 scope = spfile sid=”instance1″ ;
    5. alter system set thread = 2 scope = spfile sid=”instance2″ ;
  4. Add at least the 3 redo groups to the second thread( alter database add logfile thread 2 group x:)
  5. set up the remote_listener with the scan listener
  6. alter system enable thread 2 ;
  7. Add a Taf service.
  8. As a check stop and start and stop both instances manually to see if the database is shared as it should be between the two instances.
  9. Add the information to the cluster:

srvctl  add database -d DB -o /opt/oracle/product/oracleVersion

srvctl  add instance -d DB -i DB1 -n Node1 srvctl  add instance -d DB -i DB2 -n Node2

srvctl modify instance -d DB -i DB1 -s +ASM1 srvctl modify instance -d DB -i DB2 -s +ASM2

srvctl  add service -d DB -s DB_TAF.domain -r DB1,DB2 srvctl start database -d DB srvctl start  service -d DB -s DB_TAF.domain

Ok, are  you still with me , do i rest my case here that it is a lot of work to do this the manual way ? And that these steps can be very  lets say error prone if you do not work 100%  accurate..

Yesterdays event where indeed not successful at first . Amongst other issues He got :

PRCR-1079 : Failed to start resource ora.db.db
CRS-5017: The resource action “ora.db.db start” met the following error:
ORA-29760: instance_number parameter not specified

Once again Google was our friend it showed post of a fellow blogger with regard to a Mos note:

MOS note ‘ORA-29760: instance_number parameter not specified’ When Starting the Database with Srvctl [ID 749515.1].

After that . Things worked .

Bottom-line of this story ? Plain and simple use the right and easiest tool for the Job you need to do. If that is a Gui  there is no dishonor in that . If you decide to go manual create , make sure you work accurately  hmm and have Google available for when things come up during the process.

Happy Reading ,

Mathijs

The return of ASM communication Issues ((ORA-01031: insufficient privileges) (WARNING: ASM communication(the aftermath)

Introduction

January  5th I wrote a post on  the issues we faced with ASM instance which would not let me log in as sqlplus /  as sysasm at specific point and time during which time alert log of the databases  on  that box would also be sending warnings to the alert log “.. ASM communication error”. With information on the web (Metalink)  a solution and a workaround had been offered and implemented.  For example on that specific box the oinstall  gid was lacking in the first place (primary os group is dba (oracle:dba) so I had th Linux colleague added the oinstall onthat box. And  as a workaround  I created a tnsnames entry  and connected via: sys@asm as sysasm that was also working well.  So at that point and time we all thought , case closed.

Well…… Not entirely cause the issue showed again recently and even though the workaround (using the connect string method was working)  I was not a happy Database Administrator with it.  I opened a Tar with Oracle  but  I was going in circles with it this time.

Work Info

Last Friday the Issue showed again on a box in one of the clusters. An internal mail was sent within our team about this and a very interesting clue came back from one of the Colleagues who had similar experience in different project. He came up with following information on MOS:

Troubleshooting ORA-1031: Insufficient Privileges While Connecting As SYSDBA [ID 730067.1]
UNIX: Checklist for Resolving Connect AS SYSDBA Issues [ID 69642.1]
UNIX: Diagnostic C program for ORA-1031 from CONNECT INTERNAL / AS SYSDBA [ID 67984.1]

Actually especially  last Note 67984.1 was very useful cause it showed  that during time of issue the gid  ( group Id ) was no longer valid due to an Ldap call.

With the Output of that note and the analyses after that it turned out that the NCSD daemon (http://www.linux.ncsu.edu/realm_linux/usersguide-EL4/ch04s06.php) might be part of the issue when something like that was queried on the OS:

# getent group dba
 101
 # getent group 5000
 dba
#getent group dba
 5000

When the Linux administrator configured the correct (exception) information in /etc/ldap.conf the problem vanished and the Phantom hunt ended.

Happy end

Bottom line of this:

  • Never believe in phantoms, thinks like described happen for a reason.
  • Always be willing to communicate with in the team and beyond cause communication might bring a so-called aha – Erlebnis (déjà vu).
  • Standardize, standardize,  standardize when you are using Ldap and local configurations cause you really let the ghost out of the machine otherwise.
  • A special thank  you to the colleagues who started the internal mail and to the one who shared his experiences with the team.

Happy  reading,

Mathijs

ASM communication Issues ((ORA-01031: insufficient privileges) (WARNING: ASM communication error: op 0 state 0x0 (15055))

Introduction:

A couple of months ago  I had the issue that in an 11.2.0.2 environment  with Grid Infra structure  and ASM on Linux , without any specific reason at a specific point and time would  see  messages  about communication error between the databases and  the ASM instance and  I was not able to connect to the ASM instance with a sqlplus / as sysasm. I have opened a tar back then and even got myself  a fresh bug 14767353 number but no answers. So  But practically this remained unsolved. Friday 4th  I had same issue again, so it was time to get back to arms and investigate this.

Below you will find the steps , what i saw and how I solved it with the help of my friends(Google and Metalink).

Environment: 4 Node GI 11.2.0.2.0 on Red Hat Linux  with ASM. Installations performed as oracle:dba

Continue reading

Food for thoughts Rac One a first Step to explore

Introduction.

Some months ago I have implemented an Active Passive Solution on Oracle 11gr2 for on one of the customers.  Basically this was almost reliving old days where  tools like HP MC Service Guard (using packages where one or more database and listener configurations are stored)  or  Veritas Clusterware  (where and Agent controls the status of the resources being a database and or a listener). They would  monitor  the health of the Database/Listener,  would stop and start the Database/ Listener. With the technology at hand at that point and time ( implemented 11.2.0.2 Grid Infra with ASM and 11.2.0.2 Rdbms )  it was not   having all the required   setups to implement anything else.  Now we are 9 months further in time,  standards have been adapted and we have latest Oracle version at hand ( being 11.2.0.3) so it is time for a fresh start. Besides that recently I have been asked to  implement the Active Passive solution on more environments (since it does cover  part off HA and it controls the monitoring  of the Database/ Instance  by HP-Openview).

Continue reading

Some Like it DBCA others prefer manual JVM Installs

Introduction

For a couple of databases  I had to  install the Java Virtual Machine after the Database had been created. First attempt was  using the DBCA  gui  to get that done so i started it and  made the obvious choices  to work with A Rac database , to configure the database options  and selected the  Database I wanted to work with ( in the list only databases showing no instances).  DBCA was not so nice to me this time, telling me: “service name or Instance name not specified”. Hmm  well since i knew i can do without GUI  anyhow i decided to take the manual route and leave DBCA  be  for the moment. So lets saddle up and do it the old  – still – going strong  command line way .

My environment is a Linux 5 enterprise edition, my Oracle version is 11.2.3.0. My Database is a two Instance Rac Database.

Manual Installation:

In order to  install the JVM  i decided it would be best to have the database  in an exclusive way so after contacting customer  and getting his okay  my first step was to shut down the Rac database. I had two screens open  one to tail the alertlog and in  the other in the Linux Prompt i had :

srvctl stop database -d MYDB

Alert log showed that the database was coming down smoothly so it was time to set up a script to get things working:

— Start of Installation the JVM manually.
— Important note this will only work this way if there is no JVM or part of JVM present in the database
— Investigate MOS if you suspect a former wrong installation  or partly installation before proceeding

spool InstallJvm.lst;
set echo on
-- connecting as sysdba cause this is required for this install 
-- and mount the database.
connect / as sysdba
startup mount

-- set_system_trig_enabled to false and open the database
alter system set "_system_trig_enabled" = false scope=memory;
alter database open;

-- 28= 'JAVA SOURCE', 29= 'JAVA CLASS', 
--30= 'JAVARESOURCE', 32='JAVA DATA'
select obj#, name 
from obj$ 
where type#=28 
or type#=29 
or type#=30 
or namespace=32;

@?/javavm/install/initjvm.sql
select count(*), object_type 
from all_objects 
where object_type like '%JAVA%' 
group by object_type;

@?/xdk/admin/initxml.sql
select count(*), object_type 
from all_objects 
where object_type 
like '%JAVA%' 
group by object_type;

@?/xdk/admin/xmlja.sql
select count(*), object_type 
from all_objects 
where object_type 
like '%JAVA%' 
group by object_type;

@?/rdbms/admin/catjava.sql
select count(*), object_type 
from all_objects 
where object_type like '%JAVA%' 
group by object_type;

@?/rdbms/admin/catexf.sql
select count(*), object_type 
from all_objects 
where object_type like '%JAVA%' 
group by object_type;

shutdown immediate
set echo off
spool off
exit
starting the database again via the cluster:
srvctl start database -d MYDB

##  Time to go and check Health of JVM in Oracle : 
select comp_name, version, status 
from dba_registry;

COMP_NAME                                    VERSION         STATUS 
-------------------------------------------- --------------- ----------- 
Oracle Expression Filter                     11.2.0.3.0      VALID 
Oracle Application Express                   4.1.1.00.23     VALID 
Oracle XML Database                          11.2.0.3.0      VALID 
Oracle Database Catalog Views                11.2.0.3.0      VALID 
Oracle Database Packages and Types           11.2.0.3.0      VALID 
JServer JAVA Virtual Machine                 11.2.0.3.0      VALID 
Oracle XDK                                   11.2.0.3.0      VALID 
Oracle Database Java Packages                11.2.0.3.0      VALID 

##More proof: 
select owner, status, count(*) 
from all_objects 
where object_type like '%JAVA%' 
group by owner, status;   

OWNER                STATUS  COUNT(*)
-------------------- ------- ---------- 
SYS                    VALID    21580 --> In 11GR2 approx. # of objects.
EXFSYS                 VALID       48 

select role from dba_roles where role like '%JAVA%'; 

ROLE
------------------------------ 
JAVAUSERPRIV 
JAVAIDPRIV 
JAVASYSPRIV 
JAVADEBUGPRIV 
JAVA_ADMIN 
JAVA_DEPLOY 
6 rows selected.  -->  in 11GR2 # of roles.

select owner,status, count(status) 
from dba_objects 
group by owner,status 
order by 1,2; 

OWNER                          STATUS  COUNT(STATUS) 
------------------------------ ------- ------------- 
APEX_040100                    VALID            3341 
APPQOSSYS                      VALID               5 
DBSNMP                         VALID              55 
ERUNBK11                       VALID               5 
EXFSYS                         VALID             191 
FLOWS_FILES                    VALID              13 
ORACLE_OCM                     VALID               8 
OUTLN                          VALID              10 
PUBLIC                         VALID           24187 
SYS                            VALID           31674 
SYSTEM                         VALID             563 
XDB                            VALID            1658 

## If needed consider utlrp to see if recompile is helpful.  
@?/rdbms/admin/utlrp

 

As always Happy reading. Mathijs