Thursday, 28 March 2013

Trace File Analyzer (TFA) for Oracle cluster databases

Trace File Analyzer (TFA) for Oracle cluster databases

Monday, 25 March 2013

Wednesday, 13 March 2013

Create ASM device on the NFS mounted file system

Recently I was running some tests in my lab environment consisting of two Oracle databases on separate RHEL 6.2 VMs. The databases were configured in a Data Guard configuration and used ASM as storage. However, I ran into an issue where my standby database was becoming out of sync with the primary database because the standby was out of disk space. At the time I couldn't request additional storage for my standby (that's what happens when you work at 1:00 am while everyone works normal hours) and I needed to complete my testing. The standby server had some NFS mounted shares that had plenty of space however. This is where social interaction is of benefit because I can recall having several discussions with fellow colleagues on twitter about choosing to use NFS as ASM devices and under what conditions one would choose such a route. Yury also wrote up a blog post about it.

This seemed like a good opportunity to test it out. Only this time I will be mixing ASM block devices on a cooked filesystem along with ASM raw devices (these were actually created on top of a NFS datastore but presented to the guests as LUNs).

Please note this is for educational and testing purposes only. I wouldn't recommend doing this in your production environment. Use at your own risk.

Mount the NFS volume with the correct mount options 
[root@el6 ~]#  mount -o rw,bg,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,nfsvers=3,timeo=600 /oradata

Check the existing ASM devices
[grid@el6 ~]$ asmcmd lsdsk -G FLASH

Create a new ASM device on the NFS mounted file system 
[grid@el6 ~]$ dd if=/dev/zero of=/oradata/nfs_asm-disk1 bs=1K count=1000000
1000000+0 records in
1000000+0 records out
1024000000 bytes (1.0 GB) copied, 547.629 s, 1.9 MB/s

Check the value of the ASM_DISKSTRING initialization parameter. This tells ASM where to look for ASM devices. 
[grid@el6 ~]$ sqlplus / as sysasm
SQL*Plus: Release Production on Thu Jan 17 02:37:46 2013
Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release - 64bit Production
With the Automatic Storage Management option
SQL> show parameter asm_diskstring
------------------------------------ ----------- ------------------------------
asm_diskstring      string
SQL> alter system set asm_diskstring='/dev/oracleasm/disks/*','/oradata/nfs_asm-disk1';
System altered.
SQL> alter diskgroup flash add disk '/oradata/nfs_asm-disk1';
Diskgroup altered.

List the ASM devices in the diskgroup. Here you'll see both the LUN and new device on the NFS volume in the same diskgroup.
[grid@el6 ~]$ asmcmd lsdsk -G FLASH

Note: If you see errors similar to the following in your alert log, you may need to change the ownership and permissions on your NFS mounted ASM devices so that they are owned by the "ASM" user.
SUCCESS: disk FLASH_0001 (1.3915921253) added to diskgroup FLASH
Thu Jan 17 02:47:45 2013
Direct NFS: write FAILED 13
Errors in file /u01/app/oracle/diag/rdbms/orcl1/orcl1/trace/orcl1_mmon_4037.trc:
ORA-17500: ODM err:KGNFS WRITE FAIL:Permission denied
WARNING: Write Failed. group:1 disk:1 AU:4 offset:114688 size:16384
Errors in file /u01/app/oracle/diag/rdbms/orcl1/orcl1/trace/orcl1_mmon_4037.trc:
ORA-15080: synchronous I/O operation to a disk failed
WARNING: failed to write mirror side 1 of virtual extent 1 logical extent 0 of file 256 in group 1 on disk 1 allocation unit 4 
Errors in file /u01/app/oracle/diag/rdbms/orcl1/orcl1/trace/orcl1_mmon_4037.trc:
ORA-00206: error in writing (block 15, # blocks 1) of control file
ORA-00202: control file: '+FLASH/orcl1/controlfile/current.256.768325217'
ORA-15081: failed to submit an I/O operation to a disk
ORA-15081: failed to submit an I/O operation to a disk
minact-scn: useg scan erroring out with error e:221

Now your ASM will be playing nicely together with both NFS and LUNs.

ORA-00020: maximum number of processes (%s) exceeded

Ever received this error message in your alert log? I have on a couple of occasions. Perhaps an application server started spawning more database connections than it normally does in turn creating more processes on the database. The Oracle description of this error is:
Error: ORA 20 Text: maximum number of processes exceeded  
Cause: An operation requested a resource that was unavailable. The maximum number of processes is specified by the initialization parameter PROCESSES. When this maximum is reached, no more requests are processed. Action: Try the operation again in a few minutes. If this message occurs often, shut down Oracle, increase the PROCESSES parameter in the initialization parameter file, and restart Oracle.

But what if you cannot connect to your database to shut it down and increase the parameter?

oradba01t[labdb01]-/home/oracle/>sqlplus  /as sysdba
SQL*Plus: Release - Production on Mon Jun 6 11:25:05 2011
Copyright (c) 1982, 2008, Oracle.  All rights reserved.
ORA-00020: maximum number of processes (%s) exceeded

I always like to look at the alert log to check for any additional details.

Using adrci I tailed the alert log and sure enough we had our ORA-00020 error all over.
2011-06-06 11:17:36.042000 -05:00
ORA-00020: No more process state objects available
ORA-20 errors will not be written to the alert log for
 the next minute. Please look at trace files to see all
 the ORA-20 errors.
Process m000 submission failed with error = 20
2011-06-06 11:24:35.878000 -05:00
ORA-00020: maximum number of processes 0 exceeded
ORA-20 errors will not be written to the alert log for
 the next minute. Please look at trace files to see all
 the ORA-20 errors.
adrci> exit

How do I connect to the database to increase my processes parameter? Well I remember Tanel Poder blogged about "How to log on even when SYSDBA can't do so?"

oradba01t[labdb01]-/home/oracle/>sqlplus -prelim "/as sysdba"

SQL*Plus: Release - Production on Mon Jun 6 11:29:54 2011

Copyright (c) 1982, 2008, Oracle.  All rights reserved.


I was able to log on to the instance without getting the error message. Now to see if I can shut it down and start it up to increase the parameter.

SQL> shutdown abort
ORACLE instance shut down.
SQL> startup mount
ORACLE instance started.
ORA-01012: not logged on
Process ID: 0
Session ID: 0 Serial number: 0

It appears that the shutdown abort was successful but the mount command did not complete. Disconnect from the previously connected session and logon normally to start the instance.
oradba01t[labdb01]-/home/oracle>sqlplus / as sysdba

SQL*Plus: Release - Production on Mon Jun 6 11:33:07 2011

Copyright (c) 1982, 2008, Oracle.  All rights reserved.

Connected to an idle instance.

SQL>startup mount
ORACLE instance started.

Total System Global Area 4175568896 bytes
Fixed Size                  2160352 bytes
Variable Size            3489663264 bytes
Database Buffers          671088640 bytes
Redo Buffers               12656640 bytes
Database mounted.

Now we can go ahead and increase our processes parameter. Actually I opted not to increase the parameter since I knew the root cause of the processes being exceeded. This was a Enterprise Manager Grid Control repository and due to some OMS processes had caused the database processes to be exceeded.

Tuesday, 5 March 2013

Brief of Adding node to Oracle RAC Cluster 11.2

Assume , node as been prepared with pre-requisite like user,group,kernel,ntp,ssh,vip,ip,/etec/hosts etc.

We should now be in a position to add the node, so lets run the cluster verification utility from one of the existing nodes, to make sure everything is correct:

node1_oracle$> /tmp/oracle/grid/ stage -pre nodeadd -n racnode3

Make sure this returns a successful message.

We can now add our node into Oracle Clusterware, so from the existing node 1, run:

node1_oracle$> /home/grid/app/11.2.0/grid/oui/bin/ “CLUSTER_NEW_NODES={racnode3}” “CLUSTER_NEW_VIRTUAL_HOSTNAMES={racnode3-vip}” -silent

We now need to run the root scripts on node3:

node3#> /home/oracle/app/oraInventory/
node3#> /home/grid/app/11.2.0/grid/

Make sure the node is added, by using the cluster verification tool on node 1:

node1_oracle$> /tmp/oracle/grid/ stage -post nodeadd -n racnode3

This should return as successful.

We now need to extend the Oracle Home to the third node, so from the first node:

node1_oracle$> /home/oracle/app/oracle/product/11.2.0/dbhome_1/oui/bin/ -silent “CLUSTER_NEW_NODES={racnode3}”

We now need to run the root scripts on node3:

node3#> /home/oracle/app/oracle/product/11.2.0/dbhome_1/

We now need to add the instance to the node. So we run dbca on node 1:

oracle_node1$> export ORACLE_HOME=/home/oracle/app/oracle/product/11.2.0/dbhome_1
oracle_node1$> export PATH=$PATH:$ORACLE_HOME/bin
oracle_node1$> dbca &

Select Oracle RAC database
click on Next
Select Instance Management
click on Next
Select Add an Instance
click on Next
Make sure TESTDB is selected, enter sys for the username and enter the password
click on Next
click on Next
make sure the TESTDB3 is shown along with racnode3
click on Next
click on Finish
click on OK

Now use the cluster verification tool to confirm the addition:

oracle_node1$> export ORACLE_HOME=/home/oracle/app/oracle/product/11.2.0/dbhome_1
oracle_node1$> /tmp/oracle/grid/ comp admprv -o db_config -d $ORACLE_HOME -n racnode1,racnode2,racnode3

You should now have completed the addition of a third node into your cluster.