- Again without any specific hardware (everything runs on a simple PC)
Interconnect network:
In RAC parlance it is the private net between the nodes (==heartbeat net)On the virtual machine I add a 'Host-only network' interface.
In my case I took the 192.168.160.0/24 network, I changed the DHCP to static as follows: (using system-config-network for example)
RAC1: eth1 192.168.160.101 / 255.255.255.0
RAC2: eth1 192.168.160.101 / 255.255.255.0
#service network restart
RAC1 $ ping 192.168.160.102
OK
Users & system setup
Reference: Database Installation Guide for Linux
On each node:
# yum install gcc elfutils-libelf-devel glibc-devel libaio-devel libstdc++-devel unixODBC unixODBC-devel gcc-c++
# groupadd dba
# groupadd oinstall
# useradd -m oracle -g oinstall -G dba,asmdba
# passwd oracle
# cat >> /etc/security/limits.conf
oracle hard nofiles 65536
oracle soft nofiles 65536
# cat >> /etc/sysctl.conf
kernel.sem = 250 32000 100 128
fs.file-max = 6815744
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 4194304
fs.aio-max-nr = 1048576
# /sbin/sysctl -p
# mkdir /opt/oracle
# chown oracle.oinstall /opt/oracle/
ssh connection without password
RAC1# su - oracle
RAC1$ ssh-keygen -b 2048
(type enter for an empty passphrase)Repeat on RAC2 (this creates .ssh directory and private/pub keys)
RAC1$ scp .ssh/id_rsa.pub rac2:/home/oracle/.ssh/authorized_keys
RAC2$ chmod 600 /home/oracle/.ssh/authorized_keys
RAC2$ scp .ssh/id_rsa.pub rac1:/home/oracle/.ssh/authorized_keys
RAC1$ chmod 600 /home/oracle/.ssh/authorized_keys
Then ssh works without password from one node to the other
RAC1$ ssh RAC2
OUI (Oracle Installer) also needs login from itself, so we also need on each node our own public key:
$ cd ~/.ssh && cat id_rsa.pub >> authorized_keys
Choose some config names and IPs
in my case:cluster name= raccluster
public hostname1 = rac1 192.168.0.201
public hostname2 = rac2 192.168.0.202
virtual hostname1 = rac1-vip 192.168.0.211
virtual hostname2 = rac2-vip 192.168.0.212
virtual IP: racvip 192.168.0.203
SCAN addresses: rac-scan 192.168.0.213 192.168.0.214 192.168.0.215
(defined though the DNS, see my DNS post if like me you forgot...)
vi /etc/nsswitch.conf
hosts: dns files
# service nscd restart
Created some directories:
# mkdir -p /u01/app/11.2.0/grid
# chown -R oracle:oinstall /u01/app/11.2.0/grid
# mkdir -p /u01/app/oracle/
# chown -R oracle:oinstall /u01/app/oracle/
# chmod -R 775 /u01/app/oracle/
NTPD:
ntpd is needed and need special slewing option:
# vi /etc/sysconfig/ntpd
OPTIONS="-x -u ntp:ntp -p /var/run/ntpd.pid"
# service ntpd restart
# chkconfig ntpd on
Setup ASMLib
In this example we use ASM (alternatives are: ocfs2, GFS..)We insall ASMlib which is just the lower level software (kernel driver and low level utils). The rest of ASM is installed through the 'grid'
rac1 & rac2:
wget http://oss.oracle.com/projects/oracleasm/dist/files/RPMS/rhel5/x86/2.0.5/2.6.18-238.el5/oracleasm-2.6.18-238.el5-2.0.5-1.el5.i686.rpm
wget http://oss.oracle.com/projects/oracleasm-support/dist/files/RPMS/rhel5/x86/2.1.7/oracleasm-support-2.1.7-1.el5.i386.rpm
wget http://download.oracle.com/otn_software/asmlib/oracleasmlib-2.0.4-1.el5.i386.rpm
wget http://download.oracle.com/otn_software/asmlib/oracleasmlib-2.0.4-1.el5.i386.rpm
rpm -i oracleasm-support-2.1.7-1.el5.i386.rpm oracleasmlib-2.0.4-1.el5.i386.rpm oracleasm-2.6.18-238.el5-2.0.5-1.el5.i686.rpm
ASMlib configuration: (note the documentation is missing the '-i' option)
# /usr/sbin/oracleasm configure -i
Default user to own the driver interface []: oracleDefault group to own the driver interface []: dba
Start Oracle ASM library driver on boot (y/n) [n]: y
Scan for Oracle ASM disks on boot (y/n) [y]: y
Writing Oracle ASM library driver configuration: done
rac1# /usr/sbin/oracleasm createdisk ASMDISK1 /dev/sdd1
rac1# /usr/sbin/oracleasm listdisks
ASMDISK1
rac2# /usr/sbin/oracleasm scandisks
rac2# /usr/sbin/oracleasm listdisks
ASMDISK1
I can the ASM disk on both nodes. Good !
Grid Installation
The grid software contains ASM and Oracle Clusterware.In this test setup I used the same 'oracle' user (with hindsight I should have used 'grid', much cleaner to separate the grid/clusterware from DB itself)
export ORACLE_BASE=/u01/app/oracle/
export ORACLE_HOME=/u01/app/11.2.0/grid
./runInstaller
I met this error: [INS-40910] Virtual IP: entered is invalid.
misleading in my case it was due to bad reverse DNS resolution...
run the root script, which started a bunch of stuff and used the ASM disk
ohasd is starting
CRS-2672: Attempting to start 'ora.gipcd' on 'rac1'
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac1'
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac1'
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1'
CRS-2672: Attempting to start 'ora.cssd' on 'rac1'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac1'
CRS-2672: Attempting to start 'ora.ctssd' on 'rac1'
CRS-2672: Attempting to start 'ora.crsd' on 'rac1'
CRS-2672: Attempting to start 'ora.evmd' on 'rac1'
CRS-2672: Attempting to start 'ora.asm' on 'rac1'
CRS-2672: Attempting to start 'ora.DATA.dg' on 'rac1'
CRS-2672: Attempting to start 'ora.registry.acfs' on 'rac1'
Verifications:
$ ./crsctl check cluster -all**************************************************************
rac1:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
rac2:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
Oracle processes after clusterware + ASM (grid) install
Oh man, Oracle it works but it is not really lightweight... We haven't installed any real DB yet !root 8521 1 0 07:47 ? 00:00:00 /bin/sh /etc/init.d/init.ohasd run
root 8544 1 0 07:47 ? 00:00:04 /u01/app/11.2.0/grid/bin/ohasd.bin reboot
root 9842 1 0 07:48 ? 00:00:01 /u01/app/11.2.0/grid/bin/orarootagent.bin
oracle 10624 1 0 07:51 ? 00:00:03 /u01/app/11.2.0/grid/bin/oraagent.bin
oracle 10639 1 0 07:51 ? 00:00:00 /u01/app/11.2.0/grid/bin/mdnsd.bin
oracle 10651 1 0 07:51 ? 00:00:00 /u01/app/11.2.0/grid/bin/gipcd.bin
oracle 10662 1 0 07:51 ? 00:00:01 /u01/app/11.2.0/grid/bin/gpnpd.bin
root 10677 1 0 07:51 ? 00:00:01 /u01/app/11.2.0/grid/bin/cssdmonitor
root 10694 1 0 07:51 ? 00:00:01 /u01/app/11.2.0/grid/bin/cssdagent
oracle 10696 1 0 07:51 ? 00:00:00 /u01/app/11.2.0/grid/bin/diskmon.bin -d -f
oracle 10715 1 0 07:51 ? 00:00:03 /u01/app/11.2.0/grid/bin/ocssd.bin
root 10792 1 0 07:52 ? 00:00:00 /u01/app/11.2.0/grid/bin/octssd.bin
oracle 10852 1 0 07:52 ? 00:00:00 asm_pmon_+ASM1
oracle 10854 1 0 07:52 ? 00:00:00 asm_vktm_+ASM1
oracle 10858 1 0 07:52 ? 00:00:00 asm_gen0_+ASM1
oracle 10860 1 0 07:52 ? 00:00:00 asm_diag_+ASM1
oracle 10862 1 0 07:52 ? 00:00:00 asm_ping_+ASM1
oracle 10864 1 0 07:52 ? 00:00:00 asm_psp0_+ASM1
oracle 10866 1 0 07:52 ? 00:00:00 asm_dia0_+ASM1
oracle 10868 1 0 07:52 ? 00:00:00 asm_lmon_+ASM1
oracle 10870 1 0 07:52 ? 00:00:00 asm_lmd0_+ASM1
oracle 10873 1 0 07:52 ? 00:00:00 asm_lms0_+ASM1
oracle 10877 1 0 07:52 ? 00:00:00 asm_lmhb_+ASM1
oracle 10879 1 0 07:52 ? 00:00:00 asm_mman_+ASM1
oracle 10881 1 0 07:52 ? 00:00:00 asm_dbw0_+ASM1
oracle 10883 1 0 07:52 ? 00:00:00 asm_lgwr_+ASM1
oracle 10885 1 0 07:52 ? 00:00:00 asm_ckpt_+ASM1
oracle 10887 1 0 07:52 ? 00:00:00 asm_smon_+ASM1
oracle 10889 1 0 07:52 ? 00:00:00 asm_rbal_+ASM1
oracle 10891 1 0 07:52 ? 00:00:00 asm_gmon_+ASM1
oracle 10893 1 0 07:52 ? 00:00:00 asm_mmon_+ASM1
oracle 10895 1 0 07:52 ? 00:00:00 asm_mmnl_+ASM1
oracle 10897 1 0 07:52 ? 00:00:00 /u01/app/11.2.0/grid/bin/oclskd.bin
oracle 10900 1 0 07:52 ? 00:00:00 asm_lck0_+ASM1
root 10912 1 0 07:52 ? 00:00:08 /u01/app/11.2.0/grid/bin/crsd.bin reboot
oracle 10928 1 0 07:52 ? 00:00:01 /u01/app/11.2.0/grid/bin/evmd.bin
oracle 10930 1 0 07:52 ? 00:00:00 asm_asmb_+ASM1
oracle 10932 1 0 07:52 ? 00:00:00 oracle+ASM1_asmb_+asm1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
root 10958 1 0 07:52 ? 00:00:00 /u01/app/11.2.0/grid/bin/oclskd.bin
oracle 10960 1 0 07:52 ? 00:00:01 oracle+ASM1_ocr (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle 11017 10928 0 07:52 ? 00:00:00 /u01/app/11.2.0/grid/bin/evmlogger.bin -o /u01/app/11.2.0/grid/evm/log/evmlogger.info -l /u01/app/11.2.0/grid/evm/log/ev
mlogger.log
oracle 11220 1 0 07:53 ? 00:00:02 /u01/app/11.2.0/grid/bin/oraagent.bin
root 11388 1 0 07:53 ? 00:00:11 /u01/app/11.2.0/grid/bin/orarootagent.bin
oracle 11415 1 0 07:53 ? 00:00:00 /u01/app/11.2.0/grid/opmn/bin/ons -d
oracle 11416 11415 0 07:53 ? 00:00:00 /u01/app/11.2.0/grid/opmn/bin/ons -d
oracle 11467 1 0 07:53 ? 00:00:03 /u01/app/11.2.0/grid/jdk/jre//bin/java -Doracle.supercluster.cluster.server=eonsd -Djava.net.preferIPv4Stack=true -Djava
.util.logging.config.file=/u01/app/11.2.0/grid/srvm/admin/logging.properties -classpath /u01/app/11.2.0/grid/jdk/jre//lib/rt.jar:/u01/app/11.2.0/grid/jlib/srvm.jar:/u01
/app/11.2.0/grid/jlib/srvmhas.jar:/u01/app/11.2.0/grid/jlib/supercluster.jar:/u01/app/11.2.0/grid/jlib/supercluster-common.jar:/u01/app/11.2.0/grid/ons/lib/ons.jar orac
le.supercluster.impl.cluster.EONSServerImpl
oracle 11609 1 0 07:53 ? 00:00:00 /u01/app/11.2.0/grid/bin/tnslsnr LISTENER_SCAN2 -inherit
oracle 11620 1 0 07:54 ? 00:00:00 /u01/app/11.2.0/grid/bin/tnslsnr LISTENER_SCAN3 -inherit
oracle 12474 1 0 08:05 ? 00:00:00 /u01/app/11.2.0/grid/bin/tnslsnr LISTENER -inherit
ASM peek:
With: export ORACLE_SID="+ASM1" and PATH to OH/bin
$ asmcmd
ASMCMD> ls
DATA/
ASMCMD> du
Used_MB Mirror_used_MB
263 263
...
Database install
Again with same 'oracle' user, but at a different 'home'
$ export ORACLE_BASE=/u01/app/oracle/
$ export ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1
$ ./runInstaller
-Create & configure
-Server Class
-RAC type
-Typical
-Storage type ASM, and location on DATA
-Global name: DTST
went Ok except
problem1)
ora-845 memory_target not supported (actually not enough shm!)
added to the fstab:
shmfs /dev/shm tmpfs size=1200m 0
and did it manually:
# mount -t tmpfs shmfs -o size=1200m /dev/shm
problem2)
strange error
CRS-5804: Communication error with agent process
CRS-2632: There are no more servers to try to place resource 'ora.dtst.db' on that would satisfy its placement policy
RAC2$ ./srvctl status database -d DTST
Instance DTST1 is running on node rac1
Instance DTST2 is not running on node rac2
Tried to restart, expecting to see the error...
$ ./srvctl stop database -d DTST
$ ./srvctl start database -d DTST
$ ./srvctl status database -d DTST
Instance DTST1 is running on node rac1
Instance DTST2 is running on node rac2
but it went OK this time, should have investigated this, but skipped for now...
Verifications
Documentation suggests this:$ cd /u01/app/11.2.0/grid/bin
$ ./crsctl status resource -w "TYPE co ’ora’" -t
what an intuitive command!!!
( alternative: "./crsctl stat resource" is less nice, but I'm having difficulties remembering the other one )
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.LISTENER.lsnr
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.asm
ONLINE ONLINE rac1 Started
ONLINE ONLINE rac2 Started
ora.eons
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.gsd
OFFLINE OFFLINE rac1
OFFLINE OFFLINE rac2
ora.net1.network
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.ons
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.registry.acfs
ONLINE ONLINE rac1
ONLINE ONLINE rac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rac2
ora.LISTENER_SCAN2.lsnr
1 ONLINE ONLINE rac1
ora.LISTENER_SCAN3.lsnr
1 ONLINE ONLINE rac1
ora.dtst.db
1 ONLINE ONLINE rac1 Open
2 ONLINE ONLINE rac2 Open
ora.oc4j
1 OFFLINE OFFLINE
ora.rac1.vip
1 ONLINE ONLINE rac1
ora.rac2.vip
1 ONLINE ONLINE rac2
ora.scan1.vip
1 ONLINE ONLINE rac2
ora.scan2.vip
1 ONLINE ONLINE rac1
ora.scan3.vip
1 ONLINE ONLINE rac1
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.LISTENER.lsnr
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.asm
ONLINE ONLINE rac1 Started
ONLINE ONLINE rac2 Started
ora.eons
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.gsd
OFFLINE OFFLINE rac1
OFFLINE OFFLINE rac2
ora.net1.network
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.ons
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.registry.acfs
ONLINE ONLINE rac1
ONLINE ONLINE rac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rac2
ora.LISTENER_SCAN2.lsnr
1 ONLINE ONLINE rac1
ora.LISTENER_SCAN3.lsnr
1 ONLINE ONLINE rac1
ora.dtst.db
1 ONLINE ONLINE rac1 Open
2 ONLINE ONLINE rac2 Open
ora.oc4j
1 OFFLINE OFFLINE
ora.rac1.vip
1 ONLINE ONLINE rac1
ora.rac2.vip
1 ONLINE ONLINE rac2
ora.scan1.vip
1 ONLINE ONLINE rac2
ora.scan2.vip
1 ONLINE ONLINE rac1
ora.scan3.vip
1 ONLINE ONLINE rac1
Proper RAC shutdown
# ./crsctl stop cluster -all
"This command attempts to gracefully stop resources managed by Oracle
Clusterware while attempting to stop the Oracle Clusterware stack."Conclusion
Ok this was simplistic, but we do have our test RAC system working without any special hardware, using the iSCSI target from the previous post.
At this point we can backup the 2 Virtual machines (rac1, rac2), as well as the file used for the iSCSI disk. And experiment at will...