Following our test iSCSI setup without hardware, here is an example of typical VCS/Oracle fail-over setup. This is on RHEL5
setup heartbeat network
You need 2 more virtual network cards on node1 and node2, preferably on separate logical networks:
If needed: re-run the vmware config (/usr/bin/vmware-config.pl) to create 2 local 'host-only' subnets 192.168.130.0 and 192.168.131.0 because I suspect LLT may not work on the same bridged network
Then add 2 more network cards in each VM
Assign the addresses (system-config-network). In our example we will use:
node1:
192.168.130.160/24 (eth1)
192.168.131.160/24 (eth2)
node2:
192.168.130.161/24 (eth1)
192.168.131.161/24 (eth2)
run system-config-network and setup accordingly, then '/etc/init.d/network restart'
from node1 perform basic tests:
ping 192.168.130.161
ping 192.168.131.161
VCS prerequisites
note this is for VCS5.1 on RHEL5.5 check the install manual
# yum install compat-libgcc compat-libstdc++ glibc-2.5 libgcc glibc libgcc libstdc++ java-1.4.2
append to /etc/hosts, for easier admin, on each node:
192.168.0.201 node1
192.168.0.202 node2
ssh keys:
ssh-keygen -t dsa (on each node)
node1# scp /root/.ssh/id_dsa.pub node2:/root/.ssh/authorized_keys2
node2# scp /root/.ssh/id_dsa.pub node1:/root/.ssh/authorized_keys2
Verify you can connect without password from node1 to node2, and the other way around
node1# ssh node2
Update .bash_profile
PATH=/opt/VRTS/bin:$PATH; export PATH
MANPATH=/usr/share/man:/opt/VRTS/man; export MANPATH
kernel panic
sysctl -w kernel.panic=10
precheck (from the VCS cd, or tar.gz extracted):
./installvcs -precheck node1 node2
VCS install
./installvcs
choices:
I : install
1) Veritas Cluster Server (VCS)
3) Install all Veritas Cluster Server rpms - 322 MB required
Enter the 64 bit RHEL5 system names separated by spaces: [q,?] node1 node2
(enter license key or 60days without)
Would you like to configure VCS on node1 node2 [y,n,q] (n) y
Enter the unique cluster name: [q,?] vmclu160
Enter a unique Cluster ID number between 0-65535: [b,q,?] (0) 160
Enter the NIC for the first private heartbeat link on node1: [b,q,?] eth1
eth1 has an IP address configured on it. It could be a public NIC on node1.
Are you sure you want to use eth1 for the first private heartbeat link?
[y,n,q,b,?] (n) y
Is eth1 a bonded NIC? [y,n,q] (n)
Would you like to configure a second private heartbeat link? [y,n,q,b,?] (y)
Enter the NIC for the second private heartbeat link on node1: [b,q,?] eth2
eth2 has an IP address configured on it. It could be a public NIC on node1.
Are you sure you want to use eth2 for the second private heartbeat link?
[y,n,q,b,?] (n) y
Is eth2 a bonded NIC? [y,n,q] (n)
Would you like to configure a third private heartbeat link? [y,n,q,b,?] (n) n
Do you want to configure an additional low priority heartbeat link?
[y,n,q,b,?] (n) y
Enter the NIC for the low priority heartbeat link on node1: [b,q,?] (eth0)
Is eth0 a bonded NIC? [y,n,q] (n)
Are you using the same NICs for private heartbeat links on all systems?
[y,n,q,b,?] (y) y
Cluster information verification:
Cluster Name: vmclu160
Cluster ID Number: 160
Private Heartbeat NICs for node1:
link1=eth1
link2=eth2
Low Priority Heartbeat NIC for node1: link-lowpri=eth0
Private Heartbeat NICs for node2:
link1=eth1
link2=eth2
Low Priority Heartbeat NIC for node2: link-lowpri=eth0
Is this information correct? [y,n,q,b,?] (y) y
Virtual IP can be specified in RemoteGroup resource, and can be used to
connect to the cluster using Java GUI
The following data is required to configure the Virtual IP of the Cluster:
A public NIC used by each system in the cluster
A Virtual IP address and netmask
Do you want to configure the Virtual IP? [y,n,q,?] (n) y
Active NIC devices discovered on node1: eth0 eth1 eth2
Enter the NIC for Virtual IP of the Cluster to use on node1: [b,q,?] (eth0)
Is eth0 to be the public NIC used by all systems? [y,n,q,b,?] (y)
Enter the Virtual IP address for the Cluster: [b,q,?] 192.168.0.203
Enter the NetMask for IP 192.168.0.203: [b,q,?] (255.255.255.0)
Would you like to configure VCS to use Symantec Security Services? [y,n,q] (n)
Do you want to set the username and/or password for the Admin user
(default username = 'admin', password='password')? [y,n,q] (n) y
Enter the user name: [b,q,?] (admin)
Enter the password:
Enter again:
Do you want to add another user to the cluster? [y,n,q] (n) n
For this test setup, answer n to SMTP and Global cluster and let it restart
node1# hastatus -sum
-- SYSTEM STATE
-- System State Frozen
A node1 RUNNING 0
A node2 RUNNING 0
-- GROUP STATE
-- Group System Probed AutoDisabled State
B ClusterService node1 Y N ONLINE
B ClusterService node2 Y N OFFLINE
Oracle binary install
Reference: Database Installation Guide for Linux
On each node:
# yum install gcc elfutils-libelf-devel glibc-devel libaio-devel libstdc++-devel unixODBC unixODBC-devel gcc-c++
# groupadd dba
# groupadd oinstall
# useradd -m oracle -g oinstall -G dba,asmdba
# passwd oracle
# cat >> /etc/security/limits.conf
oracle hard nofiles 65536
oracle soft nofiles 65536
For RHEL5.5:
# cat >> /etc/sysctl.conf
kernel.sem = 250 32000 100 128
fs.file-max = 6815744
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 4194304
fs.aio-max-nr = 1048576
# /sbin/sysctl -p
# mkdir /opt/oracle
# chown oracle.oinstall /opt/oracle/
As ORACLE user:
On this example setup, Oracle binaries are installed on both nodes, in /opt/oracle
extract the Oracle distrib 11gr2, ensure you have an X connection and run:
$ ./runInstaller
install database software only
single instance database installation
enterprise
(select options -> only partitioning)
oracle base=/opt/oracle/app/oracle
sw loc=/opt/oracle/app/oracle/product/11.2.0/dbhome_1
leave other defaults
$ cat >> ~/.bash_profile
export ORACLE_HOME=/opt/oracle/app/oracle/product/11.2.0/dbhome_1
export PATH=$ORACLE_HOME/bin:$PATH
Oracle instance install
mount the shared disk on /database (as root)
# mkdir /database
# chown oracle.dba /database (do these on both nodes)
# mount /dev/sdd1 /database/ (do this only on node1, to create the instance)
create the instance with dbca (as oracle)
$ dbca
create a test database, especially set:
Use common location for all database files: /database
$ export ORACLE_SID=DTST (and add this to oracle .bash_profile on both nodes)
$ sqlplus "/ as sysdba"
SQL> select * from dual;
SQL> shutdown immediate
copy spfile to the other node (as oracle):
$ scp /opt/oracle/app/oracle/product/11.2.0/dbhome_1/dbs/spfileDTST.ora node2:/opt/oracle/app/oracle/product/11.2.0/dbhome_1/dbs
copy also the directory structure created for audit logs:
$ scp -r /opt/oracle/app/oracle/admin/DTST node2:/opt/oracle/app/oracle/admin/DTST
it seems the /opt/oracle/app/oracle/diag/rdbms/dtst strcuture for traces etc.. is created automatically)
set $ORACLE_HOME/network/admin/listener.ora with the virtual IP we will use, here : 192.168.0.204
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.204)(PORT = 1521))
)
)
ADR_BASE_LISTENER = /opt/oracle/app/oracle
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(GLOBAL_DBNAME = DTST)
(ORACLE_HOME =/opt/oracle/app/oracle/product/11.2.0/dbhome_1)
(SID_NAME = DTST)
)
)
and tnsnames.ora:
DTST =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.204)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = DTST)
)
)
copy both files to the other node (as oracle):
$ scp /opt/oracle/app/oracle/product/11.2.0/dbhome_1/network/admin/*.ora node2:/opt/oracle/app/oracle/product/11.2.0/dbhome_1/network/admin
VCS service group config for Oracle
umount shared disk
# umount /database
update /etc/VRTSvcs/conf/config/main.cf and add the following service group:
group OraGroup (
SystemList = { node1 = 0, node2 = 1 }
AutoStartList = { node1, node2 }
)
DiskReservation DR_ora (
Disks @node1 = { "/dev/sdd" }
Disks @node2 = { "/dev/sdd" }
FailFast = 1
)
Mount Mount_oraprod_dfiles (
MountPoint = "/database"
BlockDevice = "/dev/sdd1"
FSType = ext3
FsckOpt = "-n"
)
IP IP_oraprod (
Device = eth0
Address = "192.168.0.204"
NetMask = "255.255.250.0"
)
NIC NIC_oraprod (
Device = eth0
)
Netlsnr LSNR_oraprod_lsnr (
Owner = oracle
Home = "/opt/oracle/app/oracle/product/11.2.0/dbhome_1"
TnsAdmin = "/opt/oracle/app/oracle/product/11.2.0/dbhome_1/network/admin"
Listener = LISTENER
)
Oracle ORA_oraprod (
Sid =DTST
Owner = oracle
Home = "/opt/oracle/app/oracle/product/11.2.0/dbhome_1"
Pfile = "/opt/oracle/app/oracle/product/11.2.0/dbhome_1/dbs/initDTST.ora"
StartUpOpt = STARTUP
)
IP_oraprod requires NIC_oraprod
LSNR_oraprod_lsnr requires IP_oraprod
LSNR_oraprod_lsnr requires ORA_oraprod
Mount_oraprod_dfiles requires DR_ora
ORA_oraprod requires Mount_oraprod_dfiles
note: this is a simple setup: no Veritas Volume,no DetailMonitoring of the DB (ie hangs = no failure detection)
check:
# hacf -verify /etc/VRTSvcs/conf/config/
stop/start the cluster to re-read the main.cf (in prod we could use hacf -cftocmd .../config/ and run main.cmd etc.)
# hastop -all
# hastart
# hastart (on node2)
verify log while starting:
# tail -f /var/VRTSvcs/log/engine_A.log
# hastatus -sum
Failover quick test
Simulate a problem on oracle
$ su - oracle
$ sqlplus "/ as sysdba"
SQL> shutdown immediate
and verify if fail-over works.
connect for real to the database on node2 and check it is OPEN
SQL> select * from dual;
D
-
X
SQL> select STATUS from V$instance;
STATUS
------------
OPEN
Of course we have also to test with an external client, using the VIP
Conclusion:
We have a working test cluster without special hardware, using the iSCSI target from the previous post.
At this point we can backup the 2 Virtual machines (rac1, rac2), as well
as the file used for the iSCSI disk. And experiment at will...