Friday, October 26, 2012

Test HA (cluster,RAC...) without shared disk array

In my job I work quite often with HA: Veritas cluster, linux cluster, Oracle RAC...

But I don't always have a shared disk array available for tests, these are usually too expensive (10K+)  to buy just for sandbox testing.

As a result people lack practise with these HA setups,  nobody dares to touch them, and nobody is confident when problems arise!

My solution is to use any host or VM as iSCSI target to act as a shared array, and use it to build the whole HA setup:

Fake "shared array" configuration

Install "target" packages:

Use any host or VM running Linux. All the necessary package are available in popular distributions:

# yum install scsi-target-utils (red hat and similar)

# apt-get install tgt (debian/ubuntu)

Create a fake device

If you don't have a partiton handy, create a large file which will act as a device:

# dd if=/dev/zero of=/sharedarray/device1.bin bs=1024k count=1000

Make it available through iSCSI:

Edit /etc/tgt/targets.conf to define this file/device available
Here you have to invent an IQN (iSCSI Qualified Name). For example I did:

# vi /etc/tgt/targets.conf
    backing-store /sharedarray/device1.bin

Restart the service:

# service tgt restart      (/etc/init.d/tgt restart in older ubuntu version)

Verify something listens on port 3260 (standard iSCSI port)

# netstat -plnt | grep 3260
tcp        0      0  *               LISTEN      6924/tgtd      
tcp6       0      0 :::3260                 :::*                    LISTEN      6924/tgtd   

Here we have our test shared array! Of course no RAID here, nor redundant network access but this is enough for testing, even fail-over etc...
Next, let's connect the nodes...

Connect a first node

Install "initiator" packages:

the initiator is the "client" in iSCSI parlance.

# yum install iscsi-initiator-utils   (red hat)
# apt-get install open-iscsi-utils open-iscsi  (ubuntu - one for iscsiadm, one for the daemon)

Start and discover our array's device

# /etc/init.d/iscsi start  (red hat)
# /etc/init.d/iscsi-network-interface start  (ubuntu)

The following commands check what has been discovered, so far nothing:
# iscsiadm -m node
# iscsiadm -m discovery -P 1

Let's discover all disks available at our address:
#iscsiadm -m discoverydb -t sendtargets -p --discover

OR we can do it like so, to specify the IQN:
# iscsiadm -m discovery -t sendtargets -p,1

In all cases, now the info is persisted in /var/lib/iscsi, and visible with following the commands:
# iscsiadm -m node,1

# iscsiadm -m discovery -P 1
        Iface Name: default

Log in to the device

We can now login/connect to the drive:

# iscsiadm -m node --targetname "" --portal "" --login
Logging in to [iface: default, target:, portal:,3260]
Login to [iface: default, target:, portal:,3260]: successful


And now the disk is visible!:

# fdisk -l
Disk /dev/sdc doesn't contain a valid partition table

Disk /dev/sdd: 1048 MB, 1048576000 bytes
33 heads, 61 sectors/track, 1017 cylinders
Units = cylinders of 2013 * 512 = 1030656 bytes

Disk /dev/sdd doesn't contain a valid partition table

(and this survives reboot - with the default configuration on RHEL at least. Otherwise check the /etc/iscsid.conf and doc in /usr/share/doc/iscsi-initiator-utils-

Second node

Repeat the above, and test through reboot.

Create a partition, and make sure it is visible on both sides

# fdisk /dev/sdd
Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
Partition number (1-4): 1
First cylinder (1-1017, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-1017, default 1017): +900M

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

Quick test

We can do a quick test, creating a filesystem and mounting it subsequently on both nodes:

# mkfs.ext3 /dev/sdd1 
mount /dev/sdb1 /mnt
# echo "hello world" > /mnt/helloworld.txt
# umount /dev/sdb1

Check we can mount and read it on the second node (Note: ext3 is not a clusterfs: make sure it is not mounted on node 1 anymore!) 

If /dev/sdd1 is not seen at this point, you may need '/etc/init.d/iscsi restart' or reboot

 # mount /dev/sdb1 /mnt
 # cat /mnt/helloworld.txt
 hello world

To be continued...

We now have the most costly piece of a typical HA setup. 

In the next posts, I'll do a fail-over Oracle/Veritas cluster setup

and later a RAC setup




Additional notes, if needed to rework the iSCSI config


  • either all nodes from a portal:

# iscsiadm -m node --logout -p,3260
Logging out of session [sid: 1, target:, portal:,3260]
Logout of [sid: 1, target:, portal:,3260] successful.

  • or specific IQN:

# iscsiadm -m node --logout
Logging out of session [sid: 1, target:, portal:,3260]
Logout of [sid: 1, target:, portal:,3260] successful.

# iscsiadm -m session
iscsiadm: No active sessions.

and 'fdisk' doesn't show the disk anymore

Delete the entries:

# iscsiadm -m node -o delete
# iscsiadm -m node
iscsiadm: no records found!

and /var/lib/iscsi has been cleaned a bit (at least the IQN is removed)

From there we can re-create a new config (Example: larger device)


No comments:

Post a Comment