1 Objectives

2 Convert machines from plain to drbd

If our virtual machine is going to be running some important services, we need to make it "fully redundant".

We can tell Ganeti to convert our machine to go from 'plain' disk (a single copy of the disk in the LVM store of the primary node) to using 'drbd', where data is written at the same time on the primary and secondary nodes.

This is done with a single command, but first if the machine is running we need to shut it down temporarily.

(remember, all gnt-* commands must be run on the MASTER node)

# gnt-instance shutdown wordpressX

Waiting for job 277 for wordpressX ...

Wait 15-30 seconds, and check that the instance is down:

# gnt-instance list wordpressX

If it says ADMIN_down for the Status, you can now run the command to convert from plain to drbd:

# gnt-instance modify -t drbd -n hostY.ws.nsrc.org wordpressX

Notice that we provide, as a parameter to -n, the name of the node that we will be replicating to. This node should already be part of the cluster (we did this at the end of the Ganeti install lab). For this exercise we suggest you pick the next node in the cluster (e.g. if primary is on host3 then pick host4, or if you are the last node in your cluster then wrap around to the first)

Immediately the process will begin, and you will see some output similar to this:

Sat Jan 18 11:33:23 2014 Converting template to drbd
Sat Jan 18 11:33:23 2014 Creating additional volumes...
Sat Jan 18 11:33:23 2014 Renaming original volumes...
Sat Jan 18 11:33:24 2014 Initializing DRBD devices...
Sat Jan 18 11:33:26 2014  - INFO: Waiting for instance wordpressX to sync disks
Sat Jan 18 11:33:27 2014  - INFO: - device disk/0:  1.80% done, 1m 1s remaining (estimated)
Sat Jan 18 11:34:27 2014  - INFO: - device disk/0: 90.10% done, 6s remaining (estimated)
Sat Jan 18 11:34:34 2014  - INFO: - device disk/0: 99.30% done, 0s remaining (estimated)
Sat Jan 18 11:34:34 2014  - INFO: - device disk/0: 99.70% done, 0s remaining (estimated)
Sat Jan 18 11:34:34 2014  - INFO: - device disk/0: 100.00% done, 0s remaining (estimated)
Sat Jan 18 11:34:34 2014  - INFO: Instance wordpressX's disks are in sync
Modified instance wordpressX
 - disk_template -> drbd
Please don't forget that most parameters take effect only at the next (re)start of the instance initiated by ganeti; restarting from within the instance will not be enough.

While the synchronization is taking place, try running the command drbd-overview on either the primary or secondary node (not the master) to get another view of the synchronization progress.

Note: if you do not feel like waiting until the disks are synchronized on both sides, use the --no-wait-for-sync option. This will allow you to start the installation immediately (the tradeoff is disk access will be slower until sync is finished, and the machine's disk will not be securely copied on both primary and secondary until the sync finishes).

Time to restart the instance!

# gnt-instance start wordpressX

Ok, how do we know that the replication is really taking place ?

One way is to use the ifstat package, to see how much bandwidth is being used on our br-rep network, which DRBD is using to copy data from the primary to the secondary node. On your ganeti nodes:

# apt-get install ifstat

Once the ifstat tool is installed, run it to see the network traffic on your network interfaces:

# ifstat -i br-rep

      br-rep
 KB/s in  KB/s out
    0.00      0.00
    2.23     78.96
    0.00      0.00
    0.00      0.00

Now, log in to your guest VM, either at the console or via SSH, and create a large file (here 20MB of random data):

# dd if=/dev/urandom of=/tmp/test bs=1024k count=20; sync

Keep an eye on your host window, and see the bandwidth usage on br-rep

    0.00      0.00
   86.08   4886.99
  394.32  14893.37
  359.16  16467.22
   23.04    881.35

You should see the bandwidth utilization peak for a few seconds while the data you created is being replicated. If you miss it, just repeat the dd command again.

3 Migration!

Let's migrate the VM from the primary to the secondary node. You need to login to the MASTER node to issue this command, even if the primary and secondary nodes for this VM are elsewhere.

root@hostX:~# gnt-instance migrate wordpressX

Instance wordpressX will be migrated. Note that migration might
impact the instance if anything goes wrong (e.g. due to bugs in the
hypervisor). Continue?
y/[n]/?: y

Sat Jan 18 04:37:05 2014 Migrating instance wordpressX
Sat Jan 18 04:37:05 2014 * checking disk consistency between source and target
Sat Jan 18 04:37:06 2014 * switching node hostX.ws.nsrc.org to secondary mode
Sat Jan 18 04:37:06 2014 * changing into standalone mode
Sat Jan 18 04:37:06 2014 * changing disks into dual-master mode
Sat Jan 18 04:37:07 2014 * wait until resync is done
Sat Jan 18 04:37:07 2014 * preparing hostX.ws.nsrc.org to accept the instance
Sat Jan 18 04:37:08 2014 * migrating instance to hostX.ws.nsrc.org
Sat Jan 18 04:37:10 2014 * starting memory transfer
Sat Jan 18 04:37:15 2014 * memory transfer complete
Sat Jan 18 04:37:15 2014 * switching node hostY.ws.nsrc.org to secondary mode
Sat Jan 18 04:37:16 2014 * wait until resync is done
Sat Jan 18 04:37:16 2014 * changing into standalone mode
Sat Jan 18 04:37:16 2014 * changing disks into single-master mode
Sat Jan 18 04:37:17 2014 * wait until resync is done
Sat Jan 18 04:37:17 2014 * done

root@hostX:~#

If you do this while you are connected to the guest using VNC, your VNC session will drop. However if you are connected to the guest using SSH or the web, you should not see any interruption at all.

Verify that the instance is now running on the secondary node:

# gnt-instance list -o +network_port
Instance    Hypervisor OS   Primary_node      Status     Memory Network_port
wordpressX  kvm        noop hostY.ws.nsrc.org running      512M 11XXX

# gnt-instance list -o name,pnode,snodes
Instance    Primary_node      Secondary_Nodes
wordpressX  hostY.ws.nsrc.org hostX.ws.nsrc.org

Reconnect to the VNC console on the new primary node, using the port as listed in the output above (which will not have changed).

To migrate the instance back again, just repeat the command on the master node:

root@hostX:~# gnt-instance migrate wordpressX