1 Scenario 1: loss of Secondary node
- 1.1 Part 1: Loss of network connectivity
- 1.2 Alternate decisions
  - 1.2.1 Completely removing H2 from the cluster
2 Scenario 2: Loss of Master Node
- 2.1 Promoting slave

We are going to simulate a number of failure situations, and recover from them.

Try and replicate the scenarios on your hosts.

1 Scenario 1: loss of Secondary node

1.1 Part 1: Loss of network connectivity

1.1.1 Description

Cluster with 2 or more Nodes
Master is up (H1)
Secondary is up (H2)
DRBD instance (debian VM) is running on Secondary (H2)

1.1.2 Process

Control that debian (or what the name of the DRBD VM you are using is) is running on H2 (gnt-instance list)

# gnt-instance list -o name,pnode,snodes,status

Instance Primary_node      Secondary_Nodes   Status
debian   H2.ws.nsrc.org H1.ws.nsrc.org running

Shut down (halt) H2 (make sure you run this on H2!)

# halt -p

The VM goes down as a result (control this using ping / console)

# gnt-instance list -o name,pnode,snodes,status

Instance Primary_node      Secondary_Nodes   Status
debian   H2.ws.nsrc.org H1.ws.nsrc.org ERROR_nodedown

Run gnt-cluster verify (will take a while), and look at the output.

As you notice, things are quite slow. Let's start by marking H2 as offline:

# gnt-node modify -O yes H2.ws.nsrc.org

Modified node H2.ws.nsrc.org
 - master_candidate -> False
 - offline -> True

It will take a little while, but now most commands will run faster as Ganeti stops trying to contact the other nodes in the cluster.

Try running gnt-instance list and gnt-node list again.

Also re-run gnt-cluster verify

1.1.3 Recovery

We cannot migrate the host (H2 is down), so we need to failover

If you attempt to migrate, you will be told:

# gnt-instance migrate debian

Failure: prerequisites not met for this operation:
error type: wrong_state, error details:
Can't migrate, please use failover: Error 7: Failed connect to 10.10.0.X:1811; No route to host

Attempt failover

# gnt-instance failover debian

Note: it is possible that the failover will succeed, but in case you see this message:

Sat Jan 18 20:57:55 2014 Failover instance debian
Sat Jan 18 20:57:55 2014 * checking disk consistency between source and target
Failure: command execution error:
Disk 0 is degraded on target node, aborting failover

... you will need to force the operation. To do so:

Read man page on gnt-instance, find the section about failover:

If you are trying to migrate instances off a dead node, this will fail. Use the --ignore-consistency option for this purpose. Note that this option can be dangerous as errors in shutting down the instance will be ignored, resulting in possibly having the instance running on two machines in parallel (on disconnected DRBD drives).

This is why we shut down H2, and didn't simply disconnect. You MUST verify that H2 really is down, and not simply disconnected from the management / replication network, otherwise you WILL end up with two running instances of VM, and you will need to force a resolution (we will attempt this later).
Re-run gnt-instance failover with the '--ignore-consistency' flag. We are in a situation that requires this (H2 down)

# gnt-instance failover --ignore-consistency debian

There will be much more output this time, pay attention in particular if you see some warnings - these are normal since the H2 node is down, but we did it mark it as offline.

Sat Jan 18 21:03:15 2014 Failover instance debian
Sat Jan 18 21:03:15 2014 * checking disk consistency between source and target

[ ... messages ... ]

Sat Jan 18 21:03:27 2014 * activating the instance's disks on target node H1.ws.nsrc.org

[ ... messages ... ]

Sat Jan 18 21:03:33 2014 * starting the instance on the target node H1.ws.nsrc.org

Control that the VM is now up on H1:

# gnt-instance list -o name,pnode,snodes,status

Instance Primary_node      Secondary_Nodes   Status
debian   H1.ws.nsrc.org H2.ws.nsrc.org running

1.1.4 Re-adding the failed node

Ok, let's say H2 has been fixed.

Restart H2.
Make sure you can ping it and log in to it

We need to re-add it to the cluster. We do this using the gnt-node add --readd command.

From the gnt-node man page:

In case you're readding a node after hardware failure, you can use the --readd parameter. In this case, you don't need to pass the secondary IP again, it will reused from the cluster. Also, the drained and offline flags of the node will be cleared before re-adding it.

# gnt-node add --readd H2.ws.nsrc.org

[ ... question about SSH ...]

Sat Jan 18 22:09:43 2014  - INFO: Readding a node, the offline/drained flags were reset
Sat Jan 18 22:09:43 2014  - INFO: Node will be a master candidate

We're good! It could take a while to re-sync the DRBD data if a lot of disk activity (writing) has taken place on debian, but this will happen in the background.

Let's try and migrate debian back to H2:

# gnt-instance migrate debian

Test that the migration has worked.

Note: if you are certain that the node H2 is healthy (let's say it was just a power failure, and no corruption has happened on its filesystem or disks), you could simply do (DON'T DO THIS NOW!):

# gnt-node modify -O no H2.ws.nsrc.org

Sat Jan 18 22:08:45 2014  - INFO: Auto-promoting node to master candidate
Sat Jan 18 22:08:45 2014  - WARNING: Transitioning node from offline to online state without using re-add. Please make sure the node is healthy!

But you would be warned about this.

1.2 Alternate decisions

1.2.1 Completely removing H2 from the cluster

If we were certain that H2 cannot be fixed, and won't be back online, we could delete H2 from the cluster. To do this:

Run gnt-cluster verify, and look at the output, you will see

Sat Jan 18 21:31:56 2014   - NOTICE: 1 offline node(s) found.

We marked H2 as down - let's assume H2 will be down for a while while it's being fixed.
We decide to remove H2 from the cluster:

# gnt-node remove H2.ws.nsrc.org

Failure: prerequisites not met for this operation:
error type: wrong_input, error details:
Instance debian is still running on the node, please remove first

Ok, we are not allowed to remove the H2, because Ganeti can see that we still have an instance (debian) associated with H2.

This is different from simply marking the node offline, as it means we are permanently getting rid of H2, and we need to take a decision about what to do for DRBD instances that were associated with H2.

So what do we do now ? If we had a third node (H3), we could use the gnt-node evacuate. Read the man page for gnt-node and look for the section about the evacuate subcommand.

gnt-node evacuate is used to move all DRBD instances from one failed node to others. But we don't have a third node for now!

So we need to first temporarily convert our instance debian from DRBD... Unfortunately this requires shutting down the VM instance first.

# gnt-instance shutdown debian

Wait until it is down (you will see some WARNINGs again), then:

gnt-instance modify -t plain -n H1.ws.nsrc.org debian

Sat Jan 18 21:40:54 2014 Converting template to plain
Sat Jan 18 21:40:54 2014 Removing volumes on the secondary node...
Sat Jan 18 21:40:54 2014 Removing unneeded volumes on the primary node...
Modified instance debian
 - disk_template -> plain

(WARNINGs removed in the output above)

We shoud now be able to remove the node:

# gnt-node remove H2.ws.nsrc.org

More WARNINGs! But did it work ?

# gnt-node list

Node              DTotal DFree MTotal MNode MFree Pinst Sinst
H1.ws.nsrc.org  29.1G 12.6G   995M  145M  672M     2     0

Yes, H2 is gone.

Note: Ganeti will modify /etc/hosts on your remaining nodes, and remove the line for H2!

We can restart our debian instance, by the way!

# gnt-instance start debian

Test that it comes up normally.

2 Scenario 2: Loss of Master Node

Let's imagine a slightly more critical scenario: the crash of the master node.

Let's shut down the master node!

On H1:

# halt -p

The node is now down. VM still running on other nodes are unaffected, but you are not able to make any changes (stop, start, modify, add VMs, change cluster configuration, etc...)

2.1 Promoting slave

Let's assume that H1 is not coming back right now, and we need to promote a master.

You will first need to decide which of the remaining nodes will become the master. If you are only running 2 nodes, then it's rather obvious that H2 will become the Master.

Read about master-failover: man gnt-cluster, find the MASTER-FAILOVER section.

To promote the slave:

Log on to the node that will become master (in this case, H2):
Run the following command:

# gnt-cluster master-failover

Note here that you will NOT be asked to confirm the operation!

Also note that since we are running in a 2 node configuration, we may have to specify --no-voting as an option: since there is no other remaining node in the cluster, no voting can take place anyway.

At this point, the slave node (H2) is now master. You can verify this using the gnt-cluster getmaster command.

From this point, recovering downed machines is similar to what we did in the first scenario. But to be on the safe side:

Restart H1, and log in to it as root
Try and run gnt-instance list

Normally, even though H1 was down while the promotion of H2 happened, the ganeti-masterd daemon running on H1 was informed, on startup, that H1 was no longer master. The above command should therefore fail with:

This is not the master node, please connect to node 'H2.ws.nsrc.org' and
rerun the command

Which means that H1 is well aware that H2 is the master now.

Ganeti: failures and recovery scenarios