Ceph failure demo

Demonstration of catastrophic failure

If time is available, the instructor will do a class demo of a catastrophic failure of ceph when it runs out of disk space. This is a reference of the commands used to recover.

Create a large VM

Consume all Ceph space

Make sure some other Ceph-backed participant VMs are running, e.g. ping 8.8.8.8

Inside bigvm:

dd if=/dev/urandom of=/bigfile bs=4M

Monitor overall Ceph usage on Datacenter > Ceph

Monitor OSD usage on nodeX > Ceph > OSD - esp. OSD Used (%), as one of them will likely reach 100% before the others

Also worth looking at ceph -w

Watch cluster health degrading (can pause the dd with ctrl-Z and resume with fg, but it still has to flush blocks from RAM)

When it fills, eventually VMs will start to freeze. (Try ctrl-C from ping then touch /test to write a file)

On mon (nodeX1):

# ceph df
--- RAW STORAGE ---
CLASS    SIZE   AVAIL    USED  RAW USED  %RAW USED
hdd    96 GiB  24 GiB  72 GiB    72 GiB      74.99
TOTAL  96 GiB  24 GiB  72 GiB    72 GiB      74.99

--- POOLS ---
POOL    ID  PGS   STORED  OBJECTS     USED   %USED  MAX AVAIL
vmpool   1  128   23 GiB    5.94k   68 GiB  100.00        0 B
.mgr     2    1  1.5 MiB        2  4.5 MiB  100.00        0 B
# rbd du -p vmpool
NAME              PROVISIONED  USED
vm-100-cloudinit        4 MiB    4 MiB
vm-100-disk-0          34 GiB   21 GiB
vm-101-disk-0         3.5 GiB  1.9 GiB
<TOTAL>                37 GiB   23 GiB

Stop bigvm. Try to delete it in Proxmox GUI. If cluster has properly wedged, it may not be possible to delete (although this may have improved in newer versions of Ceph).

Recovery: Deleting data when OSDs are full

On mon:

ceph osd dump | grep full_ratio
ceph osd set-full-ratio 0.97
ceph -w   # watch, wait until healthy
... remove or trim volume(s) as required
ceph osd set-full-ratio 0.95

References