diff options
Diffstat (limited to 'agents/virt/docs/README')
-rw-r--r-- | agents/virt/docs/README | 125 |
1 files changed, 125 insertions, 0 deletions
diff --git a/agents/virt/docs/README b/agents/virt/docs/README new file mode 100644 index 0000000..e2b19bc --- /dev/null +++ b/agents/virt/docs/README @@ -0,0 +1,125 @@ +TODO: update + +I. Fence_xvm - Virtual machine fencing agent + +Fence_xvm is an agent which establishes a communications link between +a cluster of virtual machines (VC) and a cluster of domain0/physical +nodes which are hosting the virtual cluster. Its operations are +fairly simple. + + (a) Start a listener service. + (b) Send a multicast packet requesting that a VM be fenced. + (c) Authenticate client. + (e) Read response. + (f) Exit with success/failure, depending on the response received. + +If any of the above steps fail, the fencing agent exits with a failure +code and fencing is retried by the virtual cluster at a later time. +Because of the simplicty of fence_xvm, it is not necessary that +fence_xvm be run from within a virtualized guest - all it needs is +libnspr and libnss and a shared private key (for authentication; we +would hate to receive a false positive response from a node not in the +cluster!). + + +II. Fence_virtd - Virtual machine fencing host + +Fence_virtd is a daemon which runs on physical hosts (e.g. in domain0) +of the cluster hosting the virtual cluster. It listens on a port +for multicast traffic from virtual cluster(s), and takes actions. +Multiple disjoint virtual clusters can coexist on a single physical +host cluster, but this requires multiple instances of fence_virtd. + +NOTE: fence_virtd *MUST* be run on ALL nodes in a given cluster which +will be hosting virtual machines if fence_xvm is to be used for +fencing! + +There are a couple of ways the multicast packet is handled, +depending on the state of the host OS. It might be hosting the VM, +or it might not. Furthermore, the VM might "reside" on a host which +has failed. + +In order to be able to guarantee safe fencing of a VM even if the +last- known host is down, we must store the last-known locations of +each virtual machine in some sort of cluster-wide way. For this, we +use the corosync CPG API. Every few seconds, fence_virtd queries the +hypervisor via libvirt and stores any local VM states and sends those +states over CPG to all other members. In the event of a physical node +failure (which consequently causes the failure of one or more guests), +we can then read the stored VM state corresponding to the guest we need +to fence to find out the previous owner. With that information, we can +infer if the known host node has been fenced. If so, then the VM is clean +as well. The physical cluster must, therefore, have fencing in order for +fence_virtd to work. + +Operation of a node hosting a VM which needs to be fenced: + + (a) Receive multicast packet + (b) Authenticate multicast packet + (c) Open connection to host contained within multicast + packet. + (d) Authenticate server. + (e) Carry out fencing operation (e.g. call libvirt to destroy or + reboot the VM; there is no "on" method at this point). + (f) If operation succeeds, send success response. + +Operation of high-node-ID: + + (a) Receive multicast packet + (b) Authenticate multicast packet + (c) Read VM state from stored CPG messages + (d) Check liveliness of nodeID hosting VM (if alive, do nothing) + (e) Open connection to host contained within multicast + packet. + (f) Check with CMAN to see if last-known host has been fenced. + (g) If last-known host has been fenced, send success response. + (h) Authenticate server & send response. + +NOTE: There is always a possibility that a VM is started again +before the fencing operation and CPG update for that VM +occurs. If the VM has booted and rejoined the cluster, fencing will +not be necessary. If it is in the process of booting, but has not +yet joined the cluster, fencing will also not be necessary - because +it will not be using cluster resources yet. + + +III. Security considerations + +While fencing is generally expected to run on a more or less trusted +network, there are cases where it may not be. + +* The multicast packet is subject to replay attacks, but because no +fencing action is taken based solely on the information contained +within the packet, this should not allow an attacker to maliciously +fence a VM from outside the cluster, though it may be possible to +cause a DoS of fence_virtd if enough multicast packets are sent. + +* The only currently supported authentication mechanisms are simple +challenge-response based on a shared private key and pseudorandom +number generation. + +* An attacker with access to the shared key(s) can easily fence any +known VM, even if they are not on a cluster node. + +* Different shared keys should be used for different virtual +clusters on the same subnet (whether in the same physical cluster +or not). Additionally, multiple fence_virtd instances must be run +(each listening on a different multicast IP + port combination). + +IV. Configuration + +Generate a random key file. An example of how to generate it is: + + dd if=/dev/urandom of=/etc/cluster/fence_xvm.key bs=4096 count=1 + +Distribute the generated key file to all domUs in a cluster as well +as all dom0s which will be hosting that particular cluster of domUs. +The key should not be placed on shared file systems (because shared +file systems require the cluster, which requires fencing...). + +Start fence_virtd on all hosts + +Configure fence_xvm on the domU cluster... + +rest...tbd + |