1 files changed, 125 insertions, 0 deletions
diff --git a/agents/virt/docs/README b/agents/virt/docs/README
new file mode 100644
index 0000000..e2b19bc
--- /dev/null
+++ b/agents/virt/docs/README
@@ -0,0 +1,125 @@
+TODO: update
+
+I. Fence_xvm - Virtual machine fencing agent
+
+Fence_xvm is an agent which establishes a communications link between
+a cluster of virtual machines (VC) and a cluster of domain0/physical
+nodes which are hosting the virtual cluster.  Its operations are
+fairly simple.
+
+  (a) Start a listener service.
+  (b) Send a multicast packet requesting that a VM be fenced.
+  (c) Authenticate client.
+  (e) Read response.
+  (f) Exit with success/failure, depending on the response received.
+
+If any of the above steps fail, the fencing agent exits with a failure
+code and fencing is retried by the virtual cluster at a later time.
+Because of the simplicty of fence_xvm, it is not necessary that
+fence_xvm be run from within a virtualized guest - all it needs is
+libnspr and libnss and a shared private key (for authentication; we
+would hate to receive a false positive response from a node not in the
+cluster!).
+
+
+II. Fence_virtd - Virtual machine fencing host
+
+Fence_virtd is a daemon which runs on physical hosts (e.g. in domain0)
+of the cluster hosting the virtual cluster.  It listens on a port
+for multicast traffic from virtual cluster(s), and takes actions.
+Multiple disjoint virtual clusters can coexist on a single physical
+host cluster, but this requires multiple instances of fence_virtd.
+
+NOTE: fence_virtd *MUST* be run on ALL nodes in a given cluster which
+will be hosting virtual machines if fence_xvm is to be used for 
+fencing!
+
+There are a couple of ways the multicast packet is handled,
+depending on the state of the host OS.  It might be hosting the VM,
+or it might not.  Furthermore, the VM might "reside" on a host which
+has failed.
+
+In order to be able to guarantee safe fencing of a VM even if the
+last- known host is down, we must store the last-known locations of
+each virtual machine in some sort of cluster-wide way.  For this, we
+use the corosync CPG API.  Every few seconds, fence_virtd queries the
+hypervisor via libvirt and stores any local VM states and sends those
+states over CPG to all other members.  In the event of a physical node
+failure (which consequently causes the failure of one or more guests),
+we can then read the stored VM state corresponding to the guest we need
+to fence to find out the previous owner. With that information, we can
+infer if the known host node has been fenced.  If so, then the VM is clean
+as well. The physical cluster must, therefore, have fencing in order for
+fence_virtd to work.
+
+Operation of a node hosting a VM which needs to be fenced:
+  
+  (a) Receive multicast packet
+  (b) Authenticate multicast packet
+  (c) Open connection to host contained within multicast
+      packet.
+  (d) Authenticate server.
+  (e) Carry out fencing operation (e.g. call libvirt to destroy or
+      reboot the VM; there is no "on" method at this point).
+  (f) If operation succeeds, send success response.
+
+Operation of high-node-ID:
+
+  (a) Receive multicast packet
+  (b) Authenticate multicast packet
+  (c) Read VM state from stored CPG messages
+  (d) Check liveliness of nodeID hosting VM (if alive, do nothing)
+  (e) Open connection to host contained within multicast
+      packet.
+  (f) Check with CMAN to see if last-known host has been fenced.
+  (g) If last-known host has been fenced, send success response.
+  (h) Authenticate server & send response.
+
+NOTE: There is always a possibility that a VM is started again
+before the fencing operation and CPG update for that VM
+occurs.  If the VM has booted and rejoined the cluster, fencing will
+not be necessary.  If it is in the process of booting, but has not
+yet joined the cluster, fencing will also not be necessary - because
+it will not be using cluster resources yet.
+
+
+III. Security considerations
+
+While fencing is generally expected to run on a more or less trusted
+network, there are cases where it may not be.
+
+* The multicast packet is subject to replay attacks, but because no
+fencing action is taken based solely on the information contained
+within the packet, this should not allow an attacker to maliciously
+fence a VM from outside the cluster, though it may be possible to
+cause a DoS of fence_virtd if enough multicast packets are sent.
+
+* The only currently supported authentication mechanisms are simple
+challenge-response based on a shared private key and pseudorandom
+number generation.
+
+* An attacker with access to the shared key(s) can easily fence any
+known VM, even if they are not on a cluster node.
+
+* Different shared keys should be used for different virtual
+clusters on the same subnet (whether in the same physical cluster
+or not).  Additionally, multiple fence_virtd instances must be run
+(each listening on a different multicast IP + port combination).
+
+IV.  Configuration
+
+Generate a random key file.  An example of how to generate it is:
+
+    dd if=/dev/urandom of=/etc/cluster/fence_xvm.key bs=4096 count=1
+
+Distribute the generated key file to all domUs in a cluster as well
+as all dom0s which will be hosting that particular cluster of domUs.
+The key should not be placed on shared file systems (because shared
+file systems require the cluster, which requires fencing...).
+
+Start fence_virtd on all hosts
+
+Configure fence_xvm on the domU cluster...
+
+rest...tbd
+