diff options
Diffstat (limited to 'agents/virt/docs')
-rw-r--r-- | agents/virt/docs/README | 125 | ||||
-rw-r--r-- | agents/virt/docs/TODO | 7 | ||||
-rw-r--r-- | agents/virt/docs/architecture.txt | 16 | ||||
-rw-r--r-- | agents/virt/docs/fence_virt.txt | 127 |
4 files changed, 275 insertions, 0 deletions
diff --git a/agents/virt/docs/README b/agents/virt/docs/README new file mode 100644 index 0000000..e2b19bc --- /dev/null +++ b/agents/virt/docs/README @@ -0,0 +1,125 @@ +TODO: update + +I. Fence_xvm - Virtual machine fencing agent + +Fence_xvm is an agent which establishes a communications link between +a cluster of virtual machines (VC) and a cluster of domain0/physical +nodes which are hosting the virtual cluster. Its operations are +fairly simple. + + (a) Start a listener service. + (b) Send a multicast packet requesting that a VM be fenced. + (c) Authenticate client. + (e) Read response. + (f) Exit with success/failure, depending on the response received. + +If any of the above steps fail, the fencing agent exits with a failure +code and fencing is retried by the virtual cluster at a later time. +Because of the simplicty of fence_xvm, it is not necessary that +fence_xvm be run from within a virtualized guest - all it needs is +libnspr and libnss and a shared private key (for authentication; we +would hate to receive a false positive response from a node not in the +cluster!). + + +II. Fence_virtd - Virtual machine fencing host + +Fence_virtd is a daemon which runs on physical hosts (e.g. in domain0) +of the cluster hosting the virtual cluster. It listens on a port +for multicast traffic from virtual cluster(s), and takes actions. +Multiple disjoint virtual clusters can coexist on a single physical +host cluster, but this requires multiple instances of fence_virtd. + +NOTE: fence_virtd *MUST* be run on ALL nodes in a given cluster which +will be hosting virtual machines if fence_xvm is to be used for +fencing! + +There are a couple of ways the multicast packet is handled, +depending on the state of the host OS. It might be hosting the VM, +or it might not. Furthermore, the VM might "reside" on a host which +has failed. + +In order to be able to guarantee safe fencing of a VM even if the +last- known host is down, we must store the last-known locations of +each virtual machine in some sort of cluster-wide way. For this, we +use the corosync CPG API. Every few seconds, fence_virtd queries the +hypervisor via libvirt and stores any local VM states and sends those +states over CPG to all other members. In the event of a physical node +failure (which consequently causes the failure of one or more guests), +we can then read the stored VM state corresponding to the guest we need +to fence to find out the previous owner. With that information, we can +infer if the known host node has been fenced. If so, then the VM is clean +as well. The physical cluster must, therefore, have fencing in order for +fence_virtd to work. + +Operation of a node hosting a VM which needs to be fenced: + + (a) Receive multicast packet + (b) Authenticate multicast packet + (c) Open connection to host contained within multicast + packet. + (d) Authenticate server. + (e) Carry out fencing operation (e.g. call libvirt to destroy or + reboot the VM; there is no "on" method at this point). + (f) If operation succeeds, send success response. + +Operation of high-node-ID: + + (a) Receive multicast packet + (b) Authenticate multicast packet + (c) Read VM state from stored CPG messages + (d) Check liveliness of nodeID hosting VM (if alive, do nothing) + (e) Open connection to host contained within multicast + packet. + (f) Check with CMAN to see if last-known host has been fenced. + (g) If last-known host has been fenced, send success response. + (h) Authenticate server & send response. + +NOTE: There is always a possibility that a VM is started again +before the fencing operation and CPG update for that VM +occurs. If the VM has booted and rejoined the cluster, fencing will +not be necessary. If it is in the process of booting, but has not +yet joined the cluster, fencing will also not be necessary - because +it will not be using cluster resources yet. + + +III. Security considerations + +While fencing is generally expected to run on a more or less trusted +network, there are cases where it may not be. + +* The multicast packet is subject to replay attacks, but because no +fencing action is taken based solely on the information contained +within the packet, this should not allow an attacker to maliciously +fence a VM from outside the cluster, though it may be possible to +cause a DoS of fence_virtd if enough multicast packets are sent. + +* The only currently supported authentication mechanisms are simple +challenge-response based on a shared private key and pseudorandom +number generation. + +* An attacker with access to the shared key(s) can easily fence any +known VM, even if they are not on a cluster node. + +* Different shared keys should be used for different virtual +clusters on the same subnet (whether in the same physical cluster +or not). Additionally, multiple fence_virtd instances must be run +(each listening on a different multicast IP + port combination). + +IV. Configuration + +Generate a random key file. An example of how to generate it is: + + dd if=/dev/urandom of=/etc/cluster/fence_xvm.key bs=4096 count=1 + +Distribute the generated key file to all domUs in a cluster as well +as all dom0s which will be hosting that particular cluster of domUs. +The key should not be placed on shared file systems (because shared +file systems require the cluster, which requires fencing...). + +Start fence_virtd on all hosts + +Configure fence_xvm on the domU cluster... + +rest...tbd + diff --git a/agents/virt/docs/TODO b/agents/virt/docs/TODO new file mode 100644 index 0000000..17456cf --- /dev/null +++ b/agents/virt/docs/TODO @@ -0,0 +1,7 @@ +High Priority / Blockers for v1.0; + +* endian-clean / 64-bit clean data structure analysis + +Future Stuff: + +* clean up development bits so third parties can develop plugins diff --git a/agents/virt/docs/architecture.txt b/agents/virt/docs/architecture.txt new file mode 100644 index 0000000..54fda11 --- /dev/null +++ b/agents/virt/docs/architecture.txt @@ -0,0 +1,16 @@ +The actual architecture of fence_virtd is very simple. We have a set +of listener plugins which listens for fencing requests for virtual +machines. + +These plugins are assigned callbacks which are entry functions in to +the backend plugins. The backend plugins perform the actual fencing +request. + +In the middle, we have only enough code to provide basic integration +functions between the listener and backend plugins. This includes a +very simple confiugration plugin which we pass to each of the plugins. + +Because we are passing function pointers in to the plugins themselves +for configuration (rather than having the plugins call an API directly, +for example), we are able to swap out the configuration subsystem for +other, more full-featured configuration systems, such as libccs. diff --git a/agents/virt/docs/fence_virt.txt b/agents/virt/docs/fence_virt.txt new file mode 100644 index 0000000..e554ce4 --- /dev/null +++ b/agents/virt/docs/fence_virt.txt @@ -0,0 +1,127 @@ +We need a fencing agent which can work in a variety of guest cluster +configurations and host configurations. + +Requirements + +1. Nonrequirement of guest to host networking. Virtual machines + may be configured to run using a nework unknown to the host + operating system. Therefore, the ability to run without network + communication between the guest and the hsot is required. + +2. Ease of configuration. The absolute minimum possible configuration + must be available. + +3. Nonrequirement of host clustering software. Multiple layers of + configuration sucks. While I fundamentally disagree with the general + idea that running CMAN on the host constitutes a "heavyweight + cluster", perception is important. + +4. Ability to support RHEV-M, oVirt server, and other virtual machine + management technologies. This is beneficial from a security standpoint + since it is assumed the management server will be aware of what VMs + are allowed to fence what other VMs. + +5. Upgrade compatibility with fence_xvm from a configuration standpoint. + This may be provided by a symlink over fence_xvm. If this feature + can not be provided as a matter of design, a method to convert an + existing fence_xvm/fence_xvmd configuration to fence_virt must be + present. + + +Guest to Host Interaction +------------------------- + +The proposal is to use various communications media plugins in order +to facilitate flexibility with respect to how virtual machine +environments are configured. + +There are at least 3 simple plugins for guest/client to host/server +communications: + + * Direct serial. The guest sends fencing requests out via /dev/ttySX + in the guest. The host is listening on a Unix domain socket[1], + and forwards fencing requests accordingly. + + This satisifies most of the requirements, but adds a conundrum + when configuring guest clusters, as /dev/ttySX may be /dev/ttySY + on another guest. So, either we must account for this per-guest + configuration discrepancy or we must make it an administrative + requirement to provide the same serial device on each host + + * Multicast. This violates the networking requirement, but this is + okay since this method of operation is optional. This operational + mode provides for one of the simpler configurations: all that is + needed is the guest's name or UUID. The guest to host + communications operates in the same manner as fence_xvm/fence_xvmd, + except that there is an implied requirement on restricting the + multicast packets accepted to be from the local guests. + + * VM Channel over Serial. This works like direct serial, but + instead of owning the whole device, the device may be shared between + multiple applications. The server subscribes to a channel and + listens for fencing requests on the channel; the client in the + guest OS connects to the channel and issues fencing requests across + it. One interesting thing is that it may be possible to provide + unprivileged users the ability to fence using this method (I + do not claim to know if this is useful or not). + + +Host to Hypervisor interaction +------------------------------ + +Similar to the way we have plugins for guest to host interaction, +we also have plugins which actually do the real work. These plugins +are responsible for all of the actual real work performed, including +tracking VMs if required, forwarding requests to the appropriate hosts +or management services, and handling the responses. + +We propose at 5 plugins in this case: + + * Libvirt (local-only). There is no intracommunication and no + migration support is provided + + * Cluster CPG (+ libvirt). This the way fence_xvmd + operates today. This setup has the most requirements on the + infrastructure, as it requires guest to host networking _and_ + host-to-host clustering in order to keep track of virtual + machines. The benefit is that it is self-contained and requires + no external management nodes. VM states are stored so that other + CPG group members know the locations of other VMs and can make + some decisions about whether a VM is dead based on whether a host + is dead (i.e. if fencing is in use or can be performed on the + host). + + * Libvirt-QMF ... ??? Subscription to the appropriate cluster + specific AMQP channel is required on the host side, but this + handles routing the message very easily. The fencing request + is forwarded to the other listeners on the channel, the VM owner + takes the action requested and returns a value. When new VMs + are created, the event is broadcast out via the AMQP channel so + other hosts know the locations of other VMs and can make some + decisions about whether a VM is dead based on whether a host + is dead (i.e. if fencing is in use or can be performed on the + host). + + * oVirt Manager. The request is forwarded to the oVirt Manager + and the oVirt manager is responsible for taking the appropriate + action and responding to the request. + + * RHEV-M. The request is forwarded to the RHEV-M node, which is + responsible for taking the appropriate action and responding to + the request. + + +These plugins have no requirements on which guest to host communication +plugin is used (you could, if you wanted, use 'direct serial' with +'cluster cpg', or 'multicast' with 'RHEV-H' for example). + +These plugins must also be able to discover where appropriate. For +example, the cpg plugin can only be used if corosync/openais +is running. A defined plugin preference order should be specified/documented +so that the host daemon behaves in a predictable manner in absence of +host-side configuration data (about which plugin to use). + + +[1] TCP was also explored, however, the security is much better + using a Unix domain socket, despite the additional complexity + of listening for VM creation events. |