Remus provides transparent, comprehensive high availability to
ordinary virtual machines running on
the Xen virtual machine monitor. It
does this by maintaining a completely up-to-date copy of a running
VM on a backup server, which automatically activates if the primary
server fails. Key features:
- The backup VM is an exact copy of the primary VM. When
failure happens, it continues running on the backup host as if
failure had never occurred.
- The backup is completely up-to-date. Even active TCP
sessions are maintained without interruption.
- Protection is transparent. Existing guests can be
protected without modifying them in any way.
For a full description and evaluation, see our
NSDI paper.
Remus is feature-complete, supporting both PV and HVM, in 32 and
64-bit modes. But it is still young, and we have many improvements
in mind. Here's the short-term development plan:
- Clean up modifications to xc_domain_save:
- Remove ugly (but easier to maintain out-of-tree) gotos by
extracting per-round checkpoint code into a separate function.
- Abstract checkpoint output into user-specified function pointers.
- Incorporate network buffering into netback rather
than relying on IMQ.
- Carefully benchmark persistent suspend thread against both
high and low load to quantify its advantage over per-suspend
thread creation.
- Consider folding the control script into xend with
an xm front end.
- Support external heartbeat monitors
(e.g., Linux-HA) as an
alternative to the simple, in-band monitor bundled with Remus.
- 2009-11-09: Remus has been applied to the official
Xen repository, and is expected to be included with the next major
release! Updated (simpler!) installation and usage instructions
coming soon.
- 2009-11-05: Remus 0.9 released! Now supports HVM, 64-on-32, 32-on-64, and
the latest version of xen-unstable (changeset 20399).
- 2009-05-14: Supports 64-on-64.
- 2009-05-11: Initial port to Xen unstable (now 3.4.0-rc4) completed.
- 2009-03-30: First release (against xen-3.2-testing).
The current release is Remus 0.9. The source is
here.
Like Xen, we use Mercurial
to manage the source code. The repositories
are here.
See the README in the
distribution repository.