[buildd-tools-devel] RFC: Recent and future work on schroot

Sun Jul 26 21:21:29 UTC 2009

Recent work
-----------

The major recent change has been the merging of the union filesystem
support from Jan-Marek Glogowski.  I'd like to release a development
version (1.3.0) for further testing once I have confirmation that what
was merged is working.  Could anyone verify this?  Jan?

I'd like to get any regressions or defects related to filesystem
union support fixed soon if possible.  The current schroot master
is AFAICT fully functional for anyone who wants to test it, I just
don't have the means to test unionfs support myself [aufs appears to
cause Oopses on 2.6.30, though it might be an unrelated issue since
the Oops is in ext4/dquot].

Chroot Facets
-------------

One thing that has come out of this merge is the fact that the current
object model of chroot features is not scalable and needs fixing.  When
schroot was originally written to support a few different types of
filesystems, it made sense to have a chroot base class and use
inheritance to specialise this basic chroot for different use cases.
However, when we started to add a number of optional features common
to several (but not all) chroot types, it was decided to use
multiple inheritance to allow different types to get access to
different features, being a logical extension to the existing scheme.
This worked fairly well until the addition of union filesystems, not
due to union support specifically, just that at this point we reached
a limit of how complex the scheme could realistically be made.
Chaining up in all the various derived methods is time consuming and
it turns out that writing copy constructors when using virtual
inheritance is a recipe for disaster, given that it's the
responsibility for the most-derived class to ensure the virtual bases
are copied, and given multiple virtual bases this proved to be very
difficult, if not impossible (but was required for source chroot
cloning).

After considering the possibilities, I think the most robust and
flexible approach would be to move from inheritance to containment.
This will allow flexible extension of the basic chroot class without
any mucking around with multiple virtual inheritance.  To this end,
I've created a new class called chroot_facet which is just a set of
pure virtual functions which all extensions must implement (keyfile
serialisation, environment setting, session flags and human-readable
description of settings).  The chroot class just contains a vector
of pointers to these objects and can use them without any idea
about their internals.  There's a templated interface to introspect
and use the functionality of the facets.  So far I've just converted
a single thing (personality(2) support) as a proof of concept.

There's a longer-term goal here as well as maintainability and
robustness.  Currently schroot has a flexible and extensible chroot
setup scheme which allows us to support many different *sources*, but
ultimately it's just wrapping chroot(2)/fork(2)/execve(2).  I'd like
to support other types of virtualisation (kvm is my initial aim), and
this will require modular support for how we run commands, since for
kvm we can't just do an exec.  Facets would allow us to make that
aspect of chroot command executation replacable as well.

This is how the session code implements optional personality support:

  // Get personality facet from the chroot.
  std::tr1::shared_ptr<chroot_facet_personality> pfac =
    session_chroot->get_facet<chroot_facet_personality>();
  // Set the personality if personality support is present.
  if (pfac)
    pfac->get_persona().set();

  http://git.debian.org/?p=users/rleigh/schroot.git;a=shortlog;h=refs/heads/chroot-facet

Now the basic infrastructure is known to work (testsuite passes
with flying colours), the following will be done:

1) Make easier to use with typedefs to avoid lots of
   std::tr1::shared_ptr<chroot_facet_xxx>() all over the place.
   Depending on g++'s support for C++0x, I might look at using the
   auto type if it's working in Lenny.  Also implement create
   methods to automatically wrap in shared_ptr.
2) Convert leaf types to facets (source, session, union)
3) Convert chroot types to facets (plain, directory, loopback, file,
   block-device, lvm-snapshot etc.)
4) Convert chroot setup code in sbuild::session to facet (abstract
   setup and command execution for kvm support)

Any thoughts or comments on this are welcome.

schroot client-server protocol using IETF secsh-filexfer protocol
-----------------------------------------------------------------

Now for something more adventurous which might be a bit more
controversial (comments again welcome).

When supporting only chroots, we can freely copy files to/from the
host and run commands at will.  When using a virtualised system such
as kvm we need to log in over the network using e.g. ssh.  This
probably means using ssh to execute commands and scp/sftp to copy
files.

A design issue with schroot relating to its origins as a more
advanced dchroot replacement is that it is a setuid-root program
which runs in a single shot.  That is, it does its job and exits.
However, this is suboptimal for some tasks (session support)
and means our locking strategies for e.g. session files are not
as robust as they could be (though they work well for all
current uses, it doesn't scale for future planned uses).  This design
prevents us using some features which would be very useful such as
per-process namespaces (inherited by processes, but there is no
continuity between processes running sessions) and it's hard if not
impossible to support other stuff such as kvm where you'll have the VM
running all the time.

For these uses, a server is needed, which can manage all the
resources all the time, rather than opportunistically locking them
every time you run schroot.  If we go this route, we'll need a
means of having a client talk to the server, and this implies having
some sort of protocol they can use to communicate.  I'm adverse to
wheel reinvention where not required, so I've been looking for a
protocol I could use already use for the task, and I think I've
found one in the SSH File Transfer Protocol
(http://tools.ietf.org/html/draft-ietf-secsh-filexfer-13)

This protocol is particularly useful in that it is the basis of sftp,
and implements file transfer in a particularly elegant way, by
proxying the standard UNIX system calls over the wire.  Rather than
transfer a file, you'll open()/write()/seek()/close() etc..  This
flexibility means not only can it be used for transferring files,
it can be used to multiplex stdin/out/err over the connection to
allow streaming of the normal I/O.  It's not limited to just that,
though.

Another benefit of this protocol is that it's designed to be easy
to write custom extensions (SSH_FXP_EXTENDED), version those
extensions, and negotiate what extensions are available when you start
up a connection.  This means that it will be simple to add additional
commands which schroot needs (list chroots, list sessions, run command,
start session, stop session etc.) and this would be extensible.

I've not started any real work on this yet; there's a couple of commits
here (sftp/), but they are awful--just me writing down my thoughts as a
set of skeleton headers.

  http://git.debian.org/?p=users/rleigh/schroot.git;a=shortlog;h=refs/heads/client-server-protocol

This is obviously quite a large task (though the protocol itself is
pretty simple, the main work will just be object serialisation logic).
The sftp in openssh is just a couple of hundred lines of C.

There are some other perhaps not so obvious benefits:

1) It gives us a VM-independent means of copying files into the VM,
   whether or not we have direct access.  Setup scripts will be able
   to use a (schroot-provided) mechanism to copy files in and out,
   and this mechanism will also be available to users for scripting.
2) Since we implement a superset of the SFTP protocol, you can
   theoretically FUSE mount your virtual environment on the host system
   independently of the means of the VM implementation.  Very cool!

Once this is tested and debugged, there's also some other possible
uses.  If you think about how sbuild works, you can consider it to
be similar to an mail or print server (for example).  You submit
a job, it queues it, schedules the build, and then you get a reply.
Like schroot, it's something that could be implemented very nicely as
a server, and using this protocol it can even stream back the build
logs if monitoring interactively as well as be used to transfer the
source packages and built binaries in and out of the build environment.
It would also make sbuild no longer require *any* root access for the
user, either inside or outside the build environment (you'd just give
access to an sbuild system user).  buildd use is also a fairly obvious
extension for buildd interaction.

I'm not actually intending to actually do this just yet, I just
wanted to throw down my thoughts to let you know what I'm thinking.
Any comments welcome!

Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux             http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?       http://gutenprint.sourceforge.net/
   `-    GPG Public Key: 0x25BFB848   Please GPG sign your mail.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: Digital signature
URL: <http://lists.alioth.debian.org/pipermail/buildd-tools-devel/attachments/20090726/ae9b4624/attachment.pgp>