[buildd-tools-devel] Some schroot (union) fixes

Mon Aug 3 12:05:45 UTC 2009

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sat, 1 Aug 2009, Roger Leigh wrote:

> I've applied most of the patches now.  Could you please check if this is
> working correctly for you.
>
> I've detailed which were applied.  Some were not applied, and I've given
> the reasons why.  I'm particularly interested in some more information
> about patch 3, and if patch 6 is still OK.  Some were not applied
> because they were superceded by changes during the facet migration.

I just rebased my patch stack against master. I had to revert
"sbuild::chroot_facet_session: Move deserialisation code from
chroot_facet_source" (45364d48302ab5a4f3152159955c217306146a8c).

I fixed the compile error it introduced (chroot_facet_source ->
chroot_facet_session ), but I got deprecation errors on "schroot -l -a"
for both of my union chroots. Since I'm not sure about the real intention
of the change, I'll leave the real fix to you. Reverting works for me.

Which leaves me with three simple patches pending against current tip,
except old Nr. 3. I'll send them afterwards to the list.

> [PATCH 03/11] [sbuild_session] Don't rollback non-run scripts
>
> Not applied.  I still have significant issues with this patch, as I've
> mentioned previously:
>
> * Why do you need to use a child process to run the scripts when this
>   can all be done directly?  We have:

>   main process
>   â””â”€â”€ run_parts child
>       â”œâ”€â”€ script1
>       â”œâ”€â”€ script2
>       â””â”€â”€ script3
>
>   where this is entirely sufficient:
>
>   main process
>   â”œâ”€â”€ script1
>   â”œâ”€â”€ script2
>   â””â”€â”€ script3
>
>   I really don't understand what this extra complexity gains us.

The child code is not mine. It's the main reason, why the failed script
handling got a little bit more complex, involving pipe communication.

> * Why is it important to roll back the scripts from a defined point?
>
>   Since the setup scripts may fail *during setup-stop* at any point,
>   we don't know the failure point from one schroot invocation to the
>   next.  As a result, it is a design choice that all setup scripts
>   must be idempotent.  That is, they must be capable of repeated
>   invocation during the setup-stop stage, and function correctly all
>   the time.

As you can't "undo" a stop, there is no cleanup run, which could stop at
the failed script. Same for recovery and begin. It's only usefull with
automatic chroots.

>   Again, I don't understand what this gains us.

With a negative intention: it papers over a bug with non-union loopback
chroot. I don't see the point in stoping a script which didn't even start.

1. Start a session "chroot -c loop" -> ok (id1)
2. Open a second terminal
3. Start a second session "chroot -c loop" -> fails and kills session id1

As it's written in the comment to the patch, I guess it's an error in the
15killprocs script, which kills the first sessions processes.

So without the patch, the whole "stop" stack is run happily killing the
first session. With the patch, the first session is preserved and the
second fails.

> I'm not opposed to the idea of the patch, but I don't like the current
> implementation.  There's no reason why the class can't store a vector
> containing each command and its exit status and allow the user to
> query that and get the name of the failing command (or of all
> commands).

This was never the intention. I just wanted to prevent running non-started
stop scripts.

> Also, bear in mind run_parts is an implementation of the run-parts(8)
> interface.  I'm happy to add additional functionality to add more of
> the real run-parts(8) functionality, but if it deviates from what
> run-parts offers, I want a really good reason (and use case) for it.

For me it's not a new use case, but a bugfix.

The patch just makes sense for automatic sessions, as this is the only
operation, which runs more then one setup_type via setup_chroot. All other
operations don't have a failed state.

The code is still compatible with run-parts(8); it just offers to run from
a defined offset in the script directory, which run-parts(8) does not.

Probably I'll recheck the bug and try to fix the real error.

Regards,
Jan-Marek
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFKdtKbj6MK58wZA3cRAgxEAJ9vcWnhlBL33G0zWiF+LqD+2x6pbQCgi2Lr
UoPB8eBun1KrwQHRr8nXpd0=
=tsrO
-----END PGP SIGNATURE-----