[Soc-coordination] update-manager progress report week 6

Stephan Peijnik debian at sp.or.at
Fri Jul 3 14:54:34 UTC 2009


The last two weeks were quite productive for me, being able to get a few
important things done (full listing with more details available at [0]):

* Checking for updates / downloading package lists
* Major rewrite of the IPC mechanisms (replaced callback functions by
handler classes)
* A lot of documentation and unit tests
* Ubuntu specific code: changelog fetching
* Changelog fetching and presentation in general
* Dynamic selection of frontend, backend and distribution specific
module via command line switches
* Automatic distribution detection
* Some UI changes/refinements
* Debian packaging updates so version 0.200.0~pre0 can be built
* Some work on threading issues in python-apt

As you can see this list got rather long, and I haven't listed every
single bit of work I've done. However, I'd like to give you some insight
into the things that I consider most important (and interesting) right
now.

What stalled my work the most was downloading package lists in
combination with a rather serious python-apt bug (or maybe we should
call that a missing feature).

After the reworking the IPC mechanisms I decided to use separate worker
threads to do the python-apt work. My first try failed horribly, as I
wasn't aware that Gtk is anything but thread-safe. So I read a lot of
more or less useful tutorials and howtos on the net and got cache
reloading to work using gtk.gdk.threads_enter and gtk.gdk.threads_leave.

These two little functions must be executed if one wants to modify the
Gtk UI from any thread but the main thread (which called gtk.main in the
first place), acquiring a global lock and releasing it, causing the UI
to be updated correctly.

This method worked fine with cache reloading, but didn't work at all
with downloading package lists. The UI would freeze and only get updated
when either the background thread finished or some other unidentifiable
random event occurred.

At first I thought it was my fault and some of my code using
threads_event and threads_leave was flawed, so I decided to take another
approach.
glib's idle_add (available through gobject.idle_add in Python) function
let's one inject a function call into the Gtk main thread, so it gets
executed when the main loop comes around to do so.
Now again, I wasn't having any luck with that method and I soon felt
that something else must be wrong, but still I tried another approach.

This third approach (which I left in the code, because it looks cleaner
than the others to me) causes the handler class to emit a glib/gobject
signal, using idle_add to make sure the signal is emitted in gtk.main's
thread and works correctly. Guess what, I still didn't have any luck and
started profiling update-manager. No luck with that either as no
functions showed up that took long enough.

So I started digging around on the net and (because I suspected
python-apt to be the bad guy) in the python-apt code base and soon
discovered that the working cache reloading code called
Py_BEGIN_THREADS_ALLOW and Py_END_THREADS_ALLOW every now and then.
After some catching up with Python's documentation on C APIs it became
clear what was going wrong. Python contains a global interpreter lock,
that only allows a single thread to run at one time, unless the thread
specifically notes that it won't do anything that could harm another
thread or the global interpreter state. 
Python's own C code makes extensive use of these two macros, as to allow
other threads to run, but python-apt's code (especially progress.cc) did
not.
So, after a discussion with my mentor Michael (we actually found out
what was causing the problem together) I decided to give my approach a
try, allowing other threads to work as long as python-apt doesn't invoke
any callbacks or changes object attributes.

Long story short: this fixed my problem and the UI updated smoothly
again, whilst the python-apt was doing its job in the background, but
that cost me about a week of work-time.
The patch fixing this problem has been incorporated into Michael's
python-apt repository at [1] now.


I would also like to write about is the newly added dynamic distribution
detection code and dynamic loading of frontend, backend and
distribution-specific modules.

Basically this idea came up during a chat with Michael. The result is
that these two methods now work together. Whilst dynamic loading of
frontend, backend and dist-specific modules allows using other
implementations than the default ones an additional "Auto" distribution
module (which is update-manager's default) detects the system's
distribution using lsb_release and loads the correct module accordingly.

There is hardly any magic in this code, it is quite plain
straight-forward.

Finally, I would like to go into depth on the new IPC mechanism [2][3].
According to my original design I created a single callback function per
action (cache reloading, update checking) that would take Enums as their
first parameters, defining the state of the operation.
Even though this looked quite nice in the beginning (especially because
I was quite fond of my Enum implementation :)) it turned out to be quite
complex.
So, at one point, I started replacing all callback functions by handler
interfaces and their implementations. Now a single operation, such as
checking for updates has multiple methods it invokes, such as, for
example, "downloading started", "item updated", "item finished",
"downloading failed" and so on. This not only makes the code less
complex, but also easier to read and understand.

My TODO list for the next week is:

* Support for downloading and installing of updates
* Checking that everything is documented
* Even more unit tests
* Pylint checking
* If time permits and everything else works correctly: working on an
aptdaemon backend

Finishing the first item on my list should give me usable update-manager
packages that could go into Debian experimental rather sooner than
later, to get some people to actually test the code.

NOTE:

Even though you can roll your own packages from update-manager's sources
already this code depends on python-apt >= 0.7.10.4, which has not been
released yet, so don't forget to fetch that one from its repository [1]
and build and install it before. 
I recommend doing this upgrade *only* on development systems, as the
code has not seen extensive testing yet and (especially the python-apt
upgrade) might break other packages.
If you feel like testing I would welcome you to do so, but don't expect
everything to be working correctly and please send me bug reports,
including the output of "update-manager -d".

I guess that's it for this week.

-- Stephan


[0] http://blog.peijnik.at/2009/07/02/update-manager-weekly-update-5/
[1] https://code.launchpad.net/~mvo/python-apt/python-apt--mvo
[2]
http://update-manager.alioth.debian.org/doc/current/api/UpdateManager/Backend/index.html#UpdateManager.Backend.CacheProgressHandler
[3]
http://update-manager.alioth.debian.org/doc/current/api/UpdateManager/Backend/index.html#UpdateManager.Backend.ListProgressHandler
[4]
http://update-manager.alioth.debian.org/doc/current/api/UpdateManager/Backend/index.html#UpdateManager.Backend.CommitProgressHandler




More information about the Soc-coordination mailing list