[Soc-coordination] Report 2 - PyPI to Debian repository converter

Natalia Frydrych natalia.frydrych at gmail.com
Sun Jun 17 22:04:53 UTC 2012


Hello,

This is my second report on the work progress on a project PyPI to
Debian Repository Converter mentored by Piotr Ożarowski.

Work:
-----
Over the past two weeks I’ve worked mainly over planning the holistic
algorithm of my program, detailed designing of plugins’ API and making
the first attempts to implement it, introduction adjustments to
previously stored functions related to the PyPI and improving the
logging mechanism.

During the discussions with my mentor on the algorithm of the program
I’ve combined common arrangements and received guidance on the sketch
(because it was easier and faster for me). Unfortunately, this isn’t an
UML description, only illustrative graph by Google Docs, but after all,
I think for those interested in can be helpful, so I enclose it[1].

As mentioned in my last report, source tarballs are successfully
downloaded from PyPI. The next step was to change name of packages to
comply with Debian policy[2]. If the name does not comply with Debian
policy, it is changed using regular expression (e.g.: “Python Bytecode
Verifier” to “python-bytecode-verifier”). Another regular expression
tries to translate version numbers to those correctly recognized by dpkg
--compare-versions (e.g.: “1alpha” is renamed to “1~alpha”).

At this point, I have also decided to repack zip packages to tar.gz and
to compress (previously uncompressed) tar archives with gzip.

After managing to rename tarballs to desired name (that means in a
format that is closest to e.g.: “lxml_2.3.4.orig.tar.gz”), I copy them
from “tarballs” to “builds” directory, where I unpack them and finally
have access to the setup.py file.

Plugins:
--------
The next very important step of the program is to run the plugin that
converts package. I’ve decided to use a plugin system, so that in future
my program can be easily extended. Generally I’ve decided that plugins
should know a path to dir with setup.py, name and version of the
packages (so that’s what I pass in constructor). I’ve limited the role
of conversion plugins to generate debian/ directory. Although typically
converters are trying to build source or binary package, it seems to me
that it would be better to let plugins create debian directory only.
I’ve designed a base class, which includes these methods:
* __init__: constructor, accepts on the input: path to dir with setup.py
file (usually dir created after extracting tarball), name and version of
package;
* configure: method which is called before conversion to let plugins
initialize, configure, etc. (f.e. stdeb plugin can create config file at
this stage) should return True if successfully initialized/configured;
* prepare_command: is called by “run” method, returns string containing
command used in run method that starts the conversion;
* run: main method, converts package. If the conversion was successful,
it returns True. This method should return (True/False, log from stdout,
log from stderr) tuple. Most plugins should not override this method
(and use prepare_command instead);
* overrides: method  which is called after run, it can be used to
fix/improve files generated by the plugin in the debian directory.

At this point I stopped for a moment for further planning and to check
how plugins will work in practice. Initial implementation of first two
converting plugins are ready: stdeb[3] and pkgme[4]. Then I’ve written
the manage plugins function, set the bases of BuildPluginBase class and
implemented dpkg[5] build plugin.
A feeling when for the first time debian/ directory has began to appear
serially and binary packages have started to build - priceless :-)

Of course, this solution is still underdeveloped and requires a lot of
work. At the moment used conversion plugins are working on packages
which support Python 2 and I have to make adjustments to converters to
support Python3 packaging[6] (which is one of announced goals of my
project). Let me mention here, that during implementation of pkgme
plugin, I’ve already fixed one bug[7] and received a warm acceptance
after reporting new feature request[8]. As I’ve noted above, the option
of creating by plugins only the debian/ dir will be a large
simplification for me.

When it comes to building plugins, I have plans to implement in addition
to dpkg-buildpackage, inter alia: pbuilder[9] and sbuild[10].

In its current form, algorithm looks as follows: when conversion by the
plugin with the highest priority doesn’t work, the program will try one
with lower priority. Also, I'm going to introduce hooks, which will
improve efficiency of plugins (f.e. pkgme plugin author can also prepare
stdeb hooks that improve build dependency detection using pkgme’s
internals). When conversion of a package has been successfully finished
(that is, debian/ dir has been created) build plugin (also according to
the priority) is started.

I’ve started designing database structure to store most important
information obtained from plugins (stdout and stderr logs, return code,
test results, etc.)

Summary:
--------
It was a really nice moment, when my tool has started to be able to
convert and build a lot of packages, but now I have to focus on
improving quality and performance of each component.

Plans:
------
In the next few days the most important task is to design detailed
structure of database and to write the ORM corresponding to it. Next,
based on information from the database, I want to code the appropriate
behavior for each possible state of package. Then I'll try to improve
many components of my tool (which, for now, have basic functionality
only). Implement tests/validate plugins (eg.: lintian[11],
lintian4py[12], unit tests, ...) and implement few new commandline
options, such as: --skip-existing (to skip packages already present in
Debian); --distro (to change the default distribution in generated
packages) or --package (to convert only selected packages (or specific
version, if given)), --clean (to clean up temporary files).

My repository can be followed at: https://gitorious.org/pypi2deb

--------
[1] http://img513.imageshack.us/img513/3820/packaging.png
[2] http://www.debian.org/doc/debian-policy/ch-binary.html
[3] https://github.com/astraw/stdeb
[4] https://launchpad.net/pkgme
[5] http://www.debian.org/doc/manuals/debian-faq/ch-pkgtools.en.html
[6] http://wiki.debian.org/Python/Packaging
[7] https://bugs.launchpad.net/pkgme/+bug/1010411
[8] https://bugs.launchpad.net/pkgme/+bug/1010419
[9] http://pbuilder.alioth.debian.org/
[10] http://packages.debian.org/sid/sbuild
[11] http://lintian.debian.org/
[12] http://jwilk.net/software/lintian4python




More information about the Soc-coordination mailing list