[Teammetrics-discuss] Phase I: Statistics for mailing lists on Alioth

Sukhbir Singh sukhbir.in at gmail.com
Tue May 31 09:54:26 UTC 2011


Hi!

Please update your local 'teammetrics' Git repository as I have pushed
some new changes.

Here is what works now:

+ You can create a configuration file and specify the lists to be
parsed. Run the script and it will guide you on how to create the
file.
+ The script will download mbox archives automatically from the lists
specified. You only need to specify the name of the list and it will
handle everything on its own; it will fetch the months the list was
active for, download the gzip archives and extract the mbox files.
+ Once we have the mbox files ready, we parse them to the get frequency.

So we have the frequency through a standardized mbox format ready as
we desired. This completes about 80% of this phase. And the best part
about this script is that it can be used for any mailing list that
runs on GNU Mailman.

What is left:

- Calculating the MD5 or SHA1 checksums of the lists so that they are
not downloaded again.
- Pushing this information into a database.

But before that, I need you to have a look at this thoroughly and see
if any changes are to be made. You can test it out with the Alioth
lists and please feel free to suggest anything. The code is such that
making changes is easy so don't worry about it.

Note that when the downloading of mbox archives takes place, it will
just say 'Downloading...' and show no progress bar. If you want an
indication of progress, please let me know. If you want to keep it
simple, we can probably leave it like this or output the progress to a
log file (perhaps?).

Comments on code are most welcome and I intend to cleanup a little you
give the green signal about the working/ design.

I hope you like it!

--
Sukhbir.

Repository: https://alioth.debian.org/scm/browser.php?group_id=100628



More information about the Teammetrics-discuss mailing list