[Teammetrics-discuss] Help with Git statistics

Andreas Tille andreas at an3as.eu
Mon Jul 18 16:21:42 UTC 2011


[Charles and Manuel in CC because I consider them Git experts of Debian
 Med team]

Hi,

I try to summarise a problem we are facing with Git repositories.  One
part of our team metrics GSoC project is to calculate the commitment of
a Debian Med team member to the actual packaging.  We now have people
who are upstream as well and by the nature of Git this is mixed into the
repository as well as packaging related code.  Our first approach was to
ignore those users who are not users of Alioth (and thus can not commit
to the clone at Alioth) and are probably no Debian Med team members.

Sukhbir Singh wrote in PM:

SS> I did some testing with the 'getend passwd' method suggested by Alioth
SS> admins. Here are the results using the 'debian-med' Git repository.
SS> 
SS> 1. If we use their approach, we get the names of the users and then
SS> filter the results -- if the name is in 'getent passwd', save to
SS> database, else skip.
SS> 
SS>          name          | sum
SS> ------------------------+------
SS>  Andreas Hildebrandt    | 1103
SS>  Charles Plessy         |  698
SS>  Aaron M. Ucko          |  283
SS>  Shaun Jackman          |  157
SS>  Michael Hanke          |  128
SS>  Rafael Laboissiere     |  117
SS>  Andreas Tille          |   26

This looks familiar with the exxeption of Andreas Hildebrandt.  He is
upstream of ballvie package and does a slight portion of packaging.  So
our means to only count users on Alioth does not work to fix the number
regarding Debian packaging work.  His statistics is over represented in
the stats.

To prove my point I wrote:

AT> who without a question worked on the ballview package (see the changelog
AT> of this package) but I doubt that anybody has more commits than Charles
AT> Plessy who is our most-busy commiter.  I'm basing my statement on the
AT> fact that Andreas Hildebrandt did not showed up inside the statistics I
AT> based on the postings to debian-med-commit list - where he would have
AT> been if those numbers would be real.  For instance
AT>
AT> listarchives=# SELECT project, yearmonth, count(*) from listarchive where author = 'Andreas Hildebrandt' GROUP BY project, yearmonth;
AT>       project      | yearmonth  | count
AT> -------------------+------------+-------
AT>  debian-med-commit | 2009-09-01 |     3
AT>  debian-med-commit | 2010-01-01 |     4
AT>  x                 | 2010-01-01 |     1
AT>  debian-med-commit | 2010-02-01 |     4
AT>  debian-med-commit | 2009-08-01 |     3
AT>  newmaint          | 2009-09-01 |     1
AT>  med               | 2010-11-01 |     1
AT>  med               | 2008-12-01 |     1
AT>  debian-med-commit | 2009-07-01 |    12
AT>  med               | 2008-11-01 |     1
AT> (10 rows)
AT> 
AT> shows that he has less than 30 commits and some view other postings.

SS> - Is it possible that, 'Andreas Hildebrandt' _did_ make more commits
SS> to the Git repo than 'plessy'? Since when does the mailing list record
SS> Git commits? (I notice it does, but from when?)
SS> - 'Andreas Hildebrandt' is in the Alioth users list, I just checked,
SS> so we can't exclude him automatically. Isn't he part of the team?

AT> Hmmm, thinking about this I guess you have to set the commit mail per
AT> Git repository and this is mentioned in Debian Med policy document but
AT> it might perfectly be the case that people forgot to do so.
AT> 
AT> The Ballview packaging was moved from SVN to Git last year.  We had
AT> exactly one Debian package release since then.  The diff in the changelog
AT> is:
AT> 
AT> --- ../changelog        2011-03-09 16:17:21.000000000 +0100
AT> +++ changelog.Debian    2011-06-23 11:34:03.000000000 +0200
AT> @@ -1,13 +1,31 @@
AT> -BALL is now managed via git, please see
AT> -http://git.debian.org/?p=debian-med/ball.git;a=summary
AT> +ball (1.4.0-1) unstable; urgency=low
AT> +
AT> +  [Andreas]
AT> +  * Updated to new upstream release 1.4.0
AT> +  * Cherry-pick upstream patch for linguist files
AT> +  * Cherry-pick upstream patch for new DSO linking scheme
AT> +  * Updated policy to 3.9.2
AT> +  * Split out arch-independent data package
AT> +  * Depend on libgl1-mesa-glx instead of libgl1-mesa-swx11
AT> +  * Introduce suitable conflicts against older versions where necessary
AT> +  * Install cmake BALL exports and config
AT> +  * build-depend on doxygen-latex instead of texlive-* (Closes: 616200)
AT> +  * Split build dependencies into Build-Deps and Build-Deps-Indep
AT> +  [Steffen]
AT> +  * Moved all build instructions to debian/rules.
AT> +  * Corrected email address
AT> +
AT> + -- Andreas Hildebrandt <ahildebr at uni-mainz.de>  Tue, 21 Jun 2011 18:29:51 +0200
AT> 
AT>  ball (1.3.2-3) UNRELEASED; urgency=low
AT> 
AT>    [Steffen Moeller]
AT>    * Updated policy to 3.8.4 (no changes required)
AT>    * Added libgl1-mesa-swx11 explicitly as a dependency to ballview
AT> +  [Andreas Hildebrandt]
AT> +  * Depend on python-sip-dev instead of python-sip4-dev (Closes: #611072)
AT> 
AT> - -- Andreas Hildebrandt <anhi at bioinf.uni-sb.de>  Fri, 21 May 2010 11:29:05 +0200
AT> + -- Andreas Hildebrandt <anhi at bioinf.uni-sb.de>  Tue, 01 Mar 2011 08:30:58 +0100
AT> 
AT>  ball (1.3.2-2) unstable; urgency=low
AT> 
AT> 
AT> This is NOT like 1000 commits ...

SS> - 'Andreas Hildebrandt' is in the Alioth users list, I just checked,
SS> so we can't exclude him automatically.

AT> It is correct that we can not exclude him automatically.

SS> Isn't he part of the team?

AT> Well, you become a member of the team by just doing something and the
AT> changelog says Andreas has done something.  However, he is a very silent
AT> member and just cares for this single package.

SS> So what should we do. 1000 in the query result of his refers to 1000
SS> changes he has made. It's your call as what is to be done! Tell me and
SS> I will do it as they say -- you are the boss ;)

So my question to the experts is:  Do you see any better way to separate
the commits which are concerning debian/ from those in the upstream code?

Kind regards

        Andreas.

-- 
http://fam-tille.de



More information about the Teammetrics-discuss mailing list