[Teammetrics-discuss] Comparison between the old code and the new code.

Andreas Tille andreas at an3as.eu
Thu Sep 8 19:18:05 UTC 2011


On Thu, Sep 08, 2011 at 11:59:43PM +0530, Sukhbir Singh wrote:
> 
> Well, there should not be a bug because we are just fetching the
> messages form Gmane in a range() loop. The only possible explanation
> is that Gmane has deleted some messages due its own spam filter/
> implementation perhaps?

Yes, that's what I wanted to say: Gmane is just lacking some data (for
whatever reason) and your code has no chance to fetch it.

> But whether this results in significant change
> in numbers is to be investigated (I doubt it though).
> 
> However, the database does seem to be populated with all the messages
> and `SELECT COUNT(*)` returns a number that matches the number of
> articles in Gmane. So why this is happening, I am not sure because the
> end result from the database *matches* up to the article count from
> Gmane. And I have verified this IIRC and I will do it again.

IMHO the way to verify is the following:  Find a mailing list which
shows in a specific month a different number for a given author.  Look
up the web archive on lists.debian.org and gmane to see where the mail
is missing.  It seems like a good idea to investigate one of those lists
which do not have that much postings but show a certain difference.

> Heh, when will that happen is only a guess!

Sure.  But I pinged yesterday - lets see how the next ping-cycle will
work.
 
> How do you recommend I test NNTPstat to find out where the problem is?

As I said: Just find a specific case by digging the archive manually.

Kind regards

       Andreas. 

-- 
http://fam-tille.de



More information about the Teammetrics-discuss mailing list