[Teammetrics-discuss] Error in liststat.py

Andreas Tille andreas at an3as.eu
Sat Aug 6 22:29:36 UTC 2011


On Sat, Aug 06, 2011 at 12:29:43PM +0530, Sukhbir Singh wrote:
> >   - it might need certain locale to parse a string which might
> >     by chance work on your system
> 
> Yes, I have locale support for some Asian languages. And the subject
> can be in that, it's possible ;)

... and this it is SPAM as I suspected because those lists are English ...
 
> > I'm also starting to wonder if we always need to download mboxes from
> > past years.  Well, there could be some removals from SPAM removal
> 
> This led me to think -- why do we need the mbox archives saved locally
> at all? Don't you think it would be better if after creating them and
> parsing them, we should delete them? Because they are not used
> afterwards in anyway. Even for hash comparison, the hashes are used
> from the hash file (lists.hash) and the cached mbox archives have no
> role to play once their SHA1 is calculated.
> 
> This will take care of two problems:
> 
> 1. We can open a mbox as a string as: `string = gzip.open(file)` (what
> you suggested).
> 2. No need for compression.
> 
> Your thoughts?

We should really not keep things which are not needed any more.
However, in the testing phase we frequently are droping the database and
need to recreate it from time to time (last change was storing
Message-IDs - BTW, a good idea!)  So to speed up things it would be
probably quite helpful to have those mboxes (compressed as they are
originally) cached on the system.  This really helps to save download
time as well as bandwidth on Debian servers.  In the sense of this I
would regard downloading only those mboxes which are "potentially"
changed as a reasonable thing to do.  However, if you think implementing
this might result in potential errors we should not change a running
system.
 
> > Also the output is identical.  Just do your test on blends.debian.net.
> 
> And some good news: Lars from Gmane says that NNTP puts the least load
> possible and we can go ahead with that. So we were right :)

OK.
 
> I will also look up 'lists.debian.org' this weekend.

Great.  I hope to be able to break stubbornness of listmasters ...

Kind regards

      Andreas. 

-- 
http://fam-tille.de



More information about the Teammetrics-discuss mailing list