[Teammetrics-discuss] Converter for mboxes (Was: Debian mailing lists archives as mbox)

Christian PERRIER bubulle at debian.org
Tue Aug 30 19:50:41 UTC 2011


Quoting Cord Beermann (cord at debian.org):
> Hallo! Du (Christian Perrier) hast geschrieben:
> 
> >> what in this sense means "completely"?  Do you need *all*
> >
> > Honestly, I have no idea as I don't know about the details of the  
> > handling of the spam report mail address. IIRC, these things are handled  
> > by Cord Beerman. Maybe only the Message-ID is enough.
> 
> We have more than one way to nominate Mails for a Spamreview. 
> 
> http://wiki.debian.org/Teams/ListMaster/ListArchiveSpam#Methods_to_Nominate_Spam_for_the_Review-Process
> 
> 
> The current problem is that we currently lack a backend for methods 2
> and 3, so that input currently isn't directly used. We work on that.

The most efficient is indeed, from my experience, method 3. I used it
the following way:

- get list archives as mboxes
- process them through my own spam filter (crm114), trained with years and
years of spam and ham in Debian lists (cat mbox | formail -s <filter_command>
- open them with mutt, with a special colour for mails tagged as spam
bu CRM114
- check coloured messages, tag them, then bounce them all to the spam
nomination address

About 10-20 times more efficient than web-based methods...buty it
*requires* access to archives as mboxes *without* filtered headers
(otherwise, formail, mutt and crm114 would be confused).

So, indeed, mbox archives with filtered headers as...plain useless for
such purposes..:-)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://lists.alioth.debian.org/pipermail/teammetrics-discuss/attachments/20110830/045ab368/attachment.pgp>


More information about the Teammetrics-discuss mailing list