[Mailvoting-devel] Design doc (mail processing part)

Filipus Klutiero chealer at gmail.com
Wed Jan 21 07:47:06 UTC 2009


> Hi,
>
> So, we need to start working on it somehow :-). I thought I would just
> still start writing up some kind of design doc for each part of the
> system that I can think of, and we'll iron out the details as we go
> along. This mail discusses the mail processing part (by the way, my
> language of choice is Python, so if you prefer something else, speak
> up now :-).
>
> This describes the system for processing of the incoming mail, which
> should be PGP-signed. I'm thinking about a generic script which can
> read from an arbitrary file descriptor (mostly I was thinking about
> plugging it in into the procmail delivery chain somewhere). We
> probably should limit the size of the incoming message that we are
> going to process (say, reject everything over 100K in size), and
> discard on the spot everything which does not have a PGP signature (to
> get rid of spam), and store a copy of every message with signature in
> an mbox (for recovery purposes). So far I played with pyme to
> implement the signature verification for two cases: when a signed
> message is included inline in the body, and when it is a PGP/MIME
> message with signature included as a part of a multipart message. Once
> we verify the signature, we should parse the signed body. We expect
> the signed body to contain the message which sender wants to score,
> along with some pseudo-header, as there were some concerns about
> signing someone else message and sending it in. I suggest to adopt the
> following structure of the signed body:
>
> --------------------------------------
> <Arbitrary text>
> Tag: <space- or comma-separated tags>
> <Arbitrary text>
> <Empty line>
> <Envelope-From line for the tagged message>
> <Original headers (optionally, body) of the tagged message>
> ---------------------------------------
>
> Tags may be a fixed set of strings, something like 'offensive',
> 'rude', 'repetitive', 'constructive', etc. Any negative tags will
> cause the message to be scored negatively (negative score due
> to one tagging message should be fixed, i.e. independent of the number
> of tags), positive ones are correspondingly scoring it up. Only one
> vote for a particular message is allowed for each signer, the latest
> vote overrides all previous ones (to simplify the data structures).
> The Envelope-From line should match the regexp '^From ' or '^>From '.
> Original headers should include at least Envelope-From, 'From' , 'To',
> 'Message-ID', 'Date', 'List-Id' and 'Subject' headers. It would be
> very nice if we could provide a link to the mailing list archives
> based on this information, not sure if it is easy or not.
>
> Let me know what you think, and share your own ideas.
>
> Best regards,
> --
> Jurij Smakov

Sorry for breaking the thread.

I don't know Python nor mail, I'm only commenting about the feedback given.

I looked at how GroupLens worked ( 
https://eprints.kfupm.edu.sa/38712/1/38712.pdf ). GroupLens asked for a 
simple rating from 1 to 5 of messages. Now GroupLens is not a perfect 
reference. I for one am more interested in the filtering potential of the 
system than in the "educational" potential. Simple tags may be good for the 
educational part, but if we eventually also want a good filtering potential, 
translating tags into ratings may be non-trivial. So I'd keep room for a 
rating.

A rating could be given with

Rating: x

or more modularly with

NumericProperty: score 5

(where the NumericProperty name is really just a placeholder for something 
better)

That said, if you're only interested in tags for now, what you propose does 
not prevent adding more pseudo-headers later.



More information about the Mailvoting-devel mailing list