[Teammetrics-discuss] Next phase: Handling spam

Sukhbir Singh sukhbir.in at gmail.com
Thu Jun 9 20:51:25 UTC 2011


Hi,

> To give you an idea I select some random data sets in the database on
> blends.debian.net

I see.

> I just kept the names without e-mail address.  When thinking about it
> I'm a bit unsure whether it is finally a good idea to throw away the
> e-mail address.  We could store this in addition.

We keep the email addresses then.

> So the only chance we have is to have another lookup list - perhaps
> this should be rather done in the database itself rather than in a
> config file.  Following this strategy enables to change the names
> using an SQL UPDATE query.

But again and sorry for bringing this -- doesn't it make this manual?
An UPDATE query will have be to manually done for each author? In most
cases:

    'charles-debian-nospam', 'plessy', 'charles-guest'

A split on '-' will give the same name (but not plessy). I mean, I was
wondering if we could do better. Sure this might work for our case but
as we are doing it from scratch, somehow it doesn't feel right.
Anyways, I need to research this deeper to perhaps maybe find a
pattern.

> Well, I have not actively used it.  However, it was some kind of useful
> to detect some SPAM patterns.  I do not really mind for the moment but
> keeping it does not harm.

Ok. Sure, why not! I thought maybe you had some plans for this!



More information about the Teammetrics-discuss mailing list