[Teammetrics-discuss] Please strip quotation marks from names

Andreas Tille andreas at an3as.eu
Fri Mar 15 08:35:47 UTC 2013


Hi Sukhbir,

I faced:

teammetrics=# SELECT name, count(*) from listarchives where project like '%neurodeb%' and name like '%Yaro%' group by name order by name ;
         name         | count 
----------------------+-------
 Yaroslav Halchenko   |   289
 'Yaroslav Halchenko' |     2

which I try to hack around via

 $ git diff
diff --git a/etc/teammetrics/names.list b/etc/teammetrics/names.list
index 854e2f7..9142f54 100644
--- a/etc/teammetrics/names.list
+++ b/etc/teammetrics/names.list
@@ -650,7 +650,7 @@ Wouter van Heyst                :           larstiq-guest
 Xavier Oswald                   :           xoswald
 Xavier Vello                    :           wdgt-guest
 Y Giridhar Appaji Nag           :           appaji-guest
-Yaroslav Halchenko              :           yoh-guest, yoh
+Yaroslav Halchenko              :           yoh-guest, yoh, %Yaroslav Halchenko%
 Yask Gupta                      :           yask-guest
 Yavor Doganov                   :           yavor-guest
 Youhei Sasaki                   :           uwabami-guest


but that's a rather stupid hack and there are other occurences of this
problem (see below.) Some are surely not relevant for our actual
statistics but we have this as a hidden problem in any list.  Please
make sure that quotations will be removed before a name will be injected
into the database.

Kind regards

        Andreas.


Here is a more complete view onto the problem:

                                     name                                      | count 
-------------------------------------------------------------------------------+-------
 ' ALLAN W. BART                                                               |    38
 'cduck' Chris Grierson                                                        |    20
 '2+                                                                           |    14
 "Steffen Möller"                                                              |    13
 'maximilian attems'                                                           |    11
 "Traduz" - Portuguese Translation Team                                        |     9
 'Mash                                                                         |     9
 "Martin_Tanzer"@dvs-berlin.de                                                 |     8
 "barreno_e"@tsm.es                                                            |     7
 'Jason White'                                                                 |     5
 'Thomas Krennwallner'                                                         |     5
 'APM02 - 'securiQ.Watchdog' Demon'                                            |     4
 'Clive Menzies'                                                               |     4
 'Martin F Krafft'                                                             |     3
 'sean finney'                                                                 |     3
 'HUX01 - 'Watchdog' Demon'                                                    |     3
 "dheape"dheape55 at hotmail.com                                                  |     3
 'Andreas Tille'                                                               |     3
 'Jochen Schulz'                                                               |     3
 '05) Rosebowl (Beavers Andrew Queisser                                        |     2
 'Julian Gilbey'                                                               |     2
 ' Dr. Debra Winslow                                                           |     2
 'Yaroslav Halchenko'                                                          |     2
 'martin f krafft'                                                             |     2

BTW, looking at this it seems to make sense to remove only those quotations
that are the same at the beginning and end of the string.  Other strings like

  ' ALLAN W. BART
  'cduck' Chris Grierson

should be probably overriden manually in /etc/teammetrics/names.list
because

  '2+
  'Mash
  '05) Rosebowl (Beavers Andrew Queisser

looks somehow spammy and we do not care for SPAM anyway.

-- 
http://fam-tille.de



More information about the Teammetrics-discuss mailing list