[Soc-coordination] applying for Aptitude search ranking and presentation

KUTLU EMRE YILMAZ keylmz at gmail.com
Mon Mar 23 21:41:10 UTC 2009


Hİ all,

im a 4th year undergrad student in cs and im interested in aptitude search
ranking implementation project.

to be fair i havent used xapian before.

my thesis topic is comparison of turkish information retrieval performance
of lemur and terrier toolkits regarding their different retrieval algos.

i have also used lucene in my ir course so im familiar with ir and believe i
can do my best for this project.

so i have some background information about the ir terms.

when it comes to what i can add to this project i see that xapian has okapi
algortihm i can try to improve ranking of results by

try all the possible things that affect ir performance tokenization stemming
may be mistypings (python - ptyhon) AND OR specific boolean queries or
differently weigted queires.

also i got good results in my experiments with lemur tf-idf model weigted
with a modified okapi weighing function i can try different weighing
algorithms.

before implementing a new ranking heuristic , first i wish to try the above
i mentioned but you are the professionals and i would be glad to implement
some different unigram language models for xapian
i believe that unigrams can give better results for small queries like
package names and they arent so many fluctating like in natural language
words form different meanings.

"java sdk" -->  the probability the word "sdk" coming after java will be
higher than java rails may be i can do this by modelling collection , here
as my collection  filenames in the repository their explanations.

may be we can create two fields one for filename and other for explanations
of the package and its job , then can combine these two fields in search.

please dont hesitate to write your feedbacks either bad or good.

i really wish to work on this project and i hope i could explain myselft
well.

tnx,

Emre
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.alioth.debian.org/pipermail/soc-coordination/attachments/20090323/38e3f617/attachment.htm 


More information about the Soc-coordination mailing list