Interesting meeting this afternoon with Martin Rajman on information retrieval techniques in relation to the STAMPS project. One of the first reference I picked up was that of the link mining, which is the data mining activity of constructing links between documents.
Then Dr. Rajman, gave me two references to the weighting scheme which are currently used to make rankings on a certain data structures: Prosit (also known as DFR) and Okapi (a variant of TF*IDF).
He mentioned briefly a technique they are working on at LIA to combine a selection technique with a Random Walk data structure which should be based on probabilistic formulas.
Basically, the building of the structural part on the graph is the result of how people would have walked our graph through many interactions.
An interesting comment Dr. Rajman made is that the different layers / properties we are trying to combine (social, semantic, geometric) are mostly precious if kept separated. These are maybe interesting to use at different time during the user interaction: for instance the user may start looking for a specific area on the map and then s/he might want to zoom on a more specific subset of the result, maybe this time focussing more on the social layer and using this to trim the results. We should enquiry how to use pertinence propagation across the different layers.
Another interesting thought from Dr. Rajman is that words are interesting to look at because they do not follow gaussian distributions in phenomena.