SIGIR is a huge conference. I really enjoyed participating in this meeting to listed to these great talks. Particularly, I enjoyed the session on the user experience: it was pertinent to my research, but still the most critiqued and discussed. There is a struggle to accept that the metrics made to measure the system performances do not match with the metrics of the humans using that system.
My notes are in the extended part of this post [or here] of Day 1. Enjoy!
Tags: human computer interaction, information retrieval
TITLE OF PAPER: Keynote and User Experience papers
PRESENTED BY:
CONFERENCE: SIGIR
DATE: August, 7, 2006
LOCATION: University of Washington, Seattle, USA
—
REAL-TIME NOTES / ANNOTATIONS OF THE PAPER:
Jamie Callahan
introduces Keith van Rijsbergen. He published this book Information Retrieval. Fabio Crestani was one of his students.
KEITH VAN RIJSBERGEN
KEYNOTE: QUANTUM HAYSTACKS
Keith worked with Gery Salton. He mention two books that higly influenced his work: G. Salton, Automatic Information Organization and Retrieval, 1968. And N. Jardine & R. Sibson, Mathematical Taxonomy, 1971.
We need theories to combine logic and probability. Top-down versus bottom up (principled theories). Science of the Artificial.
There is a need of test collection. Relevance is not a static notion. How do we handle relevance when is dynamic. The probability of replication is the ability to port the result obtained in a certain test situation to another contecxt. Relevance can be considered as an event or a property.
Work on Clustering has gone backwards. You can start thinking about clustering without introducing algorithms. Similarity measurements are still an open question. Can I measure the effectivenes of the clustering? Can we describe it mathematically?
The cluster hypothesis: cluster-based retrieval has as its foundation a hypothesis, which states tat closely associated documents tend to be relevant to the same request.
What are you trying to measure? Underlying conjoint structure mapped to numerical representation. E & F measures serve the purpose to compare point and curves results -> interpolation.
The IR demon is the quivalent of Maxwell deamon for probabilistic information retrieval.
EUGENE AGICHTEIN
the goal is to harness rich user interactions with search results to improve quality of search.
Linking implicit interactions and explicit judgments [Fox et al, TOIS 2005]
browsing behavior of individual users is inflenced by many factors. Rich user interaction space.
One of the goal is to predict user preference. RankNet is a neural net trained specifically for ranking. One of the goal of the work was to understand what presentation features were helpful for the retrieval.
ANDREW TURPIN
User Perfomance versus Precision Measures for Simple Search Tasks.
Do metrix match user experience?
We can calculate the Mean Average Precision for a bunch of queries and then do some statistics on top of those.
Assumption: more relevant documents high in the list is good. Do users want more than one relevant document? Do users read lists top to bottom? Who determines relevance? Binary? Conditional or state-based?
MAP is tractable but it does not reflect user’s experience. In the experiment in year 2000 they proved that even if map was different, user experiemnces were the same.
In this year experiment, they artificially contstructed lists of results with different map values. They found that there was the same time required to find the relevant results.
The conclusion is that the metrices in use like MAP, P@1, P@5 do not allow us to compare IR systems the assumption that an increase in MAP translates into an increase in user performance or satisfaction is not true.
EUGENE AGICHTEIN
Web Search Ranking -> users can help indicating which results are more relevant.
User behavior in the wild is not reliable. How can we integrate user interaction into ranking?
Personalization -> rerank
Collaborative filtering -> directhit
General ranking
They used RankNet [Burges et al]. Their idea was to integrate the User behavior directly into the ranker.
Incorporating user behavior into ranking algorithm drammatically improves relevance.
GUANG FENG
AggregateRank: Bringing Order to Websites
PageRank
HostRank is a reproposition of PageRank which uses instead of the machine links the address of he machines in which the pages are hosted. According to this research we only need to rank websites. It is not a good ides to base the ranking to the pages.
Structural re-ranking
Given an initially retrieved list in response to a qury, re-rank is to obtain high precision at top ranks using similarity between the documents.
NIE
Topical link analysis
Rottentomatoes does not answer the query about tomatoes. Traditional link analysis does not help because it the web site is famous for entertainment but not for food.
The idea is to include some topic information in the ranking.
SERGEI VASSILVITSKY
Relevance Feedback in Web Search
We search is a non-interactive system. Exceptions are spell checking and query suggestions. The idea is to pass the feedback from the user in a very soft way: using smilys.
Hypothesis: relevant pages tent to point to other relevant pages. Irrelevant pages tend to be pointed by other irrelevant pages.
The algorithm uses some existing information about the relevance of some existing pages compared to your initial query. From these initial pages, it computes the graph of the pages connected. If any of these are returned by the query, it adjust the ranking to match this knowledge.
FERNANDO DIAZ
Improving the Estimation of Relevance Models Using Large External Corpora
The Mixture of Relevance Models is a technique to combine an external corpora to improve the ranking of a well target corpora.
TAO TAO
Regularized Estimation of Mixture Models for Robust Pseudo-Relevance Feedback
Queries are usually very short. Pseudo Feedback is a way to expanding the query. Parameter sensitivity is a major challenge of pseudo feedback.
How can we make the pseudo feedback more robust? Can automatically set the parameters?