Yesterday, I attended the Geographical Information Retrieval workshop, which was part of the SIGIR conference. There were lots of interesting papers, some of which are very close to my thesis’ interests.
This workshop will address all aspects of Geographic Information Retrieval – that is the provision and evaluation of methods to identify geographic scope, retrieve and relevance rank documents or other resources from both unstructured and partially structured collections on the basis of queries specifying both theme and geographic scope.
As an overall comment I can say that although we are on the third edition of this event, this particular discipline is still struggling to find its natural audience and support. In most of the cases the results are only partial or superficial even because there are no dataset around that we can play with. (A notable exception is this GeoCLEF2005)
My notes of the talks are in the extended section (or here).
Tags: human computer interaction, information retrieval
CONFERENCE: Geographical Information Retrieval workshop
DATE: August, 10, 2006
LOCATION: University of Washington, Seattle, USA
—
REAL-TIME NOTES / ANNOTATIONS OF THE PAPER:
JOHANNES LEVELING
PAPER: On Metonymy Recognition for Geographic IR
A metonimy is a figure of speech in which a speaker uses one entity to refer to another that is related to it. E.G. Using the name of the city for indicating the name of the team from that city.
Markert and Nissim (2002) offer a list of these figures.
They started from the mayor classes: literal, metonymic, mixed.
They tried to model these features training a classifier for IR experiments using different parameters: the lemma of the verb, the prefix and the suffix, the order of the verbs, etc.
They took a standard IR approach with TF-IDF measures. This gave good performances.
Q: Gazetteers of Alexandria Digital Library project [1]
Q: Metonymy does not cover all the ambiguity, you might have omonimy. Omografs like ‘Essen’ in german that is the name of the city but also the verb eating.
SIMON OVERELL
PAPER: Identifying and grounding description of places.
Given a description of an object or concept can we identify whether it is a place?
Disambiguation can be splitted in three parts: rule based, data-driven and bootstrapping. They used Wikipedia as the base of disambiguation, trying to build a co-occurrence mechanism to match the wiki location with Getty TGN Ids. They implemented 4 base lines references.
They measure the co-occurrence of the ambiguos word like Cambridge with the place where the word was taken from like the web site of the university of washington. This helped to disambiguate cambridge in OR from cambridge in England.
[2] http://www.doc.ic.ac.uk/~seo01
YI LI
PAPER: Exploring Probabilistic Toponym Resolution for Geographical Information Retrieval
This paper present a method for disambiguating text containing geographical terms and entities. They describe a probabilistic resolution mechanisms.
DIANA SANTOS
PAPER: The place of place in Geographical IR
Vagueness and context dependancies of natural language are usually considered as bad becuase they constitute some big challenges for Information Retrieval but they are great feature of the language that make it more adaptable to different situations.
VIVIAN ZHANG
PAPER: Geomodification in Query rewriting
Web searchers signal their geographic intent by using place names in search queries and in reformulating their queries.
They studied how web seracher reformulated their web searches and they wrote some anutomatice reformulation engine that was implemented and presented to users to evaluate.
BRUNO MARTINIS
PAPER: Handling Location in Search Engine Queries
Query Formulation in GIR are different: from the map and from the query string.
their query is concerned in parsing the geographical queries when the context of the query was not specified by the user.
They propose an ontology to disambiguate the context where the query is matched.
They try to split the query in a triple containing <what>, <relationship> and <where>.
They propose to use different techniques to define the different scopes the query maps to.
The prototype: http://local.tumba.pt
Research page: http://xldb.fc.ul.pt
STEVEN SCHOCKAERT
PAPER: Towards Fussy Spatial Reasoning in Geographic IR System
This reasearch tries to address vaguenes identification of spatial information in queries. They want to use information extracted from web pages to expand their localization on the map. They tried to came out with a conceptual model to give a specific meaning to natural language definitions of closeness
They defined a series of fuzzy functions that could discriminate between resources that can be considerated “close” or not.
QI ZHANG
PAPER: Detecting Geographical Serving Area of Web Resources
The geographical distribution of online users who are interested in a given web site (example [3]). They propose different strategies to infer the position of the server and the position of the users of that server.
The server can be identified using log files or other. The user can be identified using the IP of the machine is connecting from.
Using this inferred information can be possible to disambiguate query terms or, on the contrary, is possible to use the query terms to infer the position of the user or the server.
JOHN FRANK
PAPER: Cartographic Information retrieval
Autamating generating beautiful maps is beyond the state of the art. One of the problem is how to match a certain dataset to geographical positioning.
Metacarta sells a geotagger and a geoparser.
Questions:
1. How should IR label relief representation. What does it mean for a text to be aligned with a map.[Imhof’s Cartographic Relief Representation]
2. How should documents icons should be aligned with a map? buildings are not buildings. Can GIR achieve Imhof-level excellence in relief labeling?
3. How GIR data layer shold be aligned with geographic layers?
OpenLayers is an AJAX library that wraps all the cartographics services and allows everybody to build services with maps [4].
4. How to lie with Maps by Mark Monmonier. How do we go from low-level details to high level details?
5. How does relevance applies to cartography?
6. What’s the value of a georef? They released a public api to interact with metacarta [5]. Are gerefs to these places all of equal value? Is extracting and resolving a rare georefs more valueable? How might the value be measured?
JULIEN LESBEGUIERIES
PAPER: Associating spatial patterns to text-units for summarizing geographic information.
ALESSANDRO SORO
PAPER: Range-capable Distributed Hash Tables.
From wikipedia:
In computer science, a hash table, or a hash map, is a data structure that associates keys with values. The primary operation it supports efficiently is a lookup: given a key (e.g. a person’s name), find the corresponding value (e.g. that person’s telephone number). It works by transforming the key using a hash function into a hash, a number that the hash table uses to locate the desired value.
The goal of the paper is to define a novel indexing strategies called RDHT Distributed Has Table.
[6] http://opensource.csr4.it/
LEONARDO ANDRADE
PAPER: Relevance Ranking for Geographic IR
The goal of this work was to compute how geographic places could be combined with textual ranking.
The study showed that geographic ranking seems to be useful but some queries are more geographical than others.
DAVIDE BUSCALDI
PAPER: Inferring Geographical Ontologies form Multiple Resources for Geographical Information Retrieval
We were interested in combining results from Wordnet with some geographical information retrieval. Wordnet was used to expland the query at first and then to expand the intex terms.
In order to reduce the features they tried to use wikipedia.
Their results did not show any improvement or success.
RAY LARSON
GeoCLEF: An Overview of CLEF CROSS-language Geographic Information Retrieval Track
The aim of GeoCLEF is to provide the necessary ramework in which to evaluate GIR systems for search asks involving both spatial and multilangual aspects.
JENS GRAUPMANN
PAPER: GeoSphereSearch: Context-Aware Geographic web Search
This paper presents a platform for geographical queries on the web. Instead of assigning a footprint to each document, it consider the circular Eucledian space in which the documents are placed. The similarity and the ranking of each document is computed putting each document in the center and evaluating the distance of the others from the centers.
CHUFENG CHEN
PAPER: A Location Data Annotation System for Personal Photograph Collections
This paper describes a system for storing and retreiving images for personal collections. The images can be manually annotated or automatically inferred using the time and location stamps against a geographical gazetteer.
–
—
REFERENCES: {as documents / sites are referenced add them below}
[1] http://www.alexandria.ucsb.edu/
[2] http://www.doc.ic.ac.uk/~seo01
[3] http://www.newzealand.com/
[4] http://www.openlayers.org/
[5] http://labs.metacarta.com/
http://geourl.com/
…