Geographical Information Retrieval workshop

Yesterday, I attended the Geographical Information Retrieval workshop, which was part of the SIGIR conference. There were lots of interesting papers, some of which are very close to my thesis’ interests.

This workshop will address all aspects of Geographic Information Retrieval – that is the provision and evaluation of methods to identify geographic scope, retrieve and relevance rank documents or other resources from both unstructured and partially structured collections on the basis of queries specifying both theme and geographic scope.

As an overall comment I can say that although we are on the third edition of this event, this particular discipline is still struggling to find its natural audience and support. In most of the cases the results are only partial or superficial even because there are no dataset around that we can play with. (A notable exception is this GeoCLEF2005)

My notes of the talks are in the extended section (or here).

Tags: human computer interaction, information retrieval

CONFERENCE: Geographical Information Retrieval workshop

DATE: August, 10, 2006

LOCATION: University of Washington, Seattle, USA

—

REAL-TIME NOTES / ANNOTATIONS OF THE PAPER:

JOHANNES LEVELING

PAPER: On Metonymy Recognition for Geographic IR

A metonimy is a figure of speech in which a speaker uses one entity to refer to another that is related to it. E.G. Using the name of the city for indicating the name of the team from that city.

Markert and Nissim (2002) offer a list of these figures.

They started from the mayor classes: literal, metonymic, mixed.

They tried to model these features training a classifier for IR experiments using different parameters: the lemma of the verb, the prefix and the suffix, the order of the verbs, etc.

They took a standard IR approach with TF-IDF measures. This gave good performances.

Q: Gazetteers of Alexandria Digital Library project [1]

Q: Metonymy does not cover all the ambiguity, you might have omonimy. Omografs like ‘Essen’ in german that is the name of the city but also the verb eating.

SIMON OVERELL

PAPER: Identifying and grounding description of places.

Given a description of an object or concept can we identify whether it is a place?

Disambiguation can be splitted in three parts: rule based, data-driven and bootstrapping. They used Wikipedia as the base of disambiguation, trying to build a co-occurrence mechanism to match the wiki location with Getty TGN Ids. They implemented 4 base lines references.

They measure the co-occurrence of the ambiguos word like Cambridge with the place where the word was taken from like the web site of the university of washington. This helped to disambiguate cambridge in OR from cambridge in England.

[2] http://www.doc.ic.ac.uk/~seo01

YI LI

PAPER: Exploring Probabilistic Toponym Resolution for Geographical Information Retrieval

This paper present a method for disambiguating text containing geographical terms and entities. They describe a probabilistic resolution mechanisms.

DIANA SANTOS

PAPER: The place of place in Geographical IR

Vagueness and context dependancies of natural language are usually considered as bad becuase they constitute some big challenges for Information Retrieval but they are great feature of the language that make it more adaptable to different situations.

VIVIAN ZHANG

PAPER: Geomodification in Query rewriting

Web searchers signal their geographic intent by using place names in search queries and in reformulating their queries.

They studied how web seracher reformulated their web searches and they wrote some anutomatice reformulation engine that was implemented and presented to users to evaluate.

BRUNO MARTINIS

PAPER: Handling Location in Search Engine Queries

Query Formulation in GIR are different: from the map and from the query string.

their query is concerned in parsing the geographical queries when the context of the query was not specified by the user.

They propose an ontology to disambiguate the context where the query is matched.

They try to split the query in a triple containing <what>, <relationship> and <where>.

They propose to use different techniques to define the different scopes the query maps to.

The prototype: http://local.tumba.pt

Research page: http://xldb.fc.ul.pt

STEVEN SCHOCKAERT

PAPER: Towards Fussy Spatial Reasoning in Geographic IR System

This reasearch tries to address vaguenes identification of spatial information in queries. They want to use information extracted from web pages to expand their localization on the map. They tried to came out with a conceptual model to give a specific meaning to natural language definitions of closeness

They defined a series of fuzzy functions that could discriminate between resources that can be considerated “close” or not.

QI ZHANG

PAPER: Detecting Geographical Serving Area of Web Resources

The geographical distribution of online users who are interested in a given web site (example [3]). They propose different strategies to infer the position of the server and the position of the users of that server.

The server can be identified using log files or other. The user can be identified using the IP of the machine is connecting from.

Using this inferred information can be possible to disambiguate query terms or, on the contrary, is possible to use the query terms to infer the position of the user or the server.

JOHN FRANK

PAPER: Cartographic Information retrieval

Autamating generating beautiful maps is beyond the state of the art. One of the problem is how to match a certain dataset to geographical positioning.

Metacarta sells a geotagger and a geoparser.

Questions:

1. How should IR label relief representation. What does it mean for a text to be aligned with a map.[Imhof’s Cartographic Relief Representation]

2. How should documents icons should be aligned with a map? buildings are not buildings. Can GIR achieve Imhof-level excellence in relief labeling?

3. How GIR data layer shold be aligned with geographic layers?

OpenLayers is an AJAX library that wraps all the cartographics services and allows everybody to build services with maps [4].

4. How to lie with Maps by Mark Monmonier. How do we go from low-level details to high level details?

5. How does relevance applies to cartography?

6. What’s the value of a georef? They released a public api to interact with metacarta [5]. Are gerefs to these places all of equal value? Is extracting and resolving a rare georefs more valueable? How might the value be measured?

JULIEN LESBEGUIERIES

PAPER: Associating spatial patterns to text-units for summarizing geographic information.

ALESSANDRO SORO

PAPER: Range-capable Distributed Hash Tables.

From wikipedia:

In computer science, a hash table, or a hash map, is a data structure that associates keys with values. The primary operation it supports efficiently is a lookup: given a key (e.g. a person’s name), find the corresponding value (e.g. that person’s telephone number). It works by transforming the key using a hash function into a hash, a number that the hash table uses to locate the desired value.

The goal of the paper is to define a novel indexing strategies called RDHT Distributed Has Table.

[6] http://opensource.csr4.it/

LEONARDO ANDRADE

PAPER: Relevance Ranking for Geographic IR

The goal of this work was to compute how geographic places could be combined with textual ranking.

The study showed that geographic ranking seems to be useful but some queries are more geographical than others.

DAVIDE BUSCALDI

PAPER: Inferring Geographical Ontologies form Multiple Resources for Geographical Information Retrieval

We were interested in combining results from Wordnet with some geographical information retrieval. Wordnet was used to expland the query at first and then to expand the intex terms.

In order to reduce the features they tried to use wikipedia.

Their results did not show any improvement or success.

RAY LARSON

GeoCLEF: An Overview of CLEF CROSS-language Geographic Information Retrieval Track

The aim of GeoCLEF is to provide the necessary ramework in which to evaluate GIR systems for search asks involving both spatial and multilangual aspects.

JENS GRAUPMANN

PAPER: GeoSphereSearch: Context-Aware Geographic web Search

This paper presents a platform for geographical queries on the web. Instead of assigning a footprint to each document, it consider the circular Eucledian space in which the documents are placed. The similarity and the ranking of each document is computed putting each document in the center and evaluating the distance of the others from the centers.

CHUFENG CHEN

PAPER: A Location Data Annotation System for Personal Photograph Collections

This paper describes a system for storing and retreiving images for personal collections. The images can be manually annotated or automatically inferred using the time and location stamps against a geographical gazetteer.

–

—

REFERENCES: {as documents / sites are referenced add them below}

[1] http://www.alexandria.ucsb.edu/

[2] http://www.doc.ic.ac.uk/~seo01

[3] http://www.newzealand.com/

[4] http://www.openlayers.org/

[5] http://labs.metacarta.com/

http://geourl.com/

…

Mauro Cherubini

Professor at the University of Lausanne, Switzerland

Geographical Information Retrieval workshop

Leave a Reply Cancel reply