Smartalbum: a multi-modal photo annotation system

T. Tan, J. Chen, P. Mulhem, and M. Kankanhalli. Smartalbum: a multi-modal photo annotation system. In MULTIMEDIA ’02: Proceedings of the tenth ACM international conference on Multimedia, pages 87–88, New York, NY, USA, 2002. ACM. [PDF]

——-

Applications supporting annotation of pictures with voice: SmartAlbum (Tan et al., 2002) that unifies two indexing approaches, namely content-based and speech, Chen et al. (2001) proposed the use of a structural speech syntax to annotate photographs in four different fields, namely event, location, people, and date/time. Show&Tell (Srihari et al., 1999), which uses speech annotations to index and retrieve both personal and medical images, and FotoFile (Kuchinsky et al., 1999), which extends annotation to a more general multimedia object.

This demonstration presents a novel application (called SmartAlbum) for photo indexing and retrieval that unifies two different image indexing approaches. The system uses two modalities to extract information about a digital photograph; i.e. content-based and speech annotation for image description. The result is a powerful image retrieval tool that has capabilities beyond what current single-mode retrieval systems can offer. We show on a corpus of 1200 images the interest of our approach.

Simplifying the management of large photo collections

A. Girgensohn, J. Adcock, M. Cooper, J. Foote, and L. Wilcox. Simplifying the management of large photo collections. In Proceedings of INTERACT’03, pages 196–203. IOS Press, 2003. [PDF]

——–

This paper contains useful references on the fact that event segmentation of pictures was observed to be a valid criterion for organizing pictures The paper reports an algorithm that was compared with those of Graham at al. (2002), Platt et al. (2002), and Loui and Savakis (2000).

With digital still cameras, users can easily collect thousands of photos. Our goal is to make organizing and browsing photos simple and quick, while retaining scalability to large collections. To that end, we created a photo management application concentrating on areas that improve the overall experience without neglecting the mundane components of such an application. Our application automatically divides photos into meaningful events such as birthdays or trips. Several user interaction mechanisms enhance the user experience when organizing photos. Our application combines a light table for showing thumbnails of the entire photo collection with a tree view that supports navigating, sorting, and filtering photos by categories such as dates, events, people, and locations. A calendar view visualizes photos over time and allows for the quick assignment of dates to scanned photos. We fine-tuned our application by using it with large personal photo collections provided by several users.

Modality preference – learning from users

R. Wasinger and A. Krüger. Modality preference – learning from users. In Proceedings of the User Experience Design for Pervasive Computing workshop, Munich, Germany, 11 May 2005. [PDF]

——-

This paper describes a qualitative comparison of input modalities for mobile applications. The author conducted a user study with about 50 users that were asked to fill questionnaires on a PDA. The questions targeted products that the users could find in a shop and therefore had to be answered while on the place. The users were divided in different groups and each group was exposed to different input modality: speech, handwriting, intra- and extra-gestures.

Interestingly, speech was reported as being the most comfortable modality to use in comparison with handwriting but also the modality that could mostly expose privacy. Users reported being uncomfortable speaking sensible information to their devices. Handwriting was see as conflicting with other manual activities that needed to be carried out while shopping.

Motivating annotation for personal digital photo libraries: Lowering barriers while raising incentives

J. Kustaniwitz and B. Shneiderman. Motivating annotation for personal digital photo libraries: Lowering barriers while raising incentives. Technical Report HCIL-2004-18, University of Mariland, College Park, MD, USA, 2005. [PDF]

——–

The main argument of this paper is that annotation is a tedious activity. Few users take the time to annotate carefully and sistematically their collection and therefore thousands of pictures lie unorganized in folders pretty much as printed pictures lie in shoeboxes. The author highlights how part of the problem is that users themselves do not understand what they can do differently once pictures are annotated.

Toward the end of raising motivation for tagging pictures, this paper describes several applications for visualizing collections of annotated pictures. For instance, Birthday Collage is an application that automatically pulls together pictures taken at birthdays for the same person.

The value of personal digital photo libraries grows immensely when users invest effort to annotate their photos. Frameworks for understanding annotation requirements could guide improved strategies that would motivate more users to invest the necessary effort. We propose one framework for annotation techniques along with the strengths and weaknesses of each one, and a second framework for target user groups and their motivations. Several applications are described that provide useful and information-rich representations, but which require good annotations, in the hope of providing incentives for high quality annotation. We describe how annotations make possible four novel presentations of personal photo collections: (1) Birthday Collage to show growth of a child over several years, (2) FamiliarFace to show family trees of photos, (3) Kaleidoscope to show photos of related people in an appealing tableau, and (4) TripPics to show photos from a sequential story such as a vacation trip.

Kustaniwitz_BirthdayCollage.jpg

Direct annotation: A drag-and-drop strategy for labeling photos

B. Schneiderman and H. Kang. Direct annotation: A drag-and-drop strategy for labeling photos. In IV ’00: Proceedings of the International Conference on Information Visualisation, pages 88–96, Washington, DC, USA, 2000. IEEE Computer Society. [PDF]

———–

The basic argument of the authors of this paper is that large pictures collections need annotation in order to support efficient retrieval. However, annotation is a tedious and time-consuming task. The authors present in this paper a basic HCI technique which allows the user to annotate without much typing. Keywords can be inputed once-for-all through a list and becomes available to be dragged and dropped over pictures.

The system has the advantage of supporting labeling of regions of the pictures (see figure below) but has the disadvantage of reduced scalability.

Annotating photos is such a time-consuming, tedious and error-prone data entry task that it discourages most owners of personal photo libraries. By allowing users to drag labels such as personal names from a scrolling list and drop them on a photo, we believe we can make the task faster, easier and more appealing. Since the names are entered in a database, searching for all photos of a friend or family member is dramatically simplified. We describe the user interface design and the database schema to support direct annotation, as implemented in our PhotoFinder prototype.

Shneiderman_Direct-annotation.png

The Genetic Map of Europe

Biologists have constructed a genetic map of Europe showing the degree of relatedness between its various populations. This genetic map contains many similarities to the geographical map. The major genetic differences are between populations of the north and south Europe.

Researchers say that the patters of genetic differences reflects the colonization history of Europe, which dates back 45,000 years ago. The map also identify the existence of two genetic isolated groups: the Finns and the Italians.

The former arose because the Finnish population was at one time very small and then expanded, bearing the atypical genetics of its few founders. The latter is between Italians (yellow, bottom center) and the rest. This may reflect the role of the Alps in impeding free flow of people between Italy and the rest of Europe.

[more]

genetic-map-of-europe.jpg

Google “Dream” Phone

This week the long-awaited Google Phone produced by HTC was approved by the American FCC commission for use in the United States. This means that commercialization might start as early as October this year.

I thought that was a good idea to write a short post about it because we reached a critical mass of demos and applications circulating the Net. Google philosophy is clear and it is opposite of that of Apple or Nokia: using open-sources solutions and leveraging on open source movements to create the critical ecology around their product.

They also pulled together a nice team of designers, who released a clever sets of features for the interface. This phone will integrate nicely Google’s product that we use and love, like gMaps, gMail and so on. However it will also provide new experience opportunities for the users. For instance, the HTC device should incorporate a compass that should allow an augmented reality experience of gMaps’ Street View. Turning around the device should adapt the view of the map so to that the virtual experience will closely resemble and adapt the virtual experience. Plus the application will overlay the names of the streets to the view, giving useful information to the user.

The gPhone is also full of clever UI features that are going to make the user experience both richer and simpler. For instance, access to notifications was aggregated in a single part of the interface: a pull-down tile on the menu bar.

In sum, I am looking forward to put my hands on the first commercial model and make a full review of the system. For now, I celebrate to what resembles to be clever design decisions.

android-home-screen.jpg

people tend to stick to their own size

One reason Helena and I would never be close friends is that I am about half tall as she. People tend to stick to their own size group because it’s easier on the neck. Unless they are romantically involved, in which case the size difference is sexy. It means: I am willing to go the distance for you.

Miranda July. No one belongs here more than you. Canongate, Edinburgh, UK, 1997.

[web site]

Photo annotation on a camera phone

A. Wilhelm, Y. Takhteyev, R. Sarvas, N. V. House, and M. Davis. Photo annotation on a camera phone. In CHI ’04: CHI ’04 extended abstracts on Human factors in computing systems, pages 1403–1406, New York, NY, USA, 2004. ACM. [PDF]

——

This paper argues for the use of the multimedia capabilities of phones for helping indexing and retrieval of pictures. They built a system capable of incorporating metadata, like the user’s position, and keywords defined by the user to the picture. This information was stored to a remote server and then used during retrieval.

To evaluate their design they conducted a user study where they deployed the system to a group of 55 participants. They found that one fo the major problem was the unpredictability of the network connection that made interaction with the remote server lenghty. From the study they also learned that participants used the camera of their phone extensively but the value that they attributed to many pictures was very little. Instead, they users took advantage of the networking capabilities of the phone to share events of their life with their friends. The capture of these moments would have not been possible without the continuous availability of the mobile phone.

Interviewed participants expressed more interest for sharing and browsing the captured images than retrieving specific pictures. They were generally not interested in fully annotating pictures using keywords.

The ubiquitous camera: An in-depth study of camera phone use

T. Kindberg, M. Spasojevic, R. Fleck, and A. Sellen. The ubiquitous camera: An in-depth study of camera phone use. IEEE Pervasive Computing, 4(2):42–50, 2005. [PDF]

——-

This paper describes a study of how people use camera phones. The authors interviewed 34 subjects trying to understand what they photographed and why. They developed a number of categories of subjects of pictures, dividing them into people and scenes + things. The authors found that people took pictures with their mobile for affective reasons, to enrich a mutual experience by sharing an image with those whowere present at the time of capture. Also, they took pictures to make an absent friend or family member aware of an event.

Additionally the authors found that many pictures were taken to support a personal reflection: supporting a mutual task with colocated friends, a remote task for a friend, and a personal task.

Two thirds of the images examined were captured to share, mainly for affective reasons, and the sharing mainly took place when people were face-to-face.

The authors highlight as the key value of a camera phone is the ability to spontaneously show images. They suggest that finding and browsing images wbhould be simple and possible. The portability and pervasiveness of capture of this content has a tremendous value for the interconnection with digital and public displays, and for absent people.

The interviewed participants reacted positively to the idea of enriching this content with contextual information. Finally, they found that image browsing of the subjects’ phones when many pictures were present was ineffective for image retrieval.

Kindberg_Camera-Phone.jpg