Psykinematix: Visual Psychophysics for OSX

Psykinematix is a new OpenGL-based Software Package dedicated to Visual Psychophysics running on Mac OSX. It consists in a unique stand-alone application that does not require any programming skill to create and run complex experiments.

Easy to use, subject-friendly, powerful and reliable, Psykinematix runs standard psychophysical protocols, presents complex stimuli, collects subject’s responses, and analyzes results on the fly. Psykinematix is also a great learning tool to introduce visual perception and to illustrate psychophysical concepts to students.

Psykinematix_logo.jpg

Image retrieval: current techniques, promising directions and open issues

Rui, Y., Huang, T., and Chang, S. Image retrieval: current techniques, promising directions and open issues. Journal of Visual Communication and Image Representation 10, 4 (April 1999), 39–62. [PDF]

——–

This article summarizes many years of research in the field of image information retrieval. It describes open challenges and the state of the art in the field.

One of the main difficulty results from the rich content in the images and the subjectivity in human perception. This, according to the authors creates a mismatch between the metadata annotations, produced with different techniques and the retrieval efficacy and satisfaction perceived by the user.

To the extent of improving image information retrieval systems, “humans have to be in the loop”. The authors cite a good deal of work that has been conducted to this specific extent (e.g., QBIC interactive region segmentation, the interactive FourEyes, the dynamic feature vector recomputation of WebSEEK, the MARS and PicHunter relevance feedback, and so forth.

The ultimate end user of an image retrieval system is human: therefore the study of human perception of image content from a psychophysical level is crucial.

Additionally, the autors refer to Rogowitz et al. (1998) who conducted a series of experiments analyzing human psychophysical perception of image content. According to their results, even though visual features do niot capture the whole semantic meaning of the images, they do correlate a lot with the semantics. These results encourage the development of metrics to achieve semantically meaningful retrievals.

Human perception-driven, similarity-based access to image databases

Celebi, M. E., and Aslandogan, Y. A. Human perception-driven, similarity-based access to image databases. In Proceedings of the Eighteenth International Florida Artificial Intelligence Research Society Conference (Clearwater Beach, Florida, May 15–17 2005), I. Russell and Z. Markov, Eds., pp. 245–251. [PDF]

——–

In this work, the author used human perception of similarity as a guide in optimizing an image distance function in a content-based image retrieval system. A psychophysical experiment was designed to measure the perceived similarity of each image with every other image in the database. The weights of the distance function were optimized by means of a genetic algorithm using the distance matrix obtained from subjective experiments. Using the optimized distance function, the retrieval performance of the system was significantly improved.

In this study, the authors focused on shape similarity. However, the authors argue that the same approach can be used to develop similarity functions based on other low-level features such as color or texture.

This paper contains relevant references of image retrieval systems trained over human perception.

Celebi_exp-gui.jpg

Emulating human perception of motion similarity

Tang, J. K. T., Leung, H., Komura, T., and Shum, H. P. H. Emulating human perception of motion similarity. Comput. Animat. Virtual Worlds 19, 3-4 (2008), 211–221. [PDF]

———-

Evaluating the similarity of motions is useful for motion retrieval, motion blending, and performance analysis of dancers and athletes. Euclidean distance between corresponding joints has been widely adopted in measuring similarity of postures and hence motions. However, such a measure does not necessarily conform to the human perception of motion similarity.

In this paper, the authors propose a new similarity measure based on machine learning techniques. They make use of the results of questionnaires from subjects answering whether arbitrary pairs of motions appear similar or not. Using the relative distance between the joints as the basic features, they train the system to compute the similarity of arbitrary pair of motions. Experimental results show that our method outperforms methods based on Euclidean distance between corresponding joints.

Their method is applicable to content-based motion retrieval of human motion for large-scale database systems. It is also applicable to e-Learning systems which automatically evaluates the performance of dancers and athletes by comparing the subjects’ motions with those by experts.  

Tang_MotionPerception.jpg

Life, Ideas, Future, Together (LIFT) 2009: Where has the future gone?

Finally, I found some time to write a short report about the inspiring talks I attended at LIFT09. Patrick Gyger presented the first talk. He is the director of the “Maison D’Ailleur” a Science-Fiction museum in Human perception-driven, similarity-based access to image databases Yverdon, Switzerland. He talked about the way futuristic ideas that were imagined by science fiction of the beginning of the century did not make it to the real future.

Future was sketched as a stylish future. What happened to these visions? Science Fiction as a genre stated at the beginning of the 20th century. Future in these early attempts was not a function but it was a style. Flying cars did exist. There were many existing prototypes produced and certified in the past. Why they did not work? It is because they were not answering a real need but only a dream desire. Other things did make it to the present: examples, the wristwatch, cybernetics. The future did not take the form it was designed. But some functions are here. We do not have an urgent need for utopias any more. We live in too much comfort in developed countries. With food, lodging… There is no way that only our material world will chage our life. We require a societal change.

Nicolas Nova, my friend and colleague back at EPFL, extended this initial flow of ideas with a talk titled: “The Recurring Failure of Holy Grail”. He talked about many products that were designed and used as examples of futuristic technology but which failed at becoming mass products. He presented three examples: 1) the videophone, 2) the intelligent fridge, and 3) location-based services. Nicolas tried to sketch possible reasons that brought to the failures of these products. He described how researcher were overoptimistic, and how they had little knowledge of similar previous attempts. He described how it is easy to get trapped in the zeitgeist of the futuristic wave surrounding an invention. Additionally, he pointed out how we often forget the development and adoption cycles that it takes for new technologies to get adopted. Finally, he pointed out the we generally have a poor understanding of “users”. I totally subscribe to Nicolas’ idea that there is a need to document failures just as we document successes.

David Rose presented a number of “Ambient Devices“, little gadget interfaces that were somewhat inspired by fictional stories. He described how fiction sometimes foreshadows innovation. He presented a number of “enchanted objects” designed around some basic fictional powers/desires: 1) to know. Example: to know the truth — invention — truth machine –> Snow white: the mirror. Therefore he presented the Single pixel browser, a sphere that changes color in relation to some information available somewhere in the internet. Its principle was that summarization is more valuable because it requires less time and attention. David presented a number of other inspiring projects like an Internet connected pills container or sensors embedded in the fabric of the home.

Lee Bryant gave a presentation titled: “The twentieth century was wrong”. His main message was that many products, campaigns, initiatives treat people like objects/mass. This is wrong. Models based on social networks, where every member contributes to the group, are much more interesting and proven to work. Let’s stick to that.

Juliana Rotich talked about citizen journalism: “Globalism, Mobiles and the Cloud”. She described how volunteers around the world can provide objective and localized news that are not controlled by mainstream media. She described the project GlobalVoices. In order for this platform to work, another project had to be started, called LINGUA translation project because English does not equate with global. She described a couple of examples of services that are extremely popular in Africa like MobInfo in Kenya. Citizen Journalists does not equate with a person with a mobile phone. They still need to have some journalistic skills.

Carlo Ratti direct the senseablecity lab at the MIT. He described a number of project that he conducted around the theme of “Future Cities”. GreenWheeil in Copenhagen is a system to enhance a bicycle system so that some of the energy accumuated while pedaling can be reused to propel the bike. Trash Track is a project aiming at tracking movements of transh in a city to improve sanitizing systems. He finally described the Digital Water Pavillon built for the expo in Zaragoza.

Anne Galloway described the core of her PhD research: Envisioning the future city. She argues that many services/ products that researchers develop should be considered as gifts to the users. Expectations, promises and hopes are things that we do: these are GIFTED opportunities, as for example sensors technologies can allow citizens to map environmental conditions, or citizens can use these data to take political action. This gift needs us to want to act as data collectors and it needs us to have the ability to make sense of the data we collect. Most of the time people do not want these gifts, hence their failures. Additionally, gifted opportunities imply also gifted risks: when active citizenship requires access to technology, people without access effectively become non-citizens.

Finally, Baba Wame talked about “how African woman have embraced dating websites in Cameroon”. He explained, in a hilarious speech, how they use this new technology to escape the difficult situations they live in and how they appropriate this technology even if they are, in most of the cases, illiterate.

Understanding video interactions in YouTube

Benevenuto, F., Duarte, F., Rodrigues, T., Almeida, V. A., Almeida, J. M., and Ross, K. W. Understanding video interactions in youtube. In MM ’08: Proceeding of the 16th ACM international conference on Multimedia (New York, NY, USA, 2008), ACM, pp. 761–764. [PDF]

———–

This paper reports a number of experiments conducted to understand user behavior in a social network created essentially by video interactions. The authors crawled and analyzed interactions in YouTube finding many peculiarities of this kind of systems.

For instance, unlike other social networks that exhibit a significant degree of symmetry, the user interaction network shows a structure similar to the Web graph, where pages with high in-degree tend to be authorities and pases with high out-degree act as hubs directing users to recommended pages. This analysis is very useful to detect spammers, which might be nodes with very high out-degree.

Similarly, unlike social networks, the network in YouTube exhibit diassortive mixing, where high degree nodes preferentially connect with low degree ones and vice versa.

The authors also used a clustering coefficient (CC) to show the presence of small communities in the video-response network. Specifically, 80% of all nodes in the entire user interaction network have CC = 0, meaning that higly responsive users do not necessarily ave social links with the contributors of the videos that they are responding to. Therefore, there might not exist a sense of community among the users that receive video responsens from a responsive user.

The analysis also highlighted three kinds of anti-social behavior: a) submission of videos with a long list of misleading tags; b) posting video responses which are unrelated to the original video; and c) ranking boosting of personal video to have them highly visible.

Finally, the author used an inter-reference distance (or IRD) to characterize the user’s behavior: on the sequence of video-responses to video i, it is the total number of responses that appear between two video reponses from the same user. The author compared this metric with a manual coding of social or anti-social user and showed that the metric correctly identifies 80% of anti-social users.

Benevenuto_YouTube.jpg