Waisda? Video Labeling Game: Evaluation Report

Monday, January 18th, 2010

The Waisda? (which translates to What’s that?) video labeling game was launched in May 2009. It invites users to tag what they see and hear and receive points for a tag if it matches a tag that their opponent has entered. Waisda? is the world’s first operational video labelling game. The underlying assumption is that tags are most probably valid if there’s mutual agreement. Over 2,000 people played the project and within six months, over 340k tags have been added to over 600 items from the archive. Initial findings have been published earlier, when the pilot period was still running. This evaluation report (PDF download, in Dutch), includes a quantitative and qualitative analysis of the tags, as well as a usability study of the game environment and a study into the incentives that apply to people playing the game. The evaluation report is written by Lotte Belice Baltussen, in collaboration with Maarten Brinkerink and Johan Oomen of the Netherlands Institute for Sound and Vision R&D Department. Researchers at the VU University Amsterdam, Business Web & Media Section, also provided crucial input. The VU University Amsterdam carries out this research in light of their involvement in the PrestoPRIME European research project.

The evaluation report provides evidence that crowdsourcing video annotation in a serious, social game setting can indeed enhance retrieval of video in archives. It features success factors organizations need to take into account in setting up services that aim to actively engage their audiences online. The main conclusions are listed below:



Video Labeling Game Waisda?: Preliminary results and ongoing research

Friday, October 23rd, 2009

Ten months ago the entry “The Wisdom of the Crowds in the Audiovisual Archive Domain” was posted on this research blog. In it, the interest of the Images of the Future consortium in creating social and open archives was discussed. Due to the large scale digitisation of archival materials the opportunities of offering public access to these materials has increased dramatically. One of the ways in which archives can provide access is by creating opportunities for social tagging. This allows people to annotate archival materials with their own key terms (tags). This is not only beneficial for the original tagger, who can use their own tags to find these self-annotated materials more easily in the future, but also for other users that are searching through the user-annotated collection with similar search terms. The tags that are added by users might overlap with the metadata that is produced by experts, or generate new terms and consequently new ways of looking at and finding archival materials. This might bridge the semantic gap[1] between the vocabulary that annotation experts use and the ways in which the general public refers to and interprets (audio-visual) information.

The debate on whether tagging and other crowdsourcing possibilities will actually contribute to the accessibility of archives, or that it will just cause chaos and make finding materials more complicated and murky is still in full swing.[2] There have been some pilot projects on tagging and other crowdsourced metadata which generated some interesting and encouraging data. Notably, partners of the Flickr: The Commons project used this popular photo sharing website to make their collections more accessible, and for collecting annotations by the public. The Nationaal Archief (the Dutch national archive, and one of the Images for the Future partners) and Spaarnestad Photo were part of this project. The results were promising[3], but more hard data is needed to show in what way tagging can be beneficial to archives. Thus, the result of the Images for the Future consortium’s interest in user generated metadata was the development of tagging game through which moving images can be annotated by the public. Through this game, a dataset that will be gathered which can be used to answer some questions raised in this ongoing and topical debate.

Waisda? What’s that?

The online video labeling game Waisda? (which translates to What’s that?), is a project that is an initiative managed by the Netherlands Institute for Sound and Vision in close collaboration with the Dutch public broadcaster KRO (Catholic Radio Broadcasting). The game was developed by internet agency Q42. When the six month pilot project ends in December, the VU University Amsterdam will research various possibilities for implementing these user-generated  tags, and will develop new versions of the game with improved and extended game design and interface options.

Waisda? was launched in May and allows players to annotate Polygoon newsreel journals and KRO programmes such as Boer zoekt Vrouw (Farmer Wants a Wife), Spoorloos (Find my Family) and Memories. Recently the archive of Barend en Van Dorp, a popular Dutch talk show (broadcast from 1990 to 2006), was added to Waisda?

The basis of the game is simple. Players go to the Waisda? website and are presented with a selection of four different episodes of the programmes mentioned above. They can choose any of these programmes to start tagging. The programmes do not start from the beginning, but are played sequentially on the website. Therefore the players drop in at the point that video happens to be at. This means they never have to wait for a game to start, but can start tagging straight whenever they want. Players are asked to tag what they see and hear and receive points for a tag if it matches a tag that their opponent has typed in. The reasoning behind this is that a tag is probably valid if at least two people agree on it. This is the same assumption that is made in the case of the Games with a Purposethat were developed by Luis von Ahn, now professor of Computer Science at Carnegie Mellon University.[4] His ESP game demonstrator, in which two players add tags to a picture, was so successful that Google has licensed it under the name ‘Google Image Labeler’[5].

Right now the Netherlands Institute for Sound and Vision is working on a preliminary evaluation of Waisda? This involves both performing more research on how the game itself can be improved and analysing the crowdsourced tags that were added by the players in the last months. Since Waisda? was launched in May 2009, well over five hundred videos were tagged by a total of almost 2,300 unique players. There are 150 registered players, most of whom return frequently to play the game. So far over 14,000 unique tags have been added via Waisda? and this number still rises every day. In the end, the dataset generated by the people that play Waisda? will be used for in-depth research that will result in recommendations for the improvement of the accessibility of, and search functionalities for, audiovisual archives with crowdsourced metadata.

Tag analysis and research

The tags that were added so far were compared to the terms in the GTAA thesaurus the Netherlands Institute for Sound and Vision uses to classify the audiovisual materials in their archive, and almost 15 % of the tags provided a perfect match. This may not seem like a big number at first glance, but the GTAA contains only very specific terms like person names, genres and topics, and it was therefore not expected that many tags would match with this professional thesaurus. The tags were also compared to another database called Cornetto which contains the bulk of all official Dutch words, and another 45 % of the tags matched. There is some overlap between the Sound and Vision thesaurus and Cornetto, but still well over half the tags added via Waisda? are definitely usable based upon this first simple quantitative analysis.

This does not imply that the other half is not. There are, for instance, tags that contain spelling or typing errors but point to relevant tags. After analysing a representative and random sample of the tags a little under 10 % of them turned out to contain an error. It is expected that this percentage will eventually be lower, since there are players that enter their erroneous tag correctly after realising their mistake, in order to still receive points.

There are also tags that consist of more than one word and that are not recognised as correct terms in the Sound and Vision thesaurus or Cornetto. For example, the tag ‘illegitimate children’ does not appear in the GTAA thesaurus or the Cornettovocabulary, but the individual words do appear in Cornetto. Thus, by separating the tags that consist of multiple terms and that do not match either thesaurus they can still prove to be very useful.

Another area that requires additional research are tags that appear in multiple categories of the thesaurus and Cornetto. The tag ‘link’ means ‘dangerous’ or ‘connection’ in Dutch, among other things, and the term is therefore ambiguous. To find out which meaning the tagger intended, one solution would be to analyse the tags that were added to that video in proximity to ‘link’. If ‘scary’ and ‘exciting’ were added besides ‘link’ it is possible to semantically determine that in this case the meaning ‘dangerous’ is the most plausible. Ideally, semantic software can be used to make these determinations automatically.

These and other topics are analysed in a follow-up research project that is executed in close collaboration with the VU University Amsterdam. This project is part of the European PrestoPRIME programme, in which various partners are collaborating to “research and develop practical solutions for the long-term preservation of digital media objects, programmes and collections.”[6] The university’s research will take three years, and will result in in-depth advice on how to process and implement user generated content such as the Waisda? tags, as well as new implementations for game and interface design, which will be discussed later on in this article.



The Wisdom of the Crowds in the Audiovisual Archive Domain

Thursday, December 18th, 2008

BCK – social tagging by pulguita (CC-BY-SA)

Our consortium partner the Dutch National Archives recently joined Flickr: The Commons. Within Flickr: The Commons, leading archives across the globe upload items from their collections to Flickr and invite visitors to add tags and comments to it. This has been a major success: in six weeks the 500 photo’s of the National Archives have been viewed 600.000 times and 1200 tags have been added. Putting material out in the open like the Dutch National Archive did at Flickr raises questions. Are general users qualified enough to complete or even replace the annotations made by archivists? Who is responsible for the outcome of the annotation process? How do we motivate users to annotate the material? These questions partly remain unanswered. This article tries to shed some light on possible following directions.

Users nowadays create their own content. In a recent lecture at the University of Toronto, David Weinberger sees the Web 2.0 as a radical change in thinking about information. There is no longer one truth provided by one source, but instead there’s an ecosystem of truths provided by many. This ecosystem is constantly changing and evolving. Content becomes similar to connection and metadata becomes data.[1] Users are no longer passive visitors of websites but active creators. They give new meaning to existing information by linking different sources and using personal preferences to arrange information.

Social Archives?
The change in the perception of the Internet as a medium has a lot of opportunities for Images for the Future. Digitalizing the archival material takes a lot of time, money and manpower. The digitised material has to be accessible for users. This doesn’t only include the presentation of the material through different services but also the metadata, in order to make it searchable and to add context. The creation of metadata is a very labor-intensive process and not very efficient when its solely done by archivists. Data mining technology might be a solution. Also, archives are exploring how they can put the ‘wisdom of the crowds’ into use.[2]

One of the practices of social software the consortium is interested in is social tagging. Tagging allows users to label different forms of content and can also be used by other people to search content. There are various ways to show tags, like for instance a tag cloud where font size indicates the number of times the tag is used.

One of the tasks of an audiovisual archive is to arrange the information and to embed it in a context. Because of social software, the role of an archive is changing. Annemieke de Jong from Sound and Vision describes these changes in the article Users, Producers & Other Tags. Instead of producing the metadata, documentalists and archivists increasingly classify and correct the metadata produced by others. A part of the metadata arise during digitalization, other metadata can be created by outside experts, crowdsourcing (an open call to an undefined group of people) and social tagging. According to De Jong annotation by users can save time and money. “Free tagging by the general public could be of enormous help in making our collections accessible, on clip level and from multiple viewpoints.”[3]

Experts vs. the General User
There is a tension between the traditional annotation system and the social tagging system. Although the phrase social suggests a community spirit among users, most of them are driven by self-interest.[4] Social taggers use tags primarily to save information that is relevant for their own purpose and add every kind of tag they like. This creates a folksonomy, a free-form system of tags modified by many users. Archivists on the other hand are experts who use metadata from a thesaurus, or a closed vocabulary. The main goal is to make the information accessible for others, not for themselves.  A thesaurus is usually thoroughly designed by a small group of people.

Folksonomies are based on the principal of the ’wisdom of the crowd’. If a lot of people share the same opinion it must be a correct one. Quality is defined by the majority and not by expertise. Wikipedia is build on this assumption. Articles contain general information and there is a lack of value of expert opinion. To Weinberger, wisdom of the crowds is more credential than the wisdom of one, if the process of the creation of the content is visible to users.[5] Not everybody involved in Web 2.0 shares this opinion. One of the founders of Wikipedia, Larry Sanger, wanted more possibilities for expert contribution. He therefore started a new open encyclopedia, Citizendium, where experts have more authority.[6] This example shows that everyone does not always support the wisdom of the crowds. Social tagging challenges the notion of quality and the value of expert opinion. De Jong states that social tagging should not be a substitute for the annotation by experts. Instead there should be an exchange of information between the two systems.[7]

The different nature of the two systems makes it hard to combine them. Sound and Vision is involved in several projects that research the possibilities to combine both systems. The institute participates in the consortium MultimediaN. One of the projects is Spiegle, an alternative search engine for Google. Spiegle combines various levels of metadata like the user profile, the platform and the features of a collection. Unlike Google, this search engine provides very narrow results, which makes it easier for users to find the right content. Also, Sound and Vision participated in the project Total Content Recommendation with the Telematica Institute in Enschede. The project explored the possibilities of social tagging of audiovisual content and the incentives of users to tag the content. Sound and Vision also participates in PrestoPrRIME, a project in the 7th Framework programme of the EU for the research and development of long-term preservation of new media, which will start in January 2009. One of the goals of this project is to establish interoperability between various databases by enabling information exchange between different systems of metadata. Different forms of annotation are connected with each other, creating a semantic system of tags. Semantic tagging makes the annotation process much more efficient in the long term and eventually bridges the gap between different annotation systems.

Motivation of Users
The success of social software has two reasons. It creates weak ties between users and it operates in an open and free environment. Because of the openness of the software, users can choose the level of participation in a community. [8] Users have to be motivated to participate very actively.  During the Total Content Recommendation project Lex van Velsen and Mark Melenhorst of the Telematica Institute  have done research after the incentives of users to tag video content. Following the classification of Cameron Marlow, research scientist at Facebook, they define six different incentives:

  1. Future retrieval
  2. Contribution and sharing
  3. Attract attention
  4. Play and competition
  5. Self presentation
  6. Opinion expression[9]

The authors tested these motives with two groups. Although the groups were very small, the authors conclude users didn’t tag for play and competition and for self-presentation. The tagging sites that were tested didn’t have a game element so it was very likely users didn’t use those sites for play and competition.

Someone who has done a lot of research about crowdsourcing and games is Luis von Ahn, a professor of Computer Science at Carnegie Mellon University. His research is based on the assumption that “computers still don’t possess the basic conceptual intelligence op perceptual capabilities that humans take for granted.”[10] Computers aren’t able to solve problems that are relatively easy solved by most humans. Von Ahn calls this human computation. The human brain can be treated as a processor in a distributed system that can perform a small part of a massive computation. To become part of this massive computation, they do require incentives to solve these kinds of problems, like a game.[11]

An example is the ESP Game. In this game, two random players see the same image they need to label. Goal of the game is to use the same label as your partner. Von Ahn also developed a game called Peekaboom to determine the place of an object within an image. Other games he developed are Verbosity, a game for collecting commonsense facts and Phetch a game for collecting image descriptions for visual impaired.[12] Other researchers have developped games to collect metadata for audio content, like the Listen Game,Tag-a-Tune and MajorMiner. The Listen Game is based on a list of tags players can use to annotate the material. The other two games are similar to the ESP Game: the added tags are compared with those from a database or a different player. In both games players are free to use any tag they like.[13] There are also a few tagging initatives based on video content. Yahoo’s Video Tag Game is based on the same principle as the ESP Game. Players earn points by adding similar tags.[14] This game, developed by Yahoo research, is still in an exploration stadium. The game VideoTag, developed by Stacey Greenaway – as an Msc research project – Is already operational as a single-player game where players collect points adding tags. Some tags are pitfalls (tags that are too obvious) that lower the players score. All these tagging games are examples to generate metadata in a playfull way. Von Ahn sees great possibilities for these kind of games. However, a game designed to solve a problem should produce the right solution and be fun at the same time.[15]

Future Archives
The future archive has to be an open archive to survive the transition from Web 1.0 to Web 2.0 and beyond where users are able to use and label content for their own purpose. Archivists still have a role in evaluating and contextualizing the metadata created by general users. In order to stimulate the creation of different forms of metadata archives could develop creative concepts like games, to encourage users to create new metadata. Until now a lot of research has been done on gathering metadata from general users. Quantity seems more important than quality. Most of the games are designed after the ESP Game. Like von Ahn stated, tagging games should provide the right solution. Further research should focus on the right solution. But what is the right solution? Is it the metadata created by experts, or is it the wisdom of the crowd. And if we know the right solution, how do we control the tagging process to get it? Are players able to provide the right solution or is it necessary for archives to check the metadata that is produced? If that’s the case is a tagging game profitable enough? Further research should focus on these questions in order to gain more insight in the possibilities social tagging has for archives. Sound and Vision will release a video tagging environment early 2009 in collaboration with the Free University Amsterdam and the broadcaster KRO in order to answer some of the questions raised above.


  • Ahn, L. von.  “Games with a Purpose,” Computer (Vol. 39:6, June 2006). pp. 92-94. URL
  • Jong, A. de. “Users, producers & other tags. Trends and developments in metadata creation.” Lecture at the FIAT/IFTA conference (October 2007) URL
  • Mechant, P. “Culture ‘2.0’: Social and Cultural Exploration through the use of Folksonomies and Weak Cooperation.” Cultuur 2.0. (Amsterdam: Virtueel Platform, 2007). URL
  • Lusenet, Y de. Geven en nemen. Archiefinstellingen en het sociale web. (Den Haag: Taskforce archieven, 2008).
  • Siorpaes, K. & Hepp, M. “Games with a purpose for the semantic web.” Intelligent Systems (Vol 23:3, May 2008) pp. 50-60. URL
  • Surowiecki, J. The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations. (New York: Random House, 2004).
  • Turnbull, D. e.a. “Five approaches to collecting tags for music.” ISMIR Conference (2008). pp.225-230. URL
  • Velsen, L. van & Melenhorst, M. “User Motives for Tagging Video Content.” (2008). URL
  • Wartena, C & Brussee, R. “Instanced-based mapping between thesauri and folksonomies.” Proceedings of the 7th International Semantic Web Conference (ISWC’08) (2008). URL
  • Weinberger. D. “Knowledge at the End of the Information Age.” Bertha Bassam Lecture at the University of Toronto (2008). URL
  • Zwol, R. van. e.a. “Video Tag Game.” 17th International World Wide Web Conference (WWW developer track) (ACM Press, 2008).

[1] Weinberger, D. (2008)
[2] Surowiecki, D. (2004)

[3] Jong, A, de. (2007).

[4] Velsen, L. van & Melenhorst, M. (2008) p.2.

[5] Weinberger, D. (2008)

[6] Lusenet, Y. de. (2008) p.19-20.

[7] Jong, A, de (2007).

[8] Mechant, P. (2007) p. 24.

[9] Velsen, L. van & Melenhorst, M. (2008) p.2.

[10] Ahn, L. von. (2006) p.96.

[11] Ibid. p.96.

[12] Siorpas, K. & Hepp, M. (2008) p. 51.

[13] Turnbull, D. e.a. (2008) p. 227.

[14] Zwol, R. van. e.a. (2008) p.1-2.

[15] Ahn, L. von. (2006) p.96-98.


Freebase: the semantic web application

Monday, November 26th, 2007

Another Wikidia-style online encyclopedia has seen the light. But Freebase is something new. Its creator, the company Metaweb is setting out to create a vast public database intended to read by computers rather than people. Users still play an important rule in Freebase. They set the types of relations between pieces of information. People add metadata instead of data. In this way, information will be structured to make it possible for software to define relationships and even meaning. In the words of TechCrunch’ Micheal Arrington: This is cool unless its get consciousness and kills us all.

How does it work?
Freebase logo When logged in (registration is open for the public since november), you can add information on companies, movies, places, restaurants etc, just as in Wikipedia. But you not only enter the data, but also add the types of the information. For example, we choose to add a company to the database. When I entered Knowledgeland and told Freebase it’s a company, a new template with a lot of predefined structure came up, because Metaweb has defined a whole set of additional data that is typically associated with a company. I can choose to enter the empty fields such as employees. When I then click on the name of the employee, it’s relation with the company and it’s type is automatically established. Employees become persons, places become locations etc. And all these new topics come with their own predefined fields. Searching has become a lot more intuitive because you can use the same fields for narrowing down the results. A search string such as ‘show me all the companies in Amsterdam’ is done with two clicks.

Open for everyone
Freebase has already sucked in data from Wikipedia and other sources, and user can fill in their data too. Currently Freebase counts almost 3 million topics. More than 1200 relationships in the form of types have been established between these topics within 68 domains. Just as with Google, developers can extract information from Freebase and add it to their web applications. The information users add is licenced under the Creative Commons Attribution License or Public Domain. Because the information is structured, other web applications can use Freebase to display its information in new ways.

Freebase is interesting not only for its collective intelligence. The workflow of entering metadata is highly intuitive and can function as a blueprint for crowdsourcing purposes. Archives don’t need to worry about the types of relations, users create them on the fly.

Perhaps Freebase marks the start of a new era in gathering information. Perhaps not. But one thing is sure: Freebase in potential the Google killer for harvesting collective intelligence.

Introduction to Freebase (screencast)

Related posts
- Freebase @ Techcrunch
- Tim O’Reilly about Freebase