Archive for October, 2009

Video Labeling Game Waisda?: Preliminary results and ongoing research

Friday, October 23rd, 2009

Ten months ago the entry “The Wisdom of the Crowds in the Audiovisual Archive Domain” was posted on this research blog. In it, the interest of the Images of the Future consortium in creating social and open archives was discussed. Due to the large scale digitisation of archival materials the opportunities of offering public access to these materials has increased dramatically. One of the ways in which archives can provide access is by creating opportunities for social tagging. This allows people to annotate archival materials with their own key terms (tags). This is not only beneficial for the original tagger, who can use their own tags to find these self-annotated materials more easily in the future, but also for other users that are searching through the user-annotated collection with similar search terms. The tags that are added by users might overlap with the metadata that is produced by experts, or generate new terms and consequently new ways of looking at and finding archival materials. This might bridge the semantic gap[1] between the vocabulary that annotation experts use and the ways in which the general public refers to and interprets (audio-visual) information.

The debate on whether tagging and other crowdsourcing possibilities will actually contribute to the accessibility of archives, or that it will just cause chaos and make finding materials more complicated and murky is still in full swing.[2] There have been some pilot projects on tagging and other crowdsourced metadata which generated some interesting and encouraging data. Notably, partners of the Flickr: The Commons project used this popular photo sharing website to make their collections more accessible, and for collecting annotations by the public. The Nationaal Archief (the Dutch national archive, and one of the Images for the Future partners) and Spaarnestad Photo were part of this project. The results were promising[3], but more hard data is needed to show in what way tagging can be beneficial to archives. Thus, the result of the Images for the Future consortium’s interest in user generated metadata was the development of tagging game through which moving images can be annotated by the public. Through this game, a dataset that will be gathered which can be used to answer some questions raised in this ongoing and topical debate.

Waisda? What’s that?

The online video labeling game Waisda? (which translates to What’s that?), is a project that is an initiative managed by the Netherlands Institute for Sound and Vision in close collaboration with the Dutch public broadcaster KRO (Catholic Radio Broadcasting). The game was developed by internet agency Q42. When the six month pilot project ends in December, the VU University Amsterdam will research various possibilities for implementing these user-generated  tags, and will develop new versions of the game with improved and extended game design and interface options.

Waisda? was launched in May and allows players to annotate Polygoon newsreel journals and KRO programmes such as Boer zoekt Vrouw (Farmer Wants a Wife), Spoorloos (Find my Family) and Memories. Recently the archive of Barend en Van Dorp, a popular Dutch talk show (broadcast from 1990 to 2006), was added to Waisda?

The basis of the game is simple. Players go to the Waisda? website and are presented with a selection of four different episodes of the programmes mentioned above. They can choose any of these programmes to start tagging. The programmes do not start from the beginning, but are played sequentially on the website. Therefore the players drop in at the point that video happens to be at. This means they never have to wait for a game to start, but can start tagging straight whenever they want. Players are asked to tag what they see and hear and receive points for a tag if it matches a tag that their opponent has typed in. The reasoning behind this is that a tag is probably valid if at least two people agree on it. This is the same assumption that is made in the case of the Games with a Purposethat were developed by Luis von Ahn, now professor of Computer Science at Carnegie Mellon University.[4] His ESP game demonstrator, in which two players add tags to a picture, was so successful that Google has licensed it under the name ‘Google Image Labeler’[5].

Right now the Netherlands Institute for Sound and Vision is working on a preliminary evaluation of Waisda? This involves both performing more research on how the game itself can be improved and analysing the crowdsourced tags that were added by the players in the last months. Since Waisda? was launched in May 2009, well over five hundred videos were tagged by a total of almost 2,300 unique players. There are 150 registered players, most of whom return frequently to play the game. So far over 14,000 unique tags have been added via Waisda? and this number still rises every day. In the end, the dataset generated by the people that play Waisda? will be used for in-depth research that will result in recommendations for the improvement of the accessibility of, and search functionalities for, audiovisual archives with crowdsourced metadata.

Tag analysis and research

The tags that were added so far were compared to the terms in the GTAA thesaurus the Netherlands Institute for Sound and Vision uses to classify the audiovisual materials in their archive, and almost 15 % of the tags provided a perfect match. This may not seem like a big number at first glance, but the GTAA contains only very specific terms like person names, genres and topics, and it was therefore not expected that many tags would match with this professional thesaurus. The tags were also compared to another database called Cornetto which contains the bulk of all official Dutch words, and another 45 % of the tags matched. There is some overlap between the Sound and Vision thesaurus and Cornetto, but still well over half the tags added via Waisda? are definitely usable based upon this first simple quantitative analysis.

This does not imply that the other half is not. There are, for instance, tags that contain spelling or typing errors but point to relevant tags. After analysing a representative and random sample of the tags a little under 10 % of them turned out to contain an error. It is expected that this percentage will eventually be lower, since there are players that enter their erroneous tag correctly after realising their mistake, in order to still receive points.

There are also tags that consist of more than one word and that are not recognised as correct terms in the Sound and Vision thesaurus or Cornetto. For example, the tag ‘illegitimate children’ does not appear in the GTAA thesaurus or the Cornettovocabulary, but the individual words do appear in Cornetto. Thus, by separating the tags that consist of multiple terms and that do not match either thesaurus they can still prove to be very useful.

Another area that requires additional research are tags that appear in multiple categories of the thesaurus and Cornetto. The tag ‘link’ means ‘dangerous’ or ‘connection’ in Dutch, among other things, and the term is therefore ambiguous. To find out which meaning the tagger intended, one solution would be to analyse the tags that were added to that video in proximity to ‘link’. If ‘scary’ and ‘exciting’ were added besides ‘link’ it is possible to semantically determine that in this case the meaning ‘dangerous’ is the most plausible. Ideally, semantic software can be used to make these determinations automatically.

These and other topics are analysed in a follow-up research project that is executed in close collaboration with the VU University Amsterdam. This project is part of the European PrestoPRIME programme, in which various partners are collaborating to “research and develop practical solutions for the long-term preservation of digital media objects, programmes and collections.”[6] The university’s research will take three years, and will result in in-depth advice on how to process and implement user generated content such as the Waisda? tags, as well as new implementations for game and interface design, which will be discussed later on in this article.



