Our consortium partner the Dutch National Archives recently joined Flickr: The Commons. Within Flickr: The Commons, leading archives across the globe upload items from their collections to Flickr and invite visitors to add tags and comments to it. This has been a major success: in six weeks the 500 photoâ€™s of the National Archives have been viewed 600.000 times and 1200 tags have been added. Putting material out in the open like the Dutch National Archive did at Flickr raises questions. Are general users qualified enough to complete or even replace the annotations made by archivists? Who is responsible for the outcome of the annotation process? How do we motivate users to annotate the material? These questions partly remain unanswered. This article tries to shed some light on possible following directions.
Users nowadays create their own content. In a recent lecture at the University of Toronto, David Weinberger sees the Web 2.0 as a radical change in thinking about information. There is no longer one truth provided by one source, but instead thereâ€™s an ecosystem of truths provided by many. This ecosystem is constantly changing and evolving. Content becomes similar to connection and metadata becomes data. Users are no longer passive visitors of websites but active creators. They give new meaning to existing information by linking different sources and using personal preferences to arrange information.
The change in the perception of the Internet as a medium has a lot of opportunities for Images for the Future. Digitalizing the archival material takes a lot of time, money and manpower. The digitised material has to be accessible for users. This doesnâ€™t only include the presentation of the material through different services but also the metadata, in order to make it searchable and to add context. The creation of metadata is a very labor-intensive process and not very efficient when its solely done by archivists. Data mining technology might be a solution. Also, archives are exploring how they can put the â€˜wisdom of the crowdsâ€™ into use.
One of the practices of social software the consortium is interested in is social tagging. Tagging allows users to label different forms of content and can also be used by other people to search content. There are various ways to show tags, like for instance a tag cloud where font size indicates the number of times the tag is used.
One of the tasks of an audiovisual archive is to arrange the information and to embed it in a context. Because of social software, the role of an archive is changing. Annemieke de Jong from Sound and Vision describes these changes in the article Users, Producers & Other Tags. Instead of producing the metadata, documentalists and archivists increasingly classify and correct the metadata produced by others. A part of the metadata arise during digitalization, other metadata can be created by outside experts, crowdsourcing (an open call to an undefined group of people) and social tagging. According to De Jong annotation by users can save time and money. â€œFree tagging by the general public could be of enormous help in making our collections accessible, on clip level and from multiple viewpoints.â€
Experts vs. the General User
There is a tension between the traditional annotation system and the social tagging system. Although the phrase social suggests a community spirit among users, most of them are driven by self-interest. Social taggers use tags primarily to save information that is relevant for their own purpose and add every kind of tag they like. This creates a folksonomy, a free-form system of tags modified by many users. Archivists on the other hand are experts who use metadata from a thesaurus, or a closed vocabulary. The main goal is to make the information accessible for others, not for themselves.Â A thesaurus is usually thoroughly designed by a small group of people.
Folksonomies are based on the principal of the â€™wisdom of the crowdâ€™. If a lot of people share the same opinion it must be a correct one. Quality is defined by the majority and not by expertise. Wikipedia is build on this assumption. Articles contain general information and there is a lack of value of expert opinion. To Weinberger, wisdom of the crowds is more credential than the wisdom of one, if the process of the creation of the content is visible to users. Not everybody involved in Web 2.0 shares this opinion. One of the founders of Wikipedia, Larry Sanger, wanted more possibilities for expert contribution. He therefore started a new open encyclopedia, Citizendium, where experts have more authority. This example shows that everyone does not always support the wisdom of the crowds. Social tagging challenges the notion of quality and the value of expert opinion. De Jong states that social tagging should not be a substitute for the annotation by experts. Instead there should be an exchange of information between the two systems.
The different nature of the two systems makes it hard to combine them. Sound and Vision is involved in several projects that research the possibilities to combine both systems. The institute participates in the consortium MultimediaN. One of the projects is Spiegle, an alternative search engine for Google. Spiegle combines various levels of metadata like the user profile, the platform and the features of a collection. Unlike Google, this search engine provides very narrow results, which makes it easier for users to find the right content. Also, Sound and Vision participated in the project Total Content Recommendation with the Telematica Institute in Enschede. The project explored the possibilities of social tagging of audiovisual content and the incentives of users to tag the content. Sound and Vision also participates in PrestoPrRIME, a project in the 7th Framework programme of the EU for the research and development of long-term preservation of new media, which will start in January 2009. One of the goals of this project is to establish interoperability between various databases by enabling information exchange between different systems of metadata. Different forms of annotation are connected with each other, creating a semantic system of tags. Semantic tagging makes the annotation process much more efficient in the long term and eventually bridges the gap between different annotation systems.
Motivation of Users
The success of social software has two reasons. It creates weak ties between users and it operates in an open and free environment. Because of the openness of the software, users can choose the level of participation in a community.  Users have to be motivated to participate very actively.Â During the Total Content Recommendation project Lex van Velsen and Mark Melenhorst of the Telematica InstituteÂ have done research after the incentives of users to tag video content. Following the classification of Cameron Marlow, research scientist at Facebook, they define six different incentives:
- Future retrieval
- Contribution and sharing
- Attract attention
- Play and competition
- Self presentation
- Opinion expression
The authors tested these motives with two groups. Although the groups were very small, the authors conclude users didnâ€™t tag for play and competition and for self-presentation. The tagging sites that were tested didnâ€™t have a game element so it was very likely users didnâ€™t use those sites for play and competition.
Someone who has done a lot of research about crowdsourcing and games is Luis von Ahn, a professor of Computer Science at Carnegie Mellon University. His research is based on the assumption that â€œcomputers still donâ€™t possess the basic conceptual intelligence op perceptual capabilities that humans take for granted.â€ Computers arenâ€™t able to solve problems that are relatively easy solved by most humans. Von Ahn calls this human computation. The human brain can be treated as a processor in a distributed system that can perform a small part of a massive computation. To become part of this massive computation, they do require incentives to solve these kinds of problems, like a game.
An example is the ESP Game. In this game, two random players see the same image they need to label. Goal of the game is to use the same label as your partner. Von Ahn also developed a game called Peekaboom to determine the place of an object within an image. Other games he developed are Verbosity, a game for collecting commonsense facts and Phetch a game for collecting image descriptions for visual impaired. Other researchers have developped games to collect metadata for audio content, like the Listen Game,Tag-a-Tune and MajorMiner. The Listen Game is based on a list of tags players can use to annotate the material. The other two games are similar to the ESP Game: the added tags are compared with those from a database or a different player. In both games players are free to use any tag they like. There are also a few tagging initatives based on video content. Yahooâ€™s Video Tag Game is based on the same principle as the ESP Game. Players earn points by adding similar tags. This game, developed by Yahoo research, is still in an exploration stadium. The game VideoTag, developed by Stacey Greenaway – as an Msc research project â€“ Is already operational as a single-player game where players collect points adding tags. Some tags are pitfalls (tags that are too obvious) that lower the players score. All these tagging games are examples to generate metadata in a playfull way. Von Ahn sees great possibilities for these kind of games. However, a game designed to solve a problem should produce the right solution and be fun at the same time.
The future archive has to be an open archive to survive the transition from Web 1.0 to Web 2.0 and beyond where users are able to use and label content for their own purpose. Archivists still have a role in evaluating and contextualizing the metadata created by general users. In order to stimulate the creation of different forms of metadata archives could develop creative concepts like games, to encourage users to create new metadata. Until now a lot of research has been done on gathering metadata from general users. Quantity seems more important than quality. Most of the games are designed after the ESP Game. Like von Ahn stated, tagging games should provide the right solution. Further research should focus on the right solution. But what is the right solution? Is it the metadata created by experts, or is it the wisdom of the crowd. And if we know the right solution, how do we control the tagging process to get it? Are players able to provide the right solution or is it necessary for archives to check the metadata that is produced? If thatâ€™s the case is a tagging game profitable enough? Further research should focus on these questions in order to gain more insight in the possibilities social tagging has for archives. Sound and Vision will release a video tagging environment early 2009 in collaboration with the Free University Amsterdam and the broadcaster KRO in order to answer some of the questions raised above.
- Ahn, L. von.Â “Games with a Purpose,” Computer (Vol. 39:6, June 2006). pp. 92-94. URL
- Jong, A. de. â€œUsers, producers & other tags. Trends and developments in metadata creation.â€ Lecture at the FIAT/IFTA conference (October 2007)Â URL
- Mechant, P. â€œCulture â€˜2.0â€™: Social and Cultural Exploration through the use of Folksonomies and Weak Cooperation.â€ Cultuur 2.0. (Amsterdam: Virtueel Platform, 2007). URL
- Lusenet, Y de. Geven en nemen. Archiefinstellingen en het sociale web. (Den Haag: Taskforce archieven, 2008).
- Siorpaes, K. & Hepp, M. â€œGames with a purpose for the semantic web.â€ Intelligent Systems (Vol 23:3, May 2008) pp. 50-60. URL
- Surowiecki, J. The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations. (New York: Random House, 2004).
- Turnbull, D. e.a. â€œFive approaches to collecting tags for music.â€ ISMIR Conference (2008). pp.225-230. URL
- Velsen, L. van & Melenhorst, M. â€œUser Motives for Tagging Video Content.â€ (2008). URL
- Wartena, C & Brussee, R. â€œInstanced-based mapping between thesauri and folksonomies.â€ Proceedings of the 7th International Semantic Web Conference (ISWC’08) (2008). URL
- Weinberger. D. â€œKnowledge at the End of the Information Age.â€ Bertha Bassam Lecture at the University of Toronto (2008). URL
- Zwol, R. van. e.a. â€œVideo Tag Game.â€ 17th International World Wide Web Conference (WWW developer track) (ACM Press, 2008).
 Weinberger, D. (2008)
 Surowiecki, D. (2004)
 Jong, A, de. (2007).
 Velsen, L. van & Melenhorst, M. (2008) p.2.
 Weinberger, D. (2008)
 Lusenet, Y. de. (2008) p.19-20.
 Jong, A, de (2007).
 Mechant, P. (2007) p. 24.
 Velsen, L. van & Melenhorst, M. (2008) p.2.
 Ahn, L. von. (2006) p.96.
 Ibid. p.96.
 Siorpas, K. & Hepp, M. (2008) p. 51.
 Turnbull, D. e.a. (2008) p. 227.
 Zwol, R. van. e.a. (2008) p.1-2.
 Ahn, L. von. (2006) p.96-98.