It has been a busy week here at the National Archive. On the 18th of December the National Archive and Spaarnestad Photo launched new photographs on The Commons on Flickr. This time the photographs related to various subjects. Firstly,Â as part of the National Archiveâ€™s Afscheid van IndiÃ« project (www.afscheidvanIndiÃ«.nl), The National Archive published some photographs of the Dutch East Indies on Flickr. Secondly, the National Archive also published a number of photographs by the famous Dutch photographer Willem van de Poll, which can also be viewed on The Commons. Then, getting in to the Christmas spirit, our partner, Spaarnestad Photo published some photos with a Christmas theme.
The National Archive has now been on-line for almost two months and so far it has generated about 650 000 views, about 1400 tags and 250 comments.
The initiative has caused quite some commotion in both the Dutch and international archival community. Last week I gave a talk in front of a critical audience of Photo journalists at a conference of the European Federation of Journalists (EFJ). The title of the conference was: â€œPhoto Journalists: an endangered species in Europe? Development of an European sustainable quality agenda for photo journalism.â€ Photo journalists from all over Europe gathered to discuss their profession, and what they see as its possible decline.
Although they were a critical audience, it was still very interesting to hear the photographerâ€™s point of view on initiatives like The Commons on Flickr and big digitalization projects like â€œImages for the Futureâ€. Although, there was a general agreement about the fact that digitalization should be done to preserve historical photo collections. There was much less agreement about how to handle the copyright issues. It is clear thatÂ solutions in the course of general licensing need to be found.
What was striking to me, was that photographers and copyright holders, who were present, were not up-to-date with the general licensing methods and Open Content initiatives like Creative Commons. The big learning for me was that it is good to bear in mind that archives and heritage institutions can benefit from maintaining a regular dialogue with photographers and copyright holders. This dialogue will allow both parties to inform each other about these sorts of initiatives and enable them to work together in finding solutions for the copyright issues, encountered by big digitalization projects.
If you would like to find out more about my talk you can find my presentation at SlideShare:
In the meantime you can still see all our photographs at http://www.flickr.com/photos/nationaalarchief/.
Copyright specialist, National Archive â€œBeelden voor de Toekomstâ€
Our consortium partner the Dutch National Archives recently joined Flickr: The Commons. Within Flickr: The Commons, leading archives across the globe upload items from their collections to Flickr and invite visitors to add tags and comments to it. This has been a major success: in six weeks the 500 photoâ€™s of the National Archives have been viewed 600.000 times and 1200 tags have been added. Putting material out in the open like the Dutch National Archive did at Flickr raises questions. Are general users qualified enough to complete or even replace the annotations made by archivists? Who is responsible for the outcome of the annotation process? How do we motivate users to annotate the material? These questions partly remain unanswered. This article tries to shed some light on possible following directions.
Users nowadays create their own content. In a recent lecture at the University of Toronto, David Weinberger sees the Web 2.0 as a radical change in thinking about information. There is no longer one truth provided by one source, but instead thereâ€™s an ecosystem of truths provided by many. This ecosystem is constantly changing and evolving. Content becomes similar to connection and metadata becomes data. Users are no longer passive visitors of websites but active creators. They give new meaning to existing information by linking different sources and using personal preferences to arrange information.
The change in the perception of the Internet as a medium has a lot of opportunities for Images for the Future. Digitalizing the archival material takes a lot of time, money and manpower. The digitised material has to be accessible for users. This doesnâ€™t only include the presentation of the material through different services but also the metadata, in order to make it searchable and to add context. The creation of metadata is a very labor-intensive process and not very efficient when its solely done by archivists. Data mining technology might be a solution. Also, archives are exploring how they can put the â€˜wisdom of the crowdsâ€™ into use.
One of the practices of social software the consortium is interested in is social tagging. Tagging allows users to label different forms of content and can also be used by other people to search content. There are various ways to show tags, like for instance a tag cloud where font size indicates the number of times the tag is used.
One of the tasks of an audiovisual archive is to arrange the information and to embed it in a context. Because of social software, the role of an archive is changing. Annemieke de Jong from Sound and Vision describes these changes in the article Users, Producers & Other Tags. Instead of producing the metadata, documentalists and archivists increasingly classify and correct the metadata produced by others. A part of the metadata arise during digitalization, other metadata can be created by outside experts, crowdsourcing (an open call to an undefined group of people) and social tagging. According to De Jong annotation by users can save time and money. â€œFree tagging by the general public could be of enormous help in making our collections accessible, on clip level and from multiple viewpoints.â€
Experts vs. the General User
There is a tension between the traditional annotation system and the social tagging system. Although the phrase social suggests a community spirit among users, most of them are driven by self-interest. Social taggers use tags primarily to save information that is relevant for their own purpose and add every kind of tag they like. This creates a folksonomy, a free-form system of tags modified by many users. Archivists on the other hand are experts who use metadata from a thesaurus, or a closed vocabulary. The main goal is to make the information accessible for others, not for themselves.Â A thesaurus is usually thoroughly designed by a small group of people.
Folksonomies are based on the principal of the â€™wisdom of the crowdâ€™. If a lot of people share the same opinion it must be a correct one. Quality is defined by the majority and not by expertise. Wikipedia is build on this assumption. Articles contain general information and there is a lack of value of expert opinion. To Weinberger, wisdom of the crowds is more credential than the wisdom of one, if the process of the creation of the content is visible to users. Not everybody involved in Web 2.0 shares this opinion. One of the founders of Wikipedia, Larry Sanger, wanted more possibilities for expert contribution. He therefore started a new open encyclopedia, Citizendium, where experts have more authority. This example shows that everyone does not always support the wisdom of the crowds. Social tagging challenges the notion of quality and the value of expert opinion. De Jong states that social tagging should not be a substitute for the annotation by experts. Instead there should be an exchange of information between the two systems.
The different nature of the two systems makes it hard to combine them. Sound and Vision is involved in several projects that research the possibilities to combine both systems. The institute participates in the consortium MultimediaN. One of the projects is Spiegle, an alternative search engine for Google. Spiegle combines various levels of metadata like the user profile, the platform and the features of a collection. Unlike Google, this search engine provides very narrow results, which makes it easier for users to find the right content. Also, Sound and Vision participated in the project Total Content Recommendation with the Telematica Institute in Enschede. The project explored the possibilities of social tagging of audiovisual content and the incentives of users to tag the content. Sound and Vision also participates in PrestoPrRIME, a project in the 7th Framework programme of the EU for the research and development of long-term preservation of new media, which will start in January 2009. One of the goals of this project is to establish interoperability between various databases by enabling information exchange between different systems of metadata. Different forms of annotation are connected with each other, creating a semantic system of tags. Semantic tagging makes the annotation process much more efficient in the long term and eventually bridges the gap between different annotation systems.
Motivation of Users
The success of social software has two reasons. It creates weak ties between users and it operates in an open and free environment. Because of the openness of the software, users can choose the level of participation in a community.  Users have to be motivated to participate very actively.Â During the Total Content Recommendation project Lex van Velsen and Mark Melenhorst of the Telematica InstituteÂ have done research after the incentives of users to tag video content. Following the classification of Cameron Marlow, research scientist at Facebook, they define six different incentives:
- Future retrieval
- Contribution and sharing
- Attract attention
- Play and competition
- Self presentation
- Opinion expression
The authors tested these motives with two groups. Although the groups were very small, the authors conclude users didnâ€™t tag for play and competition and for self-presentation. The tagging sites that were tested didnâ€™t have a game element so it was very likely users didnâ€™t use those sites for play and competition.
Someone who has done a lot of research about crowdsourcing and games is Luis von Ahn, a professor of Computer Science at Carnegie Mellon University. His research is based on the assumption that â€œcomputers still donâ€™t possess the basic conceptual intelligence op perceptual capabilities that humans take for granted.â€ Computers arenâ€™t able to solve problems that are relatively easy solved by most humans. Von Ahn calls this human computation. The human brain can be treated as a processor in a distributed system that can perform a small part of a massive computation. To become part of this massive computation, they do require incentives to solve these kinds of problems, like a game.
An example is the ESP Game. In this game, two random players see the same image they need to label. Goal of the game is to use the same label as your partner. Von Ahn also developed a game called Peekaboom to determine the place of an object within an image. Other games he developed are Verbosity, a game for collecting commonsense facts and Phetch a game for collecting image descriptions for visual impaired. Other researchers have developped games to collect metadata for audio content, like the Listen Game,Tag-a-Tune and MajorMiner. The Listen Game is based on a list of tags players can use to annotate the material. The other two games are similar to the ESP Game: the added tags are compared with those from a database or a different player. In both games players are free to use any tag they like. There are also a few tagging initatives based on video content. Yahooâ€™s Video Tag Game is based on the same principle as the ESP Game. Players earn points by adding similar tags. This game, developed by Yahoo research, is still in an exploration stadium. The game VideoTag, developed by Stacey Greenaway – as an Msc research project â€“ Is already operational as a single-player game where players collect points adding tags. Some tags are pitfalls (tags that are too obvious) that lower the players score. All these tagging games are examples to generate metadata in a playfull way. Von Ahn sees great possibilities for these kind of games. However, a game designed to solve a problem should produce the right solution and be fun at the same time.
The future archive has to be an open archive to survive the transition from Web 1.0 to Web 2.0 and beyond where users are able to use and label content for their own purpose. Archivists still have a role in evaluating and contextualizing the metadata created by general users. In order to stimulate the creation of different forms of metadata archives could develop creative concepts like games, to encourage users to create new metadata. Until now a lot of research has been done on gathering metadata from general users. Quantity seems more important than quality. Most of the games are designed after the ESP Game. Like von Ahn stated, tagging games should provide the right solution. Further research should focus on the right solution. But what is the right solution? Is it the metadata created by experts, or is it the wisdom of the crowd. And if we know the right solution, how do we control the tagging process to get it? Are players able to provide the right solution or is it necessary for archives to check the metadata that is produced? If thatâ€™s the case is a tagging game profitable enough? Further research should focus on these questions in order to gain more insight in the possibilities social tagging has for archives. Sound and Vision will release a video tagging environment early 2009 in collaboration with the Free University Amsterdam and the broadcaster KRO in order to answer some of the questions raised above.
- Ahn, L. von.Â “Games with a Purpose,” Computer (Vol. 39:6, June 2006). pp. 92-94. URL
- Jong, A. de. â€œUsers, producers & other tags. Trends and developments in metadata creation.â€ Lecture at the FIAT/IFTA conference (October 2007)Â URL
- Mechant, P. â€œCulture â€˜2.0â€™: Social and Cultural Exploration through the use of Folksonomies and Weak Cooperation.â€ Cultuur 2.0. (Amsterdam: Virtueel Platform, 2007). URL
- Lusenet, Y de. Geven en nemen. Archiefinstellingen en het sociale web. (Den Haag: Taskforce archieven, 2008).
- Siorpaes, K. & Hepp, M. â€œGames with a purpose for the semantic web.â€ Intelligent Systems (Vol 23:3, May 2008) pp. 50-60. URL
- Surowiecki, J. The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations. (New York: Random House, 2004).
- Turnbull, D. e.a. â€œFive approaches to collecting tags for music.â€ ISMIR Conference (2008). pp.225-230. URL
- Velsen, L. van & Melenhorst, M. â€œUser Motives for Tagging Video Content.â€ (2008). URL
- Wartena, C & Brussee, R. â€œInstanced-based mapping between thesauri and folksonomies.â€ Proceedings of the 7th International Semantic Web Conference (ISWC’08) (2008). URL
- Weinberger. D. â€œKnowledge at the End of the Information Age.â€ Bertha Bassam Lecture at the University of Toronto (2008). URL
- Zwol, R. van. e.a. â€œVideo Tag Game.â€ 17th International World Wide Web Conference (WWW developer track) (ACM Press, 2008).
 Weinberger, D. (2008)
 Surowiecki, D. (2004)
 Jong, A, de. (2007).
 Velsen, L. van & Melenhorst, M. (2008) p.2.
 Weinberger, D. (2008)
 Lusenet, Y. de. (2008) p.19-20.
 Jong, A, de (2007).
 Mechant, P. (2007) p. 24.
 Velsen, L. van & Melenhorst, M. (2008) p.2.
 Ahn, L. von. (2006) p.96.
 Ibid. p.96.
 Siorpas, K. & Hepp, M. (2008) p. 51.
 Turnbull, D. e.a. (2008) p. 227.
 Zwol, R. van. e.a. (2008) p.1-2.
 Ahn, L. von. (2006) p.96-98.
After 9 months The Library of Congress (LoC) released a detailed report on their Flickr pilot. In January 2008 the LoC and Flickr launched Flickr Commons. They uploaded a few thousand historical photos which have drawn more than 10 million views, 7,166 comments and more than 67,000 tags, according to the new report from the project team. The project had an unexpected impact:
“The pilot spurred many positive yet unexpected outcomesâ€”especially Flickr membersâ€™ willingness to devote great effort to photo-related detective work and their level of engagement with historical images. Further, Flickr members have often drawn on personal histories to connect with the pictures, including memories of farming practices, grandparentsâ€™ lives, womenâ€™s roles in World War II, and the changing landscape of local neighborhoods”
If you want to read in more detail how the LoC organised and experienced the pilot, you can download the whole Flickr report LoC here.
On December 6th, the German Federal Archive and the online encyclopedia Wikipedia announced their cooperation in making publicly available 100,000 digitized images under Creative Commons licence (CC-BY-SA) in exchange for linking the photos to Wikipediaâ€™s Persondata. A big step for opening up public content and data.
In September 2007 the German Federal Archive already made 113,000 images available on their own online digital archive. In total the Federal Archives keeps approximately 11 million still pictures, aerial photographs and posters from modern German history. The cooperation with Wikipedia is the next big step for the German Federal Archive in opening up the archive, as the vice president of the German Federal Archive Dr. Angelika Menne-Haritz said during the press conference.
The photos are not of the highest resolution, about 800 pixels on the longest side. But, this is an enormous addition to the commons. According to Wikimedia, the repository of free content images, sound and other multimedia files on Wikipedia, the donation by the German Federal Archive of 100,000 images is the single largest one to Wikimedia Commons so far. This is even more than the archival project Flickr Commons makes available now in cooperation with 16 archival partners around the world.
Click here for the image gallery: http://www.bild.bundesarchiv.de/
Creative Commons License
The images by the German Federal Archive are licensed Creative Commons Attribution ShareAlike 3.0 Germany License (CC-BY-SA). This means that you are free to share and remix the images under the condition that you give attribution and spread this with a similar or compatible license. The Federal Archive can do this because they own sufficient rights on the images to be able to grant this kind of license. To use such a free license for archival material is really exciting. Few archives work with Creative Commons licences. One of the rare examples is the McCord Museum and the Brabants Historisch Informatiecentrum. And, the archival project Flickr Commons works with â€œno known copyright restrictionsâ€.
The other part of the cooperation between the German Federal Archive and Wikipedia is a tool for linking people from a list compiled by the Federal Archive to the German Wikipedia Persondata and to the person authority file of the German National Library. Something German Wikipedia has already been doing since 2005. Around 27% of 100,000 photos is already done. The expectation is that because the cooperation is now public, the tempo will speed up. Moreover, the users will add new information to the images. You can find the To Do list here.
Though projectleader Creative Commons Germany, Markus says that this is only a small revolution for German notions, this could very well set an example for other archives to make their content publicly available and therefore grow bigger. It will be very interesting to see where we can find the photos and in which (rich) context. Because that will make a strong argument for archives to experiment with this.
Below you’ll see some interesting reading material which could be useful one way or the other for our project Images for the Future (and of course other digitization projects). Click here for previous links. Some of the entries are in Dutch.
1. Heritage Online at Fear it? Fix it! 2008
2. Erfgoedpunt haakt in op toenemende interesse voor lokale geschiedenis
3. Reageren op Groenboek Auteursrecht in de kenniseconomie
4. Monty Python Puts All Its Content On YouTube To Increase Sales Of Scarce Goods
5. Are Copyright Holders Purposely Putting Content On P2P In Order To Demand Money?
6. Now ‘Online’: Europeana, Europe’s Digital Library
7. Teleblik nu ook voor onderbouw basisonderwijs
8. Handbook on cultural web user interaction
9. Hulu to Match YouTubeâ€™s Revenue: Ten Observations For The Future of Media
10. LIFE Magazine Photo Archives Arrive in Google Image Search
Last Friday, XS4ALL hosted FIFI 2008 at the Westergasfabriek Amsterdam. With this event, XS4ALL (the first company to offer internet to individuals in the Netherlands) celibrated its 15th birthday. The event consisted of workshops on open-source creativity, FABLAB, Web 3.0 and more.
FIFI was organised around three themes, Toolbox for the future (about opportunities and possibilities of Internet and technology), Secured? let yourself hack (data security is often not sufficient, how secure is secure?) and Privacy: secret or property? (how to deal with company information and private information. The Internet knows everything about everybody, is that ok?).
Within the theme Toolbox for the future, Johan Oomen (Netherlands Institute for Sound and Vision) gave a talk “Heritage Online”on the opportunities digitisation offers to heritage institutions across the globe. The slides (in Dutch) are available on SlideShare.