Archive for December, 2008

The National Archive and Spaarnestad Photo release new photos on The Commons on Flickr.

Monday, December 22nd, 2008

It has been a busy week here at the National Archive. On the 18th of December the National Archive and Spaarnestad Photo launched new photographs on The Commons on Flickr. This time the photographs related to various subjects. Firstly,  as part of the National Archive’s Afscheid van Indië project (www.afscheidvanIndië.nl), The National Archive published some photographs of the Dutch East Indies on Flickr. Secondly, the National Archive also published a number of photographs by the famous Dutch photographer Willem van de Poll, which can also be viewed on The Commons. Then, getting in to the Christmas spirit, our partner, Spaarnestad Photo published some photos with a Christmas theme.

The National Archive has now been on-line for almost two months and so far it has generated about 650 000 views, about 1400 tags and 250 comments.

The initiative has caused quite some commotion in both the Dutch and international archival community. Last week I gave a talk in front of a critical audience of Photo journalists at a conference of the European Federation of Journalists (EFJ). The title of the conference was: “Photo Journalists: an endangered species in Europe? Development of an European sustainable quality agenda for photo journalism.” Photo journalists from all over Europe gathered to discuss their profession, and what they see as its possible decline.

Although they were a critical audience, it was still very interesting to hear the photographer’s point of view on initiatives like The Commons on Flickr and big digitalization projects like “Images for the Future”. Although, there was a general agreement about the fact that digitalization should be done to preserve historical photo collections. There was much less agreement about how to handle the copyright issues. It is clear that  solutions in the course of general licensing need to be found.

What was striking to me, was that photographers and copyright holders, who were present, were not up-to-date with the general licensing methods and Open Content initiatives like Creative Commons. The big learning for me was that it is good to bear in mind that archives and heritage institutions can benefit from maintaining a regular dialogue with photographers and copyright holders. This dialogue will allow both parties to inform each other about these sorts of initiatives and enable them to work together in finding solutions for the copyright issues, encountered by big digitalization projects.

If you would like to find out more about my talk you can find my presentation at SlideShare:

In the meantime you can still see all our photographs at http://www.flickr.com/photos/nationaalarchief/.

Maaike Toonen

Copyright specialist, National Archive “Beelden voor de Toekomst”

 

The Wisdom of the Crowds in the Audiovisual Archive Domain

Thursday, December 18th, 2008

BCK – social tagging by pulguita (CC-BY-SA)

Our consortium partner the Dutch National Archives recently joined Flickr: The Commons. Within Flickr: The Commons, leading archives across the globe upload items from their collections to Flickr and invite visitors to add tags and comments to it. This has been a major success: in six weeks the 500 photo’s of the National Archives have been viewed 600.000 times and 1200 tags have been added. Putting material out in the open like the Dutch National Archive did at Flickr raises questions. Are general users qualified enough to complete or even replace the annotations made by archivists? Who is responsible for the outcome of the annotation process? How do we motivate users to annotate the material? These questions partly remain unanswered. This article tries to shed some light on possible following directions.

Users nowadays create their own content. In a recent lecture at the University of Toronto, David Weinberger sees the Web 2.0 as a radical change in thinking about information. There is no longer one truth provided by one source, but instead there’s an ecosystem of truths provided by many. This ecosystem is constantly changing and evolving. Content becomes similar to connection and metadata becomes data.[1] Users are no longer passive visitors of websites but active creators. They give new meaning to existing information by linking different sources and using personal preferences to arrange information.

Social Archives?
The change in the perception of the Internet as a medium has a lot of opportunities for Images for the Future. Digitalizing the archival material takes a lot of time, money and manpower. The digitised material has to be accessible for users. This doesn’t only include the presentation of the material through different services but also the metadata, in order to make it searchable and to add context. The creation of metadata is a very labor-intensive process and not very efficient when its solely done by archivists. Data mining technology might be a solution. Also, archives are exploring how they can put the ‘wisdom of the crowds’ into use.[2]

One of the practices of social software the consortium is interested in is social tagging. Tagging allows users to label different forms of content and can also be used by other people to search content. There are various ways to show tags, like for instance a tag cloud where font size indicates the number of times the tag is used.

One of the tasks of an audiovisual archive is to arrange the information and to embed it in a context. Because of social software, the role of an archive is changing. Annemieke de Jong from Sound and Vision describes these changes in the article Users, Producers & Other Tags. Instead of producing the metadata, documentalists and archivists increasingly classify and correct the metadata produced by others. A part of the metadata arise during digitalization, other metadata can be created by outside experts, crowdsourcing (an open call to an undefined group of people) and social tagging. According to De Jong annotation by users can save time and money. “Free tagging by the general public could be of enormous help in making our collections accessible, on clip level and from multiple viewpoints.”[3]

Experts vs. the General User
There is a tension between the traditional annotation system and the social tagging system. Although the phrase social suggests a community spirit among users, most of them are driven by self-interest.[4] Social taggers use tags primarily to save information that is relevant for their own purpose and add every kind of tag they like. This creates a folksonomy, a free-form system of tags modified by many users. Archivists on the other hand are experts who use metadata from a thesaurus, or a closed vocabulary. The main goal is to make the information accessible for others, not for themselves.  A thesaurus is usually thoroughly designed by a small group of people.

Folksonomies are based on the principal of the ’wisdom of the crowd’. If a lot of people share the same opinion it must be a correct one. Quality is defined by the majority and not by expertise. Wikipedia is build on this assumption. Articles contain general information and there is a lack of value of expert opinion. To Weinberger, wisdom of the crowds is more credential than the wisdom of one, if the process of the creation of the content is visible to users.[5] Not everybody involved in Web 2.0 shares this opinion. One of the founders of Wikipedia, Larry Sanger, wanted more possibilities for expert contribution. He therefore started a new open encyclopedia, Citizendium, where experts have more authority.[6] This example shows that everyone does not always support the wisdom of the crowds. Social tagging challenges the notion of quality and the value of expert opinion. De Jong states that social tagging should not be a substitute for the annotation by experts. Instead there should be an exchange of information between the two systems.[7]

The different nature of the two systems makes it hard to combine them. Sound and Vision is involved in several projects that research the possibilities to combine both systems. The institute participates in the consortium MultimediaN. One of the projects is Spiegle, an alternative search engine for Google. Spiegle combines various levels of metadata like the user profile, the platform and the features of a collection. Unlike Google, this search engine provides very narrow results, which makes it easier for users to find the right content. Also, Sound and Vision participated in the project Total Content Recommendation with the Telematica Institute in Enschede. The project explored the possibilities of social tagging of audiovisual content and the incentives of users to tag the content. Sound and Vision also participates in PrestoPrRIME, a project in the 7th Framework programme of the EU for the research and development of long-term preservation of new media, which will start in January 2009. One of the goals of this project is to establish interoperability between various databases by enabling information exchange between different systems of metadata. Different forms of annotation are connected with each other, creating a semantic system of tags. Semantic tagging makes the annotation process much more efficient in the long term and eventually bridges the gap between different annotation systems.

Motivation of Users
The success of social software has two reasons. It creates weak ties between users and it operates in an open and free environment. Because of the openness of the software, users can choose the level of participation in a community. [8] Users have to be motivated to participate very actively.  During the Total Content Recommendation project Lex van Velsen and Mark Melenhorst of the Telematica Institute  have done research after the incentives of users to tag video content. Following the classification of Cameron Marlow, research scientist at Facebook, they define six different incentives:

  1. Future retrieval
  2. Contribution and sharing
  3. Attract attention
  4. Play and competition
  5. Self presentation
  6. Opinion expression[9]

The authors tested these motives with two groups. Although the groups were very small, the authors conclude users didn’t tag for play and competition and for self-presentation. The tagging sites that were tested didn’t have a game element so it was very likely users didn’t use those sites for play and competition.

Someone who has done a lot of research about crowdsourcing and games is Luis von Ahn, a professor of Computer Science at Carnegie Mellon University. His research is based on the assumption that “computers still don’t possess the basic conceptual intelligence op perceptual capabilities that humans take for granted.”[10] Computers aren’t able to solve problems that are relatively easy solved by most humans. Von Ahn calls this human computation. The human brain can be treated as a processor in a distributed system that can perform a small part of a massive computation. To become part of this massive computation, they do require incentives to solve these kinds of problems, like a game.[11]

An example is the ESP Game. In this game, two random players see the same image they need to label. Goal of the game is to use the same label as your partner. Von Ahn also developed a game called Peekaboom to determine the place of an object within an image. Other games he developed are Verbosity, a game for collecting commonsense facts and Phetch a game for collecting image descriptions for visual impaired.[12] Other researchers have developped games to collect metadata for audio content, like the Listen Game,Tag-a-Tune and MajorMiner. The Listen Game is based on a list of tags players can use to annotate the material. The other two games are similar to the ESP Game: the added tags are compared with those from a database or a different player. In both games players are free to use any tag they like.[13] There are also a few tagging initatives based on video content. Yahoo’s Video Tag Game is based on the same principle as the ESP Game. Players earn points by adding similar tags.[14] This game, developed by Yahoo research, is still in an exploration stadium. The game VideoTag, developed by Stacey Greenaway – as an Msc research project – Is already operational as a single-player game where players collect points adding tags. Some tags are pitfalls (tags that are too obvious) that lower the players score. All these tagging games are examples to generate metadata in a playfull way. Von Ahn sees great possibilities for these kind of games. However, a game designed to solve a problem should produce the right solution and be fun at the same time.[15]

Future Archives
The future archive has to be an open archive to survive the transition from Web 1.0 to Web 2.0 and beyond where users are able to use and label content for their own purpose. Archivists still have a role in evaluating and contextualizing the metadata created by general users. In order to stimulate the creation of different forms of metadata archives could develop creative concepts like games, to encourage users to create new metadata. Until now a lot of research has been done on gathering metadata from general users. Quantity seems more important than quality. Most of the games are designed after the ESP Game. Like von Ahn stated, tagging games should provide the right solution. Further research should focus on the right solution. But what is the right solution? Is it the metadata created by experts, or is it the wisdom of the crowd. And if we know the right solution, how do we control the tagging process to get it? Are players able to provide the right solution or is it necessary for archives to check the metadata that is produced? If that’s the case is a tagging game profitable enough? Further research should focus on these questions in order to gain more insight in the possibilities social tagging has for archives. Sound and Vision will release a video tagging environment early 2009 in collaboration with the Free University Amsterdam and the broadcaster KRO in order to answer some of the questions raised above.

Literature

  • Ahn, L. von.  “Games with a Purpose,” Computer (Vol. 39:6, June 2006). pp. 92-94. URL
  • Jong, A. de. “Users, producers & other tags. Trends and developments in metadata creation.” Lecture at the FIAT/IFTA conference (October 2007) URL
  • Mechant, P. “Culture ‘2.0’: Social and Cultural Exploration through the use of Folksonomies and Weak Cooperation.” Cultuur 2.0. (Amsterdam: Virtueel Platform, 2007). URL
  • Lusenet, Y de. Geven en nemen. Archiefinstellingen en het sociale web. (Den Haag: Taskforce archieven, 2008).
  • Siorpaes, K. & Hepp, M. “Games with a purpose for the semantic web.” Intelligent Systems (Vol 23:3, May 2008) pp. 50-60. URL
  • Surowiecki, J. The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations. (New York: Random House, 2004).
  • Turnbull, D. e.a. “Five approaches to collecting tags for music.” ISMIR Conference (2008). pp.225-230. URL
  • Velsen, L. van & Melenhorst, M. “User Motives for Tagging Video Content.” (2008). URL
  • Wartena, C & Brussee, R. “Instanced-based mapping between thesauri and folksonomies.” Proceedings of the 7th International Semantic Web Conference (ISWC’08) (2008). URL
  • Weinberger. D. “Knowledge at the End of the Information Age.” Bertha Bassam Lecture at the University of Toronto (2008). URL
  • Zwol, R. van. e.a. “Video Tag Game.” 17th International World Wide Web Conference (WWW developer track) (ACM Press, 2008).

[1] Weinberger, D. (2008)
[2] Surowiecki, D. (2004)

[3] Jong, A, de. (2007).

[4] Velsen, L. van & Melenhorst, M. (2008) p.2.

[5] Weinberger, D. (2008)

[6] Lusenet, Y. de. (2008) p.19-20.

[7] Jong, A, de (2007).

[8] Mechant, P. (2007) p. 24.

[9] Velsen, L. van & Melenhorst, M. (2008) p.2.

[10] Ahn, L. von. (2006) p.96.

[11] Ibid. p.96.

[12] Siorpas, K. & Hepp, M. (2008) p. 51.

[13] Turnbull, D. e.a. (2008) p. 227.

[14] Zwol, R. van. e.a. (2008) p.1-2.

[15] Ahn, L. von. (2006) p.96-98.

 

Library of Congress releases report on Flickr pilot

Tuesday, December 16th, 2008

After 9 months The Library of Congress (LoC) released a detailed report on their Flickr pilot. In January 2008 the LoC and Flickr launched Flickr Commons. They uploaded a few thousand historical photos which have drawn more than 10 million views, 7,166 comments and more than 67,000 tags, according to the new report from the project team. The project had an unexpected impact:

“The pilot spurred many positive yet unexpected outcomes—especially Flickr members’ willingness to devote great effort to photo-related detective work and their level of engagement with historical images. Further, Flickr members have often drawn on personal histories to connect with the pictures, including memories of farming practices, grandparents’ lives, women’s roles in World War II, and the changing landscape of local neighborhoods”

LoC

Photo: Library of Congress, Germany Schaefer, Washington AL (baseball), 1911.

If you want to read in more detail how the LoC organised and experienced the pilot, you can download the whole Flickr report LoC here.

 

German Federal Archive publishes photos on Wikipedia under Creative Commons license

Monday, December 8th, 2008

On December 6th, the German Federal Archive and the online encyclopedia Wikipedia announced their cooperation in making publicly available 100,000 digitized images under Creative Commons licence (CC-BY-SA) in exchange for linking the photos to Wikipedia’s Persondata. A big step for opening up public content and data.

The commons

In September 2007 the German Federal Archive already made 113,000 images available on their own online digital archive. In total the Federal Archives keeps approximately 11 million still pictures, aerial photographs and posters from modern German history. The cooperation with Wikipedia is the next big step for the German Federal Archive in opening up the archive, as the vice president of the German Federal Archive Dr. Angelika Menne-Haritz said during the press conference.

persconferentie
Photo: Raimond Spekking, Creative Commons Attribution ShareAlike 3.0.

The photos are not of the highest resolution, about 800 pixels on the longest side. But, this is an enormous addition to the commons. According to Wikimedia, the repository of free content images, sound and other multimedia files on Wikipedia, the donation by the German Federal Archive of 100,000 images is the single largest one to Wikimedia Commons so far. This is even more than the archival project Flickr Commons makes available now in cooperation with 16 archival partners around the world.

Click here for the image gallery: http://www.bild.bundesarchiv.de/

bundesarchiv

Photo: Mitglieder des Deutschen Reichstag, German Federal Archive (1889). Author: Braatz, Julius. Creative Commons Attribution ShareAlike 3.0.

Creative Commons License

The images by the German Federal Archive are licensed Creative Commons Attribution ShareAlike 3.0 Germany License (CC-BY-SA). This means that you are free to share and remix the images under the condition that you give attribution and spread this with a similar or compatible license. The Federal Archive can do this because they own sufficient rights on the images to be able to grant this kind of license. To use such a free license for archival material is really exciting. Few archives work with Creative Commons licences. One of the rare examples is the McCord Museum and the Brabants Historisch Informatiecentrum. And, the archival project Flickr Commons works with “no known copyright restrictions”.

Persondata

The other part of the cooperation between the German Federal Archive and Wikipedia is a tool for linking people from a list compiled by the Federal Archive to the German Wikipedia Persondata and to the person authority file of the German National Library. Something German Wikipedia has already been doing since 2005. Around 27% of 100,000 photos is already done. The expectation is that because the cooperation is now public, the tempo will speed up. Moreover, the users will add new information to the images. You can find the To Do list here.

Conclusion

Though projectleader Creative Commons Germany, Markus says that this is only a small revolution for German notions, this could very well set an example for other archives to make their content publicly available and therefore grow bigger. It will be very interesting to see where we can find the photos and in which (rich) context. Because that will make a strong argument for archives to experiment with this.

 

Interesting links

Monday, December 1st, 2008

Below you’ll see some interesting reading material which could be useful one way or the other for our project Images for the Future (and of course other digitization projects). Click here for previous links. Some of the entries are in Dutch.

1. Heritage Online at Fear it? Fix it! 2008
2. Erfgoedpunt haakt in op toenemende interesse voor lokale geschiedenis
3. Reageren op Groenboek Auteursrecht in de kenniseconomie
4. Monty Python Puts All Its Content On YouTube To Increase Sales Of Scarce Goods
5. Are Copyright Holders Purposely Putting Content On P2P In Order To Demand Money?
6. Now ‘Online’: Europeana, Europe’s Digital Library
7. Teleblik nu ook voor onderbouw basisonderwijs
8. Handbook on cultural web user interaction
9. Hulu to Match YouTube’s Revenue: Ten Observations For The Future of Media
10. LIFE Magazine Photo Archives Arrive in Google Image Search

 

Heritage Online at Fear it? Fix it! 2008

Monday, December 1st, 2008

Last Friday, XS4ALL hosted FIFI 2008 at the Westergasfabriek Amsterdam. With this event, XS4ALL (the first company to offer internet to individuals in the Netherlands) celibrated its 15th birthday. The event consisted of workshops on open-source creativity, FABLAB, Web 3.0 and more.

FIFI was organised around three themes, Toolbox for the future (about opportunities and possibilities of Internet and technology), Secured? let yourself hack (data security is often not sufficient, how secure is secure?) and Privacy: secret or property? (how to deal with company information and private information. The Internet knows everything about everybody, is that ok?).

Within the theme Toolbox for the future, Johan Oomen (Netherlands Institute for Sound and Vision) gave a talk “Heritage Online”on the opportunities digitisation offers to heritage institutions across the globe. The slides (in Dutch) are available on SlideShare.

081123 Fifi Final
View SlideShare presentation or Upload your own. (tags: tagging flickr)