Sharing is caring keynote ‘Enriching cultural heritage collections through a Participatory Commons’

Enriching cultural heritage collections through a Participatory Commons platform: a provocation about collaborating with users

Mia Ridge, Open University Contact me: @mia_out or http://miaridge.com/

[I was invited to Copenhagen to talk about my research on crowdsourcing in cultural heritage at the 3rd international Sharing is Caring seminar on April 1. I’m sharing my notes in advance to make life easier for those awesome people following along in a second or third language, particularly since I’m delivering my talk via video.]

Today I’d like to present both a proposal for something called the ‘Participatory Commons’, and a provocation (or conversation starter): there’s a paradox in our hopes for deeper audience engagement through crowdsourcing: projects that don’t grow with their participants will lose them as they develop new skills and interests and move on. This talk presents some options for dealing with this paradox and suggests a Participatory Commons provides a way to take a sector-wide view of active engagement with heritage content and redefine our sense of what it means when everybody wins.

I’d love to hear your thoughts about this – I’ll be following the hashtag during the session and my contact details are above.

Before diving in, I wanted to reflect on some lessons from my work in museums on public engagement and participation.

My philosophy for crowdsourcing in cultural heritage (aka what I’ve learnt from making crowdsourcing games)

One thing I learnt over the past years: museums can be intimidating places. When we ask for help with things like tagging or describing our collections, people want to help but they worry about getting it wrong and looking stupid or about harming the museum.

The best technology in the world won’t solve a single problem unless it’s empathically designed and accompanied by social solutions. This isn’t a talk about technology, it’s a talk about people – what they want, what they’re afraid of, how we can overcome all that to collaborate and work together.

Dora’s Lost Data

So a few years ago I explored the potential of crowdsourcing games to make helping a museum less scary and more fun. In this game, ‘Dora’s Lost Data‘, players meet a junior curator who asks them to tag objects so they’ll be findable in Google. Games aren’t the answer to everything, but identifying barriers to participation is always important. You have to understand your audiences – their motivations for starting and continuing to participate; the fears, anxieties, uncertainties that prevent them participating. [My games were hacked together outside of work hours, more information is available at My MSc dissertation: crowdsourcing games for museums; if you’d like to see properly polished metadata games check out Tiltfactor’s http://www.metadatagames.org/#games]

Mutual wins – everybody’s happy

My definition of crowdsourcing: cultural heritage crowdsourcing projects ask the public to undertake tasks that cannot be done automatically, in an environment where the activities, goals (or both) provide inherent rewards for participation, and where their participation contributes to a shared, significant goal or research area.

It helps to think of crowdsourcing in cultural heritage as a form of volunteering. Participation has to be rewarding for everyone involved. That sounds simple, but focusing on the audiences’ needs can be difficult when there are so many organisational needs competing for priority and limited resources for polishing the user experience. Further, as many projects discover, participant needs change over time…

What is a Participatory Commons and why would we want one?

First, I have to introduce you to some people. These are composite stories (personas) based on my research…

Two archival historians, Simone and Andre. Simone travels to archives in her semester breaks to stock up on research material, taking photos of most documents ‘in case they’re useful later’, transcribing key text from others. Andre is often at the next table, also looking for material for his research. The documents he collected for his last research project would be useful for Simone’s current book but they’ve never met and he has no way of sharing that part of his ‘personal research collection’ with her. Currently, each of these highly skilled researchers take their cumulative knowledge away with them at the end of the day, leaving no trace of their work in the archive itself. Next…

Two people from a nearby village, Martha and Bob. They joined their local history society when they retired and moved to the village. They’re helping find out what happened to children from the village school’s class of 1898 in the lead-up to and during World War I. They are using census returns and other online documents to add records to a database the society’s secretary set up in Excel. Meanwhile…

A family historian, Daniel. He has a classic ‘shoebox archive’ – a box containing his grandmother Sarah’s letters and diary, describing her travels and everyday life at the turn of the century. He’s transcribing them and wants to put them online to share with his extended family. One day he wants to make a map for his kids that shows all the places their great-grandmother lived and visited. Finally, there’s…

Crowdsourcer Nisha.She has two young kids and works for a local authority. She enjoys playing games like Candy Crush on her mobile, and after the kids have gone to bed she transcribes ship logs on the Old Weather website while watching TV with her husband. She finds it relaxing, feels good about contributing to science and enjoys the glimpses of life at sea. Sites like Old Weather use ‘microtasks’ – tiny, easily accomplished tasks – and crowdsourcing to digitise large amounts of text.

Helping each other?

None of our friends above know it, but they’re all looking at material from roughly the same time and place. Andre and Simone could help each other by sharing the documents they’ve collected over the years. Sarah’s diaries include the names of many children from her village that would help Martha and Bob’s project, and Nisha could help everyone if she transcribed sections of Sarah’s diary.

Connecting everyone’s efforts for the greater good: Participatory Commons

This image shows the two main aspects of the Participatory Commons: the different sources for content, and the activities that people can do with that content.

The Participatory Commons (image: Mia Ridge)

The Participatory Commons is a platform where content from different sources can be aggregated. Access to shared resources underlies the idea of the ‘Commons’, particularly material that is not currently suitable for sites like Europeana, like ‘shoebox archives’ and historians’ personal record collections. So if the ‘Commons’ part refers to shared resources, how is it participatory?

The Participatory Commons interface supports a range of activities, from the types of tasks historians typically do, like assessing and contextualising documents, activities that specialists or the public can do like identifying particular people, places, events or things in sources, or typical crowdsourcing tasks like fulltext transcription or structured tagging.

By combining the energy of crowdsourcing with the knowledge historians create on a platform that can store or link to primary sources from museums, libraries and archives with ‘shoebox archives’, the Commons could help make our shared heritage more accessible to all. As a platform that makes material about ordinary people available alongside official archives and as an interface for enjoyable, meaningful participation in heritage work, the Commons could be a basis for ‘open source history’, redressing some of the absences in official archives while improving the quality of all records.

As a work in progress, this idea of the Participatory Heritage Commons has two roles: an academic thought experiment to frame my research, and as a provocation for GLAMs (galleries, museums, libraries, archives) to think outside their individual walls. As a vision for ‘open source history’, it’s inspired by community archives, public history, participant digitisation and history from below… This combination of a large underlying repository and more intimate interfaces could be quite powerful. Capturing some of the knowledge generated when scholars access collections would benefit both archives and other researchers.

‘Niche projects’ can be built on a Participatory Commons

As a platform for crowdsourcing, the Participatory Commons provides efficiencies of scale in the backend work for verifying and validating contributions, managing user accounts, forums, etc. But that doesn’t mean that each user would experience the same front-end interface.

Niche projects build on the Participatory Commons
(quick and dirty image: Mia Ridge)

My research so far suggests that tightly-focused projects are better able to motivate participants and create a sense of community. These ‘niche’ projects may be related to a particular location, period or topic, or to a particular type of material. The success of the New York Public Library’s What’s on the Menu project, designed around a collection of historic menus, and the British Library’s GeoReferencer project, designed around their historic map collection, both demonstrate the value of defining projects around niche topics.

The best crowdsourcing projects use carefully designed interactions tailored to the specific content, audience and data requirements of a given project. These interactions are usually For example, the Zooniverse body of projects use much of the same underlying software but projects are designed around specific tasks on specific types of material, whether classifying simple galaxy types, plankton or animals on the Serengeti, or transcribing ship logs or military diaries.

The Participatory Commons is not only a collection of content, it also allows ‘niche’ projects to be layered on top, presenting more focused sets of content, and specialist interfaces designed around the content, audience and purpose.

Barriers

But there are still many barriers to consider, including copyright and technical issues and important cultural issues around authority, reliability, trust, academic credit and authorship. [There’s more background on this at my earlier post on historians and the Participatory Commons and Early PhD findings: Exploring historians’ resistance to crowdsourced resources.]

Now I want to set the idea of the Participatory Commons aside for a moment, and return to crowdsourcing in cultural heritage. I’ve been looking for factors in the success or otherwise of crowdsourcing projects, from grassroots, community-lead projects to big glamorous institutionally-lead sites.

I mentioned that Nisha found transcribing text relaxing. Like many people who start transcribing text, she found herself getting interested in the events, people and places mentioned in the text. Forums or other methods for participants to discuss their questions seem to help keep participants motivated, and they also provide somewhere for a spark of curiosity to grow (as in this forum post). We know that some people on crowdsourcing projects like Old Weather get interested in history, and even start their own research projects.

Crowdsourcing as gateway to further activity

You can see that happening on other crowdsourcing projects too. For example, [email protected]aims to document historical herbarium collections within museums based on photographs of specimen cards. So far participants have documented over 130,000 historic specimens. In the process, some participants also found themselves being interested in the people whose specimens they were documenting.

As a result, the project has expanded to include biographies of the original specimen collectors. It was able to accommodate this new interest through a project wiki, which has a combination of free text and structured data linking records between the transcribed specimen cards and individual biographies.

‘Levels of Engagement’ in citizen science

There’s a consistent enough pattern in science crowdsourcing projects that there’s a model from ‘citizen science’ that outlines different stages participants can move through, from undertaking simple tasks, joining in community discussion, through to ‘working independently on self-identified research projects’.[1]

Is this ‘mission accomplished’?

This is Nick Poole’s word cloud based on 40 museum missionstatements. With words like ‘enjoyment’, ‘access’, ‘learning’ appearing in museum missions, doesn’t this mean that turning transcribers into citizen historians while digitising and enhancing collections is a success? Well, yes, but…

Paths diverge; paradox ahead?

There’s a tension between GLAM’s desire to invite people to ‘go deeper’, to find their own research interests, to begin to become citizen historians; and the desire to ask people to help us with tasks set by GLAMs to help their work. Heritage organisations can try to channel that impulse to start research into questions about their own collections, but sometimes it feels like we’re asking people to do our homework for us. The scaffolds put in place to help make tasks easier may start to feel like a constraint.

Who has agency?

If people move beyond simple tasks into more complex tasks that require a greater investment of time and learning, then issues of agency – participants’ ability to make choices about what they’re working on and why – start to become more important. Would Wikipedia have succeeded if it dictated what contributors had to write about? We shouldn’t mistake volunteers for a workforce just because they can be impressively dedicated contributors.

Participatory project models

Turning again to citizen science – this time public participation in science research, we have a model for participatory projects according to the amount of control participants have over the design of the project itself – or to look at it another way, how much authority the organisation has ceded to the crowd. This model contains three categories: ‘contributory’, where the public contributes data to a project designed by the organisation; ‘collaborative’, where the public can help refine project design and analyse data in a project lead by the organisation; and ‘co-creative’, where the public can take part in all or nearly all processes, and all parties design the project together.[2]

As you can imagine, truly co-creative projects are rare. It seems cultural organisations find it hard to truly collaborate with members of the public; for many understandable reasons. The level of transparency required, and the investment of time for negotiating mutual interests, goals and capabilities increase as collaboration deepens. Institutional constraints and lack of time to engage in deep dialogue with participants make it difficult to find shared goals that work for all parties. It seems GLAMs sometimes try to take shortcuts and end up making decisions for the group, which means their ‘co-creative’ project is actually more just ‘collaborative’.

New challenges

When participants start to out-grow the tasks that originally got them hooked, projects face a choice. Some projects are experimenting with setting challenges for participants. Here you see ‘mysteries’ set by the UK’s Museum of Design in Plastics, and by San FranciscoPublic Library on History Pin. Finding the right match between the challenge set and the object can be difficult without some existing knowledge of the collection, and it can require a lot of on-going time to encourage participants. Putting the mystery under the nose of the person who has the knowledge or skills to solve it is another challenge that projects like this will have to tackle.

Working with existing communities of interest is a good start, but it also takes work to figure out where they hang out online (or in-person) and understand how they prefer to work. GLAMs sometimes fall into the trap of choosing the technology first, or trying something because it’s trendy; it’s better to start with the intersection between your content and the preferences of potential audiences.

But is it wishful thinking to hope that others will be interested in answering the questions GLAMs are asking?

A tension?

Should projects accept that some people will move on as they develop new interests, and concentrate on recruiting new participants to replace them? Do they try to find more interesting tasks or new responsibilities for participants, such as helping moderate discussions, or checking and validating other people’s work? Or should they find ways for the project grow as participants’ skill and knowledge increase? It’s important to make these decisions mindfully as the default is otherwise to accept a level of turnover as participants move on.

To return to lessons from citizen science, possible areas for deeper involvement include choosing or defining questions for study, analysing or interpreting data and drawing conclusions, discussing results and asking new questions.[3]However, heritage organisations might have to accept that the questions people want to ask might not involve their collections, and that these citizen historians’ new interests might not leave time for their previous crowdsourcing tasks.

Why is a critical mass of content in a Participatory Commons useful?

And now we return to the Participatory Commons and the question of why a critical mass of content would be useful.

Increasingly, the old divisions between museum, library and archive collections don’t make sense. For most people, content is content, and they don’t understand why a pamphlet about a village fete in 1898 would be described and accessed differently depending on whether it had ended up in a museum, library or archive catalogue.

Basing niche projects on a wider range of content creates opportunities for different types of tasks and levels of responsibility. Projects that provide a variety of tasks and roles can support a range of different levels and types of participant skills, availability, knowledge and experience.

A critical mass of material is also important for the discoverability of heritage content. Even the most sophisticated researcher turns to Google sometimes, and if your content doesn’t come up in the first few results, many researchers will never know it exists. It’s easy to say but less easy to make a reality: the easier it is to find your collections, the more likely it is that researchers will use them.

Commons as party?

More importantly, a critical mass of content in a Commons allows us to re-define ‘winning’. If participation is narrowly defined as belonging to individual GLAMs, when a citizen historian moves onto a project that doesn’t involve your collection then it can seem like you’ve lost a collaborator. But the people who developed a new research interest through a project at one museum might find they end up using records from the archive down the road, and transcribing or enhancing their records during their investigation. If all the institutions in the region shared their records on the Commons or let researchers take and share photos while using their collections, the researcher has a critical mass of content for their research and hopefully as a side-effect, their activities will improve links between collections. If the Commons allows GLAMs to take a sector-wide view then someone moving on to a different collection becomes a moment to celebrate, a form of graduation. In our wildest imagination, the Commons could be like a fabulous party where you never know what fabulous interesting people and things you’ll discover…

To conclude – by designing platforms that allow people to collect and improve records as they work, we’re helping everybody win.

Thank you! I’m looking forward to hearing your thoughts.


[1]M. Jordan Raddick et al., ‘Citizen Science: Status and Research Directions for the Coming Decade’, in astro2010: The Astronomy and Astrophysics Decadal Survey, vol. 2010, 2009, http://www8.nationalacademies.org/astro2010/DetailFileDisplay.aspx?id=454.

[2]Rick Bonney et al., Public Participation in Scientific Research: Defining the Field and Assessing Its Potential for Informal Science Education. A CAISE Inquiry Group Report (Washington D.C.: Center for Advancement of Informal Science Education (CAISE), July 2009), http://caise.insci.org/uploads/docs/PPSR%20report%20FINAL.pdf.

[3]Bonney et al., Public Participation in Scientific Research: Defining the Field and Assessing Its Potential for Informal Science Education. A CAISE Inquiry Group Report.


Image credits in order of appearance: Glider, Library of Congress, Great hall, Library of CongressCurzona Allport from Tasmanian Archive and Heritage Office, Hålanda Church, Västergötland, Sweden, Swedish National Heritage Board, Smithsonian Institution, Postmaster, General James A. Farley During National Air Mail Week, 1938Powerhouse Museum, Canterbury Bankstown Rugby League Football Club’s third annual Ball.

Early PhD findings: Exploring historians’ resistance to crowdsourced resources

I wrote up some early findings from my PhD research for conferences back in 2012 when I was working on questions around ‘but will historians really use resources created by unknown members of the public?’. People keep asking me for copies of my notes (and I’ve noticed people citing an online video version which isn’t ideal) and since they might be useful and any comments would help me write-up the final thesis, I thought I’d be brave and post my notes.

A million caveats apply – these were early findings, my research questions and focus have changed and I’ve interviewed more historians and reviewed many more participative history projects since then; as a short paper I don’t address methods etc; and obviously it’s only a huge part of a tiny topic… (If you’re interested in crowdsourcing, you might be interested in other writing related to scholarly crowdsourcing and collaboration from my PhD, or my edited volume on ‘Crowdsourcing our cultural heritage’.) So, with those health warnings out of the way, here it is. I’d love to hear from you, whether with critiques, suggestions, or just stories about how it relates to your experience. And obviously, if you use this, please cite it!

Exploring historians’ resistance to crowdsourced resources

Scholarly crowdsourcing may be seen as a solution to the backlog of historical material to be digitised, but will historians really use resources created by unknown members of the public?

The Transcribe Bentham project describes crowdsourcing as ‘the harnessing of online activity to aid in large scale projects that require human cognition’ (Terras, 2010a). ‘Scholarly crowdsourcing’ is a related concept that generally seems to involve the collaborative creation of resources through collection, digitisation or transcription. Crowdsourcing projects often divide up large tasks (like digitising an archive) into smaller, more manageable tasks (like transcribing a name, a line, or a page); this method has helped digitise vast numbers of primary sources.

My doctoral research was inspired by a vision of ‘participant digitization’, a form of scholarly crowdsourcing that seeks to capture the digital records and knowledge generated when researchers access primary materials in order to openly share and re-use them. Unlike many crowdsourcing projects which are designed for tasks performed specifically for the project, participant digitization harnesses the transcription, metadata creation, image capture and other activities already undertaken during research and aggregates them to create re-usable collections of resources.

Research questions and concepts

When Howe clarified his original definition, stating that the ‘crucial prerequisite’ in crowdsourcing is ‘the use of the open call format and the large network of potential laborers’, a ‘perfect meritocracy’ based not on external qualifications but on ‘the quality of the work itself’, he created a challenge for traditional academic models of authority and credibility (Howe 2006, 2008). Furthermore, how does anonymity or pseudonymity (defined here as often long-standing false names chosen by users of websites) complicate the process of assessing the provenance of information on sites open to contributions from non-academics? An academic might choose to disguise their identity to mask their research activities from competing peers, from a desire to conduct early exploratory work in private or simply because their preferred username was unavailable; but when contributors are not using their real names they cannot derive any authority from their personal or institutional identity. Finally, which technical, social and scholarly contexts would encourage researchers to share (for example) their snippets of transcription created from archival documents, and to use content transcribed by others? What barriers exist to participation in crowdsourcing or prevent the use of crowdsourced content?

Methods

I interviewed academic and family/local historians about how they evaluate, use, and contribute to crowdsourced and traditional resources to investigate how a resource based on ‘meritocracy’ disrupts current notions of scholarly authority, reliability, trust, and authorship. These interviews aimed to understand current research practices and probe more deeply into how participants assess different types of resources, their feelings about resources created by crowdsourcing, and to discover when and how they would share research data and findings.

I sought historians investigating the same country and time period in order to have a group of participants who faced common issues with the availability and types of primary sources from early modern England. I focused on academic and ‘amateur’ family or local historians because I was interested in exploring the differences between them to discover which behaviours and attitudes are common to most researchers and which are particular to academics and the pressures of academia.

I recruited participants through personal networks and social media, and conducted interviews in person or on Skype. At the time of writing, 17 participants have been interviewed for up to 2 hours each. It should be noted that these results are of a provisional nature and represent a snapshot of on-going research and analysis.

Early results

I soon discovered that citizen historians are perfect examples of Pro-Ams: ‘knowledgeable, educated, committed, and networked’ amateurs ‘who work to professional standards’ (Leadbeater and Miller, 2004; Terras, 2010b).

How do historians assess the quality of resources?

Participants often simply said they drew on their knowledge and experience when sniffing out unreliable documents or statements. When assessing secondary sources, their tacit knowledge of good research and publication practices was evident in common statements like ‘[I can tell from] it’s the way it’s written’. They also cited the presence and quality of footnotes, and the depth and accuracy of information as important factors. Transcribed sources introduced another layer of quality assessment – researchers might assess a resource by checking for transcription errors that are often copied from one database to another. Most researchers used multiple sources to verify and document facts found in online or offline sources.

When and how do historians share research data and findings?

It appears that between accessing original records and publishing information, there are several key stages where research data and findings might be shared. Stages include acquiring and transcribing records, producing visualisations like family trees and maps, publishing informal notes and publishing synthesised content or analysis; whether a researcher passes through all the stages depends on their motivation and audience. Information may change formats between stages, and since many claim not to share information that has not yet been sufficiently verified, some information would drop out before each stage. It also appears that in later stages of the research process the size of the potential audience increases and the level of trust required to share with them decreases.

For academics, there may be an additional, post-publication stage when resources are regarded as ‘depleted’ – once they have published what they need from them, they would be happy to share them. Family historians meanwhile see some value in sharing versions of family trees online, or in posting names of people they are researching to attract others looking for the same names.

Sharing is often negotiated through private channels and personal relationships. Methods of controlling sharing include showing people work in progress on a screen rather than sending it to them and using email in preference to sharing functionality supplied by websites – this targeted, localised sharing allows the researcher to retain a sense of control over early stage data, and so this is one key area where identity matters. Information is often shared progressively, and getting access to more information depends on your behaviour after the initial exchange – for example, crediting the provider in any further use of the data, or reciprocating with good data of your own.

When might historians resist sharing data?

Participants gave a range of reasons for their reluctance to share data. Being able to convey the context of creation and the qualities of the source materials is important for historians who may consider sharing their ‘depleted’ personal archives – not being able to provide this means they are unlikely to share. Being able to convey information about data reliability is also important. Some information about the reliability of a piece of information is implicitly encoded in its format (for example, in pencil in notebooks versus electronic records), hedging phrases in text, in the number of corroborating sources, or a value judgement about those sources. If it is difficult to convey levels of ‘certainty’ about reliability when sharing data, it is less likely that people will share it – participants felt a sense of responsibility about not publishing (even informally) information that hasn’t been fully verified. This was particularly strong in academics. Some participants confessed to sneaking forbidden photos of archival documents they ran out of time to transcribe in the archive; unsurprisingly it is unlikely they would share those images.

Overall, if historians do not feel they would get information of equal value back in exchange, they seem less likely to share. Professional researchers do not want to give away intellectual property, and feel sharing data online is risky because the protocols of citation and fair use are presently uncertain. Finally, researchers did not always see a point in sharing their data. Family history content was seen as too specific and personal to have value for others; academics may realise the value of their data within their own tightly-defined circles but not realise that their records may have information for other biographical researchers (i.e. people searching by name) or other forms of history.

Which concerns are particular to academic historians?

Reputational risk is an issue for some academics who might otherwise share data. One researcher said: ‘we are wary of others trawling through our research looking for errors or inconsistencies. […] Obviously we were trying to get things right, but if we have made mistakes we don’t want to have them used against us. In some ways, the less you make available the better!’. Scholarly territoriality can be an issue – if there is another academic working on the same resources, their attitude may affect how much others share. It is also unclear how academic historians would be credited for their work if it was performed under a pseudonym that does not match the name they use in academia.

What may cause crowdsourced resources to be under-used?

In this research, ‘amateur’ and academic historians shared many of the same concerns for authority, reliability, and trust. The main reported cause of under-use (for all resources) is not providing access to original documents as well as transcriptions. Researchers will use almost any information as pointers or leads to further sources, but they will not publish findings based on that data unless the original documents are available or the source has been peer-reviewed. Checking the transcriptions against the original is seen as ‘good practice’, part of a sense of responsibility ‘to the world’s knowledge’.

Overall, the identity of the data creator is less important than expected – for digitised versions of primary sources, reliability is not vested in the identity of the digitiser but in the source itself. Content found on online sites is tested against a set of finely-tuned ideas about the normal range of documents rather than the authority of the digitiser.

Cite as:

Ridge, Mia. “Early PhD Findings: Exploring Historians’ Resistance to Crowdsourced Resources.” Open Objects, March 19, 2014. http://www.openobjects.org.uk/2014/03/early-phd-findings-exploring-historians-resistance-to-crowdsourced-resources/.

References

Howe, J. (undated). Crowdsourcing: A Definition http://crowdsourcing.typepad.com

Howe, J. (2006). Crowdsourcing: A Definition. http://crowdsourcing.typepad.com/cs/2006/06/crowdsourcing_a.html

Howe, J. (2008). Join the crowd: Why do multinationals use amateurs to solve scientific and technical problems? The Independent. http://www.independent.co.uk/life-style/gadgets-and-tech/features/join-the-crowd-why-do-multinationals-use-amateurs-to-solve-scientific-and-technical-problems-915658.html

Leadbeater, C., and Miller, P. (2004). The Pro-Am Revolution: How Enthusiasts Are Changing Our Economy and Society. Demos, London, 2004. http://www.demos.co.uk/files/proamrevolutionfinal.pdf

Terras, M. (2010a) Crowdsourcing cultural heritage: UCL’s Transcribe Bentham project. Presented at: Seeing Is Believing: New Technologies For Cultural Heritage. International Society for Knowledge Organization, UCL (University College London). http://eprints.ucl.ac.uk/20157/

Terras, M. (2010b). “Digital Curiosities: Resource Creation via Amateur Digitization.” Literary and Linguistic Computing 25, no. 4 (October 14, 2010): 425–438. http://llc.oxfordjournals.org/cgi/doi/10.1093/llc/fqq019

2013 in review: crowdsourcing, digital history, visualisation, and lots and lots of words

A quick and incomplete summary of my 2013 for those days when I wonder where the year went… My PhD was my main priority throughout the year, but the slow increase in word count across my thesis is probably only of interest to me and my supervisors (except where I’ve turned down invitations to concentrate on my PhD). Various other projects have spanned the years: my edited volume on ‘Crowdsourcing our Cultural Heritage’, working as a consultant on the ‘Let’s Get Real’ project with Culture24, and I’ve continued to work with the Open University Digital Humanities Steering Group, ACH and to chair the Museums Computer Group.

In January (and April/June) I taught all-day workshops on ‘Data Visualisation for Analysis in Scholarly Research‘ and ‘Crowdsourcing in Libraries, Museums and Cultural Heritage Institutions‘ for the British Library’s Digital Scholarship Training Programme.

In February I was invited to give a keynote on ‘Crowd-sourcing as participation‘ at iSay: Visitor-Generated Content in Heritage Institutions in Leicester (my event notes). This was an opportunity to think through the impact of the ‘close reading’ people do while transcribing text or describing images, crowdsourcing as a form of deeper engagement with cultural heritage, and the potential for ‘citizen history’ this creates (also finally bringing together my museum work and my PhD research). This later became an article for Curator journal, From Tagging to Theorizing: Deepening Engagement with Cultural Heritage through Crowdsourcing (proof copy available at http://oro.open.ac.uk/39117). I also ran a workshop on ‘Data visualisation for humanities researchers’ with Dr. Elton Barker (one of my PhD supervisors) for the CHASE ‘Going Digital‘ doctoral training programme.

In March I was in the US for THATCamp Feminisms in Claremont, California (my notes), to do a workshop on Data visualisation as a gateway to programming and I gave a paper on ‘New Challenges in Digital History: Sharing Women’s History on Wikipedia‘ at the Women’s History in the Digital World‘ conference at Bryn Mawr, Philadelphia (posted as ‘New challenges in digital history: sharing women’s history on Wikipedia – my draft talk notes’). I also wrote an article for Museum Identity magazine, Where next for open cultural data in museums?.

In April I gave a paper, ‘A thousand readers are wanted, and confidently asked for’: public participation as engagement in the arts and humanities, on my PhD research at Digital Impacts: Crowdsourcing in the Arts and Humanities (see also my notes from the event), and a keynote on ‘A Brief History of Open Cultural Data’ at GLAM-WIKI 2013.

In May I gave an online seminar on crowdsourcing (with a focus on how it might be used in teaching undergraduates wider skills) for the NITLE Shared Academics series. I gave a short paper on ‘Digital participation and public engagement’ at the London Museums Group‘s ‘Museums and Social Media’ at Tate Britain on May 24, and was in Belfast for the Museums Computer Group’s Spring meeting, ‘Engaging Visitors Through Play‘ then whipped across to Venice for a quick keynote on ‘Participatory Practices: Inclusion, Dialogue and Trust‘ (with Helen Weinstein) for the We Curate kick-off seminar at the start of June.

In June the Collections Trust and MCG organised a Museum Informatics event in York and we organised a ‘Failure Swapshop‘ the evening before. I also went to Zooniverse’s ZooCon (my notes on the citizen science talks) and to Canterbury Cathedral Archives for a CHASE event on ‘Opening up the archives: Digitization and user communities’.

In July I chaired a session on Digital Transformations at the Open Culture 2013 conference in London on July 2, gave an invited lightning talk at the Digital Humanities Oxford Summer School 2013, ran a half-day workshop on ‘Designing successful digital humanities crowdsourcing projects‘ at the Digital Humanities 2013 conference in Nebraska, and had an amazing time making what turned out to be Serendip-o-matic at the Roy Rosenzweig Center for History and New Media at George Mason University’s One Week, One Tool in Fairfax, Virginia (my posts on the process), with a museumy road trip via Amtrak and Greyhound to Chicago, Cleveland, Pittsburg inbetween the two events.

In August I tidied up some talk notes for publication as ‘Tips for digital participation, engagement and crowdsourcing in museums‘ on the London Museums Group blog.

October saw the publication of my Curator article and Creating Deep Maps and Spatial Narratives through Design with Don Lafreniere and Scott Nesbit for the International Journal of Humanities and Arts Computing, based on our work at the Summer 2012 NEH Advanced Institute on Spatial Narrative and Deep Maps: Explorations in the Spatial Humanities. (I also saw my family in Australia and finally went to MONA).

In November I presented on ‘Messy understandings in code‘ at Speaking in Code at UVA’s Scholars’ Lab, Charlottesville, Virginia, gave a half-day workshop on ‘Data Visualizations as an Introduction to Computational Thinking‘ at the University of Manchester and spoke at the Digital Humanities at Manchester conference the next day. Then it was down to London for the MCG’s annual conference, Museums on the Web 2013 at Tate Modern. Later than month I gave a talk on ‘Sustaining Collaboration from Afar’ at Sustainable History: Ensuring today’s digital history survives.

In December I went to Hannover, Germany for the Herrenhausen Conference: “(Digital) Humanities Revisited – Challenges and Opportunities in the Digital Age” where I presented on ‘Creating a Digital History Commons through crowdsourcing and participant digitisation’ (my lightning talk notes and poster are probably the best representation of how my PhD research on public engagement through crowdsourcing and historians’ contributions to scholarly resources through participant digitisation are coming together). In final days of 2013, I went back to my old museum metadata games, and updated them to include images from the British Library and took a first pass at making them responsive for mobile and tablet devices.

DHOxSS: ‘From broadcast to collaboration: the challenges of public engagement in museums’

I’m just back from giving at a lightning talk for the Cultural Connections strand of the [email protected] Summer School 2013, and since the projector wasn’t working to show my examples during my talk I thought I’d share my notes (below) and some quick highlights from the other presentations.

Mark Doffman said that it’s important that academic work challenges and provokes, but make sure you get headlines for the right reasons, but not e.g. on how much the project costs. He concluded that impact is about provocation, not just getting people to say your work is wonderful.

Gurinder Punn of the university’s Isis Innovation made the point that intellectual property and expertise can be transferred into businesses by consulting through your department or personally. (And it’s not just for senior academics – one of the training sessions offered to PhD students at the Open University is ‘commercialising your research’).

Giles Bergel @ChapBookPro spoke on the Broadside Ballads Online (blog), explaining that folksong scholarship is often outside academia – there’s a lot of vernacular scholarship and all sorts of domain specialists including musicians. They’ve considered crowdsourcing but want to be in a position to take the contributions as seriously as any print accession. They also have an image-match demonstrator from Oxford’s Visual Geometry Group which can be used to find similar images on different ballad sheets.

Christian von Goldbeck-Stier offered some reflections on working with conductors as part of his research on Wagner. And perfectly for a summer’s day:

Christian quotes Wilde on beauty: “one of the great facts of the world, like sunlight, or springtime…” http://t.co/8qGE9tLdBZ #dhoxss
— Pip Willcox (@pipwillcox) July 11, 2013

My talk notes: ‘From broadcast to collaboration: the challenges of public engagement in museums’

I’m interested in academic engagement from two sides – for the past decade or so I was a museum technologist; now I’m a PhD student in the Department of History at the Open University, where I’m investigating the issues around academic and ‘amateur’ historians and scholarly crowdsourcing.

As I’ve moved into academia, I’ve discovered there’s often a disconnect between academia and museum practice (to take an example I know well), and that their different ways of working can make connecting difficult, even before they try to actually collaborate. But it’s worth it because the reward is more relevant, cutting-edge research that directly benefits practitioners in the relevant fields and has greater potential impact.

I tend to focus on engagement through participation and crowdsourcing, but engagement can be as simple as blogging about your work in accessible terms: sharing the questions that drive your research, how you’ve come to some answers, and what that means for the world at large; or writing answers to common questions from the public alongside journal articles.

Plan it

For a long time, museums worked with two publics: visitors and volunteers. They’d ask visitors what they thought in ‘have your say’ interactives, but to be honest, they often didn’t listen to the answers. They’d also work with volunteers but sometimes they valued their productivity more than they valued their own kinds of knowledge. But things are more positive these days – you’ve already heard a lot about crowdsourcing as a key example of more productive engagement.

Public engagement works better when it’s incorporated into a project from the start. Museums are exploring co-curation – working with the public to design exhibitions. Museums are recognising that they can’t know everything about a subject, and figuring out how to access knowledge ‘out there’ in the rest of the world. In the Oramics project at the Science Museum (e.g. Oramics to Electronica or Engaging enthusiasts online), electronic musicians were invited to co-curate an exhibition to help interpret an early electronic instrument for the public. 

There’s a model from ‘Public Participation in Scientific Research’ (or ‘citizen science’) I find useful in my work when thinking about how much agency the public has in a project, and it’s also useful for planning engagement projects. Where can you benefit from questions or contributions from the public, and how much control are you willing to give up? 

Contributory projects designed by scientists, with participants involved primarily in collecting samples and recording data; Collaborative projects in which the public is also involved in analyzing data, refining project design, and disseminating findings; Co-created projects are designed by scientists and members of the public working together, and at least some of the public participants are involved in all aspects of the work. (Source: Public Participation in Scientific Research: Defining the Field and Assessing Its Potential for Informal Science Education (full report, PDF, 3 MB))

Do it

Museums have learnt that engaging the public means getting out of their venues (and their comfort zones). One example is Wikipedians-in-Residence, including working with Wikipedians to share images, hold events and contribute to articles. (e.g. The British Museum and MeA Wikipedian-in-Residence at the British MuseumThe Children’s Museum’s Wikipedian in Residence). 
It’s not always straightforward – museums don’t do ‘neutral’ points of view, which is a key goal for Wikipedia. Museums are object-centric, Wikipedia is knowledge-centric. Museums are used to individual scholarship and institutional credentials, Wikipedia is consensus-driven and your only credentials are your editing history and your references. Museums are slowly learning to share authority, to trust the values of other platforms. You need to invest time to learn what drives the other groups, how to talk with them and you have to be open to being challenged.

Mean it

Done right, engagement should be transformative for all sides. According to the National Co-ordinating Centre for Public Engagement, engagement ‘is by definition a two-way process, involving interaction and listening, with the goal of generating mutual benefit.’ Saying something is ‘open to the public’ is easy; making efforts to make sure that it’s intellectually and practically accessible takes more effort; active outreach is a step beyond open. It’s not the same as marketing – it may use the same social media channels, but it’s a conversation, not a broadcast. It’s hard to fake being truly engaged (and it’s rude) so you have to mean it – doing it cynically doesn’t help anyone.

Asking people to do work that helps your mission is a double win. For example, Brooklyn Museum’s ‘Freeze Tag‘ ask members of their community to help moderate tags entered by people elsewhere – they’re trusting members of the community to clean up content for them.

Enjoy it

My final example is the National Library of Ireland on Flickr Commons, who do a great job of engaging people in Irish history, partly through their enthusiasm for the subject and partly through the effort they put into collating comments and updating their records, showing how much they value contributions. 

Almost by definition, any collaboration around engagement will be with people who are interested in your work, and they’ll bring new perspectives to it. You might end up working with international peers, academics from different disciplines, practitioner groups, scholarly amateurs or kids from the school down the road. And it’s not all online – running events is a great way to generate real impact and helps start conversations with potential for future collaboration.

You might benefit too! Talking about your research sometimes reminds you why you were originally interested in it… It’s a way of looking back and seeing how far you’ve come. It’s also just plain rewarding seeing people benefit from your research, so it’s worth doing well.


Thanks again to Pip Willcox for the invitation to speak, and to the other speakers for their fascinating perspectives.  Participation and engagement lessons from cultural heritage and academia is a bit of a hot topic at the moment – there’s more on it (including notes from a related paper I gave with Helen Weinstein) at Participatory Practices.

Setting off small fireworks: leaving space for curiosity

Remember when blog posts didn’t need titles, didn’t need to be long or take ages to write, and had nothing to do with your ‘personal brand’? I’ve realised that while I’m writing up the PhD I’ll barely blog at all if I don’t blog like it’s 2007 and just share interesting stuff when I’ve got a moment. Here goes…

I’ve been interested in the role of curiosity in engaging people with museum collections since I evaluated museum ‘tagging’ crowdsourcing games for my MSc project and learnt that the randomness of the objects presented made players really curious about what would appear next, and in turn that curiosity was one reason they kept playing. It turns out other metadata game designers have noticed the same effect. Flanagan and Carini (2012) wrote: ‘Curiosity and doubt are key design opportunities. … In a number of instances, players became so curious about the images they were tagging that they would tag images with inquiry phrases, such as “want to know more about this culture.”‘

I returned to ‘curiosity’ for a talk I gave at the iSay conference in Leicester, where I related it to Raddick et al’s (2009) ‘Levels of Engagement’ in citizen science, where Level 2 participation in community discussion (e.g. forums on crowdsourcing sites) and Level 3 is ‘working independently on self-identified research projects’. To me, this suggested you should leave room for curiosity and wonder to develop – it might turn into a new personal journey for the participant or visitor, or even a new research question for a crowdsourcing project.

The reason I’m posting now is that I just came across Langer’s definition of ‘mindfulness’: ‘the “state of mind that results from drawing novel distinctions, examining information from new perspectives, and being sensitive to context. It is an open, creative, probabilistic state of mind in which the individual might be led to finding differences among things thought similar and similarities among things thought different” (Langer 1993, p.44).’ in Csikszentmihalyi and Hermanson (1995). Further:

‘Exhibits that facilitate mindfulness display information in context and present various viewpoints. For example, Langer (1993, p.47) contrasts the statement “The three main reasons for the Civil War were…” with the statement “From the perspective of the white male living in the twentieth century, the main reasons for the Civil War were…” (p.47). The latter approach calls for thoughtful comparisons. For example, How did women feel during the Civil War? the old? the old from the North? the black male today? and so on.’

I don’t know about you, but my curiosity was piqued and my mind started going in lots of different directions. The second question carefully creates a gap just big enough to let a hundred new questions through and is a brilliant example of why both museum interpretation and participatory projects should leave room for curiosity…

Works cited:

  • Csikszentmihalyi, Mihaly, and Kim Hermanson. 1995. “Intrinsic Motivation in Museums: Why Does One Want to Learn?” In Public Institutions for Personal Learning: Establishing a Research Agenda, edited by John Falk and Lynn D. Dierking, 66 – 77. Washington D.C.: American Association of Museums. [This is seriously ace, track down a copy if you can]
  • Flanagan, Mary, and Peter. 2012. “How Games Can Help Us Access and Understand Archival Images.” American Archivist 75 (2): 514–537.
  • Raddick, M. Jordan, Georgia Bracey, K. Carney, G. Gyuk, K. Borne, J. Wallin, and S Jacoby. 2009. “Citizen Science: Status and Research Directions for the Coming Decade.” In Astro2010: The Astronomy and Astrophysics Decadal Survey. Vol. 2010. http://www8.nationalacademies.org/astro2010/DetailFileDisplay.aspx?id=454.

(Ok, so a post with references is not exactly blogging like it’s 2006, but you’ve got to start somewhere…)
(Someone is literally setting off fireworks somewhere nearby. I have no idea why.)
(And yeah, I am working on a Saturday night. Friends don’t let friends do PhDs, innit.)

We’re all looking at the stars: citizen science projects at ZooCon13

Last Saturday I escaped my desk to head to the Physics department at the University of Oxford and be awed by what we’re learning about space (and more terrestrial subjects) through citizen science projects run by Zooniverse at ZooCon13. All the usual caveats about notes from events apply – in particular, assume any errors are mine and that everyone was much more intelligent and articulate than my notes make them sound. These notes are partly written for people in cultural heritage and the humanities who are interested in the design of crowdsourcing projects, and while I enjoyed the scientific presentations I am not even going to attempt to represent them!  Chris Lintott live-blogged some of the talks on the day, so check out ‘Live from ZooCon‘ for more. If you’re familiar with citizen science you may well know a lot of these examples already – and if you’re not, you can’t really go wrong by looking at Zooniverse projects.

Aprajita Verma kicked off with SpaceWarps and ‘Crowd-sourcing the Discovery of Gravitational Lenses with Citizen Scientists’. She explained the different ways gravitational lenses show up in astronomical images, and that ‘strong gravitational lensing research is traditionally very labour-intensive’ – computer algorithms generate lots of false positives, so you need people to help. SpaceWarps includes some simulated lenses (i.e. images of the sky with lenses added), mostly as a teaching tool (to provide more examples and increase familiarity with what lenses can look like) but also to make it more interesting for participants. The SpaceWarps interface lets you know when you’ve missed a (simulated, presumably) lens as well as noting lenses you’ve marked. They had 2 million image classifications in the first week, and 8500 citizen scientists have participated so far, 40% of whom have participated in ‘Talk‘, the discussion feature. As discussed in their post ‘What happens to your markers? A look inside the Space Warps Analysis Pipeline‘, they’ve analysed the results so far on ranges between astute/obtuse and pessimistic/optimistic markers – it turns out most people are astute. Each image is reviewed by ten people, so they’ve got confidence in the results.

Karen Masters talked about ‘Cosmic Evolution in the Galaxy Zoo’, taking us back to the first Galaxy Zoo project’s hopes to have 30,000 volunteers and contrasting that with subsequent peer-reviewed papers that thanked 85,000, or 160,000 or 200,000 volunteers. The project launched in 2007 (before the Zooniverse itself) to look at spiral vs elliptical galaxies and it’s all grown from there. The project has found rare objects, most famously the pea galaxies, and as further proof that the Zooniverse is doing ‘real science online’, the team have produced 36 peer reviewed paper, some with 100+ citations. At least 50 more papers have been produced by others using their data.

Phil Brohan discussed ‘New Users for Old Weather’. The Old Weather project is using data from historic ships logs to help answer the question ‘is this climate change or just weather?’. Some data was already known but there’s a ‘metaphorical fog’ from missing observations from the past. Since the BBC won’t let him put a satellite in a Tardis, they’ve been creative about finding other sources to help lift ‘the fog of ignorance’. This project has long fascinated me because it started off all about science: in Phil’s words, ‘when we started all this, I was only thinking about the weather’, but ended up being about history as well: ‘these documents are intrinsically interesting’– he learnt what else was interesting about the logs from project participants who discovered the stories of people, disasters and strange events that lay within them. The third thing the project has generated (after weather and history) is ‘a lot of experts’. One example he gave was evidence of the 1918-19 Spanish flu epidemic on board ship, which was investigated after forum posts about it. There’s still a lot to do – more logs, including possibly French and Dutch – to come, and things would ideally speed up ‘by a factor of ten’.

In Brooke Simmons’ talk on ‘Future plans for Galaxy Zoo’, she raised the eternal issue of what to call participants in crowdsourcing: ‘just call everyone collaborators’. ‘Citizen scientists’ makes a distinction between paid and unpaid scientists, as does ‘volunteers’. She wants to help people do their own science, and they’re working on making it easier than downloading and learning how to use more complicated tools. As an example, she talked about people collecting ‘galaxies with small bulges’ and analysing the differences in bulges (like a souped-up Galaxy Zoo Navigator?). She also talked about Zoo Teach, with resources for learning at all ages.

After the break we learnt about ‘The Planet 4 Invasion’, the climate and seasons of Mars from Meg Schwamb and about Solar Stormwatch in ‘Only you can save planet Earth!’ from Chris Davis, who was also presenting research from his student Kim Tucker-Wood (sp?). Who knew that solar winds could take the tail off a comet?!

Next up was Chris Lintott on ‘Planet Hunting with and without Kepler’. Science communication advice says ‘don’t show people graphs’, and since Planet Hunters is looking at graphs for fun, he thought no-one would want to do Planet Hunters. However, the response has surprised him. And ‘it turns out that stars are actually quite interesting as well’. In another example of participants going above and beyond the original scope of the project, project participants watched a talk streamed online on ‘heartbeat binaries’, and went and found 30 of them from archives, their own records and posted them on the forum.  Now a bunch of Planet Hunters are working with Kepler team to follow them up.  (As an aside, he showed a screenshot of a future journal paper – the journal couldn’t accept the idea that you could be a Planet Hunter and not be part of an academic team so they’re listed as the Department of Astronomy at Yale.)

The final speaker was Rob Simpson on ‘The Future of the Zooniverse’.  To put things in context, he said the human race spends 16 years cumulatively playing the game Angry Birds every day; people spend 2 months every day on the Zooniverse. In the past year, the human race spent 52 years on the Zooniverse’s 15 live projects (they’ve had 23 projects in total). The Andromeda project went through all their data in 22 days – other projects take longer, but still attract dedicated people.  In the Zooniverse’s immediate future are ‘tools for (citizen) scientists’ – adding the ability to do analysis in the browser, ‘because people have a habit of finding things, just by being given access to the data’. They’re also working on ‘Letters‘ – public versions of what might otherwise be detailed forum posts that can be cited, and as a form of publication, it puts them ‘in the domain’.  They’re helping people communicate with each other and embracing their ‘machine overlords’, using Galaxy Zoo as a training tool for machine learning.  As computers get more powerful, the division of work between machines and people will change, perhaps leaving the beautiful, tricky, or complex bits for humans. [Update, June 29, 2013: Rob’s posted about his talk on the Zooniverse blog, ’52 Years of Human Effort’, and corrected his original figure of 35 years to 52 years of human effort.]

At one point a speaker asked who in the room was a moderator on a Zooniverse project, and nearly everyone put their hand up. I felt a bit like giving them a round of applause because their hard work is behind the success of many projects. They’re also a lovely, friendly bunch, as I discovered in the pub afterwards.

Conversations in the pub also reminded me of the flipside of people learning so much through these projects – sometimes people lose interest in the original task as their skills and knowledge grow, and it can be tricky to find time to contribute outside of moderating.  After a comment by Chris at another event I’ve been thinking about how you might match people to crowdsourcing projects or tasks – sometimes it might be about finding something that suits their love of the topic, or that matches the complexity or type of task they’ve previously enjoyed, or finding another unusual skill to learn, or perhaps building really solid stepping stones from their current tasks to more complex ones. But it’s tricky to know what someone likes – I quite like transcribing text on sites like Trove or Notes from Nature, but I didn’t like it much on Old Weather. And my own preferences change – I didn’t think much of Ancient Lives the first time I saw it, but on another occasion I ended up getting completely absorbed in the task. Helping people find the right task and project is also a design issue for projects that have built an ‘ecosystem’ of parts that contribute to a larger programme, as discussed in ‘Using crowdsourcing to manage crowdsourcing’ in Frequently Asked Questions about crowdsourcing in cultural heritage and ‘A suite of museum metadata games?’ in Playing with Difficult Objects – Game Designs to Improve Museum Collections.

An event like ZooCon showed how much citizen science is leading the way – there are lots of useful lessons for humanities and cultural heritage crowdsourcing. If you’ve read this thinking ‘I’d love to try it for my data, but x is a problem’, try talking to someone about it – often there are computational techniques for solving similar problems, and if it’s not already solved it might be interesting enough that people want to get involved and work with you on it.

On the trickiness of crowdsourcing competitions: some lessons from Sydney Design

I generally maintain a diplomatic silence about crowdsourcing competitions when I’m talking about crowdsourcing in cultural heritage as I believe spec work (or asking people to invest time in creating designs then paying just one ‘winner’) is unethical, and it’s really tricky for design competitions to avoid looking like ‘spec work’. I discovered this for myself when I ran the ‘Cosmic Collections’ mashup competition, so I have a lot of sympathy for museums who unknowingly get it wrong when experimenting with crowdsourcing. I also tend not to talk about poorly conceived or executed crowdsourcing projects as it doesn’t seem fair to single out cultural heritage institutions that were trying to do the right thing against odds that ended up beating them, but I think the lessons to be drawn from the Sydney Design festival’s competition are important enough to discuss here.

'Is it a free poster yet?'
‘Is it a free poster yet?’

A crowdsourcing competition model that the museum had previously applied successfully (the Lace Award and Trainspotting, with prizes up to $AUD20,000 and display in the exhibition for winning designs) had a very different reception when the context and rewards changed. When the Powerhouse Museum’s design competition to produce the visual identity for the Sydney Design festival was launched with a $US1000 prize, the design community’s sensitivity to spec work and ‘free pitching’ was triggered, and they started throwing in some sarcastic responses.  The public feedback loop created as people could see previous designs and realised their own would also be featured on the site had a 4Chan-ish feel of a fun new meme about it, and once the norm of satirical responses was set, it was only going to escalate.

More importantly, there was a sense that Sydney Design was pulling a swifty. As Kate Sweetapple puts it in How the Sydney Design festival poster competition went horribly wrong:

‘The fundamental difference [to the previous competitions], however, is that by running the competition, the Museum pulled a substantial job – worth tens of thousands of dollars – out of the professional marketplace. The submissions to Love Lace and Trainspotting did not have a commercial context one year, and none the next.’

If the previous reward was mostly monetary, offering a lesser intrinsic reward in exchange for a previously extrinsic reward is unlikely to work. If there’s a bigger reward than than the competition brief itself would suggest, one important lesson is to make it unavoidably obvious. In this case, the Sydney Design Team’s response said ‘the Museum would have engaged the winning designer for further work and remuneration required to roll out the winning design into a more comprehensive marketing campaign’, but this wasn’t clear in the original brief. Many museum competitions display highly-ranked entries in their gallery spaces, and being exhibited in the museum or festival spaces might have been another form of valid reward, but only if it worked as an aspiration for the competition’s audience, who in this case might well have a breadth of experience and exposure that rendered it less valuable.

Finally, in working with museums online, I’ve noticed the harshness of criticism is often proportionate to how deeply people care about you or identify you with certain values they hold dear.  When you’re a beloved institution, people who care deeply about you feel betrayed when you get things wrong. As one commentator said in With friends like these, who needs enemies?, ‘Sydney Design are meant to be in our corner’. If you regard critics as ‘critical friends’ you can turn the relationship around (as Merel van der Vaart discusses in the ‘Opening up’ section of her post on lessons from the Science Museum’s Oramics exhibition) and build an even stronger relationship with them. Maybe Sydney Design can still turn this around…

Notes from ‘Crowdsourcing in the Arts and Humanities’

Last week I attended a one-day conference, ‘Digital Impacts: Crowdsourcing in the Arts and Humanities‘ (#oxcrowd), convened by Kathryn Eccles of Oxford’s Internet Institute, and I’m sharing my (sketchy, as always) notes in the hope that they’ll help people who couldn’t attend.

Stuart Dunn reported on the Humanities Crowdsourcing scoping report (PDF) he wrote with Mark Hedges and noted that if we want humanities crowdsourcing to take off we should move beyond crowdsourcing as a business model and look to form, nurture and connect with communities.  Alice Warley and Andrew Greg presented a useful overview of the design decisions behind the Your Paintings Tagger and sparked some discussion on how many people need to view a painting before it’s ‘completed’, and the differences between structured and unstructured tagging. Interestingly, paintings can be ‘retired’ from the Tagger once enough data has been gathered – I personally think the inherent engagement in tagging is valuable enough to keep paintings taggable forever, even if they’re not prioritised in the tagging interface.  Kate Lindsay brought a depth of experience to her presentation on ‘The Oxford Community Collection Model’ (as seen in Europeana 1914-1918 and RunCoCo’s 2011 report on ‘How to run a community collection online‘ (PDF)). Some of the questions brought out the importance of planning for sustainability in technology, licences, etc, and the role of existing networks of volunteers with the expertise to help review objects on the community collection days.  The role of the community in ensuring the quality of crowdsourced contributions was also discussed in Kimberly Kowal’s presentation on the British Library’s Georeferencer project. She also reflected on what she’d learnt after the first phase of the Georeferencer project, including that the inherent reward of participating in the activity was a bigger motivator than competitiveness, and the impact on the British Library itself, which has opened up data for wider digital uses and has more crowdsourcing projects planned. I gave a paper which was based on an earlier version, The gift that gives twice: crowdsourcing as productive engagement with cultural heritage, but pushed my thinking about crowdsourcing as a tool for deep engagement with museums and other memory organisations even further. I also succumbed to the temptation to play with my own definitions of crowdsourcing in cultural heritage: ‘a form of engagement that contributes towards a shared, significant goal or research question by asking the public to undertake tasks that cannot be done automatically’ or ‘productive public engagement with the mission and work of memory institutions’.

Chris Lintott of Galaxy Zoo fame shared his definition of success for a crowdsourcing/citizen science project: it has to produce results of value to the research community in less time than could have been done by other means (i.e. it must have been able to achieve something with crowd that couldn’t have without them) and discussed how the Ancient Lives project challenged that at first by turning ‘a few thousand papyri they didn’t have time to transcribe into several thousand data points they didn’t have time to read’.  While ‘serendipitous discovery is a natural consequence of exposing data to large numbers of users’ (in the words of the Citizen Science Alliance), they wanted a more sophisticated method for recording potential discoveries experts made while engaging with the material and built a focused ‘talk‘ tool which can programmatically filter out the most interesting unanswered comments and email them to their 30 or 40 expert users. They also have Letters for more structured, journal-style reporting. (I hope I have that right).  He also discussed decisions around full text transcriptions (difficult to automatically reconcile) vs ‘rich metadata’, or more structured indexes of the content of the page, which contain enough information to help historians decide which pages to transcribe in full for themselves.

Some other thoughts that struck me during the day… humanities crowdsourcing has a lot to learn from the application of maths and logic in citizen science – lots of problems (like validating data) that seem intractable can actually be solved algorithmically, and citizen science hypothesis-based approach to testing task and interface design would help humanities projects. Niche projects help solve the problem of putting the right obscure item in front of the right user (which was an issue I wrestled with during my short residency at the Powerhouse Museum last year – in hindsight, building niche projects could have meant a stronger call-to-action and no worries about getting people to navigate to the right range of objects).  The variable role of forums and participants’ relationship to the project owners and each other came up at various points – in some projects, interactions with a central authority are more valued, in others, community interactions are really important. I wonder how much it depends on the length and size of the project? The potential and dangers of ‘gamification’ and ‘badgeification’ and their potentially negative impact on motivation were raised. I agree with Lintott that games require a level of polish that could mean you’d invest more in making them than you’d get back in value, but as a form of engagement that can create deeper relationships with cultural heritage and/or validate some procrastination over a cup of tea, I think they potentially have a wider value that balances that.

I was also asked to chair the panel discussion, which featured Kimberly Kowal, Andrew Greg, Alice Warley, Laura Carletti, Stuart Dunn and Tim Causer.  Questions during the panel discussion included:

  • ‘what happens if your super-user dies?’ (Super-users or super contributors are the tiny percentage of people who do most of the work, as in this Old Weather post) – discussion included mass media as a numbers game, the idea that someone else will respond to the need/challenge, and asking your community how they’d reach someone like them. (This also helped answer the question ‘how do you find your crowd?’ that came in from twitter)
  • ‘have you ever paid anyone?’ Answer: no
  • ‘can you recruit participants through specialist societies?’ From memory, the answer was ‘yes but it does depend’.
  • something like ‘have you met participants in real life?’ – answer, yes, and it was an opportunity to learn from them, and to align the community, institution, subject and process.
  • badgeification?’. Answer: the quality of the reward matters more than the levels (so badges are probably out).
  • ‘what happens if you force students to work on crowdsourcing projects?’ – one suggestion was to look for entries on Transcribe Bentham in a US English class blog
  • ‘what’s happened to tagging in art museums, where’s the new steve.museum or Brooklyn Museum?’ – is it normalised and not written about as much, or has it declined?
  • ‘how can you get funding for crowdsourcing projects?’. One answer – put a good application in to the Heritage Lottery Fund. Or start small, prove the value of the project and get a larger sum. Other advice was to be creative or use existing platforms. Speaking of which, last year the Citizen Science Alliance announced ‘the first open call for proposals by researchers who wish to develop citizen science projects which take advantage of the experience, tools and community of the Zooniverse. Successful proposals will receive donated effort of the Adler-based team to build and launch a new citizen science project’.
  • ‘can you tell in advance which communities will make use of a forum?’ – a great question that drew on various discussions of the role of communities of participants in supporting each other and devising new research questions
  • a question on ‘quality control’ provoked a range of responses, from the manual quality control in Transcribe Bentham and the high number of Taggers initially required for each painting in Your Paintings which slowed things down, and lead into a discussion of shallow vs deep interactions
  • the final questioner asked about documenting film with crowdsourcing and was answered by someone else in the audience, which seemed a very fitting way to close the day.
James Murray in his Scriptorium with thousands of word references sent in by members of the public for the first Oxford English Dictionary. Early crowdsourcing?

If you found this post useful, you might also like Frequently Asked Questions about crowdsourcing in cultural heritage or my earlier Museums and the Web paper on Playing with Difficult Objects – Game Designs to Improve Museum Collections.

The ever-morphing PhD

I wrote this for the NEH/Polis Summer Institute on deep mapping back in June but I’m repurposing it as a quick PhD update as I review my call for interview participants. I’m in the middle of interviews at the moment (and if you’re an academic historian working on British history 1600-1900 who might be willing to be interviewed I’d love to hear from you) and after that I’ll no doubt be taking stock of the research landscape, the findings from my interviews and project analyses, and updating the shape of my project as we go into the new year. So it doesn’t quite reflect where I’m at now, but at the very least it’s an insight into the difficulties of research into digital history methodologies when everything is changing so quickly:

“Originally I was going to build a tool to support something like crowdsourced deep mapping through a web application that would let people store and geolocate documents and images they were digitising. The questions that are particularly relevant for this workshop are: what happens when crowdsourcing or citizen history meet deep mapping? Can a deep map created by multiple people for their own research purposes support scholarly work? Can a synthetic, ad hoc collection of information be used to support an argument or would it be just for the discovery of spatio-temporarily relevant material? How would a spatial narrative layer work?

I planned to test this by mapping the lives and intellectual networks of early scientific women. But after conducting a big review of related projects I eventually realised that there’s too much similar work going on in the field and that inevitably something similar would have been created by someone with more resources by the time I was writing up. So I had to rethink my question and my methods.

So now my PhD research seeks to answer ‘how do academic and family/local historians evaluate, use and contribute to crowdsourced resources, especially geo-located historical materials?’, with the goal of providing some insight into the impact of digitality on research practices and scholarship in the humanities. … How do trained and self-taught historians cope with changes in place names and boundaries over time, and the many variations and similarities in place names. Does it matter if you’ve never been to the place and don’t know that it might be that messy and complex?

I’m interested how living in a digital culture affects how researchers work. What does it mean to generate as well as consume digital data in the course of research? How does user-created content affect questions of authorship, authority and trust for amateur historians and scholarly practice? What are the characteristics of a well-designed digital resource, and how can resources and tools for researchers be improved? It’s a very Human-Computer Interaction/Infomatics view of the digital humanities but it addresses the issues around discoverability and usability that are so important for people building projects.

I’m currently interviewing academic, family and local historians, focusing on those working on research on people or places in early modern England – very loosely defined, as I’ll go 1600-1900. I’m asking them about the tools do they currently use in their research; how they assess new resources; if or when they might you use a resource created through crowdsourcing or user contributions? (e.g. Wikipedia or ancestry.com); how do you work out which online records to trust? How they use place names or geographic locations in your research?

So far I’ve mostly analysed the interviews for how people think about crowdsourcing, I’ll be focusing on the responses to place when I get back.

More generally, I’m interested in the idea of ‘chorography 2.0’ – what would it look like now? The abundance of information is as much of a problem as an opportunity: how to manage that?”

Frequently Asked Questions about crowdsourcing in cultural heritage

Over time I’ve noticed the repetition of various misconceptions and apprehensions about crowdsourcing for cultural heritage and digital history, so since this is a large part of my PhD topic I thought I’d collect various resources together as I work to answer some FAQs. I’ll update this post over time in response to changes in the field, my research and comments from readers. While this is partly based on some writing for my PhD, I’ve tried not to be too academic and where possible I’ve gone for publicly accessible sources like blog posts rather than send you to a journal paywall.

If you’d rather watch a video than read, check out the Crowdsourcing Consortium for Libraries and Archives (CCLA)’s ‘Crowdsourcing 101: Fundamentals and Case Studies’ online seminar.

[Last updated: February 2016, to address ‘crowdsourcing steals jobs’. Previous updates added a link to CCLA events, crowdsourcing projects to explore and a post on machine learning+crowdsourcing.]

What is crowdsourcing?

Definitions are tricky. Even Jeff Howe, the author of ‘Crowdsourcing’ has two definitions:

The White Paper Version: Crowdsourcing is the act of taking a job traditionally performed by a designated agent (usually an employee) and outsourcing it to an undefined, generally large group of people in the form of an open call.

The Soundbyte Version: The application of Open Source principles to fields outside of software.

For many reasons, the term ‘crowdsourcing’ isn’t appropriate for many cultural heritage projects but the term is such neat shorthand that it’ll stick until something better comes along. Trevor Owens (@tjowens) has neatly problematised this in The Crowd and The Library:

‘Many of the projects that end up falling under the heading of crowdsourcing in libraries, archives and museums have not involved large and massive crowds and they have very little to do with outsourcing labor. … They are about inviting participation from interested and engaged members of the public [and] continue a long standing tradition of volunteerism and involvement of citizens in the creation and continued development of public goods’

Defining crowdsourcing in cultural heritage

To summarise my own thinking and the related literature, I’d define crowdsourcing in cultural heritage as an emerging form of engagement with cultural heritage that contributes towards a shared, significant goal or research area by asking the public to undertake tasks that cannot be done automatically, in an environment where the tasks, goals (or both) provide inherent rewards for participation.

Screenshot from ‘Letters of 1916‘ project.

Who is ‘the crowd’?

Good question!  One tension underlying the ‘openness’ of the call to participate in cultural heritage is the fact that there’s often a difference between the theoretical reach of a project (i.e. everybody) and the practical reach, the subset of ‘everybody’ with access to the materials needed (like a computer and an internet connection), the skills, experience and time…  While ‘the crowd’ may carry connotations of ‘the mob’, in ‘Digital Curiosities: Resource Creation Via Amateur Digitisation‘, Melissa Terras (@melissaterras) points out that many ‘amateur’ content creators are ‘extremely self motivated, enthusiastic, and dedicated’ and test the boundaries between ‘between definitions of amateur and professional, work and hobby, independent and institutional’ and quotes Leadbeater and Miller’s ‘The Pro-Am Revolution‘ on people who pursue an activity ‘as an amateur, mainly for the love of it, but sets a professional standard’.

There’s more and more talk of ‘community-sourcing’ in cultural heritage, and it’s a useful distinction but it also masks the fact that nearly all crowdsourcing projects in cultural heritage involve a community rather than a crowd, whether they’re the traditional ‘enthusiasts’ or ‘volunteers’, citizen historians, engaged audiences, whatever.  That said, Amy Sample Ward has a diagram that’s quite useful for planning how to work with different groups. It puts the ‘crowd’ (people you don’t know), ‘network’ (the community of your community) and ‘community’ (people with a relationship to your organisation) in different rings based on their closeness to you.

‘The crowd’ is differentiated not just by their relationship to your organisation, or by their skills and abilities, but their motivation for participating is also important – some people participate in crowdsourcing projects for altruistic reasons, others because doing so furthers their own goals.

I’m worried about about crowdsourcing because…

…isn’t letting the public in like that just asking for trouble?

@lottebelice said she’d heard people worry that ‘people are highly likely to troll and put in bad data/content/etc on purpose’ – but this rarely happens. People worried about this with user-generated content, too, and while kids in galleries delight in leaving rude messages about each other, it’s rare online.

It’s much more likely that people will mistakenly add bad data, but a good crowdsourcing project should build any necessary data validation into the project. Besides, there are generally much more interesting places to troll than a cultural heritage site.

And as Matt Popke pointed out in a comment, ‘When you have thousands of people contributing to an entry you have that many more pairs of eyes watching it. It’s like having several hundred editors and fact-checkers. Not all of them are experts, but not all of them have to be. The crowd is effectively self-policing because when someone trolls an entry, somebody else is sure to notice it, and they’re just as likely to fix it or report the issue’.  If you’re really worried about this, an earlier post on Designing for participatory projects: emergent best practice‘ has some other tips.

 …doesn’t crowdsourcing take advantage of people?

XKCD on the ethics of commercial crowdsourcing

Sadly, yes, some of the activities that are labelled ‘crowdsourcing’ do. Design competitions that expect lots of people to produce full designs and pay a pittance (if anything) to the winner are rightly hated. (See antispec.com for more and a good list of links).

But in cultural heritage, no. Museums, galleries, libraries, archives and academic projects are in the fortunate position of having interesting work that involves an element of social good, and they also have hugely varied work, from microtasks to co-curated research projects. Crowdsourcing is part of a long tradition of volunteering and altruistic participation, and to quote Owens again, ‘Crowdsourcing is a concept that was invented and defined in the business world and it is important that we recast it and think through what changes when we bring it into cultural heritage.’

[Update, May 2013: it turns out museums aren’t immune from the dangers of design competitions and spec work: I’ve written On the trickiness of crowdsourcing competitions to draw some lessons from the Sydney Design competition kerfuffle.]

Anyway, crowdsourcing won’t usually work if it’s not done right. From A Crowd Without Community – Be Wary of the Mob:

“when you treat a crowd as disposable and anonymous, you prevent them from achieving their maximum ability. Disposable crowds create disposable output. Simply put: crowds need a sense of identity and community to achieve their potential.”

…crowdsourcing can’t be used for academic work

Reasons given include ‘humanists don’t like to share their knowledge’ with just anyone. And it’s possible that they don’t, but as projects like Transcribe Bentham and Trove show, academics and other researchers will share the work that helps produce that knowledge. (This is also something I’m examining in my PhD. I’ll post some early findings after the Digital Humanities 2012 conference in July).

Looking beyond transcription and other forms of digitisation, it’s worth checking out Prism, ‘a digital tool for generating crowd-sourced interpretations of texts’.

…it steals jobs

Once upon a time, people starting a career in academia or cultural heritage could get jobs as digitisation assistants, or they could work on a scholarly edition. Sadly, that’s not the case now, but that’s probably more to do with year upon year of funding cuts. Blame the bankers, not the crowdsourcers.

The good news? Crowdsourcing projects can create jobs – participatory projects need someone to act as community liaison, to write the updates that demonstrate the impact of crowdsourced contributions, to explain the research value of the project, to help people integrate it into teaching, to organise challenges and editathons and more.

What isn’t crowdsourcing?

…’the wisdom of the crowds’?

Which is not just another way of saying ‘crowd psychology’, either (another common furphy). As Wikipedia puts it, ‘the wisdom of the crowds‘ is based on ‘diverse collections of independently-deciding individuals’. Handily, Trevor Owens has just written a post addressing the topic: Human Computation and Wisdom of Crowds in Cultural Heritage.

…user-generated content

So what’s the difference between crowdsourcing and user-generated content? The lines are blurry, but crowdsourcing is inherently productive – the point is to get a job done, whether that’s identifying people or things, creating content or digitising material.

Conversely, the value of user-generated content lies in the act of creating it rather than in the content itself – for example, museums might value the engagement in a visitor thinking about a subject or object and forming a response to it in order to comment on it. Once posted it might be displayed as a comment or counted as a statistic somewhere but usually that’s as far as it goes.

And @sherah1918 pointed out, there’s a difference between asking for assistance with tasks and asking for feedback or comments: ‘A comment book or a blog w/comments isn’t crowdsourcing to me … nor is asking ppl to share a story on a web form. That is a diff appr to collecting & saving personal histories, oral histories’.

…other things that aren’t crowdsourcing:

[Heading inspired by Sheila Brennan @sherah1918]

  • Crowdfunding (it’s often just asking for micro-donations, though it seems that successful crowdfunding projects have a significant public engagement component, which brings them closer to the concerns of cultural heritage organisations. It’s also not that new. See Seventeenth-century crowd funding for one example.)
  • Data-mining social media and other content (though I’ve heard this called ‘passive’ or ‘implict’ crowdsourcing)
  • Human computation (though it might be combined with crowdsourcing)
  • Collective intelligence (though it might also be combined with crowdsourcing)
  • General calls for content, help or participation (see ‘user-generated content’) or vaguely asking people what they think about an idea. Asking for feedback is not crowdsourcing. Asking for help with your homework isn’t crowdsourcing, as it only benefits you.
  • Buzzwords applied to marketing online. And as @emmclean said, “I think many (esp mkting) see “crowdsourcing” as they do “viral” – just happens if you throw money at it. NO!!! Must be great idea” – it must make sense as a crowdsourced task.

Ok, so what’s different about crowdsourcing in cultural heritage?

For a start, the process is as valuable as the result. Owens has a great post on this, Crowdsourcing Cultural Heritage: The Objectives Are Upside Down, where he says:

‘The process of crowdsourcing projects fulfills the mission of digital collections better than the resulting searches… Far better than being an instrument for generating data that we can use to get our collections more used it is actually the single greatest advancement in getting people using and interacting with our collections. … At its best, crowdsourcing is not about getting someone to do work for you, it is about offering your users the opportunity to participate in public memory … it is about providing meaningful ways for the public to enhance collections while more deeply engaging and exploring them’.

And as I’ve said elsewhere, ‘ playing [crowdsourcing] games with museum objects can create deeper engagement with collections while providing fun experiences for a range of audiences’. (For definitions of ‘engagement’ see The Culture and Sport Evidence (CASE) programme. (2011). Evidence of what works: evaluated projects to drive up engagement (PDF).)

What about cultural heritage and citizen science?

[This was written in 2012. I’ve kept it for historical reasons but think differently now.]

First, another definition. As Fiona Romeo writes, ‘Citizen science projects use the time, abilities and energies of a distributed community of amateurs to analyse scientific data. In doing so, such projects further both science itself and the public understanding of science’. As Romeo points out in a different post, ‘All citizen science projects start with well-defined tasks that answer a real research question’, while citizen history projects rarely if ever seem to be based around specific research questions but are aimed more generally at providing data for exploration. Process vs product?

I’m still thinking through the differences between citizen science and citizen history, particularly where they meet in historical projects like Old Weather. Both citizen science and citizen history achieve some sort of engagement with the mindset and work of the equivalent professional occupations, but are the traditional differences between scientific and humanistic enquiry apparent in crowdsourcing projects? Are tools developed for citizen science suitable for citizen history? Does it make a difference that it’s easier to take a new interest in history further without a big investment in learning and access to equipment?

I have a feeling that ‘citizen science’ projects are often more focused on the production of data as accurately and efficiently as possible, and ‘citizen history’ projects end up being as much about engaging people with the content as it is about content production. But I’m very open to challenges on this…

What kind of cultural heritage stuff can be crowdsourced?

I wrote this list of ‘Activity types and data generated’ over a year ago for my Masters dissertation on crowdsourcing games for museums and a subsequent paper for Museums and the Web 2011, Playing with Difficult Objects – Game Designs to Improve Museum Collections (which also lists validation types and requirements).  This version should be read in the light of discussion about the difference between crowdsourcing and user-generated content and in the context of things people can do with museums and with games, but it’ll do for now:

Activity Data generated
Tagging (e.g. steve.museum, Brooklyn Museum Tag! You’re It; variations include two-player ‘tag agreement’ games like Waisda?, extensions such as guessing games e.g. GWAP ESP Game, Verbosity, Tiltfactor Guess What?; structured tagging/categorisation e.g. GWAP Verbosity, Tiltfactor Cattegory) Tags; folksonomies; multilingual term equivalents; structured tags (e.g. ‘looks like’, ‘is used for’, ‘is a type of’).
Debunking (e.g. flagging content for review and/or researching and providing corrections). Flagged dubious content; corrected data.
Recording a personal story Oral histories; contextualising detail; eyewitness accounts.
Linking (e.g. linking objects with other objects, objects to subject authorities, objects to related media or websites; e.g. MMG Donald). Relationship data; contextualising detail; information on history, workings and use of objects; illustrative examples.
Stating preferences (e.g. choosing between two objects e.g. GWAP Matchin; voting on or ‘liking’ content). Preference data; subsets of ‘highlight’ objects; ‘interestingness’ values for content or objects for different audiences. May also provide information on reason for choice.
Categorising (e.g. applying structured labels to a group of objects, collecting sets of objects or guessing the label for or relationship between presented set of objects). Relationship data; preference data; insight into audience mental models; group labels.
Creative responses (e.g. write an interesting fake history for a known object or purpose of a mystery object.) Relevance; interestingness; ability to act as social object; insight into common misconceptions.

You can also divide crowdsourcing projects into ‘macro’ and ‘micro’ tasks – giving people a goal and letting them solve it as they prefer, vs small, well-defined pieces of work, as in the ‘Umbrella of Crowdsourcing’ at The Daily Crowdsource and there’s a fair bit of academic literature on other ways of categorising and describing crowdsourcing.

Using crowdsourcing to manage crowdsourcing

There’s also a growing body of literature on ecosystems of crowdsourcing activities, where different tasks and platforms target different stages of the process.  A great example is Brooklyn Museum’s ‘Freeze Tag!’, a game that cleans up data added in their tagging game. An ecosystem of linked activities (or games) can maximise the benefits of a diverse audience by providing a range of activities designed for different types of participant skills, knowledge, experience and motivations; and can encompass different levels of participation from liking, to tagging, finding facts and links.

A participatory ecosystem can also resolve some of the difficulties around validating specialist tags or long-form, more subjective content by circulating content between activities for validation and ranking for correctness, ‘interestingness’ (etc) by other players (see for example the ‘Contributed data lifecycle’ diagram on my MW2011 paper or the ‘Digital Content Life Cycle’ for crowdsourcing in Oomen and Aroyo’s paper below). As Nina Simon said in The Participatory Museum, ‘By making it easy to create content but impossible to sort or prioritize it, many cultural institutions end up with what they fear most: a jumbled mass of low-quality content’.  Crowdsourcing the improvement of cultural heritage data would also make possible non-crowdsourcing engagement projects that need better content to be viable.

See also Raddick, MJ, and Georgia Bracey. 2009. “Citizen Science: Status and Research Directions for the Coming Decade” on bridging between old and new citizen science projects to aid volunteer retention, and Nov, Oded, Ofer Arazy, and David Anderson. 2011. “Dusting for Science: Motivation and Participation of Digital Citizen Science Volunteers” on creating ‘dynamic contribution environments that allow volunteers to start contributing at lower-level granularity tasks, and gradually progress to more demanding tasks and responsibilities’.

What does the future of crowdsourcing hold?

Platforms aimed at bootstrapping projects – that is, getting new projects up and running as quickly and as painlessly as possible – seem to be the next big thing. Designing tasks and interfaces suitable for mobile and tablets will allow even more of us to help out while killing time. There’s also a lot of work on the integration of machine learning and human computation; my post ‘Helping us fly? Machine learning and crowdsourcing‘ has more on this.

Find out how crowdsourcing in cultural heritage works by exploring projects

Spend a few minutes with some of the projects listed in Looking for (crowdsourcing) love in all the right places to really understand how and why people participate in cultural heritage crowdsourcing.

Where can I find out more? (AKA, a reading list in disguise)

There’s a lot of academic literature on all kinds of aspects of crowdsourcing, but I’ve gone for sources that are accessible both intellectually and in terms of licensing. If a key reference isn’t there, it might be because I can’t find a pre-print or whatever outside a paywall – let me know if you know of one!

9781472410221Liked this post? Buy the book! ‘Crowdsourcing Our Cultural Heritage‘ is available through Ashgate or your favourite bookseller…

Thanks, and over to you!

Thanks to everyone who responded to my call for their favourite ‘misconceptions and apprehensions about crowdsourcing (esp in history and cultural heritage)’, and to those who inspired this post in the first place by asking questions in various places about the negative side of crowdsourcing.  I’ll update the post as I hear of more, so let me know your favourites.  I’ll also keep adding links and resources as I hear of them.

You might also be interested in: Notes from ‘Crowdsourcing in the Arts and Humanities’ and various crowdsourcing classes and workshops I’ve run over the past few years.