From piles of material to patchwork: How do we embed the production of usable collections data into library work?

How do we embed the production of usable collections data into library work?These notes were prepared for a panel discussion at the 'Always Already Computational: Collections as Data' (#AACdata) workshop, held in Santa Barbara in March 2017. While my latest thinking on the gap between the scale of collections and the quality of data about them is informed by my role in the Digital Scholarship team at the British Library, I've also drawn on work with catalogues and open cultural data at Melbourne Museum, the Museum of London, the Science Museum and various fellowships. My thanks to the organisers and the Institute of Museum and Library Services for the opportunity to attend. My position paper was called 'From libraries as patchwork to datasets as assemblages?' but in hindsight, piles and patchwork of material seemed a better analogy.

The invitation to this panel asked us to share our experience and perspective on various themes. I'm focusing on the challenges in making collections available as data, based on years of working towards open cultural data from within various museums and libraries. I've condensed my thoughts about the challenges down into the question on the slide: How do we embed the production of usable collections data into library work?

It has to be usable, because if it's not then why are we doing it? It has to be embedded because data in one-off projects gets isolated and stale. 'Production' is there because infrastructure and workflow is unsexy but necessary for access to the material that makes digital scholarship possible.

One of the biggest issues the British Library (BL) faces is scale. The BL's collections are vast – maybe 200 million items – and extremely varied. My experience shows that publishing datasets (or sharing them with aggregators) exposes the shortcomings of past cataloguing practices, making the size of the backlog all too apparent.

Good collections data (or metadata, depending on how you look at it) is necessary to avoid the overwhelmed, jumble sale feeling of using a huge aggregator like Europeana, Trove, or the DPLA, where you feel there's treasure within reach, if only you could find it. Publishing collections online often increases the number of enquiries about them – how can institution deal with enquiries at scale when they already have a cataloguing backlog? Computational methods like entity identification and extraction could complement the 'gold standard' cataloguing already in progress. If they're made widely available, these other methods might help bridge the resourcing gaps that mean it's easier to find items from richer institutions and countries than from poorer ones.

Photo of piles of materialYou probably already all know this, but it's worth remembering: our collections aren't even (yet) a patchwork of materials. The collections we hold, and the subset we can digitise and make available for re-use are only a tiny proportion of what once existed. Each piece was once part of something bigger, and what we have now has been shaped by cumulative practical and intellectual decisions made over decades or centuries. Digitisation projects range from tiny specialist databases to huge commercial genealogy deals, while some areas of the collections don't yet have digital catalogue records. Some items can't be digitised because they're too big, small or fragile for scanning or photography; others can't be shared because of copyright, data protection or cultural sensitivities. We need to be careful in how we label datasets so that the absences are evident.

(Here, 'data' may include various types of metadata, automatically generated OCR or handwritten text recognition transcripts, digital images, audio or video files, crowdsourced enhancements or any combination or these and more)

Image credit: https://www.flickr.com/photos/teen_s/6251107713/

In addition to the incompleteness or fuzziness of catalogue data, when collections appear as data, it's often as great big lumps of things. It's hard for normal scholars to process (or just unzip) 4gb of data.

Currently, datasets are often created outside normal processes, and over time they become 'stale' as they're not updated when source collections records change. And when they manage to unzip them, the records rely on internal references – name authorities for people, places, etc – that can only be seen as strings rather than things until extra work is undertaken.

The BL's metadata team have experimented with 'researcher format' CSV exports around specific themes (eg an exhibition), and CSV is undoubtedly the most accessible format – but what we really need is the ability for people to create their own queries across catalogues, and create their own datasets from the results. (And by queries I don't mean SPARQL but rather faceted browsing or structured search forms).

Image credit: screenshot from http://data.bl.uk/

Collections are huge (and resources relatively small) so we need to supplement manual cataloguing with other methods. Sometimes the work of crafting links from catalogues to external authorities and identifiers will be a machine job, with pieces sewn together at industrial speed via entity recognition tools that can pull categories out or text and images. Sometimes it's operated by a technologist who runs records through OpenRefine to find links to name authorities or Wikidata records. Sometimes it's a labour of scholarly love, with links painstakingly researched, hand-tacked together to make sure they fit before they're finally recorded in a bespoke database.

This linking work often happens outside the institution, so how can we ingest and re-use it appropriately? And if we're to take advantage of computational methods and external enhancements, then we need ways to signal which categories were applied by catalogues, which by software, by external groups, etc.

The workflow and interface adjustments required would be significant, but even more challenging would be the internal conversations and changes required before a consensus on the best way to combine the work of cataloguers and computers could emerge.

The trick is to move from a collection of pieces to pieces of a collection. Every collection item was created in and about places, and produced by and about people. They have creative, cultural, scientific and intellectual properties. There's a web of connections from each item that should be represented when they appear in datasets. These connections help make datasets more usable, turning strings of text into references to things and concepts to aid discoverability and the application of computational methods by scholars. This enables structured search across datasets – potentially linking an oral history interview with a scientist in the BL sound archive, their scientific publications in journals, annotated transcriptions of their field notebooks from a crowdsourcing project, and published biography in the legal deposit library.

A lot of this work has been done as authority files like AAT, ULAN etc are applied in cataloguing, so our attention should turn to turning local references into URIs and making the most of that investment.

Applying identifiers is hard – it takes expert care to disambiguate personal names, places, concepts, even with all the hinting that context-aware systems might be able to provide as machine learning etc techniques get better. Catalogues can't easily record possible attributions, and there's understandable reluctance to publish an imperfect record, so progress on the backlog is slow. If we're not to be held back by the need for records to be perfectly complete before they're published, then we need to design systems capable of capturing the ambiguity, fuzziness and inherent messiness of historical collections and allowing qualified descriptors for possible links to people, places etc. Then we need to explain the difference to users, so that they don't overly rely on our descriptions, making assumptions about the presence or absence of information when it's not appropriate.

Image credit: http://europeana.eu/portal/record/2021648/0180_N_31601.html

Photo of pipes over a buildingA lot of what we need relies on more responsive infrastructure for workflows and cataloguing systems. For example, the BL's systems are designed around the 'deliverable unit' – the printed or bound volume, the archive box – because for centuries the reading room was where you accessed items. We now need infrastructure that makes items addressable at the manuscript, page and image level in order to make the most of the annotations and links created to shared identifiers.

(I'd love to see absorbent workflows, soaking up any related data or digital surrogates that pass through an organisation, no matter which system they reside in or originate from. We aren't yet making the most of OCRd text, let alone enhanced data from other processes, to aid discoverability or produce datasets from collections.)

Image credit: https://www.flickr.com/photos/snorski/34543357
My final thought – we can start small and iterate, which is just as well, because we need to work on understanding what users of collections data need and how they want to use them. We're making a start and there's a lot of thoughtful work behind the scenes, but maybe a bit more investment is needed from research libraries to become as comfortable with data users as they are with the readers who pass through their physical doors.

The rise of interpolated content?

One thing that might stand out when we look back at 2014 is the rise of interpolated content. We've become used to translating around auto-correct errors in texts and emails but we seem to be at a tipping point where software is going ahead and rewriting content rather than prompting you to notice and edit things yourself.

iOS doesn't just highlight or fix typos, it changes the words you've typed. To take one example, iOS users might use 'ill' more than they use 'ilk', but if I typed 'ilk' I'm not happy when it's replaced by an algorithmically-determined 'ill'. As a side note, understanding the effect of auto-correct on written messages will be a challenge for future historians (much as it is for us sometimes now).

And it's not only text. In 2014, Adobe previewed GapStop, 'a new video technology that eases transitions and removes pauses from video automatically'. It's not just editing out pauses, it's creating filler images from existing images to bridge the gaps so the image doesn't jump between cuts. It makes it a lot harder to tell when someone's words have been edited to say something different to what they actually said – again, editing audio and video isn't new, but making it so easy to remove the artefacts that previously provided clues to the edits is.

Photoshop has long let you edit the contrast and tone in images, but now their Content-Aware Move, Fill and Patch tools can seamlessly add, move or remove content from images, making it easy to create 'new' historical moments. The images on extrapolated-art.com, which uses '[n]ew techniques in machine learning and image processing […] to extrapolate the scene of a painting to see what the full scenery might have looked like' show the same techniques applied to classic paintings.

But photos have been manipulated since they were first used, so what's new? As one Google user reported in It’s Official: AIs are now re-writing history, 'Google’s algorithms took the two similar photos and created a moment in history that never existed, one where my wife and I smiled our best (or what the algorithm determined was our best) at the exact same microsecond, in a restaurant in Normandy.' The important difference here is that he did not create this new image himself: Google's scripts did, without asking or specifically notifying him. In twenty years time, this fake image may become part of his 'memory' of the day. Automatically generated content like this also takes the question of intent entirely out of the process of determining 'real' from interpolated content. And if software starts retrospectively 'correcting' images, what does that mean for our personal digital archives, for collecting institutions and for future historians?

Interventions between the act of taking a photo and posting it on social media might be one of the trends of 2015. Facebook are about to start 'auto-enhancing' your photos, and apparently, Facebook Wants To Stop You From Uploading Drunk Pictures Of Yourself. Apparently this is to save your mum and boss seeing them; the alternative path of building a social network that don't show everything you do to your mum and boss was lost long ago. Would the world be a better place if Facebook or Twitter had a 'this looks like an ill-formed rant, are you sure you want to post it?' function?

So 2014 seems to have brought the removal of human agency from the process of enhancing, and even creating, text and images. Algorithms writing history? Where do we go from here? How will we deal with the increase of interpolated content when looking back at this time? I'd love to hear your thoughts.

Looking for (crowdsourcing) love in all the right places

One of the most important exercises in the crowdsourcing workshops I run is the 'speed dating' session. The idea is to spend some time looking at a bunch of crowdsourcing projects until you find a project you love. Finding a project you enjoy gives you a deeper insight into why other people participate in crowdsourcing, and will see you through the work required to get a crowdsourcing project going. I think making a personal connection like this helps reduce some of the cynicism I occasionally encounter about why people would volunteer their time to help cultural heritage collections. Trying lots of projects also gives you a much better sense of the types of barriers projects can accidentally put in the way of participation. It's also a good reminder that everyone is a nerd about something, and that there's a community of passion for every topic you can think of.

If you want to learn more about designing history or cultural heritage crowdsourcing projects, trying out lots of project is a great place to start. The more time you can spend on this the better – an hour is ideal – but trying just one or two projects is better than nothing. In a workshop I get people to note how a project made them feel – what they liked most and least about a project, and who they'd recommend it to. You can also note the input and output types to help build your mental database of relevant crowdsourcing projects.

The list of projects I suggest varies according to the background of workshop participants, and I'll often throw in suggestions tailored to specific interests, but here's a generic list to get you started.

10 Most Wanted http://10most.org.uk/ Research object histories
Ancient Lives http://ancientlives.org/ Humanities, language, text transcription
British Library Georeferencer http://www.bl.uk/maps/ Locating and georeferencing maps (warning: if it's running, only hard maps may be left!)
Children of the Lodz Ghetto http://online.ushmm.org/lodzchildren/ Citizen history, research
Describe Me http://describeme.museumvictoria.com.au/ Describe objects
DIY History http://diyhistory.lib.uiowa.edu/ Transcribe historical letters, recipes, diaries
Family History Transcription Project http://www.flickr.com/photos/statelibrarync/collections/ Document transcription (Flickr/Yahoo login required to comment)
Herbaria@home http://herbariaunited.org/atHome/ (for bonus points, compare it with Notes from Nature https://www.zooniverse.org/project/notes_from_nature) Transcribing specimen sheets (or biographical research)
HistoryPin Year of the Bay 'Mysteries' https://www.historypin.org/attach/project/22-yearofthebay/mysteries/index/ Help find dates, locations, titles for historic photographs; overlay images on StreetView
iSpot http://www.ispotnature.org/ Help 'identify wildlife and share nature'
Letters of 1916 http://dh.tcd.ie/letters1916/ Transcribe letters and/or contribute letters
London Street Views 1840 http://crowd.museumoflondon.org.uk/lsv1840/ Help transcribe London business directories
Micropasts http://crowdsourced.micropasts.org/app/photomasking/newtask Photo-masking to help produce 3D objects; also structured transcription
Museum Metadata Games: Dora http://museumgam.es/dora/ Tagging game with cultural heritage objects (my prototype from 2010)
NYPL Building Inspector http://buildinginspector.nypl.org/ A range of tasks, including checking building footprints, entering addresses
Operation War Diary http://operationwardiary.org/ Structured transcription of WWI unit diaries
Papers of the War Department http://wardepartmentpapers.org/ Document transcription
Planet Hunters http://planethunters.org/ Citizen science; review visualised data
Powerhouse Museum Collection Search http://www.powerhousemuseum.com/collection/database/menu.php Tagging objects
Reading Experience Database http://www.open.ac.uk/Arts/RED/ Text selection, transcription, description.
Smithsonian Digital Volunteers: Transcription Center https://transcription.si.edu/ Text transcription
Tiltfactor Metadata Games http://www.metadatagames.org/ Games with cultural heritage images
Transcribe Bentham http://www.transcribe-bentham.da.ulcc.ac.uk/ History; text transcription
Trove http://trove.nla.gov.au/newspaper?q= Correct OCR errors, transcribe text, tag or describe documents
US National Archives http://www.amara.org/en/teams/national-archives/ Transcribing videos
What's the Score at the Bodleian http://www.whats-the-score.org/ Music and text transcription, description
What's on the menu http://menus.nypl.org/ Structured transcription of restaurant menus
What's on the menu? Geotagger http://menusgeo.herokuapp.com/ Geolocating historic restaurant menus
Wikisource – random item link http://en.wikisource.org/wiki/Special:Random/Index Transcribing texts
Worm Watch http://www.wormwatchlab.org Citizen science; video
Your Paintings Tagger http://tagger.thepcf.org.uk/ Paintings; free-text or structured tagging

NB: crowdsourcing is a dynamic field, some sites may be temporarily out of content or have otherwise settled in transit. Some sites require registration, so you may need to find another site to explore while you're waiting for your registration email.

It's here! Crowdsourcing our Cultural Heritage is now available

My edited volume, Crowdsourcing our Cultural Heritage, is now available! My introduction (Crowdsourcing our cultural heritage: Introduction), which provides an overview of the field and outlines the contribution of the 12 chapters, is online at Ashgate's site, along with the table of contents and index. There's a 10% discount if you order online.

If you're in London on the evening of Thursday 20th November, we're celebrating with a book launch party at the UCL Centre for Digital Humanities. Register at http://crowdsourcingculturalheritage.eventbrite.co.uk.

Here's the back page blurb: "Crowdsourcing, or asking the general public to help contribute to shared goals, is increasingly popular in memory institutions as a tool for digitising or computing vast amounts of data. This book brings together for the first time the collected wisdom of international leaders in the theory and practice of crowdsourcing in cultural heritage. It features eight accessible case studies of groundbreaking projects from leading cultural heritage and academic institutions, and four thought-provoking essays that reflect on the wider implications of this engagement for participants and on the institutions themselves.

Crowdsourcing in cultural heritage is more than a framework for creating content: as a form of mutually beneficial engagement with the collections and research of museums, libraries, archives and academia, it benefits both audiences and institutions. However, successful crowdsourcing projects reflect a commitment to developing effective interface and technical designs. This book will help practitioners who wish to create their own crowdsourcing projects understand how other institutions devised the right combination of source material and the tasks for their ‘crowd’. The authors provide theoretically informed, actionable insights on crowdsourcing in cultural heritage, outlining the context in which their projects were created, the challenges and opportunities that informed decisions during implementation, and reflecting on the results.

This book will be essential reading for information and cultural management professionals, students and researchers in universities, corporate, public or academic libraries, museums and archives."

Massive thanks to the following authors of chapters for their intellectual generosity and their patience with up to five rounds of edits, plus proofing, indexing and more…

  1. Crowdsourcing in Brooklyn, Shelley Bernstein;
  2. Old Weather: approaching collections from a different angle, Lucinda Blaser;
  3. ‘Many hands make light work. Many hands together make merry work’: Transcribe Bentham and crowdsourcing manuscript collections, Tim Causer and Melissa Terras;
  4. Build, analyse and generalise: community transcription of the Papers of the War Department and the development of Scripto, Sharon M. Leon;
  5. What's on the menu?: crowdsourcing at the New York Public Library, Michael Lascarides and Ben Vershbow;
  6. What’s Welsh for ‘crowdsourcing’? Citizen science and community engagement at the National Library of Wales, Lyn Lewis Dafis, Lorna M. Hughes and Rhian James;
  7. Waisda?: making videos findable through crowdsourced annotations, Johan Oomen, Riste Gligorov and Michiel Hildebrand;
  8. Your Paintings Tagger: crowdsourcing descriptive metadata for a national virtual collection, Kathryn Eccles and Andrew Greg.
  9. Crowdsourcing: Crowding out the archivist? Locating crowdsourcing within the broader landscape of participatory archives, Alexandra Eveleigh;
  10.  How the crowd can surprise us: humanities crowdsourcing and the creation of knowledge, Stuart Dunn and Mark Hedges;
  11. The role of open authority in a collaborative web, Lori Byrd Phillips;
  12. Making crowdsourcing compatible with the missions and values of cultural heritage organisations, Trevor Owens.

Who loves your stuff? How to collect links to your site

If you've ever wondered who's using content from your site or what people find interesting, here are some ways to find out, using the Design Museum's URL as an example.

'Links to your site' via Google Webmaster Tools https://support.google.com/webmasters/answer/55281

Reddit – plug your URL in after /domain/
http://www.reddit.com/domain/designmuseum.org

Wikipedia – plug your URL in after target=
http://en.wikipedia.org/w/index.php?title=Special%3ALinkSearch&target=*.designmuseum.org
Depending on your topic coverage you may want to look at other language Wikipedias.

Pinterest – plug your URL in after /source/
http://www.pinterest.com/source/designmuseum.org/

Twitter – search for the URL with quotes around it e.g. "designmuseum.org"

If you can see one particular page shooting up in your web stats, you could try a reverse image search on TinEye to see where it's being referenced.

What am I missing? I'd love to hear about similar links and methods for other sites – tell me in the comments or on twitter @mia_out.

Update: in a similar vein, Tim Sherratt @wragge launched a new experiment called Trove Traces the same day, to 'explore how Trove newspapers are used' by listing pages that link to articles: http://trovespace.webfactional.com/traces/

Update 2: Desi Gonzalez @desigonz tried out some of these techniques and put together a great post on 'Thoughts on what museums can learn from Reddit, Yelp, and what @briandroitcour calls vernacular criticism'
You might also be interested in: Can you capture visitors with a steampunk arm?

The sounds of silence

I've been reading World War One diaries and letters (getting distracted by sources is an occupational hazard in my research) as I look for sample primary sources for teaching crowdsourcing at the HILT summer school in Maryland next week and for my CENDARI fellowship later this year.

I noticed one line in the Diary of William Henry Winter WWI 1915 that manages to convey a lot without directly giving any information about his opinions or relationship with this person:

'Major Saunders is supposed to be on his way back here as well but I don't know as he is coming back to our Coy, I hope not any way. We have got a good man now.'

There's nothing in the rest of the entries online that provides any further background. It may be that sections of this correspondence either didn't survive, weren't held by the same person, or perhaps were edited before deposit with the library or during transcription (it's particularly hard to judge as the site doesn't have images of the original document), so this particular silence may not have been intentional.

Whatever the case, it's a good reminder that there are silences behind every piece of content. While it's an amazing time to research the lives of those caught up in WWI as more and more private and public material is digitised and shared, silences can be created in many ways – official archives privilege some voices over others, personal collections can be censored or remain tucked away in a shoebox, and large parts of people's experiences simply went unrecorded. Content hidden behind paywalls or inaccessible to search engines (whether inadvertently hidden behind a search box or through lack of text transcription or description) is effectively hushed, if not exactly silenced. Sources and information about WWI collected via community groups on Facebook may be lost the next time they change their terms and conditions, or only partially shared. Our challenge is to make the gaps and questions about what was collected visible (audible?) while also being careful not to render the undigitised or unsearchable invisible in our rush to privilege the easily-accessible.

[Update: I've just realised that Winter might not have needed to provider further context as it seems many men in his unit were from the same region as him, and therefore his relationship with the Major may have pre-dated the war. Tacit knowledge is of course another example of the unrecorded, and one perhaps more familiar to us now than the unsayable.]

Piloting a Participatory History Commons

I've been awarded a CENDARI Visiting Research Fellowship at Trinity College Dublin for a project called 'Bridging collections with a participatory Commons: a pilot with World War One archives'. I've posted my proposal at the link above, and when I start in September I'll post about my progress here. CENDARI have now published the list of all 2014 Fellows and a neat summary of the programme: 'The CENDARI Visiting Research Fellowships are intended to support and stimulate historical research in the two pilot areas of medieval European culture and the First World War, by facilitating access to key archives, specialist knowledge and collections in CENDARI host institutions'.

As I said in my post, 'it's an ambitious project which requires tackling community building, user experience design, historical materials and programming, and I'll be drawing on the expertise of many people'. I'll post as I go – but first, I'd best get back to finishing up my PhD thesis!

In the meantime, here's a small collection of things I've written as I think through what a participatory commons is and how it might work: my poster and talk notes for Herrenhausen conference and my keynote for Sharing is Caring, 'Enriching cultural heritage collections through a Participatory Commons platform: a provocation about collaborating with users'.

How can we connect museum technologists with their history?

A quick post triggered by an article on the role of domain knowledge (knowledge of a field) in critical thinking, Deep in thought:

Domain knowledge is so important because of the way our memories work. When we think, we use both working memory and long-term memory. Working memory is the space where we take in new information from our environment; everything we are consciously thinking about is held there. Long-term memory is the store of knowledge that we can call up into working memory when we need it. Working memory is limited, whereas long-term memory is vast. Sometimes we look as if we are using working memory to reason, when actually we are using long-term memory to recall. Even incredibly complex tasks that seem as if they must involve working memory can depend largely on long-term memory.

When we are using working memory to progress through a new problem, the knowledge stored in long-term memory will make that process far more efficient and successful. … The more parts of the problem that we can automate and store in long-term memory, the more space we will have available in working memory to deal with the new parts of the problem.

A few years ago I defined a 'museum technologist' as 'someone who can appropriately apply a range of digital solutions to help meet the goals of a particular museum project', and deep domain knowledge clearly has a role to play in this (also in the kinds of critical thinking that will save technologists from being unthinking cheerleaders for the newest buzzword or geek toy). 

There's a long history of hard-won wisdom, design patterns and knowledge (whether about ways not to tender for or specify software, reasons why proposed standards may or may not work, translating digital methods and timelines for departments raised on print, etc – I'm sure you all have examples) contained in the individual and collective memory of individual technologists and teams. Some of it is represented in museum technology mailing lists, blogs or conference proceedings, but the lessons learnt in the past aren't always easily discoverable by people encountering digital heritage issues for the first time. And then there's the issue of working out which knowledge relates to specific, outdated technologies and which still holds while not quashing the enthusiasm of new people with a curt 'we tried that before'…

Something in the juxtaposition of the 20th anniversary of BritPop and the annual wave of enthusiasm and discovery from the international Museums and the Web (#MW2014) conference prompted me to look at what the Museums Computer Group (MCG) and Museum Computer Network (MCN) lists were talking about in April five and ten years ago (i.e. in easily-accessible archives):

Five years ago in #musetech – open web, content distribution, virtualisation, wifi https://www.jiscmail.ac.uk/cgi-bin/webadmin?A1=ind0904&L=mcg&X=498A43516F310B2193 http://mcn.edu/pipermail/mcn-l/2009-April/date.html

Ten years ago in #musetech people were talking about knowledge organisation and video links with schools https://www.jiscmail.ac.uk/cgi-bin/webadmin?A1=ind04&L=mcg&F=&S=&X=498A43516F310B2193

Some of the conversations from that random sample are still highly relevant today, and more focused dives into various archives would probably find approaches and information that'd help people tackling current issues.

So how can we help people new to the sector find those previous conversations and get some of this long-term memory into their own working memory? Pointing people to search forms for the MCG and MCN lists is easy, some of the conference proceedings are a bit trickier (e.g. search within the museumsandtheweb.com) and there's no central list of museum technology blogs that I know of. Maybe people could nominate blog posts they think stand the test of time, mindful of the risk of it turning into a popularity/recency thing?

If you're new(ish) to digital heritage, how did you find your feet? Which sites or communities helped you, and how did you find them? Or if you have a new team member, how do you help them get up to speed with museum technology? Or looking further afield, which resources would you send to someone from academia or related heritage fields who wanted to learn about building heritage resources for or with specialists and the public?

'Go digital' at Museums Association 2012 Conference

Some people who couldn't make the Museums Association conference (or #museums2012) asked for more information on the session on digital strategies, so here are my introductory remarks and some scribbled highlights of the speakers' papers and discussion with the audience.

Update: a year later, I've thought of a 'too long, didn't read' version: digital strategies are like puberty. Everyone has to go through it, but life's better on the other side when you've figured things out. Digital should be incorporated into engagement, collections, venue etc strategies – it's not a thing on its own.

The speakers were Carolyn Royston (@caro_ft), Head of New Media at Imperial War Museum; Hugh Wallace (@tumshie), Head of Digital Media at National Museums Scotland; Michael Woodward (@michael1665), Commercial Director at York Museums Trust, and I chaired the session in my role as Chair of the Museums Computer Group. From the conference programme: 'This session explores the importance of developing a digital strategy. It will provide insight into how organisations can incorporate digital into a holistic approach that meets wider organisational and public engagement objectives and look at how to use digital engagement as a catalyst to drive organisational change.'

After various conversations about digital and museums with people who were interested in the session, I updated my introduction so that overall the challenge of embracing the impact of digital technologies, platforms and audiences on museums was put in a positive light.  The edited title that appeared in the programme had a different emphasis ('Go digital' rather than the 'Getting strategic about digital' we submitted) so I wanted it to be clear that we weren't pushing a digital agenda for the sake of technology itself. Or as I apparently said at the time, "it's not about making everything digital, it's about dealing with the fact that digital is everywhere".

I started by asking people to raise their hands if their museum had a digital strategy, and I'd say well over half the room responded, which surprised me. Perhaps a third were in the process of planning for a digital strategy and just a few were yet to start at all.

My notes were something like this: "we probably all know by now that digital technologies bring wonderful opportunities for museums and their audiences, but you might also be worried about the impact of technology on audiences and your museum. ‘Digital’ varies in organisations – it might encompass social media, collections, mobile, marketing, in-gallery interactives, broadcast and content production. It touches every public-facing output of the museum as well as back-office functions and infrastructure.

You can’t avoid the impact of digital on your organisation, so it’s about how you deal with it, how you integrate it into the fabric of your museum. As you’ll hear in the case studies, implementing digital strategy itself changes the organisation, so from the moment you start talking to people about devising a digital strategy, you'll be making progress. For some of our presenters, their digital strategy ultimately took the form of a digital vision document – the strategy itself is embedded in the process and in the resulting framework for working across the organisation. A digital strategy framework allows you to explore options in conversation with the whole organisation, it’s not about making everything digital.

Our case studies come from three very different organisations working with different collections in different contexts. Mike, Commercial Director at York Museums Trust will talk about planning the journey, moving from ad hoc work to making digital integral to how the organisation works; Hugh, Head of Digital Media at National Museums Scotland will discuss the process they went through to develop digital strategy, what’s worked and what hasn’t’; Carolyn Royston, Head of Digital Media at Imperial War Museums, who comes from a learning background, will talk from IWM’s digital adventure, from where they started to where they are now. They’re each at different stages of the process of implementing and living with a digital strategy.

Based on our discussions as we planned this session, the life cycle of a digital strategy in a museum seems to be: aspiration, design, education and internal outreach, integration with other strategies (particularly public engagement) and sign off… then take a deep breath, look at what the ripple effect has been and start updating your strategies as everything will have changed since you started. And with that, over to Mike…"

Mike talked about working out when digital delivery really makes sense, whether for inaccessible objects (like a rock on Mars) or a delicate book; the major role that outreach and communication play in the process of creating a digital strategy; appointing the staff that would deliver it based on eagerness, enthusiasm and teamwork rather than pure tech skills; where digital teams should sit in the organisation; and about the possibility of using digital volunteers (or 'armchair experts') to get content online.

Hugh went for 'frameworks, not fireworks', pointing out that what happens after the strategy is written is important so you need to create a flexible framework to manage the inevitable change.  He discussed the importance of asking the right-sized question (as in one case, where 'we didn't know at the start that an app would be the answer') and working on getting digital into 'business as usual' rather than an add-on team with specialist skills.  Or as one tweeter summarised, 'work across depts, don't get hung up on the latest tech, define users realistically and keep it simple'.

Carolyn covered the different forms of digital engagement and social media the IWM have been trying and the role of creating their digital vision in helping overcome their fears; the benefits of partnerships with other organisations for piggybacking on their technology, networks and audiences, and the fact that their collections sales have gone up as a result of opening up their collections.  In the questions, someone described intellectual property restrictions to try to monetise collections as 'fool's gold' – great term!  I think we should have a whole conference session on this sometime soon.

When reviewing our discussions beforehand I'd found a note from a planning call which summed up how much the process should change the organisation: 'if you're not embarrassed by your digital strategy six months after sign-off you probably haven't done it right', and on the day the speakers reinforced my impression that ultimately, devising and implementing a digital strategy is (probably) a necessary process to go through but it's not a goal in its own right.  The IWM and NMS examples show that the internal education and conversations can both create a bigger appetite for digital engagement and change organisational expectations around digital to the point where it has to be more widely integrated.  The best place for a digital strategy is within a public engagement strategy that integrates the use of digital platforms and working methods into the overall public-facing work of the museum.

Listening to the speakers, a new metaphor occurred to me: is implementing a digital strategy like gardening? It needs constant care and feeding after the big job of sowing seeds is over. And much like gardening for pleasure (in the UK, anyway), the process may have more impact than the product.

And something I didn't articulate at the time – if the whole museum is going to be doing some digital work, we technologists are going to have to be patient and generous in sharing our knowledge and helping everyone learn how to make sensible decisions about digital content and experiences.  If we don't, we risk being a bottleneck or forcing people to proceed based on guesswork and neither are good for museums or their audiences.

So much awesomeness! #GODIGITAL #Museums2012 twitter.com/dannybirchall/…
— Danny Birchall (@dannybirchall) November 9, 2012

Huge thanks for Carolyn, Hugh and Michael for making the whole thing such a pleasure and to the Museum Association conference organisers for the opportunity to share our thoughts and experiences.

And finally, if you're interested in digital strategies in heritage organisations, the Museums Computer Groups annual Museums on the Web conference is all about being 'strategically digital' (which as you might have guessed from the above, sometimes might mean not using technology at all) but UKMW12 tickets are selling out fast, so don't delay.

Designing for participatory projects: emergent best practice, getting discussion started

I was invited over to New Zealand (from Australia) recently to talk at Te Papa in Wellington and the Auckland Museum.  After the talks I was asked if I could share some of my notes on design for participatory projects and for planning for the impact of participatory projects on museums.  Each museum has a copy of my slides, but I thought I'd share the final points here rather than by email, and take the opportunity to share some possible workshop activities to help museums plan audience participation around its core goals.

Both talks started by problematising the definition of a 'museum website' – it doesn't work to think of your 'museum website' as purely stuff that lives under your domain name when it's now it's also the social media accounts under your brand, your games and mobile apps, and maybe also your objects and content on Google Art Project or even your content in a student’s Tumblr.  The talks were written to respond to the particular context of each museum so they varied from there, but each ended up with these points.  The sharp-eyed among you might notice that they're a continuation of ideas I first shared in my Europeana Tech keynote: Open for engagement: GLAM audiences and digital participation.  The second set are particularly aimed at helping museums think about how to market participatory projects and sustain them over the longer term by making them more visible in the museum as a whole.

Best practice in participatory project design

  • Have an answer to 'Why would someone spend precious time on your project?'
  • Be inspired by things people love
  • Design for the audience you want
  • Make it a joy to participate
  • Don't add unnecessary friction, barriers (e.g. don't add sign-up forms if you don’t really need them, or try using lazy registration if you really must make users create accounts)
  • Show how much you value contributions (don't just tell people you value their work)
  • Validate procrastination – offer the opportunity to make a difference by providing meaningful work
  • Provide an easy start and scaffolded tasks (see e.g. Nina Simon's Self-Expression is Overrated: Better Constraints Make Better Participatory Experiences)
  • Let audiences help manage problems – let them know which behaviours are acceptable and empower them to keep the place tidy
  • Test with users; iterate; polish

Best practice within your museum

  • Fish where the fish are – find the spaces where people are already engaging with similar content and see how you can slot in, don't expect people to find their way to you unless you have something they can’t find anywhere else
  • Allow for community management resources – you’ll need some outreach to existing online and offline communities to encourage participation, some moderation and just a general sense that the site hasn’t been abandoned. If you can’t provide this for the life of the project, you might need to question why you’re doing it.
  • Decide where it's ok to lose control. Try letting go… you may find audiences you didn't expect, or people may make use of your content in ways you never imagined. Watch and learn and tweak in response – this is a good reason to design in iterations, and to go into public or invited-beta earlier rather than later. 
  • Realistically assess fears, decide acceptable levels of risk. Usually fears can be turned into design requirements, they’re rarely show-stoppers.
  • Have a clear objective, ideally tied to your museum’s mission. Make sure the point of the project is also clear to your audience.
  • Put the audience needs first. You’re asking people to give up their time and life experience, so make sure the experience respects this. Think carefully before sacrificing engagement to gain efficiency.
  • Know how to measure success
  • Plan to make the online activity visible in the organisation and in the museum. Displaying online content in the museum is a great way to show how much you value it, as well as marketing the project to potential contributors.  Working out how you can share the results with the rest of the organization helps everyone understand how much potential there is, and helps make online visitors ‘real’.
  • Have an exit strategy – staff leave, services fold or change their T&Cs

I'd love to know what you think – what have I missed?  [Update: for some useful background on the organisational challenges many museums face when engaging with technology, check out Collections Access and the use of Digital Technology (pdf).]

More on designing museum projects for audience participation

I prepared this activity for one of the museums, but on the day the discussion after my talk went on so long that we didn't need to use a formal structure to get people talking. In the spirit of openness, I thought I'd share it. If you try it in your organisation, let me know how it goes!

The structure – exploratory idea generation followed by convergence and verification – was loosely based on the 'creativity workshops' developed by City University's Centre for Creativity (e.g. the RESCUE creativity workshops discussed in Use and Influence of Creative Ideas and Requirements for a Work-Integrated Learning System).  It's designed to be a hackday-like creative activity for non-programmers.

In small groups…

  • Pick two strategic priorities or organisational goals…
  • In 5 minutes: generate as many ideas as possible
  • In 2 minutes: pick one idea to develop further

Ideas can include in-gallery and in-person activity; they must include at least two departments and some digital component.

Developing your idea…
Ideas can include in-gallery and in-person activity; they must include at least two departments

  • You have x minutes to develop your idea
  • You have 2 minutes each to report back. Include: which previous museum projects provide relevant lessons? How will you market it? How will it change the lives of its target audience? How will it change the museum?
  • How will you alleviate potential risks?  How will you maximise potential benefits?
  • You have x minutes for general discussion. How can you build on the ideas you've heard?

For bonus points…

These discussion points were written for another museum, but they might be useful for other organisations thinking about audience participation and online collections:

What are the museum’s goals in engaging audiences with collections online?

  • What does success look like?
  • How will it change the museum?
  • Which past projects provide useful lessons?

How can the whole organisation be involved in supporting online conversations?

  • What are the barriers?
  • What small, sustainable steps can be taken?
  • Where are online contributions visible in the museum?