From piles of material to patchwork: How do we embed the production of usable collections data into library work?

How do we embed the production of usable collections data into library work?These notes were prepared for a panel discussion at the ‘Always Already Computational: Collections as Data‘ (#AACdata) workshop, held in Santa Barbara in March 2017. While my latest thinking on the gap between the scale of collections and the quality of data about them is informed by my role in the Digital Scholarship team at the British Library, I’ve also drawn on work with catalogues and open cultural data at Melbourne Museum, the Museum of London, the Science Museum and various fellowships. My thanks to the organisers and the Institute of Museum and Library Services for the opportunity to attend. My position paper was called ‘From libraries as patchwork to datasets as assemblages?‘ but in hindsight, piles and patchwork of material seemed a better analogy.

The invitation to this panel asked us to share our experience and perspective on various themes. I’m focusing on the challenges in making collections available as data, based on years of working towards open cultural data from within various museums and libraries. I’ve condensed my thoughts about the challenges down into the question on the slide: How do we embed the production of usable collections data into library work?

It has to be usable, because if it’s not then why are we doing it? It has to be embedded because data in one-off projects gets isolated and stale. ‘Production’ is there because infrastructure and workflow is unsexy but necessary for access to the material that makes digital scholarship possible.

One of the biggest issues the British Library (BL) faces is scale. The BL’s collections are vast – maybe 200 million items – and extremely varied. My experience shows that publishing datasets (or sharing them with aggregators) exposes the shortcomings of past cataloguing practices, making the size of the backlog all too apparent.

Good collections data (or metadata, depending on how you look at it) is necessary to avoid the overwhelmed, jumble sale feeling of using a huge aggregator like Europeana, Trove, or the DPLA, where you feel there’s treasure within reach, if only you could find it. Publishing collections online often increases the number of enquiries about them – how can institution deal with enquiries at scale when they already have a cataloguing backlog? Computational methods like entity identification and extraction could complement the ‘gold standard’ cataloguing already in progress. If they’re made widely available, these other methods might help bridge the resourcing gaps that mean it’s easier to find items from richer institutions and countries than from poorer ones.

Photo of piles of materialYou probably already all know this, but it’s worth remembering: our collections aren’t even (yet) a patchwork of materials. The collections we hold, and the subset we can digitise and make available for re-use are only a tiny proportion of what once existed. Each piece was once part of something bigger, and what we have now has been shaped by cumulative practical and intellectual decisions made over decades or centuries. Digitisation projects range from tiny specialist databases to huge commercial genealogy deals, while some areas of the collections don’t yet have digital catalogue records. Some items can’t be digitised because they’re too big, small or fragile for scanning or photography; others can’t be shared because of copyright, data protection or cultural sensitivities. We need to be careful in how we label datasets so that the absences are evident.

(Here, ‘data’ may include various types of metadata, automatically generated OCR or handwritten text recognition transcripts, digital images, audio or video files, crowdsourced enhancements or any combination or these and more)

Image credit: https://www.flickr.com/photos/teen_s/6251107713/

In addition to the incompleteness or fuzziness of catalogue data, when collections appear as data, it’s often as great big lumps of things. It’s hard for normal scholars to process (or just unzip) 4gb of data.

Currently, datasets are often created outside normal processes, and over time they become ‘stale’ as they’re not updated when source collections records change. And when they manage to unzip them, the records rely on internal references – name authorities for people, places, etc – that can only be seen as strings rather than things until extra work is undertaken.

The BL’s metadata team have experimented with ‘researcher format’ CSV exports around specific themes (eg an exhibition), and CSV is undoubtedly the most accessible format – but what we really need is the ability for people to create their own queries across catalogues, and create their own datasets from the results. (And by queries I don’t mean SPARQL but rather faceted browsing or structured search forms).

Image credit: screenshot from http://data.bl.uk/

Collections are huge (and resources relatively small) so we need to supplement manual cataloguing with other methods. Sometimes the work of crafting links from catalogues to external authorities and identifiers will be a machine job, with pieces sewn together at industrial speed via entity recognition tools that can pull categories out or text and images. Sometimes it’s operated by a technologist who runs records through OpenRefine to find links to name authorities or Wikidata records. Sometimes it’s a labour of scholarly love, with links painstakingly researched, hand-tacked together to make sure they fit before they’re finally recorded in a bespoke database.

This linking work often happens outside the institution, so how can we ingest and re-use it appropriately? And if we’re to take advantage of computational methods and external enhancements, then we need ways to signal which categories were applied by catalogues, which by software, by external groups, etc.

The workflow and interface adjustments required would be significant, but even more challenging would be the internal conversations and changes required before a consensus on the best way to combine the work of cataloguers and computers could emerge.

The trick is to move from a collection of pieces to pieces of a collection. Every collection item was created in and about places, and produced by and about people. They have creative, cultural, scientific and intellectual properties. There’s a web of connections from each item that should be represented when they appear in datasets. These connections help make datasets more usable, turning strings of text into references to things and concepts to aid discoverability and the application of computational methods by scholars. This enables structured search across datasets – potentially linking an oral history interview with a scientist in the BL sound archive, their scientific publications in journals, annotated transcriptions of their field notebooks from a crowdsourcing project, and published biography in the legal deposit library.

A lot of this work has been done as authority files like AAT, ULAN etc are applied in cataloguing, so our attention should turn to turning local references into URIs and making the most of that investment.

Applying identifiers is hard – it takes expert care to disambiguate personal names, places, concepts, even with all the hinting that context-aware systems might be able to provide as machine learning etc techniques get better. Catalogues can’t easily record possible attributions, and there’s understandable reluctance to publish an imperfect record, so progress on the backlog is slow. If we’re not to be held back by the need for records to be perfectly complete before they’re published, then we need to design systems capable of capturing the ambiguity, fuzziness and inherent messiness of historical collections and allowing qualified descriptors for possible links to people, places etc. Then we need to explain the difference to users, so that they don’t overly rely on our descriptions, making assumptions about the presence or absence of information when it’s not appropriate.

Image credit: http://europeana.eu/portal/record/2021648/0180_N_31601.html

Photo of pipes over a buildingA lot of what we need relies on more responsive infrastructure for workflows and cataloguing systems. For example, the BL’s systems are designed around the ‘deliverable unit’ – the printed or bound volume, the archive box – because for centuries the reading room was where you accessed items. We now need infrastructure that makes items addressable at the manuscript, page and image level in order to make the most of the annotations and links created to shared identifiers.

(I’d love to see absorbent workflows, soaking up any related data or digital surrogates that pass through an organisation, no matter which system they reside in or originate from. We aren’t yet making the most of OCRd text, let alone enhanced data from other processes, to aid discoverability or produce datasets from collections.)

Image credit: https://www.flickr.com/photos/snorski/34543357
My final thought – we can start small and iterate, which is just as well, because we need to work on understanding what users of collections data need and how they want to use them. We’re making a start and there’s a lot of thoughtful work behind the scenes, but maybe a bit more investment is needed from research libraries to become as comfortable with data users as they are with the readers who pass through their physical doors.

Keynote online: ‘Reaching out: museums, crowdsourcing and participatory heritage’

In September I was invited to give a keynote at the Museum Theme Days 2016 in Helsinki. I spoke on ‘Reaching out: museums, crowdsourcing and participatory heritage. In lieu of my notes or slides, the video is below. (Great image, thanks YouTube!)

Network visualisations and the ‘so what?’ problem

This week I was in Luxembourg for a workshop on Network Visualisation in the Cultural Heritage Sector, organised by Marten Düring and held on the Belval campus of the University of Luxembourg.

In my presentation, I responded to some of the questions posed in the workshop outline:

In this workshop we want to explore how network visualisations and infrastructures will change the research and outreach activities of cultural heritage professionals and historians. Among the questions we seek to discuss during the workshop are for example: How do users benefit from graphs and their visualisation? Which skills do we expect from our users? What can we teach them? Are SNA [social network analysis] theories and methods relevant for public-facing applications? How do graph-based applications shape a user’s perception of the documents/objects which constitute the data? How can applications benefit from user engagement? How can applications expand and tap into other resources?

A rough version of my talk notes is below. The original slides are also online.

Network visualisations and the ‘so what?’ problem

Caveat

While I may show examples of individual network visualisations, this talk isn’t a critique of them in particular. There’s lots of good practice around, and these lessons probably aren’t needed for people in the room.

Fundamentally, I think network visualisations can be useful for research, but to make them more effective tools for outreach, some challenges should be addressed.

Context

I’m a Digital Curator at the British Library, mostly working with pre-1900 collections of manuscripts, printed material, maps, etc. Part of my job is to help people get access to our digital collections. Visualisations are a great way to firstly help people get a sense of what’s available, and then to understand the collections in more depth.

I’ve been teaching versions of an ‘information visualisation 101’ course at the BL and digital humanities workshops since 2013. Much of what I’m saying now is based on comments and feedback I get when presenting network visualisations to academics, cultural heritage staff (who should be a key audience for social network analyses).

Provocation: digital humanists love network visualisations, but ordinary people say, ‘so what’?

Fig1And this is a problem. We’re not conveying what we’re hoping to convey.

Network visualisation, via Table of data, via http://fredbenenson.com/
Network visualisation http://fredbenenson.com

When teaching datavis, I give people time to explore examples like this, then ask questions like ‘Can you tell what is being measured or described? What do the relationships mean?’. After talking about the pros and cons of network visualisations, discussion often reaches a ‘yes, but so what?’ moment.

Here are some examples of problems ordinary people have with network visualisations…

Location matters

Spatial layout based on the pragmatic aspects of fitting something on the screen using physics, rules of attraction and repulsion doesn’t match what people expect to see. It’s really hard for some to let go of the idea that spatial layout has meaning. The idea that location on a page has meaning of some kind is very deeply linked to their sense of what a visualisation is.

Animated physics is … pointless?

People sometimes like the sproinginess when a network visualisation resettles after a node has been dragged, but waiting for the animation to finish can also be slow and irritating. Does it convey meaning? If not, why is it there?

Size, weight, colour = meaning?

The relationship between size, colour, weight isn’t always intuitive – people assume meaning where there might be none.

In general, network visualisations are more abstract than people expect a visualisation to be.

‘What does this tell me that I couldn’t learn as quickly from a sentence, list or table?’

Table of data, via http://fredbenenson.com/
Table of data, via http://fredbenenson.com/

Scroll down the page that contains the network graph above and you get other visualisations. Sometimes they’re much more positively received, particularly people feel they learn more from them than from the network visualisation.

Onto other issues with ‘network visualisations as communication’…

Which algorithmic choices are significant?

screenshot of network graphs
Mike Bostock’s force-directed and curved line versions of character co-occurrence in Les Misérables

It’s hard for novices to know which algorithmic and data-cleaning choices are significant, and which have a more superficial impact.

Untethered images

Images travel extremely well on social media. When they do so, they often leave information behind and end up floating in space. Who created this, and why? What world view does it represent? What source material underlies it, how was it manipulated to produce the image? Can I trust it?

‘Can’t see the wood for the trees’

viral texts

When I showed this to a class recently, one participant was frustrated that they couldn’t ‘see the wood for the trees’. The visualisations gives a general impression of density, but it’s not easy to dive deeper into detail.

Stories vs hairballs

But when I started to explain what was being represented – the ways in which stories were copied from one newspaper to another – they were fascinated. They might have found their way there if they’d read the text but again, the visualisation is so abstract that it didn’t hint at what lay underneath. (Also I have only very, very rarely seen someone stop to read the text before playing with a visualisation.)

No sense of change over time

This flattening of time into one simultaneous moment is more vital for historical networks than for literary ones, but even so, you might want to compare relationships between sections of a literary work.

No sense of texture, detail of sources

All network visualisations look similar, whether they’re about historical texts or cans of baked beans. Dots and lines mask texture, and don’t always hint at the depth of information they represent.

Jargon

Node. Edge. Graph. Directed, undirected. Betweenness. Closeness. Eccentricity.

There’s a lot to take on to really understand what’s being expressed in a network graph.

There is some hope…

Onto the positive bit!

Interactivity is engaging

People find the interactive movement, the ability to zoom and highlight links engaging, even if they have no idea what’s being expressed. In class, people started to come up with questions about the data as I told them more about what was represented. That moment of curiosity is an opportunity if they can dive in and start to explore what’s going on, what do the relationships mean?

…but different users have different interaction needs

For some, there’s that frustration expressed earlier they ‘can’t get to see a particular tree’ in the dense woods of a network visualisation. People often want to get to the detail of an instance of a relationship – the lines of text, images of the original document – from a graph.

This mightn’t be how network visualisations are used in research, but it’s something to consider for public-facing visualisations. How can we connect abstract lines or dots to detail, or provide more information about what the relationship means, show the quantification expressed as people highlight or filter parts of a graph? A  harder, but more interesting task is hinting at the texture or detail of those relationships.

Proceed, with caution

One of the workshop questions was ‘Are social network analysis theories and methods relevant for public-facing applications?’ – and maybe the answer is a qualified yes. As a working tool, they’re great for generating hypotheses, but they need a lot more care before exposing them to the public.

[As an aside, I’d always taken the difference between visualisations as working tools for exploring data – part of the process of investigating a research question – and visualisation as an output – a product of the process, designed for explanation rather than exploration – as fundamental, but maybe we need to make that distinction more explicit.]

But first – who are your ‘users’?

During this workshop, at different points we may be talking about different ‘users’ – it’s useful to scope who we mean at any given point. In this presentation, I was talking about end users who encounter visualisations, not scholars who may be organising and visualising networks for analysis.

Sometimes a network visualisation isn’t the answer … even if it was part of the question.

As an outcome of an exploratory process, network visualisations are not necessarily the best way to present the final product. Be disciplined – make yourself justify the choice to use network visualisations.

No more untethered images

Include an extended caption – data source, tools and algorithms used. Provide a link to find out more – why this data, this form? What was interesting but not easily visualised? Let people download the dataset to explore themselves?

Present visualisations as the tip of the data iceberg

Visualisations are the tip of the iceberg
Visualisations are the tip of the iceberg

Lots of interesting data doesn’t make it into a visualisation. Talking about what isn’t included and why it was left out is important context.

Talk about data that couldn’t exist

Beyond the (fuzzy, incomplete, messy) data that’s left out because it’s hard to visualise, data that never existed in the first place is also important:

‘because we’re only looking on one axis (letters), we get an inflated sense of the importance of spatial distance in early modern intellectual networks. Best friends never wrote to each other; they lived in the same city and drank in the same pubs; they could just meet on a sunny afternoon if they had anything important to say. Distant letters were important, but our networks obscure the equally important local scholarly communities.’
Scott Weingart, ‘Networks Demystified 8: When Networks are Inappropriate’

Help users learn the skills and knowledge they need to interpret network visualisations in context.

How? Good question! This is the point at which I hand over to you…

The good, the bad, and the unstructured… Open data in cultural heritage

I was in London this week for the Linked Pasts event, where I presented on trends and practices for open data in cultural heritage. Linked Pasts was a colloquium put on by the Pelagios project (Leif Isaksen, Elton Barker and Rainer Simon with Pau de Soto). I really enjoyed the other papers, which included thoughtful, grounded approaches to structured data for historical periods, places and people,  recognition of the importance of designing projects around audience needs (including user research), the relationship between digital tools and scholarly inquiry, visualisations as research tools, and the importance of good infrastructure for digital history.

My talk notes are below the embedded slides.

 

Warning: generalisations ahead.

My discussion points are based on years of conversations with other cultural heritage technologists in museums, libraries, and archives, but inevitably I’ll have blind spots. For example, I’m focusing on the English-speaking world, which means I’m not discussing the great work that Dutch and Japanese organisations are doing. I’ve undoubtedly left out brilliant specific examples in the interests of focusing on broader trends.  The point is to start conversations, to bring issues out into the open so we can collectively decide how to move forward.

The good

The good news is that more and more open cultural data is being published. Organisations have figured out that a) nothing bad is likely to happen and that b) they might get some kudos for releasing open data.

Generally, organisations are publishing the data that they have to hand – this means it’s mostly collections data. This data is often as messy, incomplete and fuzzy as you’d expect from records created by many different people using many different systems over a hundred or more years.

…the bad…

Copyright restrictions mean that images mightn’t be included. Furthermore, because it’s often collections data, it’s not necessarily rich in interpretative information. It’s metadata rather than data. It doesn’t capture the scholarly debates, the uncertain attributions, the biases in collecting… It certainly doesn’t capture the experience of viewing the original object.

Licensing issues are still a concern. Until cultural organisations are rewarded by their funders for releasing open data, and funders free organisations from expectations for monetising data, there will be damaging uncertainty about the opportunity cost of open data.

Non-commercial licenses are also an issue – organisations and scholars might feel exploited if others who have not contributed to the process of creating it can commercially publish their work. Finally, attribution is an important currency for organisations and scholars but most open licences aren’t designed with that in mind.

…and the unstructured

The data that’s released is often pretty unstructured. CSV files are very easy to use, so they help more people get access to information (assuming they can figure out GitHub), but a giant dump like this doesn’t provide stable URIs for each object. Records in data dumps rarely link to external identifiers like the Getty’s Thesaurus of Geographic Names, Art & Architecture Thesaurus (AAT) or Union List of Artist Names, or vernacular sources for place and people names such as Geonames or DBPedia. And that’s fair enough, because people using a CSV file probably don’t want all the hassle of dereferencing each URI to grab the place name so they can visualise data on a map (or whatever they’re doing with the data). But it also means that it’s hard for someone to reliably look for matching artists in their database, and link these records with data from other organisations.

So it’s open, but it’s often not very linked. If we’re after a ‘digital ecosystem of online open materials’, this open data is only a baby step. But it’s often where cultural organisations finish their work.

Classics > Cultural Heritage?

But many others, particularly in the classical and ancient world, have managed to overcome these issues to publish and use linked open data. So why do museums, libraries and archives seem to struggle? I’ll suggest some possible reasons as conversation starters…

Not enough time

Organisations are often busy enough keeping their internal systems up and running, dealing with the needs of visitors in their physical venues, working on ecommerce and picture library systems…

Not enough skills

Cultural heritage technologists are often generalists, and apart from being too time-stretched to learn new technologies for the fun of it, they might not have the computational or information science skills necessary to implement the full linked data stack.

Some cultural heritage technologists argue that they don’t know of any developers who can negotiate the complexities of SPARQL endpoints, so why publish it? The complexity is multiplied when complex data models are used with complex (or at least, unfamiliar) technologies. For some, SPARQL puts the ‘end’ in ‘endpoint’, and ‘RDF triples‘ can seem like an abstraction too far. In these circumstances, the instruction to provide linked open data as RDF is a barrier they won’t cross.

But sometimes it feels as if some heritage technologists are unnecessarily allergic to complexity. Avoiding unnecessary complexity is useful, but progress can stall if they demand that everything remains simple enough for them to feel comfortable. Some technologists might benefit from working with people more used to thinking about structured data, such as cataloguers, registrars etc. Unfortunately, linked open data falls in the gap between the technical and the informatics silos that often exist in cultural organisations.

And organisations are also not yet using triples or structured data provided by other organisations. They’re publishing data in broadcast mode; it’s not yet a dialogue with other collections.

Not enough data

In a way, this is the collections documentation version of the technical barriers. If the data doesn’t already exist, it’s hard to publish. If it needs work to pull it out of different departments, or different individuals, who’s going to resource that work? Similarly, collections staff are unlikely to have time to map their data to CIDOC-CRM unless there’s a compelling reason to do so. (And some of the examples given might use cultural heritage collections but are a better fit with the work of researchers outside the institution than the institution’s own work).

It may be easier for some types of collections than others – art collections tend to be smaller and better described; natural history collections can link into international projects for structured data, and libraries can share cataloguing data. Classicists have also been able to get a critical mass of data together. Your local records office or small museum may have more heterogeneous collections, and there are fewer widely used ontologies or vocabularies for historical collections. The nature of historical collections means that ‘small ontologies, loosely joined’, may be more effective, but creating these, or mapping collections to them, is still a large piece of work. While there are tools for mapping to data structures like Europeana’s data model, it seems the reasons for doing so haven’t been convincing enough, so far. Which brings me to…

Not enough benefits

This is an important point, and an area the community hasn’t paid enough attention to in the past. Too many conversations have jumped straight to discussion about the specific standards to use, and not enough have been about the benefits for heritage audiences, scholars and organisations.

Many technologists – who are the ones making decisions about digital standards, alongside the collections people working on digitisation – are too far removed from the consumers of linked open data to see the benefits of it unless we show them real world needs.

There’s a cost in producing data for others, so it needs to be linked to the mission and goals of an organisation. Organisations are not generally able to prioritise the potential, future audiences who might benefit from tools someone else creates with linked open data when they have so many immediate problems to solve first.

While some cultural and historical organisations have done good work with linked open data, the purpose can sometimes seem rather academic. Linked data is not always explained so that the average, over-worked collections or digital team will that convinced by the benefits outweigh the financial and intellectual investment.

No-one’s drinking their own champagne

You don’t often hear of people beating on the door of a museum, library or archive asking for linked open data, and most organisations are yet to map their data to specific, widely-used vocabularies because they need to use them in their own work. If technologists in the cultural sector are isolated from people working with collections data and/or research questions, then it’s hard for them to appreciate the value of linked data for research projects.

The classical world has benefited from small communities of scholar-technologists – so they’re not only drinking their own champagne, they’re throwing parties. Smaller, more contained collections of sources and research questions helps create stronger connections and gives people a reason to link their sources. And as we’re learning throughout the day, community really helps motivate action.

(I know it’s normally called ‘eating your own dog food’ or ‘dogfooding’ but I’m vegetarian, so there.)

Linked open data isn’t built into collections management systems

Getting linked open data into collections management systems should mean that publishing linked data is an automatic part of sharing data online.

Chicken or the egg?

So it’s all a bit ‘chicken or the egg’ – will it stay that way? Until there’s a critical mass, probably. These conversations about linked open data in cultural heritage have been going around for years, but it also shows how far we’ve come.

[And if you’ve published open data from cultural heritage collections, linked open data on the classical or ancient world, or any other form of structured data about the past, please add it to the wiki page for museum, gallery, library and archive APIs and machine-readable data sources for open cultural data.]

Drink your own champagne! (Nasjonalbiblioteket image)
Drink your own champagne! (Nasjonalbiblioteket image)

All the things I didn’t say in my welcome to UKMW14 ‘Museums beyond the web’…

Here are all the things I (probably) didn’t say in my Chair’s welcome for the Museums Computer Group annual conference… Other notes, images and tweets from the day are linked from ‘UKMW14 round-up: posts, tweets, slides and images‘.

Welcome to MCG’s UKMW14: Museums beyond the web! We’ve got great speakers lined up, and we’ve built in lots of time to catch up and get to know your peers, so we hope you’ll enjoy the day.

It’s ten years since the MCG’s Museums on the Web became an annual event, and it’s 13 years since it was first run in 2001. It feels like a lot has changed since then, but, while the future is very definitely here, it’s also definitely not evenly distributed across the museum sector. It’s also an interesting moment for the conference, as ‘the web’ has broadened to include ‘digital’, which in turn spans giant distribution networks and tiny wearable devices. ‘The web’ has become a slightly out-dated shorthand term for ‘audience-facing technologies’.

When looking back over the last ten years of programmes, I found myself thinking about planetary orbits. Small planets closest to the sun whizz around quickly, while the big gas giants move incredibly slowly. If technology start-ups are like Mercury, completing a year in just 88 Earth days, and our audiences are firmly on Earth time, museum time might be a bit closer to Mars, taking two Earth years for each Mars year, or sometimes even Jupiter, completing a circuit once every twelve years or so.

But museums aren’t planets, so I can only push that metaphor so far. Different sections of a museum move at different speeds. While heroic front of house staff can observe changes in audience behaviours on a daily basis and social media platforms can be adopted overnight, websites might be redesigned every few years, but galleries are only updated every few decades (if you’re lucky). For a long time it felt like museums were using digital platforms to broadcast at audiences without really addressing the challenges of dialogue or collaborating with external experts.

But at this point, it seems that, finally, working on digital platforms like the web has pushed museums to change how they work. On a personal level, the need for specific technical skills hasn’t changed, but more content, education and design jobs work across platforms, are consciously ‘multi-channel’ and audience rather than platform-centred in their focus. Web teams seem to be settling into public engagement, education, marketing etc departments as the idea of a ‘digital’ department slowly becomes an oxymoron. Frameworks from software development are slowly permeating organisations that use to think in terms of print runs and physical gallery construction. Short rounds of agile development are replacing the ‘build and abandon after launch’ model, voices from a range of departments are replacing the disembodied expert voice, and catalogues are becoming publications that change over time.

While many of us here are comfortable with these webby methods, how will we manage the need to act as translators between digital and museums while understanding the impact of new technologies? And how can we help those who are struggling to keep up, particularly with the impact of the cuts?

Today is a chance to think about the technologies that will shape the museums of the future. What will audiences want from us? Where will they go looking for information and expertise, and how much of that information and expertise should be provided by museums? How can museums best provide access to their collections and knowledge over the next five, ten years?

We’re grateful to our sponsors, particularly as their support helps keep ticket prices affordable. Firstly I’d like to thank our venue sponsors, the Natural History Museum. Secondly, I’d like to thank Faversham & Moss for their sponsorship of this conference. Go chat to them and find out more about their work!

Opening notes for Museums on the Web 2013: ‘Power to the people’

It’ll take me a few days to digest the wonderfulness that was MCG’s UK Museums on the Web 2013: ‘Power to the people’, so in lieu of a summary, here are my opening notes for the conference… (With the caveat that I didn’t read this but still hopefully hit most of these points on the day).

Welcome to Museums on the Web 2013! I’m Mia Ridge, Chair of the Museums Computer Group.

Hopefully the game that began at registration has helped introduce you to some people you hadn’t met before…You can vote on the game in the auditorium over the lunch break, and the winning team will be announced before the afternoon tea break. Part of being a welcoming community is welcoming others, so we tried to make it easier to start conversations. If you see someone who maybe doesn’t know other people at the event, say hi. I know that many of you can feel like you’re working alone, even within a big organisation, so use this time to connect with your peers.

This week saw the launch of a report written for Nesta, the Arts Council, and the Arts and Humanities Research Council in relation to the Digital R&D Fund for the Arts, ‘Digital Culture: How arts and cultural organisations in England use technology‘. One line in the report stood out: ‘Museums are less likely than the rest of the sector to report positive impacts from digital technologies’ – which seems counter-intuitive given what I know of museums making their websites and social media work for them, and the many exciting and effective projects we’ve heard about over the past twelve years of MCG’s UK Museums on the Web conferences (and on our active discussion list).

The key to that paradox may lie in another statement in the report: museums report ‘lower than average levels of digital expertise and empowerment from their senior management and a lower than average focus on digital experimentation, and research and development’.* (It may also be that a lot of museum work doesn’t fit into an arts model, but that’s a conversation for another day.) Today’s theme almost anticipates this – our call for papers around ‘Power to the people’ asked for responses around the rise of director-level digital posts the rise of director-level digital posts and empowering museum staff to learn through play as well as papers on grassroots projects and the power of embedding digital audience participation and engagement into the overall public engagement strategy for a museum.

Today we’ll be hearing about great projects from museums and a range of other organisations, but reports like this – and perhaps the wider issue of whether senior management and funders understand the potential of digital beyond new forms of broadcast and ticket sales – raises the question of whether we’re preaching to the converted. How can we help others in museums benefit from the hard-won wisdom and lessons you’ll hear today?

The Museums Computer Group has always been a platform for people working with museum technology who want to create positive change in the sector: our motto is ‘connect, support, inspire’, and we’re always keen to hear your ideas about how we can help you connect, support and inspire you, but as a group we should also be asking: how can we share our knowledge and experience with others? It can be difficult to connect with and support others when you’re flat out with your own work, yet the need to scale up the kinds of education we might have done with small groups working on digital projects is becoming more urgent as audience expectations change and resources need to be spent even more carefully. Ultimately we can help each other by helping the sector get better at technology and recognise the different types of expertise already available within the heritage sector. Groups like the MCG can help bridge the gap; we need your voices to reach senior management as well as practitioners and those who want to work with museums who’ll shape the sector in the future.

It’s rare to find a group so willing to share their failures alongside their successes, so willing to generously share their expertise and so keen to find lessons in other sectors. We appreciate the contributions of many of you who’ve spoken honestly about the successes and failures of your projects in the past, and applaud the spirit of constructive conversation that encourages your peers to share so openly and honestly with us. I’m looking forward to learning from you all today.

* Update to add a link to an interview with MTM’s Richard Ellis who co-authored the Nesta report, who says the ‘sheer extent of the divide between those in the know and those not’ was one of the biggest surprises working in the culture sector.

‘Digital challenges, digital opportunities’ at MCGPlay, Belfast

These are my rough notes for my talk on ‘Digital challenges, digital opportunities’ at Museum Computer Group‘s Spring event, ‘Engaging Visitors Through Play’ (or #MCGPlay). My aim was to introduce the Museums Computer Group, discuss some of the challenges museums and their staff are facing and think about how to create opportunities from those challenges. I’ve posted my notes about the other talks at MCGPlay at ‘Engaging Visitors Through Play’ – the Museums Computer Group in Belfast.

Play testing Alex’s game at #MCGPlay

I started with some information about the MCG – our mission to connect, support and inspire people working with museum technology (whether technologists, curators, academics, directors or documentation staff) and how that informs the events we run and platforms like our old-school but effective mailing list, whose members who can between them answer almost any museumy question you can think of. As a practioner-led group of volunteers, the MCG can best fulfill its mission by acting as a platform, and with over 1000 members on our mailing list and hundreds of attendees at events, we can help people in the sector help and inspire each other in a mutually supportive space. We’ve also been involved in projects like the Semantic Web Think Tank (2006-2007), Mashed Museum hack days (2007, 2008) and LIVE!Museum (2009-2010). Apparently list discussions even inspired Culture24’s Let’s Get Real analytics project! In response to surveys with our members we’re experimenting with more regional events, and with event formats like the ‘Failure Swapshop’ we trialled early this week and #drinkingaboutmuseums after the conference. (On a personal note, reviewing our history and past events was a lovely excuse to reflect on the projects and events the MCG community has been involved in and also to marvel at how young familiar faces looked at past events).

I’d reviewed the MCG list subject lines over the past few months to get a sense of the challenges or questions that digital museum people were facing:

  • Finding good web design/SEO/evaluation/etc agencies, finding good staff
  • The emergence of ‘head of digital’ roles
  • Online collections, managing digital assets; integration with Collections Management Systems and other systems
  • Integrating Collections Management Systems and 3rd party platforms like WordPress
  • Storytelling to engage the public
  • Museum informatics: CIDOC-CRM and other linked open data topics
  • ‘Create once, publish everywhere’ – can re-usable content really work?
  • Online analytics
  • Digital 3D objects – scanning, printing
  • Measuring the impact of social media
  • MOOCs (online courses)
  • Google Cultural Institute, Google Art Project, Artsy, etc
  • 3rd party tools – PayPal, Google Apps
  • Mobile – apps, well-designed experiences
  • Digital collections in physical exhibitions spaces
  • Touch tables/large-scale interactives
  • The user experience of user-generated content / co-produced exhibitions

Based on those, discussions at various meetings and reviews from other conferences, I pulled out a few themes in museum conversations:

  • ‘Strategically digital’ – the topic of many conversations over the past few years, including MCG’s Museums on the Web 2012, which was actually partly about saying that best solution for a project might not involve technology. Being ‘strategically digital’ offers some solutions to the organisational change issues raised by the mismatch between web speed and museum speed, and it means technology decisions should always refer back to a museum’s public engagement strategy (or infrastructure plans for background ICT services).
  • Mobile – your museum’s website probably has over 20% mobile visitors, so if you’re not thinking about the quality of their experience, you may be driving away business.
  • Immersive, challenging experiences – the influence of site-specific theatre, alternative reality games and transmedia experiences, the ever-new value of storytelling…
  • High-quality services integrated across the whole museum – new terms like service design and design thinking, are taking over from the old refrain of user-centred design, and going beyond it to test how the whole organisation appears to the customer – does it feel like a seamless, pleasurable (or at least not painful) experience? Museums are exploring new(ish) ways of thinking to solve old problems. As with mobile sites, you should be designing around your audiences needs, not your internal structures and complications.
  • Audience participation and engagement – we’ll hear about games over the day, but also think about crowdsourcing, asking the audience to help with tasks or share their knowledge with you.

And a few more challenges:

  • New models of authority and expertise – museum authority is challenged not only by audiences expecting to ‘curate’ their own experience but also by younger staff or people who’ve come from other sectors and have their own ideas about digital projects.
  • Constantly changing audience expectations – if you’ve ever seen kids smoosh their hands on a screen because they expect it to zoom in response to their touch, you’ll know how hard it is to keep up with consumer technologies. Expectations about the quality of the experience and the quality of the technology are always changing based on films, consumer products and non-museum experiences.
  • ‘Doing more with less’ (and then less again)
  • Figuring out where to ask for help – it can be hard to find your way through the jargon and figure out what language to use
  • Training and personal development – job swaps or mentoring might supplement traditional training

There’ll always be new things to learn, and new challenges, so find supportive peers to learn with. The MCG community is one of the ways that people can learn from each other, but the museum sector is full of smart people who are generous with their time and knowledge. Run a discussion group or seminar series over lunch or in the pub, even if you have to rope in other local organisations to make it happen, join in mailing lists, find blogs to follow, look for bursaries to get to events. The international Museums and the Web past papers are an amazing resource, and Twitter hashtags can be another good place to ask for help (check out Dana Allen-Greil’s ‘Glossary of Museum-Related Hashtags‘ for US-based pointers).

I finished by saying that despite all the frustrations, it’s an amazing time to work in or study the sector, so enjoy it! We shouldn’t limit ourselves to engaging audiences in play when we could be engaging ourselves in play.

Museums Computer Group: connect, support, inspire me