Happy developers + happy museums = happy punters (my JISC dev8D talk)

This is a rough transcript of my lightning talk 'Happy developers, happy museums' at JISC's dev8D 'developer happiness' days last week. The slides are downloadable or embedded below. The reason I'm posting this is because I'd still love to hear comments, ideas, suggestions, particularly from developers outside the museum sector – there's a contact form on my website, or leave a comment here.

"In this talk I want to show you where museums are in terms of data and hear from you on how we can be more useful.

If you're interested in updates I use my blog to [crap on a bit, ahem] talk about development at work, and also to call for comment on various ideas and prototypes. I'm interested in making the architecture and development process transparent, in being responsive to not only traditional museum visitors as end users, but also to developers. If you think of APIs as a UI for developers, we want ours to be both usable and useful.

I really like museums, I've worked in three museums (or families of museums) now over ten years. I think they can do really good things. Museums should be about delight, serendipity and answers that provoke more questions.

A recent book, 'How does one become a scientist? : survey on the birth of a Vocation' states that '60% of scientists over 30 and 40% of scientists under 30 note claim, without prompting, that the Palais de la Découverte [a science museum in Paris] triggered their vocation'.

Museums can really have an impact on how people think about the world, how they think about the possibilities of their lives. I think museums also have a big responsibility – we should be curating collections for current and future audiences, but also trying to provide access to the collections that aren't on display. We should be committed to accessibility, transparency, curation, respecting and enabling expertise.

So today I'm here because we want to share our stuff – we are already – but we want to share better.

We do a lot of audience research and know a lot about some of our users, including our specialist users, but we don't know so much about how people might use our data, it's a relatively new thing for us. We're used to saying 'here are objects in a case, interpretation in label', we're not used to saying 'here's unmediated access, access through the back door'.

Some of the challenges for museums: technology isn't that much of a challenge for us on the whole, except that there are pockets of excellence, people doing amazing things on small budgets with limited resources, but there are also a lot of old-fashioned monolithic project designs with big overheads that take a long time to deliver. Lots of people mean well but don't know what's possible – I want to spread the news about lightweight, more manageable and responsive ways of developing things that make sense and deliver results.

We have a lot of data, but a lot of it's crap. Some of what we have is wrong. Some of it was written 100 years ago, so it doesn't match how we'd describe things now.

We face big institutional challenges. Some curators – (though it does depend on the museum) – fear loss of control, fear intellectual vandalism, that mistakes in user-generated content published on museum sites will cause people to lose trust in museums. We have fears of getting the IT wrong (because for a while we did). Funding and metrics are a big issue – we are paid by how many people come through our door or come to our websites. If we're doing a mashup, how do we measure the usage of that? Are we going to cost our organisations money if we can't measure visits and charge back to the government? [This is particularly an issue for free museums in the UK, an interesting by-product of funding structures.]

Copyright is a huge issue. We might not even own an object that appears in our collections, we might not own the rights to the image of our object, or to the reproductions of an image. We might not have asked for copyright clearance at the time when an object was donated, and the cost of tracing it might be too high, so we can't use that object online. Until we come up with a reliable model that reduces the risk to an institution of saying 'copyright unknown', we're stuck.

The following are some ways I can think of for dealing with these challenges…
Limited resources – we can't build an interface to meet every need for every user, but we can provide the content that they'd use. Some of the semantic web talks here have discussed a 'thin layer' of application over data, and that's kind of where we want to go as well.

Real examples to reduce institutional fear and to provide real examples of working agile projects. [I didn't mean strictly 'agile' methodology but generally projects that deliver early and often and can respond to the changing technical and social environment]

Finding ways for the sector to reward intelligent failure. Some museums will never ever admit to making a mistake. I've heard over the past few days that universities can be the same. Projects that are hyped up suddenly aren't mentioned, and presumably it's failed, but no-one [from the project] ever talks about why so we don't learn from those mistakes. 'Fail faster, succeed sooner'.
I'd like to hear suggestions from you on how we could deal with those challenges.

What are museums known for? Big buildings, full of stuff; experts; we make visitors come to us; we're known for being fun; or for being boring.

Museum websites traditionally appear to be about where we are, when we're open, what's on, is there a cafe on site. Which is useful, but we can do a lot more.

Traditionally we've done pretty exhibition microsites, which are nice – they provide an experience of the exhibition before or after your visit. They're quite marketing-led, they don't necessarily provide an equivalent experience and they don't really let you engage with the content beyond the fact that you're viewing it.

We're doing lots of collections online projects, some of these have ended up being silos – sometimes to the extent if we want to get data out of them, we have to screen-scrape our own data. These sites often aren't as pretty, they don't always have the same design and usability budgets (if any).

I think we should stick to what we're really good at – understanding the data (collections), understanding how to mediate it, how to interpret it, how to select things that are appropriate for publication, and maybe open it up to other people to do the shiny pretty things. [Sounds almost like I'm advocating doing myself out of a job!]

So we have lots of objects, images, lots of metadata; our collections databases also include people, events, dates, places, businesses and organisations, lots of qualified information around things like dates, they're not necessarily simple fields but that means they can convey a lot more meaning. I've included that because people don't always realise we have information beyond objects and object metadata. This slide [11 below] is an example of one of the challenges – this box of objects might not be catalogued as individual instruments, it might just be catalogued as a 'box of stuff', which doesn't help you find the interesting objects in the box. Lots of good stuff is hidden in this way.

We're slowly getting there. We're opening up access. We're using APIs internally to share data between gallery interactives and the web, we're releasing them as data points, we're using them to provide direct access to collections. At the moment it still tends to be quite mediated access, so you're getting a lot of interpretation and a fewer number of objects because of the resources required to create really nice records and the information around them.

'Read access' is relatively easy, 'write access' is harder because that's when we hit those institutional issues around authority, authorship. Some curators are vaguely horrified that they might have to listen to what the public have to say and actually take some of it back into their collections databases. But they also have to understand that they can't know everything about their collections, and there are some specialist users who will know everything there is to know about a particular widget on a particular kind of train. We'd like to capture that knowledge. [London Transport Museum have had a good go at that.]

Some random URLs of cool stuff happening in museums [http://dashboard.imamuseum.org/, http://www.powerhousemuseum.com/collection/database/menu.php, http://www.brooklynmuseum.org/opencollection/collections/, http://objectwiki.sciencemuseum.org.uk/] – it's still very much in small pockets, it's still difficult for museum staff to convince people to take what seems like a leap of faith and try these non-traditional things out.

We're taking our content to where people hang out. We're exploring things like Flickr Commons, asking people to tag and comment. Some museums have been updating collections records with information added by the public as a result. People are geo-tagging photos for us, which means you can do 'then and now' mashups without a big metadata enhancement budget.

I'd like to see an end to silos. We are kinda getting there but there's not a serious commitment to the idea that we need to let things go, that we need to make sure that collections online shareable, that they're interoperable, that they can mesh with other things.

Particularly for an education audience, we want to help researchers help themselves, to help developers help others. What else do we have that people might find useful?

What we can do depends on who you are. I could hope that things like enquiry-based learning, mashups, linked data, semantic web technologies, cross-collections searches, faceted browsing to make complex searches easy would be useful, that the concept of museums as a place where information lives – a happy home for metadata mapped around objects and authority records – are useful for people here but I wouldn't want to put words into your mouths.

There's a lot we can do with the technology, but if we're investing resources we need to make sure that they're useful. I can try things in my own time because it's fun, but if we're going to spend limited resources on interfaces for developers then we need to that it's actually going to help some group of people out there.

The philosophy that I'm working with is 'we've got really cool things, but we can have even cooler things if we can share what we have with everyone else'. "The coolest thing to do with your data will be thought of by someone else". [This quote turns out to be on the event t-shirts, via CRIG!] So that said… any ideas, comments, suggestions?"

And that, thankfully, is where I stopped blathering on. I'll summarise the discussion and post back when I've checked that people are ok with me blogging their comments.

[If the slide show below has a brown face on a black background, it's the right one – slideshare's embed seems to have had a hiccup. If it's not that, try viewing it online directly.]

[My slide images include the Easter Egg museum in Kolomyya, Ukraine and 'Laughter in Odd Places' event at the Museum of London.]

This is a quick dump of some of the text from an interview I did at the event, cos I managed to cover some stuff I didn't quite articulate in my talk:

[On challenges for museums:] We need to change institutional priorities to acknowledge the size of the online audience and the different levels of engagement that are possible with the online experience. Having talked to people here, museums also need to do a bit of a sell job in letting people know that we've changed and we're not just great big imposing buildings full of stuff.

[What are the most exciting developments in the museum sector, online?] For digital collections, going outside the walls of the museum using geo-location to place objects in their original context is amazing. It means you can overlay the streets of the city with past events and lives. Outsourcing curation and negotiating new models of expertise is exciting. Overcoming the fear of the digital surrogate as a competitor for museum visits and understanding that everything we do builds audiences, whether digital or physical.

Are blog awards missing the point?

And not only that, but why did I feel disconcerted when this blog was nominated for an award back in April? (I didn't win, but that wasn't surprising.) I kept meaning to post back with the results, but I hadn't yet managed to articulate how I felt about it.

Today Paul Walk blogged 'I think I might be allergic to lists and awards', which sums up a lot of my inchoate thoughts. In posting a comment I realised that I found being reminded that I have an audience a bit disconcerting. I also realised that the value of this blog for me is the chance to learn more, not only during the process of writing a post but also during the online and offline discussions that follow.

Anyway, go read Paul's post. Tony Hirst also makes an interesting point in his comment – awards may act as 'crossover' that introduces non-blog-readers to the value of blogs.

Social Media Statistics

One of those totally brilliant and obvious-in-hindsight ideas. I'd like to see stronger guidelines on citing sources as it grows and clear differentiation by region/nation, because it's easy for vague figures and rumour to become universal 'fact', but it's a great idea and will hopefully grow: Social Media Statistics is:

A big home for all facts and figures around social media – because I'm fed up of trawling around for them and I'm also sure that I'm not the only one who gets asked 'how many users does Facebook have?' every hour of every day. … I'm hoping that this wiki will not only include usage stats, but also behaviour and attitude stats. It's a bit of a skeleton at the moment, with v few of my stats having stated sources, but be patient – and help where you can!

Please add in any juicy stats as you come across them, and do cite your references and link to them where possible.

I'll put my money where my mouth is and add information I find. I find wikis a really useful tool for lightweight documentation – it's really easy to add some information while it's in your brain, and the software doesn't get in the way of your flow.

For a while now I've wanted a repository of museum and cultural heritage audience evaluation – this could be a good model. Speaking of which, I really must write up my notes from the MCG Autumn meeting.

[Edit to add: Social Media Statistics also links to Measurementcamp, which might be of interest to cultural heritage organisations wondering how they can 'measure their social media communications online and offline' (and how they can work with project sponsors and funders to define suitable metrics for an APId, social media world).]

Next-generation approaches at 'UK Museums on the Web Conference 2008'

Session 3, 'Next-generation approaches', of the UK Museums on the Web Conference 2008 was introduced by Jon Pratty.

Jon questioned, 'what is a virtual museum?. It can be pretty much anything. Lots of valuable historical documents aren't in 'online museum', they're just out there to be found by search. It raises the question – how much permanence should digital objects have?'.

George Oates, 'Sharing museum collections through Flickr'
Introducing the Flickr Commons project and talking about some early results. Some practical information on what it means to join the program, and things that have come out of it.

Flickr 'swerved in from left field' and bumped into museum people and librarians and archivists.

It started with Library of Congress thinking about how to engage with Web 2.0. They were looking for a Web 2.0 partner. They have 14 million images, about a million digitised.

Flickr is designed specifically to search and browse photos. It has a big infrastructure and supports interfaces in 8 languages. It has lots of eyeballs – "it's made of people".

From the Commons point of view, it's simply a service, organisations can publish content into it.

They hit a hurdle: can a collecting institution publish content onto a site like Flickr? As collecting institution, someone like the Library of Congress doesn't necessarily own the copyright or know who the copyright holder was. They devised a new statement – 'no known copyright restrictions' – this provided a way to use this content once institution had done as much work as they could to trace copyright so they could still publish if not able to trace copyright holders.

Might open up to other sorts of content.

What's it for? Increase access to public photography collections; gather context about them, [something else I missed].

Powerhouse – lots of the collection was geo-tagged. It means you can find photos from then and now, for example around the CBD of Sydney. [Cool! I love the way geo-tagging content lets you build up layers of history]

Brooklyn – it made sense to use their existing established Flickr account, so Flickr created functionality to support that. The Smithsonian joined on Monday.

Soon they'll have content from other partners including a charming collection from a tiny local museum.

Results:
Last 28 days Library of Congress – 15,000 [or 50,000?] views per day, 8 million views over last six months, 72,000 tags.
Powerhouse – 77,000 views (more views of that collection in one month than in the whole previous year), 3500 tags.
Brooklyn – figures affected by merged account issue.
Smithsonian – 10,000 views in first day, 100 new contacts

The numbers are probably affected by the ratio of photos e.g. smaller numbers when an institution has put fewer photos online.

"But, is it any good"?"
Suddenly there are conversations between Flickr users and institutions, and between Flickr users, contributing information and identifications.

They contribute the identification of places and people, with information about the history behind photos.

Now and then – people are adding their recent photos of a location via comments on Flickr.

Library of Congress have made a list of types of interactions [slides], they include the transcription of text on signs, posters, etc in background, geo-tags, non-English tags.

Institutional context and Flickr – bind them together with hyperlink, but being on Flickr frees a program from institutional constraints.

Flickr has been designed as a vessel or platform where interactions and conversations can happen.

The information that the community provides is proving useful. The Library of Congress has updated 176 records in catalogue, recording that it's based on 'information provided by Flickr Commons Project 2008'.

The Smithsonian found it was opportunity for collaboration between institutions/departments and staff.

How to join: the process is publish – interact – feedback.

What to think about: give a broad representation of what's in your collection. Think about placement of images in photostream and sets. Plan to attract special interest groups. Think about what is already digital, what is popular? It can direct your digitisation efforts with feedback from a live community. Or you could go into your stores or collections database and possibly digitised randomly.

How much metadata to include? How many fields from database into description of photo; more or less?

When: can be a challenge for institutions.

How? You could use the normal Flickr uploadr if you don't have too many images; or you could use API to write applications that will work with Collections Management Systems.

Who? Might be web technician and curator.

The catch? It costs $24.95 for a Pro account. But you get unlimited storage, and could conceivably put whole collection online.

The future:
It's a work in progress. Probably will end up developing tools like additional reporting
Grow gently (make sure institution can handle the changes and respond to interactions)
They will continue their focus on photographs, not photographs of objects "(sorry)". "Flickr is about … empathic photography"
"Go local" e.g. small archives in little towns – people can still participate even if they don't have a web team, or web site.
API methods, RSS
Searching, browsing, maps
Search across Commons coming soon. Maybe combine searches to see a map of photos taken in 1910.

Metrics and ROI for social software

A useful post about Social Media Metrics/Return on Investment with some thoughts on "how to provide useful metrics and measurements on the effects of social media for a nonprofit organization" and lots of useful links. It suggests "audience, engagement, loyalty, influence, and action" can put metrics in the "more holistic" context of outcomes, measures, strategy.