Thumbs up to Migratr (and free and open goodness)

[Update: Migratr downloads all your files to the desktop, with your metadata in an XML file, so it's a great way to backup your content if you're feeling a bit nervous about the sustainability of the online services you use. If it's saved your bacon, consider making a donation.]

This is just a quick post to recommend a nice piece of software: "Migratr is a desktop application which moves photos between popular photo sharing services. Migratr will also migrate your metadata, including the titles, tags, descriptions and album organization."

I was using it to migrate stuff from random Flickr accounts people had created at work in bursts of enthusiasm to our main Museum of London Flickr account, but it also works for 23HQ, Picasa, SmugMug and several other photo sites.

The only hassles were that it concatenated the tags (e.g. "Museum of London" became "museumoflondon") and didn't get the set descriptions, but overall it's a nifty utility – and it's free (though you can make a donation). [Update: Alex, the developer, has pointed out that the API sends the tags space delimited, so his app can't tell the different.]

And as the developer says, the availability of free libraries (and the joys of APIs) cut down development time and made the whole thing much more possible. He quotes Newton's, "If I have seen further it is by standing on the shoulders of giants" and I think that's beautifully apt.

Notes from 'The API as Curator' and on why museums should hire programmers

These are my notes from the third paper 'The API as Curator' by Aaron Straup Cope in the Theoretical Frameworks session chaired by Darren Peacock at Museums and the Web 2008. The slides for The API as Curator are online.

I've also included below some further notes on why, how, whether museums should hire programmers, as this was a big meme at the conference and Aaron's paper made a compelling case for geeks in art, arty geeks and geeky artists.

You might have noticed it's taken me a while to catch up on some of my notes from this conference, and the longer I leave it the harder it gets. As always, any mistakes are mine, any comments corrections are welcome, and the comments in [square brackets] below are mine.

The other session papers were Object-centred democracies: contradictions, challenges and opportunities by Fiona Cameron and Who has the responsibility for saying what we see? mashing up Museum and Visitor voices, on-site and online by Peter Samis; all the conference papers and notes I've blogged have been tagged with 'MW2008'.

Aaron Cope: The API as curator.

The paper started with some quotes as 'mood music' for the paper.

Institutions are opening up, giving back to the communitiy and watching what people build.

It's about (computer stuff as) plumbing, about making plumbing not scary. If you're talking about the web, sooner or later you're going to need to talk about computer programming.

Programmers need to be more than just an accessory – they should be in-house and full-time and a priority. It boils down to money. You don't all need to be computer scientists, but it should be part of it so that you can build things.

Experts and consumers – there's a long tradition of collaboration in the art community, for example printmaking. Printers know about all the minutiae (the technical details) but/so the artists don't have to.

Teach computer stuff/programming so that people in the arts world are not simply consumers.

Threadless (the t-shirt site) as an example. Anyone can submit a design, they're voted on in forum, then the top designs are printed. It makes lots of money. It's printmaking by any other name. Is it art?

"Synthetic performances" Joseph Beuys in Second Life…

It's nice not to be beholden to nerds… [I guess a lot of people think that about their IT department. Poor us. We all come in peace!]

Pure programming and the "acid bath of the internet".

Interestingness on Flickr – a programmer works on it, but it's not a product – (it's an expression of their ideas). Programming is not a disposable thing, it's not as simple as a toaster. But is it art? [Yes! well, it can be sometimes, if a language spoken well and a concept executed elegantly can be art.]

API and Artspeak – Aaron's example (a bit on slide 15 and some general mappy goodness).

Build on top of APIs. Open up new ways to explore collection. Let users map their path around your museum to see the objects they want to see.

Their experience at Flickr is that people will build those things (if you make it possible). [Yay! So let's make it possible.]

There's always space for collaboration.

APIs as the nubby bits on Lego. [Lego is the metaphor of the conference!]

Flickr Places – gazetteer browsing.

[Good image on slide 22]: interpretation vs intent, awesome (x) vs time (y). You need programmers on staff, you need to pay them [please], you don't want them to be transient if you want to increase smoothness of graph between steps of awesomeness. Go for the smallest possible release cycles. Small steps towards awesome.

Questions for the Theoretical Frameworks session
Qu from the Science Museum Minnesota: how to hire programmers in museums – how to attract them? when salaries are crap.
Aaron – teach it in schools and go to computer science departments. People do stuff for more than just money.

Qu on archiving UGC and other stuff generated in these web 2.0 projects… Peter Samis – WordPress archives things. [So just use the tools that already exist]

Aaron – build it and they will come. Also, redefine programming.

There's a good summary of this session by Nate at MW2008 – Theoretical Frameworks.

And here's a tragically excited dump from my mind written at the time: "Yes to all that! Now how do we fund it, and convince funders that big top-down projects are less likely to work than incremental and iterative builds? Further, what if programmers and curators and educators had time to explore, collaborate, push each other in a creative space? If you look at the total spend on agencies and external contractors, it must be possible to make a case for funding in-house programmers – but silos of project-based funding make it difficult to consolidate those costs, at least in the UK."

Continuing the discussion about the benefits of an in-house developer team, post-Museums and the Web, Bryan Kennedy wrote a guest post on Museum 2.0 about Museums and the Web in Montreal that touched on the issue:

More museums should be building these programming skills in internal teams that grow expertise from project to project. Far too many museums small and large rely on outside companies for almost all of their technical development on the web. By and large the most innovation at Museums and the Web came from teams of people who have built expertise into the core operations of their institution.

I fundamentally believe that at least in the museum world there isn't much danger of the technology folks unseating the curators of the world from their positions of power. I'm more interested in building skilled teams within museums so that the intelligent content people aren't beholden to external media companies but rather their internal programmers who feel like they are part of the team and understand the overall mission of the museum as well as how to pull UTF-8 data out of a MySQL database.

I left the following comment at the time, and I'm being lazy* and pasting here to save re-writing my thoughts:

Good round-up! The point about having permanent in-house developers is really important and I was glad to see it discussed so much at MW2008.

It's particularly on my mind at the moment because yesterday I gave a presentation (on publishing from collections databases and the possibilities of repositories or feeds of data) to a group mostly comprised of collections managers, and I was asked afterwards if this public accessibility meant "the death of the curator". I've gathered the impression that some curators think IT projects impose their grand visions of the new world, plunder their data, and leave the curators feeling slightly shell-shocked and unloved.

One way to engage with curatorial teams (and educators and marketers and whoever) and work around these fears and valuable critiques is to have permanent programmers on staff who demonstrably value and respect museum expertise and collections just as much as curators, and who are willing to respond to the concerns raised during digital projects.

There's a really good discussion in the comments on Bryan's post. I'm sure this is only a sample of the discussion, but it's a bit difficult to track down across the blogosphere/twitterverse/whatever and I want to get this posted some time this century.

* But good programmers are lazy, right?

Notes from Advanced Web Development: software strategies for online applications at MW2008

These are my notes from the Advanced Web Development: software strategies for online applications workshop with Rob Stein, Charles Moad and Edward Bachta from the Indianapolis Museum of Art at Museums and the Web 2008 (MW2008) in Montreal. I don't know if they'll be useful for anyone else, but if you have any questions about my notes, let me know.

They had their slides online before the presentation, which was really helpful. [More of this sort of thing, please! Though I wish there was a way to view thumbnails of slides on slideshare so you can skip to particular slides.]

The workshop covered a lot of ground, and they did a pretty good job of pitching it at different levels of geekdom. Some of my notes will seem self-evident to different types of geeks or non-geeks but I've tried to include most of what they covered. I've put some of my own comments in [square brackets].

They started with the difference between web pages and web applications, and pointed out that people have been building applications for 30 years so build on existing stuff.

Last year's talk was about 'web 2.0' and the foundations of building solid software applications but since then APIs/SDKs have taken off. Developers should pick pieces that already work rather than building from the bottom up. The craft lies in knowing how to choose the components and how to integrate them.

There are still reasons to consider building your own APIs e.g. if you have unique information others are unlikely to support adequately, if you care about security of data, if you want to control the distribution of information, or if a guarantee of service is important (e.g. if vendors disappear).

Building APIs
They're using model driven development, using xmlschema or database as your model.

Object relational mappers provide object-oriented access to a database. Data model changes are picked up automatically and they're generally database-agnostic so you can swap out the back end. Object relational mappers include Ruby, Hibernate (also in .Net), Propel and SQLAlchemy.

IMA use Hibernate with EMu (their collections management system) and Propel. They've built an 'adaptive layer' for their collection that glues it all together.

Slide on Eclipse: 'rich client platform', not just an IDE. Supports nearly every language except .Net; is cross-platform.

Search
Use full-text indexes for good search functionality. They suggest Lucene (from apache.org) or Google gears. Lucene query types offer finer control than Google e.g. fielded searching [a huge draw for specialist collections searches], date range searching, sorting by any field, multiple index searching with merged results. Fast, low memory usage, extensible. Tools built on Lucene include Nutch (web crawler) and Solr – REST and SOAP API.

Bite size web components and suggestions for a web application toolkit
Harking back to the 'find good components' thing. Leverage someone else's work, and reduce dev/debugging costs – in their experience it produces fewer errors than writing their own stuff.

Storage – Amazon, Nirvanix, XDrive, Google, Box.net. Use Amazon S3 if accessed infrequently cos of free structure.

Video – YouTube, Revver, blip.tv also have developer interfaces. The IMA don't host any video on their website, it's all on YouTube.

Images – Flickr, Picasa. [But the picasa UI sucks so please don't inflict that on your users!]. Flickr support for REST, SOAP, JSON.

Compute (EC, Amazon web service) – Linux virtual machines. Custom disk images for specific requirements. Billable on use. See slides on costs for web hosting.

Authentication services – OpenID, OAuth.

Social computing
Consider social computing when developing your web applications – it's evolving rapidly and is uncertain. Facebook vs OpenSocial (might be the question today, but tomorrow?). Stick with the eyeballs and be ready to change. [Though the problem for museums thinking about social software applications remains – by the time most museums go through approval processes to get onto Facebook it'll be dead in the water. Another reason to have good programmers on staff and include content resources in online programs, so that teams can be more flexible while still working within the overall online strategy of their organisation.]

Developing on Facebook
Facebook API – REST-based API. Use their developer platform – simpler than original API calls. JSON simpler than XML responses. Facebook Query Language (FQL) reduces calls to API. Facebook Markup Language (FBML). HTML + Facebook specific features, inc security controls and interfaces features. [There's a pronoun tag with built-in 'they' if not sure of gender of person. Cute.] Lots more in their slides.

Widget frameworks
Widgets are the buzzword that hasn't quite taken off. The utility isn't quite there yet, so what are they used for? Players are Google, Netvibes (supports more platforms including Apple Mac dashboard, Yahoo, iGoogle, etc) but is Adobe AIR the widget killer? Flash-based runtime for desktop apps. e.g. twhirl. Run as background processes, and can access desktop files directly, clipboard, drag and drop. [I downloaded the AIR Google Analytics application during the session, it's a good example.]

Content management
The CMS is the container to put all the components together. A good CMS will let you integrate components into a new site with a minimum of effort. [Wouldn't that be nice?] Examples include Joomla, WordPress, Drupal, Plone.

There aren't slides for the next 'CMS tour' bit, but they gave some great examples.

Nature holds my camera: they tried visitor blogging with a terminal in gallery so people could ask questions.

They talked about the IMA dashboard. [I asked a vague question about whether there was a user-driven or organisational business case for it – turns out it was driven by their CEO's interest in transparency, e.g. in sharing how they invest monies, track stats and communicate with their visitors. It helps engender trust and loyalty e.g. for donors. Attendance drives corporate sponsorship so there was a business case. It's also good for tracking their performance against actual actions vs stated goals.]

The advantages of using a web application toolkit – theromansarecoming.com took $50,000 to build for a four month exhibition. It hit the goals but was expensive. [The demo looked really cool, it's a shame you don't seem to be able to access it online.]

Breaking the Mode was built using existing components on the technical side, but required the same content investment i.e. in-house resources as The Romans Are Coming. The communication issues were much better because it was built in-house – less of a requirement to explain to external developers, which had some effect on the cost [but the biggest saving was i re-usable component] – the site took 25 hours to build and IT staff costs were about $1000. [So, quite a saving there.]

They demonstrated 'athena', the IMA's intranet. It has file sharing and task management and is built on drupal, looks a bit like basecamp-lite but without licensing issues. "Everything you do in a museum is project-based" and their intranet is built to support that.

There was discussion about whether their intranet could be shared with other museums. Rob Stein is a firm believer in open source and thinks it's the best way to go for museum sector. They're willing to share the source code but don't have the facilities to support it. There's a possibility that they could partner with other institutions to combine to pay small vendors to support it.

[I could hear a sudden burst of keyboards clicking around me as the discussion went onto pooling resources to create and support open source applications for stuff museums need to do. Smaller museums (i.e. most of us, and most are much smaller than MoL) don't have the resources for bespoke software or support but if we all combined, we'd be a bigger market. Overall, it was a really good, grounded discussion about the realities and possibilities of open source development.]

Back to the slides…

Team Troubles
[It was absolutely brilliant to see a discussion of teamwork and collaboration issues in a technical session.]

Divide and conquer – allow team members to focus on area of expertise. Makes it easier to swap out content and themes.

They're using MVC – Model (data management), Controller (interaction logic), View (user internface). They had some good stuff on MVC and the web in their slides (around 77-79). They also discussed the role of non-technical team members.

Drupal boot camp
[This was a pretty convincing demo of getting started with Drupal and using the Content Construction Kit (CCK) to create custom content types e.g. work of art to publish content quickly, though I did wonder about how it integrated with ORMs that would automatically pick up an underlying data structure. Slide 103 showed recommended Drupal modules. It's definitely worth checking out if you're looking for a CMS. If you're on Windows, check out bitnami for installation.]

Client side development
"The customer is always right"
They talked about the DOM (document object model) and javascript for Web 2.0 coolness.
They recommended using Javascript toolkits – more object-orientated, solve cross-browser issues, rapid development. Slide 109 listed some Javascript toolkits and they also recommended Firebug.

Interface components
They should be re-usable, just like the server-side stuff. They should some suggestions like reCAPTCHA, image carousels and rating modules. Pick the tools with best community support and cross-platform support.

CSS boilerplates
Treat CSS like another software component of web design and standardise your CSS usage. Use structured naming for classes and divs in server-side content generation. Check out oswd.org for free templates.

XML in the real world
They demonstrated Global Origins (more on that and other goodness at www.ima-digital.org/special-projects) which uses XML driven content.

Questions and discussion
I asked about integration with legacy/existing systems. Their middleware component 'Mercury' binds their commercial packages and other applications together. e.g. collection management system extraction layer. [This could be a good formalised model for MoL, as we have to pull from a few different places and push out to lots more and it's all a bit ad hoc at the moment. I think we'll be having lots of good discussions about this very soon.]

Some discussion about putting pressure on vendors to open data models. It's a better economic model for them and for museums.

Their CEO is supportive of iteration (in the development process). The web team is cross-department, and they have new media content creators.

[I was curious about how iterative development and the possibility of making mistakes work with their brand but didn't want to ask too many questions]

They made the point that you have a bigger recruiting pool with open source software. [Recruiting geeks into museums has been a bit of a conference meme.]

They give away iPods for online surveys and get more responses that way, but you do have to be aware that people might only give polite answers to survey questions so pay close attention to any criticism.

The IMA say you should be able to justify the longevity of projects when experimenting. Measure your projects against your mission, and how they can implement your mission statement.

So, that's it! I hope I didn't misrepresent anything they said.

Open Source Jam (osjam) – designing stuff that gets used by people

On Thursday I went to Google's offices to check out the Open Source Jam. I'd meant to check them out before and since I was finally free on the right night and the topic was 'Designing stuff that gets used by people' it was perfect timing. A lot of people spoke about API design issues, which was useful in light of the discussions Jeremy started about the European Digital Library API on the Museums Computer group email list (look for subject lines containing 'APIs and EDL' and 'API use-cases').

These notes are pretty much just as they were written on my phone, so they're more pointers to good stuff than a proper summary, and I apologise if I've got names or attributions wrong.

I made a note to go read more of Duncan Cragg on URIs.

Paul Mison spoke about API design antipatterns, using Flickr's API as an example. He raised interesting points about which end of the API provider-user relationship should have the expense and responsibility for intensive relational joins, and designing APIs around use cases.

Nat Pryce talked about APIs as UIs for programmers. His experience suggests you shouldn't do what programmers ask for but find out what they want to do in the end and work with that. Other points: avoid scope creep for your API based on feature lists. Naming decisions are important, and there can be multilingual and cultural issues with understanding names and functionality. Have an open dialogue with your community of users but don't be afraid to selectively respond to requests. [It sounds like you need to look for the most common requests as no one API can do everything. If the EDL API is extensible or plug-in-able, is the issue of the API as the only interface to that service or data more tenable?] Design so that code using your API can be readable. Your API should be extensible cos you won't get it right first time. (In discussion someone pointed out that this can mean you should provide points to plug in as well as designing so it's extensible.) Error messages are part of the API (yes!).

Christian Heilmann spoke on accessibility and make some really good points about accessibility as a hardcore test and incubator for your application/API/service. Build it in from the start, and the benefits go right through to general usability. Also, provide RSS feeds etc as an alternative method for data access so that someone else can build an application/widget to meet accessibility needs. [It's the kind of common sense stuff you don't think someone has to say until you realise accessibility is still a dirty word to some people]

Jonathan Chetwynd spoke on learning disabilities (making the point that it includes functional illiteracy) and GUI schemas that would allow users to edit the GUI to meet their accessibility needs. He also mentioned the possibility of wrapping microformats around navigation or other icons.

Dan North talked about how people learn and the Dreyfus model of skill acquisition, which was new to me but immediately seemed like something I need to follow up. [I wonder if anyone's done work on how that relates to models of museum audiences and how it relates to other models of learning styles.]

Someone whose name I didn't catch talked about Behaviour driven design which was also new to me and tied in with Dan's talk.

Exposing the layers of history in cityscapes

I really liked this talk on "Time, History and the Internet" because it touches on lots of things I'm interested in.

I have a on-going fascination with the idea of exposing the layers of history present in any cityscape.

I'd like to see content linked to and through particular places, creating a sense of four dimensional space/time anchored specifically in a given location. Discovering and displaying historical content marked-up with the right context (see below) gives us a chance to 'move' through the fourth dimension while we move through the other three; the content of each layer of time changing as the landscape changes (and as information is available).

Context for content: when was it written? Was it written/created at the time we're viewing, or afterwards, or possibly even before it about the future time? Who wrote/created it, and who were they writing/drawing/creating it for? If this context is machine-readable and content is linked to a geo-reference, can we generate a representation of these layers on-the-fly?

Imagine standing at the base of Centrepoint at London's Tottenham Court Road and being able to ask, what would I have seen here ten years ago? fifty? two hundred? two thousand? Or imagine sitting at home, navigating through layers of historic mapping and tilting down from a birds eye view to a view of a street-level reconstructed scene. It's a long way off, but as more resources are born or made discoverable and interoperable, it becomes more possible.