April 2008 – Page 3 – Open Objects

Calling geeks in the UK with an interest in cultural heritage content/audiences

You might be interested in BathCamp – a bar camp in Bath on a Saturday (with overnight stay) in late August. This is an initial open call so head along to the website (BathCamp) and check it out. Ideally you would have an interest in cultural heritage content, audiences or applications, but we love the idea of getting fresh perspectives from a wide range of people so we don't expect that you would have worked with the cultural heritage sector (museums, galleries, libraries, archives, archaeology) before.

Questions from 'Beyond Single Repositories' at MW2008

I'm still working on getting my notes from Museums and the Web in Montreal online.

These are notes from the questions at the 'Beyond Single Repositories' session. This session was led by Ross Parry, and included the papers Learning from the People: Traditional Knowledge and Educational Standards by Daniel Elias and James Forrest and The Commons on Flickr: A Primer by George Oates.

This clashed with the User-Generated Content session that I felt I should see for work, but I managed to sneak in at the end of Ross's session. I expected this room to be packed, but it wasn't. I guess the ripples of user-generated content and Web 2.0-ish stuff are still spreading beyond the geeks, and the pebbles of single repositories and the semantic web have barely dropped into the pond for most people. As usual, all mistakes are mine – if you asked a question and I haven't named you or got your question wrong, drop me a line.

Quite a lot of the questions related to 'The Commons'.

There was a question about the difference between users who download and retain context of images, versus those who just download the image and lose all context, attribution, etc. George: Flickr considered putting the metadata into EXIF but it was problematic and wasn't robust enough to be useful.

Another question: how to link back to institution from Flickr? George: 'there's this great invention called the hyperlink'. And links can also go to picture libraries to buy prints.

[I need to check this but it could really help make the case for Commons in museums if that's the case. We might also be able to target different audiences with different requirements – e.g. commercial publications vs school assignments. I also need to check if Flickr URLs are permanent and stable.]

Seb Chan asked: how does business model of having images on Flickr co-exist with existing practices?

Flickr are cool with museums putting in content at different resolutions – it's up to institution to decide.

"It's so easy to do things the correct way" so please teach everyone to use CC licence stuff appropriately.

Issues are starting to be raised about revenue sharing models.

[I wonder if we could put in FOI requests to find out exactly how much revenue UK museums make from selling images compared to the overhead in servicing commercial picture libraries, and whether it varies by type of image or use. It'd be great if we could put some Museum of London/MoLAS images on Commons, particularly if we could use tagging to generate multilingual labels and re-assess images in terms of diversity – such an important issue for our London audiences; or to get more images/objects geo-located. I also wonder if there are any resourcing issues for moderation requirements, or do we just cope with whatever tags are added?]

Update: following the conference, Frankie Roberto started a discussion on the Museums Computer Group list under the subject 'copyright licensing and museums'. You have to be a member to post but a range of perspectives and expertise would really help move this discussion on.

Some feedback to MW2008 and other conferences

There's a thread on the Museums and the Web conference site asking for suggestions for MW2009. I was a bit zombie-like by the time I filled out the feedback form, so I'd added some more comments.

I'm posting them here because I think they apply to lots of conferences and these are things I'd like to see generally. It might look like a lot of comments but I'm probably inspired to write because overall the conference was so good.

There were suggestions to have Pecha Kucha style sessions for people to talk about their projects. I think that'd be really useful – people in the early stages of a project could get a range of feedback and suggestions from some of the best researchers and most experienced 'doers' around; and the vast majority of projects that will never be written up as big conference papers can still pass on a few valuable lessons in a few minutes. It'd also help build a pool of people who had some experience presenting.

I also suggested having afternoon versions of the Birds of a Feather breakfasts. I'm one of those people who's not at all sociable in the morning, but an afternoon session in a coffee shop or pub would be perfect. It'd also give you a way to meet people and maybe go on to dinner or drinks – it must be really difficult if you don't know anyone there and are a bit shy. I'd imagine you could find people who are interested in the same topics more easily this way because it offers a bit more structure than just drinks.

I don't know if there are any guidelines when writing papers but I'd like to suggest one – it's really useful when people talk about how their projects worked in their institutions/sector, as it helps everyone work out how to champion and implement similar ideas when they get back from the conference. Or maybe that's a thread for one of the museum geeks lists…

It would be really useful if each session listed the audience (managers, technologists, educators, etc) and the level of experience it was aimed at (e.g. absolute beginners, practitioners, people looking for a practical learning session) in the program. A lot of the papers did a really good job covering a range of potential audiences, but I might have skipped other sessions if I'd realised they were aimed at an introductory level.

Museums and the Web conferences are brilliant because they put the papers online, so this is a minor quibble, but it would be handy if the papers were available as pdf (or similar) downloads so I could load them onto my phone or laptop beforehand. That way I could follow them during the presentations if there isn't any network connectivity, or review them afterwards.

Finally, it would be so helpful if all presenters had to put their slides online somewhere, tagged with the conference tag and linked from the conference site. The one paper I've blogged about so far had their slides online, and it helped me immensely when writing up as I could check my notes against theirs. As more people blog about conferences, you might need tags for each session – a bit more overhead, but I'm sure you'd get great conversations between people who blogged about the same sessions and hopefully with presenters too.

A slide projected in a 'fancy hotel'-style conference room. The text says: 'miaridge: if 2007 was about UI to differentiate UCG and 'expert' content, 2008 could add machine generated tags to the mix #mw2008' — A tweet projected, the text says: 'miaridge: if 2007 was about UI to differentiate UCG and 'expert' content, 2008 could add machine generated tags to the mix #mw2008'

How I do documentation: a column of bumph and a column of gold

All programmers hate documentation, right? But I've discovered a way to make it less painful and I'm posting in case it helps anyone else.

The first trick is to start documenting as soon as you start thinking about a project – well before you've written any code. I keep a running document of the work I've done, including the bits I'm about to try, information about links into other databases or applications, issues I need to think about or questions I need to ask someone, rude comments (I know, I look like such a nice girl), references, quick use cases, bits about functions, summary notes from meetings, etc.

Mostly I record by date, blog style. Doing it by date helps me link repository files, paper notes and emails with particular bits of work, which can otherwise be tricky if it's a while since you worked on a project or if you have lots of projects on the go. It's also handy if you need to record the time spent on different projects.

I just did it like this for a while, and it was ok, but I learnt the hard way that it takes a while to sort through it if I needed to send someone else some documentation. Then I made a conscious decision to separate the random musings from the decisions and notes on the productive bits of code.

So now my document has two columns. This first column is all the bumph described above – the stuff I'd need if I wanted to retrace my steps or remind myself why I ended up doing things a certain way. The second column records key decisions or final solutions. This is your column of gold.

This way I can quickly run down the items in the second column, organise it by area instead of by date and come up with some good documentation without much effort. And if I ever want to write up the whole project, I've got a record of the whole process in the column of bumph.

You could add a third column to record outstanding tasks or questions. I tend to mark these up with colour and un-colour them when they're done. It just depends how you like to work.

It's amazingly simple, but it works. I hope it might be useful for you too. Or if you have any better suggestions (or a better title for this post), I'd love to hear them.

What Does Openness Mean to The Musum Community?

There's an almost-live report from Mike Ellis and Brian Kelly's "What Does Openness Mean to The Museum Community?" forum at the Museums and the Web conference yesterday at http://mw2008.wetpaint.com/page/report

It's a really important discussion and as it's a wiki I assume you can add comments. I am running late for a session but will sort out my notes later.

Notes from Advanced Web Development: software strategies for online applications at MW2008

These are my notes from the Advanced Web Development: software strategies for online applications workshop with Rob Stein, Charles Moad and Edward Bachta from the Indianapolis Museum of Art at Museums and the Web 2008 (MW2008) in Montreal. I don't know if they'll be useful for anyone else, but if you have any questions about my notes, let me know.

They had their slides online before the presentation, which was really helpful. [More of this sort of thing, please! Though I wish there was a way to view thumbnails of slides on slideshare so you can skip to particular slides.]

The workshop covered a lot of ground, and they did a pretty good job of pitching it at different levels of geekdom. Some of my notes will seem self-evident to different types of geeks or non-geeks but I've tried to include most of what they covered. I've put some of my own comments in [square brackets].

They started with the difference between web pages and web applications, and pointed out that people have been building applications for 30 years so build on existing stuff.

Last year's talk was about 'web 2.0' and the foundations of building solid software applications but since then APIs/SDKs have taken off. Developers should pick pieces that already work rather than building from the bottom up. The craft lies in knowing how to choose the components and how to integrate them.

There are still reasons to consider building your own APIs e.g. if you have unique information others are unlikely to support adequately, if you care about security of data, if you want to control the distribution of information, or if a guarantee of service is important (e.g. if vendors disappear).

Building APIs
They're using model driven development, using xmlschema or database as your model.

Object relational mappers provide object-oriented access to a database. Data model changes are picked up automatically and they're generally database-agnostic so you can swap out the back end. Object relational mappers include Ruby, Hibernate (also in .Net), Propel and SQLAlchemy.

IMA use Hibernate with EMu (their collections management system) and Propel. They've built an 'adaptive layer' for their collection that glues it all together.

Slide on Eclipse: 'rich client platform', not just an IDE. Supports nearly every language except .Net; is cross-platform.

Search
Use full-text indexes for good search functionality. They suggest Lucene (from apache.org) or Google gears. Lucene query types offer finer control than Google e.g. fielded searching [a huge draw for specialist collections searches], date range searching, sorting by any field, multiple index searching with merged results. Fast, low memory usage, extensible. Tools built on Lucene include Nutch (web crawler) and Solr – REST and SOAP API.

Bite size web components and suggestions for a web application toolkit
Harking back to the 'find good components' thing. Leverage someone else's work, and reduce dev/debugging costs – in their experience it produces fewer errors than writing their own stuff.

Storage – Amazon, Nirvanix, XDrive, Google, Box.net. Use Amazon S3 if accessed infrequently cos of free structure.

Video – YouTube, Revver, blip.tv also have developer interfaces. The IMA don't host any video on their website, it's all on YouTube.

Images – Flickr, Picasa. [But the picasa UI sucks so please don't inflict that on your users!]. Flickr support for REST, SOAP, JSON.

Compute (EC, Amazon web service) – Linux virtual machines. Custom disk images for specific requirements. Billable on use. See slides on costs for web hosting.

Authentication services – OpenID, OAuth.

Social computing
Consider social computing when developing your web applications – it's evolving rapidly and is uncertain. Facebook vs OpenSocial (might be the question today, but tomorrow?). Stick with the eyeballs and be ready to change. [Though the problem for museums thinking about social software applications remains – by the time most museums go through approval processes to get onto Facebook it'll be dead in the water. Another reason to have good programmers on staff and include content resources in online programs, so that teams can be more flexible while still working within the overall online strategy of their organisation.]

Developing on Facebook
Facebook API – REST-based API. Use their developer platform – simpler than original API calls. JSON simpler than XML responses. Facebook Query Language (FQL) reduces calls to API. Facebook Markup Language (FBML). HTML + Facebook specific features, inc security controls and interfaces features. [There's a pronoun tag with built-in 'they' if not sure of gender of person. Cute.] Lots more in their slides.

Widget frameworks
Widgets are the buzzword that hasn't quite taken off. The utility isn't quite there yet, so what are they used for? Players are Google, Netvibes (supports more platforms including Apple Mac dashboard, Yahoo, iGoogle, etc) but is Adobe AIR the widget killer? Flash-based runtime for desktop apps. e.g. twhirl. Run as background processes, and can access desktop files directly, clipboard, drag and drop. [I downloaded the AIR Google Analytics application during the session, it's a good example.]

Content management
The CMS is the container to put all the components together. A good CMS will let you integrate components into a new site with a minimum of effort. [Wouldn't that be nice?] Examples include Joomla, WordPress, Drupal, Plone.

There aren't slides for the next 'CMS tour' bit, but they gave some great examples.

Nature holds my camera: they tried visitor blogging with a terminal in gallery so people could ask questions.

They talked about the IMA dashboard. [I asked a vague question about whether there was a user-driven or organisational business case for it – turns out it was driven by their CEO's interest in transparency, e.g. in sharing how they invest monies, track stats and communicate with their visitors. It helps engender trust and loyalty e.g. for donors. Attendance drives corporate sponsorship so there was a business case. It's also good for tracking their performance against actual actions vs stated goals.]

The advantages of using a web application toolkit – theromansarecoming.com took $50,000 to build for a four month exhibition. It hit the goals but was expensive. [The demo looked really cool, it's a shame you don't seem to be able to access it online.]

Breaking the Mode was built using existing components on the technical side, but required the same content investment i.e. in-house resources as The Romans Are Coming. The communication issues were much better because it was built in-house – less of a requirement to explain to external developers, which had some effect on the cost [but the biggest saving was i re-usable component] – the site took 25 hours to build and IT staff costs were about $1000. [So, quite a saving there.]

They demonstrated 'athena', the IMA's intranet. It has file sharing and task management and is built on drupal, looks a bit like basecamp-lite but without licensing issues. "Everything you do in a museum is project-based" and their intranet is built to support that.

There was discussion about whether their intranet could be shared with other museums. Rob Stein is a firm believer in open source and thinks it's the best way to go for museum sector. They're willing to share the source code but don't have the facilities to support it. There's a possibility that they could partner with other institutions to combine to pay small vendors to support it.

[I could hear a sudden burst of keyboards clicking around me as the discussion went onto pooling resources to create and support open source applications for stuff museums need to do. Smaller museums (i.e. most of us, and most are much smaller than MoL) don't have the resources for bespoke software or support but if we all combined, we'd be a bigger market. Overall, it was a really good, grounded discussion about the realities and possibilities of open source development.]

Back to the slides…

Team Troubles
[It was absolutely brilliant to see a discussion of teamwork and collaboration issues in a technical session.]

Divide and conquer – allow team members to focus on area of expertise. Makes it easier to swap out content and themes.

They're using MVC – Model (data management), Controller (interaction logic), View (user internface). They had some good stuff on MVC and the web in their slides (around 77-79). They also discussed the role of non-technical team members.

Drupal boot camp
[This was a pretty convincing demo of getting started with Drupal and using the Content Construction Kit (CCK) to create custom content types e.g. work of art to publish content quickly, though I did wonder about how it integrated with ORMs that would automatically pick up an underlying data structure. Slide 103 showed recommended Drupal modules. It's definitely worth checking out if you're looking for a CMS. If you're on Windows, check out bitnami for installation.]

Client side development
"The customer is always right"
They talked about the DOM (document object model) and javascript for Web 2.0 coolness.
They recommended using Javascript toolkits – more object-orientated, solve cross-browser issues, rapid development. Slide 109 listed some Javascript toolkits and they also recommended Firebug.

Interface components
They should be re-usable, just like the server-side stuff. They should some suggestions like reCAPTCHA, image carousels and rating modules. Pick the tools with best community support and cross-platform support.

CSS boilerplates
Treat CSS like another software component of web design and standardise your CSS usage. Use structured naming for classes and divs in server-side content generation. Check out oswd.org for free templates.

XML in the real world
They demonstrated Global Origins (more on that and other goodness at www.ima-digital.org/special-projects) which uses XML driven content.

Questions and discussion
I asked about integration with legacy/existing systems. Their middleware component 'Mercury' binds their commercial packages and other applications together. e.g. collection management system extraction layer. [This could be a good formalised model for MoL, as we have to pull from a few different places and push out to lots more and it's all a bit ad hoc at the moment. I think we'll be having lots of good discussions about this very soon.]

Some discussion about putting pressure on vendors to open data models. It's a better economic model for them and for museums.

Their CEO is supportive of iteration (in the development process). The web team is cross-department, and they have new media content creators.

[I was curious about how iterative development and the possibility of making mistakes work with their brand but didn't want to ask too many questions]

They made the point that you have a bigger recruiting pool with open source software. [Recruiting geeks into museums has been a bit of a conference meme.]

They give away iPods for online surveys and get more responses that way, but you do have to be aware that people might only give polite answers to survey questions so pay close attention to any criticism.

The IMA say you should be able to justify the longevity of projects when experimenting. Measure your projects against your mission, and how they can implement your mission statement.

So, that's it! I hope I didn't misrepresent anything they said.

In Montreal for MW2008

I'm in Montreal until Sunday April 13 for Museums and the Web 2008. I ran into Brian Kelly on the flight over, and he convinced me to Twitter, so I guess I'll be at https://twitter.com/miaridge for the duration. I'm also uploading photos from Montreal/MW2008 to Flickr as I go. My cheap and cheerful hostel has free wifi, and I'm hoping that it'll generally be pretty easy to get online.

[Update: you can read other people's reactions to the conference at http://conference.archimuse.com/ including a feed from Mike Ellis' experimental onetag application.]

Introducing new technologies…

Shamefully, I can't remember which blog originally pointed me to this Mitchell and Webb clip on a Bronze Age Orientation day.

[Update: it was in the comments on Archaeoastronomy. I found this via Middle Savagery, which has two great posts with videos from 'Personal Histories in Archaeological Theory and Method' presentations, which I will watch when I get back from Montreal. Thanks Alun, Colleen and Mark.]

I'm in Montreal for Museums and the Web 2008 next week. I'll take notes as I go but I'm not sure whether I'll get a chance to blog. See some of you there!

The (UK) Museums Computer Group – can you tell us what you think of MCG?

The Museums Computer Group is interested in canvassing a wide range of views. If you're not a member but work in a relevant field, we'd be interested to hear about why you're not a member, and of course if you're a member we'd love to hear about how we can help you and your organisations.

Message from the MCG committee (of which I'm a member) below.

As our next 25 years of Museums Computer Group activity begins, we're looking to the future, and what sort of shape you – the membership – would like the MCG to take.

To get your views, we've set up an online survey on http://www.museumscomputergroup.org.uk.

As the sector opens up to new ways to meet audiences, new ways to allow access to searchers, and new horizons like the semantic web, the MCG committee think it's time to open up our plans to all our members for comment.

If you think about it, the way people work together, think together and spark ideas off each other is changing rapidly right now. Many of us are on Facebook or LinkedIn – what sort of opportunities might networks like these bring MCG? How can we truly make the group represent the excitement and feel of the 21st century digital world?

Do you think we should still be publishing printed newsletters, or is the web a better place to find the published expertise and thinking of a respected professional organisation? Should MCG continue to plan regular meetings for members around the UK? Does our conventional committee structure really represent the sector and serve the membership?

We'd like you to tell us what you think about these issues, and anything else you feel would be constructive and helpful. It would be great if you could visit the MCG website (http://www.museumscomputergoup.org.uk)
and click the survey link on the homepage.

If you'd rather not fill in online forms, send an email directly to MCG Chair, Debbie Richards – DRichards@leics.gov.uk.

We'll be reporting survey results to the membership at the Spring 08 MCG meeting, which is on 23rd April at the National Waterfront Museum, Swansea (http://www.museumscomputergroup.org.uk/meetings/1-2008.shtml)
and on the MCG website.

Bookings are now being taken for the meeting and please do book soon as spaces are filling up fast.

With best wishes,

The Committee of the Museums Computer Group