Open Objects – Page 31 – 'Every age has its orthodoxy and no orthodoxy is ever right.'

Quick and light solutions at 'UK Museums on the Web Conference 2008'

These are my notes from session 4, 'Quick and light solutions', of the UK Museums on the Web Conference 2008. In the interests of getting my notes up quickly I'm putting them up pretty much 'as is', so they're still rough around the edges. There are quite a few sections below which need to be updated when the presentations or photos of slides go online. [These notes would have been up a lot sooner if my laptop hadn't finally given up the ghost over the weekend.]

Frankie Roberto, 'The guerrilla approach to aggregating online collections'
He doesn't have slides, he's presenting using Firefox 3. [You can also read Frankie's post about his presentation on his blog.]

His projects came out of last year's mashed museum day, where the lack of re-usable cultural heritage data online was a real issue. Talk in the pub turned to 'the dark side' of obtaining data – screen scraping was one idea. Then the idea of FoI requests came up, and Frankie ended up sending Freedom of Information requests to national museums in any electronic format with some kind of structure.

He's not showing site he presented at Montreal, it should be online soon and he'll release the code.

Frankie demonstrated the Science Museum object wiki.

[I found 'how it works' as focus of the object text on the Science Museum wiki a really interesting way of writing object descriptions, it could work well for other projects.]

He has concerns about big top down projects so he's suggesting five small or niche projects. He asked himself, how do people relate to objects?
1. Lots of people say, "I've got one of these" so: ivegotoneofthose.com – put objects up, people can hit button to say 'I have one of those'. The raw numbers could be interesting.
[I suggested this for Exploring 20th Century London at one point, but with a bit more user-generated content so that people could upload photos of their object at home or stories about how they got it, etc. I suppose ivegotoneofthose.com could be built so that it also lets people add content about their particular thing, then ideally that could be pulled back into and displayed on a museum site like Exploring. Would ivegotoneofthose.com sit on top of a federated collections search or would it have its own object list?]
2. Looking at TheyWorkForYou.com, he suggests: TheyCollectForYou.com – scan acquisition forms, publish feeds of which curators have bought what objects. [Bringing transparency to the acquisition process?]
3. Looking at howstuffworks.com, what about howstuffworked.com?
4. 'what should we collect next?' – opening up discourse on purchasing. Frankie took the quote from Indiana Jones: thatbelongsinamuseum.com – people can nominate things that should be in a museum.
5. pricelessartefact.com – [crowdsourcing object evaluation?] – comparing objects to see which is the most valuable, however 'valuable' is defined.
[Except that possibly opens the museum to further risk of having stuff nicked to order]

Fiona Romeo, 'Different ways of seeing online collections'
I didn't take many detailed notes for this paper, but you can see my notes on a previous presentation at Notes from 'Maritime Memorials, visualised' at MCG's Spring Conference.

Mapping – objects don't make a lot of sense about themselves, but are compelling as part of information about an expedition, or failed expedition.

They'll have new map and timeline content launching next month.

Stamen can share information about how they did their geocoding and stuff.

Giving your data out for creative re-use can be as easy as giving out a CSV file.
You always want to have an API or feed when doing any website.
The National Maritime Museum make any data set they can find without licensing restrictions and put it online for creative re-use.

[Slide on approaches to data enhancement.]
Curation is the best approach but it's time-consuming.

Fiona spoke about her experiments at the mashed museum day – she cut and paste transcript data into IBM's Many Eyes. It shows that really good tools are available, even if you don't have resources to work with a company like Stamen.

Mike Ellis presented a summary of the 'mashed museum' day held the day before.

Questions, wrap up session
Jon – always assume there (should be) an API

[A question I didn't ask but posted on twitter: who do we need to get in the room to make sure all these ideas for new approaches to data, to aggregation and federation, new types of experiences of cultural heritage data, etc, actually go somewhere?]

Paul on fears about putting content online: 'since the state of Florida put pictures of their beaches on their website, no-one goes to the beach anymore'.

Metrics:
Mike: need to go shout at DCMS about the metrics, need to use more meaningful metrics especially as thinking of something like APIs
Jon: watermark metadata… micro-marketing data.
Fiona: send it out with a wrapper. Make it embeddable.

Question from someone from Guernsey Museum about images online: once you've downloaded your nice image its without metadata. George: Flickr like as much data in EXIF as possible. EXIF data isn't permanent but is useful.

Angela Murphy: wrappers are important for curators, as they're more willing to let things go if people can get back to the original source.

Me, referring back to the first session of the day: what were Lee Iverson's issues with the keynote speech? Lee: partly about the role of institution like the BBC in modern space. National broadcaster should set social common ground, be a fundamental part of democratic discussion. It's even more important now because of variety of sources out there, people shutting off or being selective about information sources to cope with information overload. Disparate source mean no middle ground or possibility of discussion. BBC should 'let it go' – send the data out. The metric becomes how widely does it spread, where does it show up? If restricted to non-commercial use then [strangling use/innovation].

The 'net recomender' thing is a flawed metric – you don't recommend something you disagree with, something that is new or difficult knowledge. What gets recommended is a video of a cute 8 year old playing Guitar Hero really well. People avoid things that challenge them.

Fiona – the advantage of the 'net recomender' is it's taking judgement of quality outside originating institution.

Paul asked who wondered why 7 – 8 on scale of 10 is neutral for British people, would have thought it's 5 – 6.

Angela: we should push data to DCMS instead of expecting them to know what they could ask for.

George: it's opportunity to change the way success is measured. Anita Roddick says 'when the community gives you wealth, it's time to give it back'. [Show, don't tell] – what would happen if you were to send a video of people engaging instead of just sending a spreadsheet?

Final round comments
Fiona: personal measure of success – creating culture of innovation, engagement, creating vibrant environment.

Paul: success is getting other people to agree with what we've been talking about [at the mashed museum day and conference] the past two days. [yes yes yes!] A measure of success was how a CEO reacted to discovering videos about their institution on YouTube – he didn't try to shut it down, but asked, 'how we can engage with that'

Ross on 'take home' ideas for the conference
Collections – we conflate many definitions in our discussions – images, records, web pages about collections.

Our tone has changed. Delivery changed – realignment of axis of powers, MLA's Digital portfolio is disappearing, there's a vacuum. Who will fill it? The Collections Trust, National Museum Directors' Conference? Technology's not a problem, it's the cultural, human factors. We need to talk about where the tensions are, we've been papering over the cracks. Institutional relationships.

The language has changed – it was about digitisation, accessibility, funding. Three words today – beauty, poetry, life. We're entering an exciting moment.

What's the role of the Museums Computer Group – how and what can the MCG do?

The BBC, accessibility, the hCalendar microformat and RDFa

The BBC have announced (in 'Removing Microformats from bbc.co.uk/programmes') that they'll stop using the hCalendar microformat because of concerns about accessibility, specifically the use of the HTML abbreviation element (the abbr tag):

Our concerns were:

the effect on blind users using screen readers with abbreviation expansion turned on where abbreviations designed for machines would be read out

the effect on partially sighted users using screen readers where tool tips of abbreviations designed for machines would be read out

the effect of incomprehensible tooltips on users with cognitive disabilities

the potential fencing off of abbreviations to domains that need them

Until these issues are resolved the BBC semantic markup standards have been updated to prevent the use of non-human-readable text in abbreviations.

They're looking at using RDFa, which they describe as 'a slightly bigger S semantic web technology similar to microformats but without some of the more unexpected side-effects'.

Their support for RDFa is timely in light of Lee Iverson's presentation at the UK Museums on the Web conference (my notes). It's also an interesting study of what can happen when geek enthusiasm meets existing real world users.

More generally, does the fact that an organisation as big as the BBC hasn't yet produced an API mean that creating an API is not a simple task, or that the organisational issues are bigger than the technical issues?

Next-generation approaches at 'UK Museums on the Web Conference 2008'

Session 3, 'Next-generation approaches', of the UK Museums on the Web Conference 2008 was introduced by Jon Pratty.

Jon questioned, 'what is a virtual museum?. It can be pretty much anything. Lots of valuable historical documents aren't in 'online museum', they're just out there to be found by search. It raises the question – how much permanence should digital objects have?'.

George Oates, 'Sharing museum collections through Flickr'
Introducing the Flickr Commons project and talking about some early results. Some practical information on what it means to join the program, and things that have come out of it.

Flickr 'swerved in from left field' and bumped into museum people and librarians and archivists.

It started with Library of Congress thinking about how to engage with Web 2.0. They were looking for a Web 2.0 partner. They have 14 million images, about a million digitised.

Flickr is designed specifically to search and browse photos. It has a big infrastructure and supports interfaces in 8 languages. It has lots of eyeballs – "it's made of people".

From the Commons point of view, it's simply a service, organisations can publish content into it.

They hit a hurdle: can a collecting institution publish content onto a site like Flickr? As collecting institution, someone like the Library of Congress doesn't necessarily own the copyright or know who the copyright holder was. They devised a new statement – 'no known copyright restrictions' – this provided a way to use this content once institution had done as much work as they could to trace copyright so they could still publish if not able to trace copyright holders.

Might open up to other sorts of content.

What's it for? Increase access to public photography collections; gather context about them, [something else I missed].

Powerhouse – lots of the collection was geo-tagged. It means you can find photos from then and now, for example around the CBD of Sydney. [Cool! I love the way geo-tagging content lets you build up layers of history]

Brooklyn – it made sense to use their existing established Flickr account, so Flickr created functionality to support that. The Smithsonian joined on Monday.

Soon they'll have content from other partners including a charming collection from a tiny local museum.

Results:
Last 28 days Library of Congress – 15,000 [or 50,000?] views per day, 8 million views over last six months, 72,000 tags.
Powerhouse – 77,000 views (more views of that collection in one month than in the whole previous year), 3500 tags.
Brooklyn – figures affected by merged account issue.
Smithsonian – 10,000 views in first day, 100 new contacts

The numbers are probably affected by the ratio of photos e.g. smaller numbers when an institution has put fewer photos online.

"But, is it any good"?"
Suddenly there are conversations between Flickr users and institutions, and between Flickr users, contributing information and identifications.

They contribute the identification of places and people, with information about the history behind photos.

Now and then – people are adding their recent photos of a location via comments on Flickr.

Library of Congress have made a list of types of interactions [slides], they include the transcription of text on signs, posters, etc in background, geo-tags, non-English tags.

Institutional context and Flickr – bind them together with hyperlink, but being on Flickr frees a program from institutional constraints.

Flickr has been designed as a vessel or platform where interactions and conversations can happen.

The information that the community provides is proving useful. The Library of Congress has updated 176 records in catalogue, recording that it's based on 'information provided by Flickr Commons Project 2008'.

The Smithsonian found it was opportunity for collaboration between institutions/departments and staff.

How to join: the process is publish – interact – feedback.

What to think about: give a broad representation of what's in your collection. Think about placement of images in photostream and sets. Plan to attract special interest groups. Think about what is already digital, what is popular? It can direct your digitisation efforts with feedback from a live community. Or you could go into your stores or collections database and possibly digitised randomly.

How much metadata to include? How many fields from database into description of photo; more or less?

When: can be a challenge for institutions.

How? You could use the normal Flickr uploadr if you don't have too many images; or you could use API to write applications that will work with Collections Management Systems.

Who? Might be web technician and curator.

The catch? It costs $24.95 for a Pro account. But you get unlimited storage, and could conceivably put whole collection online.

The future:
It's a work in progress. Probably will end up developing tools like additional reporting
Grow gently (make sure institution can handle the changes and respond to interactions)
They will continue their focus on photographs, not photographs of objects "(sorry)". "Flickr is about … empathic photography"
"Go local" e.g. small archives in little towns – people can still participate even if they don't have a web team, or web site.
API methods, RSS
Searching, browsing, maps
Search across Commons coming soon. Maybe combine searches to see a map of photos taken in 1910.

'Orphan works' legislation – the artists' view

A perspective on the proposed US 'orphan works' legislation at the Art Newsletter: "The proposed new law is a nightmare for artists".

US Congress is currently debating legislation which will remove the penalty for copyright infringement if the creator of a work, after a diligent search, cannot be located. Libraries and archives are among the groups lobbying for the change to allow copying of so-called "orphan works".

"The proposal goes far beyond current concepts of fair use, and, as explicitly acknowledged by the Register of Copyrights in a recent congressional hearing, it is not designed to deal with the special situations of non-profit museums, libraries and archives. Rather, it would give carte blanche to infringers even if they wished to exploit an artistic work for commercial advantage.
…
The Copyright Office presumes that the infringers it would let off the hook would be those who had made a "good faith, reasonably diligent" search for the copyright holder. Unfortunately, it is totally up to the infringer to decide if he has made a good faith search.
…
And, the Copyright Office has made it clear that failure to register a work with these private companies would automatically render it an orphan, available to be copied by infringers with impunity.

While there are clear benefits to clarifying the situation with orphan works, and for protecting heritage organisations from the possible risks of publishing non-orphan works in good faith, it seems that as Obi Wan might say, this proposed legislation is not the solution we're looking for.

'Sector-wide initiatives' at 'UK Museums on the Web Conference 2008'

Session 2, 'Sector-wide initiatives', of the UK Museums on the Web Conference 2008 was chaired by Bridget McKenzie.

In the interests of getting my notes up quickly I'm putting them up pretty much 'as is', so they're still rough around the edges. There are quite a few sections below which need to be updated when the presentations or photos of slides go online. Updated posts should show in your RSS feed but you might need to check your settings.

[I hope Bridget puts some notes from her paper on her blog because I didn't get all of it down.]

The session was introduced as case studies on how cross institutional projects can be organised and delivered. She mentioned resistance to bottom-up or experimental approach, institutional constraints; and building on emerging frames of web.

Does the frame of 'the museum' make sense anymore, particularly on the web? What's our responsibilities when we collaborate? Contextual spaces – chance to share expertise in meaningful ways.

It's easy to revert to ways previous projects have been delivered. Funding plans don't allow for iterative, new and emergent technologies.

Carolyn Royston and Richard Morgan, V&A and NMOLP.
The project is funded by the 'invest to save' program, Treasury.

Aims:
Increase use of the digital collections of the 9 museums (no new website)
No new digitisation or curatorial content.
Encourage creative and critical use of online resources.
[missed one]
Sustainable high-quality online resource for partners.

The reality – it's like herding cats.

They had to address issue of partnership to avoid problems later in project.

Focussed on developing common vision, set of principles on working together, identify things uniquely achievable through partnership, barriers to success, what added value for users.

Three levels of barriers to success – one of working in an inter-museum collaborative way, which was first for those nationals; organisational issues – working inter-departmentally (people are learning or web or whatever people and not used to working together); personal issues – people involved who may not think they are web or learning people.

These things aren't necessary built in to project plan.

Deliverables: web quests, 'creative journeys', federated search, [something I missed], new ways of engaging with audiences.

Web Quests – online learning challenge, flexible learning tool mapped to curriculum. They developed a framework. It supports user research, analysis and synthesis of information. Users learn to use collections in research.

Challenges: creating meaningful collection links; sending people to collections sites knowing that content they'd find there wasn't written for those audiences; provide support for pupils when searching collections. Sustainable content authoring tool and process.

[I wondered if the Web Quest development tools are extendible, and had a chance to ask Carolyn in one of the breaks – she was able to confirm that they were.]

Framework stays on top to support and structure.

Creative journeys:
[see slide]

They're using Drupal. [Cool!]

[I also wondered about the user testing for creative journeys, whether there was evidence that people will do it there and not on their blogs, Zotero, in Word documents or hard drives – Carolyn also had some information on this.]

Museums can push relevant content.

What are the challenges?
How to build and sustain the Creative Journeys (user-generated content) communities, individually and as a partnership?
Challenge to curatorial authority and reputation
Work with messiness and complexity around new ways of communicating and using collections
Copyright and moderation issues

But partners are still having a go – shared risk, shared success.

Federated search
Wasn't part of original implementation plan
[slide on reasons for developing]
Project uses a cross collection search, not a cross collection search project. The distinction can be important.

The technical solution was driven by project objectives [choices were made in that context, not in a constraint-free environment.]

Richard, Technical Solution
The back-end is de-coupled from front end applications
A feed syndicates user actions.

Federated search – a system for creating machine readable search results and syndicating them out.
Real time search or harvester. [IMO, 'real time' should always be in scare quotes for federated searches – sometimes Google creates expectations of instantaneous results that other searches can't deliver, though the difference may only be a matter of seconds.]

Data manipulation isn't the difficult bit

Creative Journeys – more machine readable data

Syndicated user interactions with collections.
Drupal [slide]

Human factor – how to sell to board
Deploy lightweight solutions. RAD. Develop in house, don't need to go to agency.

[I'd love it if the NMOLP should have a blog, or a holding page, or something, where they could share the lessons they've learnt, the research they've done and generally engage with the digital museum community. Generally a lot of these big infrastructure projects would benefit from greater transparency, as scary as this is for traditional organisations like museums. The open source model shows that many eyeballs mean robust applications.]

Jeremy Ottevanger and Europeana/the European Digital Library
[I have to confess I was getting very hungry by this point so you might get more detailed information from Jeremy's blog when he adds his notes.]
Some background on his involvement in it, hopes and concerns.
"cross-domain access to Europe's cultural heritage"
Our content is more valuable together than scattered around.

Partnership, planning and prototyping
Not enough members from the UK, not very many museums.
Launch November this year
Won't build all of planned functionality – user-generated content and stuff planned but not for prototype.

Won't build an API or all levels of multiple linguality (in first release). Interface layer may have 3 or 4 major languages; object metadata (maybe a bit) and original content of digitised documents.

Originals on content contributors site, so traffic ends up there. That's not necessarily clear in the maquette (prototype). [But that knowledge might help address some concerns generally out there about off-site searches]

Search, various modes of browsing, timeline and stuff.

Jeremy wants to hear ideas, concerns, ambitions, etc to take to plenary meeting.

He'd always wanted personal place to play with stuff.

[Similarly to my question above, I've always wondered whether users would rely on a cultural heritage sector site to collate their data? What unique benefits might a user see in this functionality – authority by association? live updates of data? Would they think about data ownership issues or the longevity of their data and the reliability of the service?]

Why are there so few UK museums involved in this? [Based on comments I've heard, it's about no clear benefits, yet another project, no API, no clear user need] Jeremy had some ideas but getting in contact and telling him is the best way to sort it out.

Some benefits include common data standards, a big pool of content that search engines would pay attention to in a way they wouldn't on our individual sites. Sophisticated search. Will be open source. Multi-lingual technology.

Good news:
"API was always in plans".

EDLocal – PNDS. EU projects will be feeding in technologies.

Bad news: API won't be in website prototype. Is EDLocal enough? Sustainability problems.
'Wouldn't need website at all if had API'. Natural history collections are poorly represented.

Is OAI a barrier too far? You should be able to upload from spreadsheet. [You can! But I guess not many people know this – I'm going to talk to the people who coded the PNDS about writing up their 'upload' tool, which is a bit like Flickr's Uploadr but for collections data.]

Questions
Jim O'Donnell: regarding the issue of lack of participation. People often won't implement their own OAI repository so that requirement puts people off.

Dan Zambonini: aggregation fatigue. 'how many more of these things do we have to participate in'. His suggestion: tell museums to build APIs so that projects can use their data, should be other way around. Jeremy responded that that's difficult for smaller museums. [Really good point, and the PNDS/EDL probably has the most benefits for smaller museums; bigger museums have the infrastructure not to need the functionality of the PNDS though they might benefit from cross-sector searching and better data indexing.]

Gordon McKenna commented: EDLocal starts on Wednesday next week, for three years.

George Oates: what's been most surprising in collaboration process? Carolyn: that we've managed to work together. Knowledge sharing.

Notes from 'UK Museums on the Web Conference 2008'

I'm back in London after
UK Museums on the Web Conference 2008 and the mashed museum day.

In the interests of getting my notes up quickly I'm putting them up pretty much 'as is', so they're still rough around the edges. I'll add links to the speaker slides when they are all online. Some photos from the two days are online – a general search for ukmw08 on Flickr will find some. I have some in a set online now, others are still to come, including some photos of slides so I'll update this as I check the text from the slides. These are my notes from the first session.

The keynote speech was given by Tom Loosemore of Ofcom on the Future of Public Service Content.

[For context, Ofcom is the 'independent regulator and competition authority for the UK communications industries' and their recently second review of public service broadcasting, 'The Digital Opportunity', caused a stir in the digital cultural heritage world for its assessment of the extent to which public sector websites delivered on 'public service purposes and characteristics'. You can read the summary or download the full report.]

'How many of you are on the main board of your institution?'

Leadership doesn't have the vision in place to take advantage of the internet.

Sees the internet as platform for public service, [most importantly] enlightenment. He's here today to enlist our help.

We view the internet through lens of expectations from the past, definitely in public service broadcasting – 'let's get our programs on the internet'.

What is value for money?

Would that other sectors did the same soul searching

[On the Ofcom review:] 'You can't really review the web, it's bonkers'

Public service characteristics to create a report card. Of the public service characteristics in the online market (high quality, original, innovative, challenging, engaging, discoverable and accessible), 'challenging' is the hardest.

Museums and cultural sector have amazing potential. What are the barriers between the people here who get it and being able to take that opportunity and redefine public service broadcasting?

It's not skills. Maybe ten years ago, not today. And it's not technology. The crucial missing link is leadership and vision, the lack of recognition by people who govern direction of institutions of the huge potential.

[Which does translate into 'more resources', eventually, but perhaps the missing gap right now is curatorial/interpretative resources? Every online project we do generates more enquiries, stretching these people further, and they don't have time to proactively create content for ad hoc projects as it is, especially as their time tends to be allocated a long time in advance.]

What's behind that reluctance, what can you do to help people on your board understand the opportunities? We can ask 'what business are we in? what's the purpose of our institution?'.

Tate recognise they're not just in the business of getting people to go to the Tate venues, they're in the business of informing people about art. Compare that to the Royal Shakespeare Company which is using its online site purely to get bums on seats.

Next opportunity… how do you take opportunity to digitise your collections and reach a whole new audience? How can you make better use of cultural objects that were previously constrained by physicalty.

What opportunities are native to the internet, can only happen there? How can it help your institution to deliver its purpose?

Recognise that you are in the (public service) media business.

How do you measure enlightenment? You could be changing the way people see the world, etc. but you need to measure it to make a case, to know whether you're succeeding. Metrics really really matter in public service arena.

BBC used to look at page views, but developers gamed the system. Then the metric was 'time online', but it stopped people thinking externally. Metric as proxy for quality.

Value = reach x quality. What kind of experience did they have?

Quality is the really hard part. As defined by BBC: quality is in the eye of the beholder. Did the user have an excellent experience?

BBC measure 'net promoter' – how likely are you to recommend this to a friend or colleague, on a scale of 1 – 10?

[But for our sector, what if you don't have any friends with the same interest in x? Would people extrapolate from their specific page on a Roman buckle to recommend the site generally?]

Throw away the 'soggy British middle' – the 7, 8s (out of ten).

Group them as Promoters (9-10/10), Passive (7-8/10), Detractors (0 – 6/10). The key measure is the difference between how many Promoters and how many Detractors. This was 'fabulously useful' at the BBC. 30% is good benchmark.

They mapped whole BBC portfolio against 'net promoters' % and reach, bubbles show cost.

It's not necessarily about reaching mass audiences. But when producing for niche audiences – they must love it, and it shouldn't cost that much.

He's telling us this because it's the language of funders, of KPIs, this is hard evidence with real people. You might use a different measure of quality but you can't talk about opportunities in abstract, must have numbers behind them.

Suggested the BBC's 15 Web Principles, including 'fall forward, fast'.

A measure of personal success for him would be that in x years when he asked 'who here is on the board of your institution, at least x should put hands up'.

[I really liked this keynote speech as a kick up the arse in case we started to get too complacent about having figured out what matters to us, as museum geeks. It doesn't count unless we can get through our organisations and get that content out to audiences in ways they can use (and re-use).]

In linking the sessions, Ross Parry mused about the legacy of 18th, 19th century ideas of how to build a museum, how would they be different if museums were created today?

Lee Iverson, How does the web connect content? "Semantic Pragmatics"
'Profoundly disagreed' with some of the things Tom was talking about, wants to have a dialogue.
He asked how many know the background to semantic web stuff? Quite a few hands were raised.

Talking about how the web works now and where it's going. Museums have significant opportunity to push things forward, but must understand possibilities and limitations.

Changing classic relationship – museum websites as face of institution to users. Huge opportunity for federating and aggregating content (between museums) – an order of magnitude better.

He's working with 13 museums, with north west native American artefacts. Communities are co-developers, virtually repatriating their (land).

Possibility to connect outside the museum. Powerhouse Museum as an excellent example of why (and how) you should connect.

Becoming connected:
Expose own data from behind presentation layers
Find other data
Integrate – creating a cohesive (situation)
Engage with users

Access to data is core business, curatorial stuff.

RDFa
Pragmatics of standards – get a sense of what it is you're doing [and start, don't try and create the system of everything first], it'll never work. Use existing standards if possible, grab chunks if you can. Never standardise what you minimally need to do to get the utility you need at the moment. Then extend, layers, version 2. A standard is an agreement between a minimum of two people [and doesn't have to be more complicated than that].

"Just do it" – make agreements, get it to work, then engage in the standardisation process.

Relationship between this and semantic web? Semantic web as 'data web'. Competing definitions.

Slide on Tim Berners-Lee on the semantic web in 1999.

Why hasn't it appeared? It's vapourware, you can't make effective standards for it.

Syntax – capability of being interpreted. Semantic – ability to interpret, and to connect interpretations.

Finding data – how much easier would it be if we could just grab the data we want directly from where we want it?

Key is relating what you're doing to what they're doing.

XML vs RDF
Semantic web built on RDF, it's designed for representing metadata. It's substantially different to XML. Lots of reaction against RDF has been reaction against XML encoding, syntactic resistance.

RDF is designed to be manipulated as data, XML is about annotating text. In XML, syntax is the thing, with RDF the data is the thing.

Grab entire XML doc before you can figure out how to smoosh then together. RDF works by reference, you can just build on it.

RDFa. A way of embedding RDF content directly in XHTML, relies on same strategies as microformats. Will be ignored by presentation oriented systems but readable by RDF parsers.

[RDF triples vs machine tags? RDF vs microformats? How RDF-like is OAI PMH?]

You can talk about things you don't have a representation for e.g. people.

Ignore the term 'ontology' – it's just a way of talking about a vocabulary.

Four steps for widespread adoption:
Promote practical applications
Develop applications now
[and the slide was gone and I missed the last two steps!]

There was also some stuff on limitations of lightweight approaches, and hermetically sealed museum data, user experiences. Also a bit on 'give away structured data' but with a good awareness of the need to keep some data private – object location and value, for example.

Ross – we've had the media context and technical context, now for the sector context.

Paul Marty, Engaging Audiences by connecting to collections online.
Vital connections…

What does it mean to say x% of your collection is online? For whom is it useful?

How to engage audiences around your collections? Not just presenting information.

Goes beyond providing access to data. Research shows audiences want engagement. Surveyed 1200 museum visitors about their requirements. [I would love to see the research] Virtuous circle between museum visits and website visits.

Build on interest, give experience that grabs people.

Romans in Sussex website – multiple museums offering collections for multiple audiences. Re-presenting same content in different ways on the fly.

Audiences
Don't just give general public a list of stuff. Give them a way to engage.

"Engaging a community around a collection is harder than providing access to data about a collection"

Photo of the week – says "What do you know about this photo? Please share your thoughts with us" But no link or instructions on how to do it. But at least they're trying…

Discussion – Tom, Lee and Paul.

"Why do you digitise collections before had need in mind?" [Because the driver is internal, not external, needs, would be the generous answer; because they could get funding to do it would be my ungenerous answer].

Tom on RDF – how seriously engaged with it to build audiences, tell stories.

BBC licence terms – couldn't re-use data for commercial purposes/at all.

Leadership need to understand opportunities because otherwise they won't support geek stuff.

Qu: terms of engagement – how is it defined?

Paul – US has made same mistakes re digitisation of collections and websites that don't have reusable data.

Participants must be involved in process from the beginning, need input at start from intended users on how it can engage them.

Fiona: why not use existing resources, go to existing sites with established audiences?

Lee: how did YouTube succeed – people were brought by embedded content. [This issue of using 'wrappers' around your content to help it go viral by being embeddable elsewhere was raised in another session too.]

Tom: letting go is how you win, but it's a profound challenge to institutions and their desire to maintain authority.

'The Machine That Changed the World' online

I think I'm posting this as much to tell you about it as to remind myself to watch it!

The Machine That Changed the World is the longest, most comprehensive documentary about the history of computing ever produced, but since its release in 1992, it's become virtually extinct. …

It's a whirlwind tour of computing before the Web, with brilliant archival footage and interviews with key players — several of whom passed away since the filming.

In other news, the mashed museum day is tomorrow, and the UK Museums on the Web conference is the day afterwards – see you there, maybe! I've been flat out so I've no idea what I'll work on tomorrow – I have lots of ideas but haven't had a chance to do any preparation.

Yahoo! SearchMonkey, the semantic web – an example from last.fm

I had meant to blog about SearchMonkey ages ago, but last.fm's post 'Searching with my co-monkey' about a live example they've created on the SearchMonkey platform has given me the kick I needed. They say:

The first version of our application deals with artist, album and track pages giving you a useful extract of the biography, links to listen to the artist if we have them available, tags, similar artists and the best picture we can muster for the page in question.

Some background on SearchMonkey from ReadWriteWeb:

At the same time, it was clear that enhancing search results and cross linking them to other pieces of information on the web is compelling and potentially disruptive. Yahoo! realized that in order to make this work, they need to incentivize and enable publishers to control search result presentation.

…

SearchMonkey is a system that motivates publishers to use semantic annotations, and is based on existing semantic standards and industry standard vocabularies. It provides tools for developers to create compelling applications that enhance search results. The main focus of these applications is on the end user experience – enhanced results contain what Yahoo! calls an "infobar" – a set of overlays to present additional information.

…

SearchMonkey's aim is to make information presentation more intelligent when it comes to search results by enabling the people who know each result best – the publishers – to define what should be presented and how.

(From Making the Web Searchable: The Story of SearchMonkey)

And from Yahoo!'s search blog:

This new developer platform, which we're calling SearchMonkey, uses data web standards and structured data to enhance the functionality, appearance and usefulness of search results. Specifically, with SearchMonkey:

Site owners can build enhanced search results that will provide searchers with a more useful experience by including links, images and name-value pairs in the search results for their pages (likely resulting in an increase in traffic quantity and quality)

Developers can build SearchMonkey apps that enhance search results, access Yahoo! Search's user base and help shape the next generation of search

Users can customize their search experience with apps built by or for their favorite sites

This could be an interesting new development – the question is, how well does the data we currently output play with it; could we easily adapt our pages so they're compatible with SearchMonkey; should we invest the time it might take? Would a simple increase in the visibility and usefulness of search results be enough? Could there be a greater benefit in working towards federated searches across the cultural heritage sector or would this require a coordinated effort and agreement on data standards and structure?

Update to link to the Yahoo! Search Blog post ;The Yahoo! Search Gallery is Open for Business' which has a few more examples.

Nice information design/visualisation pattern browser

infodesignpatterns.com is a Flash-based site that presents over 50 design patterns 'that describe the functional aspects of graphic components for the display, behaviour and user interaction of complex infographics'.

The development of a design pattern taxonomy for data visualisation and information design is a work in progress, but the site already has a useful pattern search, based on order principle, user goal, graphic class and number of dimensions.

Some ideas for location-linked cultural heritage projects

I loved the Fire Eagle presentation I saw at the WSG Findability event [my write-up] because it got me all excited again about ideas for projects that take cultural heritage outside the walls of the museum, and more importantly, it made some of those projects seem feasible.

There's also been a lot of talk about APIs into museum data recently and hopefully the time has come for this idea. It'd be ace if it was possible to bring museum data into the everyday experience of people who would be interested in the things we know about but would never think to have 'a museum experience'.

For example, you could be on your way to the pub in Stoke Newington, and your phone could let you know that you were passing one of Daniel Defoe's hang outs, or the school where Mary Wollstonecraft taught, or that you were passing a 'Neolithic working area for axe-making' and that you could see examples of the Neolithic axes in the Museum of London or Defoe's headstone in Hackney Museum.

That's a personal example, and those are some of my interests – Defoe wrote one of my favourite books (A Journal of the Plague Year), and I've been thinking about a project about 'modern bluestockings' that will collate information about early feminists like Wollstonecroft (contact me for more information) – but ideally you could tailor the information you receive to your interests, whether it's football, music, fashion, history, literature or soap stars in Melbourne, Mumbai or Malmo. If I can get some content sources with good geo-data I might play with this at the museum hack day.

I'm still thinking about functionality, but a notification might look something like "did you know that [person/event blah] [lived/did blah/happened] around here? Find out more now/later [email me a link]; add this to your map for sharing/viewing later".

I've always been fascinated with the idea of making the invisible and intangible layers of history linked to any one location visible again. Millions of lives, ordinary or notable, have been lived in London (and in your city); imagine waiting at your local bus stop and having access to the countless stories and events that happened around you over the centuries. Wikinear is a great example, but it's currently limited to content on Wikipedia, and this content has to pass a 'notability' test that doesn't reflect local concepts of notability or 'interestingness'. Wikipedia isn't interested in the finds associated with an archaeological dig that happened at the end of your road in the 1970s, but with a bit of tinkering (or a nudge to me to find the time to make a better programmatic interface) you could get that information from the LAARC catalogue.

The nice thing about local data is that there are lots of people making content; the not nice thing about local data is that it's scattered all over the web, in all kinds of formats with all kinds of 'trustability', from museums/libraries/archives, to local councils to local enthusiasts and the occasional raving lunatic. If an application developer or content editor can't find information from trusted sources that fits the format required for their application, they'll use whatever they can find on other encyclopaedic repositories, hack federated searches, or they'll screen-scrape our data and generate their own set of entities (authority records) and object records. But what happens if a museum updates and republishes an incorrect record – will that change be reflected in various ad hoc data solutions? Surely it's better to acknowledge and play with this new information environment – better for our data and better for our audiences.

Preparing the data and/or the interface is not necessarily a project that should be specific to any one museum – it's the kind of project that would work well if it drew on resources from across the cultural heritage sector (assuming we all made our geo-located object data and authority records available and easily queryable; whether with a commonly agreed core schema or our own schemas that others could map between).

Location-linked data isn't only about official cultural heritage data; it could be used to display, preserve and commemorate histories that aren't 'notable' or 'historic' enough for recording officially, whether that's grime pirate radio stations in East London high-rise roofs or the sites of Turkish social clubs that are now new apartment buildings. Museums might not generate that data, but we could look at how it fits with user-generated content and with our collecting policies.

Or getting away from traditional cultural heritage, I'd love to know when I'm passing over the site of one of London's lost rivers, or a location that's mentioned in a film, novel or song.

[Updated December 2008 to add – as QR tags get more mainstream, they could provide a versatile and cheap way to provide links to online content, or 250 characters of information. That's more information than the average Blue Plaque.]