One step closer to intelligent searching?

The BBC have a story on a new search engine, Search site aims to rival Google:

Called Cuil [pronounced ‘cool’], from the Gaelic for knowledge and hazel, its founders claim it does a better and more comprehensive job of indexing information online.

The technology it uses to index the web can understand the context surrounding each page and the concepts driving search requests, say the founders.

But analysts believe the new search engine, like many others, will struggle to match and defeat Google.

Instead of just looking at the number and quality of links to and from a webpage as Google’s technology does, Cuil attempts to understand more about the information on a page and the terms people use to search. Results are displayed in a magazine format rather than a list.

From the Cuil FAQ:

So Cuil searches the Web for pages with your keywords and then we analyze the rest of the text on those pages. This tells us that the same word has several different meanings in different contexts. Are you looking for jaguar the cat, the car or the operating system?

We sort out all those different contexts so that you don’t have to waste time rephrasing your query when you get the wrong result.

Different ideas are separated into tabs; we add images and roll-over definitions for each page and then make suggestions as to how you might refine your search. We use columns so you can see more results on one page.

They also provide ‘drill-downs’ on the results page.

Cuil will direct you to this additional information. By looking at these suggestions, you may discover search data, concepts, or related areas of interest that you hadn’t expected. This is particularly useful when you are researching a subject you don’t know much about and aren’t sure how to compose the “right” query to find the information you need.

I haven’t used it enough to work out exactly how it differentiates concepts (tabs) and ‘additional information’ (drill-downs/categories).

It does a good job on something like the Cutty Sark. Under ‘Explore by Category’ it offered:

  • Buildings And Structures In Greenwich
  • Sailboat Names
  • Museums In London
  • Neighbourhoods Of Greenwich
  • School Ships

It picked up search results for Cutty Sark whisky and news of the Cutty Sark fire but they weren’t reflected in the categories, and the search term didn’t trigger the tabs. The tabs kick in when you search for something like ‘orange‘.

It didn’t do as well with ‘samian ware‘ – the categories picked up all sorts of places and peoples, (and randomly ‘American Films’), but while the search results all say that it’s ‘a kind of bright red Roman pottery’ that’s not reflected in the categories. Fair enough, there may not be enough information easily available online so that ‘Types of Roman pottery’ registers as a category.

Incidentally, most of the results listed for ‘samian ware’ are just recycled entries from Wikipedia. It’s a shame the results aren’t filtered to remove entries that have just duplicated Wikipedia text. The FAQ says they don’t index duplicate content I guess the overall site or page is just different enough to be retained.

It might take a while for museum content to appear in the most useful ways, but it looks like it might be a useful search engine for niche content. From the FAQ again:

We’ve found that a lot of Web pages have been designed with a small audience in mind—perhaps they are blogs or academic papers with specific interests or pages with family photos. We think that even though these pages aren’t necessarily for a wide audience, they contain content that one day you might need.

Our job is to index all these pages and examine their content for relevancy to your search. If they contain information you need, then they should be available to you.

It’s all sounding a bit semantic web-ish (and quite a bit ‘reacting to Google-ish’) and I’ll use it for a while to see how it compared to Google. The webmaster information doesn’t give any indication of how you could mark up content so the relationships between terms in different contexts is clear, but I guess nice semantic markup would help.

Refreshingly, it doesn’t retain search info – privacy is one of their big differentiators from Google.

The Future of the Web with Sir Tim Berners-Lee @ Nesta

The Future of the Web with Sir Tim Berners-Lee at Nesta, London, July 8.

My notes from the Nesta event, The Future of the Web with Sir Tim Berners-Lee, held in London on July 8, 2008.

nesta panel
Panel at ‘The Future of the Web’ with Sir Tim Berners-Lee, Nesta

As usual, let me know of any errors or corrections, comments are welcome, and comments in [square brackets] are mine. I wanted to get these notes up quickly so they’re pretty much ‘as is’, and they’re pretty much about the random points that interested me and aren’t necessarily representative. I’ve written up more detailed notes from a previous talk by Tim Berners-Lee in March 2007, which go into more detail about web science.

[Update: the webcast is online at http://www.nesta.org.uk/future-of-web/ so you might as well go watch that instead.]

The event was introduced by NESTA’s CEO, Jonathan Kestenbaum. Explained that online contributions from the pre-event survey, and from the (twitter) backchannel would be fed into the event. Other panel members were Andy Duncan from Channel 4 and the author Charlie Leadbeater though they weren’t introduced until later.

Tim Berners-Lee’s slides are at http://www.w3.org/2008/Talks/0708-ws-30min-tbl/.

So, onto the talk:
He started designing the web/mesh, and his boss ‘didn’t say no’.

He didn’t want to build a big mega system with big requirements for protocols or standards, hierarchies. The web had to work across boundaries [slide 6?]. URIs are good.

The World Wide Web Consortium as the point where you have to jump on the bob sled and start steering before it gets out of control.

Producing standards for current ideas isn’t enough; web science research is looking further out. Slide 12 – Web Science Research Initiative (WSRI) – analysis and synthesis; promote research; new curriculum.

Web as blockage in sink – starts with a bone, stuff builds up around it, hair collect, slime – perfect for bugs, easy for them to get around – we are the bugs (that woke people up!). The web is a rich environment in which to exist.

Semantic web – what’s interesting isn’t the computers, or the documents on the computers, it’s the data in the documents on the computers. Go up layers of abstraction.

Slide on the Linked Open Data movement (dataset cloud) [Anra from Culture24 pointed out there’s no museum data in that cloud].

Paraphrase, about the web: ‘we built it, we have a duty to study it, to fix it; if it’s not going to lead to the kind of society we want, then tweak it, fix it’.

‘Someone out there will imagine things we can’t imagine; prepare for that innovation, let that innovation happen’. Prepare for a future we can’t imagine.

End of talk! Other panelists and questions followed.

Charles Leadbeater – talked about the English Civil War, recommends a book called ‘The World Turned Upside Down’. The bottom of society suddenly had the opportunity to be in charge. New ‘levellers‘ movement via the web. Participate, collaborate, (etc) without the trappings of hierarchy. ‘Is this just a moment’ before the corporate/government Restoration? Iterative, distributed, engaged with practice.

Need new kinds of language – dichotomies like producer/consumer are disabling. Is the web – a mix of academic, geek, rebel, hippie and peasant village cultures – a fundamentally different way of organising, will it last? Are open, collaborative working models that deliver the goals possible? Can we prevent creeping re-regulation that imposes old economics on the new web? e.g. ISPs and filesharing. Media literacy will become increasingly important. His question to TBL – what would you have done differently to prevent spam while keeping the openness of the web? [Though isn’t spam more of a problem for email at the moment?]

Andy Duncan, CEO of Channel 4 – web as ‘tool of humanity’, ability for humans to interact. Practical challenges to be solved. £50million 4IP fund. How do we get, grow ideas and bring them to the wider public, and realise the positive potential of ideas. Battle between positive public benefit vs economic or political aspects.

The internet brings more/different perspectives, but people are less open to new ideas – they get cosy, only talk to like-minded people in communities who agree with each other. How do you get people engaged in radical and positive thinking? [This is a really good observation/question. Does it have to do with the discoverability of other views around a topic? Have we lost the serendipity of stumbling across random content?]

Open to questions. ‘Terms and conditions’ – all comments must have a question mark at the end of them. [I wish all lectures had this rule!]

Questions from the floor: 1. why is the semantic web taking so long; 2. 3D web; 3. kids.
TBL on semantic web – lots of exponential growth. SW is more complicated to build than HTML system. Now has standard query language (SPARQL). Didn’t realise at first that needed a generic browser and linked open data. (Moving towards real world).

[This is where I started to think about the question I asked, below – cultural heritage institutions have loads of data that could be open and linked, but it’s not as if institutions will just let geeks like me release it without knowing where and why and how it will be used – and fair enough, but then we need good demonstrators. The idea that the semantic web needs lots of acronyms (OWL, GRDDL, RDF, SPARQL) in place to actually happen is a perception I encounter a lot, and I wanted an answer I could pass on. If it’s ‘straight from the horse’s mouth’, then even better…]

Questions from twitter (though the guy’s laptop crashed): 4. will Google own the world? What would Channel 4 do about it?; 5. is there a contradiction between [collaborative?] open platform and spam?; 6. re: education, in era of mass collaboration, what’s the role of expertise in a new world order? [Ooh, excellent question for museums! But more from the point of view of them wondering what happens to their authority, especially if their collections/knowledge start to appear outside their walls.]

AD: Google ‘ferociously ambitious in terms of profit’, fiercely competitive. They should give more back to the UK considering how much they take out. Qu to TBL re Google, TBL did not bite but said, ‘tremendous success; Google used science, clustering algorithms, looked at the web as a system’.
CL re qu 5 – the web works best through norms and social interactions, not rules. Have to be careful with assumption that can regulate behaviour -> ‘norm based behaviour’. [But how does that work with anti-social individuals?]
TBL re qu 6: e.g. MIT Courseware – experts put their teaching materials on the web. Different people have different levels of expertise [but how are those experts recognised in their expert context? Technology, norms/links, a mixture?]. More choice in how you connect – doesn’t have to be local. Being an expert [sounds exhausting!] – connect, learn, disseminate – huge task.

Questions from the floor: 7. ISPs as villains, what can they do about it?; 9. why can’t the web be designed to use existing social groups? [I think, I was still recovering from asking a question] TBL re qu 7 and ISPs ‘give me non-discriminatory access and don’t sell my clickstream’. [Hoorah!]

So the middle question  (Question 8) was me. It should have been something like ‘if there’s a tension between the top-down projects that don’t work, and simple protocols like HTML that do, and if the requirements of the ‘Semantic Web’ are top-down (and hard), how do we get away from the idea that the semantic web is difficult to just have the semantic web?’* but it came out much more messily than that as ‘the semantic web as proposed is a top-down system, but the reason the web worked was that it was simple, easy to participate, so how does that work, how do we get the semantic web?’ and his response started “Who told you SW is top down?”. It was a leading question so it’s my fault, but the answer was worth asking a possibly stupid/leading question. His full answer [about two minutes at 20’20” minutes in on the Q&A video] was: ‘Who on earth told you the semantic web was a top-down designed system? It’s not. It is totally bottom-out. In fact the really magic thing about it is that it’s middle-out as well. If you imagine lots of different data systems which talk different languages, it’s a bit like imagine them as a quilt of those things sewn together at the edges. At the bottom level, you can design one afternoon a little data system which uses terms and particular concepts which only you use, and connect to nobody else. And then, in a very bottom-up way, start meeting more and more people who’ll start to use those terms, and start negotiating with people, going to, heaven forbid, standards bodies and committees to push, to try to get other people to use those terms. You can take an existing set of terms, like the concepts when you download a bank statement, you’ll find things like the financial institution and transaction and amount have pretty much been defined by the banks, you can take those and use those as semantic web terms on the net. And if you want to, you can do that at the very top level because you might decide that it’s worth everybody having exactly the same URI for the concept of latitude, for the number you get out of the GPS, and you can join the W3C interest group which has gotten together people who believe in that, and you’ve got the URI, [people] went to a lot of trouble to make something which is global. The world works like that plug of stuff in the sink, it’s a way of putting together lots and lots of different communities at different levels, only some of them, a few of them are global. The global communities are hard work to make. Lots and lots and lots of them are local, those are very easy to make. Lots of important benefits are in the middle. The semantic web is the first technology that’s designed with an understanding of that’s how the world is, the world is a scale-free, fractal if you like, system. And that’s why it’s all going to work.’

[So I was asking ‘how do we get to the semantic web’ in the museum sector – we can do this. Put a dataset out there, make connections to the organisation next to you (or get your users to by gathering enough anonymised data on how they link items through searching and browsing). Then make another connection, and another. We could work at the sector (national or international) level too (stable permanent global identifiers would be a good start) but start with the connections. “Small pieces loosely joined” -> “small ontologies, loosely joined”. Can we make a manifesto from this?

There’s also a good answer in this article, Sir Tim Talks Up Linked Open Data Movement on internetnews.com.

“He urged attendees to look over their data, take inventory of it, and decide on which of the things you’d most likely get some use out of re-using it on the Web. Decide priorities, and benefits of that data reuse, and look for existing ontologies on the Web on how to use it, he continued, referring to the term that describes a common lexicon for describing and tagging data.”

Anyway, on with the show.]

[*Comment from 2015: in hindsight, my question speaks to the difficulties of getting involved in what appeared to be distant and top-down processes of ontology development, though it might not seem that distant to someone already working with W3C. And because museums are tricky, it turns out the first place to start is getting internal museum systems to talk to each other – if you can match people, places, objects and concepts across your archive, library and museum collections management systems, digital asset management system and web content management system, you’re in a much better position to match terms with other systems. That said, the Linking Museums meetups I organised in London and various other museum technology forums were really helpful.]

Questions from the floor: 10. do we have enough “bosses who don’t say no”?; 11. web to solve problems, social engineering [?]; 12. something on Rio meeting [didn’t get it all].

TBL re 10 – he can’t emulate other bosses but he tries to have very diverse teams, not clones of him/each other, committed, excited people and ‘give them spare time to do things they’re interested in’. So – give people spare time, and nurture the champions. They might be the people who seem a bit wacky [?] but nurture the ones who get it.

Qu 11 – conflicting demands and expectations of web. TBL – ‘try not to think of it as a thing’. It’s an infrastructure, connections between people, between us. So, are we asking too much of us, of humanity? Web is reflection of humanity, “don’t expect too little”.

TBL re qu 12 – internet governance is the Achilles heel of the web. No permission required except for domain name. A ‘good way to make things happen slowly is to get a bureaucracy to govern it’. Slowness, stability. Domain names should last for centuries – persistence is a really important part of the web.

CL re qu 11 – possibilities of self-governance, we ask too little of the web. Vision of open, collaborative web capable of being used by people to solve shared problems.

JK – (NESTA) don’t prescribe the outcome at the beginning, commitment to process of innovation.

Then Nesta hosted drinks, then we went to the pub and my lovely mate said “I can’t believe you trolled Tim Berners-Lee”. [I hope I didn’t really!]

The BBC, accessibility, the hCalendar microformat and RDFa

The BBC have announced (in ‘Removing Microformats from bbc.co.uk/programmes‘) that they’ll stop using the hCalendar microformat because of concerns about accessibility, specifically the use of the HTML abbreviation element (the abbr tag):

Our concerns were:

  • the effect on blind users using screen readers with abbreviation expansion turned on where abbreviations designed for machines would be read out
  • the effect on partially sighted users using screen readers where tool tips of abbreviations designed for machines would be read out
  • the effect of incomprehensible tooltips on users with cognitive disabilities
  • the potential fencing off of abbreviations to domains that need them

Until these issues are resolved the BBC semantic markup standards have been updated to prevent the use of non-human-readable text in abbreviations.

They’re looking at using RDFa, which they describe as ‘a slightly bigger S semantic web technology similar to microformats but without some of the more unexpected side-effects’.

Their support for RDFa is timely in light of Lee Iverson’s presentation at the UK Museums on the Web conference (my notes). It’s also an interesting study of what can happen when geek enthusiasm meets existing real world users.

More generally, does the fact that an organisation as big as the BBC hasn’t yet produced an API mean that creating an API is not a simple task, or that the organisational issues are bigger than the technical issues?

Yahoo! SearchMonkey, the semantic web – an example from last.fm

I had meant to blog about SearchMonkey ages ago, but last.fm’s post ‘Searching with my co-monkey’ about a live example they’ve created on the SearchMonkey platform has given me the kick I needed. They say:

The first version of our application deals with artist, album and track pages giving you a useful extract of the biography, links to listen to the artist if we have them available, tags, similar artists and the best picture we can muster for the page in question.

Some background on SearchMonkey from ReadWriteWeb:

At the same time, it was clear that enhancing search results and cross linking them to other pieces of information on the web is compelling and potentially disruptive. Yahoo! realized that in order to make this work, they need to incentivize and enable publishers to control search result presentation.

SearchMonkey is a system that motivates publishers to use semantic annotations, and is based on existing semantic standards and industry standard vocabularies. It provides tools for developers to create compelling applications that enhance search results. The main focus of these applications is on the end user experience – enhanced results contain what Yahoo! calls an “infobar” – a set of overlays to present additional information.

SearchMonkey’s aim is to make information presentation more intelligent when it comes to search results by enabling the people who know each result best – the publishers – to define what should be presented and how.

(From Making the Web Searchable: The Story of SearchMonkey)

And from Yahoo!’s search blog:

This new developer platform, which we’re calling SearchMonkey, uses data web standards and structured data to enhance the functionality, appearance and usefulness of search results. Specifically, with SearchMonkey:

  • Site owners can build enhanced search results that will provide searchers with a more useful experience by including links, images and name-value pairs in the search results for their pages (likely resulting in an increase in traffic quantity and quality)
  • Developers can build SearchMonkey apps that enhance search results, access Yahoo! Search’s user base and help shape the next generation of search
  • Users can customize their search experience with apps built by or for their favorite sites

This could be an interesting new development – the question is, how well does the data we currently output play with it; could we easily adapt our pages so they’re compatible with SearchMonkey; should we invest the time it might take? Would a simple increase in the visibility and usefulness of search results be enough? Could there be a greater benefit in working towards federated searches across the cultural heritage sector or would this require a coordinated effort and agreement on data standards and structure?

Update to link to the Yahoo! Search Blog post ;The Yahoo! Search Gallery is Open for Business‘ which has a few more examples.

Notes from ‘How Can Culture Really Connect? Semantic Front Line Report’ at MW2008

These are my notes from the workshop on “‘How Can Culture Really Connect? Semantic Front Line Report” at Museums and the Web 2008. This session was expertly led by Ross Parry.

The paper, “Semantic Dissonance: Do We Need (And Do We Understand) The Semantic Web?” (written by Ross Parry, Jon Pratty and Nick Poole) and the slides are online. The blog from the original Semantic Web Think Tank (SWTT) sessions is also public.

These notes are pretty rough so apologies for any mistakes; I hope they’re a bit useful to people, even though it’s so late after the event. I’ve tried to include most of what was discussed but it’s taken me a while to catch up.

There’s so much to see at MW I missed the start of this session; when we arrived Ross had the participants debating the meaning of terms like ‘Web 2.0’, ‘Web 3.0’, ‘semantic web, ‘Semantic Web’.

So what is the semantic web (sw) about? It’s about intelligent and efficient searching; discovering resources (e.g. URIs of picture, news story, video, biographical detail, museum object) rather than pages; machine-to-machine linking and processing of data.

Discussion: how much/what level of discourse do we need to take to curators and other staff in museums?
me: we need to show people what it can do, not bother them with acronyms.
Libby Neville: believes in involving content/museum people, not sure viewing through the prism of technology.
[?]: decisions about where data lives have an effect.

Slide 39 shows various axes against which the Semantic Web (as formally defined) and the semantic web (the SW ‘lite’?) can be assessed.
Discussion: Aaron: it’s context-dependent.

‘expectations increase in proportion to the work that can be done’ so the work never decreases.

sw as ‘webby way to link data’; ‘machine processable web’ saves getting hung up on semantics [slide 40 quoting Emma Tonkin in BECTA research report, ‘If it quacks like a duck…’ Developments in search technologies].

What should/must/could we (however defined) do/agree/build/try next (when)?

Discussion: Aaron: tagging, clusters. Machine tags (namespace: predicate: value).
me: let’s build semantic webby things into what we’re doing now to help facilitate the conversations and agreements, provide real world examples – attack the problem from the bottom up and the top down.

Slide 49 shows three possible modes: make collections machine-processable via the web; build ontologies and frameworks around added tags; develop more layered and localised meaning. [The data (the data around the data) gets smarter and richer as you move through those modes.]

I was reminded of this ‘mash it‘ video during this session, because it does a good jargon-free job of explaining the benefits of semantic webby stuff. I also rather cynically tweeted that the semantic web will “probably happen out there while we talk about it”.

Explaining the semantic web: by analogy and by example

Explaining by analogy: Miko Coffey summarises the semantic web as:

  • Web 1.0 is like buying a can of Campbell’s Soup
  • Web 2.0 is like making homemade soup and inviting your soup-loving friends over
  • The semantic web is like having a dinner party, knowing that Tom is allergic to gluten, Sally is away til next Thursday and Bob is vegetarian.

And she’s got a great image in the same post to help explain it.

To extend the analogy, it’s also as if the semantic web could understand that when your American aunt’s soup recipe says ‘cilantro’, you’d look for ‘coriander’ in shops in Australia or the UK.

Explaining by doing: this review ‘Why I Migrated Over to Twine (And Other Social Services Bit the Dust)‘ of Twine gives lots of great examples of how semantic web stuff can help us:

So for example when Stanley Kubrick is mentioned in the bookmarklet fields, or in the document you upload, or in the email you send into Twine — the system will analyze and identify him as a person (not as a mere keyword). This is called entity extraction and is applied to all text on Twine.

Under the hood, a person is defined in a larger ontology in relation to other things. Here’s an example of a very small portion of my own graph within Twine:

Hrafn Th. Thorissons RDF graph in Twine

Some may not find the point of this clear. So to explain: Just as HTML enables computers to display data — this extra semantic information markup (RDF, OWL, etc.) enables computers to understand what the data is they’re displaying. And moreover, to understand what things are in relation to other things.

Example Search
For an example, when we search for “Stanley Kubrick” on regular search engines, the words “Stanley” and “Kubrick” are usually regarded as mere keywords: a series of letters that the search engine then tries to find pages with those series of letters. But in the world of semantic web, the engines know “Stanley Kubrick” is a person. This results in a lot less irrelevant items from the search’s results….

If you weren’t already aware, the systems I just described above are the basic semantic web concept: Encapsulating data in a new layer of machine processable information to help us search, find and organize the overwhelming and ever-growing sea of pictures, videos, text and whatever else we’re creating.

I think these are both useful when explaining the benefits of the semantic web to non-geeks and may help overcome some of the fear of the unknown (or fear of investment in the pointless buzzword) we might encounter. If we believe in the semantic web, it’s up to us to explain it properly to other people it’s going to effect.

I also discovered a good post by Mike on the ‘Innovation Manifesto‘.

It’s a wonderful, wonderful web

First, the news that Google are starting to crawl the deep or invisible web via html forms on a sample of ‘high quality’ sites (via The Walker Art Center’s New Media Initiatives blog):

This experiment is part of Google’s broader effort to increase its coverage of the web. In fact, HTML forms have long been thought to be the gateway to large volumes of data beyond the normal scope of search engines. The terms Deep Web, Hidden Web, or Invisible Web have been used collectively to refer to such content that has so far been invisible to search engine users. By crawling using HTML forms (and abiding by robots.txt), we are able to lead search engine users to documents that would otherwise not be easily found in search engines, and provide webmasters and users alike with a better and more comprehensive search experience.

You’re probably already well indexed if you have a browsable interface that leads to every single one of your collection records and images and whatever; but if you’ve got any content that was hidden behind a search form (and I know we have some in older sites), this could give it much greater visibility.

Secondly, Mike Ellis has done a sterling job synthesising some of the official, backchannel and informal conversations about the semantic web at MW2008 and adding his own perspective on his blog.

Talking about Flickr’s 20 gazillion tags:

To take an example: at the individual tag level, the flaws of misspellings and inaccuracies are annoying and troublesome, but at a meta level these inaccuracies are ironed out; flattened by sheer mass: a kind of bell-curve peak of correctness. At the same time, inferences can be drawn from the connections and proximity of tags. If the word “cat” appears consistently – in millions and millions of data items – next to the word “kitten” then the system can start to make some assumptions about the related meaning of those words. Out of the apparent chaos of the folksonomy – the lack of formal vocabulary, the anti-taxonomy – comes a higher-level order. Seb put it the other way round by talking about the “shanty towns” of museum data: “examine order and you see chaos”.

The total “value” of the data, in other words, really is way, way greater than the sum of the parts.

So far, so ace. We’ve been excited about using the implicit links created between data as people consciously record information with tags, or unconsciously with their paths between data to create those ‘small ontologies, loosely joined’; the possibilities of multilingual tagging, etc, before. Tags are cool.

But the applications of this could go further:

I got thinking about how this can all be applied to the Semantic Web. It increasingly strikes me that the distributed nature of the machine processable, API-accessible web carries many similar hallmarks. Each of those distributed systems – the Yahoo! Content Analysis API, the Google postcode lookup, Open Calais – are essentially dumb systems. But hook them together; start to patch the entire thing into a distributed framework, and things take on an entirely different complexion.

Here’s what I’m starting to gnaw at: maybe it’s here. Maybe if it quacks like a duck, walks like a duck (as per the recent Becta report by Emma Tonkin at UKOLN) then it really is a duck. Maybe the machine-processable web that we see in mashups, API’s, RSS, microformats – the so-called “lightweight” stuff that I’m forever writing about – maybe that’s all we need. Like the widely accepted notion of scale and we-ness in the social and tagged web, perhaps these dumb synapses when put together are enough to give us the collective intelligence – the Semantic Web – that we have talked and written about for so long.

I’d say those capital letters in ‘Semantic Web’ might scare some of the hardcore SW crowd, but that’s ok, isn’t it? Semantics (sorry) aside, we’re all working towards the same goal – the machine-processable web.

And in the meantime, if we can put our data out there so others can tag it, and so that we’re exposing our internal ‘tags’ (even if they have fancier names in our collections management systems), we’re moving in the right direction.

(Now I’ve got Black’s “Wonderful Life” stuck in my head, doh. Luckily it’s the cover version without the cheesy synths).

Right, now I’m off to the Museum in Docklands to talk about MultiMimsy database extractions and repositories. Rock.