A recent Alertbox talked about Banner Blindness: Old and New Findings:

The most prominent result from the new eyetracking studies is not actually new. We simply confirmed for the umpteenth time that banner blindness is real. Users almost never look at anything that looks like an advertisement, whether or not it's actually an ad.

The heatmaps also show how users don't fixate within design elements that resemble ads, even if they aren't ads

I guess the most interesting thing about the post is that it acknowledges that unethical methods attract the most eyeballs:

In addition to the three main design elements that occasionally attract fixations in online ads, we discovered a fourth approach that breaks one of publishing's main ethical principles by making the ad look like content:

  • The more an ad looks like a native site component, the more users will look at it.
  • Not only should the ad look like the site's other design elements, it should appear to be part of the specific page section in which it's displayed.

This overtly violates publishing's principle of separating "church and state" — that is, the distinction between editorial content and paid advertisements should always be clear. Reputable newspapers don't allow advertisers to mimic their branded typefaces or other layout elements.

I think it's particularly important that we don't allow commercial considerations to damage our users' trust in cultural heritage institutions as repositories of impartial* knowledge. We've developed models for differentiating user- and museum-generated content and hopefully quelled fears about user-generated content somehow damaging or diluting museum content; it would be a shame if we lost that trust over funding agreements.

* insert acknowledgement of the impossibility of truly impartial cultural content.

In a post titled, What is Web 3.0?, Nicholas Carr said:

"Web 3.0 involves the disintegration of digital data and software into modular components that, through the use of simple tools, can be reintegrated into new applications or functions on the fly by either machines or people."

And recently I went to a London Geek Girl Dinner, where Paul Amery from Skype (who hosted the event) said
"the next big step forward in software is going to be providing the plumbing, to provide people what they want, where they want …start thinking about plumbing all this software together, joining solutions together… mashups are just the tip of the iceberg".

So why does that matter to us in the cultural heritage sector? Without stretching the analogy too far, we have two possible roles – one, to provide the content that flows through the pipes, ensuring we use plumbing-compatible tubes so that other people can plumb our content into new applications; the second is to build applications ourselves, using our data and others. I think we're are brilliant content producers, and we're getting better at providing re-usable data sources – but we often don't have the resources to do cool things with them ourselves.

Maybe what I'm advocating is giving geeks in the cultural heritage sector the time to spend playing with technology and supplying the tools for agile development. Or maybe it's just the perennial cry of the backend geek who never gets to play with the shiny pretty things. I'm still thinking about this one.

This post on 'What comes after post-processualism' caught my eye, I guess because I have a fascination with the ways in which archaeological theory affects database design and digitisation strategies. I either work with contract archaeologists or on a post-processual site and the structural requirements are quite different, though both fundamentally rely on single context recording.

We have to face the fact that archaeological theory is quite simply no longer at the heart of archaeology, as it perhaps was from the 1960s until the end of the 1980s.

Instead we have seen over the last few decades an enormous expansion of commercial archaeology, now controlling far more funding than the Universities and responsible for the lion share of archaeological research. We may or may not like that fact and what it led to in terms of research results but commercial archaeology is undeniably today a far bigger player in the discipline than its poor sibling, University-based research.

I'm sure it'll be eons before it trickles down into the museum sector, but it's an interesting change:

Nielsen/NetRatings to use total time spent by users of a site as its primary measurement metric

In a nod to the success of emerging Web 2.0 technologies like AJAX and streaming media, one of the country's largest Internet benchmarking companies will no longer use page views as its primary metric for comparing sites.

Nielsen/NetRatings will announce Tuesday that it will immediately begin using total time spent by users of a site as its primary measurement metric.

Nielsen/NetRatings will still report page views as a secondary metric, and it will continue to reevaluate its primary metric as technology continues to evolve, Ross added. "For the foreseeable future, we will champion minutes if you are comparing two sites. Going forward, we'll see what that equates to in terms of true advertising opportunity," he said.

Meta-social networking

I've been wondering how long it would take for a meta-social networking site to emerge (or whether I should create one thereby making millions), allowing you to maintain active accounts on facebook, myspace, etc, with one single interface to read and post messages and comments, but of course Wired got there first. Sorta.

And yes, I did mean to post that many months ago! But it's still relevant because interoperability is only going to become more important in the social networking world.

This post on the Gartner "Hype Cycle for Emerging Technologies 2007" report includes the familiar Gartner Hype Cycle diagram, updated for 2007, which is more than you'll get from the Gartner site (for free, anyway).

Common Craft have produced videos on RSS in Plain English, Social Bookmarking in Plain English, Wikis in Plain English and Social Networking in Plain English (via Groundswell)

Also worth a look, Google Code for Educators "provides teaching materials created especially for CS educators looking to enhance their courses with some of the most current computing technologies and paradigms". They say, "[w]e know that between teaching, doing research and advising students, CS educators have little time to stay on top of the most recent trends. This website is meant to help you do just that" and it looks like it might also be useful for busy professionals who want to try new technologies they don't get time to play with in their day jobs (via A Consuming Experience).

Also from A Consuming Experience, a report on a talk on "5 secrets of successful Web 2.0 businesses" at the June London Geek Dinner.

On a random note, I noticed that the BBC have added social bookmarking to their news site:

I wonder if this marks the 'mainstreaming' of social bookmarking.

European search engine

EU OKs German Online Search-Engine Grant

The European Union on Thursday authorized Germany to give $165 million for research on Internet search-engine technologies that could someday challenge U.S. search giant Google Inc.

The Theseus research project — the German arm of what the French call Quaero — is aiming to develop the world's most advanced multimedia search engine for the next-generation Internet. It would translate, identify and index images, audio and text.

Fragmented European research efforts are one of the reasons blamed for the region lagging behind the United States in information technology. European companies in general spend far less on research than those based in other parts of the world, and the EU said the project should help change that.

I wonder how they'll identify and weight or rank European content. And will it be tied in with the European Digital Library?

I'm still catching up on news and various RSS feeds, here are just a few things that caught my eye.

These slides from a presentation on Open Source applications in archaeology are worth a look. They've included lots of screenshots, which is useful because it demonstrates that open source applications are becoming much more user-friendly.

Wired makes a compelling case for Twitter as a 'Social Sixth Sense':

Twitter and other constant-contact media create social proprioception. They give a group of people a sense of itself, making possible weird, fascinating feats of coordination.

In theory I just don't get Twitter but in practice I do read some long-running threads on various forums where people can post a quick rant about work, about their love life, or just add a random disclosure. If I know the people posting then I find those threads interesting. And I also love Facebook status updates for the same reason – they don't require a response but sometimes it's nice when they trigger one.

Final diary entry from Catalhoyuk

I'm back in London now but here goes anyway:

August 1
My final entry of the season as I'm on the overnight train from Cumra to Istanbul tonight. After various conversations on the veranda I've been thinking about the intellectual accessibility of our Catalhoyuk data and how that relates to web publication and this entry is just a good way to stop these thoughts running round my head like a rogue tune.

[This has turned into a long entry, and I don't say anything trivial about the weather or other random things so you'd have to be pretty bored to read it all. Shouldn't you be working instead?]

Getting database records up on the website isn't hard – it's just a matter of resources. The tricky part is providing an interesting and engaging experience for the general visitor, or a reliable, useable and useful data set for the specialist visitor.

At the moment it feels like a lot of good content is hidden within the database section of the website. When you get down to browsing lists of features, there's often enough information in the first few lines to catch your interest. But when you get to lists of units, even the pages with some of the unit description presented alongside the list, you start to encounter the '800 lamps' problem.

[A digression/explanation – I'm working on a website at the Museum of London with a searchable/browsable catalogue of objects from Roman Londinium. One section has 800 Roman oil lamps – how on earth can you present that to the user so they can usefully distinguish between one lamp and another?]

Of course, it does depend on the kind of user and what they want to achieve on the Londinium site – for some, it's enough to read one nicely written piece on the use of lamps and maybe a bit about what different types meant, all illustrated with a few choice objects; specialist users may want to search for lamps with very particular characteristics. Here, our '800 lamps' are 11846 (and counting) units of archaeology. The average user isn't going to read every unit sheet, but how can they even choose one to start with? And how many will know how to interpret and create meaning from what they read about the varying shades of brown stuff? Being able to download unit sheets that match a particular pattern – part of a certain building, ones that contain certain types of finds, units related to different kinds of features – is probably of real benefit to specialist visitors, but are we really giving those specialist visitors (professional or amateur) and our general visitors what they need? I'm not sure a raw huge list of units or flotation numbers is of much help to anyone – how do people distinguish between one thumbnail of a lamp or one unit number and another in a useful and meaningful way? I hope this doesn't sound like a criticism of the website – it's just the nature of the material being presented.

The variability of the data is another problem – it's not just about data cleaning (though the 'view features by type' page shows why data cleaning is useful) – but about the difference between the beautiful page for Building 49 and rather less interesting page for Building 33 (to pick one at random). If a user lands on one of the pages with minimal information they may never realise that some pages have detailed records with fantastic plans and photos.

So there are the barriers to entry that we might accidentally perpetuate by 'hiding' the content behind lists of numbers; and there is the general intellectual accessibility of the information to the general user. Given limited resources, where should our energies be concentrated? Who are our websites for?

It's also about matching the data and website functionality to the user and their goals – the excavation database might not be of interest to the general user in its most raw form, and that's ok because it will be of great interest to others. At a guess, the general public might be more interested in finds, and if that's the case we should find ways to present information about the objects with appropriate interpretation and contextualisation, not only to present information about the objects but also to help people have a more meaningful experience on the site.

I wonder if 'team favourite' finds or buildings/spaces/features could be a good way into the data, a solution that doesn't mean making some kinds of finds or some buildings into 'treasure' and more important than others. Or perhaps specialists could talk about a unit or feature they find interesting – along the way they could explain how their specialism contributes to the archaeological record (written as if to an intelligent thirteen year old). For example, Flip could talk about phytoliths, or Stringy could talk about obsidian, and what their finds can tell us.

Proper user evaluation would be fabulous, but in the absence of resources, I really should look at the stats and see how the site is used. I wonder if I could do a surveymonkey thing to get general information from different types of users? I wonder what levels of knowledge our visitors have about the Neolithic, about Anatolian history, etc. What brings them to the website? And what makes them stick around?

Intellectual accessibility doesn't only apply to the general public – it also applies to the accessibility of other team's or labs content. There are so many tables hidden behind the excavation and specialist database interfaces – some are archived, some had a very particular usage, some are still actively used but still have the names of long-gone databases. It's all very well encouraging people to use the database to query across specialisms, but how will they know where to look for the data they need? [And if we make documentation, will anyone read it?]

It was quite cool this morning but now it's hot again. Ha, I lied about not saying anything trivial about the weather! Now go do some work.
(Started July 29, but finally posted August 1)

The BBC says "Photo tool could fix bad images" but I think it's far more likely to be used to create fake images. I guess I wouldn't have thought of an image that shows what was actually present as 'bad' – maybe it's not the best postcard if you want the recipient to think you were in an undeveloped paradise, but it's an accurate depiction of the scene.

I'm reminded of the way the Soviets would remove people from historic photos when they fell out of favour – now the ability to rewrite history is available for you at home!