Notes on usability testing

Further to my post about the downloadable guidelines, I’ve picked out the bits from the chapter on ‘Usability Testing’ that are relevant to my work but it’s worth reading the whole of the chapter if you’re interested. My comments or headings are in square brackets below.

“Generally, the best method is to conduct a test where representative participants interact with representative scenarios.

The second major consideration is to ensure that an iterative approach is used.

Use an iterative design approach

The iterative design process helps to substantially improve the usability of Web sites. One recent study found that the improvements made between the original Web site and the redesigned Web site resulted in thirty percent more task completions, twenty-five percent less time to complete the tasks, and sixty-seven percent greater user satisfaction. A second study reported that eight of ten tasks were performed faster on the Web site that had been iteratively designed. Finally, a third study found that forty-six percent of the original set of issues were resolved by making design changes to the interface.

[Soliciting comments]

Participants tend not to voice negative reports. In one study, when using the ’think aloud’ [as opposed to retrospective] approach, users tended to read text on the screen and verbalize more of what they were doing rather than what they were thinking.

[How many user testers?]

Performance usability testing with users:
– Early in the design process, usability testing with a small number of users (approximately six) is sufficient to identify problems with the information architecture (navigation) and overall design issues. If the Web site has very different types of users (e.g., novices and experts), it is important to test with six or more of each type of user. Another critical factor in this preliminary testing is having trained usability specialists as the usability test facilitator and primary observers.
– Once the navigation, basic content, and display features are in place,
quantitative performance testing … can be conducted

[What kinds of prototypes?]

Designers can use either paper-based or computer-based prototypes. Paper-based prototyping appears to be as effective as computer-based prototyping when trying to identify most usability issues.

Use inspection evaluation [and cognitive walkthroughs] results with caution.
Inspection evaluations include heuristic evaluations, expert reviews, and cognitive walkthroughs. It is a common practice to conduct an inspection evaluation to try to detect and resolve obvious problems before conducting usability tests. Inspection evaluations should be used cautiously because several studies have shown that they appear to detect far more potential problems than actually exist, and they also tend to miss some real problems.

Heuristic evaluations and expert reviews may best be used to identify potential usability issues to evaluate during usability testing. To improve somewhat on the performance of heuristic evaluations, evaluators can use the ’usability problem inspector’ (UPI) method or the ’Discovery and Analysis Resource’ (DARe) method.

Cognitive walkthroughs may best be used to identify potential usability issues to evaluate during usability testing.

Testers can use either laboratory or remote usability testing because they both elicit similar results.

[And finally]

Use severity ratings with caution.”

Useful background on usability testing

I came across while looking for some background information on usability testing to send colleagues I’m planning some user evaluation with. It looks like a really useful resource for all stages of a project from planning to deployment.

Their guidelines are available to download in PDF form, either as entire book or specific chapters.

“Encouraging a “There Are No Dumb Questions” culture is only part of the solution. What we really need is a “There are No Dumb Answers” policy.”

How to Build a User Community, Part 1 offers some good solutions to the kinds of issues I’ve worried about when thinking about our user communities. I think it’s a good basis for some guidelines but really we just need to get it up and running and see how our users respond.

Are small museums the long tail?

On the way home from the Semantic Web Think Tank last week (see previous post), I suddenly thought: are small or specialised museums the long tail?

Each museum by itself would represent a tiny proportion of the overall use of museum collections online, but if you put all that usage together, would their collections in fact have a higher rate of use than those of more ‘popular’ museums?

At the moment I don’t think there’s any way to find out, because so many small or specialised museums don’t have collections online, through a lack of expertise, digitisation resources or an easy-to-use publication infrastructure. Still, it’s an interesting question.

Semantic Web ThinkTank

I went to the Semantic Web Think Tank meeting on “Social Software and the User Experience of the Semantic Web” in Brighton on Thursday. I’m still thinking about the discussions a lot, but here are some of my thoughts. This isn’t an official report of the day, and they’re in entirely random order and mixed up with other issues I’ve been thinking about lately.

We were asked to introduce ourselves and briefly describe our interest in the Semantic Web at the start of the session. I explained that I have a long-standing interest in user experiences online, and in the presentation of collections online. I’ve been interested in discovering whether we’re actually using the most effective schema, formats, navigation and interfaces for our audiences for a long time, so sessions like this are a delight.

User-generated content
A lot of the conversation was about user-generated content rather than the users’ experience of the semantic web, possibly because museums are thinking hard about user-generated content at the moment.

We talked about models for the presentation of user-generated content that would suggest users are comfortable distinguishing between content generated within an institution and that written by other users, such as Amazon reviews.

I didn’t raise this at the time but while the overall quality of Amazon reviews and Wikipedia entries are encouraging, the Yahoo! Answers service makes me despair for humanity. Maybe it doesn’t have the snob value of other social software sites like Amazon or Wikipedia, but the answers tend to be pretty low quality and sometimes possibly even maliciously wrong. Importantly, stupid or bad answers don’t tend to be rated down the way a less insightful Amazon review would be.

However, overall it does show that there are existing models of user-generated content that we can follow – we don’t have to invent them to start publishing user-generated content on our museum websites.

As an aside, hopefully our users don’t discount ‘official’ museum content the way users tend to disregard the publisher blurbs on Amazon – we’re told that users regard museums as trustworthy and ‘objective’ and I would hope regarded with some affection.

I’d never really thought about using folksonomies as a form of feedback that would inform the process of creating ontologies but once Areti raised it I started thinking about it. I guess I’ve always seen them as serving slightly different purposes, and as I don’t think they don’t compete in any sense, I hadn’t seen the need to change how ontologies are constructed. I guess it depends – while internal ontologies don’t need to be user-friendly, museums have a tendency to re-use them as navigation and information architecture on a website, where they do need to suit the audiences.

There was some discussion about the barriers to participation for museums and the possibility of resistance from curators and other museum staff. I’ve been lucky that so far I haven’t encountered any resistance but I think generally we can use internal goodwill to engage new, non-traditional or disengaged audiences as a motivator. Our barriers to participation are those old favourites, time and money.

I think I must have been hungry because I started thinking about collections online as RSS as ‘home delivery’ from a range of menus and traditional online collections as going to a restaurant – the restaurant chooses the range of items you can order and in what format they’ll be delivered.

Not all users are equal!
User-generated content isn’t written by random voices from undifferentiated mass of users. Reputation and trust are important, whether ‘Real Name’ reviewers on Amazon, established authors on Wikipedia, or eBay sellers with good feedback. What impact might that have on museum content that’s ‘leaked out’ and lost its original context?

Knowing what our users want matters, and simultaneously doesn’t
Towards the end of the morning session I decided that we can’t predict how semantic web users will use unfettered access, so maybe we should just build it and see what happens, instead of trying to second guess them. In a way, the Semantic Web is post-User Centred Design because we’re not designing the applications but providing repositories of data that can be used in what could be called User Created Applications. It’s not that users don’t matter, it’s that now isn’t the time to make assumptions about what they want – our dialogue with them should be very open-ended.

As for what ‘it’ is – maybe a repository of objects published in a sector-wide digital object model or schema? There was some discussion of whether objects could be published in microformats, but I think they’re too big for that. Otoh, if we have a repository where each object has a permanent URI, we could put selected data into microformats that can refer back to the URI.

We can better predict how user-generated content might relate to our existing infrastructure so we should try to cater to known models and requirements.

The semantic web can cause problems for museums funded according to the number of visitors through their door or to their website. We need to redefine the measures of success to incorporate content that’s used outside the infrastructure of the originating museum; or we can refer users onto commercial services such as picture libraries. In terms of development, we can aim to created re-usable and sustainable infrastructures in any new applications developed so they can be used to deliver content both to the target audience/application and beyond.

Other random thoughts
I’ve also realised that maybe we need to take a step back and ask “do we actually know who our users are?” before we can assess the effectiveness of online collections. There generally accepted groupings of users we talk about, but do they reflect reality? It may well be that we’re on the right track but it would be good to confirm this. Interestingly, since I started writing this post, I’ve noticed that my old workmates Jonny Brownbill and Darren Peacock are presenting a session titled Audiences, Visitors and Users: Reconceptualising users of museum online content and services at MW2007 so hopefully research in this area is moving forward.

I can relate this to the discussions on Thursday but actually it came out of a conversation I had with my workmate Jeremy beforehand: does Australia’s history with models of distance education like the “School of the Air” mean that Australian museums have a different understanding of how to present collections online? Australian museums have had extensive collections online for years, possibly a lot earlier than museums in Europe or North America.

Update: the workshop report is now online.

“Shoppers are likely to abandon a website if it takes longer than four seconds to load, a survey suggests.

It found 75% of the 1,058 people asked would not return to websites that took longer than four seconds to load.” Akamai study as reported on the BBC.

It’s a study of online shopping habits, but I wonder if the same holds for cultural sector sites. I guess that says something either about my knowledge of existing audience evaluation or the paucity of existing information.

The article doesn’t report whether the study analysed the results by gender, but this article, Key Website Research Highlights Gender Bias, suggests that gender makes a big difference to the user experience:

“Despite the parity of target audience, the results found that 94% of the sites displayed a masculine orientation with just 2% displaying a typically female bias.”

Interesting use of location-aware devices at the Tower of London.

“The new game employs HP’s iPAQ handheld devices and location sensors to trigger the appropriate digital file, which includes voices, images, music and clues.

HP said that developing the new game has helped it to explore opportunities for new products and services that will emerge around the delivery of location and other context-based experiences.”

Via the BCS.

I’ve always wanted to do something like a ‘museum outside the walls’ where hand-held devices or mobile phones deliver content based on your location. They could be used in walking tours, or signs could let people know that content is available. London has so many layers of history, and the Museum has so much content about London’s histories.