July 2008 – Open Objects

It's a good week for search engine gossip

Dare Obasanjo quotes Nick Carr as a lead in to a post on Google's Assault on Wikipedia:

Clearly Nick Carr wasn't the only one that realized that Google was slowly turning into a Wikipedia redirector. Google wants to be the #1 source for information or at least be serving ads on the #1 sites on the Internet in specific area. Wikipedia was slowly eroding the company's effectivenes at achieving both goals. So it is unsurprising that Google has launched Knol and is trying to entice authors away from Wikipedia by offering them a chance to get paid.

What is surprising is that Google is tipping it's search results to favor Knol. Or at least that is the conclusion of several search engine optimization (SEO) experts and also jibes with my experiences.

After looking at some test cases he concludes:

Google is clearly favoring Knol content over content from older, more highly linked sites on the Web. I won't bother with the question of whether Google is doing this on purpose or whether this is some innocent mistake. The important question is "What are they going to do about it now that we've found out?"

It's early days for Knol so maybe the placement of Google search results will settle down over time.

Via other links I found confirmation that '[f]or years, Google's link: command (and see here) has deliberately failed to show all the links to a website.' Old news but I missed it at the time, but since I'd always wondered why the link: thing never seemed to work properly I thought it was worth mentioning.

One step closer to intelligent searching?

The BBC have a story on a new search engine, Search site aims to rival Google:

Called Cuil [pronounced 'cool'], from the Gaelic for knowledge and hazel, its founders claim it does a better and more comprehensive job of indexing information online.

The technology it uses to index the web can understand the context surrounding each page and the concepts driving search requests, say the founders.

But analysts believe the new search engine, like many others, will struggle to match and defeat Google.

…

Instead of just looking at the number and quality of links to and from a webpage as Google's technology does, Cuil attempts to understand more about the information on a page and the terms people use to search. Results are displayed in a magazine format rather than a list.

From the Cuil FAQ:

So Cuil searches the Web for pages with your keywords and then we analyze the rest of the text on those pages. This tells us that the same word has several different meanings in different contexts. Are you looking for jaguar the cat, the car or the operating system?

We sort out all those different contexts so that you don't have to waste time rephrasing your query when you get the wrong result.

Different ideas are separated into tabs; we add images and roll-over definitions for each page and then make suggestions as to how you might refine your search. We use columns so you can see more results on one page.

They also provide 'drill-downs' on the results page.

Cuil will direct you to this additional information. By looking at these suggestions, you may discover search data, concepts, or related areas of interest that you hadn’t expected. This is particularly useful when you are researching a subject you don't know much about and aren't sure how to compose the "right" query to find the information you need.

I haven't used it enough to work out exactly how it differentiates concepts (tabs) and 'additional information' (drill-downs/categories).

It does a good job on something like the Cutty Sark. Under 'Explore by Category' it offered:

Buildings And Structures In Greenwich
Sailboat Names
Museums In London
Neighbourhoods Of Greenwich
School Ships

It picked up search results for Cutty Sark whisky and news of the Cutty Sark fire but they weren't reflected in the categories, and the search term didn't trigger the tabs. The tabs kick in when you search for something like 'orange'.

It didn't do as well with 'samian ware' – the categories picked up all sorts of places and peoples, (and randomly 'American Films'), but while the search results all say that it's 'a kind of bright red Roman pottery' that's not reflected in the categories. Fair enough, there may not be enough information easily available online so that 'Types of Roman pottery' registers as a category.

Incidentally, most of the results listed for 'samian ware' are just recycled entries from Wikipedia. It's a shame the results aren't filtered to remove entries that have just duplicated Wikipedia text. The FAQ says they don't index duplicate content I guess the overall site or page is just different enough to be retained.

It might take a while for museum content to appear in the most useful ways, but it looks like it might be a useful search engine for niche content. From the FAQ again:

We've found that a lot of Web pages have been designed with a small audience in mind—perhaps they are blogs or academic papers with specific interests or pages with family photos. We think that even though these pages aren't necessarily for a wide audience, they contain content that one day you might need.

Our job is to index all these pages and examine their content for relevancy to your search. If they contain information you need, then they should be available to you.

It's all sounding a bit semantic web-ish (and quite a bit 'reacting to Google-ish') and I'll use it for a while to see how it compared to Google. The webmaster information doesn't give any indication of how you could mark up content so the relationships between terms in different contexts is clear, but I guess nice semantic markup would help.

Refreshingly, it doesn't retain search info – privacy is one of their big differentiators from Google.

'Annoying adverts affect website traffic'

Via the BCS:

Nearly three quarters – 73 per cent – of internet users clicked away from a favourite website because of an annoying advert, according to research.

The survey, carried out by Opinion Matters for HowTo.tv, also revealed that 59 per cent no longer visited a particular website because of its advertising.

I use AdBlock for a serene and calm web experience, so when I use someone else's computer I'm always amazed at the sheer level of noise on the web and the crappiness of pages plastered with ads.

I was using Add-Art in conjunction with AdBlock, and will again when it support Firefox 3 because it's a lovely idea. If you haven't heard of Add-Art before, check out this Webmonkey article until you can install it on Firefox 3.

Giant squid dissection via live video

I've been watching the recording of the live stream of the first ever public dissection by Museum scientists of a giant squid.

Congratulations to everyone involved at Museum Victoria, it's a great use of technology and a great approach to openness. The explanations were beautifully clear, and did a great job of contextualising the research, the process and the animal itself.

I love the paparazzi-style photo flashes as they rolled the trolley out onto the main floor.

Portable mapping applications make managers happy

This webmonkey article, Multi-map with Mapstraction, about an 'open source abstracted JavaScript mapping library' called Mapstraction is perfectly on target for organisations that worry about relying on one mapping provider.

How many of these have you heard as possible concerns about using a particular mapping service?

Current provider might change the terms of service
Your map could become too popular and use up too many map views
Current provider quality might get worse, or they might put ads on your map
New provider might have prettier maps
You might get bored of current provider, or come up with a reason that makes sense to you

They're all reasonable concerns. But look what the lovely geeks have made:

The promise of Mapstraction is to only have to change two lines of code. Imagine if you had a large map with many markers and other features. It could take a lot of work to manually convert the map code from one provider to another.

And functionality is being expanded. I liked this:

One of my favorite Mapstraction features is automatic centering and zooming. When called on a map with multiple markers, Mapstraction calculates the center point of all markers and the smallest zoom level that will contain all the markers.

Open source rocks! Not only can you grab the code and have someone maintain it for you if you ever need to, but it sounds like a labour of geek love:

Mapstraction is maintained by a group of geocode lovers who want to give developers options when creating maps.

Microupdates and you (a.ka. 'twits in the museum')

I was trying to describe Twitter-esque applications for a presentation today, and I wasn't really happy with 'microblogging' so I described them as 'micro-updates'. Partly because I think of them as a bit like Facebook status updates for geeks, and partly because they're a lot more actively social than blog posts.

In case you haven't come across them, Twitter, Pownce, Jaiku, tumblr, etc, are services that let you broadcast short (140 characters) messages via a website or mobile device. I find them useful for finding like-minded people (or just those who also fancy a drink) at specific events (thanks to Brian Kelly for convincing me to try it).

You can promote a 'hash tag' for use at your event – yes, it's a tag with a # in front of it, low tech is cool. Ideally your tag should be short and snappy yet distinct, because it has to be typed in manually (mistakes happen easily, especially from a mobile device) and it's using up precious characters. You can use tools like Summize, hashtags, Quotably or Twemes to see if anyone else has used the same tag recently.

You can also ask people to use your event tag on blog posts, photos and videos to help bring together all the content about your event and create an ad hoc community of participants. Be aware that especially with Twitter-type services you may get fairly direct criticism as well as praise – incredibly useful, but it can seem harsh out of context (e.g. in a report to your boss).

More generally, you can use the same services above to search twitter conversations to find posts about your institution, events, venues or exhibitions. You can add in a search term and subscribe to an RSS feed to be notified when that term is used. For example, I tried http://summize.com/search?q="museum+of+london" and discovered a great review of the last 'Lates' event that described it as 'like a mini festival'. You should also search for common variations or misspellings, though they may return more false positives. When someone tweets (posts) using your search phrase it'll show up in your RSS reader and you can then reply to the poster or use the feedback to improve your projects.

This can be a powerful way to interact with your audience because you can respond directly and immediately to questions, complaints or praise. Of course you should also set up google alerts for blog posts and other websites but micro-update services allow for an incredible immediacy and directness of response.

As an example, yesterday I tweeted (or twitted, if you prefer):

me: does anyone know how to stop firefox 3 resizing pages? it makes images look crappy

I did some searching [1] and found a solution, and posted again:

me: aha, it's browser.zoom.full or "View → Zoom → Zoom Text Only" on windows, my firefox is sorted now

Then, to my surprise, I got a message from someone involved with Firefox [2]:

firefox_answers: Command/Control+0 (zero, not oh) will restore the default size for a page that's been zoomed. Also View->Zoom->Reset

me: Impressed with @firefox_answers providing the answer I needed. I'd been looking in the options/preferences tabs for ages

firefox_answers: Also, for quick zooming in & out use control plus or control minus. in Firefox 3, the zoom sticks per site until you change it.

Not only have I learnt some useful tips through that exchange, I feel much more confident about using Firefox 3 now that I know authoritative help is so close to hand, and in a weird way I have established a relationshp with them.

Finally, twitter et al have a social function – tonight I met someone who was at the same event I was last week who vaguely recognised me because of the profile pictures attached to Twitter profiles on tweets about the event. Incidentally, he's written a good explanation of twitter, so I needn't have written this!

[1] Folksonomies to the rescue! I'd been searching for variations on 'firefox shrink text', 'firefox fit screen', 'firefox screen resize' but since the article that eventually solved my problem called it 'zoom', it took me ages to find it. If the page was tagged with other terms that people might use to describe 'my page jumps, everything resizes and looks a bit crappy' in their own words, I'd have found the solution sooner.

[2] Anyone can create a username and post away, though I assume Downing Street is the real thing.

London Transport Museum's Flickr scavenger hunt

I haven't looked at the whole site yet but I loved the idea so I wanted to post it while you could still vote (until 20 July 2008):

London Transport Museum is hosting a Flickr scavenger hunt on Sunday 6th July in Covent Garden as part of the events for the London Festival of Architecture 2008. Focusing on the transport network's quirky design features, in a race against time teams of photographers will have to unlock a series of cryptic clues in order to snap roundels, station murals and much more. Have you got what it takes to get all the shots and make it back to the Museum? Prizes for the first team back (with the most correct answers), and – voted by the public – the best team and the best picture uploaded on Flickr.

We're all suckers for museums

A lovely post on 'Why Museums Are Important to Me' that also contains a reminder of the need to consciously communicate properly with those outside the sector:

When you work in a museum you sometimes forget what it's like NOT to work in a museum. Museums can be very absorbing little worlds, because they have such odd functions and corners and people in them.

Museums might not offer the best working conditions in the world (especially if you work in a profession that's usually better paid), but there are good reasons why most people who work in museums love their jobs.

20% time – an experiment (with some results)

A company called Atlassian have been experimenting with allowing their engineers 20% of their time to work on free or non-core projects (a la Google). They said:

You see, while everyone knows about Google's 20% time and we've heard about all the neat products born from it (Google News, GMail etc) – we've found it extremely difficult to get any hard facts about how it actually works in practice.

So they started with a list of questions they wanted to answer through their experiment, and they've been blogging about it at http://blogs.atlassian.com/developer/20_percent_time/. It makes for interesting reading, and it's great to see some real evidence starting to emerge.

Hat tip: Tech-Ed Collisions.

Learn web standards for free

So now you have no excuse – it's free, accessible, and "designed to give anyone a solid grounding in web design/development, no matter who they are" (and what they might/not already know):

Learning Web Standards just got easier. Opera's new Web Standards Curriculum is a complete course to teach you standards-based web development, including HTML, CSS, design principles and background theory, and JavaScript basics.

Interesting, the introduction says, "I am mainly aiming this at universities, as I believe the standards of education in web standards to be somewhat lacking at many universities".

More at Learn to build a better Web with Opera.