Helping us fly? Machine learning and crowdsourcing

Image of a man in a flying contrapation powered by birds
Moon Machine by Bernard Brussel-Smith via Serendip-o-matic

Over the past few years we’ve seen an increasing number of projects that take the phrase ‘human-computer interaction’ literally (perhaps turning ‘HCI’ into human-computer integration), organising tasks done by people and by computers into a unified system. One of the most obvious benefits of crowdsourcing on digital platforms has been the ability to coordinate the distribution and validation of tasks. Increasingly, data manually classified through crowdsourcing is being fed into computers to improve machine learning so that computers can learn to recognise images or words almost as well as we do. I’ve outlined a few projects putting this approach to work below.

This creates new challenges for the future: if fun, easy tasks like image tagging and text transcription can be done by computers, what are the implications for cultural heritage and digital humanities crowdsourcing projects that used simple tasks as the first step in public engagement? After all, Fast Company reported that ‘at least one Zooniverse project, Galaxy Zoo Supernova, has already automated itself out of existence’. What impact will this have on citizen science and history communities? How might machine learning free us to fly further, taking on more interesting tasks with cultural heritage collections?

The Public Catalogue Foundation has taken tags created through Your Paintings Tagger and achieved impressive results in the art of computer image recognition: ‘Using the 3.5 million or so tags provided by taggers, the research team at Oxford ‘educated’ image-recognition software to recognise the top tagged terms’. All paintings tagged with a particular subject (e.g. ‘horse’) were fed into feature extraction processes to build an ‘object model’ of a horse (a set of characteristics that would indicate that a horse is depicted) then tested to see the system could correctly tag horses.

The BBC World Service archive used an ‘open-source speech recognition toolkit to listen to every programme and convert it to text’ and keywords then asked people to check the correctness of the data created (Algorithms and Crowd-Sourcing for Digital Archives, see also What we learnt by crowdsourcing the World Service archive).

The CUbRIK project combines ‘machine, human and social computation for multimedia search’ in their technical demonstrator, HistoGraph. The SOCIAM: The Theory and Practice of Social Machines project is looking at ‘a new kind of emergent, collective problem solving’, including ‘citizen science social machines’.

And of course the Zooniverse is working on this, most recently with Galaxy Zoo. A paper summarised on their Milky Way project blog, outlines the powerful synergy between citizens scientists, professional scientists, and machine learning: ‘citizens can identify patterns that machines cannot detect without training, machine learning algorithms can use citizen science projects as input training sets, creating amazing new opportunities to speed-up the pace of discovery’, addressing the weakness of each approach if deployed alone.

Further reading: an early discussion of human input into machine learning is in Quinn and Bederson’s 2011 Human Computation: A Survey and Taxonomy of a Growing Field. You can get a sense of the state of the field from various conference papers, including ICML ’13 Workshop: Machine Learning Meets Crowdsourcing and ICML ’14 Workshop: Crowdsourcing and Human Computing. There’s also a mega-list of academic crowdsourcing conferences and workshops, though it doesn’t include much on the tiny corner of the world that is crowdsourcing in cultural heritage.

Last update: March 2015. This post collects my thoughts on machine learning and human-computer integration as I finish my thesis. Do you know of examples I’ve missed, or implications we should consider?

8 thoughts on “Helping us fly? Machine learning and crowdsourcing”

  1. Thanks, great post. There’s one other nice thing I’ve noticed about using crowdsourcing to produce training data for machine learning: it gives you a meaningful benchmark for accuracy.

    In other words, if you arrange things so crowdsourcers are tagging some of the same items, you can derive a measure of inter-rater agreement for your human readers. Then you can compare this to the accuracy of the algorithm.

    This is pretty illuminating in some cases. You might feel bad that your algorithm is “only” 92% accurate, until you discover that human readers agree with each other about these categories only 93% of the time.

  2. Thanks Ted! I think your comment also serves as a reminder that people might be remembering the data quality of pre-crowdsourcing projects through rose-tinted glasses – I’ve certainly seen lots of obviously hastily-catalogued records produced by internal digitisation projects.

    And from the humanities to humanitarianism, ‘Aerial Imagery Analysis: Combining Crowdsourcing and Artificial Intelligence‘ is another take on integrating human and machine computation.

  3. Fascinating stuff. I’m particularly interested to know that tagging is something where a significant level of ‘accuracy’ is sought after: It’s always seemed to me that tagging, and categorisation in general, is a highly subjective matter (witness my ability to find any of my carefully-archived emails, due to each one having several possible homes – and of course the more tags you use for each individual item, the more noise you create when inspecting a particular tag – but I guess advanced search options help to reduce that problem).

    1. Thanks! The accuracy needed depends on the goal – if you’re trying to bridge the semantic gap between descriptions of objects in museum collections and how normal people think of them, then the more varied terms you can collect the better (and if you get enough for a folksonomy you can derive structure from the ‘noise’); but if you’re tagging entities like people, places, events, etc, then you don’t want Winston Churchill mis-tagged as Clement Attlee.

      And of course if you’re trying to categorise galaxies or accurately transcribe text then you want the contributions to match what’s on the screen as closely as possible (because no-one cares about your subjective opinion unless you’ve noticed something interesting about the image).

Leave a Reply

Your email address will not be published. Required fields are marked *