I’ve developed this exercise on computational data generation and entity extraction for various information/data visualisation workshops I’ve been teaching lately. As these methods have become more accessible, my dataviz workshops have included more discussion of computational methods for generating data to be visualised. I used to do a text-based version of this, but found that using services that describe images was more accessible and generated richer discussion in class. If you try something like this in your classes I’d love to hear from you.
It’s also a chance to talk about the uses of these technologies in categorising and labelling our posts on social media. We can tell people that their social media posts are analysed for personality traits and mentions of brands, but seeing it in action is much more powerful.
Exercise: trying computational data generation and entity extraction
Time: c. 10 minutes with discussion.
Goal: explore methods for extracting information from text or an image and reflect on what the results tell you about the algorithms
1 Find a sample image
Find an image (e.g. from a news site or digitised text) you can download and drag into the window. It may be most convenient to save a copy to your desktop. Many sites let you load images from a URL, so right- or control-clicking to copy an image location for pasting into the site can be useful.
2 Work in your browser
It’s probably easiest to open each of these links in a new browser window. It’s best to use Firefox or Chrome, if you can. Safari and Internet Explorer may behave slightly differently on some sites. You should not need to register to use these sites – please read the tips below or ask for help if you get stuck.
- Clarifai https://www.clarifai.com/demo – you can drag and drop, open the file explorer to find an image, or load one from a URL via the large ‘+’ in the bottom right-hand corner. You can adjust settings via the ‘Configure’ tab.
- Google Cloud Vision API https://cloud.google.com/vision/ – don’t sign up, scroll down to the ‘Try the API’ box. Drag and drop your image on the box or click the box to open the file finder. You may need to go through the ‘I am not a robot’ process.
- Microsoft Computer Vision API https://www.microsoft.com/cognitive-services/en-us/computer-vision-api – scroll to ‘Analyze an image’. You can use one of their sample images, paste a URL and hit ‘Submit’, or click on the green folder icon to upload your own image.
- IBM Watson Visual Recognition https://visual-recognition-demo.mybluemix.net/ – scroll to ‘Try the service’. Drag an image onto the grey box or click in the grey box to open the file finder. You can also load an image directly from a URL.
- Blippar https://developer.blippar.com/portal/vs-api/index/#demoSection – scroll to the ‘Analyze any image’ section – the upload and URL options are below the sample images and tags
- Caffe http://demo.caffe.berkeleyvision.org/
3 Review the outputs
Make notes, or discuss with your neighbour. Be prepared to report back to the group.
- What attributes does each tool report on?
- Which attributes, if any, were unique to a service?
- Based on this, what do Clarifai, Google, IBM and Microsoft seem to think is important to them (or to their users)?
- How many of possible entities (concepts, people, places, events, references to time or dates, etc) did it pick up?
- Is any of the information presented useful?
- Did it label anything incorrectly?
- What options for exporting or saving the results did the demo offer? What about the underlying service or software?
- For tools with configuration options – what could you configure? What difference did changing classifiers or other parameters make?
- If you tried it with a few images, did it do better with some than others? Why might that be?
This exercise focuses on images, but you can try a similar exercise with text-based tools like Stanford’s Natural Language Processing (NLP) demo http://corenlp.run/, DBPedia https://dbpedia-spotlight.github.io/demo/ and Ontotext http://tag.ontotext.com/
— Anouk Lang (@a_e_lang) February 15, 2017