A quick update from my CENDARI fellowship working on a project that's becoming 'In their own words: linking lived experiences of the First World War'. I've spent the week reading (again a mixture of original diaries and letters, technical stuff like ontology documentation and also WWI history forums and 'amateur' sites) and writing. I put together a document outlining a rang of possible goals and some very sketchy tech specs, and opened it up for feedback. The goals I set out are copied below for those who don't want to delve into detail. The commentable document, 'Linking lived experiences of the First World War': possible goals and a bunch of technical questions goes into more detail.
However, the main point of this post is to publicly thank those who've helped by commenting and sharing on the doc, on twitter or via email. Hopefully I'm not forgetting anyone, as I've been blown away by and am incredibly grateful for the generosity of those who've taken the time to at least skim 1600 words (!). It's all helped me clarify my ideas and find solutions I'm able to start implementing next week. In no order at all – at CENDARI, Jennifer Edmond, Alex O'Connor, David Stuart, Benjamin Štular, Francesca Morselli, Deirdre Byrne; online Andrew Gray @generalising; Alex Stinson @ DHKState; jason webber @jasonmarkwebber; Alastair Dunning @alastairdunning; Ben Brumfield @benwbrum; Christine Pittsley; Owen Stephens @ostephens; David Haskiya @DavidHaskiya; Jeremy Ottevanger @jottevanger; Monika Lechner @lemondesign; Gavin Robinson @merozcursed; Tom Pert @trompet2 – thank you all!
Worthy goals (i.e. things I'm hoping to accomplish, with the help of historians and the public; only some of which I'll manage in the time)
Hopefully these personal accounts will provide some context, in their own words, for the lived experiences of WWI. Some goals listed are behind-the-scenes stuff that should just invisibly make personal diaries, letters and memoirs more easily discoverable. It needs datasets that provide structures that support relationships between people and documents; participatory interfaces for creating or enhancing information about contemporary materials (which feed into those supporting structures), and interfaces that use the data created.
More specifically, my goals include:
- A personal account by someone in each unit linked to that unit's record, so that anyone researching a WWI name would have at least one account to read. To populate this dataset, personal accounts (diaries, letters, etc) would need to be linked to specific soldiers, who can then be linked to specific units. Linking published accounts such as official unit histories would be a bonus. [Semantic MediaWiki]
- Researched links between individual men and the units they served in, to allow their personal accounts to be linked to the relevant military unit. I'm hoping I can find historians willing to help with the process of finding and confirming the military unit the writer was in. [Semantic MediaWiki]
- A platform for crowdsourcing the transcription and annotation of digitised documents. The catch is that the documents for transcription would be held remotely on a range of large and small sites, from Europeana's collection to library sites that contain just one or two digitised diaries. Documents could be tagged/annotated with the names of people, places, events, or concepts represented in them. [Semantic MediaWiki??]
- A structured dataset populated with the military hierarchy (probably based on The British order of battle of 1914-1918) that records the start and end dates of each parent-child relationship (an example of how much units moved within the hierarchy)
- A published webpage for each unit, to hold those links to official and personal documents about that unit in WWI. In future this page could include maps, timelines and other visualisations tailored to the attributes of a unit, possibly including theatres of war, events, campaigns, battles, number of privates and officers, etc. (Possibly related to CENDARI Work Package 9?) [Semantic MediaWiki]
- A better understanding of what people want to know at different stages of researching WWI histories. This might include formal data gathering, possibly a combination of interviews, forum discussions or survey
- Trained 'named entity recognition' and 'natural language processing' tools that could be run over transcribed text to suggest possible people, places, events, concepts, etc [this might drop off the list as the CENDARI project is working on a tool called Pineapple (PDF poster). That said, I'll probably still experiment with the Stanford NER tool to see what the results are like]
- A way of presenting possible matches from the text tools above for verification or correction by researchers. Ideally, this would be tied in with the ability to annotate documents
- The ability to search across different repositories for a particular soldier, to help with the above.