Another update from my CENDARI Fellowship at Trinity College Dublin, looking at 'In their own words: linking lived experiences of the First World War', which is a small-scale, short-term pilot based on WWI collections. My first post is Defining the scope: week one as a CENDARI Fellow. Over the past two weeks I've done a lot of reading – more WWI diaries and letters; WWI histories and historiography; specialist information like military structures (orders of battle, etc). I've also sketched out lots of snippets of possible functions, data, relationships and other outcomes.
I've narrowed the key goal (or minimum viable product, if you prefer) of my project to linking personal accounts of the war – letters, diaries, memoirs, photographs, etc – to battalions, by creating links from the individual who wrote them to their military unit. Once these personal accounts are linked to particular military units, they can be linked to higher units – from the battalion, ship or regiment to brigade, corps, etc – and to particular places, activities, events and campaigns. The idea behind this is to provide context for an individual's experience of WWI by linking to narratives written by people in the same situation. I'm still working out how to organise the research process of matching the right soldier to the right battalion/regiment/ship so that relevant personal stories are discoverable. I'm also still working out which attributes of a battalion are relevant, how granular the data will be, and how to design for the inevitable variation in data quality (for example, the availability of records for different armies varies hugely). Finally, I’m still working out which bits need computer science tools and which need the help of other historians.
Given the number of centenary projects, I was hoping to find more structured data about WWI entities. Trenches to Triples would be useful source of permanent URLs, and terms to train named entity recognition, but am I missing other sources?
There's a lot of content, and so much activity around WWI records, but it's spread out across the internet. Individual people and small organisations are digitising and transcribing diaries and letters. Big collecting projects like Europeana have lots of personal accounts, but they're often not transcribed and they don't seem to be linked to structured data about the item itself. Some people have painstakingly transcribed unit diaries, but they're not linked from the official site, so others wouldn't know there's a more easily read version of the diary available. I've been wondering if you could crowdsource the process of transcribing records held elsewhere, and offer the transcripts back to sites. Using dedicated transcription software would let others suggest corrections, and might also make it possible to link sections of the text to external 'entities' like names, places, events and concepts.
|Albert Henry Bailey. Image:
Sir George Grey Special Collections,
Auckland Libraries, AWNS-19150909-39-5
To help figure out the issues researchers face and the variations in available resources, I'm researching randomly selected soldiers from different Allied forces. I've posted my notes on Private Albert Henry Bailey, service number 13/970a. You'll see that they're in prose form, and don't contain any structured data. Most of my research used digitised-but-not-transcribed images of documents, with some transcribed accounts. It would definitely benefit from deeper knowledge of military history – for a start, which battalions were in the same place as his unit at the same time?
This account of the arrival and first weeks of the Auckland Mount Rifles at Gallipoli from the official unit history gives a sense of the density and specificity of local place names, as does the official unit diary, and I assume many personal accounts. I'm not sure how named entity recognition tools will cope, and ideally I'd like to find lists of places to 'train' the tools (including possibly some from the 'Trenches to Triples' project).
If there aren't already any structured data sources for military hierarchies in WWI, do I have to make one? And if so, how? The idea would be to turn prose descriptions like this Australian War Memorial history of the 27th AIF Battalion, this order of battle of the 2nd Australian Division and any other suitable sources into structured data. I can see some ways it might be possible to crowdsource the task, but it's a big task. But it's worth it – providing a service that lets people look up which higher military units, places. activities and campaigns a particular battalion/regiment/ship was linked to at a given time would be a good legacy for my research.
I'm sure I'm forgetting lots of things, and my list of questions is longer than my list of answers, but I should end here. To close, I want to share a quote from the official history of the Auckland Mounted Rifles. The author said he 'would like to speak of the splendid men of the rank and file who died during this three months' struggle. Many names rush to the memory, but it is not possible to mention some without doing an injustice to the memory of others'. I guess my project is driven by a vision of doing justice to the memory of every soldier, particularly those ordinary men who aren't as easily found in the records. I'm hoping that drawing on the work of other historians and re-linking disparate sources will help provide as much context as possible for their experiences of the First World War.
Update, 15 October 2014: if you've made it this far, you might also be interested in chipping in at 'Linking lived experiences of the First World War': possible goals and a bunch of technical questions.