‘An (even briefer) history of open cultural data’ at GLAM-Wiki 2013

These are some of my notes for my invited plenary talk at GLAM-Wiki 2013 (Galleries, Libraries, Archives, Museums & Wikimedia, #GLAMWiki), held at the British Library on April 12-13, 2013.

A (now very) brief history of open cultural data

Firstly, thank you for the invitation to speak… This morning I want to highlight some key moments of change in the history of open cultural data – a history not only of licenses and data, but also of conversations, standards, and collaborations, of moments where things changed… I’ve included key moments from funders, legislative influences and the commercial sector too, as they create the context in which change happens and often have an effect on what’s considered possible. I’ll close by considering some of the lessons learnt.

[Please help improve this talk]

A caveat – there may well be a bias towards the English-speaking world (and to museums, because of my background). If you know of an open GLAM (gallery, library, archive, museum) data source I’ve missed, you can add it to the open cultural data/GLAM API wiki… or Lotte’s Belice‘s list of open culture milestones  timeline.


‘open cultural data’ is data from cultural institutions that is made available for use in a machine-readable format under an open licence. But each word in open, cultural, data is slightly more complicated so I’ll unpack them a little…


While the degree of openness required to be ‘open’ data can be contentious, at its simplest, ‘open’ refers to content that is available for use outside the institution that created it, whether for school homework projects, academic monographs or mobile phone apps. ‘Open’ may refer to licences that clarify the permissions and restrictions placed on data, or to the use of non-proprietary digital technologies, or ideally, to a combination of both open licences and technologies.

Ideally, open data is freely available for use and redistribution by anyone for any purpose, but in reality there are often restrictions. GLAMs may limit commercial use by licensing content for ‘non-commercial use only’, but as there is no clear definition of ‘non-commercial use’ in Creative Commons licences, some developers may choose not to risk using a dataset with an unclear licence. GLAMs may also release data for commercial use but still require attribution, either to help retain the provenance of the content, to help people find their way to related content or just because they’d like some credit for their work. GLAMs might also release data under custom licences that deal with their specific circumstances, but they are then difficult to integrate with content from other openly-licensed datasets.

Hybrid licensing models are a pragmatic solution for the current environment. They at least allow some use and may contribute to greater use of open cultural data while other issues are being worked out. For example, some institutions in the UK are making lower resolutions images available for re-use under an open licence while reserving high resolution versions for commercial sales and licensing. Or they may differentiate between scholarly and commercial use, or use more restrictive licences for commercially valuable images and release everything else openly.

I think this type of access is better than nothing, particularly if organisations can learn from the experience and release more data next time. Because these hybrid models are often experimental, their reception is important, and it’s helpful for GLAMs to be able to show they’ve had a positive impact and hopefully helped create relationships with groups like Wikipedia.


Cultural data is data about objects, publications (such as books, pamphlets, posters or musical scores), archival material, etc, created and distributed by museums, libraries, archives and other organisations.


It’s a useful distinction to discuss early with other cultural heritage staff as it’s easy to be talking at cross-purposes: data can refer to different types of content, from metadata or tombstone records (the basic titles, names, dates, places, materials, etc of a catalogue record), to entire collection records (including data such as researched and interpretive descriptions of objects, bibliographic data, related themes and narratives) to full digital surrogates of an object, document or book as images or transcribed text. Some organisations release open metadata, others release all their data including their images. If you can’t do open data (full content or ‘digital surrogates’ like photographs or texts) then at least open up the metadata (data about the content) as e.g. CC0 and the rest with another licence. Releasing data may involve licensing images, offering downloads from catalogue sites; ‘content donations’, APIs and machine-facing interfaces; term lists, etc. Much of the data that isn’t images isn’t immediately interesting, and may be designed for inter-collections interoperability or mashups rather than media commons.

Why is open cultural data important?

Before I go on, why do we care? Open cultural data is the foundation on which many projects can be built. It helps achieve organisational goals, mission; can help increase engagement with content; can create ‘network effect’ with related institutions; can be re-used by people who share your goals around access to knowledge and information – people like Wikipedians.

Some key moments in open cultural data

Events I discussed included the founding of Wikimedia, Europeana and Flickr Commons, previous GLAM-Wiki conferences, changes in licences for art images, library catalogue records and museum content, GLAM APIs and linked data services and the launch of the Digital Public Library of America next week.

Lessons learnt

Many of the changes are the results of years of conversation and collaboration – change is slow but it does happen. GLAMs work through slow iterations – try something, and if no-one dies, they’ll try something else. We are all ambassadors, and we are all translators, helping each domain understand the other.

Contradictory things GLAMs are told they must do

  • Give content away for the benefit of all
  • Monetise assets; protect against loss of potential income; protect against mis-use of collections; conserve collections in perpetuity; protect the IP of artists; demonstrate ROI on digitisation

It’s not easy for GLAMs to release all their data under an entirely open licence, but they don’t do it just to be annoying – it’s important to understand some of the pressures they’re under.  For example, GLAMs usually need to be able to track uses of their data and content to show the impact of digitising and publishing content, so they prefer attribution licences.

The issue of potential lost income – imaginary money that could be made one day if circumstances change, or profit that someone else makes off their opened data – is particularly difficult as hard to deal with [and here I ad-libbed, saying that it was like worrying about failing to meet the love of your life because you got on a different tube carriage – you can’t live your life chasing ghosts]. Ideally, open data needs to be understood as an input to the creative economy rather than an item on the balance sheet of an individual GLAM.

GLAMs worry about reputational damage, whether appearing on the front page of a tabloid newspaper for the ‘wrong’ reasons, questions being asked in Parliament, or critique from Wikipedians.  Over time, their mindset is changing from keeping ‘our data’ to being holders, custodians of our shared heritage.

Conversations, communities, collaborations

Conversations matter… we’re all working towards the same goal, but we have different types of anxieties and different problems we have to address.

GLAMs are about collections, knowledge, and audiences. Unlike most online work, they are used to seeing the excitement people experience walking through their door – help GLAMs understand what Wikipedians can do for different audiences by making those audience real to them. GLAMs are also used to being wined and dined before you lay the hard word on them. Just because you don’t need to ask for permission to use content doesn’t mean you shouldn’t start a conversation with an organisation. There are lots of people with similar goals inside organisations, so try to find them and work with them. Trust is a currency, don’t blow it!

Being truly collaborative sometimes means compromising (or picking your battles) and it definitely means practising empathy. Open data people could stop talking about open data as something you *do* to GLAMs, and GLAMs could stop thinking open data people just want to make your life difficult.

The role of higher powers

Government attitudes to open data make a big difference and they can also change the risks associated with publishing orphan works.  Governments can also help GLAMs open up their content by indemnifying them against the chance that someone else will monetise their data – consider it not a failure of the GLAM but a contribution to the creative and digital economy.

Things that are better than a poke in the eye with a sharp stick

  1. Kittens (and puppies)
  2. Cultural data that’s available online but isn’t (yet) openly licensed
  3. Cultural data online that is licensed for non-commercial use

Yes, the last two aren’t ideal, but they are great deal better than nothing.

Into the future…

GLAMs and Wikipedians may move at different paces, and may have different priorities and different ways of viewing the world, but we’re all working towards the same goals. Not everything is as open, but a lot more is open than it used to be. I sensed yesterday [the first day of the conference] that there are still some tensions between Wikimedians and GLAMers, moments when we need to take a deep breath and put empathy before a pithy put down, but I loved that Kat Walsh’s welcome yesterday described how Wikipedia used to focus on how different from others but now focuses on reaching out to others and figuring out how we’re the same.

GLAMs and Wikipedians have already used open cultural data to make the world a better place. Let’s celebrate the progress we’ve made and keep working on that…

Congratulations to everyone who helped make it a great event, but particularly to Daria Cybulska and Andrew Gray (@generalising) for making everything work so smoothly, and Liam Wyatt (@wittylama) for the original invitation to speak.

Notes from THATCamp Feminisms West #tcfw

I’m just back from ten days in the US where I attended two events, both closely related to digital history, feminist digital humanities and women’s history (whether intellectual, science, education, etc related). I’m posting to mark the moment and to collect some links – I think I’m still digesting the many conversations and moments of insight.

THATCamp Feminisms West #tcfw

A THATCamp is a technology+humanities unconference, a format much loved in the digital humanities world. This one was conceived from a twitter conversation and organised by the wonderful Jacque Wernimont of Scripps College in Claremont, California for March 14-15. Two other THATCamp Feminisms were held simultaneously in the south and east. I was invited over to do a workshop, and thought ‘data visualisation as a gateway to programming‘ would be useful – I prepared two exercises, one of which involved thinking about how to match visualisation types to the structure of the selected content in ManyEyes, while the other was more about learning about how code works by playing with a pre-coded (and heavily, chattily commented) working visualisation that used SIMILE’s JavaScript libraries – ‘view source’ and save the file to your hard drive to get started. It was a good chance to talk about the issues that messy humanities data create for generic visualisation tools, the risks in the ‘truthiness’ of visualisations, the importance of thinking critically about algorithms and issues around primary sources and women’s history, etc, with people who’d thought deeply about some of these issues and could make their own contributions to the workshop.
The day started with the #tooFEW Wikipedia editathon (storify of results), which gave everyone a chance to learn and try out something new before the THATCamp had even officially started. It was a nice way to ease into things and achieve something together before working out the THATCamp programme as a group.
Over the day and a half I went to sessions including Feminist digital pedagogy and Feminist Collaboration. After a week of further travel across the US and another conference, the sessions are blurring into one, but overall they were a great chance to think about what a feminist digital humanities might be like (see for example Transformative Digital Humanities projects or read Toward an Open Digital Humanities from an earlier THATCamp for things to move towards or be careful of), to ask questions like ‘what would a feminist Digging into Data look like?’, to ask ‘does it matter if feminist projects are made with people who don’t share their politics’? (Probably not, though academic work might be attractive to people who value work/life balance.)  What’s the right mix of openness and shared authority, how collaborative can a class be, and how can we help students fail safely in the cause of experimenting (especially when using public technologies like YouTube or Twitter)? It’s important to remember that, as Alex Juhasz said, feminism is about process (or praxis), doing and making things, which in turn made me realise that one reason I value teaching coding is that it gives people DIY tools to make things that suit their own research needs and styles (see ‘Why learning to code isn’t as important as learning to build something‘ but please also read Code: Craft and Culture and the comments below it). I also loved Alex’s statement that she’s ‘less interested in feminism that starts from danger than feminism that starts from agency’ and being fearless about taking up space.

The value of meeting in person was an underlying theme of the event, and eventually a conversation about Building a DH Regional Hub, and the difficulties in collaborating between institutions and organising in-person meetings with the huge geographic coverage of the Los Angeles area lead to the invention of Mindr: ‘Grindr for travelling DHers – who’s nearby and what do they want to chat about?’, or as @laurenfklein described it, a ‘geo-aware interface to DH Answers‘, an app that lets you know when someone with similar scholarly interests is nearby and might be up for a chat.  I would *love* this to actually happen, and who knows, if someone is able to shepherd the enthusiasm for it, it might.

Beyond the value in the discussion, just being surrounded by people who were digitally savvy and were also aware of the effects of implicit biases and tech-as-a-meritocracy, the role of disciplinary gatekeepers, assumptions about gendered work, emotional labour and the pressure to be ‘nice’ as well as the peculiarities of academia was brilliant. It was also a bit intimidating at first as I don’t feel hugely qualified to comment on feminist issues (it’s a long time since I’ve been caught up on theory and ‘feminism’ online has probably made it sound scarier than it really is) unless conversation moved to ‘women in tech’ issues or I could contribute observations on my experience of academia and workplaces in the UK. Perhaps that’s one reason I was encouraged by discussion about possible models of feminist scholarship and mentoring (including asking male allies for help) – I don’t have to figure this out on my own. That said, as Anne Cong-Huyen said:

‘At an event like this one, where we come together to address or at least share about gender and sexual equality in dh and the academy it leaves us to ask: Where does the burden of addressing that inequity fall? […] And how about those of us who are junior faculty, adjuncts, or graduate students (like myself) who have even less power within the academy?’

Or in Amanda Phillips’ words:

In this way, THATCamp Feminisms felt a bit different than other THATCamps I’ve attended. The infectious enthusiasm of DH was tempered here by the political, professional, and market realities that disproportionately affect marginalized communities.

I think it’s important that those realities are widely understood and shared, or some of the promise of the digital humanities will have failed to blossom. Creating space for those hard questions perhaps highlights how positive, supportive and constructive the environment at THATCamp Feminisms West was.  I don’t have a witty or concise conclusion, except to say that I met a bunch of amazing women and came away encouraged and inspired, and you should definitely go to a THATCamp Feminisms if you ever get a chance. Or run one yourself and see what happens. To quote Alex Juhasz again:

To me it is was less the DH, or even the digital, that made this conversation matter, but the feminist: because we shared values, the will and capacity to be critical as well as intellectual while being supportive and trying to distribute authority and voice around the room all the while working, quick.

Other posts:

(For the clarity, my personal definition of feminism is something like ‘working to create a world in which the choices available in your life aren’t determined by your gender’ – of course, ideally the same would be true for ethnicity, nationality or class, and they’re all inter-related, and they all work to create a better life for all genders. I shouldn’t have to offer a definition of feminism as ‘equality of opportunity’ but somehow the term has been twisted to mean all sorts of other things, so there you go.)

From Claremont I made my way back to LA, then over to DC, then Philly, catching up with or meeting various ace people before heading to Bryn Mawr for Women’s History in the Digital World, but I’ve run out of time and space so I’ll have to post about that later.

New challenges in digital history: sharing women’s history on Wikipedia – my talk notes

I’m at The Albert M. Greenfield Digital Center for the History of Women’s Education at Bryn Mawr College for the inaugural Women’s History in the Digital World Conference. Since I’m about to speak and ask historians to share their research and write history in public, I thought I should also be brave and share my draft talk notes (which I’ve now updated with formatted references, though Blogger is still re-formatting things slightly oddly).

Introduction: New challenges in digital history: sharing women’s history on Wikipedia

[slide – title, my details]
Hi, I’m Mia. I’m actually doing a PhD on scholarly crowdsourcing, or collaboratively creating online resources, and, thinking about the impact of digitality on the practices of historians, so this paper is indirectly related to my research but isn’t core to it.
I proposed this paper as a deliberate provocation: ‘if we believe the subjects of our research are important, then we should ensure they are represented on freely available encyclopedic sites like Wikipedia’. Just in case you’re not familiar with it, Wikipedia is a free online encyclopedia ‘that anyone can edit.’ It contains 25 million articles, over 4 million of them in English, but also in 285 other languages, and has 100,000 active contributors[1].

‘Brilliant Women’ at the National Portrait Gallery

The genesis of this paper was two-fold. The 2008 exhibition ‘Brilliant Women: 18th Century Bluestockings‘ at the UK National Portrait Gallery, made the point that ‘Despite the fact that ‘bluestockings’ made a substantial contribution to the creation and definition of national culture their intellectual participation and artistic interventions have largely been forgotten’. As a computer programmer, reinventing the wheel and other inefficient processes drive me crazy, and I began to think about how digital publishing could intervene in the cycle of remembering and forgetting that seemed to be the fate of brilliant women throughout history. How could historians use digital platforms to stop those histories being lost and to make them easy for others to find?

[Screenshot – Caitlin Moran quote from How to be a woman: ‘Even the ardent feminist historian, male or female – citing Amazons and tribal matriarchies and Cleopatra – can’t conceal that women have done basically f*ck-all for the last 100,000 years’]
A few years later, by then a brand-new PhD student, I attended the Women’s History Network conference in London in 2011 and learnt of so many interesting lives that challenged conventional mainstream historical narratives of gender. I wished that others could hear those stories too. But when I asked if any of these histories were available outside academia on sites like Wikipedia, there was a strong sense that editing Wikipedia was something that other people did. But who better to make a case for better representation of women’s histories than the people in that room? Who else has the skills, knowledge and the passion? Some academic battles may have been won regarding the importance of women’s histories, but representing women’s histories on the sites where ordinary people start their queries is hugely important. The quote on this slide illustrates why – even if it was meant in jest, it represents a certain world view.

WikiWomen’s Collaborative

[slide – logos from http://en.wikipedia.org/wiki/Wikipedia:WikiWomen%27s_History_Month http://meta.wikimedia.org/wiki/WikiWomen%27s_Collaborative ]
Of course, I’m not the first, and definitely not the most qualified to make this point. I would also like to acknowledge the work of many groups and individuals, particularly within Wikipedia, that’s preceded this.[2]

[slide – Scripps editathon, #tooFEW]
Things move fast in the digital world and we’re at a different moment than the one when I proposed this paper. Gender issues on Wikipedia had been discussed for a number of years but there’s been a recent burst of activity, including the #tooFEW (‘Feminists Engage Wikipedia’) editathons – ‘a scheduled time where people edit Wikipedia together, whether offline, online, or a mix of both’ – [3], held online and in person across four physical sites.[4] [5] I was going to be provocative and ask you to create Wikipedia entries about the histories you’ve invested so much in researching, but some of that is happening already. As a result, this is version 2 of this paper, but my starting question remains the same – assuming we believe that women’s history is important, what’s wrong with our current methods of research dissemination and dialogue?

The case of the Invisible Scholarship

[slide – outline of section]
Cumulative centuries of archival and theoretical work have been spent recovering women’s histories, yet much of this inspiring scholarship might as well not exist when so few people have access to it. Sadly, it’s currently the case that scholarship that isn’t deliberately made public is invisible outside academia. The open access movement, with all its thorny complications, is one potential solution. Engaging in new forms of open scholarship and disseminating research on sites where the public already goes to learn about history is another.

If it’s not Googleable, it doesn’t exist.

[slide – screenshot of unsuccessful search for Ina von Grumbkow]
Most content searches start and end online. The content and links available to search engines inform their assumptions about the world, and they in turn shape the world view presented on the results screen. If the name of a historical figure doesn’t show up in Google, how else would someone find out about them? While college students might be heavy users of Google’s specialist Google Scholar search, it’s unlikely that people would come across it accidentally, not least because there’s a ‘semantic gap’ between the language used in academia and the language used in everyday speech. Writing for Wikipedia means writing in everyday language, and the site is heavily indexed by search engines – it doesn’t take long for content created on Wikipedia – even on a user’s talk page and not the main site – to show up in Google results. So one reason to take history on Wikipedia seriously is that it affects what search engines know about the world.

‘Did you mean… hegemony?’

Search for ‘Viscountess Ranelagh’, Google says ‘Did you mean Viscount’. No. 

[slide – screenshot  of search for ‘Viscountess Ranelagh and the Authorisation of Women’s Knowledge in the Hartlib Circle’, Google says ‘Did you mean Viscount’. No.]
Scholarship and sources contained in specialist online archives and repositories are often off-limits to the Google bots that crawl the web looking for content to index. Because search engines normalise certain assumptions about the world, getting more content about women’s histories in publically accessible spaces will eventually have an effect in the algorithms that determine suggestions for ‘did you mean’ etc. Contributions to sites like Wikipedia can eventually become contributions to the ‘knowledge graphs’ that determine the answers to questions we ask online.

If it’s behind a paywall, it only exists for a privileged few

[Slide – Screenshot of blocked attempt to access ‘Wives and daughters of early Berlin geoscientists and their work behind the scenes’]
Specialist users will be able to find academic research via Google Scholar, but any independent scholars in attendance will be able to speak to the difficulties in gaining access to journal articles without membership of an institutional library. Journal articles obviously have a lot of value within academic communities, but the research they represent is only available to a privileged few.

Why does Wikipedia matter?

[slide: For some, Wikipedia is the font of all wisdom]
Wikipedia is one of the most visited websites in the world. As one commentator said, ‘people turn to Wikipedia as an objective resource’ but ‘ it’s not so objective in many ways.'[6]

However, as the free online encyclopedia ‘that anyone can edit’, it also provides the ability to take direct action to fix the under-representation of women’s history. President of the AHA, William Cronon said, ‘Wikipedia provides an online home for people interested in histories long marginalized by the traditional academy'[7] – this may not be entirely true yet, but we can hope.

Wikipedia is not yet encyclopedic

[Slide – Ina screenshot]
The English version of Wikipedia has over 4 million articles but it still has some way to go to become truly encyclopedic. Martha Saxton has noted the absence of women’s history content on Wikipedia and was distressed by ‘its superficiality and inaccuracies when present [8]’. Just as female assistants, secretaries, collectors, illustrators, correspondents, translators, salonists, cataloguers, text book writers, popularisers, explorers, pioneers and colleagues have been left out of traditional academic histories and gradually reclaimed by historians, they are often still invisible on Wikipedia. This may be partly because not enough women edit Wikipedia – as Wikipedia User Gobonobo says, ‘editors often contribute to topics they are familiar with and that concern them […] This systemic bias has the potential to exacerbate an historical record that already gives undue emphasis to men.’ [9]

The under-representation of women’s history undermines Wikipedia’s claim to be encyclopedic. Issues include missing entries or omissions in coverage for existing topics, entries with inaccurate content, a failure to represent a truly ‘neutral point of view’, and a representation of ‘male’ as the default gender.

Many notable women have been buried in pages titled for their husbands, brothers, tutors, etc. In 1908 Ina von Grumbkow undertook an expedition to Iceland. She later made significant contributions to the field of natural history and wrote several books but other than passing references online and a mention on her husband’s Wikipedia page, her story is only available to those with access to sources like the ‘ Earth Sciences History’ journal[10][11].

[Slide: ‘Main articles: List of Fellows of the Royal Society and List of female Fellows of the Royal Society ‘.]
Some of the categories used in Wikipedia posit the default gender as male. For example, there’s a ‘ List of Fellows of the Royal Society  and ‘ List of female Fellows of the Royal Society.

Wikipedia and the challenges of digital history

Writing for Wikipedia encapsulates many, but not all, of the challenges of digital history.

New forms of writing

Writing for Wikipedia calls upon historians to write engaging, intellectually accessible, succinct text that still accurately represents its subject. It not only means valuing the work and skills in writing public history, it requires the ability to write history in public.

Writing for a ‘neutral point of view’ – one of the key values of Wikipedia – is challenging for historians. Many may find difficult to believe that it’s even possible, and it’s difficult to achieve [12].

Unlike traditional historical scholarship, characterised by ‘possessive individualism’ [13] and honed to perfection before publication, Wikipedia entries are considered a work in progress, and anyone who spots an issue is asked to fix it themselves or flag it for others to review.

It won’t advance your career

While it might have a large public impact, editing Wikipedia is work that isn’t credited in academia, and it takes time that could be used for projects that would count for career advancement. More importantly from Wikipedia’s point of view, you can’t promote your own work on the site, so writing about your own research interests is not straightforward if not many people have published in your area of expertise.

“On the internet, nobody knows you’re a professor”

In a comment with ‘pointers for academics who would like to contribute to Wikipedia’ on a Chronicle article, commentator ‘operalala’ said, ‘”On the internet nobody knows you’re a professor.” If you’re used to deferential treatment at your home institution, you’ll be treated like everybody else in the Wide Open Internet.'[14] Or in William Cronon’s words, you must ‘give up the comfort of credentialed expertise’.[15] Anyone can edit, re-shape or even delete your work.

Just like academia, Wikipedia has ways of establishing the credibility and reputation of a contributor, and just like any other community, there are etiquettes and conventions to observe. As newcomers to the community, Claire Potter warns that it’s important not to think of Wikipedia as ‘another realm for intellectuals to colonize and professionalize’.[16]

The opportunities and challenges of women’s history as public history on Wikipedia


#WomenSciWP editathon at the Royal Society

Wikipedia uses red links to represent entries that could be created but don’t yet exist. Women’s history editathons often create lists of red-linked names as suggested topics that could be created [17] . Projects on and outside Wikipedia, and events at institutions like the Smithsonian and Royal Society and just last weekend at three THATCamps across the United States might be part of a critical mass of people learning how to edit Wikipedia to better include women’s history.

Compared to the lengthy process of writing for academic publication, a new Wikipedia entry can be created in a few hours, allowing for time to structure the content and format the references as necessary to pass the first quality bar. An existing entry can be corrected in minutes. Each editathon or personal edit improves the representation of women’s history, and there’s something very satisfying about turning red links blue.

Ina von Grumbkow’s name red-linked on her husband’s Wikipedia page

Adding the brackets that turn a piece of text into a red link, suggesting the possibility of an entry to be created is a small but potentially powerful intervention. Red links can render the gaps and silences visible.


Creating or editing entries on women’s history may be relatively easy, but making sure they stay there is less so. There are countless examples of women having to fight to keep changes in as other editors revert them, argue about their choice of sources, the significance or notability of their topic. Wikipedians are zealous in preventing spammers and crackpots polluting the quality of the site, which explains some of the rapid ‘nominations for deletion’, but some pockets of the site are also hostile to women’s history or to women themselves.

Saxton said editing Wikipedia is ‘not for the faint of heart’ and ‘a lesson in how little women’s history has penetrated mainstream culture’. There’s work to be done in sharing and normalising an understanding of the historical circumstances and cultural contexts that created difficulties for women. We might know that, as Janet Abbate said, ‘The laws and social conventions of a given time and place strongly shape the kinds of technical training available to women and men, the career options open to them, their opportunities for advancement and recognition’ [18] but until other Wikipedians understand that, there will continue to be issues around ‘notability’. Having those conversations as many times as necessary might be tiring and uncomfortable or even controversial, but it’s part of the work of representing women’s history on Wikipedia.


‘Reliable sources’

Wikipedians may have different definitions of ‘reliable sources’ than scholarly researchers. As one academic discovered:
“Wikipedia is not ‘truth,’ Wikipedia is ‘verifiability’ of reliable sources. Hence, if most secondary sources which are taken as reliable happen to repeat a flawed account or description of something, Wikipedia will echo that.”‘ [19]

The same gatekeepers matter

As some academics have found, ‘Wikipedia differs from primary-source research, from scholarly writing, and how it privileges existing rather than new knowledge’ [20] [21] Wikipedia is not the place to redress fundamental issues with silences in the archives or in the profession overall, not least because on Wikipedia, primary research is bad and secondary sources are good [22] . This puts the onus back on to traditional academic publishing in peer-reviewed journals and books that can be cited in Wikipedia articles, though other published works such as ‘credible and authoritative books’ and ‘reputable media sources’ can also be cited.


‘A person is presumed to be notable if he or she has received significant coverage in reliable secondary sources that are independent of the subject. […] the person who is the topic of a biographical article should be “worthy of notice” – that is, “significant, interesting, or unusual enough to deserve attention or to be recorded” within Wikipedia as a written account of that person’s life.’ [23] ‘The common theme in the notability guidelines is that there must be verifiable, objective evidence that the subject has received significant attention from independent sources to support a claim of notability.’ [24] This creates obvious difficulties for some women’s histories.

It’s also difficult to judge where ‘notability’ should end. When does focusing on exceptional women become counter-productive? When do we risk creating a new canon? When does it stop being remarkable that a woman became prominent in a field and start being more accepted, if still not expected? [25] At what point should writing shift from individual entries to integration into more general topics?


Sometimes it’s hard to tell whether Wikipedia lags behind academia’s acceptance and general integration of women’s history into mainstream history or whether it is representative of the field’s more conservative corners. Recent digital history projects are doing a good job in explaining some of the issues with key sources for Wikipedia like the Oxford Dictionary of National Biography [26] , and I’d hope that this continues. As Martha Saxton said, ‘integrating women’s experience into broad subjects’ is ‘both more challenging intellectually and ultimately, more to the point of the overall project of bringing women into our acknowledged history’. [27]

But it’s also clearly up to us to make a difference. If it’s worth researching the life and achievements of a notable woman, it’s worth making sure their contribution to history is available to the world while improving the quality of the world’s biggest encyclopaedia. And it doesn’t mean going it alone. It’s still just Women’s History Month so it’s not too late to sign up and join one of the women’s history projects, or to plan something with your students. [28] [29] [30]

I’d like to close with quotes from two different women. Executive Director of the Wikimedia Foundation, Sue Gardner: ‘Wikipedia will only contain ‘the sum of all human knowledge’ if its editors are as diverse as the population itself: you can help make that happen. And I can’t think of anything more important to do, than that.’ [31]
And to quote Laura Mandell’s keynote yesterday: ‘Let’s write and publish about each other’s projects so that future historians will have those sources to write about. … Nothing changes through thinking alone, only through massive amounts of re-iteration’. [32]

[Update: based on questions afterwards, you may want to get started with Wikipedia:How to run an edit-a-thon, or sign up and say hello at Wikipedia:WikiProject Women’s History. You could also join in  the Global Women Wikipedia Write-In #GWWI on April 26 (1-3pm, US EST), and they have a handy page on How to Create Wikipedia Entries that Will Stick.

And update April 30, 2013: check out ‘Learning to work with Wikipedia – New Pages Patrol and how to create new Wikipedia articles that will stick‘ by the excellent Adrianne Wadewitz.

Update, June 9: if you’re thinking of setting a class assignment involving editing Wikipedia, check out their ‘For educators‘ and ‘Assignment Design‘ pages for tips and contact points.  June 18: see also Nicole Beale’s ‘Wikipedia for Regional Museums‘.

Update, August 21, 2013: content on Wikipedia appears to have had an additional boost in Google’s search results, making it even more important in shaping the world’s knowledge. More at ‘The Day the Knowledge Graph Exploded‘.

New link, February 2014: Jacqueline Wernimont’s Notes for #tooFEW Edit a thon based on a training session by Adrianne Wadewitz are a useful basic introduction to editing.]


Join in the conversation about Wikimedia @ MW2010

Wikimedia@MW2010 is a workshop to be held in Denver in April, just before the Museums and the Web 2010 conference.  The goal is to develop ‘policies that will enable museums to better contribute to and use Wikipedia or Wikimedia Commons, and for the Wikimedia community to benefit from the expertise in museums’.

If you’ve got stuff you want to say, you can dive right into the conversation – there’s a whole bunch of conversations at http://conference.archimuse.com/forums/wikimediamw2010, including ‘Legal and Business Model Barriers to Collaboration, ‘Notability Criteria‘ and ‘Metrics for Museums on Wikipedia‘.

I’m going to be at the workshop and will do my best to represent any issues raised at the meeting.  I think it’s particularly important that we avoid ‘Feeling glum after GLAM-WIKI‘ if we possibly can, so I’d like to go there with a really good understanding of the possible points of resistance, clashes in organisational culture or world view, incompatible requirements or wishlists so that they can be raised and hopefully dealt with during the in-person workshop.  I’d love to hear from you if there are messages you want to pass on.

I’m also thinking about an informal meetup in London to help cultural heritage people articulate some of the issues that might help or hinder collaboration so they can be represented at the workshop – if you’re a museum, gallery, archive, library or general cultural heritage bod, would that be useful for you?

It’s a good week for search engine gossip

Dare Obasanjo quotes Nick Carr as a lead in to a post on Google’s Assault on Wikipedia:

Clearly Nick Carr wasn’t the only one that realized that Google was slowly turning into a Wikipedia redirector. Google wants to be the #1 source for information or at least be serving ads on the #1 sites on the Internet in specific area. Wikipedia was slowly eroding the company’s effectivenes at achieving both goals. So it is unsurprising that Google has launched Knol and is trying to entice authors away from Wikipedia by offering them a chance to get paid.

What is surprising is that Google is tipping it’s search results to favor Knol. Or at least that is the conclusion of several search engine optimization (SEO) experts and also jibes with my experiences.

After looking at some test cases he concludes:

Google is clearly favoring Knol content over content from older, more highly linked sites on the Web. I won’t bother with the question of whether Google is doing this on purpose or whether this is some innocent mistake. The important question is “What are they going to do about it now that we’ve found out?”

It’s early days for Knol so maybe the placement of Google search results will settle down over time.

Via other links I found confirmation that ‘[f]or years, Google’s link: command (and see here) has deliberately failed to show all the links to a website.’ Old news but I missed it at the time, but since I’d always wondered why the link: thing never seemed to work properly I thought it was worth mentioning.

Introducing modern bluestocking

[Update, May 2012: I’ve tweaked this entry so it makes a little more sense.  These other posts from around the same time help put it in context: Some ideas for location-linked cultural heritage projectsExposing the layers of history in cityscapes, and a more recent approach  ‘…and they all turn on their computers and say ‘yay!” (aka, ‘mapping for humanists’). I’m also including below some content rescued from the ning site, written by Joanna:

What do historian Catharine Macauley, scientist Ada Lovelace, and photographer Julia Margaret Cameron have in common? All excelled in fields where women’s contributions were thought to be irrelevant. And they did so in ways that pushed the boundaries of those disciplines and created space for other women to succeed. And, sadly, much of their intellectual contribution and artistic intervention has been forgotten.

Inspired by the achievements and exploits of the original bluestockings, Modern Bluestockings aims to celebrate and record the accomplishments not just of women like Macauley, Lovelace and Cameron, but also of women today whose actions within their intellectual or professional fields are inspiring other women. We want to build up an interactive online resource that records these women’s stories. We want to create a feminist space where we can share, discuss, commemorate, and learn.

So if there is a woman whose writing has inspired your own, whose art has challenged the way you think about the world, or whose intellectual contribution you feel has gone unacknowledged for too long, do join us at http://modernbluestocking.ning.com/, and make sure that her story is recorded. You’ll find lots of suggestions and ideas there for sharing content, and plenty of willing participants ready to join the discussion about your favourite bluestocking.

And more explanation from modernbluestocking on freebase:

Celebrating the lives of intellectual women from history…

Wikipedia lists bluestocking as ‘an obsolete and disparaging term for an educated, intellectual woman’.  We’d prefer to celebrate intellectual women, often feminist in intent or action, who have pushed the boundaries in their discipline or field in a way that has created space for other women to succeed within those fields.

The original impetus was a discussion at the National Portrait Gallery in London held during the exhibition ‘Brilliant Women, 18th Century Bluestockings’ (http://www.npg.org.uk/live/wobrilliantwomen1.asp) where it was embarrassingly obvious that people couldn’t name young(ish) intellectual women they admired.  We need to find and celebrate the modern bluestockings.  Recording and celebrating the lives of women who’ve gone before us is another way of doing this.

However, at least one of the morals of this story is ‘don’t get excited about a project, then change jobs and start a part-time Masters degree.  On the other hand, my PhD proposal was shaped by the ideas expressed here, particularly the idea of mapping as a tool for public history by e.g using geo-located stories to place links to content in the physical location.

While my PhD has drifted away from early scientific women, I still read around the subject and occasionally adding names to modernbluestocking.freebase.com.  If someone’s not listed in Wikipedia it’s a lot harder to add them, but I’ve realised that if you want to make a difference to the representation of intellectual women, you need to put content where people look for information – i.e. Wikipedia.

And with the launch of Google’s Knowledge Graph, getting history articles into Wikipedia then into Freebase is even more important for the visibility of women’s history: “The Knowledge Graph is built using facts and schema from Freebase so everyone who has contributed to Freebase had a part in making this possible. …The Knowledge Graph is built using facts and schema from Freebase soeveryone who has contributed to Freebase had a part in making this possible. (Source: this post to the Freebase list).  I’d go so far as to say that if it’s worth writing a scholarly article on an intellectual woman, it’s worth re-using  your references to create or improve their Wikipedia entry.]

Anyway. On with the original post…]

I keep meaning to find the time to write a proper post explaining one of the projects I’m working on, but in the absence of time a copy and paste job and a link will have to do…

I’ve started a project called ‘modern bluestocking’ that’s about celebrating and commemorating intellectual women activists from the past and present while reclaiming and redefining the term ‘bluestocking’.  It was inspired by the National Portrait Gallery’s exhibition, ‘Brilliant Women: 18th-Century Bluestockings’.  (See also the review, Not just a pretty face).

It will be a website of some sort, with a community of contributors and it’ll also incorporate links to other resources.

We’ve started talking about what it might contain and how it might work at modernbluestocking.ning.com (ning died, so it’s at modernbluestocking.freebase.com…)

Museum application (something to make for mashed museum day?): collect feminist histories, stories, artefacts, images, locations, etc; support the creation of new or synthesised content with content embedded and referenced from a variety of sources. Grab something, tag it, display them, share them; comment, integrate, annotate others. Create a collection to inspire, record, commemorate, and build on.
What, who, how should this website look? Join and help us figure it out.

Why modernbluestocking? Because knowing where you’ve come from helps you know where you’re going.

Sources could include online exhibition materials from the NPG (tricky interface to pull records from).  How can this be a geek/socially friendly project and still get stuff done?  Run a Modernbluestocking, community and museum hack day app to get stuff built and data collated?  Have list of names, portraits, objects for query. Build a collection of links to existing content on other sites? Role models and heroes from current life or history. Where is relatedness stored? ‘Significance’ -thorny issue? Personal stories cf other more mainstream content?  Is it like a museum made up of loan objects with new interpretation? How much is attribution of the person who added the link required? Login v not? Vandalism? How do deal with changing location or format of resources? Local copies or links? Eg images. Local don’t impact bandwidth, but don’t count as visits on originating site. Remote resources might disappear – moved, permissions changed, format change, taken offline, etc, or be replaced with different content. Examine the sources, look at their format, how they could be linked to, how stable they appear to be, whether it’s possible to contact the publisher…

Could also be interesting to make explicit, transparent, the processes of validation and canonisation.

Move your FAQ to Wikipedia?

Mal Booth from the Australian War Memorial (AWM) makes the fascinating suggestion: they should move their entire Encyclopaedia to Wikipedia. Their encyclopaedia seems to function as a fully researched and referenced FAQ with content creation driven by public enquiries, and would probably sit well in Wikipedia.

In Wikipedia and “produsers”, Mal says:

“Putting the content up on Wikipedia.org gives it MUCH wider exposure than our website ever can and it therefore has the potential to bring new users to our website that may not even know we exist (via links in to our own web content). With a wikipedia.org user account, we can maintain an appropriate amount of control over the content (more than we have at present over wikipedia content that started as ours, already put up there by others).

Another point is that putting it up on Wikipedia allows us to engage the assistance of various volunteers who’d like to help us, but don’t live locally.”

He also presents some good suggestions from their web developer, Adam: they should understand and participate in the Wikipedia community, and identify themselves as AWM professionals before importing content. I think they’ve taken the first step by assessing the suitability of their content for Wikipedia.

It’s also an interesting example of an organisation that is willing to ‘let go’ of their content and allow it to be used and edited outside their institution. Mal’s blog is a real find (and I’m not just saying that because it has ‘Melbin’ (Melbourne) in the title), and I’ll be following the progress of their project with interest.

I wonder how issues of trust and authority will play out on their entries: by linking to the relevant Wikipedia entries, the AWM is giving those entries a level of authority they might not otherwise have. They’re also placing a great deal of trust in Wikipedia authors.

Mal links to a post by Alex Bruns, Beyond Public Service Broadcasting: Produsage at the ABC and summarises the four preconditions for good user-generated content:

  • the replacement of a hierarchy with a more open participatory structure;
  • recognising the power of the COMMUNITY to distinguish between constructive and destructive contributions;
  • allowing for random (granular, simple) acts of participation (like ratings); and
  • the development of shared rather than owned content that is able to be re-used, re-mixed or mashed up.

Adam’s post lists key principles that anyone “looking to develop successful and sustainable participatory media environments” should take into account. These points are defined and expanded on in the original post, which is well worth reading:

  1. Open Participation, Communal Evaluation
  2. Fluid Heterarchy, Ad Hoc Meritocracy
  3. Unfinished Artefacts, Continuing Process
  4. Common Property, Individual Rewards

User-generated content and the general public vs invited experts

We’ve been having discussions at work about the promises and challenges of user-generated content. In that light, this article is quite timely:

“The estranged founder of Wikipedia, the online encyclopaedia written entirely by members of the public, is to launch a rival that he says is less likely to be riddled with errors.
Larry Sanger says that vast swaths of the anarchic encyclopaedia he helped create in 2001 are in desperate need of an editor – and that is what he is promising for his new project.

Mr Sanger has begun signing up academics furious at the mistakes and generalisations they find on Wikipedia’s articles on their specialist subjects, and vowed to give these experts a special role to shape articles on Citizendium.org.”