Wikipedia and Open Access: making research as useful as it can be

This post was written by Martin Poulter, Wikimedian in Residence at Bodleian Libraries, and was first published on Open Access Oxford.

The Budapest Open Access Declaration is one of the defining documents of the Open Access movement. It says that free, unrestricted access to peer-reviewed scholarly literature will “accelerate research, enrich education, share the learning of the rich with the poor and the poor with the rich, make this literature as useful as it can be, and lay the foundation for uniting humanity in a common intellectual conversation and quest for knowledge.”

To bring about this optimistic vision, there needs to be some way to deliver this knowledge directly to everyone on the planet. Rather than broadcasting to passive recipients, this needs to encourage repurposing and remixing of research outputs, so people can adapt them into educational materials, translate them into other languages, extract data from them, and find other uses.

Fifteen years after its creation in January 2001, Wikipedia is emerging as that creative space. Wikipedia is not a competitor to normal scholarly publication, but a bridge connecting it to the world of informal learning and discussion. Wikipedia is only meant to be a starting point: its credibility does not come from its contributors, who are often anonymous, but from checkable citations to reputable sources.

Being “the free encyclopedia” reflects not just that Wikipedia is available without charge, but that it is free for use by anyone for any purpose, subject to the requirements of the most liberal Creative Commons licences. These freedoms are a part of its success: on the article about your favourite topic, click “View history”, then “Page view statistics”: it is not uncommon to see a scientific article getting thousands of hits per day.

When a team in 2015 announced the discovery of a new hominid, Homo Naledi, the extensive diagrams, fossil photos and other supplements they produced exceeded the size limit set by their first choice of journal, Nature. So they went to the open-access journal eLIFE. As well as publishing the peer reviews along with the paper, eLIFE uses a very liberal licence, so figures from the paper made it possible to create a comprehensive Wikipedia article for Homo Naledi, and to improve related articles.

There are many more cases where a research paper is adapted into a Wikipedia article which acts as a lay summary. For example, the article on Major Urinary Proteins was written by scientists at the Wellcome Trust Sanger Institute based on, and using figures from, papers they had published in PLOS open-access journals.

Editing Wikipedia used to involve learning a form of markup called “wiki code”. Thanks to some software development, this is no longer necessary. When you register an account, each article presents two tabs “Edit” and “Edit source”. “Edit source” gives you the old wiki code interface; but “Edit” gives a much more straightforward wordprocessor-like interface. Especially handy is the “Cite” button, which can convert a DOI (Digital Object Identifier) into a full citation.

Still much about Wikipedia is poorly-designed and dependent on insider knowledge. Luckily there are insiders who are keen to share, and training is available. The Royal Society of Chemistry, Cancer Research UK and the Royal Society are amongst the scientific bodies which have employed Wikipedians In Residence. As WIR at the Bodleian Libraries, I have run events to improve articles on Women In Science and am celebrating Wikipedia’s 15th birthday working with researchers and students from the Oxford Internet Institute to improve articles about the “social internet”.

Wikimedia encompasses more than just Wikipedia: it is an ecosystem of different projects handling and repurposing text, data and digital media. There are many sites that you can use without charge to share or build materials, but Wikimedia is distinctive in being a charitable project existing purely to share knowledge, with no adverts or other commercial influences.

Wikimedia Commons is the media archive, hosting photographs, diagrams, video clips and other digital media, along with author and source credits and other metadata. It currently offers just under 30 million files, of which tens of thousands are extracted from figures or supplementary materials from Open Access papers. It’s a massively multilingual site, where each file can have descriptions in many languages, and one of the repurposing activities going on is creating alternative language versions of labelled diagrams.

Wikidata describes itself as “a free linked database that can be read and edited by both humans and machines”. It holds secondary data: not raw measurements, but key facts and figures concluded from them. Looking up Platinum, for example, gives the element’s periodic table placement, various official identifiers and physical properties. Wikidata holds knowledge about fifteen million entities, including species, molecules, astronomical bodies and diseases although the number is still rapidly growing.

What’s exciting about Wikidata is the uses it can be put to. Making data about many millions of things freely available enables a new generation of applications for education and reference. Reasonator gives a visually pleasing overview of what Wikidata knows about an entity. Histropedia (histropedia.com) is a tool for building timelines (try “Discovers of chemical elements”, then zoom in).

There are eleven Wikimedia projects in total each with its own strengths and flaws. My personal favourites include Wikisource – a library of open access and out-of-copyright text, including for example Michael Faraday’s Royal Institution lectures – and Wikibooks which aims to create textbooks for every level and topic from ABCs to genome sequencing.

As open access becomes more mainstream, technical and legal barriers around research outputs will diminish, so more research will become as “useful as it can be” through the Wikimedia projects. That benefits the research in terms of impact and public awareness, but it also benefits the end users who, in a connected world, are everybody.

Sidebar