Using Wikipedia to open up science

The image is a series of drawings showing various parts of a newly discovered animal species
A description of a new species of Brazilian Paraportanus, uploaded by Open Access Media Importer

This post was written by Dr Martin Poulter, Wikimedia UK volunteer and Wikipedian

As part of Open Access Week, I’d like to explore some overlaps between Open Access and what we do in Wikimedia, and end with an announcement that I’m very excited about.

We who write Wikipedia do not expect readers to believe something just because Wikipedia says so. We cite our sources and hope that readers will follow the links and check for themselves. This is a kind of continuous quality control: if readers verify Wikipedia’s sources, then bias and misrepresentation will be winnowed out. However, we do not yet live in that ideal world. A huge amount of research is still hidden behind “paywalls” that charge startlingly high amounts per paper.

Here in the UK, a lot of progress is being made in opening up research, thanks to the policies of major funding bodies including Research Councils UK and the Higher Education Funding Council for England. This is a difficult cultural change for many researchers, but Wikipedia and its sister sites show that a totally open-access publishing system can work. These sites also provide platforms that give that greatest exposure and reuse for open access materials.

Open Access in the Broadest Sense

There is much more to open access than being able to read papers without paying. The OA agenda is about getting the full benefits of research, removing technical or legal barriers that restrict progress. You may sometimes hear about “Budapest” OA, referring to the 2002 declaration of the Budapest Open Access Initiative which said that open access would “accelerate research, enrich education, share the learning of the rich with the poor and the poor with the rich, make this literature as useful as it can be, and lay the foundation for uniting humanity in a common intellectual conversation and quest for knowledge.”

Open Access is ideally about unrestricted outputs to all the outputs of research, not just the finished research paper. Can the expert community get hold of the data and run their own analysis to check the conclusions? Can a lecturer use a paper’s figures to make educational materials? If not, it is arguably not open.

Openness is not just about whether you can access research outputs, but whether you can repurpose and reuse them. On Wikipedia, we want to use diagrams with text labels and translate those labels into other languages for our global audience. Some image formats make this easy while others make it difficult. Researchers will not just want to look at data tables but want them in a format that can be copied into their software for analysis.

We can also ask for open access to information about the review process: what faults did reviewers identify in the submitted paper, and what editorial changes were made as a result? We could also include open access to measures of impact: the metrics that help to show if a new finding is significant for its field or for public debate.

Metascience, the study of the scientific process, is all but impossible without open access. If we want to test whether different funders of research get different results, we need to mine large amounts of data about research studies. This requires not just the research outputs themselves but data about how, when, and by whom the studies were funded. To study biases in publication, you need to know not only what was published but also what trials have been conducted.

Wikipedia and the Open Agenda

Wikipedia and its sister projects embrace all aspects of “open” in the Budapest sense, not just that readers do not pay. The articles themselves can be copied, analysed, and reused by anyone, for any purpose. An article’s evolution, including any reviews it has gone through, is publicly examinable. Many kinds of data are available; about users’ contributions, about the number of edits, or about the readership of articles. These data give us ways to assess the reach and significance of experts’ contributions to Wikipedia.

For scientists, improving Wikipedia is not just a way to feed public curiosity about their work: it could improve science itself. A team at the Wellcome Trust Sanger Institute in Cambridge have for years been sharing their database of proteins on Wikipedia. Not only does this combine their data with other knowledge about the proteins, but it allows a new audience to improve the database.

Wikimedia sites offer new models for academic publishing. A few weeks ago saw the first peer-reviewed paper to be authored on Wikipedia: a clinical review paper about dengue fever. Among the new challenges for the journal was how to credit the authors for this paper with 1,373 contributors. Alongside this “Wiki-to-Journal publication” there is “Journal-to-Wiki”, exemplified by several articles published on Wikipedia by the journal PLoS Computational Biology.

A software “robot” called the Open Access Media Importer takes photos, diagrams, and video clips from suitable research papers and uploads them to Wikimedia Commons, with full attribution to the original authors and paper. From Commons they can be used to illustrate Wikipedia articles or materials on any other site.

Wikidata, the newest Wikimedia project, has many millions of facts and figures about everything from Ebola virus disease to the Hubble Space Telescope. At Wikimania this Summer, Peter Murray Rust, the University of Cambridge chemist who coined the term “Open Data”, said “Wikidata is the future of science data. […] We [Wikimedians] are going to change the world.”

So there is a rapidly expanding overlap between science and Wikimedia. How will the scientific community – including researchers, educators, publishers, funders, and scholarly societies – keep up? A vital next step is to get people together in the same room: professionals and volunteers; bold innovators and curious newcomers.

This is why Wikimedia UK is working towards holding the first ever Wikipedia Science Conference. This will take place next September 2015 in London. It is a chance to explore how all aspects of openness – including open access, open data, open scholarship, and open source software – can transform the world’s understanding of, engagement with, and even practice of science. Details are still being worked out, but we have a long time to prepare and to make this a landmark event. We hope to see you there.