Is data sharing the right step towards open science?
Since its inception, the objective of science has been human progress through knowledge sharing. The default direction of science has been towards making research freely available to all, irrespective of their scientific and academic background, unless there are any legal or ethical ramifications. Open access is modeled in line with this ideology wherein both researchers and the general population can access scientific knowledge without any restrictions.
A previous post discussed what academicians think of open access. For years, some have argued that a published paper without the underlying data is of limited use. Keeping this consideration central, PLOS recently announced that it is “releasing a revised Data Policy that will come into effect on March 1, 2014, in which authors will be required to include a data availability statement in all research articles published by PLOS journals.” What are the implications of this policy?
Let’s take a look at some important aspects of this announcement:
Reason: PLOS believes that “Data availability allows replication, reanalysis, new analysis, interpretation, or inclusion into meta-analyses, and facilitates reproducibility of research, all providing a better ‘bang for the buck’ out of scientific research.” Transparency and easy availability of scientific knowledge are being touted as the main reasons behind this new policy.
Requirements: Researchers would need to submit a “minimal dataset” at the time of manuscript submission. This includes any metadata or additional data that would be needed to replicate the reported findings. PLOS clarifies that researchers do not need to submit all of their raw data collected as part of the research, but only the underlying data that is vital and relevant to the paper. This can be included in the manuscript, or provided as supporting information, or if the dataset is saved in a public repository, its access needs to be provided.
Exceptions: Some data, such as private patient data or specific information relating to any fossil deposits or endangered species, cannot be shared due to ethical and legal restrictions. Additionally, data obtained from third parties cannot be shared freely. In such cases, the authors need to indicate that the data would be made available upon request.
This announcement has sparked many debates and discussions among academicians. While some side with the policy on account of the transparency it will lend to science, others are staunchly against it. Here are some pros of this policy:
1. Reproducibility of research: Scientific progress is possible only through knowledge sharing and building on that knowledge. Hence, if data is freely accessible along with the findings and observations, researchers can re-use it in novel ways, attest to its quality, replicate the study, etc.
2. Rapid scientific progress: The ready availability of data might help scientists working on similar projects. If data sharing fills any gaps in their ongoing research, they will be able to progress faster, which in turn would hasten scientific progress.
3. Data management: As authors would need to submit their data along with the paper as downloadable supporting information or store the data in a public repository and provide an accession number or digital object identifier (DOI) for each dataset, managing data would be easy for authors for future reference. Large datasets can become easier to use if they are stored appropriately.
4. Greater opportunities: When data is stored and shared through public data archiving, it leads to more citations, thus increasing a researcher’s performance on alternative metrics such as altmetrics. Moreover, data sharing can increase the chances of co-authorship.
Roli Roberts, an associate editor at PLOS, mentions in a blog post that elemental data is central to a paper as papers mainly analyze the data (which is provided limitedly) and discuss conclusions. According to him, scientific papers contain very little raw data because they follow the format of the past, where papers had space constraints. However, now, papers are mostly published and read online, so data can be provided by directing readers to public repositories and other online resources. Roberts, thus, feels that data sharing should be the future of science.
However, some academicians are not in favor of the data sharing policy, for the following reasons:
1. Fear of being “scooped”: The competition for getting grants and academic positions, and to be the first to publish research is very intense. Hence, many are skeptical about making their data available for all, as others may use it as they please. Losing priority and ownership of their entire dataset is an uncomfortable idea for many researchers.
2. Repercussions for researchers in developing countries: In countries that do not invest as much in research, “data acquired are like gold.” Researchers work on limited funds, and thus, they want to extract as many publications as possible from their data. Making it public would leave them with few options of reusing it.
3. Effects on the peer review process: A number of researchers have questioned whether peer reviewers would actually look through the dataset at the time of review. They have also raised concerns about how this can affect the timeline of the peer review process and whether it would really help to scrutinize the data.
4. Costs of data sharing: Storing data in public repositories can be expensive. In addition to this, cases where data cannot be shared directly due to ethical reasons, researchers might need to modify the data, which can cost a lot of time and money.
Interestingly, PLOS’s policy seems to uphold Barack Obama’s declaration last year of making federally funded research, along with the supporting data, available for all. In the last few decades, the general public has become increasingly curious about science and is keen on following its progress. In keeping with this, PLOS’s policy intends to lend more transparency to science, which is again a matter of concern for those researchers who believe that data should be shared universally only once it’s an accomplished fact.
It remains to be seen whether authors would be willing to part with their data along with their submissions to PLOS, particularly those belonging to countries where research funding is not the focus. However, most academicians agree that if this policy comes into effect, it will be a huge step towards open science.
It would be interesting to know your views on this issue. Do you think researchers will embrace this change? Would this policy affect PLOS’s output and the diversity of PLOS’s authorship?