Data sharing and data management are extremely important and make data verifiable. While several government and funding agencies are imposing stricter mandates on the inclusion of data sharing plans in grant applications, for the scholarly community, access to research data still remains a challenge. The publishing community is trying to help reduce the practical hurdles to data sharing by creating awareness and supporting infrastructural developments. With these efforts in place, it is crucial to apprehend researchers' outlook about sharing data.
In an attempt to understand how researchers perceive data sharing globally, and to form a clear view of data sharing practices among researchers across disciplines, Wiley conducted a survey earlier this year. Around 90,000 researchers across the world participated in the survey belonging to the health sciences, life sciences, physical sciences, social sciences, and humanities fields. However, only 2886 researchers (3.2%) responded, of which, only 52% reported sharing data.
What data do researchers share?
- 82% of the respondents produce data in spreadsheets and CSV files
- Only 12% reported creating relational databases
- Other kinds of data that are shared include 2
-D and 3D images, executable code/models, transcripts from interviews, video/audio recording, etc.
- The sizes of the shared files are relatively small – over 60% are less than 10GB.
How do researchers share data?
- 67% of those who have reported sharing data do so as supplementary material in journals.
- 28% store their data in an institutional repository.
- Only 19% use a discipline-specific data repository.
- A meagre 6% use a general-purpose data repository, such as Dryad or figshare.
- Many researchers report sharing data in informal, often impermanent ways, that would not meet formal requirements, such as sharing at a conference (57%) and sharing on request via email, direct contact, etc. (42%).
- A staggering 37% say they are using a personal, institutional, or project website to share data – again, unlikely to meet any data sharing mandates, and certainly not the best way of ensuring any kind of long-term preservation of the data.
The main reasons for not sharing data:
- Not a funder requirement
- Concerns about being scooped
- Possible misuse or misinterpretation of data
- IP/confidentiality issues, especially in health sciences
Motivations to share data in different disciplines:
- In the physical sciences, social sciences, and humanities, the main motivation for data sharing is to increase the visibility and impact of their work.
- In the life sciences, researchers would feel motivated to share data if they get some kind of credit or attribution for it.
- In the health sciences, concerns about privacy and ethical issues around data sharing adversely affect motivation levels.
So what steps should the scholarly community adopt to improve data sharing practices?
The survey reveals that publishers receive a regular inflow of research data in the form of supplementary material. Therefore, making data archiving a publisher requirement would definitely improve data sharing practices. Secondly, since the lack of funder requirements has been cited as a major reason for not sharing data, governments and other funding bodies should more actively enforce researchers to submit their data management plans. Additionally, researchers need to be reassured of the safety and confidentiality of data storage mechanisms. Continued support for and investment in infrastructure such as such discipline-specific and general repositories will provide them with safe and permanent ways to store their data. Funders and institutions should also try to find out ways to give researchers credit or attribution for sharing data.
What is required at this juncture is organized action by the wider scholarly community. Funders, institutions, publishers, and learned societies should come together to formulate data-citation standards and best practices, and encourage researchers to share data by creating a clear system of incentives for sharing data and providing transparency about the different forms of data sharing and their benefits. This will, hopefully, help develop a culture for data sharing among researchers.
You might also be interested in knowing Is data sharing the right step towards open science?