What are research data?
Researchers often ask what constitutes their data. Johns Hopkins University defines research data “records that would be used for the reconstruction and evaluation of reported or otherwise published results” in the policy on access and retention of research data and materials. Examples include laboratory notebooks, numerical raw experimental results and instrumental outputs.
Storing data in use, archiving completed projects
Storing and backing up research data is, of course, critical during research. However, these actions are not sufficient to ensure the data’s future usability for you and your research community. When ending a research project or project phase such as data collection, consider taking time to prepare an archived copy of your research data. Archiving research data is not simply taking stored data out of active use; it requires a few additional steps:
- protecting data: requiring safeguards and periodic checks of file integrity on storage media
Archiving research data builds upon the storage process, providing for long-term access to the data and preparing the data for deposit into a data repository if desired (see the figure at right).
Advantages to sharing data
While the sharing of research data is expected by some funding agencies, such as NSF, sharing research data also has many advantages for the scientists. In a 2010 UK study on open data, researchers identified the following as benefits to themselves:
- Enhancing visibility of research
- Increasing the efficiency of research due to reusability and exposure
- Enabling researchers to ask new research questions and potentially further science
- Promoting scientific integrity and replication
- Enhancing collaboration and community-building
Restrictions to sharing research data
Not all research data can or should be shared due to legal, ethical or practical reasons. Your data management plan should address any restrictions to the sharing of your research data with others. The table below outlines some of these restrictions that should be considered. Information on Johns Hopkins University policies, including IRB requirements and intellectual property definitions can be found on the JHU Policies page.
|Privacy||Information that identifies an individual (e.g., HIPPA, IRB)|
|Confidentiality||Information that should not be shared (e.g., embargo period, trade secret)|
|Security||Threats to something and someone through release of dtaa|
|Intellectual Property||New, intangible creations (e.g., patents, copyright)|
Ways to share research data
Scientists can disseminate their data through various solutions, each with pros and cons to consider. As shown in the figure at right, access to and use of your research data will be facilitated by file sharing services or the use of a data archive. However, these solutions may require more effort than sharing the data upon request. A JHU data management consultant can help you assess your options for a sharing solution.
Data repositories for sharing data
Archived data collections can be more easily shared, whether by direct request or via websites. Or, consider archiving at a data repository to expand the access, discoverability and active management of your data collections. A data repository is a digital system and actively managed service for providing access to data. Repositories vary in their capabilities, but most include the following to varying degrees:
- Providing a web-accessible interface for discovering and downloading research data collections.
- Managing preservation of digital objects such as file integrity checking and redundant offsite backups.
- Use of identifiers, such as DOIs (digital object identifiers) to give datasets persistent location links and citations similar to journal articles
- Description of projects and files, and ways to include documentation sufficient for using the collection without contacting the researcher.
We have developed guidance for researchers on Selecting a Repository for Data Deposit. You can also search for repositories for your field on the re3data.org website and contact us for assistance in locating a suitable data repository.
- ICPSR: Johnston, Lloyd D., Jerald G. Bachman, Patrick M. O’Malley, and John E. Schulenberg. Monitoring the Future: A Continuing Study of American Youth (12th-Grade Survey), 2007 [Computer File]. ICPSR22480-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2008-10-29. doi:10.3886/ICPSR22480.
- ESIP Federation: Ishikawa, M. 2002. Inventory of Rock Glaciers along the Ghunsa Valley, Kanchanjunga Himal, Eastern Nepal. Boulder, CO: National Snow and Ice Data Center. Digital Media.
- JHU Data Archive: Zhang, Q., Harman, C. J., and Ball, W.P., 2016. Data associated with An Improved Method for Interpretation of Riverine Concentration-Discharge Relationships Indicates Long-Term Shifts in Reservoir Sediment Trapping. Version 1. Johns Hopkins University Data Archive. http://dx.doi.org/10.7281/T18G8HM0.