Data Management

This guide presents policies and resources for managing and sharing research data.

What are research data?

Researchers often ask what constitutes their data. Johns Hopkins University defines research data “records that would be used for the reconstruction and evaluation of reported or otherwise published results” in the policy on access and retention of research data and materials. Examples include laboratory notebooks, numerical raw experimental results and instrumental outputs.

Storing data in use, archiving completed projects

Storing and backing up research data is, of course, critical during research. However, these actions are not sufficient to ensure the data’s future usability for you and your research community. When ending a research project or project phase such as data collection, consider taking time to prepare an archived copy of your research data. Archiving research data is not simply taking stored data out of active use; it requires a few additional steps:

  • protecting data: requiring safeguards and periodic checks of file integrity on storage media
  • documenting data to ensure that data can be used and interpreted in the future, especially by others. This includes organizing the data as an identifiable collection with a stable reference.

Archiving research data builds upon the storage process, providing for long-term access to the data and preparing the data for deposit into a data repository if desired (see the figure at right).

Advantages to sharing data

While the sharing of research data is expected by some funding agencies, such as NSF, sharing research data also has many advantages for the scientists. Some benefits are listed below:

  • Enhancing visibility of research
  • Increasing the efficiency of research due to reusability and exposure
  • Enabling researchers to ask new research questions and potentially further science
  • Promoting scientific integrity and replication
  • Enhancing collaboration and community-building

Restrictions to sharing research data

Not all research data can or should be shared due to legal, ethical or practical reasons. Your data management plan should address any restrictions to the sharing of your research data with others. The table below outlines some of these restrictions that should be considered. Information on Johns Hopkins University policies, including IRB requirements and intellectual property definitions can be found on the JHU Policies page.

Term Definition
Privacy Information that identifies an individual (e.g., HIPPA, IRB)
Confidentiality Information that should not be shared (e.g., embargo period, trade secret)
Security Threats to something and someone through release of dtaa
Intellectual Property New, intangible creations (e.g., patents, copyright)


Ways to share research data

Scientists can disseminate their data through various solutions, each with pros and cons to consider. As shown in the figure at right, access to and use of your research data will be facilitated by file sharing services or the use of a data archive. However, these solutions may require more effort than sharing the data upon request. A JHU data management consultant can help you assess your options for a sharing solution.

Data repositories for sharing data

Archived data collections can be more easily shared, whether by direct request or via websites. Or, consider archiving at a data repository to expand the access, discoverability and active management of your data collections. A data repository is a digital system and actively managed service for providing access to data. Repositories vary in their capabilities, but most include the following to varying degrees:

  • Providing a web-accessible interface for discovering and downloading research data collections.
  • Managing preservation of digital objects such as file integrity checking and redundant offsite backups.
  • Use of identifiers, such as DOIs (digital object identifiers) to give datasets persistent location links and citations similar to journal articles
  • Description of projects and files, and ways to include documentation sufficient for using the collection without contacting the researcher.

We have developed guidance for researchers on Selecting a Repository for Data Deposit. You can also search for repositories for your field on the website and contact us for assistance in locating a suitable data repository.

Data citation

Citations for research data are important both for giving researchers proper credit for shared research data and for facilitating references to datasets in publications. One advantage of depositing your data into a data repository or archive is your data often receives a unique identifier (e.g., DOI, like those for journal articles) that is permanently associated with that data to facilitate proper citation. Also, these data repositories often create and display a proper data citation so users know exactly how to cite the downloaded data. Although formal data citation formats are emerging, a number of groups have established guidelines. In general, they contain a title, author, date, distributor, version and locator/identifier, but other citation elements are possible such as release date and resource type. Below are examples of data citations from three different data archives: 
  • ICPSR: Johnston, Lloyd D., Jerald G. Bachman, Patrick M. O’Malley, and John E. Schulenberg. Monitoring the Future: A Continuing Study of American Youth (12th-Grade Survey), 2007 [Computer File]. ICPSR22480-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2008-10-29. doi:10.3886/ICPSR22480.
  • ESIP Federation: Ishikawa, M. 2002. Inventory of Rock Glaciers along the Ghunsa Valley, Kanchanjunga Himal, Eastern Nepal. Boulder, CO: National Snow and Ice Data Center. Digital Media.
  • JHU Data Archive: Zhang, Q., Harman, C. J., and Ball, W.P., 2016. Data associated with An Improved Method for Interpretation of Riverine Concentration-Discharge Relationships Indicates Long-Term Shifts in Reservoir Sediment Trapping. Version 1. Johns Hopkins University Data Archive.
For more information on data citation, please see the project website for DataCite and the Digital Curation Centre’s guide on “How to Cite Datasets and Link to Publications”.