Data Management and Sharing

This guide gathers overviews and resources for data management and sharing following the research workflow for data, from preparing data management and sharing plans for grant proposals, conducting research, to sharing research data.

Best Practices for Data Management

Workshops offered by JHU Data Services

 
Best Practices for Research Data Management and Sharing (live webinar)

Effective data management can increase the pace of the research process, contribute to the soundness of research results, and meet funding agency requirements by making research data easy to share. Join us for an overview of best practices including backup procedures, tips on effective file names, data security and access controls, and data documentation/metadata. This seminar is for faculty, postdoctoral researchers, and graduate students from all disciplines. This course does not focus on creating or using any particular data collection or analysis tool (e.g. REDCap, SPSS), but discusses data management at a general level.

Documenting Your Research Data (self-paced online modules)

Documenting your research data is a prerequisite for data sharing and your own use of your data in the future. Good documentation helps your data be discoverable, understood, and trusted by others. Please view our individual modules for the training “Documenting Your Research Data” to learn documentation best practices by subtopic (e.g., code, tabular data, using documentation standards). Resources in these modules are also available in the Guide for Documenting Research Data.

Online Training offered by other organizations

 

COURSERA Research Data Management and Sharing Online Course

A 5-week online data management course on COURSERA, offered by the University of North Carolina at Chapel Hill and the University of Edinburgh. It is a free course, but a fee is attached if you require a certificate of completion. This course will cover the following topics: Understanding research data, Data management planning, Working with data, Sharing data, and Archiving data.

COURSERA Data Management for Clinical Research Online Course

A 6-week online data management course for people who conduct clinical research on COURSERA, offered by Vanderbilt University. It is a free course, but a fee is attached if you require a certificate of completion. The first part of the course covers best practices for clinical data management, followed by a demonstration of using REDCap to design an Electronic Data Capture (EDC) system. In addition, data management practices in different fields, such as neuroimaging data management, mHealth in developing countries, and data management for multi-center studies, are mentioned in the end.

ESIP Data Management Short Course for Scientists

A data management short course developed by the Federation of Earth Science Information Partners (ESIP), in cooperation with NOAA and the Data Conservancy. The following topics are included in this course: Data stewardship, Data management plans, Local data management, and Responsible data use.

Data Management and Visualization Training by the Stanford University Center for Ocean Solutions

Online training focuses on data management and visualization. Topics include best practices for data management, data cleaning, data visualization with Tableau and storytelling with datasets, etc. 

Data Management and Preservation Education and Training List by the Geospatial Data Preservation Resource Center

Geospatial Data Preservation Resource Center compiled a list of education resources for managing and preserving geospatial data.

Data Carpentry

Data Carpentry develops and offers workshops for researchers to teach them fundamental data skills needed to conduct research. Workshops are domain-specific and current workshops cover the following disciplines: Ecology, Genomics, Geospatial data, Social Science and Biology. You can request a workshop for your institution, or attend an upcoming one at your institution. Also, see Software Carpentry‘s workshops for teaching basic software skills to researchers.

New England Collaborative Data Management Curriculum (NECDMC)

An instructional tool for instructors to develop a research data management course for researchers. A detailed lesson plan and materials for seven data management modules are included in this curriculum. Sample class activities, research cases, and data management plans are also available.

Glossary of Data Management Terms

Common terms are found on our website and guides, with links to other data science term lists compiled by Cornell University Library. 

Best Practices for Data Management

 
Best Practices for Research Data Management by Johns Hopkins Institute for Clinical and Translational Research (ICTR)

A best practices guide provides resources for data management and sharing at Johns Hopkins University. Topics include data management planning, compliance, data sharing, data quality, security, and backup and archiving. 

Data Management Tool and Software

Services and Applications that provide support for geospatial data management 

A list of geospatial data management tools and software resources created by the Geospatial Data Preservation Resource Center.

RDMkit

The RDMkit is a research data management toolkit for life sciences, including best practices and guidelines to make researchers' data FAIR (Findable, Accessible, Interoperable, and Reusable). You can browse these resources based on the research life cycle, your roles, tasks, domains, tool assembly, or nations. 

File Naming

Managing Filename Metadata for Sharing Next-Generation Sequencing (NGS) Data

A descriptive filename can be the metadata of research data. It provides information on file contents so it is important to name research files properly. This lesson teaches researchers how to give good file names to their genomic data as part of the documentation. 

Batch Renaming Software

Both Windows and Mac offer options to batch rename files. There are also software applications, such as Rename-It!, Bulk Rename Utility, to rename multiple files at once.

Best Practices for file naming
  • Using File Naming to Organize Research Files: An interactive online training module about naming research files properly. This module is created by JHU Data Services.
  • Data Best Practices and Case Studies by Stanford University: This Library Guide offers best practices for naming research files and some case studies with good and bad file names. 
  • File Naming Convention Worksheet by CalTech Library: This worksheet helps researchers come up with their own file naming convention as part of the documentation.  
  • File Names Checklist by Harvard University: A checklist to help researchers come up with good file names and to examine if they have good file names. 

Collaboration Tools

Secure Analytic Framework Environment (SAFE) Desktop

A virtual desktop that provides a secure environment for researchers to remotely work with their clinical data and use software application. Researchers can also SAFE to share data with your team members and/or collaborators in a secure way. The Basic SAFE is free, including access to a virtual desktop, 100 GB storage and licensing for SAS and Stata. Additional fee will apply if require more storage or licenses for other software. To request access to an existing SAFE folder, please submit the request using the SAFE Desktop Request Form. The SAFE Desktop is managed and supported by IT@JH. Please contact the IT Help Desk at 410-955-HELP with any technical issues.

Microsoft Teams at Johns Hopkins

Microsoft Teams is a collaboration tool available to all JHU users with a valid JHED account. It is fully integrated with Microsoft Office 365 and supports group chats, file sharing, and online meetings. 

Open Science Framework (OSF)

The Open Science Framework (OSF) is a free and open source web application built by the Center for Open Science to aid researchers in managing the entire scientific research workflow. The user interface is designed to allow you to manage multiple projects from a single dashboard.

Data Security

IT Services at JHU

Services from IT@JH

Information Technology @ Johns Hopkins (IT@JH) offers a variety of virtual computing, storage, back-up and web hosting services for JHU researchers. Contact them directly or through the departmental IT representatives.

JHU Information Security Institute (JHUISI)

JHUISI is JHU’s focal point for research and education in information security, assurance, and privacy. It offers graduate degree programs and online programs for students who are interested in information security. Faculty’s research focuses on two major areas: Cryptography and privacy, and Health and medical security.

Johns Hopkins Secure File Transfer (SFTP)

A secure data transfer option operated by the Systems and Application Support Services team at JHU. This can be used for transferring data securely to the JHU network both internally and externally.   

Privacy and Security for PHI Data

JHU Privacy Office

JHU Privacy Office assists JHU researchers who work with PHI data to comply with state and federal healthcare privacy laws, including HIPAA. This website provides forms, policies, procedures, guidance, and other information related to protecting patients’ privacy. In addition, a data breach event should be reported to JHU Privacy Office for review. Here is a policy about Privacy and Protection of Sensitive Information. This policy sets the standards for protecting sensitive information, including Personally Identifiable Information (PII) and Protected Health Information (PHI).

HIPAA Security Rule

The HIPAA Security Rule establishes national standards to protect individuals’ electronic personal health information that is created, received, used, or maintained by a covered entity. A summary of the HIPAA Security Rule is available here and combined text rules can be found here.

NIH Security Best Practices for Controlled-Access Data Subject to the NIH Genomic Data Sharing (GDS) Policy

A best practice guideline for information technology professionals whose institutions have researchers with access to controlled-access human genomic and phenotypic data under NIH Genomic Data Sharing policy. This is a guideline for people working as Chief Information Officers (CIOs), Information Systems Security Officers (ISSOs), and operating staff in a research group. 

Clinical Data Management

Online Training

COURSERA Data Management for Clinical Research Online Course

A 6-week online data management course for people who conduct clinical research on COURSERA, offered by Vanderbilt University. It is a free course, but a fee is attached if you require a certificate of completion. The first part of the course covers best practices for clinical data management, followed by a demonstration of using REDCap to design an Electronic Data Capture (EDC) system. In addition, data management practices in different fields, such as neuroimaging data management, mHealth in developing countries, and data management for multi-center studies, are mentioned in the end.

Data Science at NIH

Resources of NIH data science-related events and news include information about NIH’s Big Data to Knowledge (BD2K) initiative and NIH Commons. The BD2K Training Coordinating Center offers resources and tools for biomedical researchers to navigate the data science field. These BD2K Guide to the Fundamentals of Data Science Series can provide a basic understanding of data science for biomedical researchers.

The Fundations of Biomedical Data Science by the University of Virginia

This seminar series covers the basics of data management, representation, computation, statistical inference, data modeling, and other topics related to biomedical big data.  

JHU Organizations

Institute for Clinical and Translational Research (ICTR)

The ICTR provides clinical resources, consulting services, funding, and training for JHU clinical researchers. Submit a request here if you need ICTR services. Or visit their training and resources pages to find out more about how can ICTR help with your clinical research. Here is a list of JHU resources that ICTR provides for researchers to apply best practices for clinical data management. 

The Johns Hopkins Biostatistics Center

JHU Biostatistics Center provides consulting on biostatistical issues related to the effective collection and interpretation of health information including research design, professional and scientific report writing, and statistical analysis.

The Biostatistics, Epidemiology and Data Management (BEAD) Core

BEAD Core at the School of Medicine provides a myriad of consulting and support services around study design and analysis, database development, and survey design review. Please note, it is most beneficial to the researcher to receive help from BEAD prior to data collection. Check out their past seminars to learn more about BEAD Core.

Research Electronic Data Capture (REDCap) 

REDCap is a mature, secure web application for building and managing online surveys and databases. Reach out to redcap@jhu.edu for more information or sign up for one of their Zoom-in clinic sessions if you need help with REDCap. REDCap Training Central is for people who are interested in using REDCap and want to learn more about it. 

Clinical Data Security

JHM Data Risk Tiers

JHU Data Trust Research Data Subcouncil provides resources to help researchers determine the risk level of their JHM data.

Secure Analytic Framework Environment (SAFE) Desktop

A virtual desktop that provides a secure environment for researchers to remotely work with their clinical data and use software applications. Researchers can also SAFE to share data with their team members and/or collaborators in a secure way. The Basic SAFE is free, including access to a virtual desktop, 100 GB storage, and licensing for SAS and Stata. An additional fee will apply if require more storage or licenses for other software. To request access to an existing SAFE folder, please submit the request using the SAFE Desktop Request Form. The SAFE Desktop is managed and supported by IT@JH. Please contact the IT Help Desk at 410-955-HELP with any technical issues.

HIPAA Security Rule

The HIPAA Security Rule establishes national standards to protect individuals’ electronic personal health information that is created, received, used, or maintained by a covered entity. A summary of the HIPAA Security Rule is available here and combined text rules can be found here.