Documenting Research Data

This guide is a collection of resources shared in Data Services' "Documenting Your Research Data" online modules

JHU Data Services

We are here to help you find, use, manage, visualize and share your data. Contact us to schedule a consultation. View and register for upcoming workshops. Visit our website to learn more about our services.

Overview of medical research documentation

Medical and health research have distinct requirements for documentation, ranging from standardized metadata for device interoperability to protocols and standard operating procedures for clinical trials. Good documentation facilitates collaboration among a broad range of specialized roles, translational research for applying results to treatments, and compliance with regulations for protecting patients. These resources are for anyone conducting or supporting research in clinical, biomedical, and public health research, emphasizing planning for sharing data with collaborators and biomedical communities. They accompany our online module on documenting medical research data.

Medical metadata standards

Metadata is information associated with datasets and data files that provides context. See our Metadata and Metadata Standards module.

Metadata standards examples and resources:

CDE Repository (Common Data Elements)

Observational Health Data Sciences and Informatics (OHDSI) OMOP Common Data Model

  • The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) is an open community data standard for standardizing the structure and content of observational data for health and other research to enhance analysis and outcomes. a OMOP CDM uses the OHDSI standardized vocabularies. The OHDSI vocabularies allow organization and standardization of medical terms for various clinical domains of the OMOP common data model and enable standardized analytics and information databases. Johns Hopkins Medical's Precision Medicine Analytics Platform (PMAP) is implementing the OMOP model for its billions of medical records to facilitate evidence-based research.  Read more about the OMOP Common Data ModelRead more about OHDSI's standardized vocabularies.

Clinical Data Interchange Standards Consortium (CDISC)

  • A global nonprofit developing data collection standards and protocols for clinical research, pharmogenomics,   Standards areas include protocol representation, clinical data acquisition,  and pharmcogenomics/genetics testing and description.  You must create a free account to access the standards.

Human Pathogen and Vector Sequencing Metadata Standards (GSCID/BRC)

  • A NIAD project developing standardized human pathogen and vector sequencing metadata to support epidemiologic and genotype-phenotype association studies for human infectious diseases.

Neuroimaging Informatics Technology Initiative (NIfTI)

  • Specialized standards and tools for functional MRI (fMRI) supported by NIMH.

Health Level Seven International (HCL)

  • An international non-profit for developing ANSI-accredited standards for Electronic Health Systems

SNOMED International

  • Focuses on international common standards for medical terms, including human and machine-readable concepts, descriptions, and relationships​

Integrating the Healthcare Enterprise (IHE)

  • Healthcare and industry standards for better integrating healthcare computing standards among other standard like DICOM and HCL

Brain Imaging Data Structure (BIDS)

  • Brain Imaging Data Structure (BIDS) presents a standardized way of organizing neuroimaging and behavioral data that can help with overall documentation. BIDS is compatible with the OpenNeuro data repositor

Center for Expanded Data Annotation and Retrieval (CEDAR)

  • CEDAR provides metadata templates, which define the data elements needed to describe particular types of biomedical experiments. The templates include controlled terms and synonyms for specific data elements.  CEDAR uses a library of such templates to help scientists submit annotated datasets to appropriate online data repositories.

Types of medical research documentation

Controlled vocabularies: lists of predefined terms, Some authorized and maintained by a community (see metadata standards examples) or developed internally by a research group. Medical research and biomedical professional communities may employ controlled vocabulary standards such as:

Ontologies: a variety of controlled vocabulary that defines components and describes relationships among components. Most are used for interoperability among databases, some using (Web Ontology Language (OWL) or Resource Description Frameworks (RDF)). Here are some example of ontologies used in biomedical research: