Entity Mapping from OpenMRS to OMOP
This document provides a detailed reference for the entity-level and field-level mappings used to transform data from the OpenMRS data model into the OMOP Common Data Model (CDM). The transformation was implemented using SQLMesh, and follows OMOP's standardization principles to ensure interoperability across healthcare systems.
Each section describes the mapping for a specific OMOP entity, including:
Source fields from OpenMRS
Transformations or logic applied
Hardcoded or default values (where applicable)
Notes on vocabulary alignment and filtering rules
A special focus has been placed on concept mapping, which aligns OpenMRS concepts with standard OMOP vocabulary concepts using the CIEL SAME-AS relationship, to ensure the resulting data can be meaningfully analyzed across OMOP-compliant platforms.
This reference is intended to support implementers, data engineers, and analysts in understanding how OpenMRS data has been normalized and structured for OMOP-based analytical environments.
Feedback
Feedback and suggestions are welcome! If you notice any issues, inconsistencies, or areas for improvement, please reach out or open a discussion thread. Your input helps improve this mapping for everyone.
CARE SITE
OMOP Field Name | OpenMRS Source Field | Transformation / Logic | Notes |
|---|---|---|---|
|
| Direct mapping | Uses the unique identifier of the OpenMRS location |
|
| Direct mapping | Name of the location, used as the care site name |
| — |
| Not mapped; can be populated if standard place-of-service exists |
|
| Direct mapping | Links to |
|
| Direct mapping | Original source value from OpenMRS |
| — |
| Optional field; not mapped |
Filtering Notes:
Only non-retired locations are included (
l.retired = 0).
Notes:
Each OpenMRS location is treated as a care site.
The
place_of_service_concept_idandplace_of_service_source_valueare left null for now, but may be populated in the future using additional metadata or vocabulary mappings.
PERSON
OMOP Field Name | OpenMRS Source Field | Transformation / Logic | Notes |
|---|---|---|---|
|
| Direct mapping | Unique identifier for each person |
|
| 'M' → 8507, 'F' → 8532, else 0 | OMOP standard concept IDs for gender |
|
|
| Extracts the year from birthdate |
|
|
| Extracts the month from birthdate |
|
|
| Extracts the day from birthdate |
|
| Direct mapping | Full birth date and time |
| — | Hardcoded to | Race not recorded in OpenMRS |
| — | Hardcoded to | Ethnicity not recorded in OpenMRS |
| — | Hardcoded to | Default location assumed (can be updated later) |
|
| Mapped from | Person who created the patient record |
| — | Hardcoded to | Default care site (can be refined later) |
| — | Hardcoded to empty string | Optional, not currently mapped |
|
| Direct mapping | Retains raw gender string from OpenMRS |
| — | Hardcoded to | Can be populated later |
| — | Hardcoded to empty string | Not available in OpenMRS |
| — | Hardcoded to | — |
| — | Hardcoded to empty string | — |
| — | Hardcoded to | — |
Notes:
Gender is standardized using OMOP concept IDs.
Race and ethnicity fields are left empty or set to
0as OpenMRS does not capture this data by default.location_idandcare_site_idare hardcoded to1as placeholders — these can be refined based on actual mappings toLOCATIONandCARE_SITE.
OBSERVATION_PERIOD
OMOP Field Name | OpenMRS Source Field(s) | Transformation / Logic | Notes |
|---|---|---|---|
| — |
| Sequential ID assigned per row |
|
| Direct mapping | Links to |
|
|
| First known date of healthcare activity |
|
|
| Latest known visit or encounter date |
| — | Hardcoded to | OMOP standard concept for EHR record |
Why visit.date_started for observation period start?
The
visittable in OpenMRS represents high-level patient interactions with the healthcare system.date_startedgives the earliest recorded point of care, often more reliable than scattered observations.Using the earliest
date_startedper patient ensures we mark the beginning of valid, intentional medical data.
Why GREATEST(visit.date_stopped, encounter.encounter_datetime) for observation period end?
Some visits may not have a
date_stopped, especially if not explicitly closed in OpenMRS.Encounters within a visit may occur after the visit’s
date_stopped(due to data entry lag or configuration).To ensure we capture the latest available data, we take the greater of:
MAX(visit.date_stopped): expected end of clinical engagementMAX(encounter.encounter_datetime): actual last recorded interaction
Additional Considerations:
Patients may have long gaps between visits, but for simplicity and consistency, only a single continuous observation period is generated per patient in this model.
Concept Mapping Strategy
One of the key challenges in translating entities from OpenMRS to OMOP was handling concepts effectively.
Initial Approach
The initial approach involved converting existing OpenMRS concepts into the OMOP CONCEPT table and using those concept IDs during the observation transformation. However, this method resulted in OpenMRS-specific data, which reduced interoperability and contradicted OMOP’s goal of standardized, system-agnostic data representation.
Improved Approach
To ensure universal compatibility, standardized OMOP concept IDs were used during the transformation process. This allows the converted data to remain consistent and interoperable across different health systems, not just OpenMRS.
The official OMOP
CONCEPT.csvfile was downloaded from the OMOP vocabulary repository.This file includes over 2 million standard concepts (2,059,343 entries).
A custom mapping table was created using:
OpenMRS
concepttableOpenMRS
concept_reference_maptableThe downloaded OMOP concept definitions
Mapping Table Structure
|
|
|
|
|
|---|---|---|---|---|
5 | 45919733 | SAME-AS | CIEL | Tetra brand of tetracycline |
5 | 1836948 | SAME-AS | RxNorm | tetracycline |
5 | 4187581 | NARROWER-THAN | SNOMED | Tetracycline |
5 | 4281393 | SAME-AS | SNOMED | Tetracycline-containing product |
6 | 45919413 | SAME-AS | CIEL | ETH-Oxydose |
6 | 1124957 | SAME-AS | RxNorm | oxycodone |
… | … | … | … | … |
Current Mapping Strategy
The current conversion uses CIEL vocabulary concepts with a SAME-AS relationship to map OpenMRS concepts to OMOP. However, the structure supports alternative vocabularies such as RxNorm or SNOMED, which can be utilized in future iterations or based on specific use cases.
This mapping strategy ensures that clinical data remains aligned with OMOP standards, enabling better data analysis, interoperability, and integration with other OMOP-compliant systems.
VISIT_OCCURRENCE
OMOP Field Name | OpenMRS Source Field | Transformation / Logic | Notes |
|---|---|---|---|
|
| Direct mapping | Primary key in OpenMRS |
|
| Direct mapping |
|
| — | Hardcoded to | Needs concept mapping if available |
|
|
| Standard OMOP format |
|
| Direct mapping | — |
|
|
|
|
|
| Direct mapping |
|
|
| Direct mapping (needs lookup) | Might require mapping to OMOP concepts |
| — | Hardcoded to | Optional in OMOP |