Entity Mapping from OpenMRS to OMOP
This document provides a detailed reference for the entity-level and field-level mappings used to transform data from the OpenMRS data model into the OMOP Common Data Model (CDM). The transformation was implemented using SQLMesh, and follows OMOP's standardization principles to ensure interoperability across healthcare systems.
Each section describes the mapping for a specific OMOP entity, including:
Source fields from OpenMRS
Transformations or logic applied
Hardcoded or default values (where applicable)
Notes on vocabulary alignment and filtering rules
A special focus has been placed on concept mapping, which aligns OpenMRS concepts with standard OMOP vocabulary concepts using the CIEL SAME-AS
relationship, to ensure the resulting data can be meaningfully analyzed across OMOP-compliant platforms.
This reference is intended to support implementers, data engineers, and analysts in understanding how OpenMRS data has been normalized and structured for OMOP-based analytical environments.
Feedback
Feedback and suggestions are welcome! If you notice any issues, inconsistencies, or areas for improvement, please reach out or open a discussion thread. Your input helps improve this mapping for everyone.
PERSON
OMOP Field Name | OpenMRS Source Field | Transformation / Logic | Notes |
---|---|---|---|
|
| Direct mapping | Primary key from OpenMRS patient table |
|
| 'M' → 8507, 'F' → 8532, else 0 | Uses standard OMOP gender concept IDs |
|
| Extract | — |
|
| Extract | — |
|
| Extract | — |
|
| Direct mapping | — |
| — | Hardcoded to | Race not tracked in OpenMRS by default |
| — | Hardcoded to | Ethnicity not tracked in OpenMRS |
| — | Hardcoded to | Needs lookup |
| — | Hardcoded to |
|
| — | Hardcoded to |
|
| — | Hardcoded to empty string | Can be used to store original person reference |
|
| Direct mapping | Keeps original source value of gender |
| — | Hardcoded to | Optional |
| — | Hardcoded to empty string | Not tracked in OpenMRS |
| — | Hardcoded to | — |
| — | Hardcoded to empty string | — |
| — | Hardcoded to | — |
OBSERVATION_PERIOD
OMOP Field Name | OpenMRS Source Field(s) | Transformation / Logic | Notes |
---|---|---|---|
| — |
| Synthetic primary key per patient |
|
| Direct mapping | Links to the |
|
|
| First known activity date per patient |
|
|
| Last known activity, taking both visit and encounter into account |
| — | Hardcoded to | OMOP concept ID for EHR record |
Why visit.date_started
for observation period start?
The
visit
table in OpenMRS represents high-level patient interactions with the healthcare system.date_started
gives the earliest recorded point of care, often more reliable than scattered observations.Using the earliest
date_started
per patient ensures we mark the beginning of valid, intentional medical data.
Why GREATEST(visit.date_stopped, encounter.encounter_datetime)
for observation period end?
Some visits may not have a
date_stopped
, especially if not explicitly closed in OpenMRS.Encounters within a visit may occur after the visit’s
date_stopped
(due to data entry lag or configuration).To ensure we capture the latest available data, we take the greater of:
MAX(visit.date_stopped)
: expected end of clinical engagementMAX(encounter.encounter_datetime)
: actual last recorded interaction
Additional Considerations:
Patients may have long gaps between visits, but for simplicity and consistency, only a single continuous observation period is generated per patient in this model.
Concept Mapping Strategy
One of the key challenges in translating entities from OpenMRS to OMOP was handling concepts effectively.
Initial Approach
The initial approach involved converting existing OpenMRS concepts into the OMOP CONCEPT
table and using those concept IDs during the observation transformation. However, this method resulted in OpenMRS-specific data, which reduced interoperability and contradicted OMOP’s goal of standardized, system-agnostic data representation.
Improved Approach
To ensure universal compatibility, standardized OMOP concept IDs were used during the transformation process. This allows the converted data to remain consistent and interoperable across different health systems, not just OpenMRS.
The official OMOP
CONCEPT.csv
file was downloaded from the OMOP vocabulary repository.This file includes over 2 million standard concepts (2,059,343 entries).
A custom mapping table was created using:
OpenMRS
concept
tableOpenMRS
concept_reference_map
tableThe downloaded OMOP concept definitions
Mapping Table Structure
|
|
|
|
|
---|---|---|---|---|
5 | 45919733 | SAME-AS | CIEL | Tetra brand of tetracycline |
5 | 1836948 | SAME-AS | RxNorm | tetracycline |
5 | 4187581 | NARROWER-THAN | SNOMED | Tetracycline |
5 | 4281393 | SAME-AS | SNOMED | Tetracycline-containing product |
6 | 45919413 | SAME-AS | CIEL | ETH-Oxydose |
6 | 1124957 | SAME-AS | RxNorm | oxycodone |
… | … | … | … | … |
Current Mapping Strategy
The current conversion uses CIEL vocabulary concepts with a SAME-AS
relationship to map OpenMRS concepts to OMOP. However, the structure supports alternative vocabularies such as RxNorm or SNOMED, which can be utilized in future iterations or based on specific use cases.
This mapping strategy ensures that clinical data remains aligned with OMOP standards, enabling better data analysis, interoperability, and integration with other OMOP-compliant systems.
VISIT_OCCURRENCE
OMOP Field Name | OpenMRS Source Field | Transformation / Logic | Notes |
---|---|---|---|
|
| Direct mapping | Primary key in OpenMRS |
|
| Direct mapping |
|
| — | Hardcoded to | Needs concept mapping if available |
|
|
| Standard OMOP format |
|
| Direct mapping | — |
|
|
|
|
|
| Direct mapping |
|
|
| Direct mapping (needs lookup) | Might require mapping to OMOP concepts |
| — | Hardcoded to | Optional in OMOP |
|
| Direct mapping | Assumes |
| — | Hardcoded to empty string | Optional; can populate with source value |
| — | Hardcoded to | Optional |
| — | Hardcoded to | Optional; not tracked in OpenMRS by default |
| — | Hardcoded to empty string | — |
| — | Hardcoded to | — |
| — | Hardcoded to empty string | — |
| — | Hardcoded to | Could be populated if visit chains exist |
OBSERVATION
OMOP Field Name | OpenMRS Source Field(s) | Transformation / Logic | Notes |
---|---|---|---|
|
| Direct mapping | Unique identifier from the OpenMRS |
|
| Direct mapping | Links to |
|
| Mapped using CIEL | Concept mapping to standard OMOP concepts |
|
|
| Extracts just the date portion |
|
| Direct mapping | — |
| — | Hardcoded to | OMOP concept for EHR encounter record |
|
| Direct mapping | Numeric value of the observation, if present |
|
|
| Truncated to 60 characters as there’s a character limit |
|
| Direct mapping | Used if the observation has a coded value |
| — |
| Not used in current mapping |
| — |
| Not used in current mapping |
| — |
| Optional; not mapped |
|
| Direct mapping | Assumes encounter maps to a visit occurrence |
| — |
| Not used |
| — | Hardcoded to empty string | Optional; can be used to track original values |
|
| Direct mapping | Keeps original OpenMRS concept ID |
| — | Hardcoded to empty string | Optional |
| — | Hardcoded to empty string | Optional |
| — | Hardcoded to empty string | Optional |
| — |
| Not used |
| — |
| Not used |
Filtering Notes:
Only observations that are not voided are included (
o.voided = 0
).Observations from encounter types 5 (vitals), 8 (notes), and 11 (lab results) are excluded.
Concept mapping uses the
OMRS_TO_OMOP_CONCEPT
table whererelationship_id = 'SAME-AS'
andvocabulary_id = 'CIEL'
.
MEASUREMENT
OMOP Field Name | OpenMRS Source Field(s) | Transformation / Logic | Notes |
---|---|---|---|
|
| Direct mapping | Unique identifier for each measurement |
|
| Direct mapping | Links to |
|
| Mapped using CIEL | Uses standardized OMOP concept ID |
|
|
| — |
|
| Direct mapping | — |
|
|
| Extracts time component in |
| — | Hardcoded to | OMOP standard for EHR-derived measurement |
| — |
| Not used |
|
| Direct mapping | Numeric measurement value |
|
| Direct mapping | Used if the measurement has a coded value |
| — |
| Optional |
|
| Direct mapping | Lower normal range if available |
|
| Direct mapping | Upper normal range if available |
| — |
| Not used |
|
| Direct mapping | Links measurement to associated visit |
| — |
| Optional |
| — | Hardcoded to empty string | Optional |
|
| Direct mapping | Keeps original OpenMRS concept ID |
|
| Direct mapping | Source unit text (e.g., 'mmHg') |
| — |
| Optional |
|
| Direct mapping | Same as |
| — |
| Not used |
| — |
| Not used |
Filtering Notes:
Only non-voided observations are included (
o.voided = 0
).Only vitals (encounter_type_id = 5) and lab results (encounter_type_id = 11) are included.
Concept mapping uses the
OMRS_TO_OMOP_CONCEPT
table where:relationship_id = 'SAME-AS'
vocabulary_id = 'CIEL'
NOTE
OMOP Field Name | OpenMRS Source Field(s) | Transformation / Logic | Notes |
---|---|---|---|
|
| Direct mapping | Unique identifier for each note |
|
| Direct mapping | Links to |
|
|
| Date of the observation |
|
| Direct mapping | Full timestamp |
| — | Hardcoded to | OMOP concept ID for Note |
| — | Hardcoded to | Also set to Note |
| — | Hardcoded to empty string | Title not available in OpenMRS |
|
| Direct mapping | The actual note content |
| — | Hardcoded to | OMOP concept ID for UTF-8 |
| — | Hardcoded to | Language unknown or not specified |
| — |
| Not mapped |
|
| Direct mapping | Links note to associated visit |
| — |
| Optional |
| — | Hardcoded to empty string | Optional |
| — |
| Not used |
| — |
| Not used |
Filtering Notes:
Only non-voided observations are included (
o.voided = 0
).Filters for encounter type ID = 8, which corresponds to Visit Notes in OpenMRS.
Only observations with non-null
value_text
are included.
LOCATION
OMOP Field Name | OpenMRS Source Field | Transformation / Logic | Notes |
---|---|---|---|
|
| Direct mapping | Unique identifier for the location |
|
| Direct mapping | Primary address line |
|
| Direct mapping | Secondary address line |
|
| Direct mapping | Name of the city or village |
|
|
| Truncated to first 2 characters |
|
|
| Truncated to first 9 characters |
|
|
| Truncated to first 20 characters |
|
| Direct mapping | Original name of the location |
| — |
| Not mapped; can be added if a country concept exists |
|
| Direct mapping | Country name as stored in OpenMRS |
|
|
| Avoids storing empty string as a valid value |
|
|
| Same as above |
Filtering Notes:
Only non-retired locations are included (
l.retired = 0
).
CONDITION_OCCURRENCE
OMOP Field Name | OpenMRS Source Field | Transformation / Logic | Notes |
---|---|---|---|
|
| Direct mapping | Unique identifier for the condition record |
|
| Direct mapping | Links to the patient in the OMOP |
|
| Mapped using | Uses standardized OMOP concept ID |
|
|
| — |
|
| Direct mapping | — |
|
|
| May be |
|
| Direct mapping | — |
| — | Hardcoded to | Optional; can be mapped to OMOP concept types in future |
| — | Hardcoded to | Optional |
|
|
| Reason for stopping the condition |
| — |
| Not mapped currently |
| — |
| Could be linked if encounter/visit info is available |
| — |
| Not used |
| — | Hardcoded to empty string | Optional; can include original value from source if needed |
|
| Direct mapping | Keeps original OpenMRS concept ID |
|
|
| Captures status like "confirmed", "presumed", etc. if available |
Filtering Notes:
Only non-voided conditions are included (
c.voided = 0
).Concept mapping uses the
OMRS_TO_OMOP_CONCEPT
table where:relationship_id = 'SAME-AS'
vocabulary_id = 'CIEL'