Entity Mapping from OpenMRS to OMOP

Entity Mapping from OpenMRS to OMOP

This document provides a detailed reference for the entity-level and field-level mappings used to transform data from the OpenMRS data model into the OMOP Common Data Model (CDM). The transformation was implemented using SQLMesh, and follows OMOP's standardization principles to ensure interoperability across healthcare systems.

Each section describes the mapping for a specific OMOP entity, including:

  • Source fields from OpenMRS

  • Transformations or logic applied

  • Hardcoded or default values (where applicable)

  • Notes on vocabulary alignment and filtering rules

A special focus has been placed on concept mapping, which aligns OpenMRS concepts with standard OMOP vocabulary concepts using the CIEL SAME-AS relationship, to ensure the resulting data can be meaningfully analyzed across OMOP-compliant platforms.

This reference is intended to support implementers, data engineers, and analysts in understanding how OpenMRS data has been normalized and structured for OMOP-based analytical environments.

Feedback

Feedback and suggestions are welcome! If you notice any issues, inconsistencies, or areas for improvement, please reach out or open a discussion thread. Your input helps improve this mapping for everyone.

CARE SITE

OMOP Field Name

OpenMRS Source Field

Transformation / Logic

Notes

OMOP Field Name

OpenMRS Source Field

Transformation / Logic

Notes

care_site_id

location.location_id

Direct mapping

Uses the unique identifier of the OpenMRS location

care_site_name

location.name

Direct mapping

Name of the location, used as the care site name

place_of_service_concept_id

NULL

Not mapped; can be populated if standard place-of-service exists

location_id

location.location_id

Direct mapping

Links to LOCATION table

care_site_source_value

location.name

Direct mapping

Original source value from OpenMRS

place_of_service_source_value

NULL

Optional field; not mapped

Filtering Notes:

  • Only non-retired locations are included (l.retired = 0).

Notes:

  • Each OpenMRS location is treated as a care site.

  • The place_of_service_concept_id and place_of_service_source_value are left null for now, but may be populated in the future using additional metadata or vocabulary mappings.

PERSON

OMOP Field Name

OpenMRS Source Field

Transformation / Logic

Notes

OMOP Field Name

OpenMRS Source Field

Transformation / Logic

Notes

person_id

patient.patient_id

Direct mapping

Unique identifier for each person

gender_concept_id

person.gender

'M' → 8507, 'F' → 8532, else 0

OMOP standard concept IDs for gender

year_of_birth

person.birthdate

YEAR(birthdate)

Extracts the year from birthdate

month_of_birth

person.birthdate

MONTH(birthdate)

Extracts the month from birthdate

day_of_birth

person.birthdate

DAY(birthdate)

Extracts the day from birthdate

birth_datetime

person.birthdate

Direct mapping

Full birth date and time

race_concept_id

Hardcoded to 0

Race not recorded in OpenMRS

ethnicity_concept_id

Hardcoded to 0

Ethnicity not recorded in OpenMRS

location_id

Hardcoded to 1

Default location assumed (can be updated later)

provider_id

users.person_id

Mapped from creator.user_id

Person who created the patient record

care_site_id

Hardcoded to 1

Default care site (can be refined later)

person_source_value

Hardcoded to empty string

Optional, not currently mapped

gender_source_value

person.gender

Direct mapping

Retains raw gender string from OpenMRS

gender_source_concept_id

Hardcoded to 0

Can be populated later

race_source_value

Hardcoded to empty string

Not available in OpenMRS

race_source_concept_id

Hardcoded to 0

ethnicity_source_value

Hardcoded to empty string

ethnicity_source_concept_id

Hardcoded to 0

Notes:

  • Gender is standardized using OMOP concept IDs.

  • Race and ethnicity fields are left empty or set to 0 as OpenMRS does not capture this data by default.

  • location_id and care_site_id are hardcoded to 1 as placeholders — these can be refined based on actual mappings to LOCATION and CARE_SITE.

OBSERVATION_PERIOD

OMOP Field Name

OpenMRS Source Field(s)

Transformation / Logic

Notes

OMOP Field Name

OpenMRS Source Field(s)

Transformation / Logic

Notes

observation_period_id

ROW_NUMBER() OVER (ORDER BY MIN(v.date_started))

Sequential ID assigned per row

person_id

visit.patient_id

Direct mapping

Links to PERSON table

observation_period_start_date

visit.date_started

DATE(MIN(v.date_started))

First known date of healthcare activity

observation_period_end_date

visit.date_stopped, encounter.encounter_datetime

DATE(GREATEST(MAX(v.date_stopped), MAX(e.encounter_datetime)))

Latest known visit or encounter date

period_type_concept_id

Hardcoded to 44814724

OMOP standard concept for EHR record

Why visit.date_started for observation period start?

  • The visit table in OpenMRS represents high-level patient interactions with the healthcare system.

  • date_started gives the earliest recorded point of care, often more reliable than scattered observations.

  • Using the earliest date_started per patient ensures we mark the beginning of valid, intentional medical data.

Why GREATEST(visit.date_stopped, encounter.encounter_datetime) for observation period end?

  • Some visits may not have a date_stopped, especially if not explicitly closed in OpenMRS.

  • Encounters within a visit may occur after the visit’s date_stopped (due to data entry lag or configuration).

  • To ensure we capture the latest available data, we take the greater of:

    • MAX(visit.date_stopped): expected end of clinical engagement

    • MAX(encounter.encounter_datetime): actual last recorded interaction

Additional Considerations:

  • Patients may have long gaps between visits, but for simplicity and consistency, only a single continuous observation period is generated per patient in this model.

Concept Mapping Strategy

One of the key challenges in translating entities from OpenMRS to OMOP was handling concepts effectively.

Initial Approach

The initial approach involved converting existing OpenMRS concepts into the OMOP CONCEPT table and using those concept IDs during the observation transformation. However, this method resulted in OpenMRS-specific data, which reduced interoperability and contradicted OMOP’s goal of standardized, system-agnostic data representation.

Improved Approach

To ensure universal compatibility, standardized OMOP concept IDs were used during the transformation process. This allows the converted data to remain consistent and interoperable across different health systems, not just OpenMRS.

  • The official OMOP CONCEPT.csv file was downloaded from the OMOP vocabulary repository.

  • This file includes over 2 million standard concepts (2,059,343 entries).

  • A custom mapping table was created using:

    • OpenMRS concept table

    • OpenMRS concept_reference_map table

    • The downloaded OMOP concept definitions

Mapping Table Structure

omrs_concept_id

omop_concept_id

relationship_id

source_vocabulary

description

omrs_concept_id

omop_concept_id

relationship_id

source_vocabulary

description

5

45919733

SAME-AS

CIEL

Tetra brand of tetracycline

5

1836948

SAME-AS

RxNorm

tetracycline

5

4187581

NARROWER-THAN

SNOMED

Tetracycline

5

4281393

SAME-AS

SNOMED

Tetracycline-containing product

6

45919413

SAME-AS

CIEL

ETH-Oxydose

6

1124957

SAME-AS

RxNorm

oxycodone

Current Mapping Strategy

The current conversion uses CIEL vocabulary concepts with a SAME-AS relationship to map OpenMRS concepts to OMOP. However, the structure supports alternative vocabularies such as RxNorm or SNOMED, which can be utilized in future iterations or based on specific use cases.

This mapping strategy ensures that clinical data remains aligned with OMOP standards, enabling better data analysis, interoperability, and integration with other OMOP-compliant systems.

VISIT_OCCURRENCE

OMOP Field Name

OpenMRS Source Field

Transformation / Logic

Notes

OMOP Field Name

OpenMRS Source Field

Transformation / Logic

Notes

visit_occurrence_id

visit.visit_id

Direct mapping

Primary key in OpenMRS

person_id

visit.patient_id

Direct mapping

patient_id in OpenMRS maps to person_id in OMOP

visit_concept_id

Hardcoded to 0

Needs concept mapping if available

visit_start_date

visit.date_started

DATE() extraction

Standard OMOP format

visit_start_datetime

visit.date_started

Direct mapping

visit_end_date

visit.date_stopped

DATE() extraction

NULL for ongoing visits

visit_end_datetime

visit.date_stopped

Direct mapping

NULL for ongoing visits

visit_type_concept_id

visit.visit_type_id

Direct mapping (needs lookup)

Might require mapping to OMOP concepts

provider_id

Hardcoded to 0

Optional in OMOP