⚠️ Prevent Duplicates at Registration
Summary:
OpenMRS needs a feature to prevent and address duplicate patient records. The feature will:
Automatically search for existing records during patient registration and alert users of potential matches → based on name, date of birth/age, sex, and address details.
Display potential duplicate matches clearly, prioritizing the closest matches.
Provide an easy-to-use tool to compare and merge duplicate records into a single, complete patient record.
Rather than reinventing the wheel, OpenMRS will adapt and refine existing solutions such as PIH, AMRS POC (AMPATH), and KenyaEMR duplicate detection and merging functionalities.
Standardizing this feature within OpenMRS RefApp ensures a unified approach to handling duplicate patient records across different OpenMRS implementations.
1. Problem Statement: Duplicate Patient Records in Registration
Duplicate medical records arise when users unintentionally create multiple entries for the same patient. This often happens due to insufficient searches before selecting Create New, inconsistent name spellings, name variations, data entry errors, or demographic inconsistencies. Such duplicates fragment patient information across multiple records, making it challenging for healthcare providers to access a complete and accurate medical history.
Issues of Duplicate Patient Records:
Medical errors, such as administering the wrong treatment or medication.
Missed critical health information, affecting patient safety.
Inefficient workflows, as staff spend time reconciling or merging records.
Cluttered EMR systems, with unnecessary duplicate charts.
Incorrect/Inaccurate Reporting numbers (since same patient is counted twice).
Common causes of duplicate records include:
Insufficient searches before new records are created.
Patients being known by different names, nicknames, or aliases. Example: Some patients may not know how to spell their name, and may give a different guessed spelling at each visit.
Misspellings, incorrect demographic information, and data entry errors. Example: In some countries, it is normal to spell a single name in 3+ different ways! Eg. in Rwanda, r’s and l’s are often swapped. PIH had to create a special module to handle this in Patient Searches for their Rwanda sites. In Sri Lanka, there are 3-4 acceptable different ways to spell the name Jayasanka, for the same person!
Lack of advanced identity management technologies like biometrics, which are not universally adopted (and for example, biometrics are often prone to other issues, like privacy, or laborers having with worn-out fingerprints, or that children don’t develop prints until >5 yrs, etc).
The Problem:
Currently, OpenMRS O3 RefApp does not have a built-in alert system to notify users when a newly registered patient closely matches an already-existing record. Without real-time duplicate detection, users cannot identify and review potential duplicate entries before they are created. For example, the two records below appear identical except for the system-generated OpenMRS ID.
The RefApp also lacks a feature to merge duplicate patient records into a single comprehensive record once such duplicates are identified.
Our Goal: The OpenMRS O3 RefApp should include a feature that actively detects potential duplicate records during patient registration.
This functionality would enable the system to analyze entered patient details—such as name, date of birth, and other demographic information—when creating new records.
As users input data, the system should conduct searches against existing records to identify similarities.
If possible matches are detected, a notification should alert the user, prompting them to review the suggested duplicates before finalizing the registration.
2. User Stories: Preventing Duplicate Medical Records
As a Receptionist, I want the system to provide suggestions for possible duplicate records when registering a new patient so that I can verify if the patient already exists before creating a new record or accidentally creating a duplicate record.
As a Clinician, I want to access a single, complete patient record so that I can make informed treatment decisions without missing critical health information.
As a Health Information Officer, I want a feature to help detect and merge duplicate records so that patient data remains accurate and complete, and I can reassure Government and Funders that our reports do not contain duplicates and are correct numbers.
As a System Administrator, I want to configure duplicate-checking rules based on multiple criteria (name, date of birth, ID number, address details) so that the system accurately detects potential duplicates.
3. Market Analysis
🟢 Duplicates Prevention
Duplicate Detection & Alerts: System prompts an alert when a new entry closely matches an existing record.
Purpose: To identify potential duplicate patient records during the registration process by analyzing demographic information such as name, date of birth/age, sex, and address.
When a new registration closely resembles an existing patient, the system triggers an alert, allowing healthcare staff to review and determine whether the new patient is indeed a duplicate. This feature helps prevent duplicate records, reduce administrative burden, and improve patient data accuracy. Healthcare staff can then verify the duplicate and merge or create a new record accordingly, ensuring that patient information is up-to-date and accurate.
Feature Functionality:
Real Time Alerts:
When a user attempts to register a new patient, the system automatically checks for existing records with similar details (e.g., name, date of birth/age, sex, address).
If a match is found, a warning alert is displayed/an alert is triggered, prompting the user to review the potential duplicate before proceeding.
Configurable Matching Criteria:
Allows administrators to define which fields should be considered for duplicate detection (i.e., demographic details).
Review & Confirmation Process:
Provides a side-by-side comparison of the new and existing records.
Users can confirm if the new entry is a duplicate or proceed with registration if it is a different patient.
🟠 Duplicates Merging
Duplicates Merging: System facilitates merging of duplicate patient records when identified after registration.
Purpose: The merging feature ensures that authorized users can efficiently consolidate duplicate patient records when identified after registration. The goal is to maintain a single, accurate patient record while preserving critical data.
Feature Functionality
Merging Feature:
This would allow authorized users to merge patient records if a duplicate is detected after registration.
While merging, the system should ensure that historical data is preserved i.e., clinical data, visit history, and identifiers are retained in the merged record.
Audit Trail & Reporting:
Logs all duplicate detection and resolution actions for accountability.
Generates reports on detected duplicates, merged records, and unresolved potential duplicates for system monitoring.
🔵 Existing Solutions
O2 RefApp + PIH
AMRS POC (AMPATH)
KenyaEMR
Accuro EMR
Bahmni (WIP)
AI/ML Approach
⑴ PIH Duplicate Detection & Alert:
When the user inputs the patient's registration details, the system detects and alerts the user about potential duplicates based on matching name, sex, and date of birth (DOB)/age.
If the user continues with patient creation, duplicate records are generated.
When users search for a duplicate record, using ID or patient name, the possible duplicates are identified and the user is alerted about the duplicate records they want to merge.
Before merging, the user must select the preferred record.
Notably, merging retains both EMR IDs assigned to the patient.
⑵ AMRS POC (AMPATH) Duplicates Detection and Alert:
When the user inputs the patient's registration details, the system identifies and flags potential duplicates based on matching name, gender, and age.
The user is prompted to review and confirm the details before proceeding with patient creation.
If the patients differ, the user can return to the registration process and continue entering the new patient’s details.
⑶ KenyaEMR Patient Duplicates Merge Feature:
If duplicate patient records are created, the KenyaEMR Record Merge tool enables consolidation into a single record while preserving all historical data.
This feature allows users to identify potential duplicates based on matching name, gender, and date of birth—either by manually selecting patients or letting the system search for matches.
Once duplicates are identified, the user can review and select records for merging.
⑷ Accuro EMR Duplicates Identification and Merging
When a user identifies duplicate medical records, they can select the "Merge" action → The step is necessary to eliminate redundant records.
After selecting "Merge," the system provides a search function to locate the records to be merged → This ensures that users can accurately identify and select the correct duplicates.
Users must specify which record should be retained as the primary record and which one will be merged →This step is essential to preserve the most accurate and complete patient information while consolidating duplicates.
Before proceeding, the system prompts the user to confirm whether they want to continue with the merge → This prevents accidental merging and gives users a chance to review their selection.
If any conflicting information exists between the records (such as different addresses or contact details), a pop-up message appears, requiring the user to verify and resolve the discrepancies. This ensures that only accurate and validated data is retained in the final merged record.
⑸ Bahmni (WIP)
→ https://bahmni.atlassian.net/browse/BAH-460
⑹ AI/ML Approach for Identifying Duplicate Patients in an EMR System (Future Iteration)
AI/ML Promises:
(ML) models help identify duplicate patient records by analyzing data patterns and correlations.
Real-Time Prediction & Matching – AI scans records dynamically as new data is entered, flagging possible duplicates before they are created.
Data Correlation & Pattern Recognition – Identifies similarities across multiple fields (e.g., names, phone numbers, addresses) to detect duplicates with high accuracy.
Probabilistic Matching & Fuzzy Logic – Uses advanced matching techniques to detect duplicates even with minor variations in names, dates of birth, or other identifiers.
Machine Learning Model Training – Continuously improves accuracy by learning from past duplicate resolutions and user feedback.
No Need for Rule-Based Maintenance – Unlike traditional systems, AI models adapt and improve over time without requiring manual rule creation/maintenance.
User Confirmation & Control – Users can confirm, reject, or mark potential duplicates as uncertain, ensuring human oversight in decision-making.
Scalability & Efficiency – Works effectively with large healthcare datasets, ensuring duplicate detection even in high-volume environments.
Minimal Data Normalization Requirements – Can process raw data without extensive pre-processing, making implementation easier.
Flexible Application – Supports any data object within the EMR system.
4. Technical Considerations & Dependencies
Here are the technical considerations and dependencies for preventing and addressing duplicate patient registrations in OpenMRS-based systems:
Real-Time Duplicate Detection: Use the rule-based deterministic approach to compare new entries with existing patient records based on:
Name
Sex
Date of Birth/Age
Address
System Alerts and Warnings: Display real-time warnings or prompts when a potential duplicate is detected.
Require the user confirmation before proceeding with registration if a possible match is found.
Patient Merge Tool: Provide a merge feature/action that allows authorized users to search for and combine duplicate records while:
Preserving historical data (e.g., clinical encounters, lab results).
Maintaining both EMR IDs for reference if needed.
Allow side-by-side comparison of potential duplicates.
Audit Logs and Version Control: Maintain detailed audit logs of patient merges.
Preserve the audit trail of both records.
User role-based access control to prevent unauthorized merging.
Feature to detect and flag existing duplicate records: Generate reports for data managers to review and correct patient records.
5. Sketches / Design Ideas
→ https://docs.google.com/presentation/d/1duFitDOpXFwE91NjvCD7pyZ4iXlUmt1ANDPPx6a1xps/edit?usp=sharing
⑴ First Iteration Proposal: Deterministic Matching
For the initial implementation, duplicate detection will be based on a rule-based deterministic approach using the following key identifiers:
Name – Matching will consider exact or near-exact spelling (e.g., ignoring case differences and minor typos).
Date of Birth/Age – Matches will be checked for exact DOB where available, or an age range where precise DOB is missing.
Sex – Used as an additional matching criterion to refine potential duplicates.
Address Details – Matching will use normalized address fields (e.g., city/village, district, or region) to reduce inconsistencies due to formatting variations.
This approach is relatively simple to implement because it relies on structured fields commonly and consistently captured in patient records.
⑵ Future Iterations: Probabilistic Matching
Once the deterministic approach is validated and operational, a more advanced probabilistic matching model can be considered. This may include:
Applying machine learning or fuzzy matching algorithms to improve accuracy.
Incorporating additional data points such as phone numbers, national IDs, or historical visit records to enhance match confidence.
Using weighted scoring methods to account for slight variations in names, addresses, and typos.
⑶ Duplicate Detection Mockups
When a user navigates to Registration → Create New and begins entering patient details, the system automatically searches for similar existing records.
If a potential match is found, a UI notification appears (on the screen) displaying the message: "2 similar patient (s) found. Please review before you create a new patient."
The review action should present a list of potential matches, prioritizing the closest matches and displaying up to five suggestions
⑷ Patient Merge Interface Mockups
When duplicate patient records are detected, a Patient Merge Interface is necessary to facilitate review and merging. This interface should provide healthcare providers or data managers with an intuitive way to compare records and make informed decisions on merging duplicate entries.
The interface displays two or more patient records side by side for easy comparison.
The interface provides two action buttons:
A red "Cancel" button to abort the merge process.
A green "Merge" button to combine the selected patient records.
6. Existing Technical Assets
Legacy Admin module that detects similar patients for merging (can likely re-use same logic): ______
PIH’s Name Phonetics OMOD: https://github.com/openmrs/openmrs-module-namephonetics
Ampath’s code for the Registration Module: (can review to see how they are seeking matches) _______
7. QA Plan
Test Scenarios & Cases:
⑴ Duplicate Detection During Registration
Test Case | Expected Outcome |
---|---|
User enters a new patient’s details | No duplicate message appears, allowing registration to proceed. |
User enters a patient’s details that partially match an existing record | A UI notification appears: "x similar patient/s found. Please review before you create a new patient." |
User clicks on the review button after receiving the duplicate notification | A list of up to potential duplicate matches is displayed. |
User clicks on a suggested duplicate record | The system navigates to the existing patient profile instead of creating a new record. |
User ignores the warning and proceeds to create a new patient | The system allows the action but logs it in the audit trail. |
⑵ Patient Merge Interface
Test Case | Expected Outcome |
---|---|
User selects two duplicate records for merging | Both records appear side by side for easy comparison. |
User clicks the "Cancel" button | The merge process is aborted, and both records remain unchanged. |
User clicks the "Merge" button without resolving conflicts | System prompts user to confirm which conflicting fields should be retained. |
User merges records successfully |
|
System merges two records with different unique identifiers | System retains one identifier and logs the previous ones in merge history. |
8. Safety & Tech Risks
Loss of Critical Patient Data
If merging is not handled correctly, essential clinical history, visit data, or identifiers may be overwritten or lost.
Risk Mitigation: Require user confirmation for field selection before merging.
Unauthorized Merging
If the merge functionality is not properly restricted, unauthorized users might merge records incorrectly.
Risk Mitigation: Restrict merging actions to authorized users and enforce role-based access control (RBAC).
Performance Bottlenecks
Real-time searches could slow down the system if not optimized, especially in large datasets.
Risk Mitigation: Use indexed searches, caching, and asynchronous processing to minimize lag.
Misidentification of Patients
If the system falsely detects duplicates or misses true duplicates, it could lead to incorrect patient records being merged or duplicated patients being registered.
Risk Mitigation: Implement probabilistic matching and strict validation rules to improve accuracy.
Algorithm Limitations
The matching algorithm might not accurately identify duplicates due to misspellings, name variations, or data entry errors.
Risk Mitigation: Utilize fuzzy matching and machine learning-based improvements.