⚠️ Prevent Duplicates at Registration

4 Must-Do’s

1. Problem Description: Have you clearly defined the user problem(s) you intend to solve, and what value this creates? Write down a story, user insight, or quote about this problem (this is important because (1) this will motivate your team, and (2) without this your problem might not actually be a big problem for the users themselves).
2. User Stories: Have you clearly written at least 3 user stories and use cases
3. Market Analysis: Have you surveyed what the market is doing here (e.g. comparison to other EMRs, or paper approaches; and don’t forget about learning from historic/existing OMRS instances)? Have you written down any possible gaps in your understanding of your users or their workflows? Have you reviewed the topic in FHIR to see what requirements or fields the global community references? (Eg if working on insurance, should look here)
4. Technical Considerations & Dependencies: Have you outlined what you need from cross-functional areas for success of the feature? E.g. do you need the platform to support a new API call? Have you explained how you’ve addressed dev concerns, such as designs that may not be feasible, or will be extra time-intensive to implement? 

Optional/Encouraged

Sketches: Have you added a drawing or description of how the feature could work to solve the problem at hand? (Pictures of sketches are ok!) 
Project Management: Have you created the Epic and JIRA tasks so you can share work clearly? Roll-out plan: Do you have an idea whether this will be an experiment, gradual roll out, and when? Have you added this to the timeline view? Have you planned how you will promote and/or work with communications folks in order to help this feature reach the widest audience and have the biggest impact it can?

Later but should do

QA Plan: Have you mentioned the plan for QA, such as how you will discover and address edge cases? Does your team/squad have a plan for automated tests to be added to new components (unit tests) or workflows (e2e tests)?
Safety & Tech Risks: Is there any reason you could regret rolling out this feature? (e.g. possible patient harm, heavy tech debt like introducing an unsupported library) Have you thought through the risks for this particular solution? And, how to reduce/address those? 

This checklist was inspired by this article. Additional Business Analyst Resources here.

Summary:

  • OpenMRS needs a feature to prevent and address duplicate patient records. The feature will:

    1. Automatically search for existing records during patient registration and alert users of potential matches → based on name, date of birth/age, sex, and address details.

    2. Display potential duplicate matches clearly, prioritizing the closest matches.

    3. Provide an easy-to-use tool to compare and merge duplicate records into a single, complete patient record.

  • Rather than reinventing the wheel, OpenMRS will adapt and refine existing solutions such as PIH, AMRS POC (AMPATH), and KenyaEMR duplicate detection and merging functionalities.

  • Standardizing this feature within OpenMRS RefApp ensures a unified approach to handling duplicate patient records across different OpenMRS implementations.

1. Problem Statement: Duplicate Patient Records in Registration

Duplicate medical records arise when users unintentionally create multiple entries for the same patient. This often happens due to insufficient searches before selecting Create New, inconsistent name spellings, name variations, data entry errors, or demographic inconsistencies. Such duplicates fragment patient information across multiple records, making it challenging for healthcare providers to access a complete and accurate medical history.

Issues of Duplicate Patient Records:

  • Medical errors, such as administering the wrong treatment or medication.

  • Missed critical health information, affecting patient safety.

  • Inefficient workflows, as staff spend time reconciling or merging records.

  • Cluttered EMR systems, with unnecessary duplicate charts.

  • Incorrect/Inaccurate Reporting numbers (since same patient is counted twice).

Common causes of duplicate records include:

  • Insufficient searches before new records are created.

  • Patients being known by different names, nicknames, or aliases. Example: Some patients may not know how to spell their name, and may give a different guessed spelling at each visit.

  • Misspellings, incorrect demographic information, and data entry errors. Example: In some countries, it is normal to spell a single name in 3+ different ways! Eg. in Rwanda, r’s and l’s are often swapped. PIH had to create a special module to handle this in Patient Searches for their Rwanda sites. In Sri Lanka, there are 3-4 acceptable different ways to spell the name Jayasanka, for the same person!

  • Lack of advanced identity management technologies like biometrics, which are not universally adopted (and for example, biometrics are often prone to other issues, like privacy, or laborers having with worn-out fingerprints, or that children don’t develop prints until >5 yrs, etc).

The Problem:

  • Currently, OpenMRS O3 RefApp does not have a built-in alert system to notify users when a newly registered patient closely matches an already-existing record. Without real-time duplicate detection, users cannot identify and review potential duplicate entries before they are created. For example, the two records below appear identical except for the system-generated OpenMRS ID.

  • The RefApp also lacks a feature to merge duplicate patient records into a single comprehensive record once such duplicates are identified.

image-20250312-121310.png
image-20250312-121231.png

Our Goal: The OpenMRS O3 RefApp should include a feature that actively detects potential duplicate records during patient registration.

  • This functionality would enable the system to analyze entered patient details—such as name, date of birth, and other demographic information—when creating new records.

  • As users input data, the system should conduct searches against existing records to identify similarities.

  • If possible matches are detected, a notification should alert the user, prompting them to review the suggested duplicates before finalizing the registration.

2. User Stories: Preventing Duplicate Medical Records

  • As a Receptionist, I want the system to provide suggestions for possible duplicate records when registering a new patient so that I can verify if the patient already exists before creating a new record or accidentally creating a duplicate record.

  • As a Clinician, I want to access a single, complete patient record so that I can make informed treatment decisions without missing critical health information.

  • As a Health Information Officer, I want a feature to help detect and merge duplicate records so that patient data remains accurate and complete, and I can reassure Government and Funders that our reports do not contain duplicates and are correct numbers.

  • As a System Administrator, I want to configure duplicate-checking rules based on multiple criteria (name, date of birth, ID number, address details) so that the system accurately detects potential duplicates.

3. Market Analysis

🟢 Duplicates Prevention

  1. Duplicate Detection & Alerts: System prompts an alert when a new entry closely matches an existing record.

  2. Purpose: To identify potential duplicate patient records during the registration process by analyzing demographic information such as name, date of birth/age, sex, and address.

    • When a new registration closely resembles an existing patient, the system triggers an alert, allowing healthcare staff to review and determine whether the new patient is indeed a duplicate. This feature helps prevent duplicate records, reduce administrative burden, and improve patient data accuracy. Healthcare staff can then verify the duplicate and merge or create a new record accordingly, ensuring that patient information is up-to-date and accurate.

  3. Feature Functionality:

    • Real Time Alerts:

      • When a user attempts to register a new patient, the system automatically checks for existing records with similar details (e.g., name, date of birth/age, sex, address).

      • If a match is found, a warning alert is displayed/an alert is triggered, prompting the user to review the potential duplicate before proceeding.

    • Configurable Matching Criteria:

      • Allows administrators to define which fields should be considered for duplicate detection (i.e., demographic details).

    • Review & Confirmation Process:

      • Provides a side-by-side comparison of the new and existing records.

      • Users can confirm if the new entry is a duplicate or proceed with registration if it is a different patient.

🟠 Duplicates Merging

  1. Duplicates Merging: System facilitates merging of duplicate patient records when identified after registration.

  2. Purpose: The merging feature ensures that authorized users can efficiently consolidate duplicate patient records when identified after registration. The goal is to maintain a single, accurate patient record while preserving critical data.

  3. Feature Functionality

  • Merging Feature:

    • This would allow authorized users to merge patient records if a duplicate is detected after registration.

    • While merging, the system should ensure that historical data is preserved i.e., clinical data, visit history, and identifiers are retained in the merged record.

  • Audit Trail & Reporting:

    • Logs all duplicate detection and resolution actions for accountability.

    • Generates reports on detected duplicates, merged records, and unresolved potential duplicates for system monitoring.

🔵 Existing Solutions

  1. O2 RefApp + PIH

  2. AMRS POC (AMPATH)

  3. KenyaEMR

  4. Accuro EMR

  5. Bahmni (WIP)

  6. AI/ML Approach

PIH Duplicate Detection & Alert:

  1. When the user inputs the patient's registration details, the system detects and alerts the user about potential duplicates based on matching name, sex, and date of birth (DOB)/age.

  2. If the user continues with patient creation, duplicate records are generated.

  3. When users search for a duplicate record, using ID or patient name, the possible duplicates are identified and the user is alerted about the duplicate records they want to merge.

  4. Before merging, the user must select the preferred record.

  5. Notably, merging retains both EMR IDs assigned to the patient.

image-20250313-130327.png
❶ Warning Alert
image-20250313-130436.png
❷ Merge Patient Feature
image-20250313-130524.png
❸ Duplicate Records
image-20250313-130642.png
❹ Merging Feature
image-20250313-131058.png
❺ Merging
image-20250313-130915.png
❻ Merged Records

AMRS POC (AMPATH) Duplicates Detection and Alert:

  1. When the user inputs the patient's registration details, the system identifies and flags potential duplicates based on matching name, gender, and age.

  2. The user is prompted to review and confirm the details before proceeding with patient creation.

  3. If the patients differ, the user can return to the registration process and continue entering the new patient’s details.

image-20250314-070925.png
❶ Patient Registration
image-20250314-071044.png
❷ Flagging of Patients with similar details (name, gender, age)
image-20250314-071226.png
❸ Verification/Review of Patient

KenyaEMR Patient Duplicates Merge Feature:

  1. If duplicate patient records are created, the KenyaEMR Record Merge tool enables consolidation into a single record while preserving all historical data.

  2. This feature allows users to identify potential duplicates based on matching name, gender, and date of birth—either by manually selecting patients or letting the system search for matches.

  3. Once duplicates are identified, the user can review and select records for merging.

image-20250312-093029.png
❶ Matching Fields
❷ Flagging of Duplicates
image-20250312-092038.png
❸ Merging of Duplicates

 

⑷ Accuro EMR Duplicates Identification and Merging

  1. When a user identifies duplicate medical records, they can select the "Merge" action → The step is necessary to eliminate redundant records.

  2. After selecting "Merge," the system provides a search function to locate the records to be merged → This ensures that users can accurately identify and select the correct duplicates.

  3. Users must specify which record should be retained as the primary record and which one will be merged →This step is essential to preserve the most accurate and complete patient information while consolidating duplicates.

  4. Before proceeding, the system prompts the user to confirm whether they want to continue with the merge → This prevents accidental merging and gives users a chance to review their selection.

  5. If any conflicting information exists between the records (such as different addresses or contact details), a pop-up message appears, requiring the user to verify and resolve the discrepancies. This ensures that only accurate and validated data is retained in the final merged record.

image-20250324-053027.png
❶ Duplicates Identification
image-20250324-055036.png
❷ Merge Patient Action
image-20250324-053245.png
❸ Selecting the record to record to keep before Merging
image-20250324-053319.png
❹ Merging Duplicate Patients

Bahmni (WIP)

https://bahmni.atlassian.net/browse/BAH-460

 

AI/ML Approach for Identifying Duplicate Patients in an EMR System (Future Iteration)

AI/ML Promises:

  1. (ML) models help identify duplicate patient records by analyzing data patterns and correlations.

  2. Real-Time Prediction & Matching – AI scans records dynamically as new data is entered, flagging possible duplicates before they are created.

  3. Data Correlation & Pattern Recognition – Identifies similarities across multiple fields (e.g., names, phone numbers, addresses) to detect duplicates with high accuracy.

  4. Probabilistic Matching & Fuzzy Logic – Uses advanced matching techniques to detect duplicates even with minor variations in names, dates of birth, or other identifiers.

  5. Machine Learning Model Training – Continuously improves accuracy by learning from past duplicate resolutions and user feedback.

  6. No Need for Rule-Based Maintenance – Unlike traditional systems, AI models adapt and improve over time without requiring manual rule creation/maintenance.

  7. User Confirmation & Control – Users can confirm, reject, or mark potential duplicates as uncertain, ensuring human oversight in decision-making.

  8. Scalability & Efficiency – Works effectively with large healthcare datasets, ensuring duplicate detection even in high-volume environments.

  9. Minimal Data Normalization Requirements – Can process raw data without extensive pre-processing, making implementation easier.

  10. Flexible Application – Supports any data object within the EMR system.

 

image-20250324-073344.png
❶ Example of Rule-based Approach ~ Rules, Constraints, Patterns
image-20250324-075015.png
❷ ML Training/Creating the Model

 

image-20250324-074704.png
❸ How do machines know if names are similar?
image-20250324-080110.png
❹ Manual Merge

4. Technical Considerations & Dependencies

Here are the technical considerations and dependencies for preventing and addressing duplicate patient registrations in OpenMRS-based systems:

  • Real-Time Duplicate Detection: Use the rule-based deterministic approach to compare new entries with existing patient records based on:

    • Name

    • Sex

    • Date of Birth/Age

    • Address

  • System Alerts and Warnings: Display real-time warnings or prompts when a potential duplicate is detected.

    • Require the user confirmation before proceeding with registration if a possible match is found.

  • Patient Merge Tool: Provide a merge feature/action that allows authorized users to search for and combine duplicate records while:

    • Preserving historical data (e.g., clinical encounters, lab results).

    • Maintaining both EMR IDs for reference if needed.

  • Allow side-by-side comparison of potential duplicates.

  • Audit Logs and Version Control: Maintain detailed audit logs of patient merges.

    • Preserve the audit trail of both records.

  • User role-based access control to prevent unauthorized merging.

  • Feature to detect and flag existing duplicate records: Generate reports for data managers to review and correct patient records.

5. Sketches / Design Ideas

https://docs.google.com/presentation/d/1duFitDOpXFwE91NjvCD7pyZ4iXlUmt1ANDPPx6a1xps/edit?usp=sharing

⑴ First Iteration Proposal: Deterministic Matching

For the initial implementation, duplicate detection will be based on a rule-based deterministic approach using the following key identifiers:

  1. Name – Matching will consider exact or near-exact spelling (e.g., ignoring case differences and minor typos).

  2. Date of Birth/Age – Matches will be checked for exact DOB where available, or an age range where precise DOB is missing.

  3. Sex – Used as an additional matching criterion to refine potential duplicates.

  4. Address Details – Matching will use normalized address fields (e.g., city/village, district, or region) to reduce inconsistencies due to formatting variations.

This approach is relatively simple to implement because it relies on structured fields commonly and consistently captured in patient records.

⑵ Future Iterations: Probabilistic Matching

Once the deterministic approach is validated and operational, a more advanced probabilistic matching model can be considered. This may include:

  • Applying machine learning or fuzzy matching algorithms to improve accuracy.

  • Incorporating additional data points such as phone numbers, national IDs, or historical visit records to enhance match confidence.

  • Using weighted scoring methods to account for slight variations in names, addresses, and typos.

⑶ Duplicate Detection Mockups

  1. When a user navigates to Registration → Create New and begins entering patient details, the system automatically searches for similar existing records.

  2. If a potential match is found, a UI notification appears (on the screen) displaying the message: "2 similar patient (s) found. Please review before you create a new patient."

  3. The review action should present a list of potential matches, prioritizing the closest matches and displaying up to five suggestions

image-20250326-125649.png
❶ Duplicate Detection and Alert!
image-20250326-125747.png
❷ Duplicates Review Action

⑷ Patient Merge Interface Mockups

When duplicate patient records are detected, a Patient Merge Interface is necessary to facilitate review and merging. This interface should provide healthcare providers or data managers with an intuitive way to compare records and make informed decisions on merging duplicate entries.

  1. The interface displays two or more patient records side by side for easy comparison.

  2. The interface provides two action buttons:

    • A red "Cancel" button to abort the merge process.

    • A green "Merge" button to combine the selected patient records.

image-20250326-124952.png
❶ Manage Duplicates ~ Patient Merge Feature
image-20250326-124820.png
❷ Patient Merge Interface

6. Existing Technical Assets

  • Legacy Admin module that detects similar patients for merging (can likely re-use same logic): ______

  • PIH’s Name Phonetics OMOD: https://github.com/openmrs/openmrs-module-namephonetics

  • Ampath’s code for the Registration Module: (can review to see how they are seeking matches) _______

7. QA Plan

Test Scenarios & Cases:

⑴ Duplicate Detection During Registration

Test Case

Expected Outcome

Test Case

Expected Outcome

User enters a new patient’s details

No duplicate message appears, allowing registration to proceed.

User enters a patient’s details that partially match an existing record

A UI notification appears: "x similar patient/s found. Please review before you create a new patient."

User clicks on the review button after receiving the duplicate notification

A list of up to potential duplicate matches is displayed.

User clicks on a suggested duplicate record

The system navigates to the existing patient profile instead of creating a new record.

User ignores the warning and proceeds to create a new patient

The system allows the action but logs it in the audit trail.

⑵ Patient Merge Interface

Test Case

Expected Outcome

Test Case

Expected Outcome

User selects two duplicate records for merging

Both records appear side by side for easy comparison.

User clicks the "Cancel" button

The merge process is aborted, and both records remain unchanged.

User clicks the "Merge" button without resolving conflicts

System prompts user to confirm which conflicting fields should be retained.

User merges records successfully

  • One consolidated record is created.

  • Visit history, clinical data, and identifiers are preserved.

  • The action is logged in the audit trail.

System merges two records with different unique identifiers

System retains one identifier and logs the previous ones in merge history.

8. Safety & Tech Risks

  • Loss of Critical Patient Data

    • If merging is not handled correctly, essential clinical history, visit data, or identifiers may be overwritten or lost.

    • Risk Mitigation: Require user confirmation for field selection before merging.

  • Unauthorized Merging

    • If the merge functionality is not properly restricted, unauthorized users might merge records incorrectly.

    • Risk Mitigation: Restrict merging actions to authorized users and enforce role-based access control (RBAC).

  • Performance Bottlenecks

    • Real-time searches could slow down the system if not optimized, especially in large datasets.

    • Risk Mitigation: Use indexed searches, caching, and asynchronous processing to minimize lag.

  • Misidentification of Patients

    • If the system falsely detects duplicates or misses true duplicates, it could lead to incorrect patient records being merged or duplicated patients being registered.

    • Risk Mitigation: Implement probabilistic matching and strict validation rules to improve accuracy.

  • Algorithm Limitations

    • The matching algorithm might not accurately identify duplicates due to misspellings, name variations, or data entry errors.

    • Risk Mitigation: Utilize fuzzy matching and machine learning-based improvements.

 

Related content