Enhance Patient Matching Module Project

Enhance Patient Matching Module

Primary mentor

@Shaun Grannis

Backup mentor

@Burke Mamlin

Assigned to

@Former user (Deleted), @Willa Mhawila

Introduction

OpenMRS collects patient identifying data when registering a patient. Over time, errors creep into the records (e.g., a patient's name may be misspelled or a birth date is entered incorrectly), and these errors can result in duplicate patient records being created within the same OpenMRS installation.

Record Linkage, Patient Matching, and and De-duplication

Record linkage is the task of identifying pieces of scattered information that refer to the same thing. Patient matching refers to the process that identifies records belonging to the same patient among different data sources, or duplicates within the same data source. De-duplication is the specific patient matching process that focuses on identifying links among patients with the same installation.

OpenMRS De-duplication Module

The current OpenMRS patient de-duplication module, which identifies potentially duplicate patients using a sophisticated probabilistic matching algorithm, requires multiple enhancements to ensure effective and efficient use in real-world installations. Consequently, we seek to enhance the current patient de-duplication module as described below.

Objectives

We have opportunities for innovation in a number of areas. The summer of code candidate will work with Shaun Grannis and James Egg to identify and prioritize a subset of innovations, which include:

User Interface Innovations.
  • Add Merge Confirmation: Because merging patient records is not reversible programmatically, we should add a warning statement when users click on the "patient merge" button. (PTM-37)

  • Configure An Analysis Server to Point to a Production Server: Add a URL property option to the patient de-duplication module to define which OpenMRS instance the de-duplication report points to, instead of automatically defaulting to the local URL. (PTM-28)

  • Improved Manage Strategies Interface: (1) Highlight (either using bold or colored text) the fields used for matching so these "active" fields will be more readily recognized. (2) Change the list of available fields from three to two columns to avoid horizontal scrolling. (PTM-29, PTM-30)

Matching Algorithm Enhancements.
  • Transposed Fields: Patients occasionally interchange or transpose their three names (Given Name, Middle Name, and Family Name). We seek to create a process where transposed fields can be compared and used for matching. This will require (1) creating a single concatenated field containing the values from all 3 fields, (2) using the Longest Common Subsequence field comparator to analyze transposed fields, and (3) creating the ability to select and configure these fields on the "Manage Strategies" screen. (PTM-23)

Persisting Meta-Data.
  • Persist configuration data for each de-duplication report: Information such as the user that ran the report, a timestamp for the report, elapsed time for each sub-step of the analysis, and which strategies were used should be associated with the de-duplication report. (