Data Quality
Facilitator: Evan Waters
Notetaker: Christian Neumann
What makes "good data" quality? What are we trying to achieve and capture?
What actually happened
Accurate data
Timely
Complete
Consistent
Legible
Used (usable) -> for decision making
Reusable
Relevant
Necessary (only)
What are the challenges? What leads to poor data quality? What can get in the way?
Too much data... (impacts the quality)
All the inverse of good data quality
Lack of understanding, training
Process, capacity
Lack of standards definitions
Illegible
Machine errors
Incomplete and lost data
Transcription errors
Redundancies
What are the problems and possible solutions?
Sensitize on data quality
Overcome transcription errors with Education
Measure impact
Overcome Transit errors (time, distance) by reducing steps in process, being more timely, mobile platform
Find answer for who is responsible for checking data. Everybody, not just data manager can check this
Define categories of data quality areas and use different interventions to overcome the errors by different people
Paper-based Forms can get lost; having 2 forms filled out by the responsible persons provides a fall-back in case the initial forms is lost or entered dfferently
Gap of time and distance
Community data 9more mobile) vs. clinical data (more statical)
Complexity in forms
Depending on environment patients might reuse Identifiers (to save costs, black market, privacy, ...)
Problem of uniquely identifying persons, e.g. with Checkdigit in Identifier, Barcodes
Name spelling, Soundex, birthdate, biometrics (fingerprint)
Additional Identification by Secret questions & answers
Tradeoff between Privacy and scaled-up National unique patient identification
What is "necessary data"?
1. Key logistics: like Drug box (stocks): How much meds have been used, how much is left?
2. M&E data, Data for founders
Depends on the audience, e.g. Government, Facilities, ..., Task: Identify customers
Minimum data set, task: Who defines minimum data set?
Relationship information crucial for preventions, e.g. kids of an HIV mother
Coordinating on nation level, harmonizing data sets
How do we measure "data quality"?
Quarterly assessments, but who is actually doing these formal measurements?
Validation against forms, but this needs access to the paper-based forms
Look at completeness of data in the forms
Random samples by data entry, e.g. check 20% of the data 3x a week
Keeping log of errors for each data assistant
Incentive for staff if error rate is low
What is already available in OpenMRS?
DoubleEntryModule for Infopatch
Patient Flags Module
Reporting tools, which can be used for data quality
Things that we would like to see?
Data Statistics module
Data Integrity module
"Pre-canned" rules for data quality
Audit trail
Double-entry for HTML & XForms
Soundex module for fuzzy search in non-english languages
Idgen
How do we continue?
Make some noise on
OpenMRS Groups
Tickets
Wiki