Anonymizing Data
This page is outdated and no longer receives updates!
Obscuring Names for Demonstration
At times you will want to demonstrate the OpenMRS web application to people outside of your organization. Displaying real patient names would not be very appropriate.
OpenMRS has a built in way to simply obscure all patient's names in the system using your runtime properties:
obscure_patients=false
obscure_patients.given_name=John
obscure_patients.middle_name=P.
obscure_patients.family_name=Smith
Complete Database Anonymizer
When you need to have (nearly) complete HIPAA compliant anonymity of patients and data you need to do a lot of scrubbing to the underlying database.
This archive:anonymizing sql script will scramble your patient names, patient addresses, locations, and any dates stored against patients. User names and passwords will be reset to username-"id" / test.
Do not run this on a live database. Only run this on a copy. Accurate data is impossible to recover from this.
This anonymization may not be fully HIPAA compliant. It is very difficult to truly anonymize (de-identify) data such that someone could not re-identify a patient. People (and a growing number of tools) can come up with clever ways of figuring out identities from supposedly anonymous data. Any free text (comments on observations, observations with text values, or any user-entered text anywhere else in the database) could accidentally reveal a patient's identity. Whenever possible, avoid sharing patient data publicly (even if you think you have anonymized it). If you want/need to share data publicly, be extremely vigilant in ensuring that the data are truly anonymized.
Ideally, all protected health information should be anonymized, including:
Names
All geographic subdivisions smaller than 20,000 people – e.g., address, city, county, precinct, exact zip code, and equivalent geocodes.
All months and days of dates directly related to the individual. This includes birth date, admission dates, discharge dates, dates of death, encounter dates, and observation dates.
Telephone numbers
Fax numbers
E-mail addresses
National numbers (e.g., social security number, national ID)
Medical record numbers
Health plan beneficiary numbers
Account numbers
Certificate/license numbers
Vehicle identifiers
Device identifiers
URLs
IP addresses
Biometric identifiers
Full face photographic images
Any other unique, identifying number, characteristic, or code
In OpenMRS, this means that not only names and birth dates get scrubbed, but all dates (including dates on encounters & observations) should be stripped of month & day and any internal identifiers (e.g., patient_id, person_id, etc.) must be randomly changed. Identifiers and dates should not be adjusted relative to their original value (e.g., adding or substracting a specific number), since patterns (like differences between dates or identifiers) could be used to re-identify patients.