OpenMRS De-Identification Rules
Below are the set of rules ( in plain english psuedocode ) that we are applying to OpenMRS Patients and their encounters to de-identify them. This is a work in progress.
For a org.openmrs.Patient and org.openmrs.Person Object we need to remove the 18 PHI Identifiers:
Names
Remove all names (a patient can have multiple names in OpenMRS with a preferred name) and replace with a fake name (e.g., name(s) selected from a pool of fake names).
Geographic data
Remove all addresses (a patient can have multiple addresses) and generate a fake address.
Remove all GPS data.
Dates
For birthdate, replace month & day with random values for patients under 60 years of age. For patients 60+ years of age, adjust year randomly by ±5 years.
For all data (observations, encounters, etc.) replace month & day with random values, keeping sequence of data (intervals will change randomly).
For all other dates, randomly replace month & day.
Person attributes
Remove all person attributes, which could include telephone data, fax numbers, or other identifiable data.
Telephone numbers
These are often included in a Person's extra attribute data.
FAX numbers
These are often included in a Person's extra attribute data.
Email addresses
These are often included in a Person's extra attribute data.
National identifiers
Remove all patient identifiers, replacing with a fake (randomly generated & unique) identifier.
Medical record numbers
Remove all patient identifiers, replace with a fake (randomly generated & unique) identifier.
Health plan beneficiary numbers
Account numbers
Certificate/license numbers
Vehicle identifiers and serial numbers including license plates
Device identifiers and serial numbers
Web URLs
Internet protocol addresses
Biometric identifiers (i.e. retinal scan, fingerprints)
Full face photos and comparable images
Any unique identifying number, characteristic or code
For each Patients org.openmrs.Encounter Object we will need to do the following:
Randomize the month & day of Encounter Datetime, keeping encounters in the same sequence without maintaining intervals between them.
Assign random location of the encounter.
We may also need some sort of 'obs filter' that includes a list of obs and rules specific to the concept dictionary that must be removed from encounters. Family Data:
Remove all relationships between persons.