OpenMRS De-Identification Rules

Below are the set of rules ( in plain english psuedocode ) that we are applying to OpenMRS Patients and their encounters to de-identify them.  This is a work in progress.

For a org.openmrs.Patient and org.openmrs.Person Object we need to remove the 18 PHI Identifiers:

  • Names

    • Remove all names (a patient can have multiple names in OpenMRS with a preferred name) and replace with a fake name (e.g., name(s) selected from a pool of fake names).

  • Geographic data

    • Remove all addresses (a patient can have multiple addresses) and generate a fake address.

    • Remove all GPS data.

  • Dates

    • For birthdate, replace month & day with random values for patients under 60 years of age.  For patients 60+ years of age, adjust year randomly by ±5 years.

    • For all data (observations, encounters, etc.) replace month & day with random values, keeping sequence of data (intervals will change randomly).

    • For all other dates, randomly replace month & day.

  • Person attributes

    • Remove all person attributes, which could include telephone data, fax numbers, or other identifiable data.

  • Telephone numbers

    • These are often included in a Person's extra attribute data.

  • FAX numbers

    • These are often included in a Person's extra attribute data.

  • Email addresses

    • These are often included in a Person's extra attribute data.

  • National identifiers

    • Remove all patient identifiers, replacing with a fake (randomly generated & unique) identifier.

  • Medical record numbers

    • Remove all patient identifiers, replace with a fake (randomly generated & unique) identifier.

  • Health plan beneficiary numbers

  • Account numbers

  • Certificate/license numbers

  • Vehicle identifiers and serial numbers including license plates

  • Device identifiers and serial numbers

  • Web URLs

  • Internet protocol addresses

  • Biometric identifiers (i.e. retinal scan, fingerprints)

  • Full face photos and comparable images

  • Any unique identifying number, characteristic or code

For each Patients org.openmrs.Encounter Object we will need to do the following:

  • Randomize the month & day of Encounter Datetime, keeping encounters in the same sequence without maintaining intervals between them.

  • Assign random location of the encounter. 

  • We may also need some sort of 'obs filter' that includes a list of obs and rules specific to the concept dictionary that must be removed from encounters. Family Data:

  • Remove all relationships between persons.