Migrating to OCL: PIH Use Case

 

This is a work in progress. We are using this wiki page to capture notes as we work on getting PIH concepts into OCL. Tracked in ocl_issues #45.

@Ellen Ball created a SQL dump of PIH concept tables (similar to what @Andrew Kanter uses for sharing CIEL) using a command like this:

mysqldump -u root -p"$MYSQL_ROOT_PASSWORD" --opt \ openmrs \ concept concept_answer concept_class concept_complex \ concept_datatype concept_description concept_map_type \ concept_name concept_name_tag concept_name_tag_map concept_numeric \ concept_reference_map concept_reference_source concept_reference_term \ concept_reference_term_map concept_set \ concept_stop_word > pih-concepts-db-20210503.sql

@Burke Mamlin adapted the tooling used to convert CIEL SQL to JSON (OpenConceptLab/ocl_omrs) to do the same with PIH's SQL dump 

./sql-to-json.sh local/pih-concepts-db-20210503.sql PIH PIH staging

Note: This runs the script in a docker container. You will need docker installed, as the script starts up a docker container.

The JSON was bulk imported into the PIH source on OCL using the OCL app's bulk import feature.

We discovered some issues during the PIH import process, which are documented in OCL issues:

We also discovered that, since they manage dictionaries across multiple sites and many different servers, PIH does not expect concept IDs (concept.concept_id in the OpenMRS database) to be predictable, so they rely on mappings and UUIDs instead of concept IDs. They sometimes create a mapping (e.g., PIH:3) to serve as a organizational concept code, but they don't have these for all concepts. Discussion on how to assign official PIH codes for concepts within OCL took place in this OpenMRS Talk thread.

Once we got a close proximation of the PIH dictionary into a temporary source in OCL, we created a matching collection and the added references to all concepts & mapping in the source with a REST API call like this:

PUT https://api.staging.openconceptlab.org/orgs/PIH/collections/PIH/ { "data": { "uri": "/orgs/PIH/sources/PIH/", "concepts": "*", "mappings": "*" } }

One PIH began testing by downloading their dictionary back into OpenMRS, we quickly realized the sort weight for answers & set members (which allows answers & members of a set to be sorted) were not being imported into OCL.

  • The ocl_omrs conversion script was adapted to include sort weight for answers & set members (here).

  • OCLOMRS-1067 was created to make sure the sort weights are populated when concepts & mappings are imported from OCL.

 

 

Recipe for importing the PIH dictionary into OCL

Prerequisites

1. Clone the OCL import script:

2. Copy SQL dump file (can be .sql or zipped sql file as .zip) into local/ subfolder (Docker containers can't see folders outside of the local copy of ocl-omrs, so you must copy your SQL under this local folder and reference it through a relative path)

3. Run the sql-to-json.sh script.

For PIH, since the base OpenMRS instance used to create the dictionary is OpenMRS 2.3+, the Python script needs to be modified to refer to the concept_numeric.allow_decimal column rather than concept_numeric.precise.

In ocl_omrs/omrs/models.ph:

Then execute the script as follows:

This will unzip the sql file if needed, import the sql into a temporary db, check the target OCL environment to make sure all referenced sources already exist in that environment, and then write an export file (same name with a .json extension) into the local/ folder alongside the sql file.

  • USE_GOLD_MAPPINGS=1  sets an environment variable that will make the import script require a gold mapping (same-as PIH with numeric code) for every concept. If any concept is missing or has more than exactly one gold mapping, the import will fail.

  • ./sql-to-json.sh local/pih-concepts-db-20211213.zip  runs the import script on the specified file in the local/ folder.

  • PIH PIH specifies the organization (first "PIH") and the target source (second "PIH") to be used in the import file. If you wanted to create an import file for PIH's foobar source, then PIH PIH would be replaced with PIH foobar 

  • staging specifies the target environment (as of Jan 2022, this can be "staging" or "production"). This is used when ensuring the sources referenced within the import exist on the target environment.

4. Delete the existing PIH content (while you can import without deleting first – i.e., just update existing content in OCL like we do with CIEL updates – while things are changing more between PIH imports, I've been deleting the PIH dictionary (source & collection) on each iteration to ensure we don't have extraneous resources in OCL (e.g., a concept or mapping that was removed between imports not getting deleted from OCL).

I use httpie, but you can use curl or postman to do the same thing:

Verify that the PIH source has been deleted before proceeding. I usually give this 15-30 minutes to be sure it's gone before recreating it just to be safe.

5. Create the PIH source using make-pih.sh  and pih.json files (attached to this page).

6. Bulk import the PIH source by logging into app.staging.openconceptlab.org, and selecting the Bulk Importer app.

  • Check both "Update existing" (checked by default) and "Parallel" options

  • Upload the import file that was generated in step #3 (e.g., local/pih-concepts-db-20211213.json )

7. Create a PIH collection using make-pih-collection.sh and pih-collection.json files (attached to this page).

8. Add references for all concepts & mappings of the PIH source into the PIH collection using the make-pih-references.sh and pih-references.json files (attached to this page). This approach generates individual references for each concept & mapping. While it's possible to do this with two references (one to all concepts & one to all mappings), until the Dictionary Manager is refactored to use dynamic references or merged into OCL's term browser, you'll still want to work one reference per concept/mapping.

Â