Sync Module Overview

session description

  • session was lead by Dave Thomas (dthomas@pih.org)

  • the sync module is in use by PIH Rwanda

  • more documentation can be found here: Sync Module

setup / overview

  • sync operates on a spoke and hub model (parent server with children)

    • each child needs to initiated with a copy of the parent server's DB

      • this was done through database dumping

      • an alternate option to consider is to copy the mysql files instead of importing/exporting the DB, in order to save import/export time

    • any way to not start out with clones and have some kind of merge done?

      • short answer is no since the records would need to be reconciled

      • the key to doing that successfully is to be able to match up UUIDs, which is a non-trivial task

    • registration is done between the child to parent and parent to children

      • parent server and child servers have a static IP

        • it may be possible to use a dynamic DNS (DDNS) service (e.g. no-ip.org, dyndns.com) to get around this

    • sync process always runs on the child; configured to contact the parent server on a regular basis

      • can we use the parent server as the operational server?

        • yes, this is how it is currently used in PIH Rwanda

        • is there a performance impact?

          • none observed so far

  • configurable settings

    • the amount of time when the child contacts the parent

    • timeout for when the child should stop contacting the parent

    • maximum number of tries for the child to reach the parent

    • whether sync'ed data is compressed or not (which is run through the openmrs configuration which can optionally be gzip)

    • the class of data to be sync'ed

    • at the technical level, it is sync'ing at the level of db tables

      • not able to sync subsets of patients, but this is a feature we'd like to implement in the future

  • prerequisites

    • did we need to beef up our servers for the sync?

      • no, we have our servers running on off the shelf laptops with 3GB of RAM

    • had issues with the servers not being on proper UPS (we used laptops)

      • mysql databases can get corrupted if power is lost in the middle of a transaction

    • what about network firewall considerations?

      • this is done via HTTP post, so either port 80 (HTTP) or 443 (HTTPS)

    • bandwidth

      • our sites are on VSAT and performance has been fine even with all of the outages that we have. sometimes the bandwidth is 1kbps or less.

        • how long do we need to have something up and running in order for the sync'ing to work?

          • dependent on bandwidth, but from our experience, sending 200 records at a time, it takes about 5 minutes to sync

        • this can even be sync'ed via USB key

          • and there won't be duplicates even if it is manually sync'ed by USB and afterwards, the network is restored

        • we will be piloting this on GPRS modems; the issue is not bandwidth, but rather having fixed IP addresses on the children (that are using the modem)

          • however, this may be overcome by using a DDNS service (see above)

functional questions

  • still able to do updates on both the parent and the child?_

    • yes

  • what is the format of what is being sent?

    • xml (serialized java objects)

      • can the XML be consumed by other services?

        • probably possible

  • is it possible to have grandparents? i.e. multiple layers of syncing between children and the parent?

    • theoretically it could be possible

    • sync was built to be a child to parent relationship, where each node will communicate to a parent (if it exists) and its children (if they exist); children don't know about each other.

    • is there any thought about doing a peer-to-peer relationship rather than a hierarchical relationship?

      • this might be something to consider in the future

      • (a concern was raised by an audience member): not sure if this should be perceived as a solution for a national reporting system built off of (child) districts and (grandchild) clinics because of data access/privacy concerns; DIHS should be used instead

      • filtering might help to alleviate this challenge

  • complex obs files able to be sync'ed?

    • may not be possible to do at the current time since it would require so much bandwidth to do

  • 2 vs. 1 way sync

    • it is possible to use the sync module as an info path alternative (in a one directional way)

similar/related efforts

  • run a sync on an android device

    • the android device has a full DB copy

    • it is not made to handle conflicts

    • questions to think about:

      • is there any opportunity to combine the efforts through refactoring?

      • because there has been trouble to get hibernate to run on android, is there any way to remove that dependency from the existing sync module?