Info | ||
---|---|---|
| ||
1.4.0 |
This page is based on 'Conflict detection and resolution - research'.
Introduction
Conflict scenarios:
- Update-update
Occurs when the same object was updated at more than one node. - Update-delete
Occurs if a row was updated at one node, but the same row was deleted at another node. - Delete-delete
Occurs when a row was deleted from more than one node.
Conflict Detection
We need a new table in a database of each instance to store hash codes of objects pulled from this instance's Parent.
PULL:
- At first, when we pull an object from Parent to Child instance (where it does not exist yet):
- calculate hash code of an object pulled from Parent,
- look for an object on Child instance with UUID of pulled object,
- such object on Child instance does not exist, so object is created on Child instance,
- object's hash code is saved in Child's database as the latest version of this object on Parent instance,
- there is NO conflict.
- Object is NOT modified on Parent instance.
- When we pull the same object from Parent the second time (this object already exist on Child instance):
- calculate hash code of an object pulled from Parent,
- look for an object on Child instance with UUID of pulled object,
- such object on Child instance exists, so compare calculated hash code with a hash code that is saved in Child's database,
- they are equal, which means that object on Child instance is up to date with corresponding object on Parent instance,
- there is NO conflict.
- Object is modified both on Parent and/or Child instance.
- When we pull the same object from Parent the third time (this object already exist on Child instance):
- calculate hash code of an object pulled from Parent,
- look for an object on Child instance with UUID of pulled object,
- such object on Child instance exists, so compare calculated hash code with a hash code that is saved in Child's database,
- they are NOT equal, which means that object on Child instance is NOT up to date with corresponding object on Parent instance
- there is a conflict, which is resolved using rule RULE 1
PUSH:
- At first, when we push an object from Child to Parent instance (where it does not exist yet):
- In SyncPushServiceImpl class during execution of readAndPushObjectToParent method we call shouldPushObject method, where we pull corresponding object from Parent instance,
- this object does not exist on Parent instance yet, shouldPushObject returns true,
- object is pushed to Parent,
- object's hash code is saved in Child's database as the latest version of this object on Parent instance.
- Object is modified on Child instance.
- When we push this object from Child to Parent instance (where it already exists):
- In SyncPushServiceImpl class during execution of readAndPushObjectToParent method we call shouldPushObject method, where we pull corresponding object from Parent instance,
- this object already exists on Parent instance,
- calculate hash code of an object pulled from Parent,
- compare calculated hash code with a hash code that is saved in Child's database,
- they are equal, which means that Child modified the latest version of object on Parent instance,
- object is pushed to Parent,
- object's hash code is updated in Child's database as the last version of this object.
- Object is modified both on Parent and/or Child instance.
- When we push this object from Child to Parent instance (where it already exists):
- In SyncPushServiceImpl class during execution of readAndPushObjectToParent method we call shouldPushObject method, where we pull corresponding object from Parent instance,
- this object already exists on Parent instance,
- calculate hash code of an object pulled from Parent,
- compare calculated hash code with a hash code that is saved in Child's database,
- they are NOT equal, which means that object on Child instance is NOT up to date with corresponding object on Parent instance (someone modified this object on Parent instance),
- there is a conflict, which is resolved using rule RULE 2.
Conflict Resolution
PULL:
...
- these two objects have to be merged**, then
- saved on Child and Parent instance;
- hash codes on both Child and Parent instance have to be updated.
* REMEMBER: This is a hash code of an object that was calculated and saved on the last pull of this object from Parent. This hash code is stored in a separate table than pulled object.
** See Merge
PUSH:
...
- these two objects have to be merged**, then
- saved on Child and Parent instance;
- hash codes on both Child and Parent instance have to be updated.
* REMEMBER: This is a hash code of an object that was calculated and saved on the last pull of this object from Parent. This hash code is stored in a separate table than pulled object.
** See Merge
Merge
Once a conflict was detected two objects will have to be merged. To do this, a new database table "Conflicts" will have to be created.
This table will have columns: openmrs_class, first_object, second_object and more if needed.
It will be possible to insert any type of object into child_object and parent_object due to its type: it will be either JSON, (TINY)BLOB, TEXT or else.
All conflicted objects will be put into this table and queued just like AuditLog does this.
However, there should be only the most recent merge conflict for any object. The old ones should be voided.
Then either chosen strategy or UI will be used to:
- determine if child_object, parent_object or mixed object will be preserved,
- determine what operation should be taken after merging (push / pull / ...)
When the conflict is resolved we should save parent's object as voided in child's database.
...
Conflict Detection
Conflict detection is based on hash codes. Every object has its own Sync hash code which can be generated basing on that object. During pull or push operation a child instance compares hash codes of a local object, parent's object and recently saved parent's object. The hash code of recently saved parent's object is persisted in the database. We have to establish some shortcuts before we explain when the conflict is detected:
- #C : a hash code of the child instance object
- #P : a hash code of the parent instance object
- #lastP : a last saved hash code of the parent instance object.
The conflict is detected when #lastP != NULL && #lastP != #P && #lastP != #C.
Note! We have added a new row in a database sync_parent_object_hashcode table for each object pulled from its Parent instance. We keep only the most current hash code of each object.
...
Conflict Resolution
Conflict resolution can be handled automatically or manually. After the conflict is detected it is passed to object/service which implements MergeBehaviour interface. The MergeBehaviour class defines a strategy of dealing with conflicts automatically. (In fact, you can add your own implementation. All hints can be found here.) All unresolved conflicts will be saved in order to be solved by the users.
If the same object causes another conflict, then an old one will be replaced with the new one. So there is only one, the most recent, conflict regarding any object.
The used MergeBehaviour is determined by global property "sync2.mergeBehavior". For instance when it's set to "sync2.newIsTheBestMergeBehaviour" then the NewIsTheBestMergeBehaviour will be used.
With Sync 2 version 1.4.0 was provided two example MergeBehaviour:
1) RestrictConflictMergeBehaviourImpl (link) - "sync2.restrictConflictMergeBehaviour"
This is the default strategy which using hash codes check if the both objects were changed then create unresolved conflict in order to be solved by the users. If only one object changed then it will be choose as a newest version.
2) NewIsTheBestMergeBehaviourImpl (link) - "sync2.newIsTheBestMergeBehaviour"
In this strategy used the date of changed to determine the newest version. If the object hasn't the date of change then the strategy will chose the source object.