This article is a proposal of conflict detection and resolution, which is a result of research.
Index:
- Introduction
- Research
- Conclusions
- Resources
1. Introduction
Conflict scenarios:
- Insert-insert
All objects in each table participating in peer-to-peer replication are uniquely identified by using primary key values. An insert-insert conflict occurs when an object with the same key value was inserted at more than one node. - Update-update
Occurs when the same object was updated at more than one node. - Insert-update
Occurs if an object was updated at one node, but the same object was deleted and then reinserted at another node. - Insert-delete
Occurs if a row was deleted at one node, but the same row was deleted and then reinserted at another node. - Update-delete
Occurs if a row was updated at one node, but the same row was deleted at another node. - Delete-delete
Occurs when a row was deleted from more than one node.
Greyed-out scenarios do not apply to our system, because:
- we use UUID, so ther are no PK conflicts
- we do not delete object, we void them if needed, right?
2. Research
2.1. Oracle
Conflict detection
Main conception:
To detect and resolve an update conflict for a row, the propagating site must send a certain amount of data about the new and old versions of the row to the receiving site.
More info here!
How it is done:
The receiving site detects an update conflict if there is any difference between the old values of the replicated row (the values before the modification) and the current values of the same row at the receiving site.
Let's take an example:
- New Patient with name "A" is created on Parent instance
- Both Child 1 and Child 2 pull new Patient with name "A"
- Child 1 updates name of Patient to "B" and pushes to Parent
- Conflict detection checks if name saved on Parent is equal to old (before updating) name saved on Child 1
- if YES, then it means that Child 1 is updating original and up-to-date name on Parent = no conflict, update
- in NO, then it means that either Parent itself or Child 2 updated name = conflict, resolve somehow
- if YES, then it means that Child 1 is updating original and up-to-date name on Parent = no conflict, update
- Name saved on Parent is "A", old name saved on Child 1 is "A", updated name on Child 1 is "B"
- "A" == "A", conflict detection answered YES, name value was updated
- Conflict detection checks if name saved on Parent is equal to old (before updating) name saved on Child 1
- Child 2 updates name of Patient to "C" and pushes to Parent
- Conflict detection checks if name saved on Parent is equal to old (before updating) name saved on Child 2
- if YES, then it means that Child 2 is updating original and up-to-date name on Parent = no conflict, update
- in NO, then it means that either Parent itself or Child 1 updated name = conflict, resolve somehow
- if YES, then it means that Child 2 is updating original and up-to-date name on Parent = no conflict, update
- Name saved on Parent is now "B", old name saved on Child 2 is "A", updated name on Child 2 is "C"
- "B" !== "A", conflict detection answers NO, there is a conflict
- Conflict detection checks if name saved on Parent is equal to old (before updating) name saved on Child 2
Conflict resolution
If it is possible to apply, we should check out Oracle Database Advanced Replication Management API Reference.
Common Update Conflict Resolution Methods
- Latest Timestamp
- Overwrite
But out of those, only Latest Timestamp is useful with multiple master sites.
Latest Timestamp = most recent update wins
Additional Update Conflicts Resolution Methods
(those than offer convergence with multiple master sites)
- additive - for single numeric values
- maximum - takes the biggest value
- minimum - takes the smallest value
- priority group - enable you to assign a priority level to each possible value of a particular column
2.2. GitHub
Conflict detection
We can easily apply Oracle example to GitHub.
Example 1:
Let's imagine a master branch (Parent) and two branches (Child 1 and Child 2) with Parent as upstream.
- Both branches pull master
- Child 1 pushes its changes to master
- Parent code is equal to Child 1 code before making any modifications (update) = no conflict
- Child 2 pushes its changes to master
- Parent code was modified by Child 1's push, so it is NOT equal to Child 2 code before making modifications = conflict
Conflict resolution
Resolving Example 1 conflict:
Git will not let Child 2 push to master, but will notify:
Updates were rejected because the remote contains work that you do not have locally.
You will be asked to pull updated master and resolve conflicts manually.
Create UI interface for choosing which changes to accept?
2.3. Microsoft
Conflict detection
Microsoft's note:
To avoid potential data inconsistency, make sure that you avoid conflicts in a peer-to-peer topology, even with conflict detection enabled. To ensure that write operations for a particular row are performed at only one node, applications that access and change data must partition insert, update, and delete operations. This partitioning ensures that modifications to a given row that is originating at one node are synchronized with all other nodes in the topology before the row is modified by a different node.
So we just sync change in one node with ALL the rest?
How SQL Server 2017 handles it:
In systems such as peer-to-peer replication, conflicts are not detected when changes are committed at individual peers. Instead, they are detected when those changes are replicated and applied at other peers. In peer-to-peer replication, conflicts are detected by the stored procedures that apply changes to each node, based on a hidden column in each published table. This hidden column stores an ID that combines an originator ID that you specify for each node and the version of the row.
empty
2.4. Brent Ozar's website
empty
Conflict resolution
Strategies
- Manual intervention
- Logging conflicts
- Master write server
- Last write wins
- Write partitioning
a. Manual intervention
Letting users/admin choose which change to accept.
b. Logging conflicts
As conflicts are logged, they can be written to a queue where conflict resolution software and kick into action and attempt to resolve the conflicts automatically.
c. Master write server
One instance writes to all child instances - rather impossible solution
d. Last write wins
Oracle's Last Timestamp
e. Write partitioning
Ensure that all writes for a single row will always happen on the same server.
- useless in our system
3. Conclusions
Conflict detection
The most reasonable solution for conflict detection could be:
- On every manual push of Child to Parent - first download Parent's version of object.
- Compare values of Parent's object to object stored on Child (before modifications).
- If Parent's object is equal to Child's object than it means that Child is operating on up-to-date object and there is no conflict.
- However, if they differ, it means that Child is operating on outdated object and conflict is detected.
Note: Some OpenMRS classes offer voiding, that saves outdated values as voided and keep date of change - but not all classes!
However with this solution we will have problems detecting conflicts when object are synced automatically, we should figure out if there is a solution for this case.
Different approach is better because it does not need to detect conflicts, because they can be prevented. However it requires all nodes to be updated after a single change of object in one node - that for sure is hard to accomplish due to: amount of operations that would have to be executed, constant internet connection (node-node connection).
Conflict resolution
The most reasonable solution for conflict resolution could be:
After a conflict was detected:
- Show "conflict resolution" page to the user to let him choose between conflicted values.
or - Use Latest Timestamp approach.
or - Action taken by Parent instance.
4. Resources
Conflict detection and resolution by Oracle: https://docs.oracle.com/cd/E11882_01/server.112/e10706/repconflicts.htm
Microsoft SQL Server Conflict Detection in Peer-to-Peer Replication: https://docs.microsoft.com/en-us/sql/relational-databases/replication/transactional/peer-to-peer-conflict-detection-in-peer-to-peer-replication?view=sql-server-2017
Brent Ozar article: https://www.brentozar.com/archive/2011/06/resolving-conflicts-database/