Collaboration is Conflict. IRL and online.

All collaborative applications need to device a way to handle conflicts between it’s clients. A large number of creative endeavours start and end in an online collaborative environment - whether that be a Notion doc, a Figma file, a Google Sheet or a Git repository. The appeal lies in seamless collaboration and this collaboration is THE monetizable component for many SAaS applications.

How do applications achieve this coveted experience of seamless collaboration?

Defining our Wants

Defining Conflict Resolution Rules

Defining a CRDT

Defining Granularity

While discussing each of these, we’ll assume we’re working with a collaborative text editor.

Defining our wants

We want N number of people to be able to write to a document at the same time while receiving and sharing the updates.

We want users to not have their own changes overridden if possible and have a ‘clean’ writing experience.

We want to establish eventual consistency such that all peers should EVENTUALLY see the same document.

Defining Conflict Resolution Rules

Conflict resolution strategies depend on one’s expectations from a system.
Something like Git leaves all merging and conflict resolution upto the user.
On the other hand, Google Docs does this automagically. This is a classic distributed systems problem. One of the solutions is to employ CRDTs - Conflict Free Replicated Datatypes.

Defining a CRDT

To stay conflict free, peers can choose to adopt one of the two conflict resolution paradigms. These statements are from a peer’s POV.

We’ll always push ordered changes (King, e4) which can be applied one after the other to result in a final consistent state. This is an Operational CRDT.

We’ll always push our entire state (snapshot the entire chessboard) and the newest state (decided via a version) should overwrite the value to result in a final consistent state. This is State CRDT.

Operational CRDTs really optimise on the network bandwidth by but NEED to be ordered. Order preservation is in itself a NP hard problem in distributed systems and pursuing this is like digging ourselves a nice comfortable hole.

On the other hand, State CRDTs push their entire changed state each time and any of the peer states can ‘win’ and be declared the final state. All the other peers lose their ‘progress’ in this case.

A State CRDT is a simple thing. It’s any class that holds a value and state (peer ID, timestamp) with a mechanism to merge two states.

Value - this is what all the peers are trying to agree upon.

State - this a serializable representation of the value along with the timestamp of the last change and the peer ID.

```python abstract class CRDT: value: T state: S

@abstractmethod
def merge(state: S):
    pass }