This document defines Convergent Replicated Text (CR-Text), a delta CRDT for replicated text value state. CR-Text represents the visible string as an ordered sequence of grapheme-cluster string values backed by CR-List [[CRLIST]].

The specification defines the text projection, local insert and remove operations, merge and compaction behavior inherited from CR-List, event dispatching, and one conforming JavaScript binding.

This specification is maintained alongside a reference implementation.

You can check the reference implementation docs:

See it in action:

Or use the reference implementation:

Inspired by [[MARTIN_KLEPPMANN]].

Unless otherwise stated, data-structure and algorithm terms such as list, map, record, sequence, parameters, variables, and iterate are used with the meanings defined by Infra [[INFRA]].

A conforming implementation MUST preserve the visible text convergence, grapheme-cluster editing, event ordering, and tombstone-safety constraints defined by this specification and by its CR-List substrate [[CRLIST]].

Terminology

A CR-Text replica is one independently mutable instance of the data structure.

A grapheme cluster is the user-perceived text segment produced by ECMAScript internationalization segmentation with granularity set to "grapheme" [[ECMA402]].

A text value is one string value stored as a CR-List entry. Each text value represents one grapheme cluster.

A text projection is the consumer-facing string formed by concatenating the visible text value(s) in CR-List order.

A text index is a zero-based position in the current text projection, counted in visible text value(s).

A snapshot is the detached structured-clone-compatible serialized state of a CR-Text replica. It has the same shape as a CR-List snapshot whose value payloads are strings [[CRLIST]].

A delta is a partial snapshot containing only the values or tombstones a replica wishes to share.

An acknowledgement frontier is the CR-List acknowledgement frontier produced for the current replica state [[CRLIST]].

A change patch is a sparse record whose keys are decimal string forms of text index(s). Inserted indexes map to string values and deleted indexes map to undefined.

Core Model

A CR-Text replica is a text-specialized projection over CR-List. The underlying CR-List stores strings, and each visible list value is interpreted as one text value.

Local insertions split the supplied string into grapheme cluster values and append or prepend those values into the underlying CR-List. Local removals delete a contiguous range of visible text values from the underlying CR-List.

Merge, snapshot, acknowledgement, and garbage-collection behavior are inherited from CR-List. The convergence target is the text projection, not identical internal tombstone storage at every intermediate step.

Snapshot and delta ingress is intentionally tolerant through the CR-List substrate. Malformed entries ignored by CR-List do not become visible text.

Data Structure

The same logical replica can be viewed as the CR-List-compatible snapshot used for transport, the runtime projection held by the binding, and the final consumer-facing string.

Snapshot

{
  "values": [
    {
      "uuidv7": "018f6a2b-7c8d-7000-8000-000000000001",
      "value": "H",
      "predecessor": "\\u0000"
    },
    {
      "uuidv7": "018f6a2b-7c8d-7000-8000-000000000002",
      "value": "i",
      "predecessor": "018f6a2b-7c8d-7000-8000-000000000001"
    }
  ],
  "tombstones": []
}
        

Runtime

{
  size: number,
  cursor: TextStateEntry | undefined,
  values: Map<UUIDv7Identifier, TextStateEntry>,
  tombstones: Set<UUIDv7Identifier>
}

type TextStateEntry = {
  uuidv7: UUIDv7Identifier,
  value: GraphemeCluster,
  predecessor: UUIDv7Identifier | RootPredecessor,
  index: number,
  prev: TextStateEntry | undefined,
  next: TextStateEntry | undefined
}
        

UUIDv7Identifier denotes the UUID version 7 identifiers used by CR-List [[RFC9562]]. RootPredecessor denotes the CR-List root predecessor sentinel '\0' [[CRLIST]].

Consumer

"Hi"
        

Core Algorithms

Text Projection

Segment Characters

  1. Let input be the supplied string.
  2. Create an ECMAScript Intl.Segmenter with granularity set to "grapheme" [[ECMA402]].
  3. Segment input with that segmenter in source order.
  4. Return the list of segment strings. Each returned string is a grapheme cluster.

Materialize Text Projection

  1. Let output be the empty string.
  2. For each visible entry in ascending text index order:
    1. If the entry value is not a string, continue to the next visible entry.
    2. Append the entry value to output.
  3. Return output as the text projection.

CRUD

Create

  1. Let replica be a new underlying CR-List replica hydrated from the supplied optional snapshot [[CRLIST]].
  2. Ignore malformed snapshot entries rejected by the underlying replica.
  3. Treat visible string payloads as text value(s). Non-string payloads MUST NOT contribute to the text projection.
  4. Return replica as a CR-Text replica.

Read

  1. To read one text value, inspect the visible entry at the supplied text index.
  2. If the index is out of bounds or the visible entry value is not a string, return undefined.
  3. Return a detached structured clone of that visible string value.
  4. To read the full string, run materialize text projection.

Insert

  1. If index is not a number or characters is not a string, then fail with BAD_PARAMS.
  2. Let values be the result of running segment characters with characters.
  3. If values is empty, return no change.
  4. Let targetIndex be index and let mode be after.
  5. If index is -1, then:
    1. Set targetIndex to 0.
    2. If the replica has at least one visible entry, set mode to before.
  6. Insert values into the underlying ordered replica at targetIndex using mode [[CRLIST]].
  7. If the underlying ordered replica reports no mutation, return no change.
  8. Return the produced delta and change patch.

Remove

  1. If index is not a number or removeCount is not a number, then fail with BAD_PARAMS.
  2. Let endIndex be index plus removeCount.
  3. Delete the visible range [index, endIndex) from the underlying ordered replica [[CRLIST]].
  4. If the effective deleted range is empty, return no change.
  5. Return the produced delta and change patch. Deleted indexes in the patch map to undefined.

MAGS

Merge

  1. If the supplied candidate is not a record, return no change.
  2. Merge the candidate into the underlying ordered replica [[CRLIST]].
  3. If no visible state was adopted or removed, return no change.
  4. Return the minimal change patch for the affected visible text indexes.

Acknowledge

  1. If the local replica retains no tombstones, return no acknowledgement frontier.
  2. Iterate the retained tombstones and keep the lexicographically greatest valid UUID version 7 identifier [[RFC9562]].
  3. Return that identifier as the replica's acknowledgement frontier.

Garbage Collect

  1. If the supplied frontier input is not a list, return.
  2. Sort the supplied frontiers lexicographically.
  3. Let boundary be the smallest supplied frontier that is a valid UUID version 7 identifier [[RFC9562]].
  4. If no such frontier exists, return.
  5. Remove every retained tombstone whose identifier is less than or equal to boundary.

This algorithm is only safe when the caller supplies acknowledgement frontiers covering every replica whose future convergence must still be preserved.

Snapshot

  1. For each current live entry, create a detached snapshot entry containing its UUID version 7 identifier, value, and predecessor.
  2. Assemble those entries into the snapshot's values list.
  3. Assemble all retained tombstones into the snapshot's tombstones list.
  4. Return the resulting snapshot.

Reference JavaScript Binding

This section defines one conforming JavaScript binding for the model above.

API Summary

[Exposed=*]
interface CRText {
  constructor(optional object snapshot = {});
  readonly attribute unsigned long size;
  undefined insertAfter(long index, DOMString characters);
  undefined removeAfter(long index, long removeCount);
  undefined merge(optional object delta = {});
  undefined acknowledge();
  undefined garbageCollect(sequence<DOMString> frontiers);
  undefined snapshot();
  undefined addEventListener(
    DOMString type,
    EventListener? listener,
    optional (AddEventListenerOptions or boolean) options = {}
  );
  undefined removeEventListener(
    DOMString type,
    EventListener? listener,
    optional (EventListenerOptions or boolean) options = {}
  );
  object toJSON();
  stringifier DOMString ();
  iterable<DOMString>;
};

dictionary CRTextSnapshotValueEntry {
  required DOMString uuidv7;
  required DOMString value;
  required DOMString predecessor;
};

dictionary CRTextSnapshot {
  required sequence<CRTextSnapshotValueEntry> values;
  required sequence<DOMString> tombstones;
};

dictionary CRTextDelta {
  sequence<CRTextSnapshotValueEntry> values;
  sequence<DOMString> tombstones;
};

typedef record<DOMString, any> CRTextChange;
typedef DOMString CRTextAck;
        

The constructor and merge() method are typed as taking generic objects in the coarse Web IDL summary above so that the IDL matches the permissive JavaScript ingress behavior. The expected object shapes are CRTextSnapshot and CRTextDelta.

size returns the current number of visible text value(s). valueOf(), Symbol.toPrimitive, runtime inspect hooks, and primitive coercion expose the current text projection.

When the current snapshot is JSON-compatible, JSON.stringify(text) serializes the binding through that snapshot. toString() returns the JSON string form of the current snapshot, while valueOf() returns the visible text string.

Errors

The JavaScript binding throws CRTextError with a string code attribute for local API misuse.

BAD_PARAMS
Thrown when insertAfter() or removeAfter() receives arguments of the wrong runtime types.

Events

The JavaScript binding maintains an internal EventTarget object as defined by DOM [[DOM]].

The binding forwards addEventListener() and removeEventListener() to that internal event target. Local mutations, merges, acknowledgement emission, and snapshot emission dispatch synthetic events through the same target.

The JavaScript binding defines four synthetic event types. Each is a CustomEvent object whose detail attribute is initialized as described below [[DOM]].

delta event
A CustomEvent whose detail is a delta. The binding dispatches this event for local insertions and removals that produce gossip material.
change event
A CustomEvent whose detail is a change patch.
ack event
A CustomEvent whose detail is the current acknowledgement frontier produced when acknowledge() is invoked and a frontier exists.
snapshot event
A CustomEvent whose detail is the full current snapshot produced when snapshot() is invoked.

insertAfter() and removeAfter() MUST dispatch delta event before change event when the operation changes visible text. merge() MUST dispatch only a change event when the visible text projection changes. acknowledge() MUST dispatch an ack event when a frontier exists, and snapshot() MUST dispatch a snapshot event.

Methods

Constructor

Run Create with the optional snapshot.

size

Return the current number of visible text value(s).

insertAfter()

Run Insert with the supplied index and characters. Dispatch the resulting delta event and change event. Passing -1 inserts at the beginning of the document.

removeAfter()

Run Remove with the supplied index and remove count. Dispatch the resulting delta event and change event when the visible text changes.

merge()

Run Merge with the supplied delta. If the visible text changes, dispatch a change event whose detail is the minimal visible patch.

acknowledge()

Run Acknowledge and dispatch an ack event when a frontier exists.

garbageCollect()

Run Garbage Collect with the supplied list of acknowledgement frontiers.

snapshot()

Run Snapshot and dispatch a snapshot event with the resulting value. The method returns undefined.

@@iterator

Iterate the current visible text value(s) in ascending text index order.

toJSON()

Return the same detached structured-clone-compatible shape as snapshot. This is the value used by JSON.stringify.

toString()

Return the JSON string form of the current snapshot.

addEventListener() and removeEventListener()

Forward listener registration and listener removal to the binding's internal EventTarget object using the provided type, listener, and options values.