This document specifies a UUIDv7-based optimization strategy for an observed-remove set (OR-Set). It defines how additions are identified, how removals are represented as compact tombstones, how replicas merge, and how time-ordered UUIDv7 identifiers can support practical garbage collection schemes driven by acknowledged frontiers.

This is an unofficial specification maintained alongside a reference implementation.

The key words MUST, MUST NOT, SHOULD, SHOULD NOT, and MAY in this document are to be interpreted as described in BCP 14 [[RFC2119]] [[RFC8174]] when, and only when, they appear in all capitals.

Unless otherwise stated, neutral data-structure and algorithm terms such as list, set, contains, append, and remove are used with the meanings defined by Infra [[INFRA]].

A conforming implementation MUST preserve the replica convergence and removal semantics defined by the algorithms in this specification.

Introduction

An OR-Set gives each value a stable identity that can later be marked as tombstoned. In the classic OR-Set, adds are distinguished by fresh unique tags and removes record the observed tagged values [[ORSET]]. Once a removed identity has been observed, the removed value no longer needs to remain present for remove semantics.

UUIDv7 is a particularly useful identifier format for this purpose [[RFC9562]]. It is fixed-width, designed for time-ordering, and includes enough non-time entropy to keep collision risk negligible for normal distributed application use.

This combination gives a practical OR-Set profile:

Terminology

An OR-Set replica is one independently mutable copy of the data type.

An identifier is the per-add token that distinguishes one observed add from another.

A UUIDv7 identifier is an identifier that is a valid UUID version 7 as defined by [[RFC9562]].

A stored value is a payload value associated with an identifier.

A live value is a stored value whose identifier is present in a replica's live item store and absent from that replica's tombstone set.

A tombstone is an identifier recorded as removed.

A snapshot is a transportable state containing a collection of live values and a collection of tombstones.

A well-formed snapshot is a value that is an object and has a values member whose value is a list and a tombstones member whose value is a list.

An acknowledgement frontier is deployment metadata that states, for a given actor, the greatest identifier below or equal to which that actor will not later surface previously unseen operations.

Core Model

Each OR-Set replica maintains two logical collections:

Remove semantics depend on identifiers, not on payload equality. A conforming implementation MUST therefore record at least the removed identifier and MAY discard the removed value itself. It MUST NOT require the removed payload to remain available in order to preserve remove semantics.

Each live identifier MUST correspond to at most one accepted add observation in a replica's live item store. If multiple candidate live values with the same identifier are received, the first accepted live value MAY be retained and later duplicates MAY be ignored.

UUIDv7 Identifier Requirements

A conforming UUIDv7-based OR-Set implementation MUST assign one UUIDv7 identifier to each add observation.

Implementations SHOULD preserve a caller supplied UUIDv7 identifier only when that identifier is valid, not currently live, and not already tombstoned.

If an implementation needs a total order over identifiers, it MUST compare them by their UUID value. A host MAY compare canonical textual UUID representations directly because UUIDv7 values are intended to be lexicographically sortable in text as well as in raw bytes [[RFC9562]].

Replica Semantics

Add

An add operation MUST produce or adopt one UUIDv7 identifier and associate that identifier with exactly one payload value.

If the chosen identifier is already live in the target replica, the add operation MAY be treated as a no-op.

If the chosen identifier is tombstoned in the target replica, the add operation MUST NOT resurrect that tombstoned observation. A conforming implementation SHOULD instead mint a fresh UUIDv7 identifier for the new add observation.

Remove

A remove operation in the reference JavaScript binding operates over locally visible live state.

If a live value with that identifier exists, it MUST be removed from the live value store and the identifier MUST be recorded in the tombstone set.

If no live value with that identifier exists in the local replica, the reference JavaScript binding MUST treat the remove operation as a no-op.

Merge

Merge operates over a snapshot. A conforming implementation MUST process tombstones before live values from the same incoming snapshot.

When merging:

  1. If the incoming value is not a well-formed snapshot, then the implementation MUST reject it.
  2. For each incoming tombstone:
    1. If the tombstone is not a valid UUIDv7 identifier, continue.
    2. Add the tombstone to the local tombstone set.
    3. Remove any live value with the same identifier from the local live value store.
  3. For each incoming live value:
    1. Let id be the value's identifier.
    2. If id is not a valid UUIDv7 identifier, continue.
    3. If the local tombstone set contains id, continue.
    4. If the local live value store already contains id, continue.
    5. Insert the live value under id.

This ordering guarantees that an incoming item whose identifier is tombstoned by the same merge will not be reintroduced as live state.

Snapshot

A snapshot MUST contain enough information to recreate the live value store and the tombstone set.

A snapshot MAY ignore malformed individual values and malformed individual tombstones so long as malformed top-level input is still rejected.

Tombstone Garbage Collection

Tombstone garbage collection is deployment specific. This specification does not require any particular garbage collection strategy.

A conforming implementation MAY expose tombstones to application logic so that an external replication layer can decide when collection is safe.

A deployment MAY maintain one acknowledgement frontier per actor and update that frontier with the greatest acknowledged UUIDv7 identifier for that actor.

When acknowledgements are exchanged across trust boundaries, they SHOULD be authenticated, for example by signatures.

If a deployment gives each actor frontier the meaning that the actor will not later surface previously unseen operations at identifiers less than or equal to that frontier, then the deployment MAY compute a garbage collection floor as the minimum acknowledged frontier across the relevant actors.

Under that stronger acknowledgement model, tombstones whose identifiers are strictly less than the garbage collection floor MAY be deleted.

Payload Guidance

This technique is best suited to membership state and static metadata. It is not, by itself, a field-level CRDT for mutating nested object graphs.

An implementation MAY shallow-freeze stored values as a low-cost hint that item payloads are not intended to be mutated in place. Such a hint does not create deep replication semantics for nested values.

Reference JavaScript Binding

This section defines one conforming JavaScript binding for the model above.

API Summary

[Exposed=*]
interface ORSet {
  constructor();
  constructor(ORSetSnapshot snapshot);
  readonly attribute unsigned long size;
  boolean has(any value);
  undefined append(any value);
  undefined clear();
  undefined remove(any value);
  sequence<any> values();
  any tombstones();
  undefined merge(ORSetSnapshot ingress);
  undefined snapshot();
  undefined addEventListener(DOMString type, any listener, optional any options);
  undefined removeEventListener(DOMString type, any listener, optional any options);
};

dictionary ORSetSnapshot {
  required sequence<any> values;
  required sequence<DOMString> tombstones;
};

dictionary ORSetMergeResult {
  sequence<DOMString> removals = [];
  sequence<any> additions = [];
};
        

Events

The JavaScript binding maintains an internal EventTarget object as defined by DOM [[DOM]].

This binding is not specified as inheriting from EventTarget. Instead, its listener methods forward to that internal event target, and its local mutations and merges dispatch synthetic events through it.

The JavaScript binding defines three synthetic event types. Each is a CustomEvent object whose detail attribute is initialized as described below [[DOM]].

delta event
A CustomEvent whose detail contains only the values and tombstones produced by one local mutation.
snapshot event
A CustomEvent whose detail is the full current snapshot produced when snapshot() is invoked.
merge event
A CustomEvent whose detail contains the added values and removals accepted from an incoming merge.

A no-op mutation MUST NOT dispatch any event.

Errors

If the JavaScript binding rejects a malformed top-level snapshot, it MUST throw an Error-derived object whose name is ORSetError and whose code is BAD_SNAPSHOT.

Algorithms

Constructor

  1. Set size to 0.
  2. Initialize the live value store to an empty object and the tombstone set to an empty set.
  3. If snapshot is undefined, then return.
  4. If snapshot is not a well-formed snapshot, then throw an ORSetError with code BAD_SNAPSHOT.
  5. For each tomb in snapshot's tombstones:
    1. If tomb is not a valid UUIDv7 identifier, continue.
    2. Add tomb to the tombstone set.
  6. For each value in snapshot's values:
    1. Let id be value's __uuidv7 member.
    2. If id is not a valid UUIDv7 identifier, continue.
    3. If the tombstone set contains id, continue.
    4. If the live value store already has its own property named id, continue.
    5. Store Object.freeze(value) in the live value store at id.
    6. Increase size by 1.

has()

  1. Let id be value if value is a string; otherwise let id be value.__uuidv7.
  2. Return true if the live value store has its own property named id; otherwise return false.

append()

  1. Let id be value.__uuidv7.
  2. If id is a valid UUIDv7 identifier and the live value store already has its own property named id, then return.
  3. If id is a valid UUIDv7 identifier and the tombstone set does not contain id, then let stored value be Object.freeze(value).
  4. Otherwise, let stored value be Object.freeze({...value, __uuidv7: new UUIDv7}), where new UUIDv7 is freshly generated.
  5. Let next id be stored value's __uuidv7.
  6. Store stored value in the live value store at next id.
  7. Increase size by 1.
  8. Dispatch a delta event whose detail contains values: [stored value] and tombstones: [].

clear()

  1. If size is 0, then return.
  2. Let egress tombstones be a new empty list.
  3. For each own property name id of the live value store:
    1. Add id to the tombstone set.
    2. Delete the live value store entry at id.
    3. Append id to egress tombstones.
  4. Set size to 0.
  5. Dispatch a delta event whose detail contains values: [] and tombstones: egress tombstones.

remove()

  1. Let id be value if value is a string; otherwise let id be value.__uuidv7.
  2. Let had item be whether the live value store has its own property named id.
  3. If had item is false, then return.
  4. Add id to the tombstone set.
  5. Delete the live value store entry at id.
  6. Decrease size by 1.
  7. Dispatch a delta event whose detail contains values: [] and tombstones: [id].

values()

Return a new list containing the live values in the live value store's enumeration order.

tombstones()

Return the live tombstone set itself so that application level replication logic can inspect or compact it directly.

merge()

  1. If ingress is not a well-formed snapshot, then throw an ORSetError with code BAD_SNAPSHOT.
  2. Let additions be a new empty list.
  3. Let removals be a new empty list.
  4. For each tomb in ingress's tombstones:
    1. If the tombstone set already contains tomb, continue.
    2. If tomb is not a valid UUIDv7 identifier, continue.
    3. Let had item be whether the live value store has its own property named tomb.
    4. Add tomb to the tombstone set.
    5. Delete the live value store entry at tomb.
    6. If had item is true, decrease size by 1.
    7. Append tomb to removals.
  5. For each value in ingress's values:
    1. Let id be value's __uuidv7 member.
    2. If id is not a valid UUIDv7 identifier, continue.
    3. If the tombstone set contains id, continue.
    4. If the live value store already has its own property named id, continue.
    5. Store Object.freeze(value) in the live value store at id.
    6. Increase size by 1.
    7. Append value to additions.
  6. If additions is empty and removals is empty, then return.
  7. Dispatch a merge event whose detail contains additions and removals.

snapshot()

  1. Let snapshot be a new object whose values member is a new list containing the current live values and whose tombstones member is a new list containing the current tombstones.
  2. Dispatch a snapshot event whose detail is snapshot.
  3. Return undefined.

addEventListener() and removeEventListener()

Forward listener registration and listener removal to the binding's internal EventTarget object using the provided type, listener, and options values.

Acknowledgements

This specification builds on the OR-Set literature [[ORSET]], the UUID design in [[RFC9562]], and practical discussions of optimized observed-remove sets such as [[NIK96A]].