This document specifies a UUIDv7-based optimization strategy for an observed-remove set (OR-Set). It defines how additions are identified, how removals are represented as compact tombstones, how replicas merge, and how time-ordered UUIDv7 identifiers can support practical garbage collection schemes driven by acknowledged frontiers.
This is an unofficial specification maintained alongside a reference implementation.
The key words MUST, MUST NOT, SHOULD, SHOULD NOT, and MAY in this document are to be interpreted as described in BCP 14 [[RFC2119]] [[RFC8174]] when, and only when, they appear in all capitals.
Unless otherwise stated, neutral data-structure and algorithm terms such as list, set, contains, append, and remove are used with the meanings defined by Infra [[INFRA]].
A conforming implementation MUST preserve the replica convergence and removal semantics defined by the algorithms in this specification.
An OR-Set gives each value a stable identity that can later be marked as tombstoned. In the classic OR-Set, adds are distinguished by fresh unique tags and removes record the observed tagged values [[ORSET]]. Once a removed identity has been observed, the removed value no longer needs to remain present for remove semantics.
UUIDv7 is a particularly useful identifier format for this purpose [[RFC9562]]. It is fixed-width, designed for time-ordering, and includes enough non-time entropy to keep collision risk negligible for normal distributed application use.
This combination gives a practical OR-Set profile:
An OR-Set replica is one independently mutable copy of the data type.
An identifier is the per-add token that distinguishes one observed add from another.
A UUIDv7 identifier is an identifier that is a valid UUID version 7 as defined by [[RFC9562]].
A stored value is a payload value associated with an identifier.
A live value is a stored value whose identifier is present in a replica's live item store and absent from that replica's tombstone set.
A tombstone is an identifier recorded as removed.
A snapshot is a transportable state containing a collection of live values and a collection of tombstones.
A well-formed snapshot is a value that is an object and has a
values member whose value is a list and a
tombstones member whose value is a list.
An acknowledgement frontier is deployment metadata that states, for a given actor, the greatest identifier below or equal to which that actor will not later surface previously unseen operations.
Each OR-Set replica maintains two logical collections:
Remove semantics depend on identifiers, not on payload equality. A conforming implementation MUST therefore record at least the removed identifier and MAY discard the removed value itself. It MUST NOT require the removed payload to remain available in order to preserve remove semantics.
Each live identifier MUST correspond to at most one accepted add observation in a replica's live item store. If multiple candidate live values with the same identifier are received, the first accepted live value MAY be retained and later duplicates MAY be ignored.
A conforming UUIDv7-based OR-Set implementation MUST assign one UUIDv7 identifier to each add observation.
Implementations SHOULD preserve a caller supplied UUIDv7 identifier only when that identifier is valid, not currently live, and not already tombstoned.
If an implementation needs a total order over identifiers, it MUST compare them by their UUID value. A host MAY compare canonical textual UUID representations directly because UUIDv7 values are intended to be lexicographically sortable in text as well as in raw bytes [[RFC9562]].
An add operation MUST produce or adopt one UUIDv7 identifier and associate that identifier with exactly one payload value.
If the chosen identifier is already live in the target replica, the add operation MAY be treated as a no-op.
If the chosen identifier is tombstoned in the target replica, the add operation MUST NOT resurrect that tombstoned observation. A conforming implementation SHOULD instead mint a fresh UUIDv7 identifier for the new add observation.
A remove operation in the reference JavaScript binding operates over locally visible live state.
If a live value with that identifier exists, it MUST be removed from the live value store and the identifier MUST be recorded in the tombstone set.
If no live value with that identifier exists in the local replica, the reference JavaScript binding MUST treat the remove operation as a no-op.
Merge operates over a snapshot. A conforming implementation MUST process tombstones before live values from the same incoming snapshot.
When merging:
This ordering guarantees that an incoming item whose identifier is tombstoned by the same merge will not be reintroduced as live state.
A snapshot MUST contain enough information to recreate the live value store and the tombstone set.
A snapshot MAY ignore malformed individual values and malformed individual tombstones so long as malformed top-level input is still rejected.
Tombstone garbage collection is deployment specific. This specification does not require any particular garbage collection strategy.
A conforming implementation MAY expose tombstones to application logic so that an external replication layer can decide when collection is safe.
A deployment MAY maintain one acknowledgement frontier per actor and update that frontier with the greatest acknowledged UUIDv7 identifier for that actor.
When acknowledgements are exchanged across trust boundaries, they SHOULD be authenticated, for example by signatures.
If a deployment gives each actor frontier the meaning that the actor will not later surface previously unseen operations at identifiers less than or equal to that frontier, then the deployment MAY compute a garbage collection floor as the minimum acknowledged frontier across the relevant actors.
Under that stronger acknowledgement model, tombstones whose identifiers are strictly less than the garbage collection floor MAY be deleted.
This technique is best suited to membership state and static metadata. It is not, by itself, a field-level CRDT for mutating nested object graphs.
An implementation MAY shallow-freeze stored values as a low-cost hint that item payloads are not intended to be mutated in place. Such a hint does not create deep replication semantics for nested values.
This section defines one conforming JavaScript binding for the model above.
[Exposed=*]
interface ORSet {
constructor();
constructor(ORSetSnapshot snapshot);
readonly attribute unsigned long size;
boolean has(any value);
undefined append(any value);
undefined clear();
undefined remove(any value);
sequence<any> values();
any tombstones();
undefined merge(ORSetSnapshot ingress);
undefined snapshot();
undefined addEventListener(DOMString type, any listener, optional any options);
undefined removeEventListener(DOMString type, any listener, optional any options);
};
dictionary ORSetSnapshot {
required sequence<any> values;
required sequence<DOMString> tombstones;
};
dictionary ORSetMergeResult {
sequence<DOMString> removals = [];
sequence<any> additions = [];
};
The JavaScript binding maintains an internal
EventTarget object as defined by DOM [[DOM]].
This binding is not specified as inheriting from
EventTarget. Instead, its listener methods forward to
that internal event target, and its local mutations and merges
dispatch synthetic events through it.
The JavaScript binding defines three synthetic event types. Each is a
CustomEvent object whose detail attribute is
initialized as described below [[DOM]].
CustomEvent whose detail contains only
the values and tombstones produced by one local mutation.
CustomEvent whose detail is the full
current snapshot produced when
snapshot()
is invoked.
CustomEvent whose detail contains the
added values and removals accepted from an incoming merge.
A no-op mutation MUST NOT dispatch any event.
If the JavaScript binding rejects a malformed top-level snapshot, it
MUST throw an Error-derived object whose
name is ORSetError and whose
code is BAD_SNAPSHOT.
0.undefined, then return.
ORSetError with code
BAD_SNAPSHOT.
tombstones:
values:
__uuidv7 member.
Object.freeze(value) in the live value
store at id.
1.value.__uuidv7.
value.__uuidv7.Object.freeze(value).
Object.freeze({...value, __uuidv7: new UUIDv7}),
where new UUIDv7 is freshly generated.
__uuidv7.
1.values: [stored value] and
tombstones: [].
0, then return.0.values: [] and
tombstones: egress tombstones.
value.__uuidv7.
1.values: [] and tombstones: [id].
Return a new list containing the live values in the live value store's enumeration order.
Return the live tombstone set itself so that application level replication logic can inspect or compact it directly.
ORSetError with code
BAD_SNAPSHOT.
tombstones:
1.
values:
__uuidv7 member.
Object.freeze(value) in the live value
store at id.
1.additions and removals.
values member is a new list containing the current
live values and whose tombstones member is a new list
containing the current tombstones.
undefined.
Forward listener registration and listener removal to the binding's
internal EventTarget object using the provided type,
listener, and options values.
This specification builds on the OR-Set literature [[ORSET]], the UUID design in [[RFC9562]], and practical discussions of optimized observed-remove sets such as [[NIK96A]].