Changeset JSON schema¶
A changeset JSON document is a tree of DiffNode values wrapped
in a Changeset envelope. The shape is deliberately open:
action, item_type, and tags are unbounded strings that plugins extend.
Consumers should treat unknown values as opaque and fall through to generic
handling.
The machine-readable schema (JSON Schema draft 2020-12) lives alongside this
page at changeset-schema.json and is generated
from the Rust IR types. The tables below are a rendering of that schema.
What is not in the changeset¶
- Significance classification. Clerical vs. substantive is a renderer concern, applied at render time from a tag-to-category mapping in config. The IR is judgment-free. See Significance classification.
- Transient session data.
source_itemsandartifactsare wire-visible because the plugin ABI carries them across (potentially process-isolated) boundaries, but they are stripped at the output boundary viaDiffNode::strip_transientbefore changeset JSON is written for users. They appear in the schema below, but callers writing changeset files should not expect to see populated values. See the Transient fields on wire ADR.
Stability¶
The IR is still evolving. Once a first stable version is cut, the schema will be versioned and this page will document compatibility guarantees. Until then, treat the shape as informative and pin your downstream pipeline to known plugin versions.
Where to go next¶
- IR and changesets — the conceptual model behind the shape documented here.
- Save and render changesets — producing and combining changeset JSON from the CLI.
- Extract changed data — using the provenance fields to pull actual changed content out of a changeset.
Types¶
Changeset¶
A structured description of how to get from one snapshot to the next.
| Field | Type | Required | Description |
|---|---|---|---|
from_snapshot |
string | yes | |
metadata |
object (map of string → string) | no | |
root |
DiffNode | null |
no | |
to_snapshot |
string | yes |
DiffNode¶
A node in the diff tree — the central data structure of the system. Every comparator emits it, every transformer rewrites it, and serializers or bindings read it.
| Field | Type | Required | Description |
|---|---|---|---|
action |
string | yes | Open enum: "add", "remove", "modify", "move", "reorder", "schema_change", etc. Plugins may define new actions. |
annotations |
object (free-form) | no | Transformer-added metadata. |
artifacts |
array of ArtifactDescriptor |
no | Published artifacts for this node. Session-scoped working data: carried across the plugin ABI wire as descriptors (the bytes live in the shared data_root cache), but not meaningful outside a session. Callers writing changeset output must strip this via [DiffNode::strip_transient] before serializing. |
children |
array of DiffNode |
no | Child diff nodes forming the tree structure. |
comparator |
string | null | no | Which comparator produced this node (provenance for extract chain). |
details |
object (free-form) | no | Comparator-specific payload, schema determined by item_type convention. |
item_type |
string | yes | Open string: "directory", "file", "tabular", "zip_archive", etc. No built-in types — conventions, not enforcement. |
path |
string | yes | Location within snapshot (logical path, including interior paths like "archive.zip/data/file.csv"). |
source_items |
ItemPair | null |
no | The original item pair that produced this node. Session-scoped working data: available during a live diff/transform session for transformers and extractors that need to re-read source data, and carried across the plugin ABI wire so separately-compiled plugins can access it. Callers writing changeset output must strip this via [DiffNode::strip_transient] before serializing. |
source_path |
string | null | no | For moves/renames: the original path. |
summary |
string | null | no | Optional human-readable one-liner describing the change. Set by comparator or transformer; used by renderers for narrative rendering. |
tags |
array of string | no | Open bag of semantic tags, namespaced by convention. e.g. "binoc.column-reorder", "biobinoc.gap-change" |
transformed_by |
array of string | no | Transformers that modified this node, in order (provenance for extract chain). |
ItemPair¶
A pair of items to compare. Either side may be None (add/remove).
| Field | Type | Required | Description |
|---|---|---|---|
left |
ItemRef | null |
no | |
right |
ItemRef | null |
no |
ItemRef¶
Metadata-only view of one side of a comparison. Carries logical identity and content metadata but NOT a filesystem path — data access goes through DataAccess. # Metadata invariants content_hash, size, and media_type are opportunistic hints. Producers (expanding comparators like directory/zip, or data backends) populate them when doing so is cheap — typically as a byproduct of work they were already performing. Consumers must not assume presence, but may trust presence: when a field is set, the value accurately reflects the current bytes. Use [ItemRef::resolve_hash] / [ItemRef::resolve_size] to obtain a value with a transparent fall-back read. This keeps fast paths (directory-only listings, short-circuit identical detection) cheap while letting consumers that need a value — most notably the move detector, which correlates leaves across container boundaries — hydrate on demand.
| Field | Type | Required | Description |
|---|---|---|---|
content_hash |
string | null | no | |
handle |
string | no | Opaque identifier used by DataAccess implementations to locate data. Plugin authors should not create or interpret this value directly. |
is_dir |
boolean | yes | |
logical_path |
string | yes | |
media_type |
string | null | no | |
size |
integer | null | no |
ArtifactDescriptor¶
Descriptor for a published artifact attached to a node. Artifacts are the unified mechanism for both private reuse and cross-plugin composition. A comparator or transformer publishes zero or more artifacts; downstream plugins consume them by format.
| Field | Type | Required | Description |
|---|---|---|---|
format |
ArtifactFormat |
yes | |
handle |
string | yes | Opaque handle managed by the SDK's DataAccess implementation. Plugins should not create or interpret this value directly. |
producer |
string | yes | |
subject |
ArtifactSubject |
yes |
ArtifactFormat¶
Identifies an artifact's data format as a structured tuple of (package, name, version). - package — the package that owns and defines this format, resolvable through the language's normal package system (e.g. "binoc", "binoc-csv", "acme-parquet"). - name — the format name within that package (e.g. "tabular", "relational-schema"). - version — a single integer. Bump only for breaking schema changes. Adding optional fields to an existing version is fine and does not require a bump (JSON/serde naturally ignore unknown fields and default missing ones).
| Field | Type | Required | Description |
|---|---|---|---|
name |
string | yes | |
package |
string | yes | |
version |
integer (uint32) | yes |
ArtifactSubject¶
Which side of a comparison an artifact describes.
String enum. One of:
leftrightpair