Examples gallery¶

These are runnable examples from binoc's test suite. Each example links to its source folder on GitHub, tells you whether it needs any extra setup, gives you the exact command to run, and shows the Markdown changelog binoc is expected to print.

Binoc currently ships 62 shared examples in this gallery.

One-time setup¶

Clone the repository and materialize the archive-based fixtures once:

git clone https://github.com/harvard-lil/binoc
cd binoc
just materialize

At a glance¶

Example	What it shows	Example output	Setup
`binary-fallback-diagnostic`	Unknown file type compared by the binary fallback emits a suggestion	data.parquet: Binary content changed; 1 extracted string added, 1 extracted string removed	Default pipeline
`binary-strings-fallback`	Two opaque binary blobs with differing hashes. The change is hash-driven (binoc.content-changed), and an additive extra…	firmware.bin: Binary content changed; 2 extracted strings added, 2 extracted strings removed	Default pipeline
`csv-cell-changes`	Individual cell values changed	data.csv: 2 cells changed	Default pipeline
`csv-column-addition`	New column added	data.csv: Column added: 'email'	Default pipeline
`csv-column-removal`	Column removed	data.csv: Column removed: 'city'	Default pipeline
`csv-column-reorder`	Columns shuffled, content identical	data.csv: Columns reordered	Default pipeline
`csv-distribution-shift`	Numeric column distribution shifts with keyed row matching	data.csv: 4 rows modified by key	Custom config
`csv-keyed-null-duplicate`	Configured CSV row keys surface null and duplicate key diagnostics	data.csv: 14 cells changed	Custom config
`csv-keyed-row-diff`	Configured CSV row keys match reordered rows and report keyed row/cell changes	data.csv: 1 row added; 1 row removed; 1 row modified by key	Custom config
`csv-mid-row-insertion`	A mid-table row insertion compacts while column reorder/addition rules remain independent	data.csv: Column added: 'email'; Columns reordered; 1 row added	Default pipeline
`csv-mixed-changes`	Multiple change types	data.csv: Column added: 'email'; Columns reordered; 1 row added	Default pipeline
`csv-rename-modify`	CSV renamed and modified: detected as a single move by fuzzy correlation	data_v2.csv:	Default pipeline
`csv-row-addition`	New rows appended	data.csv: 2 rows added	Default pipeline
`csv-row-removal`	Rows removed from CSV	data.csv: 2 rows removed	Default pipeline
`csv-stacked-tables`	Detects two logical tables stacked in one messy CSV	data.csv/>table_2: 1 row added	Default pipeline
`csv-to-tsv-reformat`	Table reformatted from CSV to TSV with row edits: detected as one reformatted-and-modified table, not remove + add	data.tsv:	Default pipeline
`csv-verbosity-full`	Markdown full verbosity renders every captured changed-cell example.	data.csv: 5 cells changed	Custom config
`directory-file-copy`	New file with same content as an existing unchanged file detected as a copy	duplicate.txt: Copied from original.txt	Default pipeline
`directory-nested`	Subdirectories with mixed changes	data/records.csv: 1 row added	Default pipeline
`directory-nested-with-tar`	Shows binoc diffing a tar archive and a plain directory that contain overlapping internal paths.	data.tar.gz/>records.csv: 1 cell changed	Default pipeline
`enforcement-actions-merge-years`	Per-year CSVs merged row-wise into one file; detected as a clean partition merge (CFM-72)	actions_2023.csv, actions_2024.csv merged into actions.csv	Default pipeline
`file-correspondence-container`	Config declares a correspondence between renamed zip containers	archive.zip: Moved from data.zip	Custom config
`file-correspondence-scheme`	Config declares that a state CSV moved into a new directory scheme is the same logical file	by-state: Added	Custom config
`file-correspondence-token`	Config declares that year-stamped CSV filenames are the same logical file	running_list_as_of_2023.csv:	Custom config
`folder-move-nested`	Detects a whole-folder rename and rolls many file moves up into one folder-move entry.	documentation: Moved from docs	Default pipeline
`folder-move-partial`	Detects a mostly-moved folder rename and preserves only the added/removed/modified remainder entries beneath it.	FoodData_Central_csv_2026-04-30: Added	Default pipeline
`geojson-feature-cell-change`	A GeoJSON FeatureCollection where one feature's property changes; transcoded to a tabular artifact with the geometry as…	places.geojson: 1 cell changed	Default pipeline
`gzip-inner-dispatch`	Gzipped CSV and text are decompressed and redispatched under their inner names	census.txt.gz/>census.txt: 1 line added; 1 line removed	Default pipeline
`ini-value-change`	An INI value changes; transcoded to a structured_document and reported as a value change	config.ini: Document values changed	Default pipeline
`json-array-order-significant`	JSON array order changes are semantic content changes in stage 1	metadata.json: Document values changed	Default pipeline
`json-key-order-reexport`	JSON object key order and pretty-printing changed without semantic value changes	metadata.json: Document serialization changed	Default pipeline
`json-records-cell-change`	JSON array of like-shaped objects parsed as a typed table; numeric cell values change	data.json: 2 cells changed	Default pipeline
`json-records-nested-value`	JSON records with a nested object cell; the nested value changes and is reported as a single equality-based cell edit (…	people.json: 1 cell changed	Default pipeline
`jsonl-row-addition`	JSONL stream of like-shaped objects parsed as a table; a record is appended	events.jsonl: 1 row added	Default pipeline
`jsonld-value-change`	A .jsonld file with no declared media type parses as a structured document tagged format=jsonld; a value change is repo…	person.jsonld: Document values changed	Default pipeline
`kitchen-sink`	Runs text, CSV, archive, move, and copy detection together in one end-to-end example.	archive.tar.gz/>inventory.csv: 1 row added	Default pipeline
`observations-repartition-equal-arity`	Equal-arity N→M repartition: 2 tables grouped by region become 2 tables grouped by year, every row preserved exactly bu…	observations_2024.csv:	Default pipeline
`observations-split-by-year`	One CSV split row-wise into per-year files; detected as a clean partition split (CFM-72)	observations.csv split into observations_2024.csv, observations_2025.csv	Default pipeline
`observations-split-residual`	A would-be split missing one row: partition declines (not complete), emits binoc.possible_split, and degrades to honest…	observations_2024.csv:	Default pipeline
`single-file-add`	File present in B but not A	new_file.txt: Added	Default pipeline
`single-file-modify-binary`	Binary file, different hash	data.bin: 1 edit	Default pipeline
`single-file-modify-csv`	CSV file compared directly (file-to-file, not via directory)	data.csv: 1 row added	Default pipeline
`single-file-modify-text`	Text file with line-level changes	story.txt: 2 lines added; 1 line removed	Default pipeline
`single-file-modify-text-root`	Text file compared directly (file-to-file, not via directory)	story.txt: 2 lines added; 1 line removed	Default pipeline
`single-file-remove`	File present in A but not B	removed_file.txt: Removed	Default pipeline
`stacked-csv-broken-out`	Stacked-CSV tables broken out into one file per table; whole-table rehoming (reshape + 1:1), NOT a partition split (CFM…	changes.csv: Moved from report.csv/>table_1	Default pipeline
`tar-nested`	Nested tar.gz containing CSV	outer.tar.gz/>inner.tar.gz/>data.csv: 1 row added	Default pipeline
`tar-simple`	Tar.gz archive with changes inside	archive.tar.gz/>data.csv: 1 row added	Default pipeline
`text-rename-modify`	Text file renamed and modified: detected as a single move by fuzzy correlation	meeting-notes-v2.txt:	Default pipeline
`toml-value-change`	A TOML value changes; transcoded to a structured_document and reported as a value change	config.toml: Document values changed	Default pipeline
`tree-wide-correlation`	Shows tree-wide move and copy detection across nested zip boundaries, including one-to-many copies and many-to-one moves.	gamma-renamed.txt: Moved from outer.zip/>inner.zip/>gamma.txt	Default pipeline
`trivial-identical`	Two identical directories → empty changeset	# Changelog: snapshot-a → snapshot-b	Default pipeline
`trivial-identical-csv`	Two identical CSV files → no changes reported	# Changelog: snapshot-a → snapshot-b	Default pipeline
`tsv-cell-changes`	Tab-delimited file parses into real columns and reports cell changes	data.tsv: 2 cells changed	Default pipeline
`yaml-value-change`	A YAML scalar value changes; transcoded to a structured_document and reported as a value change	config.yaml: Document values changed	Default pipeline
`zip-declared-container`	Config declares a correspondence between nested zip containers and preserves inner CSV content detail	outer.zip/>records.zip:	Custom config
`zip-json-key-order-reexport`	JSON files inside zip expansion get parsed and rendered as serialization-only changes	archive.zip/>metadata.json: Document serialization changed	Default pipeline
`zip-nested`	Nested zip containing CSV	outer.zip/>inner.zip/>data.csv: 1 row added	Default pipeline
`zip-rename-contents-rewritten`	Documents a known gap — a renamed zip whose children were all renamed AND rewritten (no content similarity) yields unpa…	data.zip: Removed	Default pipeline
`zip-rename-identical`	Zip archive renamed with identical contents; bottom-up roll-up of the inner clean file moves compacts the pair into a s…	archive.zip: Moved from data.zip	Default pipeline
`zip-rename-inner-rename-edit`	Zip archive renamed while its only child was renamed and had one cell edited; the modified move counts as roll-up evide…	archive.zip: Moved from data.zip	Default pipeline
`zip-simple`	Zipped files with changes inside	archive.zip/>data.txt: 1 line added; 1 line removed	Default pipeline

binary-fallback-diagnostic¶

Unknown file type compared by the binary fallback emits a suggestion

Browse source: binary-fallback-diagnostic
Tags: modify, binary, diagnostics
Snapshots: snapshot-a has 1 file — data.parquet; snapshot-b has 1 file — data.parquet

Run it:

binoc diff \
  ./test-vectors-materialized/binary-fallback-diagnostic/snapshot-a \
  ./test-vectors-materialized/binary-fallback-diagnostic/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **data.parquet**: Binary content changed; 1 extracted string added, 1 extracted string removed
  - Extracted strings added
    - 'after!\n'
  - Extracted strings removed
    - 'before\n'

binary-strings-fallback¶

Two opaque binary blobs with differing hashes. The change is hash-driven (binoc.content-changed), and an additive extra…

Browse source: binary-strings-fallback
Tags: modify, binary, strings
Snapshots: snapshot-a has 1 file — firmware.bin; snapshot-b has 1 file — firmware.bin

Run it:

binoc diff \
  ./test-vectors-materialized/binary-strings-fallback/snapshot-a \
  ./test-vectors-materialized/binary-strings-fallback/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **firmware.bin**: Binary content changed; 2 extracted strings added, 2 extracted strings removed
  - Extracted strings added
    - 'build-beta'
    - 'version=2.0.0'
  - Extracted strings removed
    - 'build-alpha'
    - 'version=1.0.0'

csv-cell-changes¶

Individual cell values changed

Browse source: csv-cell-changes
Tags: csv, cell-change
Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/csv-cell-changes/snapshot-a \
  ./test-vectors-materialized/csv-cell-changes/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **data.csv**: 2 cells changed
  - Changed cells
    - row 1, column 'score': '85' -> '92'
    - row 2, column 'score': '90' -> '88'

csv-column-addition¶

New column added

Browse source: csv-column-addition
Tags: csv, column-addition, schema
Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/csv-column-addition/snapshot-a \
  ./test-vectors-materialized/csv-column-addition/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **data.csv**: Column added: 'email'
  - Set Headers: from: ["name","age"]; to: ["name","age","email"]
  - Add Column: name: 'email'; values: {"total_values":2,"truncated":false,"values":["alice@test.com","bob@test.com"]}

csv-column-removal¶

Column removed

Browse source: csv-column-removal
Tags: csv, column-removal, schema
Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/csv-column-removal/snapshot-a \
  ./test-vectors-materialized/csv-column-removal/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **data.csv**: Column removed: 'city'
  - Set Headers: from: ["name","age","city"]; to: ["name","age"]
  - Remove Column: name: 'city'; values: {"total_values":2,"truncated":false,"values":["NYC","LA"]}

csv-column-reorder¶

Columns shuffled, content identical

Browse source: csv-column-reorder
Tags: csv, column-reorder, clerical
Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/csv-column-reorder/snapshot-a \
  ./test-vectors-materialized/csv-column-reorder/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **data.csv**: Columns reordered
  - Reorder Columns: order: ["city","name","age"]

csv-distribution-shift¶

Numeric column distribution shifts with keyed row matching

Browse source: csv-distribution-shift
Tags: csv, statistics, row-identity
Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv
Setup: This example uses a custom dataset config to make the relevant correspondence behavior obvious. Save this dataset config as /tmp/csv-distribution-shift.yaml:

dataset:
  tables:
    defaults:
      row_identity:
        columns:
          - id

Run it:

binoc diff \
  ./test-vectors-materialized/csv-distribution-shift/snapshot-a \
  ./test-vectors-materialized/csv-distribution-shift/snapshot-b \
  --config /tmp/csv-distribution-shift.yaml

Result:

# Changelog: snapshot-a → snapshot-b

- **data.csv**: 4 rows modified by key
  - Changed cells (showing 3 of 5)
    - key id '1', column 'score': '10' -> '12'
    - key id '2', column 'score': '20' -> '35'
    - key id '2', column 'label': 'beta' -> 'beta2'

csv-keyed-null-duplicate¶

Configured CSV row keys surface null and duplicate key diagnostics

Browse source: csv-keyed-null-duplicate
Tags: csv, keyed, null-key, duplicate-key
Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv
Setup: This example uses a custom dataset config to make the relevant correspondence behavior obvious. Save this dataset config as /tmp/csv-keyed-null-duplicate.yaml:

dataset:
  tables:
    defaults:
      row_identity:
        on_null_key: diagnostic
        on_duplicate_key: diagnostic
    entries:
      - path_regex: ^data\.csv$
        columns:
          - id

Run it:

binoc diff \
  ./test-vectors-materialized/csv-keyed-null-duplicate/snapshot-a \
  ./test-vectors-materialized/csv-keyed-null-duplicate/snapshot-b \
  --config /tmp/csv-keyed-null-duplicate.yaml

Result:

# Changelog: snapshot-a → snapshot-b

- **data.csv**: 14 cells changed
  - Changed cells (showing 3 of 14)
    - row 1, column 'id': 'a' -> 'b'
    - row 1, column 'name': 'Alice' -> 'Bob'
    - row 1, column 'score': '10' -> '21'

## Warnings

- configured row keys had null values; fell back to positional row comparison (`binoc.write.tabular`) [binoc.keyed_row_identity_degraded]

csv-keyed-row-diff¶

Configured CSV row keys match reordered rows and report keyed row/cell changes

Browse source: csv-keyed-row-diff
Tags: csv, keyed, row-addition, row-removal, cell-change
Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv
Setup: This example uses a custom dataset config to make the relevant correspondence behavior obvious. Save this dataset config as /tmp/csv-keyed-row-diff.yaml:

dataset:
  tables:
    - path_regex: ^data\.csv$
      columns:
        - id

Run it:

binoc diff \
  ./test-vectors-materialized/csv-keyed-row-diff/snapshot-a \
  ./test-vectors-materialized/csv-keyed-row-diff/snapshot-b \
  --config /tmp/csv-keyed-row-diff.yaml

Result:

# Changelog: snapshot-a → snapshot-b

- **data.csv**: 1 row added; 1 row removed; 1 row modified by key
  - Changed cells
    - key id 'p2', column 'price': '20' -> '25'
  - Rows added
    - key id 'p4': 'p4', 'Delta', '40'
  - Rows removed
    - key id 'p3': 'p3', 'Gamma', '30'

csv-mid-row-insertion¶

A mid-table row insertion compacts while column reorder/addition rules remain independent

Browse source: csv-mid-row-insertion
Tags: csv, row-addition, column-reorder, column-addition, lcs, compaction
Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/csv-mid-row-insertion/snapshot-a \
  ./test-vectors-materialized/csv-mid-row-insertion/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **data.csv**: Column added: 'email'; Columns reordered; 1 row added
  - Rows added
    - row 2: 'LA', 'Bob', '25'
  - Reorder Columns: order: ["city","name","age"]
  - Add Column: name: 'email'; values: {"total_values":3,"truncated":false,"values":["alice@example.test","bob@example.test","charlie@example.test"]}

csv-mixed-changes¶

Multiple change types

Browse source: csv-mixed-changes
Tags: csv, column-reorder, column-addition, row-addition
Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/csv-mixed-changes/snapshot-a \
  ./test-vectors-materialized/csv-mixed-changes/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **data.csv**: Column added: 'email'; Columns reordered; 1 row added
  - Rows added
    - row 3: 'SF', 'Charlie', '35'
  - Reorder Columns: order: ["city","name","age"]
  - Add Column: name: 'email'; values: {"total_values":3,"truncated":false,"values":["a@test.com","b@test.com","c@test.com"]}

csv-rename-modify¶

CSV renamed and modified: detected as a single move by fuzzy correlation

Browse source: csv-rename-modify
Tags: csv, fuzzy-move, rename-modify
Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data_v2.csv

Run it:

binoc diff \
  ./test-vectors-materialized/csv-rename-modify/snapshot-a \
  ./test-vectors-materialized/csv-rename-modify/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **data_v2.csv**:
  - Moved from data.csv
  - Column added: 'email'
  - Set Headers: from: ["name","age","city"]; to: ["name","age","city","email"]
  - Add Column: name: 'email'; values: {"total_values":3,"truncated":false,"values":["alice@test.com","bob@test.com","carol@test.com"]}

csv-row-addition¶

New rows appended

Browse source: csv-row-addition
Tags: csv, row-addition
Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/csv-row-addition/snapshot-a \
  ./test-vectors-materialized/csv-row-addition/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **data.csv**: 2 rows added
  - Rows added
    - row 2: 'Bob', '25'
    - row 3: 'Charlie', '35'

csv-row-removal¶

Rows removed from CSV

Browse source: csv-row-removal
Tags: csv, row-removal
Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/csv-row-removal/snapshot-a \
  ./test-vectors-materialized/csv-row-removal/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **data.csv**: 2 rows removed
  - Rows removed
    - row 2: 'Bob', '25'
    - row 3: 'Charlie', '35'

csv-stacked-tables¶

Detects two logical tables stacked in one messy CSV

Browse source: csv-stacked-tables
Tags: csv, stacked-tables, row-addition
Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/csv-stacked-tables/snapshot-a \
  ./test-vectors-materialized/csv-stacked-tables/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **data.csv/>table_2**: 1 row added
  - Rows added
    - row 12: '761012', 'Mu', 'Mu Pharma'

csv-to-tsv-reformat¶

Table reformatted from CSV to TSV with row edits: detected as one reformatted-and-modified table, not remove + add

Browse source: csv-to-tsv-reformat
Tags: csv, tsv, reformat, serialization-change, tabular-pair
Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.tsv

Run it:

binoc diff \
  ./test-vectors-materialized/csv-to-tsv-reformat/snapshot-a \
  ./test-vectors-materialized/csv-to-tsv-reformat/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **data.tsv**:
  - Moved from data.csv
  - 1 row added; 1 cell changed
  - Changed cells
    - row 2, column 'age': '25' -> '26'
  - Rows added
    - row 4: 'Dave', '41', 'Austin'

csv-verbosity-full¶

Markdown full verbosity renders every captured changed-cell example.

Browse source: csv-verbosity-full
Tags: csv, cell-change, verbosity
Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv
Setup: This example sets output.markdown.verbosity: full so the changelog prints every captured changed-cell example instead of the default capped sample. Save this dataset config as /tmp/csv-verbosity-full.yaml:

output:
  markdown:
    verbosity: full

Run it:

binoc diff \
  ./test-vectors-materialized/csv-verbosity-full/snapshot-a \
  ./test-vectors-materialized/csv-verbosity-full/snapshot-b \
  --config /tmp/csv-verbosity-full.yaml

Result:

# Changelog: snapshot-a → snapshot-b

- **data.csv**: 5 cells changed
  - Sources
    - data.csv (from, modify, binoc.pair.name)
  - Changed cells
    - row 1, column 'score': '10' -> '11'
    - row 2, column 'score': '20' -> '21'
    - row 3, column 'score': '30' -> '31'
    - row 4, column 'score': '40' -> '41'
    - row 5, column 'score': '50' -> '51'

directory-file-copy¶

New file with same content as an existing unchanged file detected as a copy

Browse source: directory-file-copy
Tags: copy, directory, content-hash
Snapshots: snapshot-a has 1 file — original.txt; snapshot-b has 2 files — duplicate.txt, original.txt

Run it:

binoc diff \
  ./test-vectors-materialized/directory-file-copy/snapshot-a \
  ./test-vectors-materialized/directory-file-copy/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **duplicate.txt**: Copied from original.txt

directory-nested¶

Subdirectories with mixed changes

Browse source: directory-nested
Tags: directory, nested, mixed
Snapshots: snapshot-a has 2 files — data/records.csv, docs/readme.txt; snapshot-b has 3 files — data/extra.csv, data/records.csv, docs/readme.txt

Run it:

binoc diff \
  ./test-vectors-materialized/directory-nested/snapshot-a \
  ./test-vectors-materialized/directory-nested/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **data/records.csv**: 1 row added
  - Rows added
    - row 3: '3', 'Charlie'
- **data/extra.csv**: Added
- **docs/readme.txt**: 2 lines added; 1 line removed
  - Line changes
    - line 1: 'Version 1 readme' -> 'Version 2 readme'

directory-nested-with-tar¶

Shows binoc diffing a tar archive and a plain directory that contain overlapping internal paths.

Browse source: directory-nested-with-tar
Tags: directory, tar, overlap, artifact-collision
Snapshots: snapshot-a has 2 files — data.tar.gz.d/records.csv, data/records.csv; snapshot-b has 2 files — data.tar.gz.d/records.csv, data/records.csv

Run it:

binoc diff \
  ./test-vectors-materialized/directory-nested-with-tar/snapshot-a \
  ./test-vectors-materialized/directory-nested-with-tar/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **data.tar.gz/>records.csv**: 1 cell changed
  - Changed cells
    - row 2, column 'count': '20' -> '25'
- **data/records.csv**: 1 row added
  - Rows added
    - row 3: '3', 'Charlie'

enforcement-actions-merge-years¶

Per-year CSVs merged row-wise into one file; detected as a clean partition merge (CFM-72)

Browse source: enforcement-actions-merge-years
Tags: csv, partition, merge
Snapshots: snapshot-a has 2 files — actions_2023.csv, actions_2024.csv; snapshot-b has 1 file — actions.csv

Run it:

binoc diff \
  ./test-vectors-materialized/enforcement-actions-merge-years/snapshot-a \
  ./test-vectors-materialized/enforcement-actions-merge-years/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

Claims

- actions_2023.csv, actions_2024.csv merged into actions.csv

- **actions.csv**: Merged from actions_2023.csv, actions_2024.csv

file-correspondence-container¶

Config declares a correspondence between renamed zip containers

Browse source: file-correspondence-container
Tags: zip, file-correspondence, declared-correspondence, container
Snapshots: snapshot-a has 1 file — data.zip.d/file.csv; snapshot-b has 1 file — archive.zip.d/file.csv
Setup: This example uses a custom dataset config to make the relevant correspondence behavior obvious. Save this dataset config as /tmp/file-correspondence-container.yaml:

dataset:
  files:
    correspondences:
      - name: archive-pair
        key: archive
        left:
          path_regex: ^data\.zip$
        right:
          path_regex: ^archive\.zip$

Run it:

binoc diff \
  ./test-vectors-materialized/file-correspondence-container/snapshot-a \
  ./test-vectors-materialized/file-correspondence-container/snapshot-b \
  --config /tmp/file-correspondence-container.yaml

Result:

# Changelog: snapshot-a → snapshot-b

- **archive.zip**: Moved from data.zip

file-correspondence-scheme¶

Config declares that a state CSV moved into a new directory scheme is the same logical file

Browse source: file-correspondence-scheme
Tags: csv, file-correspondence, scheme-change
Snapshots: snapshot-a has 1 file — data/state_AL.csv; snapshot-b has 1 file — by-state/AL/records.csv
Setup: This example uses a custom dataset config to make the relevant correspondence behavior obvious. Save this dataset config as /tmp/file-correspondence-scheme.yaml:

dataset:
  files:
    correspondences:
      - name: state-records
        key: "${state}"
        logical_path: "states/${state}.csv"
        on_null_key: diagnostic
        on_duplicate_key: diagnostic
        left:
          path_regex: "^data/state_(?P<state>[A-Z]{2})\\.csv$"
        right:
          path_regex: "^by-state/(?P<state>[A-Z]{2})/records\\.csv$"

Run it:

binoc diff \
  ./test-vectors-materialized/file-correspondence-scheme/snapshot-a \
  ./test-vectors-materialized/file-correspondence-scheme/snapshot-b \
  --config /tmp/file-correspondence-scheme.yaml

Result:

# Changelog: snapshot-a → snapshot-b

- **by-state**: Added
- **by-state/AL**: Moved from data
- **by-state/AL/records.csv**:
  - Moved from data/state_AL.csv
  - 1 row added
  - Rows added
    - row 2: '2', 'Birmingham'

file-correspondence-token¶

Config declares that year-stamped CSV filenames are the same logical file

Browse source: file-correspondence-token
Tags: csv, file-correspondence, declared-correspondence
Snapshots: snapshot-a has 1 file — running_list_as_of_2022.csv; snapshot-b has 1 file — running_list_as_of_2023.csv
Setup: This example uses a custom dataset config to make the relevant correspondence behavior obvious. Save this dataset config as /tmp/file-correspondence-token.yaml:

dataset:
  files:
    correspondences:
      - name: running-list
        key: "${list}"
        logical_path: "${list}.csv"
        on_null_key: diagnostic
        on_duplicate_key: diagnostic
        left:
          path_regex: "^(?P<list>running_list)_as_of_[0-9]{4}\\.csv$"
        right:
          path_regex: "^(?P<list>running_list)_as_of_[0-9]{4}\\.csv$"

Run it:

binoc diff \
  ./test-vectors-materialized/file-correspondence-token/snapshot-a \
  ./test-vectors-materialized/file-correspondence-token/snapshot-b \
  --config /tmp/file-correspondence-token.yaml

Result:

# Changelog: snapshot-a → snapshot-b

- **running_list_as_of_2023.csv**:
  - Moved from running_list_as_of_2022.csv
  - 1 row added
  - Rows added
    - row 3: '3', 'Cy'

folder-move-nested¶

Detects a whole-folder rename and rolls many file moves up into one folder-move entry.

Browse source: folder-move-nested
Tags: folder-move, rollup, nested, directory
Snapshots: snapshot-a has 4 files — docs/readme.txt, docs/reports/annual.txt, docs/reports/quarterly/q1.txt, docs/reports/quarterly/q2.txt; snapshot-b has 4 files — documentation/readme.txt, documentation/reports/annual.txt, documentation/reports/quarterly/q1.txt, documentation/reports/quarterly/q2.txt

Run it:

binoc diff \
  ./test-vectors-materialized/folder-move-nested/snapshot-a \
  ./test-vectors-materialized/folder-move-nested/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **documentation**: Moved from docs

folder-move-partial¶

Detects a mostly-moved folder rename and preserves only the added/removed/modified remainder entries beneath it.

Browse source: folder-move-partial
Tags: folder-move, partial, rollup, directory
Snapshots: snapshot-a has 10 files — FoodData_Central_csv_2025-12-18/README.txt, FoodData_Central_csv_2025-12-18/data/categories.csv, FoodData_Central_csv_2025-12-18/data/food.csv, FoodData_Central_csv_2025-12-18/data/nutrients.csv, +6 more; snapshot-b has 10 files — FoodData_Central_csv_2026-04-30/README.txt, FoodData_Central_csv_2026-04-30/data/categories.csv, FoodData_Central_csv_2026-04-30/data/food.csv, FoodData_Central_csv_2026-04-30/data/new-table.csv, +6 more

Run it:

binoc diff \
  ./test-vectors-materialized/folder-move-partial/snapshot-a \
  ./test-vectors-materialized/folder-move-partial/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **FoodData_Central_csv_2026-04-30**: Added
- **FoodData_Central_csv_2026-04-30/README.txt**: Moved from FoodData_Central_csv_2025-12-18/README.txt
- **FoodData_Central_csv_2026-04-30/data**: Moved from FoodData_Central_csv_2025-12-18/data
- **FoodData_Central_csv_2026-04-30/data/new-table.csv**: Added
- **FoodData_Central_csv_2026-04-30/docs**: Added
- **FoodData_Central_csv_2026-04-30/docs/changelog-note.txt**: Moved from FoodData_Central_csv_2025-12-18/docs/changelog-note.txt
- **FoodData_Central_csv_2026-04-30/docs/license.txt**: Moved from FoodData_Central_csv_2025-12-18/docs/license.txt
- **FoodData_Central_csv_2026-04-30/docs/schema.txt**: Moved from FoodData_Central_csv_2025-12-18/docs/schema.txt
- **FoodData_Central_csv_2026-04-30/docs/modified.txt**: Added
- **FoodData_Central_csv_2025-12-18**: Removed
- **FoodData_Central_csv_2025-12-18/docs**: Removed
- **FoodData_Central_csv_2025-12-18/docs/modified.txt**: Removed
- **FoodData_Central_csv_2025-12-18/docs/old-table.txt**: Removed

geojson-feature-cell-change¶

A GeoJSON FeatureCollection where one feature's property changes; transcoded to a tabular artifact with the geometry as…

Browse source: geojson-feature-cell-change
Tags: geojson, tabular, nested, cell-change
Snapshots: snapshot-a has 1 file — places.geojson; snapshot-b has 1 file — places.geojson

Run it:

binoc diff \
  ./test-vectors-materialized/geojson-feature-cell-change/snapshot-a \
  ./test-vectors-materialized/geojson-feature-cell-change/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **places.geojson**: 1 cell changed
  - Changed cells
    - row 1, column 'properties': {"name":"Boston","population":650000} -> {"name":"Boston","population":675000}

gzip-inner-dispatch¶

Gzipped CSV and text are decompressed and redispatched under their inner names

Browse source: gzip-inner-dispatch
Tags: gzip, csv, text, cell-change, row-addition, line-change
Snapshots: snapshot-a has 2 files — census.txt.gz.d/census.txt, data.csv.gz.d/data.csv; snapshot-b has 2 files — census.txt.gz.d/census.txt, data.csv.gz.d/data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/gzip-inner-dispatch/snapshot-a \
  ./test-vectors-materialized/gzip-inner-dispatch/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **census.txt.gz/>census.txt**: 1 line added; 1 line removed
  - Line changes
    - line 2: '1|Aroostook|120' -> '1|Aroostook|121'
- **data.csv.gz/>data.csv**: 1 row added; 1 cell changed
  - Changed cells
    - row 2, column 'name': 'Bob' -> 'Robert'
  - Rows added
    - row 3: '3', 'Carla'

ini-value-change¶

An INI value changes; transcoded to a structured_document and reported as a value change

Browse source: ini-value-change
Tags: ini, structured-document, value-change
Snapshots: snapshot-a has 1 file — config.ini; snapshot-b has 1 file — config.ini

Run it:

binoc diff \
  ./test-vectors-materialized/ini-value-change/snapshot-a \
  ./test-vectors-materialized/ini-value-change/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **config.ini**: Document values changed
  - Value Change: changes: [{"from":"\"3\"","kind":"replace","path":"$.replicas","to":"\"5\""}]; examples_truncated: false

json-array-order-significant¶

JSON array order changes are semantic content changes in stage 1

Browse source: json-array-order-significant
Tags: json, array-order, content-change
Snapshots: snapshot-a has 1 file — metadata.json; snapshot-b has 1 file — metadata.json

Run it:

binoc diff \
  ./test-vectors-materialized/json-array-order-significant/snapshot-a \
  ./test-vectors-materialized/json-array-order-significant/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **metadata.json**: Document values changed
  - Value Change: changes: [{"from":"2","kind":"replace","path":"$.ids[1]","to":"3"},{"from":"3","kind":"replace","path":"$.ids[2]","to":"2"}]; examples_truncated: false

json-key-order-reexport¶

JSON object key order and pretty-printing changed without semantic value changes

Browse source: json-key-order-reexport
Tags: json, serialization, key-order
Snapshots: snapshot-a has 1 file — metadata.json; snapshot-b has 1 file — metadata.json

Run it:

binoc diff \
  ./test-vectors-materialized/json-key-order-reexport/snapshot-a \
  ./test-vectors-materialized/json-key-order-reexport/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **metadata.json**: Document serialization changed
  - Serialization Change: kinds: ["object_key_order","formatting"]; left: {"byte_len":70,"line_ending":"lf","object_key_orders":[{"keys":["id","name"],"path":"$.fields"},{"keys":["name","version","fields"],"path":"$"}],"trailing_newli...; right: {"byte_len":98,"indentation":"2 spaces","line_ending":"lf","object_key_orders":[{"keys":["name","id"],"path":"$.fields"},{"keys":["fields","version","name"],"pa...

json-records-cell-change¶

JSON array of like-shaped objects parsed as a typed table; numeric cell values change

Browse source: json-records-cell-change
Tags: json, records, tabular, cell-change
Snapshots: snapshot-a has 1 file — data.json; snapshot-b has 1 file — data.json

Run it:

binoc diff \
  ./test-vectors-materialized/json-records-cell-change/snapshot-a \
  ./test-vectors-materialized/json-records-cell-change/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **data.json**: 2 cells changed
  - Changed cells
    - row 1, column 'score': 85 -> 92
    - row 2, column 'score': 90 -> 88

json-records-nested-value¶

JSON records with a nested object cell; the nested value changes and is reported as a single equality-based cell edit (…

Browse source: json-records-nested-value
Tags: json, records, tabular, nested, cell-change
Snapshots: snapshot-a has 1 file — people.json; snapshot-b has 1 file — people.json

Run it:

binoc diff \
  ./test-vectors-materialized/json-records-nested-value/snapshot-a \
  ./test-vectors-materialized/json-records-nested-value/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **people.json**: 1 cell changed
  - Changed cells
    - row 2, column 'meta': {"role":"user","tags":["z"]} -> {"role":"editor","tags":["z"]}

jsonl-row-addition¶

JSONL stream of like-shaped objects parsed as a table; a record is appended

Browse source: jsonl-row-addition
Tags: jsonl, records, tabular, row-addition
Snapshots: snapshot-a has 1 file — events.jsonl; snapshot-b has 1 file — events.jsonl

Run it:

binoc diff \
  ./test-vectors-materialized/jsonl-row-addition/snapshot-a \
  ./test-vectors-materialized/jsonl-row-addition/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **events.jsonl**: 1 row added
  - Rows added
    - row 3: 'carol', 'delete', 3

jsonld-value-change¶

A .jsonld file with no declared media type parses as a structured document tagged format=jsonld; a value change is repo…

Browse source: jsonld-value-change
Tags: json, jsonld, structured-document
Snapshots: snapshot-a has 1 file — person.jsonld; snapshot-b has 1 file — person.jsonld

Run it:

binoc diff \
  ./test-vectors-materialized/jsonld-value-change/snapshot-a \
  ./test-vectors-materialized/jsonld-value-change/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **person.jsonld**: Document values changed
  - Value Change: changes: [{"from":"\"Mathematician\"","kind":"replace","path":"$.jobTitle","to":"\"Computer Scientist\""}]; examples_truncated: false

kitchen-sink¶

Runs text, CSV, archive, move, and copy detection together in one end-to-end example.

Browse source: kitchen-sink
Tags: csv, text, binary, tar, zip, directory, move, copy, column-reorder, integration
Snapshots: snapshot-a has 9 files — archive.tar.gz.d/inventory.csv, bundle.zip.d/notes.txt, data.csv, docs/old-notes.txt, +5 more; snapshot-b has 10 files — archive.tar.gz.d/inventory.csv, bundle.zip.d/notes.txt, data.csv, docs/new-file.txt, +6 more

Run it:

binoc diff \
  ./test-vectors-materialized/kitchen-sink/snapshot-a \
  ./test-vectors-materialized/kitchen-sink/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **archive.tar.gz/>inventory.csv**: 1 row added
  - Rows added
    - row 3: 'sprockets', '20'
- **bundle.zip/>notes.txt**: 2 lines added; 1 line removed
  - Line changes
    - line 1: 'Version 1 notes.' -> 'Version 2 notes.'
- **data.csv**: 2 cells changed
  - Changed cells
    - row 1, column 'age': '30' -> '31'
    - row 3, column 'city': 'Seattle' -> 'Portland'
- **docs/readme.txt**: 2 lines added; 2 lines removed
  - Line changes
    - line 2: 'This is the original readme.' -> 'This is the updated readme.'
    - line 4: 'Some will change.' -> 'New content added here.'
- **docs/old-notes.txt**: Removed
- **docs/new-file.txt**: Added
- **icon.bin**: Binary content changed; 1 extracted string added, 1 extracted string removed
  - Extracted strings added
    - '\nFAKEICONv2'
  - Extracted strings removed
    - '\nFAKEICONv1'
- **license-copy.txt**: Copied from license.txt
- **metrics.csv**: Columns reordered
  - Reorder Columns: order: ["category","year","value"]
- **summary.txt**: Moved from report.txt

observations-repartition-equal-arity¶

Equal-arity N→M repartition: 2 tables grouped by region become 2 tables grouped by year, every row preserved exactly bu…

Browse source: observations-repartition-equal-arity
Tags: csv, partition, possible-split, equal-arity
Snapshots: snapshot-a has 2 files — observations_north.csv, observations_south.csv; snapshot-b has 2 files — observations_2024.csv, observations_2025.csv

Run it:

binoc diff \
  ./test-vectors-materialized/observations-repartition-equal-arity/snapshot-a \
  ./test-vectors-materialized/observations-repartition-equal-arity/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **observations_2024.csv**:
  - Moved from observations_north.csv
  - 3 cells changed
  - Changed cells
    - row 2, column 'year': '2025' -> '2024'
    - row 2, column 'region': 'north' -> 'south'
    - row 2, column 'count': '15' -> '12'
- **observations_2025.csv**:
  - Moved from observations_south.csv
  - 3 cells changed
  - Changed cells
    - row 1, column 'year': '2024' -> '2025'
    - row 1, column 'region': 'south' -> 'north'
    - row 1, column 'count': '12' -> '15'

## Suggestions

- 'observations_north.csv' shares rows with other unmatched tables but the relationship is not a clean partition (residual, shared, or extra rows); left as add/remove (`binoc.pair.partition`) [binoc.possible_split]

observations-split-by-year¶

One CSV split row-wise into per-year files; detected as a clean partition split (CFM-72)

Browse source: observations-split-by-year
Tags: csv, partition, split
Snapshots: snapshot-a has 1 file — observations.csv; snapshot-b has 2 files — observations_2024.csv, observations_2025.csv

Run it:

binoc diff \
  ./test-vectors-materialized/observations-split-by-year/snapshot-a \
  ./test-vectors-materialized/observations-split-by-year/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

Claims

- observations.csv split into observations_2024.csv, observations_2025.csv

- **observations_2024.csv**: Split from observations.csv
- **observations_2025.csv**: Split from observations.csv

observations-split-residual¶

A would-be split missing one row: partition declines (not complete), emits binoc.possible_split, and degrades to honest…

Browse source: observations-split-residual
Tags: csv, partition, possible-split
Snapshots: snapshot-a has 1 file — observations.csv; snapshot-b has 2 files — observations_2024.csv, observations_2025.csv

Run it:

binoc diff \
  ./test-vectors-materialized/observations-split-residual/snapshot-a \
  ./test-vectors-materialized/observations-split-residual/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **observations_2024.csv**:
  - Moved from observations.csv
  - 2 rows removed
  - Rows removed
    - row 3: '2025', 'north', '15'
    - row 4: '2025', 'south', '9'
- **observations_2025.csv**: Added

## Suggestions

- 'observations.csv' shares rows with other unmatched tables but the relationship is not a clean partition (residual, shared, or extra rows); left as add/remove (`binoc.pair.partition`) [binoc.possible_split]

single-file-add¶

File present in B but not A

Browse source: single-file-add
Tags: add, file
Snapshots: snapshot-a has 0 files (empty snapshot); snapshot-b has 1 file — new_file.txt

Run it:

binoc diff \
  ./test-vectors-materialized/single-file-add/snapshot-a \
  ./test-vectors-materialized/single-file-add/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **new_file.txt**: Added

single-file-modify-binary¶

Binary file, different hash

Browse source: single-file-modify-binary
Tags: modify, binary
Snapshots: snapshot-a has 1 file — data.bin; snapshot-b has 1 file — data.bin

Run it:

binoc diff \
  ./test-vectors-materialized/single-file-modify-binary/snapshot-a \
  ./test-vectors-materialized/single-file-modify-binary/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **data.bin**: 1 edit

single-file-modify-csv¶

CSV file compared directly (file-to-file, not via directory)

Browse source: single-file-modify-csv
Tags: csv, single-file, modify
Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/single-file-modify-csv/snapshot-a/data.csv \
  ./test-vectors-materialized/single-file-modify-csv/snapshot-b/data.csv

Result:

# Changelog: snapshot-a → snapshot-b

- **data.csv**: 1 row added
  - Rows added
    - row 3: 'Charlie', '35'

single-file-modify-text¶

Text file with line-level changes

Browse source: single-file-modify-text
Tags: modify, text, lines
Snapshots: snapshot-a has 1 file — story.txt; snapshot-b has 1 file — story.txt

Run it:

binoc diff \
  ./test-vectors-materialized/single-file-modify-text/snapshot-a \
  ./test-vectors-materialized/single-file-modify-text/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **story.txt**: 2 lines added; 1 line removed
  - Line changes
    - line 2: 'Line 2' -> 'Line 2 revised'

single-file-modify-text-root¶

Text file compared directly (file-to-file, not via directory)

Browse source: single-file-modify-text-root
Tags: text, single-file, modify
Snapshots: snapshot-a has 1 file — story.txt; snapshot-b has 1 file — story.txt

Run it:

binoc diff \
  ./test-vectors-materialized/single-file-modify-text-root/snapshot-a/story.txt \
  ./test-vectors-materialized/single-file-modify-text-root/snapshot-b/story.txt

Result:

# Changelog: snapshot-a → snapshot-b

- **story.txt**: 2 lines added; 1 line removed
  - Line changes
    - line 2: 'Line 2' -> 'Line 2 revised'

single-file-remove¶

File present in A but not B

Browse source: single-file-remove
Tags: remove, file
Snapshots: snapshot-a has 1 file — removed_file.txt; snapshot-b has 0 files (empty snapshot)

Run it:

binoc diff \
  ./test-vectors-materialized/single-file-remove/snapshot-a \
  ./test-vectors-materialized/single-file-remove/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **removed_file.txt**: Removed

stacked-csv-broken-out¶

Stacked-CSV tables broken out into one file per table; whole-table rehoming (reshape + 1:1), NOT a partition split (CFM…

Browse source: stacked-csv-broken-out
Tags: csv, stacked-tables, reshape
Snapshots: snapshot-a has 1 file — report.csv; snapshot-b has 2 files — changes.csv, products.csv

Run it:

binoc diff \
  ./test-vectors-materialized/stacked-csv-broken-out/snapshot-a \
  ./test-vectors-materialized/stacked-csv-broken-out/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **changes.csv**: Moved from report.csv/>table_1
- **products.csv**: Reshaped from report.csv (stacked tables → tabular)
- **report.csv/>table_2**: Removed

tar-nested¶

Nested tar.gz containing CSV

Browse source: tar-nested
Tags: tar, nested, csv
Snapshots: snapshot-a has 1 file — outer.tar.gz.d/inner.tar.gz.d/data.csv; snapshot-b has 1 file — outer.tar.gz.d/inner.tar.gz.d/data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/tar-nested/snapshot-a \
  ./test-vectors-materialized/tar-nested/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **outer.tar.gz/>inner.tar.gz/>data.csv**: 1 row added
  - Rows added
    - row 2: 'Bob', '25'

tar-simple¶

Tar.gz archive with changes inside

Browse source: tar-simple
Tags: tar, archive
Snapshots: snapshot-a has 2 files — archive.tar.gz.d/data.csv, archive.tar.gz.d/hello.txt; snapshot-b has 2 files — archive.tar.gz.d/data.csv, archive.tar.gz.d/hello.txt

Run it:

binoc diff \
  ./test-vectors-materialized/tar-simple/snapshot-a \
  ./test-vectors-materialized/tar-simple/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **archive.tar.gz/>data.csv**: 1 row added
  - Rows added
    - row 3: 'gamma', '3'
- **archive.tar.gz/>hello.txt**: 1 line added

text-rename-modify¶

Text file renamed and modified: detected as a single move by fuzzy correlation

Browse source: text-rename-modify
Tags: text, fuzzy-move, rename-modify
Snapshots: snapshot-a has 1 file — notes.txt; snapshot-b has 1 file — meeting-notes-v2.txt

Run it:

binoc diff \
  ./test-vectors-materialized/text-rename-modify/snapshot-a \
  ./test-vectors-materialized/text-rename-modify/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **meeting-notes-v2.txt**:
  - Moved from notes.txt
  - 2 lines added
  - Line changes (showing 3 of 4)
    - line 10: '' -> '- Marketing strategy update'
    - line 11: 'Action Items:' -> ''
    - line 12: '- Alice to finalize budget by Friday' -> 'Action Items:'

toml-value-change¶

A TOML value changes; transcoded to a structured_document and reported as a value change

Browse source: toml-value-change
Tags: toml, structured-document, value-change
Snapshots: snapshot-a has 1 file — config.toml; snapshot-b has 1 file — config.toml

Run it:

binoc diff \
  ./test-vectors-materialized/toml-value-change/snapshot-a \
  ./test-vectors-materialized/toml-value-change/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **config.toml**: Document values changed
  - Value Change: changes: [{"from":"3","kind":"replace","path":"$.replicas","to":"5"}]; examples_truncated: false

tree-wide-correlation¶

Shows tree-wide move and copy detection across nested zip boundaries, including one-to-many copies and many-to-one moves.

Browse source: tree-wide-correlation
Tags: move, copy, aggregation, zip, nested, archive, tree-wide
Snapshots: snapshot-a has 6 files — alpha.txt, dup.bin, kept.txt, outer.zip.d/beta.txt, +2 more; snapshot-b has 7 files — gamma-renamed.txt, kept-copy.txt, kept.txt, merged.bin, +3 more

Run it:

binoc diff \
  ./test-vectors-materialized/tree-wide-correlation/snapshot-a \
  ./test-vectors-materialized/tree-wide-correlation/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **gamma-renamed.txt**: Moved from outer.zip/>inner.zip/>gamma.txt
- **kept-copy.txt**: Copied from kept.txt
- **merged.bin**: Moved from dup.bin
- **outer.zip/>alpha-renamed.txt**: Moved from alpha.txt
- **outer.zip/>inner.zip/>beta-renamed.txt**: Moved from outer.zip/>beta.txt
- **outer.zip/>kept-copy.txt**: Copied from kept.txt
- **outer.zip/>dup-b.bin**: Removed

trivial-identical¶

Two identical directories → empty changeset

Browse source: trivial-identical
Tags: identical, baseline
Snapshots: snapshot-a has 1 file — data.txt; snapshot-b has 1 file — data.txt

Run it:

binoc diff \
  ./test-vectors-materialized/trivial-identical/snapshot-a \
  ./test-vectors-materialized/trivial-identical/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

trivial-identical-csv¶

Two identical CSV files → no changes reported

Browse source: trivial-identical-csv
Tags: csv, identical, baseline
Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/trivial-identical-csv/snapshot-a \
  ./test-vectors-materialized/trivial-identical-csv/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

tsv-cell-changes¶

Tab-delimited file parses into real columns and reports cell changes

Browse source: tsv-cell-changes
Tags: tsv, cell-change
Snapshots: snapshot-a has 1 file — data.tsv; snapshot-b has 1 file — data.tsv

Run it:

binoc diff \
  ./test-vectors-materialized/tsv-cell-changes/snapshot-a \
  ./test-vectors-materialized/tsv-cell-changes/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **data.tsv**: 2 cells changed
  - Changed cells
    - row 1, column 'age': '30' -> '31'
    - row 2, column 'city': 'Boston' -> 'Cambridge'

yaml-value-change¶

A YAML scalar value changes; transcoded to a structured_document and reported as a value change

Browse source: yaml-value-change
Tags: yaml, structured-document, value-change
Snapshots: snapshot-a has 1 file — config.yaml; snapshot-b has 1 file — config.yaml

Run it:

binoc diff \
  ./test-vectors-materialized/yaml-value-change/snapshot-a \
  ./test-vectors-materialized/yaml-value-change/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **config.yaml**: Document values changed
  - Value Change: changes: [{"from":"3","kind":"replace","path":"$.replicas","to":"5"}]; examples_truncated: false

zip-declared-container¶

Config declares a correspondence between nested zip containers and preserves inner CSV content detail

Browse source: zip-declared-container
Tags: zip, file-correspondence, declared-correspondence, container
Snapshots: snapshot-a has 1 file — outer.zip.d/records-old.zip.d/data.csv; snapshot-b has 1 file — outer.zip.d/records.zip.d/data.csv
Setup: This example uses a custom dataset config to make the relevant correspondence behavior obvious. Save this dataset config as /tmp/zip-declared-container.yaml:

dataset:
  files:
    correspondences:
      - name: inner-archive-pair
        key: records
        logical_path: outer.zip/>records.zip
        on_null_key: diagnostic
        on_duplicate_key: diagnostic
        left:
          path_regex: ^outer\.zip/>records-old\.zip$
        right:
          path_regex: ^outer\.zip/>records\.zip$

Run it:

binoc diff \
  ./test-vectors-materialized/zip-declared-container/snapshot-a \
  ./test-vectors-materialized/zip-declared-container/snapshot-b \
  --config /tmp/zip-declared-container.yaml

Result:

# Changelog: snapshot-a → snapshot-b

- **outer.zip/>records.zip**:
  - Moved from outer.zip/>records-old.zip
  - 1 cell changed

zip-json-key-order-reexport¶

JSON files inside zip expansion get parsed and rendered as serialization-only changes

Browse source: zip-json-key-order-reexport
Tags: zip, json, serialization, key-order
Snapshots: snapshot-a has 1 file — archive.zip.d/metadata.json; snapshot-b has 1 file — archive.zip.d/metadata.json

Run it:

binoc diff \
  ./test-vectors-materialized/zip-json-key-order-reexport/snapshot-a \
  ./test-vectors-materialized/zip-json-key-order-reexport/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **archive.zip/>metadata.json**: Document serialization changed
  - Serialization Change: kinds: ["object_key_order","formatting"]; left: {"byte_len":82,"line_ending":"lf","object_key_orders":[{"keys":["id","name"],"path":"$.schema"},{"keys":["dataset","issued","schema"],"path":"$"}],"trailing_new...; right: {"byte_len":110,"indentation":"2 spaces","line_ending":"lf","object_key_orders":[{"keys":["name","id"],"path":"$.schema"},{"keys":["schema","issued","dataset"],...

zip-nested¶

Nested zip containing CSV

Browse source: zip-nested
Tags: zip, nested, csv
Snapshots: snapshot-a has 1 file — outer.zip.d/inner.zip.d/data.csv; snapshot-b has 1 file — outer.zip.d/inner.zip.d/data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/zip-nested/snapshot-a \
  ./test-vectors-materialized/zip-nested/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **outer.zip/>inner.zip/>data.csv**: 1 row added
  - Rows added
    - row 2: 'Bob', '25'

zip-rename-contents-rewritten¶

Documents a known gap — a renamed zip whose children were all renamed AND rewritten (no content similarity) yields unpa…

Browse source: zip-rename-contents-rewritten
Tags: zip, archive, known-gap
Snapshots: snapshot-a has 3 files — data.zip.d/x.csv, data.zip.d/y.csv, data.zip.d/z.csv; snapshot-b has 3 files — archive.zip.d/p.csv, archive.zip.d/q.csv, archive.zip.d/r.csv

Run it:

binoc diff \
  ./test-vectors-materialized/zip-rename-contents-rewritten/snapshot-a \
  ./test-vectors-materialized/zip-rename-contents-rewritten/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **data.zip**: Removed
- **data.zip/>x.csv**: Removed
- **data.zip/>y.csv**: Removed
- **data.zip/>z.csv**: Removed
- **archive.zip**: Added
- **archive.zip/>p.csv**: Added
- **archive.zip/>q.csv**: Added
- **archive.zip/>r.csv**: Added

zip-rename-identical¶

Zip archive renamed with identical contents; bottom-up roll-up of the inner clean file moves compacts the pair into a s…

Browse source: zip-rename-identical
Tags: zip, archive, folder-move
Snapshots: snapshot-a has 3 files — data.zip.d/x.csv, data.zip.d/y.csv, data.zip.d/z.csv; snapshot-b has 3 files — archive.zip.d/x.csv, archive.zip.d/y.csv, archive.zip.d/z.csv

Run it:

binoc diff \
  ./test-vectors-materialized/zip-rename-identical/snapshot-a \
  ./test-vectors-materialized/zip-rename-identical/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **archive.zip**: Moved from data.zip

zip-rename-inner-rename-edit¶

Zip archive renamed while its only child was renamed and had one cell edited; the modified move counts as roll-up evide…

Browse source: zip-rename-inner-rename-edit
Tags: zip, archive, folder-move, fuzzy-correlation
Snapshots: snapshot-a has 1 file — data.zip.d/old.csv; snapshot-b has 1 file — archive.zip.d/new.csv

Run it:

binoc diff \
  ./test-vectors-materialized/zip-rename-inner-rename-edit/snapshot-a \
  ./test-vectors-materialized/zip-rename-inner-rename-edit/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **archive.zip**: Moved from data.zip
- **archive.zip/>new.csv**:
  - Moved from data.zip/>old.csv
  - 1 cell changed
  - Changed cells
    - row 5, column 'score': '60' -> '61'

zip-simple¶

Zipped files with changes inside

Browse source: zip-simple
Tags: zip, archive
Snapshots: snapshot-a has 1 file — archive.zip.d/data.txt; snapshot-b has 2 files — archive.zip.d/data.txt, archive.zip.d/extra.txt

Run it:

binoc diff \
  ./test-vectors-materialized/zip-simple/snapshot-a \
  ./test-vectors-materialized/zip-simple/snapshot-b

Result:

# Changelog: snapshot-a → snapshot-b

- **archive.zip/>data.txt**: 1 line added; 1 line removed
  - Line changes
    - line 1: 'hello from zip A' -> 'hello from zip B'
- **archive.zip/>extra.txt**: Added