Skip to content

Examples gallery

These are runnable examples from binoc's test suite. Each example links to its source folder on GitHub, tells you whether it needs any extra setup, gives you the exact command to run, and shows the Markdown changelog binoc is expected to print.

Binoc currently ships 25 shared examples in this gallery.

One-time setup

Clone the repository and materialize the archive-based fixtures once:

git clone https://github.com/harvard-lil/binoc
cd binoc
just materialize

At a glance

Example What it shows Example output Setup
csv-cell-changes Individual cell values changed data.csv: 2 cells changed Default pipeline
csv-column-addition New column added data.csv: Column added: 'email' Default pipeline
csv-column-removal Column removed data.csv: Column removed: 'city' Default pipeline
csv-column-reorder Columns shuffled, content identical data.csv: Columns reordered (content unchanged) Custom config
csv-mixed-changes Multiple change types data.csv: Column added: 'email'; columns reordered; 1 row added Default pipeline
csv-row-addition New rows appended data.csv: 2 rows added Default pipeline
csv-row-removal Rows removed from CSV data.csv: 2 rows removed Default pipeline
directory-file-copy New file with same content as an existing unchanged file detected as a copy duplicate.txt: Copied from original.txt Default pipeline
directory-nested Subdirectories with mixed changes data/extra.csv: New table (2 columns, 1 rows) Default pipeline
directory-nested-with-tar Shows binoc diffing a tar archive and a plain directory that contain overlapping internal paths. data/records.csv: 1 row added Default pipeline
folder-move-nested Detects a whole-folder rename and rolls many file moves up into one folder-move entry. documentation: Folder moved from docs Default pipeline
kitchen-sink Runs text, CSV, archive, move, and copy detection together in one end-to-end example. metrics.csv: Columns reordered (content unchanged) Default pipeline
single-file-add File present in B but not A new_file.txt: New file (1 line) Default pipeline
single-file-modify-binary Binary file, different hash data.bin: Content changed (4 bytes → 4 bytes) Default pipeline
single-file-modify-csv CSV file compared directly (file-to-file, not via directory) data.csv: 1 row added Default pipeline
single-file-modify-text Text file with line-level changes story.txt: 2 lines added, 1 removed Default pipeline
single-file-modify-text-root Text file compared directly (file-to-file, not via directory) story.txt: 2 lines added, 1 removed Default pipeline
single-file-remove File present in A but not B removed_file.txt: File removed (1 line) Default pipeline
tar-nested Nested tar.gz containing CSV outer.tar.gz/inner.tar.gz/data.csv: 1 row added Default pipeline
tar-simple Tar.gz archive with changes inside archive.tar.gz/data.csv: 1 row added Default pipeline
tree-wide-correlation Shows tree-wide move and copy detection across nested zip boundaries, including one-to-many copies and many-to-one moves. gamma-renamed.txt: Moved from gamma.txt Default pipeline
trivial-identical Two identical directories → empty changeset No changes detected. Default pipeline
trivial-identical-csv Two identical CSV files → no changes reported No changes detected. Default pipeline
zip-nested Nested zip containing CSV outer.zip/inner.zip/data.csv: 1 row added Default pipeline
zip-simple Zipped files with changes inside archive.zip/data.txt: 1 line added, 1 removed Default pipeline

csv-cell-changes

Individual cell values changed

  • Browse source: csv-cell-changes
  • Tags: csv, cell-change
  • Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/csv-cell-changes/snapshot-a \
  ./test-vectors-materialized/csv-cell-changes/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

## Other Changes

- **data.csv**: 2 cells changed

csv-column-addition

New column added

  • Browse source: csv-column-addition
  • Tags: csv, column-addition, schema
  • Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/csv-column-addition/snapshot-a \
  ./test-vectors-materialized/csv-column-addition/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

## Substantive Changes

- **data.csv**: Column added: 'email'

csv-column-removal

Column removed

  • Browse source: csv-column-removal
  • Tags: csv, column-removal, schema
  • Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/csv-column-removal/snapshot-a \
  ./test-vectors-materialized/csv-column-removal/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

## Substantive Changes

- **data.csv**: Column removed: 'city'

csv-column-reorder

Columns shuffled, content identical

  • Browse source: csv-column-reorder
  • Tags: csv, column-reorder, clerical
  • Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv
  • Setup: This example uses a custom dataset config to narrow the pipeline to the comparators and transformers that make the behavior obvious. Save this dataset config as /tmp/csv-column-reorder.yaml:
comparators:
  - binoc.directory
  - binoc.csv
transformers:
  - binoc.tabular_analyzer
  - binoc.column_reorder_detector

Run it:

binoc diff \
  ./test-vectors-materialized/csv-column-reorder/snapshot-a \
  ./test-vectors-materialized/csv-column-reorder/snapshot-b \
  --config /tmp/csv-column-reorder.yaml
Result:
# Changelog: snapshot-a → snapshot-b

## Clerical Changes

- **data.csv**: Columns reordered (content unchanged)

csv-mixed-changes

Multiple change types

  • Browse source: csv-mixed-changes
  • Tags: csv, column-reorder, column-addition, row-addition
  • Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/csv-mixed-changes/snapshot-a \
  ./test-vectors-materialized/csv-mixed-changes/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

## Substantive Changes

- **data.csv**: Column added: 'email'; columns reordered; 1 row added

csv-row-addition

New rows appended

  • Browse source: csv-row-addition
  • Tags: csv, row-addition
  • Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/csv-row-addition/snapshot-a \
  ./test-vectors-materialized/csv-row-addition/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

## Substantive Changes

- **data.csv**: 2 rows added

csv-row-removal

Rows removed from CSV

  • Browse source: csv-row-removal
  • Tags: csv, row-removal
  • Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/csv-row-removal/snapshot-a \
  ./test-vectors-materialized/csv-row-removal/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

## Substantive Changes

- **data.csv**: 2 rows removed

directory-file-copy

New file with same content as an existing unchanged file detected as a copy

  • Browse source: directory-file-copy
  • Tags: copy, directory, content-hash
  • Snapshots: snapshot-a has 1 file — original.txt; snapshot-b has 2 files — duplicate.txt, original.txt

Run it:

binoc diff \
  ./test-vectors-materialized/directory-file-copy/snapshot-a \
  ./test-vectors-materialized/directory-file-copy/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

## Other Changes

- **duplicate.txt**: Copied from original.txt

directory-nested

Subdirectories with mixed changes

  • Browse source: directory-nested
  • Tags: directory, nested, mixed
  • Snapshots: snapshot-a has 2 files — data/records.csv, docs/readme.txt; snapshot-b has 3 files — data/extra.csv, data/records.csv, docs/readme.txt

Run it:

binoc diff \
  ./test-vectors-materialized/directory-nested/snapshot-a \
  ./test-vectors-materialized/directory-nested/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

## Substantive Changes

- **data/extra.csv**: New table (2 columns, 1 rows)
- **data/records.csv**: 1 row added
- **docs/readme.txt**: 2 lines added, 1 removed

directory-nested-with-tar

Shows binoc diffing a tar archive and a plain directory that contain overlapping internal paths.

  • Browse source: directory-nested-with-tar
  • Tags: directory, tar, overlap, artifact-collision
  • Snapshots: snapshot-a has 2 files — data.tar.gz.d/records.csv, data/records.csv; snapshot-b has 2 files — data.tar.gz.d/records.csv, data/records.csv

Run it:

binoc diff \
  ./test-vectors-materialized/directory-nested-with-tar/snapshot-a \
  ./test-vectors-materialized/directory-nested-with-tar/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

## Substantive Changes

- **data/records.csv**: 1 row added

## Other Changes

- **data.tar.gz/records.csv**: 1 cell changed

folder-move-nested

Detects a whole-folder rename and rolls many file moves up into one folder-move entry.

  • Browse source: folder-move-nested
  • Tags: folder-move, rollup, nested, directory
  • Snapshots: snapshot-a has 4 files — docs/readme.txt, docs/reports/annual.txt, docs/reports/quarterly/q1.txt, docs/reports/quarterly/q2.txt; snapshot-b has 4 files — documentation/readme.txt, documentation/reports/annual.txt, documentation/reports/quarterly/q1.txt, documentation/reports/quarterly/q2.txt

Run it:

binoc diff \
  ./test-vectors-materialized/folder-move-nested/snapshot-a \
  ./test-vectors-materialized/folder-move-nested/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

## Other Changes

- **documentation**: Folder moved from docs

kitchen-sink

Runs text, CSV, archive, move, and copy detection together in one end-to-end example.

  • Browse source: kitchen-sink
  • Tags: csv, text, binary, tar, zip, directory, move, copy, column-reorder, integration
  • Snapshots: snapshot-a has 9 files — archive.tar.gz.d/inventory.csv, bundle.zip.d/notes.txt, data.csv, docs/old-notes.txt, +5 more; snapshot-b has 10 files — archive.tar.gz.d/inventory.csv, bundle.zip.d/notes.txt, data.csv, docs/new-file.txt, +6 more

Run it:

binoc diff \
  ./test-vectors-materialized/kitchen-sink/snapshot-a \
  ./test-vectors-materialized/kitchen-sink/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

## Clerical Changes

- **metrics.csv**: Columns reordered (content unchanged)

## Substantive Changes

- **archive.tar.gz/inventory.csv**: 1 row added
- **bundle.zip/notes.txt**: 2 lines added, 1 removed
- **docs/new-file.txt**: New file (1 line)
- **docs/old-notes.txt**: File removed (1 line)
- **docs/readme.txt**: 2 lines added, 2 removed
- **icon.bin**: Content changed (19 bytes → 19 bytes)

## Other Changes

- **data.csv**: 2 cells changed
- **license-copy.txt**: Copied from license.txt
- **summary.txt**: Moved from report.txt

single-file-add

File present in B but not A

  • Browse source: single-file-add
  • Tags: add, file
  • Snapshots: snapshot-a has 0 files (empty snapshot); snapshot-b has 1 file — new_file.txt

Run it:

binoc diff \
  ./test-vectors-materialized/single-file-add/snapshot-a \
  ./test-vectors-materialized/single-file-add/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

## Substantive Changes

- **new_file.txt**: New file (1 line)

single-file-modify-binary

Binary file, different hash

  • Browse source: single-file-modify-binary
  • Tags: modify, binary
  • Snapshots: snapshot-a has 1 file — data.bin; snapshot-b has 1 file — data.bin

Run it:

binoc diff \
  ./test-vectors-materialized/single-file-modify-binary/snapshot-a \
  ./test-vectors-materialized/single-file-modify-binary/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

## Substantive Changes

- **data.bin**: Content changed (4 bytes → 4 bytes)

single-file-modify-csv

CSV file compared directly (file-to-file, not via directory)

  • Browse source: single-file-modify-csv
  • Tags: csv, single-file, modify
  • Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/single-file-modify-csv/snapshot-a/data.csv \
  ./test-vectors-materialized/single-file-modify-csv/snapshot-b/data.csv
Result:
# Changelog: snapshot-a → snapshot-b

## Substantive Changes

- **data.csv**: 1 row added

single-file-modify-text

Text file with line-level changes

  • Browse source: single-file-modify-text
  • Tags: modify, text, lines
  • Snapshots: snapshot-a has 1 file — story.txt; snapshot-b has 1 file — story.txt

Run it:

binoc diff \
  ./test-vectors-materialized/single-file-modify-text/snapshot-a \
  ./test-vectors-materialized/single-file-modify-text/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

## Substantive Changes

- **story.txt**: 2 lines added, 1 removed

single-file-modify-text-root

Text file compared directly (file-to-file, not via directory)

  • Browse source: single-file-modify-text-root
  • Tags: text, single-file, modify
  • Snapshots: snapshot-a has 1 file — story.txt; snapshot-b has 1 file — story.txt

Run it:

binoc diff \
  ./test-vectors-materialized/single-file-modify-text-root/snapshot-a/story.txt \
  ./test-vectors-materialized/single-file-modify-text-root/snapshot-b/story.txt
Result:
# Changelog: snapshot-a → snapshot-b

## Substantive Changes

- **story.txt**: 2 lines added, 1 removed

single-file-remove

File present in A but not B

  • Browse source: single-file-remove
  • Tags: remove, file
  • Snapshots: snapshot-a has 1 file — removed_file.txt; snapshot-b has 0 files (empty snapshot)

Run it:

binoc diff \
  ./test-vectors-materialized/single-file-remove/snapshot-a \
  ./test-vectors-materialized/single-file-remove/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

## Substantive Changes

- **removed_file.txt**: File removed (1 line)

tar-nested

Nested tar.gz containing CSV

  • Browse source: tar-nested
  • Tags: tar, nested, csv
  • Snapshots: snapshot-a has 1 file — outer.tar.gz.d/inner.tar.gz.d/data.csv; snapshot-b has 1 file — outer.tar.gz.d/inner.tar.gz.d/data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/tar-nested/snapshot-a \
  ./test-vectors-materialized/tar-nested/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

## Substantive Changes

- **outer.tar.gz/inner.tar.gz/data.csv**: 1 row added

tar-simple

Tar.gz archive with changes inside

  • Browse source: tar-simple
  • Tags: tar, archive
  • Snapshots: snapshot-a has 2 files — archive.tar.gz.d/data.csv, archive.tar.gz.d/hello.txt; snapshot-b has 2 files — archive.tar.gz.d/data.csv, archive.tar.gz.d/hello.txt

Run it:

binoc diff \
  ./test-vectors-materialized/tar-simple/snapshot-a \
  ./test-vectors-materialized/tar-simple/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

## Substantive Changes

- **archive.tar.gz/data.csv**: 1 row added
- **archive.tar.gz/hello.txt**: 1 line added

tree-wide-correlation

Shows tree-wide move and copy detection across nested zip boundaries, including one-to-many copies and many-to-one moves.

  • Browse source: tree-wide-correlation
  • Tags: move, copy, aggregation, zip, nested, archive, tree-wide
  • Snapshots: snapshot-a has 6 files — alpha.txt, dup.bin, kept.txt, outer.zip.d/beta.txt, +2 more; snapshot-b has 7 files — gamma-renamed.txt, kept-copy.txt, kept.txt, merged.bin, +3 more

Run it:

binoc diff \
  ./test-vectors-materialized/tree-wide-correlation/snapshot-a \
  ./test-vectors-materialized/tree-wide-correlation/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

## Other Changes

- **gamma-renamed.txt**: Moved from gamma.txt
- **kept-copy.txt**: Copied from kept.txt to kept-copy.txt and outer.zip/kept-copy.txt
- **merged.bin**: Moved from dup.bin and dup-b.bin
- **outer.zip/alpha-renamed.txt**: Moved from alpha.txt
- **outer.zip/inner.zip/beta-renamed.txt**: Moved from beta.txt

trivial-identical

Two identical directories → empty changeset

  • Browse source: trivial-identical
  • Tags: identical, baseline
  • Snapshots: snapshot-a has 1 file — data.txt; snapshot-b has 1 file — data.txt

Run it:

binoc diff \
  ./test-vectors-materialized/trivial-identical/snapshot-a \
  ./test-vectors-materialized/trivial-identical/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

No changes detected.

trivial-identical-csv

Two identical CSV files → no changes reported

  • Browse source: trivial-identical-csv
  • Tags: csv, identical, baseline
  • Snapshots: snapshot-a has 1 file — data.csv; snapshot-b has 1 file — data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/trivial-identical-csv/snapshot-a \
  ./test-vectors-materialized/trivial-identical-csv/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

No changes detected.

zip-nested

Nested zip containing CSV

  • Browse source: zip-nested
  • Tags: zip, nested, csv
  • Snapshots: snapshot-a has 1 file — outer.zip.d/inner.zip.d/data.csv; snapshot-b has 1 file — outer.zip.d/inner.zip.d/data.csv

Run it:

binoc diff \
  ./test-vectors-materialized/zip-nested/snapshot-a \
  ./test-vectors-materialized/zip-nested/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

## Substantive Changes

- **outer.zip/inner.zip/data.csv**: 1 row added

zip-simple

Zipped files with changes inside

  • Browse source: zip-simple
  • Tags: zip, archive
  • Snapshots: snapshot-a has 1 file — archive.zip.d/data.txt; snapshot-b has 2 files — archive.zip.d/data.txt, archive.zip.d/extra.txt

Run it:

binoc diff \
  ./test-vectors-materialized/zip-simple/snapshot-a \
  ./test-vectors-materialized/zip-simple/snapshot-b
Result:
# Changelog: snapshot-a → snapshot-b

## Substantive Changes

- **archive.zip/data.txt**: 1 line added, 1 removed
- **archive.zip/extra.txt**: New file (1 line)