Write a Rust transformer¶
Goal. Build a transformer in Rust that rewrites nodes in the completed IR tree, ending with a working plugin packaged through maturin.
Prerequisites.
- Rust toolchain and maturin.
- Familiarity with
Write a Rust comparator (the project
layout, Cargo.toml, and export_plugin! macro are identical).
- Conceptual background in Plugin model
and Artifacts and composition.
For a reference implementation, read
model-plugins/binoc-row-reorder.
The minimal shape¶
use binoc_sdk::*;
#[derive(Default)]
pub struct SequenceNormalizer;
impl Transformer for SequenceNormalizer {
fn descriptor(&self) -> TransformerDescriptor {
TransformerDescriptor::new("biobinoc.sequence_normalizer")
.with_match_types(vec!["fasta".into()])
}
fn transform(
&self,
node: DiffNode,
_data: &dyn DataAccess,
) -> BinocResult<TransformResult> {
let left = node.details.get("sequences_left");
let right = node.details.get("sequences_right");
if node.action == "modify" && left == right {
return Ok(TransformResult::Replace(
node.with_tag("biobinoc.whitespace-only"),
));
}
Ok(TransformResult::Unchanged)
}
}
Key points:
- Dispatch is declarative.
TransformerDescriptordeclares which nodes you match on — any combination of match_types, match_tags, match_actions, match_artifacts, and node_shape. All non-empty fields must match (AND-of-ORs). See Dispatch model. - Return types.
TransformResult::Unchanged,Replace(node),ReplaceMany(nodes), orRemove.Unchangedis the zero-cost path — return it whenever you don't need to rewrite. - Ordering matters. Transformers run in the order declared in the dataset config. Later transformers see the output of earlier ones. The tree walk is bottom-up, so when your transformer sees a container, its children are already in their final form.
Register the transformer via the same export_plugin! macro used
for comparators:
Or register it alongside your comparators:
binoc_sdk::export_plugin! {
module: biobinoc,
comparators: [FastaComparator],
transformers: [SequenceNormalizer],
}
Consume a comparator's artifact¶
The cleanest way for a transformer to get structured data is to
consume an artifact published by a comparator. This decouples the
transformer from any specific comparator — any comparator publishing
tabular_v1 can feed your transformer:
impl Transformer for TabularEnricher {
fn descriptor(&self) -> TransformerDescriptor {
TransformerDescriptor::new("biobinoc.tabular_enricher")
.with_match_artifacts(vec![tabular_v1()])
}
fn transform(
&self,
node: DiffNode,
data: &dyn DataAccess,
) -> BinocResult<TransformResult> {
let fmt = tabular_v1();
let Some(desc) = node.artifacts.iter().find(|a| {
a.format == fmt && a.subject == ArtifactSubject::Left
}) else {
return Ok(TransformResult::Unchanged);
};
let Some(bytes) = data.get_artifact(desc)? else {
return Ok(TransformResult::Unchanged);
};
let tabular: TabularData = serde_json::from_slice(&bytes).unwrap();
let new_node = enrich(node, &tabular);
Ok(TransformResult::Replace(new_node))
}
}
Artifacts are the public composition contract. See Artifacts and composition for the full story and the thin-comparator pattern.
When to reach for source_items¶
The controller sets DiffNode::source_items on every node during the
diff. If no artifact carries what you need, transformers can re-parse
source data:
fn transform(
&self,
node: DiffNode,
data: &dyn DataAccess,
) -> BinocResult<TransformResult> {
let Some(pair) = &node.source_items else {
return Ok(TransformResult::Unchanged);
};
// re-parse pair.left / pair.right via data.read_bytes(...) etc.
...
}
Prefer artifacts. They avoid redundant re-parsing across
transformers and enable cross-plugin composition. Use source_items
only when you genuinely need raw byte access (for example, hashing
for move detection) or the comparator you want to cooperate with does
not publish a suitable artifact.
Testing¶
See Test a plugin with vectors for
the shared harness. For unit tests, construct DiffNode values and
call transform() directly.
Where to go next¶
- Publish a plugin — packaging, entry points, versioning.
- Write a Rust comparator — emit the nodes (or artifacts) your transformer consumes.
- Artifacts and composition — the deeper story on publishing and consuming typed data across plugin boundaries.