CLI Reference

All commands are Hydra-configured. Key-value overrides (key=value) are passed directly on the CLI.


saffron <command> [overrides...]

collect

Runs model inference with forward hooks attached. Writes activations to <out_dir>/activations/activations.h5.


saffron collect model=<rf3|rfd3> \
  hooks=<group> \
  inputs=<path/to/inputs.json> \
  out_dir=<output_dir>

Override	Required	Description
`model`	✓	`rf3` or `rfd3`
`hooks`	—	Hook config group: `rf3_default`, `rfd3_partial`
`steering`	—	Steering config group: `none`, `sae_block12_f639`, …
`inputs`	✓	Path to inputs JSON (per-design list)
`out_dir`	✓	Output directory
`inference_engine`	—	Override the Hydra inference engine config

hooks= and steering= resolve from sae/src/sae/configs/{hooks,steering}/. Identical mechanism to inference_engine=.

train

Trains a Matryoshka BatchTopK SAE on collected activations.


saffron train \
  activations_path=<path/to/activations.h5> \
  hook_name=<block6|block8|block12|…> \
  [steps=20000] [batch_size=4096]

Override	Default	Description
`activations_path`	✓	HDF5 file from `saffron collect`
`hook_name`	✓	Which hook’s activations to train on
`architecture`	`matryoshka_batch_top_k`	SAE architecture
`steps`	`20000`	Training steps
`batch_size`	`4096`	Batch size
`device`	`cuda:0`	Torch device
`use_wandb`	`false`	Enable W&B logging
`out_dir`	auto-timestamped	Checkpoint output directory

eval

Scores SAE features: activation frequency, top residues, per-feature AUROC.


saffron eval \
  checkpoint_path=<path/to/final.pt> \
  activations_path=<path/to/activations.h5> \
  hook_name=<block12> \
  [metadata_dir=<dir>]

Override	Default	Description
`checkpoint_path`	✓	SAE checkpoint `.pt`
`activations_path`	✓	HDF5 activations file
`hook_name`	✓	Hook to evaluate
`metadata_dir`	—	Dir with per-design labels (for AUROC)
`num_features`	`12`	Features to display per analysis
`top_residues`	`5`	Top residues per feature
`eval_batch_size`	`4096`	Eval batch size

screen

Runs a trained detector bundle on new sequences or structures. Alias for detect screen.


saffron screen model=<rf3|rfd3> \
  inputs=<path/to/inputs.json> \
  detector=<path/to/detector_bundle.pt> \
  out_dir=<output_dir>

steer

Identical dispatch to collect. Adds or ablates directions in activation space at each diffusion step.


saffron steer model=<rf3|rfd3> \
  hooks=<group> \
  steering=<group> \
  inputs=<path/to/inputs.json> \
  out_dir=<output_dir>

See Steering for steering config group reference.

compute_steering_vector

Computes mean(positives) − mean(negatives) per hook from two activation sets. Outputs one .pt per hook.


saffron compute_steering_vector \
  inputs=<path/to/diff_config.yaml>

diff_config.yaml schema:


activations_h5: outputs/collect/train/activations/activations.h5
labels_csv: data_pipelines/vaxijen/labels.tsv
positive_label: 1
negative_label: 0
hooks: [block6, block8, block12]
normalize: unit          # unit | none
out_dir: outputs/steering/vectors/haz_minus_ben