Saffron
Sparse Autoencoders for RoseTTAFold and RFDiffusiON (and other interpretability tooling)
Saffron is a API for reading and writing activations for protein folding and design models.
Built on foundry — the RoseTTACommons inference infrastructure — with minimal, non-invasive engine modifications to expose activation hooks.
Prerequisite: uv
git clone https://github.com/RoseTTACommons/foundry && cd foundry
uv sync --extra saePipeline
collect → train → eval → screen → steerRun model inference with forward hooks. Writes activations to HDF5.
collectTrain a Matryoshka BatchTopK sparse autoencoder on collected activations.
trainScore SAE features: activation frequency, top residues, AUROC per label.
evalRun a trained detector bundle on new sequences or structures.
screenAdd or ablate directions in activation space during inference.
steerModels
| Model | Input | Use case |
|---|---|---|
| RF3 | sequence | Folding-time activation collection |
| RFD3 | PDB structure | Partial-diffusion activation collection + steering |
Paper
Saffron: Sparse Autoencoders for RoseTTAFold and RFDiffusiON — NeurIPS 2026 submission.