Saffron

Sparse Autoencoders for RoseTTAFold and RFDiffusiON (and other interpretability tooling)

Saffron is a API for reading and writing activations for protein folding and design models.

Built on foundry — the RoseTTACommons inference infrastructure — with minimal, non-invasive engine modifications to expose activation hooks.

Prerequisite: uv


git clone https://github.com/RoseTTACommons/foundry && cd foundry
uv sync --extra sae

Pipeline


collect → train → eval → screen → steer

Run model inference with forward hooks. Writes activations to HDF5.

Train a Matryoshka BatchTopK sparse autoencoder on collected activations.

Score SAE features: activation frequency, top residues, AUROC per label.

Run a trained detector bundle on new sequences or structures.

Add or ablate directions in activation space during inference.

Model	Input	Use case
RF3	sequence	Folding-time activation collection
RFD3	PDB structure	Partial-diffusion activation collection + steering