Welcome to Inseq! 🐛

Inseq is a Pytorch-based hackable toolkit to democratize the study of interpretability for sequence generation models. At the moment, Inseq supports a wide set of models from the 🤗 Transformers library and an ever-growing set of feature attribution methods, leveraging in part the widely-used Captum library. For a quick introduction to common use cases, see the Getting started with Inseq page.

Using Inseq, feature attribution maps that can be saved, reloaded, aggregated and visualized either as HTMLs (with Jupyter notebook support) or directly in the console using rich. Besides simple attribution, Inseq also supports features like step score extraction, attribution aggregation and attributed functions customization for more advanced use cases. Refer to the guides in the 🐛 Using Inseq section for more details and examples on specific features.

To give a taste of what Inseq can do in a couple lines of code, here’s a snippet doing source-side attribution of an English-to-Italian translation produced by the model Helsinki-NLP/opus-mt-en-it from 🤗 Transformers using the IntegratedGradients method with 300 integral approximation steps, and returning the attribution convergence delta and token-level prediction probabilties.

import inseq

model = inseq.load_model("Helsinki-NLP/opus-mt-en-fr", "integrated_gradients")
out = model.attribute(
    "The developer argued with the designer because she did not like the design.",
    n_steps=300,
    return_convergence_delta=True,
    step_scores=["probability"],
)
out.show()

Inseq is still in early development and is currently maintained by a small team of graduate students based working on interpretability for NLP/NLG led by Gabriele Sarti. We are working hard to add more features and models. If you have any suggestions or feedback, please open an issue on our GitHub repository. Happy hacking! 🐛

Using Inseq 🐛

Main Classes