Step Functions

The following functions can be used as attribution targets or step functions in the inseq.models.AttributionModel.attribute() function call.

inseq.attr.step_functions.logit_fn(attribution_model: AttributionModel, forward_output: ModelOutput, target_ids: Tensor[Tensor], **kwargs) → Tensor[Tensor][source]: Compute the logit of the target_ids from the model’s output logits.

inseq.attr.step_functions.probability_fn(attribution_model: AttributionModel, forward_output: ModelOutput, target_ids: Tensor[Tensor], **kwargs) → Tensor[Tensor][source]: Compute the probabilty of target_ids from the model’s output logits.

inseq.attr.step_functions.entropy_fn(attribution_model: AttributionModel, forward_output: ModelOutput, **kwargs) → Tensor[Tensor][source]: Compute the entropy of the model’s output distribution.

inseq.attr.step_functions.crossentropy_fn(attribution_model: AttributionModel, forward_output: ModelOutput, target_ids: Tensor[Tensor], **kwargs) → Tensor[Tensor][source]: Compute the cross entropy between the target_ids and the logits. See: https://github.com/ZurichNLP/nmtscore/blob/master/src/nmtscore/models/m2m100.py#L99

inseq.attr.step_functions.perplexity_fn(attribution_model: AttributionModel, forward_output: ModelOutput, target_ids: Tensor[Tensor], **kwargs) → Tensor[Tensor][source]: Compute perplexity of the target_ids from the logits. Perplexity is the weighted branching factor. If we have a perplexity of 100, it means that whenever the model is trying to guess the next word it is as confused as if it had to pick between 100 words. Reference: https://chiaracampagnola.io/2020/05/17/perplexity-in-language-models/

inseq.attr.step_functions.contrast_prob_diff_fn(attribution_model: AttributionModel, forward_output: ModelOutput, encoder_input_embeds: Tensor[Tensor], encoder_attention_mask: Tensor[Tensor], decoder_input_ids: Tensor[Tensor], decoder_attention_mask: Tensor[Tensor], target_ids: Tensor[Tensor], contrast_ids: Tensor[Tensor], contrast_attention_mask: Tensor[Tensor], **kwargs)[source]

Returnsthe difference between next step probability for a candidate generation target vs. a contrastive alternative, answering the question. Can be used as attribution target to answer the question: “Which features were salient in the choice of picking the selected token rather than its contrastive alternative?”

Parameters:

contrast_ids (torch.Tensor) – Tensor of shape [batch_size, seq_len] containing the ids of the contrastive input to be compared to the candidate.
contrast_attention_mask (torch.Tensor) – Tensor of shape [batch_size, seq_len] containing the attention mask for the contrastive input.

inseq.attr.step_functions.mc_dropout_prob_avg_fn(attribution_model: AttributionModel, forward_output, encoder_input_embeds: Tensor[Tensor], encoder_attention_mask: Tensor[Tensor], decoder_input_ids: Tensor[Tensor], decoder_input_embeds: Tensor[Tensor], decoder_attention_mask: Tensor[Tensor], target_ids: Tensor[Tensor], aux_model: Union[AutoModelForSeq2SeqLM, AutoModelForCausalLM], n_mcd_steps: int = 10, **kwargs)[source]

Returns the average of probability scores using a pool of noisy prediction computed with MC Dropout. Can be used as an attribution target to compute more robust attribution scores.

Parameters:

aux_model (transformers.AutoModelForSeq2SeqLM or transformers.AutoModelForCausalLM) – Model used to produce noisy probability predictions for target ids. Requirements: - Must be a model of the same category as the attribution model (e.g. encoder-decoder or decoder-only) - Must have the same vocabulary as the attribution model to ensure correct probability scores are computed - Must contain dropout layers to enable MC Dropout.
n_mcd_steps (int) – The number of prediction steps that should be used to normalize the original output.