bio_embeddings.mutagenesis¶
BETA: in-silico mutagenesis using the substitution probabilities from ProtTrans-Bert-BFD
-
class
bio_embeddings.mutagenesis.
ProtTransBertBFDMutagenesis
(device: Union[None, str, torch.device] = None, model_directory: Optional[str] = None, half_precision_model: bool = False)[source]¶ BETA: in-silico mutagenesis using BertForMaskedLM
-
__init__
(device: Union[None, str, torch.device] = None, model_directory: Optional[str] = None, half_precision_model: bool = False)[source]¶ Loads the Bert Model for Masked LM
-
device
: torch.device¶
-
get_sequence_probabilities
(sequence: str, temperature: float = 1, start: Optional[int] = None, stop: Optional[int] = None, progress_bar: Optional[tqdm.std.tqdm] = None) → List[Dict[str, float]][source]¶ Returns the likelihood for each of the 20 natural amino acids to be at residue positions between start and end considering the context of the remainder of the sequence (aka: by using. BERT’s mask token and reconstructing the corrupted sequence). Probabilities may be adjusted by a temperature factor. If set to 1 (default) no adjustment is made.
- Parameters
sequence – The amino acid sequence. Please pass whole sequences, not regions
start – the start index (inclusive) of the region for which to compute residue probabilities (starting with 0)
stop – the end (exclusive) of the region for which to compute residue probabilities
temperature – temperature for the softmax computation
progress_bar – optional tqdm progress bar
- Returns
An ordered list for the region of probabilities for each of the 20 natural amino acids to be at said
position.
-
model
: transformers.models.bert.modeling_bert.BertForMaskedLM¶
-
tokenizer
: transformers.models.bert.tokenization_bert.BertTokenizer¶
-