bio_embeddings.mutagenesis¶
BETA: in-silico mutagenesis using the substitution probabilities from ProtTrans-Bert-BFD
- 
class bio_embeddings.mutagenesis.ProtTransBertBFDMutagenesis(device: Union[None, str, torch.device] = None, model_directory: Optional[str] = None, half_precision_model: bool = False)[source]¶
- BETA: in-silico mutagenesis using BertForMaskedLM - 
__init__(device: Union[None, str, torch.device] = None, model_directory: Optional[str] = None, half_precision_model: bool = False)[source]¶
- Loads the Bert Model for Masked LM 
 - 
device: torch.device¶
 - 
get_sequence_probabilities(sequence: str, temperature: float = 1, start: Optional[int] = None, stop: Optional[int] = None, progress_bar: Optional[tqdm.std.tqdm] = None) → List[Dict[str, float]][source]¶
- Returns the likelihood for each of the 20 natural amino acids to be at residue positions between start and end considering the context of the remainder of the sequence (aka: by using. BERT’s mask token and reconstructing the corrupted sequence). Probabilities may be adjusted by a temperature factor. If set to 1 (default) no adjustment is made. - Parameters
- sequence – The amino acid sequence. Please pass whole sequences, not regions 
- start – the start index (inclusive) of the region for which to compute residue probabilities (starting with 0) 
- stop – the end (exclusive) of the region for which to compute residue probabilities 
- temperature – temperature for the softmax computation 
- progress_bar – optional tqdm progress bar 
 
- Returns
- An ordered list for the region of probabilities for each of the 20 natural amino acids to be at said 
 - position. 
 - 
model: transformers.models.bert.modeling_bert.BertForMaskedLM¶
 - 
tokenizer: transformers.models.bert.tokenization_bert.BertTokenizer¶
 
-