bio_embeddings uses poetry to manage dependencies.
poetry config virtualenvs.in-project true. This will make sure all python dependencies will be in a folder called
.venv(unless you’re using conda).
Clone the repository (
git pull https://github.com/sacdallago/bio_embeddings)
poetry install -E all. This will create a new virtualenv, which you can activate with
deactivateto get back to your normal environment). If you’re already in a conda environment, poetry will use that environment instead.
To check that the environment is active, open a python console and run
We use pytest to check our code, so can run the tests with
pytest. Running them all is slow however and takes a lot of disk space, so you can use
SKIP_SLOW_TESTS=1 pytest to only run a few fast tests.
Some tests that need
RUN_VERY_SLOW_TESTS=1 to be run because they can take a couple of minutes each. FOr example you need
RUN_VERY_SLOW_TESTS=1 pytest tests/conservation.py to run the test of the conservation predictor because it uses the large T5 language model.
To create a new test, either add a new function in an existing file under
tests/, or create a new file starting with
test_ in that folder. All functions inside a
test_*.py file starting with
test_ are run by pytest.
To get the project root as pathlib.Path, use
pytestconfig.rootpath, where pytest will pass
pytestconfig to your method. Here, we just check the number of entries in
from bio_embeddings.utilities import read_mapping_file def test_mapping_file_length(pytestconfig): mapping_file_path = str( pytestconfig.rootpath.joinpath("test-data/mapping_file.csv") ) mapping_file = read_mapping_file(mapping_file_path) # Check that the mapping file actually has two rows with data assert len(mapping_file) == 2
Note that our CI machine doesn’t have a GPU, so the tests still need to pass without a GPU. For tests that need a GPU you can use the following:
import pytest import torch @pytest.mark.skipif( not torch.cuda.is_available(), reason="Can't test the GPU if there isn't any" ) def test_my_feature(): ...
Note that in CI, we skip some embedder tests marked
SKIP_NEGLEGTED_EMBEDDER_TESTS for stale and barely used embedder.