DOCKSTRING: a python package for easy molecular docking in 1 line of code.

University of Cambridge

One-sentence summary: our package automates and standardizes ligand preparation for AutoDock Vina to make molecular docking easy, accessible, and reproducible.

schematic of dockstring

Abstract (for chemists)

Molecular docking is widely used by chemists, but not in the machine learning community. This is partly because performing docking requires a lot of domain knowledge in chemistry (and potentially access to proprietary software). Our paper, dockstring, aims to simplify this process to make docking accessible to non-experts. We have created a python package which automatically prepares ligands, docks them using AutoDock Vina, and parses the results. This allows docking to be performed from a SMILES string with just one line of code. We use this package to create a dataset of docking scores, and several benchmarks for machine learning algorithms.

Abstract (for ML/CS)

Molecular docking is a widely successful method in drug discovery that estimates a molecule's binding affinity to a protein: a mechanism which underlies the activity of most drugs. Docking normally requires significant domain knowledge to perform, which is why we made a python package to perform docking automatically. The result is a program which can reliably and reproducibly dock molecules from just their SMILES string in a single line of python code. We have used dockstring to create a dataset and to define several benchmarks which are more challenging and realistic than many current popular benchmarks (e.g. maximizing penalized logP or QED).

Using dockstring

Python package:

See our package's github page for installation instructions and tutorials.

Dataset:

Our dataset is hosted on Figshare. The main dataset is here. Example scripts to download, open, and view the data can be found on github here.

Benchmarks:

Tutorials for our benchmarks are located on github. Code for the baseline methods reported in the paper is located here.

BibTeX

Our paper has been published in JCIM! If you use dockstring in any way, please use the following citation:

@article{garciaortegon2022dockstring,
    author = {García-Ortegón, Miguel and Simm, Gregor N. C. and Tripp, Austin J. and Hernández-Lobato, José Miguel and Bender, Andreas and Bacallado, Sergio},
    title = {DOCKSTRING: Easy Molecular Docking Yields Better Benchmarks for Ligand Design},
    journal = {Journal of Chemical Information and Modeling},
    volume = {62},
    number = {15},
    pages = {3486-3502},
    year = {2022},
    doi = {10.1021/acs.jcim.1c01334},
    URL = {https://doi.org/10.1021/acs.jcim.1c01334}
}