Molecular docking is a key computational method in drug discovery, used to model interactions between small molecules and biological targets. Despite its broad application in hit identification, large-scale docking campaigns remain resource-intensive and time-consuming, especially when applied to ultra-large chemical libraries.
Our AI-driven approach combines docking with Active Learning (AL), which addresses these challenges by intelligently guiding the search for promising compounds, performing only a fraction of the calculations.
The active learning process involves a cycle of up to 10 iterations, including the following steps:
Selection of a 0.1% fraction (say 100k) of random compounds from the chemical space subset
Docking of the selected 100k compounds
Training of the ML model on the obtained docking results
Prediction of the docking scores by the ML model on the rest of the subset 99.9%
Selection of 0.1% fraction of the top-scoring compounds according to the model
Repeat the cycle: docking 100k compounds selected by the model on step 5 and training of the ML model on the whole docking data of 200k compounds (+100k at each iteration)
In this way, the active learning model acts as a navigator that directs docking to analyze only the most useful 1% fraction of the chemical space subset. ?
Example datasets (up to 1 billion):
- Subset of Freedom space 4.0
- Subset of Enamine REAL
- Subset of Freedom space 3.0
- Custom compound set
More information: https://lnkd.in/evwKQsNQ
Contact us: [email protected]