The Discovery Diversity Set (DDS-50K) is a curated collection of 50,240 pre-plated compounds designed to maximize chemical diversity through a smart clustering methodology. This set is exclusively composed of lead-like compounds, representing Enamine’s three primary screening collections: HTS (High-Throughput Screening), Advanced, and Premium.
To further expand chemical diversity and explore potential analogs of the DDS-50K molecules, a set of analogs from the REAL Space was created. This was achieved by breaking down the DDS-50K compounds into their core synthons and matching each synthon with analog fragments from the vast Enamine REAL Space.
Analog identification was conducted using Tanimoto similarity search, a widely used method in cheminformatics to evaluate structural similarity between chemical compounds, combined with scaffold comparison. This approach enabled the selection of closely related analogs, facilitating the creation of a chemically diverse set while preserving the structural relevance of the original molecules.
The resulting compounds were generated through cross-enumeration, combining both analog synthons and exact match synthons. This method ensured a comprehensive exploration of potential molecular structures, leveraging the diversity of analogs while maintaining the integrity of the exact match synthons.
In summary, we have created a set of analogs that includes over 6.2 million compounds. For each compound in the DDS-50K set, up to 300 analogs were identified from the Enamine REAL Space. Each file corresponds to a specific compound and includes its identified analogs, ensuring the data is well-structured and easily accessible for further analysis.
The created set can be used to explore the Structure-Activity Relationships (SAR) of the hits identified from screening the DDS-50 set as it contains diverse analogs for each of the compounds.
We can apply this algorithm to create the analogs for your hit molecule.
Contact us at sales@chem-space.com to explore how we can assist you with compound optimization and accelerate drug discovery process.