The RDKit is a collection of cheminformatics and machine-learning software written in C++ and Python. The core algorithms and data structures are written in C++. Wrappers are provided to use the toolkit from either Python, Java, or C#. Additionally, the RDKit distribution includes a PostgreSQL-based cartridge that allows molecules to be stored in a relational database and retrieved via substructure and similarity searches.
Please see the RDKit Documentation for more information on installation, usage, cookbooks, and lots more.
Similarity Maps Example
As a way to demonstrate a use of the RDKit, below is an example of how to easily create a similarity map.
The types of atom pairs and torsions are normal (default), hashed and bit vector (bv). The types of the Morgan fingerprint are bit vector (bv, default) and count vector (count).
The function generating a similarity map for two fingerprints requires the specification of the fingerprint function and optionally the similarity metric. The default for the latter is the Dice similarity. Using all the default arguments of the Morgan fingerprint function, the similarity map can be generated like this: