Using the InteractiveMolecule widget#

For this example we need to install pandas and RDKit. If you don’t have these packages yet, just execute the cell bellow to install. Note that you will need conda to install RDKit.

!conda install -c conda-forge pandas rdkit

Now we can import the trident_chemwidgets and the pandas lib to import our csv dataset.

[1]:
import trident_chemwidgets as tcw
import pandas as pd
from rdkit import Chem

Now we can create a small function to featurize our molecules with basic information per atom.

IMPORTANT: the order of the data rows in the pandas DataFrame or dict must match the standard ordering of atoms as returned by the RDKit ``.GetAtoms()`` function. You can generate this data any way you see fit (e.g. calculated values from RDKit as in the function below or attention values from a Graph Attention Network. The only constraint is the atom ordering. If you are using RDKit-based featurizers like those from DeepChem, this standard ordering should already be the default. Take care when using cutom featurizers.

[2]:
def featurize_mol(smiles):
    # Init feature dict
    feature_dict = {
        'Chiral Tag': [],
        'Formal Charge': [],
        'Mass': [],
        'Total Hs': [],
        'Total Valence': []
    }

    # Get atoms from SMILES
    atoms = Chem.MolFromSmiles(smiles).GetAtoms()

    # Use RDKit to get all the atom properties
    for atom in atoms:
        feature_dict['Chiral Tag'].append(atom.GetChiralTag())
        feature_dict['Formal Charge'].append(atom.GetFormalCharge())
        feature_dict['Mass'].append(atom.GetMass())
        feature_dict['Total Hs'].append(atom.GetTotalNumHs())
        feature_dict['Total Valence'].append(atom.GetTotalValence())

    return pd.DataFrame.from_dict(feature_dict)

Here we’ll be exploring the atom features from the ibuprofen molecule, smiles string CC(C)CC1=CC=C(C=C1)C(C)C(=O)O. We’ll use the function we defined above to get some data at the atom level.

[3]:
atom_data = featurize_mol('CC(C)CC1=CC=C(C=C1)C(C)C(=O)O')
atom_data.head()
[3]:
Chiral Tag Formal Charge Mass Total Hs Total Valence
0 0 0 12.011 3 4
1 0 0 12.011 1 4
2 0 0 12.011 3 4
3 0 0 12.011 2 4
4 0 0 12.011 0 4

Now we can use the InteractiveMolecule widget to explore the data attached to each atom.

[4]:
w = tcw.InteractiveMolecule('CC(C)CC1=CC=C(C=C1)C(C)C(=O)O', data=atom_data)
# w # Uncomment this line to run locally

455399445dd546c993d3b4df1709ea82

The value of the widget will match what you typed into the input.

[5]:
w.smiles
[5]:
'CC(C)CC1=CC=C(C=C1)C(C)C(=O)O'