GOOD.utils.data

Some data process utils including construction of molecule PyG graph from smile (for compatibility).

Functions

batch_input(G, batch_size[, num_nodes, ...])

Repeat a graph batch_size times and pack into a Batch.

from_smiles(smiles[, with_hydrogen, kekulize])

Converts a SMILES string to a torch_geometric.data.data.Data instance.

GOOD.utils.data.batch_input(G: Data, batch_size: int, num_nodes: Optional[int] = None, node_attrs: list = ['color'])[source]

Repeat a graph batch_size times and pack into a Batch.

Parameters
  • G (Data) – The given graph G.

  • batch_size (int) – Batch size.

  • num_nodes (int) – The number of node of the graph. If None, it will use maybe_numb_nodes.

  • node_attrs (list) – The preserved node attributes.

Returns

Repeated graph batch.

GOOD.utils.data.from_smiles(smiles: str, with_hydrogen: bool = False, kekulize: bool = False)[source]

Converts a SMILES string to a torch_geometric.data.data.Data instance.

Parameters
  • smiles (string, optional) – The SMILES string.

  • with_hydrogen (bool, optional) – If set to True, will store hydrogens in the molecule graph. (default: False)

  • kekulize (bool, optional) – If set to True, converts aromatic bonds to single/double bonds. (default: False)