chemistry_tools.pubchem.compound

Attention

This package has the following additional requirements:

cawdrey>=0.1.7
mathematical>=0.1.13
pillow>=7.0.0
pyparsing>=2.4.6
tabulate>=0.8.9

These can be installed as follows:

python -m pip install chemistry-tools[pubchem]

Represents a chemical compound.

Data:

C

Invariant TypeVar bound to chemistry_tools.pubchem.compound.Compound.

Classes:

Compound(title, CID, description, **_)

Represents a single record from the PubChem Compound database.

Functions:

compounds_to_frame(compounds)

Construct a DataFrame from a list of Compound objects.

C = TypeVar(C, bound=Compound)

Type:    TypeVar

Invariant TypeVar bound to chemistry_tools.pubchem.compound.Compound.

class Compound(title, CID, description, **_)[source]

Bases: Dictable

Represents a single record from the PubChem Compound database.

The PubChem Compound database is constructed from the Substance database using a standardization and deduplication process. Each Compound is uniquely identified by a CID.

Parameters
  • title (str) – The title of the compound record (usually the name of the compound)

  • CID (int)

  • description (str)

Methods:

__repr__()

Return a string representation of the Compound.

from_cid(cid[, record_type])

Returns the Compound objects for the compound with the given CID.

get_iupac_name([type_])

Return the IUPAC name of this compound.

get_properties(properties)

Returns the requested properties for the Compound.

get_property(prop)

Get a single property for the compound.

precache()

Precache all properties for this compound.

to_series()

Return a pandas Series containing Compound data.

Attributes:

atoms

List of Atoms in this Compound.

bonds

List of Bonds between Atoms in this Compound.

cactvs_fingerprint

PubChem CACTVS fingerprint.

canonical_smiles

Canonical SMILES, with no stereochemistry information.

canonicalized

Whether the compound is canonicalized.

charge

The charge of the compound.

cid

Returns the ID of this compound.

coordinate_type

The coordinate type of this compound.

elements

List of element symbols for atoms in this Compound.

fingerprint

Raw padded and hex-encoded fingerprint, as returned by the PUG REST API.

has_full_record

Returns whether this compound has a full record available.

iupac_name

The preferred IUPAC name of this compound.

molecular_formula

Molecular formula.

molecular_mass

Molecular Weight.

molecular_weight

Molecular Weight.

smiles

Canonical SMILES, with no stereochemistry information.

synonyms

Returns a list of synonyms for the Compound.

systematic_name

The systematic IUPAC name of this compound.

__repr__()[source]

Return a string representation of the Compound.

Return type

str

property atoms

List of Atoms in this Compound.

Return type

List[Atom]

property bonds

List of Bonds between Atoms in this Compound.

Return type

List[Bond]

property cactvs_fingerprint

PubChem CACTVS fingerprint.

Each bit in the fingerprint represents the presence or absence of one of 881 chemical substructures.

Return type

Optional[str]

property canonical_smiles

Canonical SMILES, with no stereochemistry information.

Return type

str

property canonicalized

Whether the compound is canonicalized.

Return type

bool

property charge

The charge of the compound.

Return type

int

property cid

Returns the ID of this compound.

Return type

int

property coordinate_type

The coordinate type of this compound.

Return type

Optional[str]

property elements

List of element symbols for atoms in this Compound.

Return type

List[str]

property fingerprint

Raw padded and hex-encoded fingerprint, as returned by the PUG REST API.

Return type

Optional[str]

classmethod from_cid(cid, record_type='2d')[source]

Returns the Compound objects for the compound with the given CID.

Return type

Compound

get_iupac_name(type_='Systematic')[source]

Return the IUPAC name of this compound.

Parameters

type_ (str) – The type of IUPAC name. Default 'Systematic'.

Return type

Optional[str]

get_properties(properties)[source]

Returns the requested properties for the Compound.

Parameters

properties (Union[Sequence[str], str]) – The properties to retrieve for the compound. Can be either a comma-separated string or a list. See the table at the start of this chapter for a list of valid properties.

Return type

Dict[str, Any]

Returns

Dictionary mapping the property names to their values

get_property(prop)[source]

Get a single property for the compound.

Parameters

prop (str) – The property to retrieve for the compound. See the table at the start of this chapter for a list of valid properties.

Return type

Any

property has_full_record

Returns whether this compound has a full record available.

Return type

bool

property iupac_name

The preferred IUPAC name of this compound.

Return type

Optional[str]

property molecular_formula

Molecular formula.

Return type

Formula

property molecular_mass

Molecular Weight.

Return type

float

property molecular_weight

Molecular Weight.

Return type

float

precache()[source]

Precache all properties for this compound.

property smiles

Canonical SMILES, with no stereochemistry information.

Return type

str

property synonyms

Returns a list of synonyms for the Compound.

Return type

Optional[List[str]]

property systematic_name

The systematic IUPAC name of this compound.

Return type

Optional[str]

to_series()[source]

Return a pandas Series containing Compound data.

Return type

Series

compounds_to_frame(compounds)[source]

Construct a DataFrame from a list of Compound objects.

Parameters

compounds (Union[Compound, List[Compound]])

Return type

DataFrame