chemistry_tools.pubchem.compound
¶
Attention
This module has the following additional requirements:
cawdrey>=0.1.7 mathematical>=0.1.13 pillow>=7.0.0; platform_python_implementation == "PyPy" and python_version != "3.6" pillow>=7.0.0; platform_python_implementation != "PyPy" pillow<=8.0.0,>=7.0.0; platform_python_implementation == "PyPy" and python_version == "3.6" pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[pubchem]
Represents a chemical compound.
Data:
Invariant |
Classes:
|
Corresponds to a single record from the PubChem Compound database. |
Functions:
|
Construct a |
-
C
= TypeVar(C, bound=Compound)¶ Type:
TypeVar
Invariant
TypeVar
bound tochemistry_tools.pubchem.compound.Compound
.
-
class
Compound
(title, CID, description, **_)[source]¶ Bases:
Dictable
Corresponds to a single record from the PubChem Compound database.
The PubChem Compound database is constructed from the Substance database using a standardization and deduplication process. Each Compound is uniquely identified by a CID.
- Parameters
Methods:
__repr__
()Return a string representation of the
Compound
.from_cid
(cid[, record_type])Returns the Compound objects for the compound with the given CID.
get_iupac_name
([type_])Return the IUPAC name of this compound.
get_properties
(properties)Returns the requested properties for the Compound.
get_property
(prop)Get a single property for the compound.
precache
()Precache all properties for this compound.
Return a pandas
Series
containing Compound data.Attributes:
List of
Atoms
in this Compound.PubChem CACTVS fingerprint.
Canonical SMILES, with no stereochemistry information.
Whether the compound is canonicalized.
The charge of the compound.
Returns the ID of this compound.
The coordinate type of this compound.
List of element symbols for atoms in this Compound.
Raw padded and hex-encoded fingerprint, as returned by the PUG REST API.
Returns whether this compound has a full record available.
The preferred IUPAC name of this compound.
Molecular formula.
Molecular Weight.
Molecular Weight.
Canonical SMILES, with no stereochemistry information.
Returns a list of synonyms for the Compound.
The systematic IUPAC name of this compound.
-
property
cactvs_fingerprint
¶ PubChem CACTVS fingerprint.
Each bit in the fingerprint represents the presence or absence of one of 881 chemical substructures.
-
property
fingerprint
¶ Raw padded and hex-encoded fingerprint, as returned by the PUG REST API.
-
classmethod
from_cid
(cid, record_type='2d')[source]¶ Returns the Compound objects for the compound with the given CID.
- Return type
-
get_properties
(properties)[source]¶ Returns the requested properties for the Compound.
- Parameters
properties (
Union
[Sequence
[str
],str
]) – The properties to retrieve for the compound. See the table below. Can be either a comma-separated string or a list.
Property
Description
MolecularFormula
Molecular formula.
MolecularWeight
The molecular weight is the sum of all atomic weights of the constituent atoms in a compound, measured in g/mol. In the absence of explicit isotope labelling, averaged natural abundance is assumed. If an atom bears an explicit isotope label, 100% isotopic purity is assumed at this location.
CanonicalSMILES
Canonical SMILES (Simplified Molecular Input Line Entry System) string. It is a unique SMILES string of a compound, generated by a “canonicalization” algorithm.
IsomericSMILES
Isomeric SMILES string. It is a SMILES string with stereochemical and isotopic specifications.
InChI
Standard IUPAC International Chemical Identifier (InChI). It does not allow for user selectable options in dealing with the stereochemistry and tautomer layers of the InChI string.
InChIKey
Hashed version of the full standard InChI, consisting of 27 characters.
IUPACName
Chemical name systematically determined according to the IUPAC nomenclatures.
XLogP
Computationally generated octanol-water partition coefficient or distribution coefficient. XLogP is used as a measure of hydrophilicity or hydrophobicity of a molecule.
ExactMass
The mass of the most likely isotopic composition for a single molecule, corresponding to the most intense ion/molecule peak in a mass spectrum.
MonoisotopicMass
The mass of a molecule, calculated using the mass of the most abundant isotope of each element.
TPSA
Topological polar surface area, computed by the algorithm described in the paper by Ertl et al.
Complexity
The molecular complexity rating of a compound, computed using the Bertz/Hendrickson/Ihlenfeldt formula.
Charge
The total (or net) charge of a molecule.
HBondDonorCount
Number of hydrogen-bond donors in the structure.
HBondAcceptorCount
Number of hydrogen-bond acceptors in the structure.
RotatableBondCount
Number of rotatable bonds.
HeavyAtomCount
Number of non-hydrogen atoms.
IsotopeAtomCount
Number of atoms with enriched isotope(s)
AtomStereoCount
Total number of atoms with tetrahedral (sp3) stereo [e.g., (R)- or (S)-configuration]
DefinedAtomStereoCount
Number of atoms with defined tetrahedral (sp3) stereo.
UndefinedAtomStereoCount
Number of atoms with undefined tetrahedral (sp3) stereo.
BondStereoCount
Total number of bonds with planar (sp2) stereo [e.g., (E)- or (Z)-configuration].
DefinedBondStereoCount
Number of atoms with defined planar (sp2) stereo.
UndefinedBondStereoCount
Number of atoms with undefined planar (sp2) stereo.
CovalentUnitCount
Number of covalently bound units.
Volume3D
Analytic volume of the first diverse conformer (default conformer) for a compound.
XStericQuadrupole3D
The x component of the quadrupole moment (Qx) of the first diverse conformer (default conformer) for a compound.
YStericQuadrupole3D
The y component of the quadrupole moment (Qy) of the first diverse conformer (default conformer) for a compound.
ZStericQuadrupole3D
The z component of the quadrupole moment (Qz) of the first diverse conformer (default conformer) for a compound.
FeatureCount3D
Total number of 3D features (the sum of FeatureAcceptorCount3D, FeatureDonorCount3D, FeatureAnionCount3D, FeatureCationCount3D, FeatureRingCount3D and FeatureHydrophobeCount3D)
FeatureAcceptorCount3D
Number of hydrogen-bond acceptors of a conformer.
FeatureDonorCount3D
Number of hydrogen-bond donors of a conformer.
FeatureAnionCount3D
Number of anionic centers (at pH 7) of a conformer.
FeatureCationCount3D
Number of cationic centers (at pH 7) of a conformer.
FeatureRingCount3D
Number of rings of a conformer.
FeatureHydrophobeCount3D
Number of hydrophobes of a conformer.
ConformerModelRMSD3D
Conformer sampling RMSD in Å.
EffectiveRotorCount3D
Total number of 3D features (the sum of FeatureAcceptorCount3D, FeatureDonorCount3D, FeatureAnionCount3D, FeatureCationCount3D, FeatureRingCount3D and FeatureHydrophobeCount3D)
ConformerCount3D
The number of conformers in the conformer model for a compound.
Fingerprint2D
Base64-encoded PubChem Substructure Fingerprint of a molecule.
-
get_property
(prop)[source]¶ Get a single property for the compound.
- Parameters
prop (
str
) – The property to retrieve for the compound. See the table below.
Property
Description
MolecularFormula
Molecular formula.
MolecularWeight
The molecular weight is the sum of all atomic weights of the constituent atoms in a compound, measured in g/mol. In the absence of explicit isotope labelling, averaged natural abundance is assumed. If an atom bears an explicit isotope label, 100% isotopic purity is assumed at this location.
CanonicalSMILES
Canonical SMILES (Simplified Molecular Input Line Entry System) string. It is a unique SMILES string of a compound, generated by a “canonicalization” algorithm.
IsomericSMILES
Isomeric SMILES string. It is a SMILES string with stereochemical and isotopic specifications.
InChI
Standard IUPAC International Chemical Identifier (InChI). It does not allow for user selectable options in dealing with the stereochemistry and tautomer layers of the InChI string.
InChIKey
Hashed version of the full standard InChI, consisting of 27 characters.
IUPACName
Chemical name systematically determined according to the IUPAC nomenclatures.
XLogP
Computationally generated octanol-water partition coefficient or distribution coefficient. XLogP is used as a measure of hydrophilicity or hydrophobicity of a molecule.
ExactMass
The mass of the most likely isotopic composition for a single molecule, corresponding to the most intense ion/molecule peak in a mass spectrum.
MonoisotopicMass
The mass of a molecule, calculated using the mass of the most abundant isotope of each element.
TPSA
Topological polar surface area, computed by the algorithm described in the paper by Ertl et al.
Complexity
The molecular complexity rating of a compound, computed using the Bertz/Hendrickson/Ihlenfeldt formula.
Charge
The total (or net) charge of a molecule.
HBondDonorCount
Number of hydrogen-bond donors in the structure.
HBondAcceptorCount
Number of hydrogen-bond acceptors in the structure.
RotatableBondCount
Number of rotatable bonds.
HeavyAtomCount
Number of non-hydrogen atoms.
IsotopeAtomCount
Number of atoms with enriched isotope(s)
AtomStereoCount
Total number of atoms with tetrahedral (sp3) stereo [e.g., (R)- or (S)-configuration]
DefinedAtomStereoCount
Number of atoms with defined tetrahedral (sp3) stereo.
UndefinedAtomStereoCount
Number of atoms with undefined tetrahedral (sp3) stereo.
BondStereoCount
Total number of bonds with planar (sp2) stereo [e.g., (E)- or (Z)-configuration].
DefinedBondStereoCount
Number of atoms with defined planar (sp2) stereo.
UndefinedBondStereoCount
Number of atoms with undefined planar (sp2) stereo.
CovalentUnitCount
Number of covalently bound units.
Volume3D
Analytic volume of the first diverse conformer (default conformer) for a compound.
XStericQuadrupole3D
The x component of the quadrupole moment (Qx) of the first diverse conformer (default conformer) for a compound.
YStericQuadrupole3D
The y component of the quadrupole moment (Qy) of the first diverse conformer (default conformer) for a compound.
ZStericQuadrupole3D
The z component of the quadrupole moment (Qz) of the first diverse conformer (default conformer) for a compound.
FeatureCount3D
Total number of 3D features (the sum of FeatureAcceptorCount3D, FeatureDonorCount3D, FeatureAnionCount3D, FeatureCationCount3D, FeatureRingCount3D and FeatureHydrophobeCount3D)
FeatureAcceptorCount3D
Number of hydrogen-bond acceptors of a conformer.
FeatureDonorCount3D
Number of hydrogen-bond donors of a conformer.
FeatureAnionCount3D
Number of anionic centers (at pH 7) of a conformer.
FeatureCationCount3D
Number of cationic centers (at pH 7) of a conformer.
FeatureRingCount3D
Number of rings of a conformer.
FeatureHydrophobeCount3D
Number of hydrophobes of a conformer.
ConformerModelRMSD3D
Conformer sampling RMSD in Å.
EffectiveRotorCount3D
Total number of 3D features (the sum of FeatureAcceptorCount3D, FeatureDonorCount3D, FeatureAnionCount3D, FeatureCationCount3D, FeatureRingCount3D and FeatureHydrophobeCount3D)
ConformerCount3D
The number of conformers in the conformer model for a compound.
Fingerprint2D
Base64-encoded PubChem Substructure Fingerprint of a molecule.
- Return type
-
property
has_full_record
¶ Returns whether this compound has a full record available.
- Return type