chemistry_tools.pubchem.properties
Attention
This package has the following additional requirements:
cawdrey>=0.1.7 mathematical>=0.1.13 pillow>=7.0.0 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[pubchem]
Functions and classes to access properties of compounds in the PubChem database.
Data:
Allows properties to optionally be specified as underscore_separated, consistent with Compound attributes |
|
Properties for PubChem REST API |
Classes:
|
Metadata about a property. |
|
Represents a property parsed from the full PubChem record. |
Functions:
|
Coerce |
|
Returns the requested properties for the compound with the given identifier. |
|
Returns the requested property for the compound with the given identifier. |
|
Parse raw data from the |
|
Returns the properties for the compound with the given identifier in the desired format. |
|
Returns the properties for the compound with the given identifier as a dictionary. |
-
PROPERTY_MAP= {'atom_stereo_count': 'AtomStereoCount', 'bond_stereo_count': 'BondStereoCount', 'canonical_smiles': 'CanonicalSMILES', 'charge': 'Charge', 'complexity': 'Complexity', 'conformer_count_3d': 'ConformerCount3D', 'conformer_model_rmsd_3d': 'ConformerModelRMSD3D', 'covalent_unit_count': 'CovalentUnitCount', 'defined_atom_stereo_count': 'DefinedAtomStereoCount', 'defined_bond_stereo_count': 'DefinedBondStereoCount', 'effective_rotor_count_3d': 'EffectiveRotorCount3D', 'exact_mass': 'ExactMass', 'feature_acceptor_count_3d': 'FeatureAcceptorCount3D', 'feature_anion_count_3d': 'FeatureAnionCount3D', 'feature_cation_count_3d': 'FeatureCationCount3D', 'feature_count_3d': 'FeatureCount3D', 'feature_donor_count_3d': 'FeatureDonorCount3D', 'feature_hydrophobe_count_3d': 'FeatureHydrophobeCount3D', 'feature_ring_count_3d': 'FeatureRingCount3D', 'fingerprint_2d': 'Fingerprint2D', 'h_bond_acceptor_count': 'HBondAcceptorCount', 'h_bond_donor_count': 'HBondDonorCount', 'heavy_atom_count': 'HeavyAtomCount', 'inchi': 'InChI', 'inchikey': 'InChIKey', 'isomeric_smiles': 'IsomericSMILES', 'isotope_atom_count': 'IsotopeAtomCount', 'iupac_name': 'IUPACName', 'molecular_formula': 'MolecularFormula', 'molecular_weight': 'MolecularWeight', 'monoisotopic_mass': 'MonoisotopicMass', 'rotatable_bond_count': 'RotatableBondCount', 'tpsa': 'TPSA', 'undefined_atom_stereo_count': 'UndefinedAtomStereoCount', 'undefined_bond_stereo_count': 'UndefinedBondStereoCount', 'volume3d': 'Volume3D', 'volume_3d': 'XStericQuadrupole3D', 'x_steric_quadrupole_3d': 'YStericQuadrupole3D', 'xlogp': 'XLogP', 'y_steric_quadrupole_3d': 'ZStericQuadrupole3D'} -
Allows properties to optionally be specified as underscore_separated, consistent with Compound attributes
-
namedtuple
PropData(name, description, type, attr_name)[source] Bases:
NamedTupleMetadata about a property.
- Fields
name (
str) – The name of the property.description (
str) – The description of the property.type (
Callable) – The type of the property.attr_name (
str) – The Python attribute name of the property in achemistry_tools.pubchem.compound.Compound.
-
__repr__() Return a nicely formatted representation string
-
namedtuple
PubChemProperty(label, name=None, value=None, dtype=None, source=None)[source] Bases:
NamedTupleRepresents a property parsed from the full PubChem record.
- Fields
-
force_valid_properties(properties)[source] Coerce
propertiesinto a list of strings and exclude any invalid properties, or raise aValueErrorif that is not possible.
-
get_properties(identifier, properties='', namespace=<PubChemNamespace.Name: 'name'>, as_dataframe=False)[source] Returns the requested properties for the compound with the given identifier. As more than one compound may be identified the results are returned in a list.
- Parameters
identifier (
Union[str,int,Sequence[Union[str,int]]]) – Identifiers (e.g. name, CID) for the compound to look up. When using the CID namespace data for multiple compounds can be retrieved at once by supplying either a comma-separated string or a list.properties (
Union[Sequence[str],str]) – The properties to retrieve for the compound. Can be either a comma-separated string or a list. See the table at the start of this chapter for a list of valid properties. Default''.namespace (
Union[PubChemNamespace,str]) – The type of identifier to look up. Valid values are inPubChemNamespace. Default<PubChemNamespace.Name: 'name'>.as_dataframe (
bool) – Automatically extract the properties into a pandasDataFrame. DefaultFalse.
- Raises
ValueError – If the response body does not contain valid JSON.
NotFoundError – If the compound with the requested identifier was not found in PubChem.
- Return type
- Returns
List of dictionaries mapping properties to values
-
get_property(identifier, property='', namespace=<PubChemNamespace.Name: 'name'>)[source] Returns the requested property for the compound with the given identifier.
This convenience function only allows for a single property to be accessed at once, and for only a single compound. if you require multiple properties and/or properties for multiple compounds use
chemistry_tools.pubchem.properties.get_properties, which helps reduce the burden on the PubChem servers.- Parameters
identifier (
Union[str,int,Sequence[Union[str,int]]]) – Identifiers (e.g. name, CID) for the compound to look up.property (
str) – The property to retrieve for the compound. See the table at the start of this chapter for a list of valid properties. Default''.namespace (
Union[PubChemNamespace,str]) – The type of identifier to look up. Valid values are inPubChemNamespace. Default<PubChemNamespace.Name: 'name'>.
- Raises
ValueError – If the response body does not contain valid JSON.
NotFoundError – If the compound with the requested identifier was not found in PubChem.
- Return type
- Returns
The requested property. Type depends on the property requested.
-
rest_get_properties(identifier, namespace=<PubChemNamespace.Name: 'name'>, properties='', format_=<PubChemFormats.CSV: 'CSV'>)[source] Returns the properties for the compound with the given identifier in the desired format.
- Parameters
identifier (
Union[str,int,Sequence[Union[str,int]]]) – Identifiers (e.g. name, CID) for the compound to look up. When using the CID namespace data for multiple compounds can be retrieved at once by supplying either a comma-separated string or a list.namespace (
Union[PubChemNamespace,str]) – The type of identifier to look up. Valid values are inPubChemNamespace. Default<PubChemNamespace.Name: 'name'>.properties (
Union[Sequence[str],str]) – The properties to retrieve for the compound. Can be either a comma-separated string or a list. See the table at the start of this chapter for a list of valid properties. Default''.format_ (
Union[PubChemFormats,str]) – The format to obtain the data in. Default<PubChemFormats.CSV: 'CSV'>.
- Return type
-
rest_get_properties_json(identifier, namespace=<PubChemNamespace.Name: 'name'>, properties='', **kwargs)[source] Returns the properties for the compound with the given identifier as a dictionary.
- Parameters
identifier (
Union[str,int,Sequence[Union[str,int]]]) – Identifiers (e.g. name, CID) for the compound to look up. When using the CID namespace data for multiple compounds can be retrieved at once by supplying either a comma-separated string or a list.namespace (
Union[str,PubChemNamespace]) – The type of identifier to look up. Valid values are inPubChemNamespace. Default<PubChemNamespace.Name: 'name'>.properties (
Union[Sequence[str],str]) – The properties to retrieve for the compound. Can be either a comma-separated string or a list. See the table at the start of this chapter for a list of valid properties. Default''.**kwargs – Optional arguments that
json.loadstakes.
- Raises
ValueError – If the response body does not contain valid JSON.
- Return type
- Returns
Parsed JSON data
-
valid_properties= {'AtomStereoCount': <class 'int'>, 'BondStereoCount': <class 'int'>, 'CanonicalSMILES': <class 'str'>, 'Charge': <class 'int'>, 'Complexity': <class 'float'>, 'ConformerCount3D': <class 'int'>, 'ConformerModelRMSD3D': <class 'float'>, 'CovalentUnitCount': <class 'int'>, 'DefinedAtomStereoCount': <class 'int'>, 'DefinedBondStereoCount': <class 'int'>, 'EffectiveRotorCount3D': <class 'int'>, 'ExactMass': <class 'float'>, 'FeatureAcceptorCount3D': <class 'int'>, 'FeatureAnionCount3D': <class 'int'>, 'FeatureCationCount3D': <class 'int'>, 'FeatureCount3D': <class 'int'>, 'FeatureDonorCount3D': <class 'int'>, 'FeatureHydrophobeCount3D': <class 'int'>, 'FeatureRingCount3D': <class 'int'>, 'Fingerprint2D': <class 'str'>, 'HBondAcceptorCount': <class 'int'>, 'HBondDonorCount': <class 'int'>, 'HeavyAtomCount': <class 'int'>, 'IUPACName': <class 'str'>, 'InChI': <class 'str'>, 'InChIKey': <class 'str'>, 'IsomericSMILES': <class 'str'>, 'IsotopeAtomCount': <class 'int'>, 'MolecularFormula': <bound method Formula.from_string of <class 'chemistry_tools.formulae.formula.Formula'>>, 'MolecularWeight': <class 'float'>, 'MonoisotopicMass': <class 'float'>, 'RotatableBondCount': <class 'int'>, 'TPSA': <class 'float'>, 'UndefinedAtomStereoCount': <class 'int'>, 'UndefinedBondStereoCount': <class 'int'>, 'Volume3D': <class 'str'>, 'XLogP': <class 'float'>, 'XStericQuadrupole3D': <class 'float'>, 'YStericQuadrupole3D': <class 'float'>, 'ZStericQuadrupole3D': <class 'float'>} -
Properties for PubChem REST API