Chemistry Tools
Python tools for analysis of chemical compounds.
Docs |
|
---|---|
Tests |
|
PyPI |
|
Anaconda |
|
Activity |
|
QA |
|
Other |
Installation
python3 -m pip install chemistry_tools --user
First add the required channels
conda config --add channels https://conda.anaconda.org/conda-forge
conda config --add channels https://conda.anaconda.org/domdfcoding
Then install
conda install chemistry_tools
python3 -m pip install git+https://github.com/domdfcoding/chemistry_tools@master --user
Contents
chemistry_tools.elements
Properties of the chemical elements.
Each chemical element is represented as an object instance. Physicochemical and descriptive properties of the elements are stored as instance attributes.
Originally created by Christoph Gohlke. Licensed under the BSD 3-Clause license
References
Examples
>>> from chemistry_tools.elements import ELEMENTS
>>> ele = ELEMENTS['C']
>>> ele.number
6
>>> ele.symbol
'C'
>>> ele.name
'Carbon'
>>> ele.description[:21]
'Carbon is a member of'
>>> ele.eleconfig
'[He] 2s2 2p2'
>>> ele.eleconfig_dict
{(1, 's'): 2, (2, 's'): 2, (2, 'p'): 2}
>>> str(ELEMENTS[6])
'Carbon'
>>> len(ELEMENTS)
109
>>> sum(ele.mass for ele in ELEMENTS)
14693.181589001004
>>> for ele in ELEMENTS:
... ele.validate()
alkali_metals
Group 1: Alkali Metals in the Periodic Table.
alkaline_earth_metals
Group 2: Alkaline Earth Metals in the Periodic Table.
transition_metals
Transition Metals block in the Periodic Table.
pnictogens
Group 15: Pnictogens in the Periodic Table.
chalcogens
Group 16: Chalcogens in the Periodic Table.
noble_gases
Group 18: Noble Gases in the Periodic Table.
lanthanides
Lanthanides (or lanthanoids) in the Periodic Table.
classes
Provides classes to model period table elements.
Classes:
|
Chemical element. |
|
Ordered dict of Elements with lookup by number, symbol, and name. |
|
Subclass of |
|
Isotope massnumber, relative atomic mass, and abundance. |
Data:
Type alias for isotope dictionaries. |
-
class
Element
(number, symbol, name, group=0, period=0, block='', series=0, mass=0.0, eleneg=0.0, eleaffin=0.0, covrad=0.0, atmrad=0.0, vdwrad=0.0, tboil=0.0, tmelt=0.0, density=0.0, eleconfig='', oxistates='', ionenergy=None, isotopes=None, description='')[source] Bases:
Dictable
Chemical element.
- Parameters
number (
int
) – The atomic number of the element.symbol (
str
) – The chemical symbol of the element.name (
str
) – The name of the element in English.group (
int
) – The number of electrons in the element. Default0
.period (
int
) – The number of protons in the element. Default0
.block (
str
) – The group of the element in the periodic table. Default''
.series (
int
) – The Period of the element in the periodic table. Default0
.mass (
float
) – The relative atomic mass. Default0.0
.eleneg (
float
) – The Electronegativity (Pauling scale). Default0.0
.eleaffin (
float
) – The electron affinity in eV. Default0.0
.covrad (
float
) – The Covalent radius in Angstrom. Default0.0
.atmrad (
float
) – The Atomic radius in Angstrom. Default0.0
.vdwrad (
float
) – The Van der Waals radius in Angstrom. Default0.0
.tboil (
float
) – The boiling temperature in K. Default0.0
.tmelt (
float
) – The melting temperature in K. Default0.0
.density (
float
) – The density at 295K in g/cm³ respectively g/L. Default0.0
.eleconfig (
str
) – The Ground state electron configuration. Default''
.oxistates (
str
) – The oxidation states. Default''
.ionenergy (
Optional
[Tuple
]) – The ionization energies ineV
. DefaultNone
.isotopes (
Optional
[Dict
[int
,Union
[Isotope
,Tuple
[float
,float
]]]]) – The Isotopic composition. A mapping of isotope mass numbers toIsotope
objects. DefaultNone
.description (
str
) – A description of the element. Default''
.
Methods:
__repr__
()Return a string representation of the
Element
.__str__
()Return
str(self)
.validate
()Check consistency of the data.
Attributes:
The Atomic radius in Angstrom.
The Block of the element in the periodic table.
The Covalent radius in Angstrom.
The density at 295K in g/cm³ respectively g/L.
A description of the element.
The electron affinity in eV.
The Ground state electron configuration.
The ground state electron configuration.
The number of electrons in the element.
The Electronegativity (Pauling scale).
The number of electrons per shell as tuple.
The relative atomic mass calculated from the isotopic composition.
The group of the element in the periodic table.
The ionization energies in
eV
.The Isotopic composition.
The relative atomic mass.
The relative atomic mass.
The name of the element in English.
The number of neutrons in the most abundant natural stable isotope.
The mass number of the most abundant natural stable isotope.
The atomic number of the element.
The oxidation states.
The Period of the element in the periodic table.
The number of protons in the element.
Index to chemical series.
The chemical symbol of the element.
The boiling temperature in K.
The melting temperature in K.
The Van der Waals radius in Angstrom.
-
property
eleconfig_dict
The ground state electron configuration.
Mapping of Tuple(shell, subshell): electrons.
-
property
exactmass
The relative atomic mass calculated from the isotopic composition.
- Return type
-
property
isotopes
The Isotopic composition.
keys: isotope mass number
values: Isotope(relative atomic mass, abundance)
-
property
molecular_weight
The relative atomic mass.
Ratio of the average mass of atoms.
- Return type
-
property
neutrons
The number of neutrons in the most abundant natural stable isotope.
- Return type
-
validate
()[source] Check consistency of the data.
- Raises
ValueError – If there are any validation issues.
-
class
Elements
(*elements)[source] -
Ordered dict of Elements with lookup by number, symbol, and name.
- Parameters
*elements (
Element
) – The elements to add to the dictionary.
Methods:
__contains__
(item)Return
key in self
.__getitem__
(key)Return
self[key]
.__iter__
()Returns an iterator over the elements, in order.
__len__
()Returns the number of elements.
__repr__
()Return a string representation of the
Elements
.__str__
()Return
str(self)
.add_alternate_spelling
(element, spelling)Adds an alternate spelling for an element.
split_isotope
(string)Returns the symbol and mass number for the isotope represented by
string
.Attributes:
The names of the elements, all in lowercase.
The names of the elements.
The symbols of the elements.
-
class
HeavyHydrogen
(number, symbol, name, group=0, period=0, block='', series=0, mass=0.0, eleneg=0.0, eleaffin=0.0, covrad=0.0, atmrad=0.0, vdwrad=0.0, tboil=0.0, tmelt=0.0, density=0.0, eleconfig='', oxistates='', ionenergy=None, isotopes=None, description='')[source] Bases:
Element
Subclass of
Element
to handle the Heavy Hydrogen isotopes Deuterium and Tritium.Chemical element.
- Parameters
number (
int
) – The atomic number of the element.symbol (
str
) – The chemical symbol of the element.name (
str
) – The name of the element in English.group (
int
) – The number of electrons in the element. Default0
.period (
int
) – The number of protons in the element. Default0
.block (
str
) – The group of the element in the periodic table. Default''
.series (
int
) – The Period of the element in the periodic table. Default0
.mass (
float
) – The relative atomic mass. Default0.0
.eleneg (
float
) – The Electronegativity (Pauling scale). Default0.0
.eleaffin (
float
) – The electron affinity in eV. Default0.0
.covrad (
float
) – The Covalent radius in Angstrom. Default0.0
.atmrad (
float
) – The Atomic radius in Angstrom. Default0.0
.vdwrad (
float
) – The Van der Waals radius in Angstrom. Default0.0
.tboil (
float
) – The boiling temperature in K. Default0.0
.tmelt (
float
) – The melting temperature in K. Default0.0
.density (
float
) – The density at 295K in g/cm³ respectively g/L. Default0.0
.eleconfig (
str
) – The Ground state electron configuration. Default''
.oxistates (
str
) – The oxidation states. Default''
.ionenergy (
Optional
[Tuple
]) – The ionization energies ineV
. DefaultNone
.isotopes (
Optional
[Dict
[int
,Union
[Isotope
,Tuple
[float
,float
]]]]) – The Isotopic composition. A mapping of isotope mass numbers toIsotope
objects. DefaultNone
.description (
str
) – A description of the element. Default''
.
Attributes:
Return the isotope in
H[X]
format.Return mass number of most abundant natural stable isotope.
-
class
Isotope
(mass=0.0, abundance=1.0, massnumber=0)[source] Bases:
Dictable
Isotope massnumber, relative atomic mass, and abundance.
- Parameters
Methods:
__repr__
()Return a string representation of the
Isotope
.__str__
()Return
str(self)
.Attributes:
The natural abundance of the isotope.
The mass of the isotope.
The mass number of the isotope.
chemistry_tools.formulae
Parse formulae into a Python object.
Attention
This package has the following additional requirements:
cawdrey>=0.5.0 mathematical>=0.5.1 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[formulae]
chemistry_tools.formulae.composition
Attention
This module has the following additional requirements:
cawdrey>=0.5.0 mathematical>=0.5.1 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[formulae]
Elemental composition of a Formula
.
Classes:
|
Class to represent the elemental composition of a |
|
Lookup for sorting elemental composition output. |
-
class
Composition
(formula)[source] Bases:
DataArray
Class to represent the elemental composition of a
Formula
.Methods:
__str__
()Return
str(self)
.as_array
([sort_by, reverse])Returns the elemental composition as a list of lists.
Attributes:
The number of elements in the composition.
The total mass of the composition.
-
enum
CompositionSort
(value)[source] Bases:
enum.Enum
Lookup for sorting elemental composition output.
Valid values are as follows:
-
symbol
= <CompositionSort.symbol: 'symbol'>
-
count
= <CompositionSort.count: 'count'>
-
rel_mass
= <CompositionSort.rel_mass: 'rel_mass'>
-
mass_fraction
= <CompositionSort.mass_fraction: 'mass_fraction'>
-
chemistry_tools.formulae.compound
Attention
This module has the following additional requirements:
cawdrey>=0.5.0 mathematical>=0.5.1 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[formulae]
Parse formulae into a Python object.
Classes:
|
Class representing a chemical compound. |
-
class
Compound
(name, formula=None, data=None, latex_name=None, unicode_name=None, html_name=None)[source] Bases:
Dictable
Class representing a chemical compound.
- Parameters
data
could be simple such as{'mp': 0, 'bp': 100}
or considerably more involved, e.g.:{ 'diffusion_coefficient': { 'water': lambda T: 2.1*m**2/s/K*(T - 273.15*K), } }
Methods:
__eq__
(other)Return
self == other
.__repr__
()Return a string representation of the
Compound
.__str__
()Return
str(self)
.Returns the molar mass (with units) of the substance.
Attributes:
The charge of the compound.
The mass of the compound.
chemistry_tools.formulae.dataarray
Attention
This module has the following additional requirements:
cawdrey>=0.5.0 mathematical>=0.5.1 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[formulae]
Provides a base class which can output data as a pandas.DataFrame
,
to CSV, or as a pretty-printed table in a variety of formats.
-
class
DataArray
(formula, data)[source] Bases:
FrozenOrderedDict
A class which can output data as a
pandas.DataFrame
, to CSV, or as a pretty-printed table in a variety of formats.To use this class it must first be subclassed. Subclasses must implement
as_array()
which handles the conversion of the data to a list of lists of values.- Parameters
Attributes:
Methods:
__contains__
(key)Return
key in self
.__eq__
(other)Return
self == other
.__getitem__
(key)Return
self[key]
.__iter__
()Iterates over the dictionary’s keys.
__len__
()Returns the number of keys in the dictionary.
__repr__
()Return a string representation of the
DataArray
.__str__
()Return
str(self)
.as_array
(sort_by[, reverse])Must be implemented in subclasses to hand the conversion of the data to a list of lists of values.
as_csv
(*args[, sep])Returns the data as a CSV formatted string.
as_dataframe
(*args, **kwargs)Returns the isotope distribution data as a
pandas.DataFrame
.as_table
(*args, **kwargs)Returns the isotope distribution data as a table using tabulate.
copy
(*args, **kwargs)Return a copy of the
FrozenOrderedDict
.fromkeys
(iterable[, value])Create a new dictionary with keys from iterable and values set to value.
get
(k[, default])Return the value for
k
ifk
is in the dictionary, elsedefault
.items
()Returns a set-like object providing a view on the
FrozenOrderedDict
's items.keys
()Returns a set-like object providing a view on the
FrozenOrderedDict
's keys.values
()Returns an object providing a view on the
FrozenOrderedDict
's values.-
__class_getitem__
= <bound method GenericAlias of <class 'chemistry_tools.formulae.dataarray.DataArray'>> Type:
MethodType
-
__getitem__
(key) Return
self[key]
.- Parameters
key (
~KT
)- Return type
~VT
-
abstract
as_array
(sort_by, reverse=False)[source] Must be implemented in subclasses to hand the conversion of the data to a list of lists of values.
-
as_csv
(*args, sep=',', **kwargs)[source] Returns the data as a CSV formatted string.
- Parameters
*args – Arguments passed to
as_array()
.sep (
str
) – The separator for the CSV data. Default','
.**kwargs – Additional keyword arguments passed to
as_array()
.
- Return type
-
as_dataframe
(*args, **kwargs)[source] Returns the isotope distribution data as a
pandas.DataFrame
.Any arguments taken by
as_array()
can also be used here.- Return type
-
as_table
(*args, **kwargs)[source] Returns the isotope distribution data as a table using tabulate.
Any arguments taken by
as_array()
can also be used here.Additionally, any valid keyword argument for
tabulate.tabulate()
can be used.- Return type
-
copy
(*args, **kwargs) Return a copy of the
FrozenOrderedDict
.- Parameters
args
kwargs
-
classmethod
fromkeys
(iterable, value=None) Create a new dictionary with keys from iterable and values set to value.
- Return type
FrozenBase
[~KT
,~VT
]
-
get
(k, default=None) Return the value for
k
ifk
is in the dictionary, elsedefault
.- Parameters
k – The key to return the value for.
default – The value to return if
key
is not in the dictionary. DefaultNone
.
-
items
() Returns a set-like object providing a view on the
FrozenOrderedDict
's items.- Return type
AbstractSet
[Tuple
[~KT
,~VT
]]
-
keys
() Returns a set-like object providing a view on the
FrozenOrderedDict
's keys.- Return type
AbstractSet
[~KT
]
-
values
() Returns an object providing a view on the
FrozenOrderedDict
's values.- Return type
ValuesView
[~VT
]
chemistry_tools.formulae.formula
Attention
This module has the following additional requirements:
cawdrey>=0.5.0 mathematical>=0.5.1 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[formulae]
Parse formulae into a Python object.
Data:
Invariant |
Classes:
|
A Formula object stores a chemical composition of a compound. |
-
F
= TypeVar(F, bound=Formula) Type:
TypeVar
Invariant
TypeVar
bound tochemistry_tools.formulae.formula.Formula
.
-
class
Formula
(composition=None, charge=0)[source] Bases:
defaultdict
,Counter
A Formula object stores a chemical composition of a compound. It is based on
dict
, with the symbols of chemical elements as keys and the values equal to the number of atoms of the corresponding element in the compound.- Parameters
Methods:
__add__
(other)Return
self + value
.__eq__
(other)Return
self == other
.__iadd__
(other)Inplace add from another counter, keeping only positive counts.
__imul__
(other)- rtype
__isub__
(other)Inplace subtract counter, but keep only results with positive counts.
__mul__
(other)Return
self * value
.__radd__
(other)Return
value + self
.__repr__
()Return a string representation of the
Formula
.__rmul__
(other)Return
value * self
.__rsub__
(other)Return
value - self
.__setitem__
(key, value)Set
self[key]
tovalue
.__str__
()Return
str(self)
.__sub__
(other)Return
value - self
.copy
()Returns a copy of the
Formula
.from_kwargs
(*[, charge])Create a new
Formula
object from keyword arguments representing the elements in the compound.from_mass_fractions
(fractions[, charge, …])Create a new
Formula
object from elemental mass fractions by parsing a string.from_string
(formula[, charge])Create a new
Formula
object by parsing a string.get_mz
([average, charge])Calculate the average mass:charge ratio (m/z) of a
Formula
.Returns an
IsotopeDistribution
object representing the distribution of the isotopologues of the formula.iter_isotopologues
([report_abundance, …])Iterate over possible isotopic states of the molecule.
Calculate the most probable isotopic composition of a molecule/ion.
Attributes:
Calculate the average mass of a
Formula
.The average mass to charge ratio of the formula.
A
Composition
object representing the elemental composition of the Formula.A list of the element symbols in the formula.
Returns the empirical formula in Hill notation.
Calculate the monoisotopic mass of a
Formula
.Returns the formula in Hill notation.
Calculate the relative abundance of the current isotopic composition of this molecule.
Calculate the average mass of a
Formula
.Calculate the monoisotopic mass of a
Formula
.The mass to charge ratio of the formula.
Return the number of atoms in the formula.
Return the number of elements in the formula.
Returns formula in Hill notation, without any isotopes specified.
Calculate the monoisotopic mass of a
Formula
.-
__iadd__
(other)[source] Inplace add from another counter, keeping only positive counts.
>>> c = Counter('abbb') >>> c += Counter('bcc') >>> c Counter({'b': 4, 'c': 2, 'a': 1})
- Return type
-
__isub__
(other)[source] Inplace subtract counter, but keep only results with positive counts.
>>> c = Counter('abbbc') >>> c -= Counter('bccd') >>> c Counter({'b': 2, 'a': 1})
- Return type
-
property
average_mass
Calculate the average mass of a
Formula
.Note that mass is not averaged for elements with specified isotopes.
- Return type
-
property
composition
A
Composition
object representing the elemental composition of the Formula.- Return type
-
property
empirical_formula
Returns the empirical formula in Hill notation.
The empirical formula has the simplest whole number ratio of atoms of each element present in the formula.
Examples:
>>> Formula.from_string('H2O').empirical_formula 'H2O' >>> Formula.from_string('S4').empirical_formula 'S' >>> Formula.from_string('C6H12O6').empirical_formula 'CH2O'
- Return type
-
property
exact_mass
Calculate the monoisotopic mass of a
Formula
. If any isotopes are already present in the formula, the mass of these will be preserved- Return type
-
classmethod
from_kwargs
(*, charge=0, **kwargs)[source] Create a new
Formula
object from keyword arguments representing the elements in the compound.
-
classmethod
from_mass_fractions
(fractions, charge=0, maxcount=10, precision=0.0001)[source] Create a new
Formula
object from elemental mass fractions by parsing a string.Note
Isotopes cannot (currently) be parsed using this method
- Parameters
Examples:
>>> Formula.from_mass_fractions({'H': 0.112, 'O': 0.888}) 'H2O' >>> Formula.from_mass_fractions({'D': 0.2, 'O': 0.8}) 'O[2H]2' >>> Formula.from_mass_fractions({'H': 8.97, 'C': 59.39, 'O': 31.64}) 'C5H9O2' >>> Formula.from_mass_fractions({'O': 0.26, '30Si': 0.74}) 'O2[30Si]3'
- Return type
-
classmethod
from_string
(formula, charge=0)[source] Create a new
Formula
object by parsing a string.Note
Isotopes cannot (currently) be parsed using this method
- Return type
-
get_mz
(average=True, charge=None)[source] Calculate the average mass:charge ratio (m/z) of a
Formula
.- Parameters
- Return type
-
property
hill_formula
Returns the formula in Hill notation.
Example:
>>> Formula.from_string('BrC2H5').hill_formula 'C2H5Br' >>> Formula.from_string('HBr').hill_formula 'BrH' >>> Formula.from_string('[(CH3)3Si2]2NNa').hill_formula 'C6H18NNaSi4'
- Return type
-
isotope_distribution
()[source] Returns an
IsotopeDistribution
object representing the distribution of the isotopologues of the formula.- Return type
-
property
isotopic_composition_abundance
Calculate the relative abundance of the current isotopic composition of this molecule.
- Return type
- Returns
The relative abundance of the current isotopic composition.
-
iter_isotopologues
(report_abundance=False, elements_with_isotopes=None, isotope_threshold=0.0005, overall_threshold=0)[source] Iterate over possible isotopic states of the molecule.
The space of possible isotopic compositions is restrained by parameters
elements_with_isotopes
,isotope_threshold
,overall_threshold
.- Parameters
report_abundance (
bool
) – IfTrue
, the output will contain 2-tuples: (composition, abundance). Otherwise, only compositions are yielded. DefaultFalse
.elements_with_isotopes (
Optional
[Sequence
[str
]]) – A set of elements to be considered in isotopic distributions (by default, every element has an isotopic distribution). DefaultNone
.isotope_threshold (
float
) – The threshold abundance of a specific isotope to be considered. Default0.0005
.overall_threshold (
float
) – The threshold abundance of the calculated isotopic composition. Default0
.
- Return type
- Returns
Iterator over possible isotopic compositions.
-
property
mass
Calculate the average mass of a
Formula
.Note that mass is not averaged for elements with specified isotopes.
- Return type
-
property
monoisotopic_mass
Calculate the monoisotopic mass of a
Formula
. If any isotopes are already present in the formula, the mass of these will be preserved- Return type
-
most_probable_isotopic_composition
(elements_with_isotopes=None)[source] Calculate the most probable isotopic composition of a molecule/ion.
For each element, only two most abundant isotopes are considered. Any isotopes already in the Formula will be changed to the most abundant isotope
-
property
n_atoms
Return the number of atoms in the formula.
Example:
>>> Formula.from_string('CH3COOH').n_atoms 8
- Return type
-
property
n_elements
Return the number of elements in the formula.
- Return type
Example:
>>> Formula.from_string('CH3COOH').n_elements 3
-
property
no_isotope_hill_formula
Returns formula in Hill notation, without any isotopes specified.
Example:
>>> Formula.from_string('BrC2H5').no_isotope_hill_formula 'C2H5Br' >>> Formula.from_string('HBr').no_isotope_hill_formula 'BrH' >>> Formula.from_string('[(CH3)3Si2]2NNa').no_isotope_hill_formula 'C6H18NNaSi4'
- Return type
chemistry_tools.formulae.html
Attention
This module has the following additional requirements:
cawdrey>=0.5.0 mathematical>=0.5.1 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[formulae]
Functions and constants for converting formulae to HTML.
Functions:
|
Returns the HTML subscript of the given value. |
|
Returns the HTML superscript of the given value. |
|
Convert formula string to HTML string representation. |
-
string_to_html
(formula, prefixes=None, infixes=None, suffixes=('(s)', '(l)', '(g)', '(aq)'))[source] Convert formula string to HTML string representation.
Examples:
>>> string_to_html("NH4+") 'NH<sub>4</sub><sup>+</sup>' >>> string_to_html("Fe(CN)6+2") 'Fe(CN)<sub>6</sub><sup>2+</sup>' >>> string_to_html("Fe(CN)6+2(aq)") 'Fe(CN)<sub>6</sub><sup>2+</sup>(aq)' >>> string_to_html(".NHO-(aq)") '⋅NHO<sup>-</sup>(aq)' >>> string_to_html("alpha-FeOOH(s)") 'α-FeOOH(s)'
- Parameters
formula (
str
) – Chemical formula, e.g.'H2O'
,'Fe+3'
,'Cl-'
prefixes (
Optional
[Dict
[str
,str
]]) – Mapping of prefixes to their HTML equivalents. Default greek letters and.
infixes (
Optional
[Dict
[str
,str
]]) – Mapping of infixes to their HTML equivalents. DefaultNone
.suffixes (
Sequence
[str
]) – Suffixes to keep. Default('(s)', '(l)', '(g)', '(aq)')
.
- Return type
- Returns
The HTML representation of the formula
chemistry_tools.formulae.iso_dist
Attention
This module has the following additional requirements:
cawdrey>=0.5.0 mathematical>=0.5.1 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[formulae]
Isotope Distributions.
Classes:
|
Lookup for sorting isotope distribution output. |
|
An isotope distribution. |
-
enum
IsoDistSort
(value)[source] Bases:
enum_tools.custom_enums.IntEnum
Lookup for sorting isotope distribution output.
- Member Type
Valid values are as follows:
-
Formula
= <IsoDistSort.Formula: 0> Sort the isosope distribution by the formulae.
-
Mass
= <IsoDistSort.Mass: 1> Sort the isotope distribution by the masses.
-
Abundance
= <IsoDistSort.Abundance: 2> Sort the isotope distribution by the abundances.
-
Relative_Abundance
= <IsoDistSort.Relative_Abundance: 3> Sort the isotope distribution by the relative abundances.
-
class
IsotopeDistribution
(formula)[source] Bases:
DataArray
An isotope distribution.
Each composition can be accessed with their hill formulae like a dictionary (e.g.
iso_dict['H[1]2O[16]']
)Attributes:
Methods:
__contains__
(key)Return
key in self
.__eq__
(other)Return
self == other
.__getitem__
(key)Return
self[key]
.__iter__
()Iterates over the dictionary’s keys.
__len__
()Returns the number of keys in the dictionary.
__repr__
()Return a string representation of the
DataArray
.__str__
()Return
str(self)
.as_array
([sort_by, reverse, format_percentage])Returns the isotope distribution data as a list of lists.
as_csv
(*args[, sep])Returns the data as a CSV formatted string.
as_dataframe
(*args, **kwargs)Returns the isotope distribution data as a
pandas.DataFrame
.as_table
(*args, **kwargs)Returns the isotope distribution data as a table using tabulate.
copy
(*args, **kwargs)Return a copy of the
FrozenOrderedDict
.fromkeys
(iterable[, value])Create a new dictionary with keys from iterable and values set to value.
get
(k[, default])Return the value for
k
ifk
is in the dictionary, elsedefault
.items
()Returns a set-like object providing a view on the
FrozenOrderedDict
's items.keys
()Returns a set-like object providing a view on the
FrozenOrderedDict
's keys.values
()Returns an object providing a view on the
FrozenOrderedDict
's values.-
__class_getitem__
= <bound method GenericAlias of <class 'chemistry_tools.formulae.iso_dist.IsotopeDistribution'>> Type:
MethodType
-
__getitem__
(key) Return
self[key]
.- Parameters
key (
~KT
)- Return type
~VT
-
as_array
(sort_by=<IsoDistSort.Formula: 0>, reverse=False, format_percentage=True)[source] Returns the isotope distribution data as a list of lists.
- Parameters
- Return type
-
as_csv
(*args, sep=',', **kwargs) Returns the data as a CSV formatted string.
- Parameters
*args – Arguments passed to
as_array()
.sep (
str
) – The separator for the CSV data. Default','
.**kwargs – Additional keyword arguments passed to
as_array()
.
- Return type
-
as_dataframe
(*args, **kwargs) Returns the isotope distribution data as a
pandas.DataFrame
.Any arguments taken by
as_array()
can also be used here.- Return type
-
as_table
(*args, **kwargs) Returns the isotope distribution data as a table using tabulate.
Any arguments taken by
as_array()
can also be used here.Additionally, any valid keyword argument for
tabulate.tabulate()
can be used.- Return type
-
copy
(*args, **kwargs) Return a copy of the
FrozenOrderedDict
.- Parameters
args
kwargs
-
classmethod
fromkeys
(iterable, value=None) Create a new dictionary with keys from iterable and values set to value.
- Return type
FrozenBase
[~KT
,~VT
]
-
get
(k, default=None) Return the value for
k
ifk
is in the dictionary, elsedefault
.- Parameters
k – The key to return the value for.
default – The value to return if
key
is not in the dictionary. DefaultNone
.
-
items
() Returns a set-like object providing a view on the
FrozenOrderedDict
's items.- Return type
AbstractSet
[Tuple
[~KT
,~VT
]]
-
keys
() Returns a set-like object providing a view on the
FrozenOrderedDict
's keys.- Return type
AbstractSet
[~KT
]
-
values
() Returns an object providing a view on the
FrozenOrderedDict
's values.- Return type
ValuesView
[~VT
]
-
chemistry_tools.formulae.latex
Attention
This module has the following additional requirements:
cawdrey>=0.5.0 mathematical>=0.5.1 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[formulae]
Functions and constants for converting formulae to LaTeX.
Functions:
|
Returns the LaTeX subscript of the given value. |
|
Returns the LaTeX superscript of the given value. |
|
Convert a formula string to its LaTeX representation. |
-
string_to_latex
(formula, prefixes=None, infixes=None, suffixes=('(s)', '(l)', '(g)', '(aq)'))[source] Convert a formula string to its LaTeX representation.
Examples:
>>> string_to_latex('NH4+') 'NH_{4}^{+}' >>> string_to_latex('Fe(CN)6+2') 'Fe(CN)_{6}^{2+}' >>> string_to_latex('Fe(CN)6+2(aq)') 'Fe(CN)_{6}^{2+}(aq)' >>> string_to_latex('.NHO-(aq)') '^\bullet NHO^{-}(aq)' >>> string_to_latex('alpha-FeOOH(s)') '\alpha-FeOOH(s)'
- Parameters
formula (
str
) – Chemical formula, e.g.'H2O'
,'Fe+3'
,'Cl-'
.prefixes (
Optional
[Dict
[str
,str
]]) – Mapping of prefixes to their LaTeX equivalents. Default greek letters and.
.infixes (
Optional
[Dict
[str
,str
]]) – Mapping of infixes to their LaTeX equivalents. DefaultNone
.suffixes (
Sequence
[str
]) – Suffixes to keep. Default('(s)', '(l)', '(g)', '(aq)')
.
- Return type
- Returns
The LaTeX representation of the formula.
chemistry_tools.formulae.parser
Attention
This module has the following additional requirements:
cawdrey>=0.5.0 mathematical>=0.5.1 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[formulae]
Functions and parsing formulae.
Functions:
|
Calculates molecular mass, in atomic mass units, from atomic weights. |
|
Parse composition of formula representing a chemical formula. |
-
mass_from_composition
(composition, charge=0)[source] Calculates molecular mass, in atomic mass units, from atomic weights.
Note
Atomic number 0 denotes charge or “net electron defficiency”
Example:
>>> f'{mass_from_composition({0: -1, "H": 1, 8: 1}):.2f}' '17.01'
-
string_to_composition
(formula, prefixes=None, suffixes=('(s)', '(l)', '(g)', '(aq)'))[source] Parse composition of formula representing a chemical formula.
Examples:
>>> string_to_composition('NH4+') == {0: 1, "H": 4, "N": 1} True >>> string_to_composition('.NHO-(aq)') == {0: -1, "H": 1, "N": 1, "O": 1} True >>> string_to_composition('Na2CO3.7H2O') == {"Na": 2, "C": 1, "O": 10, "H": 14} True
- Parameters
- Return type
- Returns
The composition, as a dictionary mapping atomic number -> multiplicity. “Atomic number” 0 represents net charge.
chemistry_tools.formulae.species
Attention
This module has the following additional requirements:
cawdrey>=0.5.0 mathematical>=0.5.1 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[formulae]
Class to represent a formula with phase information (e.g. solid, liquid, gas, or aqueous).
Data:
Invariant |
Classes:
|
Formula with phase information (e.g. |
-
S
= TypeVar(S, bound=Species) Type:
TypeVar
Invariant
TypeVar
bound tochemistry_tools.formulae.species.Species
.
-
class
Species
(composition=None, charge=0, phase=None)[source] Bases:
Formula
Formula with phase information (e.g. solid, liquid, gas, or aqueous).
Species extends
Formula
with the new attributephase
- Parameters
composition (
Optional
[Dict
[str
,int
]]) – AFormula
object with the elemental composition of a substance, or adict
representing the same. IfNone
an empty object is created. DefaultNone
.charge (
int
) – Default0
.phase (
Optional
[Literal
['s'
,'l'
,'g'
,'aq'
]]) – Either's'
,'l'
,'g'
, or'aq'
.None
represents an unknown phase. DefaultNone
.
Methods:
__eq__
(other)Returns
self == other
.copy
()Returns a copy of the
Species
.from_kwargs
(*[, charge, phase])Create a new
Species
object from keyword arguments representing the elements in the compound.from_string
(formula[, charge, phase])Create a new
Species
object by parsing a string.Attributes:
Returns the empirical formula in Hill notation.
Returns the formula in Hill notation.
The phase of the species (e.g.
-
property
empirical_formula
Returns the empirical formula in Hill notation.
The empirical formula has the simplest whole number ratio of atoms of each element present in the formula.
Examples:
>>> Formula.from_string('H2O').empirical_formula 'H2O' >>> Formula.from_string('S4').empirical_formula 'S' >>> Formula.from_string('C6H12O6').empirical_formula 'CH2O'
- Return type
-
classmethod
from_kwargs
(*, charge=0, phase=None, **kwargs)[source] Create a new
Species
object from keyword arguments representing the elements in the compound.
-
classmethod
from_string
(formula, charge=0, phase=None)[source] Create a new
Species
object by parsing a string.Note
Isotopes cannot (currently) be parsed using this method
- Parameters
- Return type
Examples:
>>> water = Species.from_string('H2O') >>> water.phase None >>> NaCl = Species.from_string('NaCl(s)') >>> NaCl.phase s >>> Hg_l = Species.from_string('Hg(l)') >>> Hg_l.phase l >>> CO2g = Species.from_string('CO2(g)') >>> CO2g.phase g >>> CO2aq = Species.from_string('CO2(aq)') >>> CO2aq.phase aq
-
property
hill_formula
Returns the formula in Hill notation.
Examples:
>>> Species.from_string('BrC2H5').hill_formula 'C2H5Br' >>> Species.from_string('HBr').hill_formula 'BrH' >>> Species.from_string('[(CH3)3Si2]2NNa').hill_formula 'C6H18NNaSi4'
- Return type
chemistry_tools.formulae.unicode
Attention
This module has the following additional requirements:
cawdrey>=0.5.0 mathematical>=0.5.1 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[formulae]
Functions and constants for converting formulae to unicode.
Functions:
|
Convert the given formula string to a unicode string representation. |
|
Returns the Unicode subscript of the given value. |
|
Returns the Unicode superscript of the given value. |
-
string_to_unicode
(formula, prefixes=None, infixes=None, suffixes=('(s)', '(l)', '(g)', '(aq)'))[source] Convert the given formula string to a unicode string representation.
Examples:
>>> string_to_unicode('NH4+') 'NH₄⁺' >>> string_to_unicode('Fe(CN)6+2') 'Fe(CN)₆²⁺' >>> string_to_unicode('Fe(CN)6+2(aq)') 'Fe(CN)₆²⁺(aq)' >>> string_to_unicode('.NHO-(aq)') '⋅NHO⁻(aq)' >>> string_to_unicode('alpha-FeOOH(s)') 'α-FeOOH(s)'
- Parameters
formula (
str
) – Chemical formula, e.g.'H2O'
,'Fe+3'
,'Cl-'
prefixes (
Optional
[Dict
[str
,str
]]) – Mapping of prefixes to their Unicode equivalents. Default greek letters and.
infixes (
Optional
[Dict
[str
,str
]]) – Mapping of infixes to their Unicode equivalents. DefaultNone
.suffixes (
Sequence
[str
]) – Suffixes to keep. Default('(s)', '(l)', '(g)', '(aq)')
.
- Return type
- Returns
The Unicode representation of the formula.
chemistry_tools.formulae.utils
Attention
This module has the following additional requirements:
cawdrey>=0.5.0 mathematical>=0.5.1 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[formulae]
General utility functions.
Data:
Common chemical groups |
Functions:
|
Returns an iterator over the given element symbols in order of Hill notation. |
|
Returns the symbol and mass number for the isotope represented by |
-
GROUPS
= {'Abu': 'C4H7NO', 'Acet': 'C2H3O', 'Acm': 'C3H6NO', 'Adao': 'C10H15O', 'Aib': 'C4H7NO', 'Ala': 'C3H5NO', 'Arg': 'C6H12N4O', 'Argp': 'C6H11N4O', 'Asn': 'C4H6N2O2', 'Asnp': 'C4H5N2O2', 'Asp': 'C4H5NO3', 'Aspp': 'C4H4NO3', 'Asu': 'C8H13NO3', 'Asup': 'C8H12NO3', 'Boc': 'C5H9O2', 'Bom': 'C8H9O', 'Bpy': 'C10H8N2', 'Brz': 'C8H6BrO2', 'Bu': 'C4H9', 'Bum': 'C5H11O', 'Bz': 'C7H5O', 'Bzl': 'C7H7', 'Bzlo': 'C7H7O', 'Cha': 'C9H15NO', 'Chxo': 'C6H11O', 'Cit': 'C6H11N3O2', 'Citp': 'C6H10N3O2', 'Clz': 'C8H6ClO2', 'Cp': 'C5H5', 'Cy': 'C6H11', 'Cys': 'C3H5NOS', 'Cysp': 'C3H4NOS', 'Dde': 'C10H13O2', 'Dnp': 'C6H3N2O4', 'Et': 'C2H5', 'Fmoc': 'C15H11O2', 'For': 'CHO', 'Gln': 'C5H8N2O2', 'Glnp': 'C5H7N2O2', 'Glp': 'C5H5NO2', 'Glu': 'C5H7NO3', 'Glup': 'C5H6NO3', 'Gly': 'C2H3NO', 'Hci': 'C7H13N3O2', 'Hcip': 'C7H12N3O2', 'His': 'C6H7N3O', 'Hisp': 'C6H6N3O', 'Hser': 'C4H7NO2', 'Hserp': 'C4H6NO2', 'Hx': 'C6H11', 'Hyp': 'C5H7NO2', 'Hypp': 'C5H6NO2', 'Ile': 'C6H11NO', 'Ivdde': 'C14H21O2', 'Leu': 'C6H11NO', 'Lys': 'C6H12N2O', 'Lysp': 'C6H11N2O', 'Mbh': 'C15H15O2', 'Me': 'CH3', 'Mebzl': 'C8H9', 'Meobzl': 'C8H9O', 'Met': 'C5H9NOS', 'Mmt': 'C20H17O', 'Mtc': 'C14H19O3S', 'Mtr': 'C10H13O3S', 'Mts': 'C9H11O2S', 'Mtt': 'C20H17', 'Nle': 'C6H11NO', 'Npys': 'C5H3N2O2S', 'Nva': 'C5H9NO', 'Odmab': 'C20H26NO3', 'Orn': 'C5H10N2O', 'Ornp': 'C5H9N2O', 'Pbf': 'C13H17O3S', 'Pen': 'C5H9NOS', 'Penp': 'C5H8NOS', 'Ph': 'C6H5', 'Phe': 'C9H9NO', 'Phepcl': 'C9H8ClNO', 'Phg': 'C8H7NO', 'Pmc': 'C14H19O3S', 'Ppa': 'C8H7O2', 'Pro': 'C5H7NO', 'Prop': 'C3H7', 'Py': 'C5H5N', 'Pyr': 'C5H5NO2', 'Sar': 'C3H5NO', 'Ser': 'C3H5NO2', 'Serp': 'C3H4NO2', 'Sta': 'C8H15NO2', 'Stap': 'C8H14NO2', 'Tacm': 'C6H12NO', 'Tbdms': 'C6H15Si', 'Tbu': 'C4H9', 'Tbuo': 'C4H9O', 'Tbuthio': 'C4H9S', 'Tfa': 'C2F3O', 'Thi': 'C7H7NOS', 'Thr': 'C4H7NO2', 'Thrp': 'C4H6NO2', 'Tips': 'C9H21Si', 'Tms': 'C3H9Si', 'Tos': 'C7H7O2S', 'Trp': 'C11H10N2O', 'Trpp': 'C11H9N2O', 'Trt': 'C19H15', 'Tyr': 'C9H9NO2', 'Tyrp': 'C9H8NO2', 'Val': 'C5H9NO', 'Valoh': 'C5H9NO2', 'Valohp': 'C5H8NO2', 'Xan': 'C13H9O'} -
Common chemical groups
chemistry_tools.pubchem
This module provides a wrapper around the PubChem PUG_REST API.
Data for compounds can be accessed using the
pubchem.lookup.get_compounds
function.
The following table lists the various properties that can be obtained from the PubChem API:
Attention
This package has the following additional requirements:
cawdrey>=0.1.7 mathematical>=0.1.13 pillow>=7.0.0 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[pubchem]
chemistry_tools.pubchem.atom
Attention
This package has the following additional requirements:
cawdrey>=0.1.7 mathematical>=0.1.13 pillow>=7.0.0 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[pubchem]
Represents an atom in a Compound
.
Classes:
|
Class to represent an atom in a |
Functions:
|
Parse atoms from the given dictionary. |
-
class
Atom
(aid, number, x=None, y=None, z=None, charge=0)[source] Bases:
object
Class to represent an atom in a
Compound
.- Parameters
aid (
int
) – The Atom ID within the owning Compound.number (
int
) – The Atomic number for this atom.x (
Optional
[float
]) – The x coordinate for this atom. DefaultNone
.y (
Optional
[float
]) – The y coordinate for this atom. DefaultNone
.z (
Optional
[float
]) – The z coordinate for this atom. Will beNone
in 2D Compound records. DefaultNone
.charge (
int
) – Formal charge on atom. Default0
.
Methods:
__eq__
(other)Return
self == other
.__repr__
()Return a string representation of the
Atom
.set_coordinates
(x, y[, z])Set all coordinate dimensions at once.
to_dict
()Return a dictionary containing Atom data.
Attributes:
Returns whether this atom has 2D or 3D coordinates.
The element symbol for this atom.
chemistry_tools.pubchem.bond
Attention
This package has the following additional requirements:
cawdrey>=0.1.7 mathematical>=0.1.13 pillow>=7.0.0 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[pubchem]
Represents a bond between atoms in a Compound
.
Classes:
|
Class to represent a bond between two atoms in a |
|
Enumeration of possible bond types. |
Functions:
|
Parse bonds from the given dictionary. |
-
class
Bond
(aid1, aid2, order=<BondType.SINGLE: 1>, style=None)[source] Bases:
object
Class to represent a bond between two atoms in a
Compound
.- Parameters
Methods:
__eq__
(other)Return
self == other
.__repr__
()Return a string representation of the
Bond
.to_dict
()Return a dictionary containing bond data.
-
enum
BondType
(value)[source] Bases:
enum_tools.custom_enums.IntEnum
Enumeration of possible bond types.
- Member Type
Valid values are as follows:
-
SINGLE
= <BondType.SINGLE: 1>
-
DOUBLE
= <BondType.DOUBLE: 2>
-
TRIPLE
= <BondType.TRIPLE: 3>
-
QUADRUPLE
= <BondType.QUADRUPLE: 4>
-
DATIVE
= <BondType.DATIVE: 5>
-
COMPLEX
= <BondType.COMPLEX: 6>
-
IONIC
= <BondType.IONIC: 7>
-
UNKNOWN
= <BondType.UNKNOWN: 255>
chemistry_tools.pubchem.compound
Attention
This package has the following additional requirements:
cawdrey>=0.1.7 mathematical>=0.1.13 pillow>=7.0.0 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[pubchem]
Represents a chemical compound.
Data:
Invariant |
Classes:
|
Represents a single record from the PubChem Compound database. |
Functions:
|
-
C
= TypeVar(C, bound=Compound) Type:
TypeVar
Invariant
TypeVar
bound tochemistry_tools.pubchem.compound.Compound
.
-
class
Compound
(title, CID, description, **_)[source] Bases:
Dictable
Represents a single record from the PubChem Compound database.
The PubChem Compound database is constructed from the Substance database using a standardization and deduplication process. Each Compound is uniquely identified by a CID.
- Parameters
Methods:
__repr__
()Return a string representation of the
Compound
.from_cid
(cid[, record_type])Returns the Compound objects for the compound with the given CID.
get_iupac_name
([type_])Return the IUPAC name of this compound.
get_properties
(properties)Returns the requested properties for the Compound.
get_property
(prop)Get a single property for the compound.
precache
()Precache all properties for this compound.
Return a pandas
Series
containing Compound data.Attributes:
List of
Atoms
in this Compound.PubChem CACTVS fingerprint.
Canonical SMILES, with no stereochemistry information.
Whether the compound is canonicalized.
The charge of the compound.
Returns the ID of this compound.
The coordinate type of this compound.
List of element symbols for atoms in this Compound.
Raw padded and hex-encoded fingerprint, as returned by the PUG REST API.
Returns whether this compound has a full record available.
The preferred IUPAC name of this compound.
Molecular formula.
Molecular Weight.
Molecular Weight.
Canonical SMILES, with no stereochemistry information.
Returns a list of synonyms for the Compound.
The systematic IUPAC name of this compound.
-
property
cactvs_fingerprint
PubChem CACTVS fingerprint.
Each bit in the fingerprint represents the presence or absence of one of 881 chemical substructures.
-
property
fingerprint
Raw padded and hex-encoded fingerprint, as returned by the PUG REST API.
-
classmethod
from_cid
(cid, record_type='2d')[source] Returns the Compound objects for the compound with the given CID.
- Return type
-
get_properties
(properties)[source] Returns the requested properties for the Compound.
- Parameters
properties (
Union
[Sequence
[str
],str
]) – The properties to retrieve for the compound. Can be either a comma-separated string or a list. See the table at the start of this chapter for a list of valid properties.- Return type
- Returns
Dictionary mapping the property names to their values
-
get_property
(prop)[source] Get a single property for the compound.
- Parameters
prop (
str
) – The property to retrieve for the compound. See the table at the start of this chapter for a list of valid properties.- Return type
-
property
has_full_record
Returns whether this compound has a full record available.
- Return type
chemistry_tools.pubchem.description
Attention
This package has the following additional requirements:
cawdrey>=0.1.7 mathematical>=0.1.13 pillow>=7.0.0 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[pubchem]
Functions to access the name and description of compounds in the PubChem database.
Functions:
|
Returns the common name for the compound with the given name. |
|
Returns the compound ID (CID) for the compound with the given name. |
|
Returns the description compound with the given name. |
|
Returns the systematic IUPAC name for the compound with the given name. |
|
Parse raw data from the |
|
Obtains the description for the given compound from the PubChem REST API. |
-
get_iupac_name
(name)[source] Returns the systematic IUPAC name for the compound with the given name.
-
parse_description
(description_data)[source] Parse raw data from the
description
endpoint of the REST API.
-
rest_get_description
(identifier, namespace=<PubChemNamespace.Name: 'name'>, **kwargs)[source] Obtains the description for the given compound from the PubChem REST API.
- Parameters
identifier (
Union
[str
,int
,Sequence
[Union
[str
,int
]]]) – Identifiers (e.g. name, CID) for the compound to look up. When using the CID namespace data for multiple compounds can be retrieved at once by supplying either a comma-separated string or a list.namespace (
Union
[PubChemNamespace
,str
]) – The type of identifier to look up. Valid values are inPubChemNamespace
. Default<PubChemNamespace.Name: 'name'>
.kwargs – Optional arguments that
json.loads
takes.
- Raises
ValueError – If the response body does not contain valid JSON.
- Return type
- Returns
Parsed JSON data
chemistry_tools.pubchem.enums
Attention
This package has the following additional requirements:
cawdrey>=0.1.7 mathematical>=0.1.13 pillow>=7.0.0 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[pubchem]
Enumerations.
Classes:
|
Enumeration of valid values for the coordinate type. |
|
Enumeration of supported formats for the PubChem REST API. |
|
Enumeration of possible values for the PubChem namespace. |
-
enum
CoordinateType
(value)[source] Bases:
enum_tools.custom_enums.IntEnum
Enumeration of valid values for the coordinate type.
- Member Type
Valid values are as follows:
-
TWO_D
= <CoordinateType.TWO_D: 1>
-
THREE_D
= <CoordinateType.THREE_D: 2>
-
SUBMITTED
= <CoordinateType.SUBMITTED: 3>
-
EXPERIMENTAL
= <CoordinateType.EXPERIMENTAL: 4>
-
COMPUTED
= <CoordinateType.COMPUTED: 5>
-
STANDARDIZED
= <CoordinateType.STANDARDIZED: 6>
-
AUGMENTED
= <CoordinateType.AUGMENTED: 7>
-
ALIGNED
= <CoordinateType.ALIGNED: 8>
-
COMPACT
= <CoordinateType.COMPACT: 9>
-
UNITS_ANGSTROMS
= <CoordinateType.UNITS_ANGSTROMS: 10>
-
UNITS_NANOMETERS
= <CoordinateType.UNITS_NANOMETERS: 11>
-
UNITS_PIXEL
= <CoordinateType.UNITS_PIXEL: 12>
-
UNITS_POINTS
= <CoordinateType.UNITS_POINTS: 13>
-
UNITS_STDBONDS
= <CoordinateType.UNITS_STDBONDS: 14>
-
UNITS_UNKNOWN
= <CoordinateType.UNITS_UNKNOWN: 255>
The
Enum
and its members also have the following methods:
-
enum
PubChemFormats
(value)[source] Bases:
enum_tools.custom_enums.StrEnum
Enumeration of supported formats for the PubChem REST API.
- Member Type
Valid values are as follows:
-
JSON
= <PubChemFormats.JSON: 'JSON'> JSON Format
-
XML
= <PubChemFormats.XML: 'XML'> XML Format
-
CSV
= <PubChemFormats.CSV: 'CSV'> CSV Format
-
PNG
= <PubChemFormats.PNG: 'PNG'> PNG Format
The
Enum
and its members also have the following methods:
-
enum
PubChemNamespace
(value)[source] Bases:
enum_tools.custom_enums.StrEnum
Enumeration of possible values for the PubChem namespace.
- Member Type
Valid values are as follows:
-
CID
= <PubChemNamespace.CID: 'cid'> PubChem Compound ID
-
Name
= <PubChemNamespace.Name: 'name'> Compound Name
-
SMILES
= <PubChemNamespace.SMILES: 'smiles'> SMILES String
-
INCHIKEY
= <PubChemNamespace.INCHIKEY: 'inchikey'> InChI Key
The
Enum
and its members also have the following methods:
chemistry_tools.pubchem.errors
Attention
This package has the following additional requirements:
cawdrey>=0.1.7 mathematical>=0.1.13 pillow>=7.0.0 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[pubchem]
Error handling.
Exceptions:
|
Request is improperly formed (syntax error in the URL, POST body, etc.). |
|
The request timed out, from server overload or too broad a request. |
|
Request not allowed (such as invalid MIME type in the HTTP Accept header). |
|
The input record was not found (e.g. |
Generic error class to handle all HTTP error codes. |
|
PubChem response is uninterpretable. |
|
|
Some problem on the server side (such as a database server down, etc.). |
|
The requested operation has not (yet) been implemented by the server. |
Data:
Numerical list of HTTP status codes considered to be errors. |
-
exception
BadRequestError
(msg='Request is improperly formed')[source] Bases:
chemistry_tools.pubchem.errors.PubChemHTTPError
Request is improperly formed (syntax error in the URL, POST body, etc.).
-
exception
HTTPTimeoutError
(msg='The request timed out')[source] Bases:
chemistry_tools.pubchem.errors.PubChemHTTPError
The request timed out, from server overload or too broad a request.
Changed in version 0.4.0: Renamed from TimeoutErrpr
-
HTTP_ERROR_CODES
= [400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 421, 422, 423, 424, 425, 426, 428, 429, 431, 451, 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 511] Type:
list
Numerical list of HTTP status codes considered to be errors.
-
exception
MethodNotAllowedError
(msg='Request not allowed')[source] Bases:
chemistry_tools.pubchem.errors.PubChemHTTPError
Request not allowed (such as invalid MIME type in the HTTP Accept header).
-
exception
NotFoundError
(msg='The input record was not found')[source] Bases:
chemistry_tools.pubchem.errors.PubChemHTTPError
The input record was not found (e.g. invalid CID).
-
exception
PubChemHTTPError
(e)[source] Bases:
Exception
Generic error class to handle all HTTP error codes.
-
exception
ServerError
(msg='Some problem on the server side')[source] Bases:
chemistry_tools.pubchem.errors.PubChemHTTPError
Some problem on the server side (such as a database server down, etc.).
-
TimeoutError
-
exception
UnimplementedError
(msg='The requested operation has not been implemented')[source] Bases:
chemistry_tools.pubchem.errors.PubChemHTTPError
The requested operation has not (yet) been implemented by the server.
chemistry_tools.pubchem.full_record
Attention
This package has the following additional requirements:
cawdrey>=0.1.7 mathematical>=0.1.13 pillow>=7.0.0 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[pubchem]
Functions for access the complete set of data held by PubChem for a compound.
Functions:
|
Parse the complete PubChem record for a compound. |
|
Obtains the full record for the given compound from the PubChem REST API. |
-
rest_get_full_record
(identifier, namespace=<PubChemNamespace.Name: 'name'>, record_type='2d', **kwargs)[source] Obtains the full record for the given compound from the PubChem REST API.
- Parameters
identifier (
Union
[str
,int
,Sequence
[Union
[str
,int
]]]) – Identifiers (e.g. name, CID) for the compound to look up. When using the CID namespace data for multiple compounds can be retrieved at once by supplying either a comma-separated string or a list.namespace (
Union
[PubChemNamespace
,str
]) – The type of identifier to look up. Valid values are inPubChemNamespace
. Default<PubChemNamespace.Name: 'name'>
.record_type (
str
) – Default'2d'
.kwargs – Optional arguments that
json.loads
takes.
- Raises
ValueError – If the response body does not contain valid JSON.
- Return type
- Returns
Parsed JSON data
chemistry_tools.pubchem.images
Attention
This package has the following additional requirements:
cawdrey>=0.1.7 mathematical>=0.1.13 pillow>=7.0.0 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[pubchem]
Functions for handling images.
Functions:
|
Returns an image of the structure of the compound with the given name. |
-
get_structure_image
(identifier, namespace=<PubChemNamespace.Name: 'name'>, width=300, height=300)[source] Returns an image of the structure of the compound with the given name.
- Parameters
identifier (
Union
[str
,int
,Sequence
[Union
[str
,int
]]]) – Identifiers (e.g. name, CID) for the compound to look up. When using the CID namespace data for multiple compounds can be retrieved at once by supplying either a comma-separated string or a list.namespace (
Union
[PubChemNamespace
,str
]) – The type of identifier to look up. Valid values are inPubChemNamespace
. Default<PubChemNamespace.Name: 'name'>
.width (
int
) – The image width in pixels. Default300
.height (
int
) – The image height in pixels. Default300
.
- Return type
Image
- Returns
Pillow Image data
chemistry_tools.pubchem.lookup
Attention
This package has the following additional requirements:
cawdrey>=0.1.7 mathematical>=0.1.13 pillow>=7.0.0 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[pubchem]
Lookup properties for compound by name or CAS number.
Functions:
|
Returns a list of Compound objects for compounds that match the search criteria. |
-
get_compounds
(identifier, namespace=<PubChemNamespace.Name: 'name'>)[source] Returns a list of Compound objects for compounds that match the search criteria.
As more than one compound may be identified the results are returned in a list.
- Parameters
identifier (
Union
[str
,int
,Sequence
[Union
[str
,int
]]]) – Identifiers (e.g. name, CID) for the compound to look up. When using the CID namespace data for multiple compounds can be retrieved at once by supplying either a comma-separated string or a list.namespace (
Union
[PubChemNamespace
,str
]) – The type of identifier to look up. Valid values are inPubChemNamespace
. Default<PubChemNamespace.Name: 'name'>
.
- Return type
chemistry_tools.pubchem.properties
Attention
This package has the following additional requirements:
cawdrey>=0.1.7 mathematical>=0.1.13 pillow>=7.0.0 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[pubchem]
Functions and classes to access properties of compounds in the PubChem database.
Data:
Allows properties to optionally be specified as underscore_separated, consistent with Compound attributes |
|
Properties for PubChem REST API |
Classes:
|
Metadata about a property. |
|
Represents a property parsed from the full PubChem record. |
Functions:
|
Coerce |
|
Returns the requested properties for the compound with the given identifier. |
|
Returns the requested property for the compound with the given identifier. |
|
Parse raw data from the |
|
Returns the properties for the compound with the given identifier in the desired format. |
|
Returns the properties for the compound with the given identifier as a dictionary. |
-
PROPERTY_MAP
= {'atom_stereo_count': 'AtomStereoCount', 'bond_stereo_count': 'BondStereoCount', 'canonical_smiles': 'CanonicalSMILES', 'charge': 'Charge', 'complexity': 'Complexity', 'conformer_count_3d': 'ConformerCount3D', 'conformer_model_rmsd_3d': 'ConformerModelRMSD3D', 'covalent_unit_count': 'CovalentUnitCount', 'defined_atom_stereo_count': 'DefinedAtomStereoCount', 'defined_bond_stereo_count': 'DefinedBondStereoCount', 'effective_rotor_count_3d': 'EffectiveRotorCount3D', 'exact_mass': 'ExactMass', 'feature_acceptor_count_3d': 'FeatureAcceptorCount3D', 'feature_anion_count_3d': 'FeatureAnionCount3D', 'feature_cation_count_3d': 'FeatureCationCount3D', 'feature_count_3d': 'FeatureCount3D', 'feature_donor_count_3d': 'FeatureDonorCount3D', 'feature_hydrophobe_count_3d': 'FeatureHydrophobeCount3D', 'feature_ring_count_3d': 'FeatureRingCount3D', 'fingerprint_2d': 'Fingerprint2D', 'h_bond_acceptor_count': 'HBondAcceptorCount', 'h_bond_donor_count': 'HBondDonorCount', 'heavy_atom_count': 'HeavyAtomCount', 'inchi': 'InChI', 'inchikey': 'InChIKey', 'isomeric_smiles': 'IsomericSMILES', 'isotope_atom_count': 'IsotopeAtomCount', 'iupac_name': 'IUPACName', 'molecular_formula': 'MolecularFormula', 'molecular_weight': 'MolecularWeight', 'monoisotopic_mass': 'MonoisotopicMass', 'rotatable_bond_count': 'RotatableBondCount', 'tpsa': 'TPSA', 'undefined_atom_stereo_count': 'UndefinedAtomStereoCount', 'undefined_bond_stereo_count': 'UndefinedBondStereoCount', 'volume3d': 'Volume3D', 'volume_3d': 'XStericQuadrupole3D', 'x_steric_quadrupole_3d': 'YStericQuadrupole3D', 'xlogp': 'XLogP', 'y_steric_quadrupole_3d': 'ZStericQuadrupole3D'} -
Allows properties to optionally be specified as underscore_separated, consistent with Compound attributes
-
namedtuple
PropData
(name, description, type, attr_name)[source] Bases:
NamedTuple
Metadata about a property.
- Fields
name (
str
) – The name of the property.description (
str
) – The description of the property.type (
Callable
) – The type of the property.attr_name (
str
) – The Python attribute name of the property in achemistry_tools.pubchem.compound.Compound
.
-
__repr__
() Return a nicely formatted representation string
-
namedtuple
PubChemProperty
(label, name=None, value=None, dtype=None, source=None)[source] Bases:
NamedTuple
Represents a property parsed from the full PubChem record.
- Fields
-
force_valid_properties
(properties)[source] Coerce
properties
into a list of strings and exclude any invalid properties, or raise aValueError
if that is not possible.
-
get_properties
(identifier, properties='', namespace=<PubChemNamespace.Name: 'name'>, as_dataframe=False)[source] Returns the requested properties for the compound with the given identifier. As more than one compound may be identified the results are returned in a list.
- Parameters
identifier (
Union
[str
,int
,Sequence
[Union
[str
,int
]]]) – Identifiers (e.g. name, CID) for the compound to look up. When using the CID namespace data for multiple compounds can be retrieved at once by supplying either a comma-separated string or a list.properties (
Union
[Sequence
[str
],str
]) – The properties to retrieve for the compound. Can be either a comma-separated string or a list. See the table at the start of this chapter for a list of valid properties. Default''
.namespace (
Union
[PubChemNamespace
,str
]) – The type of identifier to look up. Valid values are inPubChemNamespace
. Default<PubChemNamespace.Name: 'name'>
.as_dataframe (
bool
) – Automatically extract the properties into a pandasDataFrame
. DefaultFalse
.
- Raises
ValueError – If the response body does not contain valid JSON.
NotFoundError – If the compound with the requested identifier was not found in PubChem.
- Return type
- Returns
List of dictionaries mapping properties to values
-
get_property
(identifier, property='', namespace=<PubChemNamespace.Name: 'name'>)[source] Returns the requested property for the compound with the given identifier.
This convenience function only allows for a single property to be accessed at once, and for only a single compound. if you require multiple properties and/or properties for multiple compounds use
chemistry_tools.pubchem.properties.get_properties
, which helps reduce the burden on the PubChem servers.- Parameters
identifier (
Union
[str
,int
,Sequence
[Union
[str
,int
]]]) – Identifiers (e.g. name, CID) for the compound to look up.properties – The properties to retrieve for the compound. Can be either a comma-separated string or a list. See the table at the start of this chapter for a list of valid properties.
namespace (
Union
[PubChemNamespace
,str
]) – The type of identifier to look up. Valid values are inPubChemNamespace
. Default<PubChemNamespace.Name: 'name'>
.
- Raises
ValueError – If the response body does not contain valid JSON.
NotFoundError – If the compound with the requested identifier was not found in PubChem.
- Return type
- Returns
The requested property. Type depends on the property requested.
-
rest_get_properties
(identifier, namespace=<PubChemNamespace.Name: 'name'>, properties='', format_=<PubChemFormats.CSV: 'CSV'>)[source] Returns the properties for the compound with the given identifier in the desired format.
- Parameters
identifier (
Union
[str
,int
,Sequence
[Union
[str
,int
]]]) – Identifiers (e.g. name, CID) for the compound to look up. When using the CID namespace data for multiple compounds can be retrieved at once by supplying either a comma-separated string or a list.namespace – The type of identifier to look up. Valid values are in
PubChemNamespace
. Default<PubChemNamespace.Name: 'name'>
.properties (
Union
[Sequence
[str
],str
]) – The properties to retrieve for the compound. Can be either a comma-separated string or a list. See the table at the start of this chapter for a list of valid properties. Default''
.format_ (
Union
[PubChemFormats
,str
]) – The format to obtain the data in. Default<PubChemFormats.CSV: 'CSV'>
.
- Return type
-
rest_get_properties_json
(identifier, namespace=<PubChemNamespace.Name: 'name'>, properties='', **kwargs)[source] Returns the properties for the compound with the given identifier as a dictionary.
- Parameters
identifier (
Union
[str
,int
,Sequence
[Union
[str
,int
]]]) – Identifiers (e.g. name, CID) for the compound to look up. When using the CID namespace data for multiple compounds can be retrieved at once by supplying either a comma-separated string or a list.namespace (
Union
[str
,PubChemNamespace
]) – The type of identifier to look up. Valid values are inPubChemNamespace
. Default<PubChemNamespace.Name: 'name'>
.properties (
Union
[Sequence
[str
],str
]) – The properties to retrieve for the compound. Can be either a comma-separated string or a list. See the table at the start of this chapter for a list of valid properties. Default''
.kwargs – Optional arguments that
json.loads
takes.
- Raises
ValueError – If the response body does not contain valid JSON.
- Return type
- Returns
Parsed JSON data
-
valid_properties
= {'AtomStereoCount': <class 'int'>, 'BondStereoCount': <class 'int'>, 'CanonicalSMILES': <class 'str'>, 'Charge': <class 'int'>, 'Complexity': <class 'float'>, 'ConformerCount3D': <class 'int'>, 'ConformerModelRMSD3D': <class 'float'>, 'CovalentUnitCount': <class 'int'>, 'DefinedAtomStereoCount': <class 'int'>, 'DefinedBondStereoCount': <class 'int'>, 'EffectiveRotorCount3D': <class 'int'>, 'ExactMass': <class 'float'>, 'FeatureAcceptorCount3D': <class 'int'>, 'FeatureAnionCount3D': <class 'int'>, 'FeatureCationCount3D': <class 'int'>, 'FeatureCount3D': <class 'int'>, 'FeatureDonorCount3D': <class 'int'>, 'FeatureHydrophobeCount3D': <class 'int'>, 'FeatureRingCount3D': <class 'int'>, 'Fingerprint2D': <class 'str'>, 'HBondAcceptorCount': <class 'int'>, 'HBondDonorCount': <class 'int'>, 'HeavyAtomCount': <class 'int'>, 'IUPACName': <class 'str'>, 'InChI': <class 'str'>, 'InChIKey': <class 'str'>, 'IsomericSMILES': <class 'str'>, 'IsotopeAtomCount': <class 'int'>, 'MolecularFormula': <bound method Formula.from_string of <class 'chemistry_tools.formulae.formula.Formula'>>, 'MolecularWeight': <class 'float'>, 'MonoisotopicMass': <class 'float'>, 'RotatableBondCount': <class 'int'>, 'TPSA': <class 'float'>, 'UndefinedAtomStereoCount': <class 'int'>, 'UndefinedBondStereoCount': <class 'int'>, 'Volume3D': <class 'str'>, 'XLogP': <class 'float'>, 'XStericQuadrupole3D': <class 'float'>, 'YStericQuadrupole3D': <class 'float'>, 'ZStericQuadrupole3D': <class 'float'>} -
Properties for PubChem REST API
chemistry_tools.pubchem.pug_rest
Attention
This package has the following additional requirements:
cawdrey>=0.1.7 mathematical>=0.1.13 pillow>=7.0.0 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[pubchem]
Functions for interacting with PubChem PUG_REST API.
Functions:
|
Request wrapper that automatically handles asynchronous requests. |
|
Responsible for performing the actual GET request. |
|
Returns the full JSON record for the compound with the given ID. |
|
Construct API request from parameters and return the response. |
-
async_get
(identifier, namespace='cid', operation=None, output='JSON', searchtype=None, **kwargs)[source] Request wrapper that automatically handles asynchronous requests.
- Parameters
identifier – Identifiers (e.g. name, CID) for the compounds to look up. When using the CID namespace data for multiple compounds can be retrieved at once by supplying either a comma-separated string or a list.
namespace (
Union
[PubChemNamespace
,str
]) – The type of identifier to look up. Valid values are inPubChemNamespace
. Default'cid'
.operation – Default
None
.output – Default
'JSON'
.searchtype – Default
None
.**kwargs – Keyword parameters passed along with the GET request.
- Return type
-
do_rest_get
(namespace, identifier, format_=<PubChemFormats.JSON: 'JSON'>, domain=None, record_type='2d', png_width=300, png_height=300)[source] Responsible for performing the actual GET request.
- Parameters
namespace (
Union
[PubChemNamespace
,str
]) – The type of identifier to look up. Valid values are inPubChemNamespace
.identifier (
Union
[str
,int
,Sequence
[Union
[str
,int
]]]) – Identifiers (e.g. name, CID) for the compounds to look up. When using the CID namespace data for multiple compounds can be retrieved at once by supplying either a comma-separated string or a list.format_ (
Union
[PubChemFormats
,str
]) – The file format to retrieve the data in. Valid values are inPubChemFormats
, plus'PNG'
. Default<PubChemFormats.JSON: 'JSON'>
.record_type (
str
) – Default'2d'
.png_width (
int
) – Default300
.png_height (
int
) – Default300
.
- Return type
-
request
(identifier, namespace='cid', operation=None, output='JSON', searchtype=None, **kwargs)[source] Construct API request from parameters and return the response.
Full specification at http://pubchem.ncbi.nlm.nih.gov/pug_rest/PUG_REST.html
- Parameters
identifier – Identifiers (e.g. name, CID) for the compounds to look up. When using the CID namespace data for multiple compounds can be retrieved at once by supplying either a comma-separated string or a list.
namespace (
Union
[PubChemNamespace
,str
]) – The type of identifier to look up. Valid values are inPubChemNamespace
. Default'cid'
.operation – Default
None
.output (
Union
[PubChemFormats
,str
]) – Default'JSON'
.searchtype – Default
None
.**kwargs – Keyword parameters passed along with the GET request.
- Return type
chemistry_tools.pubchem.synonyms
Attention
This package has the following additional requirements:
cawdrey>=0.1.7 mathematical>=0.1.13 pillow>=7.0.0 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[pubchem]
Functions for obtaining the synonyms of a compound from the PubChem database.
Classes:
|
Contains a list of synonyms for a compound. |
Functions:
|
Returns a list of synonyms for the compound with the given identifier. |
|
Get the list of synonyms for the given compound. |
-
class
Synonyms
(initlist)[source] -
Contains a list of synonyms for a compound.
- Parameters
initlist – The content to initialise the list with.
Methods:
__contains__
(synonym)Return
synonym in self
.append
(synonym)Append
synonym
to the end of the list.
-
get_synonyms
(identifier, namespace=<PubChemNamespace.Name: 'name'>)[source] Returns a list of synonyms for the compound with the given identifier. As more than one compound may be identified the results are returned in a list.
- Parameters
identifier (
Union
[str
,int
,Sequence
[Union
[str
,int
]]]) – Identifiers (e.g. name, CID) for the compound to look up. When using the CID namespace data for multiple compounds can be retrieved at once by supplying either a comma-separated string or a list.namespace (
Union
[PubChemNamespace
,str
]) – The type of identifier to look up. Valid values are inPubChemNamespace
. Default<PubChemNamespace.Name: 'name'>
.
- Return type
- Returns
List of dictionaries containing the CID and a list of synonyms for the compounds.
-
rest_get_synonyms
(identifier, namespace=<PubChemNamespace.Name: 'name'>, **kwargs)[source] Get the list of synonyms for the given compound.
- Parameters
identifier (
Union
[str
,int
,Sequence
[Union
[str
,int
]]]) – Identifiers (e.g. name, CID) for the compound to look up. When using the CID namespace data for multiple compounds can be retrieved at once by supplying either a comma-separated string or a list.namespace (
Union
[PubChemNamespace
,str
]) – The type of identifier to look up. Valid values are inPubChemNamespace
. Default<PubChemNamespace.Name: 'name'>
.kwargs – Optional arguments that
json.loads
takes.
- Raises
ValueError – If the response body does not contain valid JSON.
- Return type
- Returns
Parsed JSON data.
chemistry_tools.pubchem.utils
Attention
This package has the following additional requirements:
cawdrey>=0.1.7 mathematical>=0.1.13 pillow>=7.0.0 pyparsing>=2.4.6 tabulate>=0.8.9
These can be installed as follows:
python -m pip install chemistry-tools[pubchem]
General utility functions.
Functions:
|
Convert a PubChem formatted string into an HTML formatted string. |
chemistry_tools.cache
Cache for HTTP requests.
Data:
The cache object. |
|
The cache directory |
|
Instance of |
Functions:
Clear the cache. |
-
cache_dir
Type:
PosixPathPlus
The cache directory
-
cached_requests
Type:
Session
Instance of
requests.Session
with a rate limit of 5 requests per second and a 28 day on-disk cache.
chemistry_tools.cas
Functions for working with CAS registry numbers.
Functions:
|
Converts an integer CAS registry number to a hyphenated string. |
|
Converts a hyphenated string CAS registry number to an integer. |
|
Checks the CAS registry number to ensure the check digit is valid with respect to the rest of the number. |
-
cas_string_to_int
(cas_no)[source] Converts a hyphenated string CAS registry number to an integer.
- Parameters
cas_no
- Raises
ValueError – If the CAS registry number is invalid.
-
check_cas_number
(cas_no)[source] Checks the CAS registry number to ensure the check digit is valid with respect to the rest of the number.
If the CAS registry number is valid
0
is returned. If there is a problem the difference between the computed check digit and that given as part of the CAS registry number is returned.
chemistry_tools.constants
Scientific constants.
Classes:
|
Represents a scientific constant. |
Data:
The atomic mass constant. |
|
Avogadro’s constant (Avogadro’s number) |
|
Boltzmann constant |
|
Electron Radius |
|
Faraday constant |
|
Molar gas constant |
|
Neutron mass |
|
Planck’s constant |
|
Numerical IUPAC prefixes (e.g. |
|
The speed of light in a vacuum. |
|
Vacuum permittivity |
-
class
Constant
(name, value, unit, symbol=None)[source] Bases:
tuple
Represents a scientific constant.
Methods:
Returns the constant as a float (without the unit).
__int__
()Returns the constant as an integer (without the unit).
__repr__
()Return a nicely formatted representation string
Returns the constant as a
quantities.quantity.Quantity
object.Attributes:
The name of the constant.
An optional symbol for the constant.
The constant’s unit.
The value of the constant.
-
__repr__
() Return a nicely formatted representation string
-
as_quantity
()[source] Returns the constant as a
quantities.quantity.Quantity
object.- Return type
Quantity
-
unit
Type:
Quantity
The constant’s unit.
-
-
prefixes
= {1: 'mono', 2: 'di', 3: 'tri', 4: 'tetra', 5: 'penta', 6: 'hexa', 7: 'hepta', 8: 'octa', 9: 'nona', 10: 'deca', 11: 'undeca', 12: 'dodeca', 13: 'trideca', 14: 'tetradeca', 15: 'pentadeca', 16: 'hexadeca', 17: 'heptadeca', 18: 'octadeca', 19: 'nonadeca', 20: 'icosa', 21: 'henicosa', 22: 'docosa', 23: 'tricosa', 30: 'triaconta', 31: 'hentriaconta', 32: 'dotriaconta', 40: 'tetraconta', 50: 'pentaconta', 60: 'hexaconta', 70: 'heptaconta', 80: 'octaconta', 90: 'nonaconta', 100: 'hecta', 200: 'dicta', 300: 'tricta', 400: 'tetracta', 500: 'pentacta', 600: 'hexacta', 700: 'heptacta', 800: 'octacta', 900: 'nonacta', 1000: 'kilia', 2000: 'dilia', 3000: 'trilia', 4000: 'tetralia', 5000: 'pentalia', 6000: 'hexalia', 7000: 'heptalia', 8000: 'octalia', 9000: 'nonalia'} -
Numerical IUPAC prefixes (e.g. mono-).
chemistry_tools.names
Functions for working with IUPAC names for chemicals.
Functions:
|
Returns the corresponding CAS registry number for the given IUPAC name. |
|
Splits an IUPAC name for a compound into its constituent parts. |
|
Returns the order the given IUPAC names should be sorted in. |
|
Returns the constituent parts of the IUPAC names sorted into order. |
|
Returns the corresponding IUPAC name for the given CAS registry number. |
|
Sort a list of IUPAC names into order. |
|
Sort a list of lists by the IUPAC name in each row. |
|
Sorts a |
Data:
Regular expression to match “multiple” prefixes such as mono-. |
|
List of regular expressions to decompose an IUPAC name. |
-
cas_from_iupac_name
(iupac_name)[source] Returns the corresponding CAS registry number for the given IUPAC name.
-
get_IUPAC_sort_order
(iupac_names)[source] Returns the order the given IUPAC names should be sorted in.
Useful when sorting arrays containing data in addition to the name. e.g.
>>> sort_order = get_IUPAC_sort_order([row[0] for row in data]) >>> sorted_data = sorted(data, key=lambda row: sort_order[row[0]])
where row[0] would be the name of the compound
-
get_sorted_parts
(iupac_names)[source] Returns the constituent parts of the IUPAC names sorted into order.
The parts returned are in reverse order (i.e.
'diphenylamine'
becomes['amine', 'phenyl', 'di']
).
-
iupac_name_from_cas
(cas_number)[source] Returns the corresponding IUPAC name for the given CAS registry number.
-
multiplier_regex
Type:
Pattern
Regular expression to match “multiple” prefixes such as mono-.
Pattern
(mono)*(di)*(tri)*(tetra)*(penta)*(hexa)*(hepta)*(octa)*(nona)*(deca)*(undeca)*(dodeca)*(trideca)*(tetradeca)*(pentadeca)*(hexadeca)*(heptadeca)*(octadeca)*(nonadeca)*(icosa)*(henicosa)*(docosa)*(tricosa)*(triaconta)*(hentriaconta)*(dotriaconta)*(tetraconta)*(pentaconta)*(hexaconta)*(heptaconta)*(octaconta)*(nonaconta)*(hecta)*(dicta)*(tricta)*(tetracta)*(pentacta)*(hexacta)*(heptacta)*(octacta)*(nonacta)*(kilia)*(dilia)*(trilia)*(tetralia)*(pentalia)*(hexalia)*(heptalia)*(octalia)*(nonalia)*
-
re_strings
= [re.compile('((\\d+),?)+(\\d+)-'), re.compile('(mono)*(di)*(tri)*(tetra)*(penta)*(hexa)*(hepta)*(octa)*(nona)*(deca)*(undeca)*(dodeca)*(trideca)*(tetradeca)*(pentadeca)*(hexadeca)*(heptadeca)*(octadeca)*(nonadeca)*(icosa)*(henicosa)*(docosa)*(tri), re.compile('nitro'), re.compile('phenyl'), re.compile('aniline'), re.compile('anisole'), re.compile('benzene'), re.compile('centralite'), re.compile('formamide'), re.compile('glycerine'), re.compile('nitrate'), re.compile('glycol'), re.compile('phthalate'), re.compile('picrate'), re.compile('toluene'), re.compile('methyl'), re.compile('(?<!m)ethyl'), re.compile('propyl'), re.compile('butyl'), re.compile(' '), re.compile('\\('), re.compile('\\)'), re.compile('hydroxyl'), re.compile('amin[oe]'), re.compile('amide')] -
List of regular expressions to decompose an IUPAC name.
-
sort_array_by_name
(array, name_col=0, reverse=False)[source] Sort a list of lists by the IUPAC name in each row.
-
sort_dataframe_by_name
(df, name_col, reverse=False)[source] Sorts a
pandas.DataFrame
by the IUPAC name in each row.
chemistry_tools.spectrum_similarity
Mass spectrum similarity calculations.
Classes:
|
Calculate the similarity score for two mass spectra. |
Functions:
|
Create a |
|
Returns the normalised intensity for each rows of a |
|
Calculate the similarity score for two mass spectra. |
-
class
SpectrumSimilarity
(spec_top, spec_bottom, b=1, xlim=(50, 1200))[source] Calculate the similarity score for two mass spectra.
- Parameters
spec_top (
ndarray
) – Array containing the experimental spectrum’s peak list with the m/z values in the first column and corresponding intensities in the secondspec_bottom (
ndarray
) – Array containing the reference spectrum’s peak list with the m/z values in the first column and corresponding intensities in the secondb (
float
) – numeric value specifying the baseline threshold for peak identification. Expressed as a percent of the maximum intensity. Default1
.xlim (
Tuple
[int
,int
]) – tuple of length 2, defining the beginning and ending values of the x-axis. Default(50, 1200)
.
New in version 1.0.0.
Methods:
plot
([top_label, bottom_label, filter])Plot the mass spectra head to tail.
Print the dataframe giving aligned peaks in the top and bottom spectra.
score
()Returns the similarity score.
-
create_array
(intensities, mz)[source] Create a
numpy.ndarray
, in a format appropriate forSpectrumSimilarity
, from a list of intensities and a list of m/z values.
-
normalize
(row, max_val)[source] Returns the normalised intensity for each rows of a
pandas.DataFrame
.
-
spectrum_similarity
(spec_top, spec_bottom, t=0.25, b=10, top_label=None, bottom_label=None, xlim=(50, 1200), x_threshold=0, print_alignment=False, print_graphic=True, output_list=False)[source] Calculate the similarity score for two mass spectra.
Attention
The
SpectrumSimilarity
class is recommended over this function.- Parameters
spec_top (
ndarray
) – Array containing the experimental spectrum’s peak list with the m/z values in the first column and corresponding intensities in the secondspec_bottom (
ndarray
) – Array containing the reference spectrum’s peak list with the m/z values in the first column and corresponding intensities in the secondt (
float
) – numeric value specifying the tolerance used to align the m/z values of the two spectra. Default0.25
.b (
float
) – numeric value specifying the baseline threshold for peak identification. Expressed as a percent of the maximum intensity. Default10
.top_label (
Optional
[str
]) – string to label the top spectrum. DefaultNone
.bottom_label (
Optional
[str
]) – string to label the bottom spectrum. DefaultNone
.xlim (
Tuple
[int
,int
]) – tuple of length 2, defining the beginning and ending values of the x-axis. Default(50, 1200)
.x_threshold (
float
) – Default0
.print_alignment (
bool
) – whether the intensities should be printed. DefaultFalse
.output_list (
bool
) – whether the intensities should be returned as a third element of the tuple. DefaultFalse
.
- Return type
chemistry_tools.units
Functions for handling SI units.
Data:
Mapping of SI measurements to their units. |
|
Square cenimetre |
|
Mapping of dimension names to symbols. |
|
Decimetre |
|
Square decimetre |
|
Kilogray |
|
Kilojoule |
|
Square metre |
|
A medium mathematical space, `` |
|
Micromole |
|
Molal (moles per kilogram) |
|
Nanomolar |
|
Nanomole |
|
Per 100 electronVolts. |
|
Per Molar per second. |
|
Micro mole per joule. |
Functions:
|
Analogous to |
|
Returns the LaTeX reperesentation of the unit of a quantity. |
|
Returns |
|
Returns the given value, followed by the given units, and separated by a medium mathematical space. |
|
Formats a scalar with unit as two strings. |
-
SI_base_registry
= {'amount': UnitSubstance('mole', 'mol'), 'current': UnitCurrent('ampere', 'A'), 'length': UnitLength('meter', 'm'), 'luminous_intensity': UnitLuminousIntensity('candela', 'cd'), 'mass': UnitMass('kilogram', 'kg'), 'temperature': UnitTemperature('Kelvin', 'K'), 'time': UnitTime('second', 's')} Type:
dict
Mapping of SI measurements to their units.
-
allclose
(a, b, rtol=1e-08, atol=None)[source] Analogous to
numpy.allclose()
.
-
as_latex
(quant)[source] Returns the LaTeX reperesentation of the unit of a quantity.
Example:
>>> print(as_latex(1/quantities.kelvin)) \mathrm{\frac{1}{K}}
- Return type
-
cm3
= array(1.) * cm**3 Type:
Quantity
Square cenimetre
-
compare_equality
(a, b)[source] Returns
True
if two arguments are equal.Both arguments need to have the same dimensionality.
Examples:
>>> km, m = quantities.kilometre, quantities.metre >>> compare_equality(3*km, 3) False >>> compare_equality(3*km, 3000*m) True
-
dimension_codes
= {'amount': 'N', 'current': 'I', 'length': 'L', 'mass': 'M', 'temperature': 'Θ', 'time': 'T'} Type:
dict
Mapping of dimension names to symbols.
-
dm
= UnitQuantity('decimetre', 0.1 * m) Type:
UnitQuantity
Decimetre
-
dm3
= array(1.) * decimetre**3 Type:
Quantity
Square decimetre
-
format_si_units
(value, *units)[source] Returns the given value, followed by the given units, and separated by a medium mathematical space.
New in version 0.4.0.
- Return type
-
format_string
(value, precision='%.5g', tex=False)[source] Formats a scalar with unit as two strings.
Examples:
>>> print(' '.join(format_string(0.42*quantities.mol/decimetre**3))) 0.42 mol/decimetre**3 >>> print(' '.join(format_string(2/quantities.s, tex=True))) 2 \mathrm{\frac{1}{s}}
-
kilogray
= UnitQuantity('kilogray', 1000.0 * Gy) Type:
UnitQuantity
Kilogray
-
kilojoule
= UnitQuantity('kilojoule', 1000.0 * J) Type:
UnitQuantity
Kilojoule
-
m3
= array(1.) * m**3 Type:
Quantity
Square metre
-
m_math_space
= '\u205f' Type:
str
A medium mathematical space, ``
` / ``\u205f
.New in version 0.4.0.
-
micromole
= UnitQuantity('micromole', 1e-06 * mol) Type:
UnitQuantity
Micromole
-
molal
= UnitQuantity('molal', 1.0 * mol/kg) Type:
UnitQuantity
Molal (moles per kilogram)
-
nanomolar
= UnitQuantity('nM', 1e-06 * mol/m**3) Type:
UnitQuantity
Nanomolar
-
nanomole
= UnitQuantity('nanomole', 1e-09 * mol) Type:
UnitQuantity
Nanomole
-
per100eV
= UnitQuantity('per_100_eV', 0.01 * 1/(N_A*eV)) Type:
UnitQuantity
Per 100 electronVolts.
-
perMolar_perSecond
= array(1.) * 1/(s*M) Type:
Quantity
Per Molar per second.
-
umol_per_J
= array(1.) * umol/J Type:
Quantity
Micro mole per joule.
Contributing
chemistry_tools
uses tox to automate testing and packaging,
and pre-commit to maintain code quality.
Install pre-commit
with pip
and install the git hook:
python -m pip install pre-commit
pre-commit install
Coding style
formate is used for code formatting.
It can be run manually via pre-commit
:
pre-commit run formate -a
Or, to run the complete autoformatting suite:
pre-commit run -a
Automated tests
Tests are run with tox
and pytest
.
To run tests for a specific Python version, such as Python 3.6:
tox -e py36
To run tests for all Python versions, simply run:
tox
Build documentation locally
The documentation is powered by Sphinx. A local copy of the documentation can be built with tox
:
tox -e docs
Downloading source code
The chemistry_tools
source code is available on GitHub,
and can be accessed from the following URL: https://github.com/domdfcoding/chemistry_tools
If you have git
installed, you can clone the repository with the following command:
git clone https://github.com/domdfcoding/chemistry_tools
Cloning into 'chemistry_tools'...
remote: Enumerating objects: 47, done.
remote: Counting objects: 100% (47/47), done.
remote: Compressing objects: 100% (41/41), done.
remote: Total 173 (delta 16), reused 17 (delta 6), pack-reused 126
Receiving objects: 100% (173/173), 126.56 KiB | 678.00 KiB/s, done.
Resolving deltas: 100% (66/66), done.

Downloading a ‘zip’ file of the source code
Building from source
The recommended way to build chemistry_tools
is to use tox:
tox -e build
The source and wheel distributions will be in the directory dist
.
If you wish, you may also use pep517.build or another PEP 517-compatible build tool.
License
chemistry_tools
is licensed under the GNU Lesser General Public License v3.0
Permissions of this copyleft license are conditioned on making available complete source code of licensed works and modifications under the same license or the GNU GPLv3. Copyright and license notices must be preserved. Contributors provide an express grant of patent rights. However, a larger work using the licensed work through interfaces provided by the licensed work may be distributed under different terms and without source code for the larger work.
Permissions | Conditions | Limitations |
---|---|---|
|
|
GNU LESSER GENERAL PUBLIC LICENSE
Version 3, 29 June 2007
Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
This version of the GNU Lesser General Public License incorporates
the terms and conditions of version 3 of the GNU General Public
License, supplemented by the additional permissions listed below.
0. Additional Definitions.
As used herein, "this License" refers to version 3 of the GNU Lesser
General Public License, and the "GNU GPL" refers to version 3 of the GNU
General Public License.
"The Library" refers to a covered work governed by this License,
other than an Application or a Combined Work as defined below.
An "Application" is any work that makes use of an interface provided
by the Library, but which is not otherwise based on the Library.
Defining a subclass of a class defined by the Library is deemed a mode
of using an interface provided by the Library.
A "Combined Work" is a work produced by combining or linking an
Application with the Library. The particular version of the Library
with which the Combined Work was made is also called the "Linked
Version".
The "Minimal Corresponding Source" for a Combined Work means the
Corresponding Source for the Combined Work, excluding any source code
for portions of the Combined Work that, considered in isolation, are
based on the Application, and not on the Linked Version.
The "Corresponding Application Code" for a Combined Work means the
object code and/or source code for the Application, including any data
and utility programs needed for reproducing the Combined Work from the
Application, but excluding the System Libraries of the Combined Work.
1. Exception to Section 3 of the GNU GPL.
You may convey a covered work under sections 3 and 4 of this License
without being bound by section 3 of the GNU GPL.
2. Conveying Modified Versions.
If you modify a copy of the Library, and, in your modifications, a
facility refers to a function or data to be supplied by an Application
that uses the facility (other than as an argument passed when the
facility is invoked), then you may convey a copy of the modified
version:
a) under this License, provided that you make a good faith effort to
ensure that, in the event an Application does not supply the
function or data, the facility still operates, and performs
whatever part of its purpose remains meaningful, or
b) under the GNU GPL, with none of the additional permissions of
this License applicable to that copy.
3. Object Code Incorporating Material from Library Header Files.
The object code form of an Application may incorporate material from
a header file that is part of the Library. You may convey such object
code under terms of your choice, provided that, if the incorporated
material is not limited to numerical parameters, data structure
layouts and accessors, or small macros, inline functions and templates
(ten or fewer lines in length), you do both of the following:
a) Give prominent notice with each copy of the object code that the
Library is used in it and that the Library and its use are
covered by this License.
b) Accompany the object code with a copy of the GNU GPL and this license
document.
4. Combined Works.
You may convey a Combined Work under terms of your choice that,
taken together, effectively do not restrict modification of the
portions of the Library contained in the Combined Work and reverse
engineering for debugging such modifications, if you also do each of
the following:
a) Give prominent notice with each copy of the Combined Work that
the Library is used in it and that the Library and its use are
covered by this License.
b) Accompany the Combined Work with a copy of the GNU GPL and this license
document.
c) For a Combined Work that displays copyright notices during
execution, include the copyright notice for the Library among
these notices, as well as a reference directing the user to the
copies of the GNU GPL and this license document.
d) Do one of the following:
0) Convey the Minimal Corresponding Source under the terms of this
License, and the Corresponding Application Code in a form
suitable for, and under terms that permit, the user to
recombine or relink the Application with a modified version of
the Linked Version to produce a modified Combined Work, in the
manner specified by section 6 of the GNU GPL for conveying
Corresponding Source.
1) Use a suitable shared library mechanism for linking with the
Library. A suitable mechanism is one that (a) uses at run time
a copy of the Library already present on the user's computer
system, and (b) will operate properly with a modified version
of the Library that is interface-compatible with the Linked
Version.
e) Provide Installation Information, but only if you would otherwise
be required to provide such information under section 6 of the
GNU GPL, and only to the extent that such information is
necessary to install and execute a modified version of the
Combined Work produced by recombining or relinking the
Application with a modified version of the Linked Version. (If
you use option 4d0, the Installation Information must accompany
the Minimal Corresponding Source and Corresponding Application
Code. If you use option 4d1, you must provide the Installation
Information in the manner specified by section 6 of the GNU GPL
for conveying Corresponding Source.)
5. Combined Libraries.
You may place library facilities that are a work based on the
Library side by side in a single library together with other library
facilities that are not Applications and are not covered by this
License, and convey such a combined library under terms of your
choice, if you do both of the following:
a) Accompany the combined library with a copy of the same work based
on the Library, uncombined with any other library facilities,
conveyed under the terms of this License.
b) Give prominent notice with the combined library that part of it
is a work based on the Library, and explaining where to find the
accompanying uncombined form of the same work.
6. Revised Versions of the GNU Lesser General Public License.
The Free Software Foundation may publish revised and/or new versions
of the GNU Lesser General Public License from time to time. Such new
versions will be similar in spirit to the present version, but may
differ in detail to address new problems or concerns.
Each version is given a distinguishing version number. If the
Library as you received it specifies that a certain numbered version
of the GNU Lesser General Public License "or any later version"
applies to it, you have the option of following the terms and
conditions either of that published version or of any later version
published by the Free Software Foundation. If the Library as you
received it does not specify a version number of the GNU Lesser
General Public License, you may choose any version of the GNU Lesser
General Public License ever published by the Free Software Foundation.
If the Library as you received it specifies that a proxy can decide
whether future versions of the GNU Lesser General Public License shall
apply, that proxy's public statement of acceptance of any version is
permanent authorization for you to choose that version for the
Library.
View the Function Index or browse the Source Code.