chemistry_tools.spectrum_similarity

Mass spectrum similarity calculations.

Classes:

SpectrumSimilarity(spec_top, spec_bottom[, …])

Calculate the similarity score for two mass spectra.

Functions:

create_array(intensities, mz)

Create a numpy.ndarray, in a format appropriate for SpectrumSimilarity, from a list of intensities and a list of m/z values.

normalize(row, max_val)

Returns the normalised intensity for each rows of a pandas.DataFrame.

spectrum_similarity(spec_top, spec_bottom[, …])

Calculate the similarity score for two mass spectra.

class SpectrumSimilarity(spec_top, spec_bottom, b=1, xlim=(50, 1200))[source]

Calculate the similarity score for two mass spectra.

Parameters
  • spec_top (ndarray) – Array containing the experimental spectrum’s peak list with the m/z values in the first column and corresponding intensities in the second

  • spec_bottom (ndarray) – Array containing the reference spectrum’s peak list with the m/z values in the first column and corresponding intensities in the second

  • b (float) – numeric value specifying the baseline threshold for peak identification. Expressed as a percent of the maximum intensity. Default 1.

  • xlim (Tuple[int, int]) – tuple of length 2, defining the beginning and ending values of the x-axis. Default (50, 1200).

New in version 1.0.0.

Methods:

plot([top_label, bottom_label, filter])

Plot the mass spectra head to tail.

print_alignment()

Print the dataframe giving aligned peaks in the top and bottom spectra.

score()

Returns the similarity score.

plot(top_label=None, bottom_label=None, filter=False)[source]

Plot the mass spectra head to tail.

Parameters
  • top_label (Optional[str]) – string to label the top spectrum. Default None.

  • bottom_label (Optional[str]) – string to label the bottom spectrum. Default None.

print_alignment()[source]

Print the dataframe giving aligned peaks in the top and bottom spectra.

score()[source]

Returns the similarity score.

Return type

Tuple[float, float]

create_array(intensities, mz)[source]

Create a numpy.ndarray, in a format appropriate for SpectrumSimilarity, from a list of intensities and a list of m/z values.

Parameters
Return type

ndarray

normalize(row, max_val)[source]

Returns the normalised intensity for each rows of a pandas.DataFrame.

Parameters
Return type

float

spectrum_similarity(spec_top, spec_bottom, t=0.25, b=10, top_label=None, bottom_label=None, xlim=(50, 1200), x_threshold=0, print_alignment=False, print_graphic=True, output_list=False)[source]

Calculate the similarity score for two mass spectra.

Attention

The SpectrumSimilarity class is recommended over this function.

Parameters
  • spec_top (ndarray) – Array containing the experimental spectrum’s peak list with the m/z values in the first column and corresponding intensities in the second

  • spec_bottom (ndarray) – Array containing the reference spectrum’s peak list with the m/z values in the first column and corresponding intensities in the second

  • t (float) – numeric value specifying the tolerance used to align the m/z values of the two spectra. Default 0.25.

  • b (float) – numeric value specifying the baseline threshold for peak identification. Expressed as a percent of the maximum intensity. Default 10.

  • top_label (Optional[str]) – string to label the top spectrum. Default None.

  • bottom_label (Optional[str]) – string to label the bottom spectrum. Default None.

  • xlim (Tuple[int, int]) – tuple of length 2, defining the beginning and ending values of the x-axis. Default (50, 1200).

  • x_threshold (float) – Default 0.

  • print_alignment (bool) – whether the intensities should be printed. Default False.

  • print_graphic (bool) – Default True.

  • output_list (bool) – whether the intensities should be returned as a third element of the tuple. Default False.

Return type

Union[Tuple[float, float], Tuple[float, float, DataFrame]]