pyms.GCMS

Module to handle raw data.

pyms.GCMS.Class

Class to model GC-MS data.

Classes:

GCMS_data(time_list, scan_list)

Generic object for GC-MS data.

Data:

IntStr

Invariant TypeVar constrained to int and str.

class GCMS_data(time_list, scan_list)[source]

Bases: pymsBaseClass, TimeListMixin, MaxMinMassMixin, GetIndexTimeMixin

Generic object for GC-MS data.

Contains the raw data as a list of scans and a list of times.

Parameters:
Authors:

Qiao Wang, Andrew Isaac, Vladimir Likic, Dominic Davis-Foster (type assertions and properties)

Methods:

__eq__(other)

Return whether this GCMS_data object is equal to another object.

__len__()

Returns the length of the data object, defined as the number of scans.

__repr__()

Return a string representation of the GCMS_data.

__str__()

Return str(self).

dump(file_name[, protocol])

Dumps an object to a file through pickle.dump().

get_index_at_time(time)

Returns the nearest index corresponding to the given time.

get_time_at_index(ix)

Returns time at given index.

info([print_scan_n])

Prints some information about the data.

trim([begin, end])

Trims data in the time domain.

write(file_root)

Writes the entire raw data to two CSV files:

write_intensities_stream(file_name)

Loop over all scans and, for each scan, write the intensities to the given file, one intensity per line.

Attributes:

max_mass

Returns the maximum m/z value in the spectrum.

max_rt

Returns the maximum retention time for the data in seconds.

min_mass

Returns the minimum m/z value in the spectrum.

min_rt

Returns the minimum retention time for the data in seconds.

scan_list

Return a list of the scan objects.

tic

Returns the total ion chromatogram.

time_list

Return a copy of the time list.

time_step

Returns the time step of the data.

time_step_std

Returns the standard deviation of the time step of the data.

__eq__(other)[source]

Return whether this GCMS_data object is equal to another object.

Parameters:

other – The other object to test equality with.

Return type:

bool

__len__()[source]

Returns the length of the data object, defined as the number of scans.

Author:

Vladimir Likic

Return type:

int

__repr__()[source]

Return a string representation of the GCMS_data.

Return type:

str

__str__()[source]

Return str(self).

Return type:

str

dump(file_name, protocol=3)

Dumps an object to a file through pickle.dump().

Parameters:
  • file_name (Union[str, Path, PathLike]) – Filename to save the dump as.

  • protocol (int) – The pickle protocol to use. Default 3.

Authors:

Vladimir Likic, Dominic Davis-Foster (pathlib and pickle protocol support)

get_index_at_time(time)

Returns the nearest index corresponding to the given time.

Parameters:

time (float) – Time in seconds

Return type:

int

Returns:

Nearest index corresponding to given time

Authors:

Lewis Lee, Tim Erwin, Vladimir Likic

Changed in version 2.3.0: Now returns -1 if no index is found.

get_time_at_index(ix)

Returns time at given index.

Parameters:

ix (int)

Authors:

Lewis Lee, Vladimir Likic

Return type:

float

info(print_scan_n=False)[source]

Prints some information about the data.

Parameters:

print_scan_n (bool) – If set to True will print the number of m/z values in each scan. Default False.

Author:

Vladimir Likic

property max_mass

Returns the maximum m/z value in the spectrum.

Author:

Andrew Isaac

Return type:

Optional[float]

property max_rt

Returns the maximum retention time for the data in seconds.

Return type:

float

property min_mass

Returns the minimum m/z value in the spectrum.

Author:

Andrew Isaac

Return type:

Optional[float]

property min_rt

Returns the minimum retention time for the data in seconds.

Return type:

float

property scan_list

Return a list of the scan objects.

Authors:

Qiao Wang, Andrew Isaac, Vladimir Likic

Return type:

List[Scan]

property tic

Returns the total ion chromatogram.

Author:

Andrew Isaac

Return type:

IonChromatogram

property time_list

Return a copy of the time list.

Return type:

List[float]

property time_step

Returns the time step of the data.

Return type:

float

property time_step_std

Returns the standard deviation of the time step of the data.

Return type:

float

trim(begin=None, end=None)[source]

Trims data in the time domain.

The arguments begin and end can be either integers (in which case they are taken as the first/last scan number for trimming) or strings in which case they are treated as time strings and converted to scan numbers.

At present both begin and end must be of the same type, either both scan numbers or time strings.

At least one of begin and end is required.

Parameters:
Author:

Vladimir Likic

write(file_root)[source]

Writes the entire raw data to two CSV files:

  • <file_root>.I.csv, containing the intensities; and

  • <file_root>.mz.csv, containing the corresponding m/z values.

In general these are not two-dimensional matrices, because different scans may have different numbers of m/z values recorded.

Parameters:

file_root (Union[str, Path, PathLike]) – The root for the output file names

Authors:

Vladimir Likic, Dominic Davis-Foster (pathlib support)

write_intensities_stream(file_name)[source]

Loop over all scans and, for each scan, write the intensities to the given file, one intensity per line.

Intensities from different scans are joined without any delimiters.

Parameters:

file_name (Union[str, Path, PathLike]) – Output file name.

Authors:

Vladimir Likic, Dominic Davis-Foster (pathlib support)

IntStr = TypeVar(IntStr, int, str)

Type:    TypeVar

Invariant TypeVar constrained to int and str.

pyms.GCMS.Function

Provides conversion and information functions for GC-MS data objects.

Functions:

diff(data1, data2)

Compares two GCMS_data objects.

ic_window_points(ic, window_sele[, half_window])

Converts the window selection parameter into points based on the time step in an ion chromatogram.

diff(data1, data2)[source]

Compares two GCMS_data objects.

Parameters:
Authors:

Qiao Wang, Andrew Isaac, Vladimir Likic

ic_window_points(ic, window_sele, half_window=False)[source]

Converts the window selection parameter into points based on the time step in an ion chromatogram.

Parameters:
  • ic (IonChromatogram) – ion chromatogram object relevant for the conversion

  • window_sele (Union[int, str]) – The window selection parameter. This can be an integer or time string. If an integer, taken as the number of points. If a string, must of the form '<NUMBER>s' or '<NUMBER>m', specifying a time in seconds or minutes, respectively

  • half_window (bool) – Specifies whether to return half-window. Default False.

Author:

Vladimir Likic

Return type:

int

pyms.GCMS.IO

Input/output functions for GC-MS data files.

pyms.GCMS.IO.ANDI

Functions for reading ANDI-MS data files.

Functions:

ANDI_reader(file_name)

A reader for ANDI-MS NetCDF files.

ANDI_reader(file_name)[source]

A reader for ANDI-MS NetCDF files.

Parameters:

file_name (Union[str, Path, PathLike]) – The path of the ANDI-MS file

Return type:

GCMS_data

Returns:

GC-MS data object

Authors:

Qiao Wang, Andrew Isaac, Vladimir Likic, Dominic Davis-Foster

pyms.GCMS.IO.JCAMP

Functions for I/O of data in JCAMP-DX format.

Functions:

JCAMP_reader(file_name)

Generic reader for JCAMP DX files.

JCAMP_reader(file_name)[source]

Generic reader for JCAMP DX files.

Parameters:

file_name (Union[str, Path]) – Path of the file to read

Return type:

GCMS_data

Returns:

GC-MS data object

Authors:

Qiao Wang, Andrew Isaac, Vladimir Likic, David Kainer, Dominic Davis-Foster (pathlib support)

pyms.GCMS.IO.MZML

Functions for reading mzML format data files.

Functions:

mzML_reader(file_name)

A reader for mzML files.

mzML_reader(file_name)[source]

A reader for mzML files.

Parameters:

file_name (Union[str, Path, PathLike]) – The name of the mzML file.

Return type:

GCMS_data

Returns:

GC-MS data object.

Authors:

Sean O’Callaghan, Dominic Davis-Foster (pathlib support)