pyms.GCMS

Module to handle raw data.

pyms.GCMS.Class

Class to model GC-MS data.

Classes:

GCMS_data(time_list, scan_list)

Generic object for GC-MS data.

Data:

IntStr

Invariant TypeVar constrained to int and str.

class GCMS_data(time_list, scan_list)[source]

Bases: pymsBaseClass, TimeListMixin, MaxMinMassMixin, GetIndexTimeMixin

Generic object for GC-MS data.

Contains the raw data as a list of scans and a list of times.

Parameters
Authors

Qiao Wang, Andrew Isaac, Vladimir Likic, Dominic Davis-Foster (type assertions and properties)

Methods:

__eq__(other)

Return whether this GCMS_data object is equal to another object.

__len__()

Returns the length of the data object, defined as the number of scans.

__repr__()

Return a string representation of the GCMS_data.

__str__()

Return str(self).

dump(file_name[, protocol])

Dumps an object to a file through pickle.dump().

get_index_at_time(time)

Returns the nearest index corresponding to the given time.

get_time_at_index(ix)

Returns time at given index.

info([print_scan_n])

Prints some information about the data.

trim([begin, end])

Trims data in the time domain.

write(file_root)

Writes the entire raw data to two CSV files:

write_intensities_stream(file_name)

Loop over all scans and, for each scan, write the intensities to the given file, one intensity per line.

Attributes:

max_mass

Returns the maximum m/z value in the spectrum.

max_rt

Returns the maximum retention time for the data in seconds.

min_mass

Returns the minimum m/z value in the spectrum.

min_rt

Returns the minimum retention time for the data in seconds.

scan_list

Return a list of the scan objects.

tic

Returns the total ion chromatogram.

time_list

Return a copy of the time list.

time_step

Returns the time step of the data.

time_step_std

Returns the standard deviation of the time step of the data.

__eq__(other)[source]

Return whether this GCMS_data object is equal to another object.

Parameters

other – The other object to test equality with.

Return type

bool

__len__()[source]

Returns the length of the data object, defined as the number of scans.

Author

Vladimir Likic

Return type

int

__repr__()[source]

Return a string representation of the GCMS_data.

Return type

str

__str__()[source]

Return str(self).

Return type

str

dump(file_name, protocol=3)

Dumps an object to a file through pickle.dump().

Parameters
  • file_name (Union[str, Path, PathLike]) – Filename to save the dump as.

  • protocol (int) – The pickle protocol to use. Default 3.

Authors

Vladimir Likic, Dominic Davis-Foster (pathlib and pickle protocol support)

get_index_at_time(time)

Returns the nearest index corresponding to the given time.

Parameters

time (float) – Time in seconds

Return type

int

Returns

Nearest index corresponding to given time

Authors

Lewis Lee, Tim Erwin, Vladimir Likic

Changed in version 2.3.0: Now returns -1 if no index is found.

get_time_at_index(ix)

Returns time at given index.

Parameters

ix (int)

Authors

Lewis Lee, Vladimir Likic

Return type

float

info(print_scan_n=False)[source]

Prints some information about the data.

Parameters

print_scan_n (bool) – If set to True will print the number of m/z values in each scan. Default False.

Author

Vladimir Likic

property max_mass

Returns the maximum m/z value in the spectrum.

Author

Andrew Isaac

Return type

Optional[float]

property max_rt

Returns the maximum retention time for the data in seconds.

Return type

float

property min_mass

Returns the minimum m/z value in the spectrum.

Author

Andrew Isaac

Return type

Optional[float]

property min_rt

Returns the minimum retention time for the data in seconds.

Return type

float

property scan_list

Return a list of the scan objects.

Authors

Qiao Wang, Andrew Isaac, Vladimir Likic

Return type

List[Scan]

property tic

Returns the total ion chromatogram.

Author

Andrew Isaac

Return type

IonChromatogram

property time_list

Return a copy of the time list.

Return type

List[float]

property time_step

Returns the time step of the data.

Return type

float

property time_step_std

Returns the standard deviation of the time step of the data.

Return type

float

trim(begin=None, end=None)[source]

Trims data in the time domain.

The arguments begin and end can be either integers (in which case they are taken as the first/last scan number for trimming) or strings in which case they are treated as time strings and converted to scan numbers.

At present both begin and end must be of the same type, either both scan numbers or time strings.

At least one of begin and end is required.

Parameters
Author

Vladimir Likic

write(file_root)[source]

Writes the entire raw data to two CSV files:

  • <file_root>.I.csv, containing the intensities; and

  • <file_root>.mz.csv, containing the corresponding m/z values.

In general these are not two-dimensional matrices, because different scans may have different numbers of m/z values recorded.

Parameters

file_root (Union[str, Path, PathLike]) – The root for the output file names

Authors

Vladimir Likic, Dominic Davis-Foster (pathlib support)

write_intensities_stream(file_name)[source]

Loop over all scans and, for each scan, write the intensities to the given file, one intensity per line.

Intensities from different scans are joined without any delimiters.

Parameters

file_name (Union[str, Path, PathLike]) – Output file name.

Authors

Vladimir Likic, Dominic Davis-Foster (pathlib support)

IntStr = TypeVar(IntStr, int, str)

Type:    TypeVar

Invariant TypeVar constrained to int and str.

pyms.GCMS.Function

Provides conversion and information functions for GC-MS data objects.

Functions:

diff(data1, data2)

Compares two GCMS_data objects.

ic_window_points(ic, window_sele[, half_window])

Converts the window selection parameter into points based on the time step in an ion chromatogram.

diff(data1, data2)[source]

Compares two GCMS_data objects.

Parameters
Authors

Qiao Wang, Andrew Isaac, Vladimir Likic

ic_window_points(ic, window_sele, half_window=False)[source]

Converts the window selection parameter into points based on the time step in an ion chromatogram.

Parameters
  • ic (IonChromatogram) – ion chromatogram object relevant for the conversion

  • window_sele (Union[int, str]) – The window selection parameter. This can be an integer or time string. If an integer, taken as the number of points. If a string, must of the form '<NUMBER>s' or '<NUMBER>m', specifying a time in seconds or minutes, respectively

  • half_window (bool) – Specifies whether to return half-window. Default False.

Author

Vladimir Likic

Return type

int

pyms.GCMS.IO

Input/output functions for GC-MS data files.

pyms.GCMS.IO.ANDI

Functions for reading ANDI-MS data files.

Functions:

ANDI_reader(file_name)

A reader for ANDI-MS NetCDF files.

ANDI_reader(file_name)[source]

A reader for ANDI-MS NetCDF files.

Parameters

file_name (Union[str, Path, PathLike]) – The path of the ANDI-MS file

Return type

GCMS_data

Returns

GC-MS data object

Authors

Qiao Wang, Andrew Isaac, Vladimir Likic, Dominic Davis-Foster

pyms.GCMS.IO.JCAMP

Functions for I/O of data in JCAMP-DX format.

Functions:

JCAMP_reader(file_name)

Generic reader for JCAMP DX files.

JCAMP_reader(file_name)[source]

Generic reader for JCAMP DX files.

Parameters

file_name (Union[str, Path]) – Path of the file to read

Return type

GCMS_data

Returns

GC-MS data object

Authors

Qiao Wang, Andrew Isaac, Vladimir Likic, David Kainer, Dominic Davis-Foster (pathlib support)

pyms.GCMS.IO.MZML

Functions for reading mzML format data files.

Functions:

mzML_reader(file_name)

A reader for mzML files.

mzML_reader(file_name)[source]

A reader for mzML files.

Parameters

file_name (Union[str, Path, PathLike]) – The name of the mzML file.

Return type

GCMS_data

Returns

GC-MS data object.

Authors

Sean O’Callaghan, Dominic Davis-Foster (pathlib support)