pyms.Utils

Utility functions for PyMassSpec wide use.

pyms.Utils.IO

General I/O functions.

Functions:

dump_object(obj, file_name)

Dumps an object to a file through pickle.dump().

file_lines(file_name[, strip])

Returns lines from a file, as a list.

load_object(file_name)

Loads an object previously dumped with dump_object().

prepare_filepath(file_name[, mkdirs])

Convert string filename into pathlib.Path object and create parent directories if required.

save_data(file_name, data[, format_str, ...])

Saves a list of numbers or a list of lists of numbers to a file with specific formatting.

dump_object(obj, file_name)[source]

Dumps an object to a file through pickle.dump().

Parameters:
  • obj (Any) – Object to be dumped

  • file_name (Union[str, Path, PathLike]) – Name of the file for the object dump

Authors:

Vladimir Likic, Dominic Davis-Foster (pathlib support)

file_lines(file_name, strip=False)[source]

Returns lines from a file, as a list.

Parameters:
  • file_name (Union[str, Path, PathLike]) – Name of a file

  • strip (bool) – If True, lines are pre-processed. Newline characters are removed, leading and trailing whitespaces are removed, and lines starting with ‘#’ are discarded Default False.

Return type:

List[str]

Returns:

A list of lines

Authors:

Vladimir Likic, Dominic Davis-Foster (pathlib support)

load_object(file_name)[source]

Loads an object previously dumped with dump_object().

Parameters:

file_name (Union[str, Path, PathLike]) – Name of the object dump file.

Return type:

object

Returns:

Object contained in the file.

Authors:

Vladimir Likic, Dominic Davis-Foster (pathlib support)

prepare_filepath(file_name, mkdirs=True)[source]

Convert string filename into pathlib.Path object and create parent directories if required.

Parameters:
  • file_name (Union[str, Path, PathLike]) – file_name to process

  • mkdirs (bool) – Whether the parent directory of the file should be created if it doesn’t exist. Default True.

Return type:

Path

Returns:

file_name

Author:

Dominic Davis-Foster

save_data(file_name, data, format_str='%.6f', prepend='', sep=' ', compressed=False)[source]

Saves a list of numbers or a list of lists of numbers to a file with specific formatting.

Parameters:
  • file_name (Union[str, Path, PathLike]) – Name of a file

  • data (Union[List[float], List[List[float]]]) – A list of numbers, or a list of lists

  • format_str (str) – A format string for individual entries. Default '%.6f'.

  • prepend (str) – A string, printed before each row. Default ''.

  • sep (str) – A string, printed after each number. Default '␣'.

  • compressed (bool) – If True, the output will be gzipped. Default False.

Authors:

Vladimir Likic, Dominic Davis-Foster (pathlib support)

pyms.Utils.Math

Provides mathematical functions.

Functions:

MAD(v)

Median absolute deviation.

is_float(s)

Test if a string, or list of strings, contains a numeric value(s).

mad_based_outlier(data[, thresh])

Identify outliers using the median absolute deviation (MAD).

mean(data)

Return the sample arithmetic mean of data.

median(data)

Return the median (middle value) of numeric data.

median_outliers(data[, m])

Identify outliers using the median value.

percentile_based_outlier(data[, threshold])

Identify outliers using a percentile.

rmsd(list1, list2)

Calculates RMSD for the 2 lists.

std(data[, xbar])

Return the square root of the sample variance.

vector_by_step(start, stop, step)

Generates a list by using start, stop, and step values.

MAD(v)[source]

Median absolute deviation.

Parameters:

v (Union[Sequence, ndarray]) – List of values to calculate the median absolute deviation of.

Return type:

float

Returns:

median absolute deviation

Author:

Vladimir Likic

is_float(s)[source]

Test if a string, or list of strings, contains a numeric value(s).

Parameters:

s (Union[str, List[str]]) – The string or list of strings to test.

Return type:

Union[bool, List[bool]]

Returns:

A single boolean or list of boolean values indicating whether each input can be converted into a float.

Overloads:
mad_based_outlier(data, thresh=3.5)[source]

Identify outliers using the median absolute deviation (MAD).

Parameters:
Author:

David Kainer

Url:

http://stackoverflow.com/questions/22354094/pythonic-way-of-detecting-outliers-in-one-dimensional-observation-data

mean(data)[source]

Return the sample arithmetic mean of data.

>>> mean([1, 2, 3, 4, 4])
2.8
>>> from fractions import Fraction as F
>>> mean([F(3, 7), F(1, 21), F(5, 3), F(1, 3)])
Fraction(13, 21)
>>> from decimal import Decimal as D
>>> mean([D("0.5"), D("0.75"), D("0.625"), D("0.375")])
Decimal('0.5625')

If data is empty, StatisticsError will be raised.

median(data)[source]

Return the median (middle value) of numeric data.

When the number of data points is odd, return the middle data point. When the number of data points is even, the median is interpolated by taking the average of the two middle values:

>>> median([1, 3, 5])
3
>>> median([1, 3, 5, 7])
4.0
median_outliers(data, m=2.5)[source]

Identify outliers using the median value.

Parameters:
  • data

  • m (float) – Default 2.5.

Author:

David Kainer

Author:

eumiro (https://stackoverflow.com/users/449449/eumiro)

Author:

Benjamin Bannier (https://stackoverflow.com/users/176922/benjamin-bannier)

Url:

http://stackoverflow.com/questions/11686720/is-there-a-numpy-builtin-to-reject-outliers-from-a-list

percentile_based_outlier(data, threshold=95)[source]

Identify outliers using a percentile.

Parameters:
Author:

David Kainer

Url:

http://stackoverflow.com/questions/22354094/pythonic-way-of-detecting-outliers-in-one-dimensional-observation-data

rmsd(list1, list2)[source]

Calculates RMSD for the 2 lists.

Parameters:
Return type:

float

Returns:

RMSD value

Authors:

Qiao Wang, Andrew Isaac, Vladimir Likic

std(data, xbar=None)

Return the square root of the sample variance.

See variance for arguments and other details.

>>> stdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75])
1.0810874155219827
vector_by_step(start, stop, step)[source]

Generates a list by using start, stop, and step values.

Parameters:
  • start (float) – Initial value

  • stop (float) – Max value

  • step (float) – Step

Author:

Vladimir Likic

Return type:

List[float]

pyms.Utils.Time

Time conversion and related functions.

Functions:

is_str_num(arg)

Returns whether the argument is a string in the format of a number.

time_str_secs(time_str)

Resolves time string of the form '<NUMBER>s' or '<NUMBER>m' and returns the time in seconds.

window_sele_points(ic, window_sele[, ...])

Converts window selection parameter into points based on the time step in an ion chromatogram

is_str_num(arg)[source]

Returns whether the argument is a string in the format of a number.

The number can be an integer, or alternatively a floating point number in scientific or engineering format.

Parameters:

arg (str) – A string to be evaluate as a number

Author:

Gyro Funch (from Active State Python Cookbook)

Return type:

bool

time_str_secs(time_str)[source]

Resolves time string of the form '<NUMBER>s' or '<NUMBER>m' and returns the time in seconds.

Parameters:

time_str (str) – A time string, which must be of the form '<NUMBER>s' or '<NUMBER>m' where '<NUMBER>' is a valid number

Return type:

float

Returns:

Time in seconds

Author:

Vladimir Likic

window_sele_points(ic, window_sele, half_window=False)[source]

Converts window selection parameter into points based on the time step in an ion chromatogram

Parameters:
  • ic (IonChromatogram) – ion chromatogram object relevant for the conversion

  • window_sele (Union[int, str]) – The window selection parameter. This can be an integer or time string. If an integer, taken as the number of points. If a string, must of the form '<NUMBER>s' or '<NUMBER>m', specifying a time in seconds or minutes, respectively

  • half_window (bool) – Specifies whether to return half-window. Default False.

Return type:

int

Returns:

The number of points in the window

Author:

Vladimir Likic

pyms.Utils.Utils

General utility functions.

Functions:

is_number(obj)

Returns whether obj is a numerical value (int, :class`float` etc).

is_path(obj)

Returns whether the object represents a filesystem path.

is_sequence(obj)

Returns whether the object is a Sequence, and not a string.

is_sequence_of(obj, of)

Returns whether the object is a Sequence, and not a string, of the given type.

is_number(obj)[source]

Returns whether obj is a numerical value (int, :class`float` etc).

Parameters:

obj (Any)

Return type:

bool

is_path(obj)[source]

Returns whether the object represents a filesystem path.

Parameters:

obj (Any)

Return type:

bool

is_sequence(obj)[source]

Returns whether the object is a Sequence, and not a string.

Parameters:

obj (Any)

Return type:

bool

is_sequence_of(obj, of)[source]

Returns whether the object is a Sequence, and not a string, of the given type.

Parameters:
Return type:

bool

pyms.Utils.Utils.signedinteger

numpy.signedinteger at runtime; int when type checking.