pyms.Utils

Utility functions for PyMassSpec wide use.

pyms.Utils.IO

General I/O functions.

Functions:

dump_object(obj, file_name)

Dumps an object to a file through pickle.dump().

file_lines(file_name[, strip])

Returns lines from a file, as a list.

load_object(file_name)

Loads an object previously dumped with dump_object().

prepare_filepath(file_name[, mkdirs])

Convert string filename into pathlib.Path object and create parent directories if required.

save_data(file_name, data[, format_str, …])

Saves a list of numbers or a list of lists of numbers to a file with specific formatting.

dump_object(obj, file_name)[source]

Dumps an object to a file through pickle.dump().

Parameters
  • obj (Any) – Object to be dumped

  • file_name (Union[str, Path, PathLike]) – Name of the file for the object dump

Authors

Vladimir Likic, Dominic Davis-Foster (pathlib support)

file_lines(file_name, strip=False)[source]

Returns lines from a file, as a list.

Parameters
  • file_name (Union[str, Path, PathLike]) – Name of a file

  • strip (bool) – If True, lines are pre-processed. Newline characters are removed, leading and trailing whitespaces are removed, and lines starting with ‘#’ are discarded Default False.

Return type

List[str]

Returns

A list of lines

Authors

Vladimir Likic, Dominic Davis-Foster (pathlib support)

load_object(file_name)[source]

Loads an object previously dumped with dump_object().

Parameters

file_name (Union[str, Path, PathLike]) – Name of the object dump file.

Return type

object

Returns

Object contained in the file.

Authors

Vladimir Likic, Dominic Davis-Foster (pathlib support)

prepare_filepath(file_name, mkdirs=True)[source]

Convert string filename into pathlib.Path object and create parent directories if required.

Parameters
  • file_name (Union[str, Path, PathLike]) – file_name to process

  • mkdirs (bool) – Whether the parent directory of the file should be created if it doesn’t exist. Default True.

Return type

Path

Returns

file_name

Author

Dominic Davis-Foster

save_data(file_name, data, format_str='%.6f', prepend='', sep=' ', compressed=False)[source]

Saves a list of numbers or a list of lists of numbers to a file with specific formatting.

Parameters
  • file_name (Union[str, Path, PathLike]) – Name of a file

  • data (Union[List[float], List[List[float]]]) – A list of numbers, or a list of lists

  • format_str (str) – A format string for individual entries. Default '%.6f'.

  • prepend (str) – A string, printed before each row. Default ''.

  • sep (str) – A string, printed after each number. Default '␣'.

  • compressed (bool) – If True, the output will be gzipped. Default False.

Authors

Vladimir Likic, Dominic Davis-Foster (pathlib support)

pyms.Utils.Math

Provides mathematical functions.

Functions:

MAD(v)

Median absolute deviation.

is_float(s)

Test if a string, or list of strings, contains a numeric value(s).

mad_based_outlier(data[, thresh])

Identify outliers using the median absolute deviation (MAD).

mean(data)

Return the sample arithmetic mean of data.

median(data)

Return the median (middle value) of numeric data.

median_outliers(data[, m])

Identify outliers using the median value.

percentile_based_outlier(data[, threshold])

Identify outliers using a percentile.

rmsd(list1, list2)

Calculates RMSD for the 2 lists.

std(data[, xbar])

Return the square root of the sample variance.

vector_by_step(start, stop, step)

Generates a list by using start, stop, and step values.

MAD(v)[source]

Median absolute deviation.

Parameters

v (Union[Sequence, ndarray]) – List of values to calculate the median absolute deviation of.

Return type

float

Returns

median absolute deviation

Author

Vladimir Likic

is_float(s)[source]

Test if a string, or list of strings, contains a numeric value(s).

Parameters

s (Union[str, List[str]]) – The string or list of strings to test.

Return type

Union[bool, List[bool]]

Returns

A single boolean or list of boolean values indicating whether each input can be converted into a float.

Overloads
mad_based_outlier(data, thresh=3.5)[source]

Identify outliers using the median absolute deviation (MAD).

Parameters
Author

David Kainer

Url

http://stackoverflow.com/questions/22354094/pythonic-way-of-detecting-outliers-in-one-dimensional-observation-data

mean(data)[source]

Return the sample arithmetic mean of data.

>>> mean([1, 2, 3, 4, 4])
2.8
>>> from fractions import Fraction as F
>>> mean([F(3, 7), F(1, 21), F(5, 3), F(1, 3)])
Fraction(13, 21)
>>> from decimal import Decimal as D
>>> mean([D("0.5"), D("0.75"), D("0.625"), D("0.375")])
Decimal('0.5625')

If data is empty, StatisticsError will be raised.

median(data)[source]

Return the median (middle value) of numeric data.

When the number of data points is odd, return the middle data point. When the number of data points is even, the median is interpolated by taking the average of the two middle values:

>>> median([1, 3, 5])
3
>>> median([1, 3, 5, 7])
4.0
median_outliers(data, m=2.5)[source]

Identify outliers using the median value.

Parameters
  • data

  • m (float) – Default 2.5.

Author

David Kainer

Author

eumiro (https://stackoverflow.com/users/449449/eumiro)

Author

Benjamin Bannier (https://stackoverflow.com/users/176922/benjamin-bannier)

Url

http://stackoverflow.com/questions/11686720/is-there-a-numpy-builtin-to-reject-outliers-from-a-list

percentile_based_outlier(data, threshold=95)[source]

Identify outliers using a percentile.

Parameters
Author

David Kainer

Url

http://stackoverflow.com/questions/22354094/pythonic-way-of-detecting-outliers-in-one-dimensional-observation-data

rmsd(list1, list2)[source]

Calculates RMSD for the 2 lists.

Parameters
Return type

float

Returns

RMSD value

Authors

Qiao Wang, Andrew Isaac, Vladimir Likic

std(data, xbar=None)

Return the square root of the sample variance.

See variance for arguments and other details.

>>> stdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75])
1.0810874155219827
vector_by_step(start, stop, step)[source]

Generates a list by using start, stop, and step values.

Parameters
  • start (float) – Initial value

  • stop (float) – Max value

  • step (float) – Step

Author

Vladimir Likic

Return type

List[float]

pyms.Utils.Time

Time conversion and related functions.

Functions:

is_str_num(arg)

Returns whether the argument is a string in the format of a number.

time_str_secs(time_str)

Resolves time string of the form '<NUMBER>s' or '<NUMBER>m' and returns the time in seconds.

window_sele_points(ic, window_sele[, …])

Converts window selection parameter into points based on the time step in an ion chromatogram

is_str_num(arg)[source]

Returns whether the argument is a string in the format of a number.

The number can be an integer, or alternatively a floating point number in scientific or engineering format.

Parameters

arg (str) – A string to be evaluate as a number

Author

Gyro Funch (from Active State Python Cookbook)

Return type

bool

time_str_secs(time_str)[source]

Resolves time string of the form '<NUMBER>s' or '<NUMBER>m' and returns the time in seconds.

Parameters

time_str (str) – A time string, which must be of the form '<NUMBER>s' or '<NUMBER>m' where '<NUMBER>' is a valid number

Return type

float

Returns

Time in seconds

Author

Vladimir Likic

window_sele_points(ic, window_sele, half_window=False)[source]

Converts window selection parameter into points based on the time step in an ion chromatogram

Parameters
  • ic (IonChromatogram) – ion chromatogram object relevant for the conversion

  • window_sele (Union[int, str]) – The window selection parameter. This can be an integer or time string. If an integer, taken as the number of points. If a string, must of the form '<NUMBER>s' or '<NUMBER>m', specifying a time in seconds or minutes, respectively

  • half_window (bool) – Specifies whether to return half-window. Default False.

Return type

int

Returns

The number of points in the window

Author

Vladimir Likic

pyms.Utils.Utils

General utility functions.

Functions:

is_number(obj)

Returns whether obj is a numerical value (int, :class`float` etc).

is_path(obj)

Returns whether the object represents a filesystem path.

is_sequence(obj)

Returns whether the object is a Sequence, and not a string.

is_sequence_of(obj, of)

Returns whether the object is a Sequence, and not a string, of the given type.

is_number(obj)[source]

Returns whether obj is a numerical value (int, :class`float` etc).

Parameters

obj (Any)

Return type

bool

is_path(obj)[source]

Returns whether the object represents a filesystem path.

Parameters

obj (Any)

Return type

bool

is_sequence(obj)[source]

Returns whether the object is a Sequence, and not a string.

Parameters

obj (Any)

Return type

bool

is_sequence_of(obj, of)[source]

Returns whether the object is a Sequence, and not a string, of the given type.

Parameters
Return type

bool

pyms.Utils.Utils.signedinteger

numpy.signedinteger at runtime; int when type checking.