Submodule: opexebo.general¶
General signal processing functions
Overview¶
Signal processing
opexebo.general.smooth (data, sigma, **kwargs) |
Smooth provided data with a Gaussian kernel |
opexebo.general.upsample (array, upscale, …) |
A MaskedArray aware upsampling routine. |
opexebo.general.peak_search (image, **kwargs) |
Given a 1D or 2D array, return a list of co-ordinates of the local maxima or minima |
opexebo.general.circular_mask (axes, …) |
Given a set of axes, construct a circular mask defining pixels as within or without a circular arena. |
opexebo.general.power_spectrum (values, …) |
Calculate the power spectrum of a time-series of data |
opexebo.general.spatial_cross_correlation (…) |
Calculate the Pearson cross correlation between two arrays. |
opexebo.general.shuffle (times, offset_lim, …) |
Duplicate the provided time series iterations number of times. |
opexebo.general.walk_filter (speed, …[, fmt]) |
It is common practice when studying a freely moving subject to exclude data from periods when the subject was stationary, or nearly stationary. |
General Helper functions
opexebo.general.accumulate_spatial (pos, …) |
Given a list of positions, create a histogram of those positions. |
opexebo.general.normxcorr2_general (array) |
Calculate spatial autocorrelation. |
opexebo.general.fit_ellipse (X, Y) |
Fit an ellipse to the provided set of X, Y co-ordinates |
opexebo.general.bin_width_to_bin_number (…) |
This conversion is done in several separate places, and must be done the same in every case. |
Detailed Documentation¶
-
opexebo.general.
normxcorr2_general
(array)¶ Calculate spatial autocorrelation.
Python implementation of the Matlab generalized-normalized cross correlation function, adapted by Vadim Frolov. Some generality was abandoned in the adaption as unnecessary for autocorrelogram calculation.
For the original function, see https://se.mathworks.com/matlabcentral/fileexchange/29005-generalized-normalized-cross-correlation
Parameters: - array: NxM matrix
firing array. array is not necessary a numpy array. Must not contain NaNs!
Returns: - np.ndarray
Resulting correlation matrix
-
opexebo.general.
smooth
(data, sigma, **kwargs)¶ Smooth provided data with a Gaussian kernel
The smoothing is done with a routine from the astronomical package astropy Like scipy.ndimage.gaussian_filter, this does not handle MaskedArrays - but it handles NaNs much better. Specifically, astropy.convolution.convolve replaces NaN values with an interpolation across the void region.
Therefore, to handle masked arrays, the data at masked positions is replaced by np.nan prior to smoothing, and thus avoids influencing nearby, unmasked cells. The masked cells are then returned to their original values prior to return.
The package is discussed at http://docs.astropy.org/en/stable/convolution/index.html
Parameters: - data: np.ndarray or np.ma.MaskedArray
Data that will be smoothed
- sigma: float
Standard deviations for Gaussian kernel in units of pixels/bins
- mask_fill: float, optional
The value that masked locations should be treated as This can either be provided as an absolute number (e.g. 0), or nan If nan, then each masked location will get a value by interpolating from nearby cells. This will only apply if the input is a MaskedArray to start with. If the input is a standard np.ndarray, then no values will be substituted, even if there are nans present.
- circular: bool, optional
If True, then smoothing at the edge of the array will be handled in a circular manner, i.e. the value to the left of data[0] will be data[-1]. If False, the edge will be handled by padding with values equal to the boundary value. Default False
Returns: - smoothed_data: np.ndarray or np.ma.MaskedArray
Smoothed data
Notes
BNT.+general.smooth http://docs.astropy.org/en/stable/convolution/index.html https://github.com/astropy/astropy/issues/6511
Copyright (C) 2019 by Simon Ball
-
opexebo.general.
accumulate_spatial
(pos, arena_size, **kwargs)¶ Given a list of positions, create a histogram of those positions. The resulting histogram is typically referred to as a map.
The complexity in this function comes down to selecting where the edges of the arena are, and generating the bins within those limits.
The histogram bin edges must be defined in one of 3 different ways
bin_width: based on the keyword arena_size, the number of bins will be calculated as
opexebo.general.bin_width_to_bin_number
The histogram will use num_bins between the minimum and maximum of the positions (or limit if provided)
bin_number: the histogram will use bin_number of bins between the minimum and maximum of the positions (or limit if provided)
bin_edges: the histogram will use the provided `bin_edg`e arrays
Either zero or one of the three bin_* keyword arguments must be defined. If none are defined, then a default bin_width is used. If more than 1 is defined, an error is raised
Parameters: - pos: np.ndarray
1D or 2D array of positions in row-major format, i.e. x = pos[0], y = pos[1]. This matches the simplest input creation pos = np.array( [x, y] )
- arena_size: float or tuple of floats
- Dimensions of arena (in cm)
- For a linear track, length
- For a circular arena, diameter
- For a rectangular arena, length or (length, length)
- bin_width: float
Bin size in cm. Default 2.5cm. If bin_width is supplied, limit must also be supplied. One of bin_width, bin_number, bin_edges must be provided
- bin_number: int or tuple of int
Number of bins. If provided as a tuple, then (x_bins, y_bins). One of bin_width, bin_number, bin_edges must be provided
- bin_edges: array-like
Edges of the bins. Provided either as edges or (x_edges, y_edges). One of bin_width, bin_number, bin_edges must be provided
- limits: tuple or np.ndarray
(x_min, x_max) or (x_min, x_max, y_min, y_max) Provide concrete limits to the range over which the histogram searches Any observations outside these limits are discarded If no limits are provided, then use np.nanmin(data), np.nanmax(data) to generate default limits. As is standard in python, acceptable values include the lower bound and exclude the upper bound
Returns: - hist: np.ndarray
1D or 2D histogram of the occurrences of the input observations Dimensions given by arena_size/bin_width Not normalised - each cell gives the integer number of occurrences of the observation in that cell
- edges: list-like
x, or (x, y), where x, y are 1d np.ndarrays Here x, y correspond to the output histogram
-
opexebo.general.
shuffle
(times: numpy.ndarray, offset_lim: float, iterations: int, t_start: float = None, t_stop: float = None)¶ Duplicate the provided time series
iterations
number of times. Each duplicate will be incremented circularly by a random value not smaller thanoffset_lim
.Circular incrementation results in (the majority) of time _differences_ remaining preserved
- Initially, we have a time series,
times
, both with values in the range[min(times), max(times)]
.t_start
may be smaller thanmin(times)
, andt_stop
may be larger thanmax(times)
iterations
number of duplicates oftimes
are created.- In each iteraction, a random increment
T
is generated, and added to each value in that iteration, such that values now fall into the range[min(times)+T, max(times)+T]
.max(times)+T
may exceedt_stop
. - All timestamps matching
t_n > t_stop
are mapped back into the range
[t_start, t_stop]
by subtracting(t_stop-t_start)
* The iteration is re-ordered by value (moving those beyond the far edgeback to the beginning)Parameters: - times : np.ndarray
1D array of floats. Time series data to be shuffled
- offset_lim : float
Minimum offset from the original time values. Each iteration is incremented by a random value evenly distributed in the range
[offset_lim, t_stop-offset_lim]
- iterations : int
Number of repeats of
times
to be returned- t_start : float, optional
Lower bound of time domain. Must meet the criteria
t_start <= min(times)
Defaults tomin(times)
- t_stop : float, optional
Upper bound of time domain. Must meet the criteria
t_stop >= max(times)
Defaults tomax(times)
Returns: - output : np.ndarray
iterations x N array of times. A single iteration is accessed as
output[i]
- increments : np.ndarray
1D array of offset values that were used
- Initially, we have a time series,
-
opexebo.general.
validatekeyword__arena_size
(kwv, provided_dimensions)¶ Decipher the possible meanings of the keyword “arena_size”.
“arena_size” is given to describe the arena in which the animal is moving It should be either a float, or an array-like of 2 floats (x, y)
Parameters: - kw: float or array-like of floats
The value given for the keyword arena_size
- provided_dimensions : int
the number of spatial dimensions provided to the original function. Acceptable values are 1 or 2 E.g. if the original function was provided with positions = [t, x, y], then provided_dimensions=2 (x and y)
Returns: - arena_size : float or np.ndarray of floats
Raises: - ValueError
- IndexError
-
opexebo.general.
validate_keyword_arena_shape
(arena_shape)¶ Ensure that the arena_shape is a meaningful value
Parameters: - arena_shape : str
the value given for the keyword arena_shape
Returns: - arena_shape : str
A value that is guaranteed to be an acceptable member of one of the recognised groups of arena_shapes
-
opexebo.general.
fit_ellipse
(X, Y)¶ Fit an ellipse to the provided set of X, Y co-ordinates
Based on the approach taken in Authors: Andrew Fitzgibbon, Maurizio Pilu, Bob Fisher Reference: “Direct Least Squares Fitting of Ellipses”, IEEE T-PAMI, 1999
@Article{Fitzgibbon99, author = “Fitzgibbon, A.~W.and Pilu, M. and Fisher, R.~B.”, title = “Direct least-squares fitting of ellipses”, journal = pami, year = 1999, volume = 21, number = 5, month = may, pages = “476–480” } and implemented in MatLab by Vadim Frolov
Parameters: - X - np.ndarray
x co-ordinates of points
- Y - np.ndarray
y co-ordinates of points
Returns: - x_centre : float
Centre of ellipse
- y_centre : float
Centre of ellipse
- Ru : float
Major radius
- Rv : float
Minor radius
- theta_rad : float
Ellipse orientation (in radians)
Notes
BNT.+general.fitEllipse opexebo.analysis.gridscore
Copyright (C) 2019 by Simon Ball
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.
-
opexebo.general.
peak_search
(image, **kwargs)¶ Given a 1D or 2D array, return a list of co-ordinates of the local maxima or minima
Multiple searching techniques are provided:
- default: uses skimage.morphology.get_maxima
- sep: uses the Python wrapper to the Source Extractor astronomy tool to identify peaks
Parameters: - image: np.ndarray
1D or 2D array of data
- search_method : str, optional, {“default”, “sep”}
- mask : np.ndarray, optional
Array of masked locations in the image with the same dimensions. Locations where the mask value is True are ignored for the purpose of searching.
- maxima: bool, optional
[default search method only] Define whether to search for maxima or minima in the provided array
- null_background: bool
[sep search method only] Set the image background to zero for calculation purposes rather than attempt to calculate a background gradient. This should generally be True, as our images are not directly comparable to standard telescope output
- threshold : float, optional
[sep search method only] Threshold for identifiying maxima area
Returns: - peak_coords: tuple
Co-ordinates of peaks, in the form ((x0, x1, x2…), (y0, y1, y2…))
Notes
Copyright (C) 2019 by Simon Ball
-
opexebo.general.
power_spectrum
(values, time_stamps, sampling_frequency, **kwargs)¶ Calculate the power spectrum of a time-series of data
Parameters: - values: np.ndarray
Amplitude of time series at times time_stamps. Must be sampled at a constant frequency
- time_stamps: np.ndarray
Time stamps of time-series. Must be the same length as values. Assumed to be in [seconds]
- sampling_frequency: float
Sampling rate. Assumed to be in [Hz]
- method: str, optional, {“welch”, “fft”}
Method for calculating a power spectrum. Accepted values are: [“welch”, “fft”]. Defaults to “welch”
- scale: str, optiona, {“linear”, “log”}
Should the power spectrum be returned in a linear form propto v**2/sqrt(Hz)) (“linear”); or decibel scale (dB/sqrt(Hz)) (“log”) Defaults to “linear”
- resolution: float
Desired output frequency resolution. Applicable only to Welch’s method. Due to the averaging effect of Welch’s method, this sets the “effective” resolution, which is NOT the same as the minimum difference in frequencies Default 1 [Hz]. Values lower than 1/(time_series_duration) are meaningless
Returns: - frequencies: np.ndarray
Discrete Fourier Transform sample frequencies
- power_spectrum: np.ndarray
Power spectral density at that sample frequency
-
opexebo.general.
spatial_cross_correlation
(arr_0, arr_1, **kwargs)¶ Calculate the Pearson cross correlation between two arrays.
Arrays may be either 1d or 2d. If MaskedArrays are supplied, all locations that are masked are excluded from the correlation. Invalid numbers (NaN) are also excluded from the correlation.
If 2d arrays are provided, then calculates a 1d array of row(or column)-wise correlations, controlled by the row_major keyword.
Parameters: - arr_0: np.ndarray
1d or 2d array. Must either contain no NaNs, or be an np.ma.MaskedArray with NaN values masked.
- arr_1: np.ndarray
2nd array to correlate with. Must have the same dimensions as arr_0
Returns: - output_single: float
Pearson correlation coefficient between entire arrays. 2d input arrays are flattened to 1d to calculate
- output_array: np.ndarray
1D array of pearson correlation coefficients, where the i’th value is the coefficient between the i’th rows (if row_major), or the i’th columns of the two arrays Return value is np.nan if 1d arrays are supplied
Other Parameters: - row_major: bool, optional
If the arrays are 2D, process first by row, or first by column. Default True
Notes
BNT.+analyses.spatialCrossCorrelation()
Copyright (C) 2019 by Simon Ball
-
opexebo.general.
bin_width_to_bin_number
(length, bin_width)¶ This conversion is done in several separate places, and must be done the same in every case. Therefore, refactor into a quick helper function.
In cases where the bin_width is not a perfect divisor of length, the actual bins will be slightly smaller
Parameters: - length: float or np.ndarray
Length of a side to be divided into equally spaced bins
- bin_width: float
Dimension of a square bin or pixel
Returns: - num_bins: int or np.ndarray
Same type as length. Integer number of bins.
-
opexebo.general.
circular_mask
(axes, diameter, **kwargs)¶ Given a set of axes, construct a circular mask defining pixels as within or without a circular arena.
A pixel is assumed to be inside the circle iff the centre-point is within the radius of the circle. For small radii, this will result in including pixels with <50% of their area inside.
Parameters: - axes: list of np.ndarray
[x, y] - pair of nd-arrays defining the edges of the pixels. This is the return value from np.histogram2d, for instance. Each one is size+1 relative to the number of pixels This function assumes that each dimension has a CONSTANT bin width, although each dimension may have its own bin width.
- diameter: float
diameter of the circle
- origin: list of floats
[x, y] co-ordinates of the centre of the arena. Optional, defaults to (0,0)
Returns: - in_field: np.ndarray
2D boolean array of dimensions (axes[0].size-1, axs[1].size-1) Values are true if INSIDE circle and false if OUTSIDE
- distance_map: np.ndarray
2D float array of same size. Values are the distance of the centre of the pixel from the origin
- angular_map: np.ndarray
2D float array of the same size. Values are the angle from the origin to the pixel, in degrees. Angles are given with 0° pointing to increasing y, clockwise.
-
opexebo.general.
upsample
(array, upscale, **kwargs)¶ A MaskedArray aware upsampling routine.
skimage provides more reliable methods for interpolated upsampling of simple ndarrays. However, this is not appropriate to MaskedArrays, i.e. where certain locations have invalid values that should not expand.
For non-masked arrays, this function can optionally rely on the skimage routine allowing fractional upscaling. For MaskedArrays, integer upscaling is enforced, with no interpolation
Parameters: - array: array-like
Supports list, tuple, np.ndarray or np.ma.MaskedArray. Supports 1d and 2d (2d for true arrays only). If the array contains any non-finite values, or is a MaskedArray, then integer upscaling is enforced
- upscale: int or float
Upscaling factor. The output array shape will be this factor larger. Method of upscaling depends on type(upscale).
- int -> integer upscaling with no interpolation
- float -> upscaling with interpolation
Note the type distinction: type(1) == int; type(1.) == float
- debug: bool, optional
Print out debugging information while running. Default False
Returns: - new_array : np.ndarray or np.ma.MaskedArray
Upsampled array
-
opexebo.general.
walk_filter
(speed: numpy.ndarray, speed_cutoff: float, *args, fmt='remove')¶ It is common practice when studying a freely moving subject to exclude data from periods when the subject was stationary, or nearly stationary. This method is described as a “walk-filter” - a high-pass filter on subject speed.
This function allows an arbitrary number of arrays to be filtered in parallel to the speed (or whatever other filtering criteria are used). Filters can be performed either by removing the unwanted elements, or by masking them and retunring a MaskedArray.
Parameters: - speed : np.ndarray
Array of speeds for other data points
- speed_cutoff : float
The cutoff, below which values in
speed
will be excluded.- *args : np.ndarray, optional
Any other arrays that should be filtered in parallel with speed Optional arguments here _must_ be np.ndarrays with size equal to that of
speed
- fmt : str, optional
Either “remove” or “mask”. Determines how the values are returned “remove” (default) - the invalid valaues are removed from the array “mask” - the original array is returned as a MaskedArray, with the invalid values masked out.
Returns: - np.ndarray
Filtered copy of
speed
- [np.ndarray]
Arbitrary other filtered arrays, if any other arrays were provided as *args