cooltools.lib package

common

cooltools.lib.common.align_track_with_cooler(track, clr, view_df=None, clr_weight_name='weight', mask_clr_bad_bins=True, drop_track_na=True)[source]

Sync a track dataframe with a cooler bintable.

Checks that bin sizes match between a track and a cooler, merges the cooler bintable with the track, and propagates masked regions from a cooler bintable to a track.

Parameters

track (pd.DataFrame) – bedGraph-like track DataFrame to check
clr (cooler) – cooler object to check against
view_df (bioframe.viewframe or None) – Optional viewframe of regions to check for their number of bins with assigned track values. If None, constructs a view_df from cooler chromsizes.
clr_weight_name (str) – Name of the column in the bin table with weight
mask_clr_bad_bins (bool) – Whether to propagate null bins from cooler bintable column clr_weight_name to the ‘value’ column of the output clr_track. Default True.
drop_track_na (bool) – Whether to ignore missing values in the track (as if they are absent). Important for raising errors for unassigned regions and warnings for partial assignment. Default True, so NaN values are treated as not assigned. False means that NaN values are treated as assigned.

Returns

clr_track – track dataframe that has been aligned with the cooler bintable and has columns [‘chrom’,’start’,’end’,’value’]

cooltools.lib.common.assign_regions(features, supports)[source]: DEPRECATED. Will be removed in the future versions and replaced with bioframe.overlap() For each feature in features dataframe assign the genomic region (support) that overlaps with it. In case if feature overlaps multiple supports, the region with largest overlap will be reported.

cooltools.lib.common.assign_regions_to_bins(bin_ids, regions_span)[source]

cooltools.lib.common.assign_supports(features, supports, labels=False, suffix='')[source]

Assign support regions to a table of genomic intervals. Obsolete, replaced by assign_regions now.

Parameters

features (DataFrame) – Dataframe with columns chrom, start, end or chrom1, start1, end1, chrom2, start2, end2
supports (array-like) – Support areas

cooltools.lib.common.assign_view_auto(features, view_df, cols_unpaired=['chrom', 'start', 'end'], cols_paired=['chrom1', 'start1', 'end1', 'chrom2', 'start2', 'end2'], cols_view=['chrom', 'start', 'end'], features_view_col_unpaired='region', features_view_cols_paired=['region1', 'region2'], view_name_col='name', drop_unassigned=False, combined_assignments_column='region', force=True)[source]

Assign region names from the view to each feature

Determines whether the features are unpaired (1D, bed-like) or paired (2D, bedpe-like) based on presence of column names (cols_unpaired vs cols_paired) Assigns a regular 1D view, independently to each side in case of paired features. Will add one or two columns with region names (features_view_col_unpaired or features_view_cols_paired) respectively, in case of unpaired and paired features.

Parameters

features (pd.DataFrame) – bedpe-style dataframe
view_df (pandas.DataFrame) – ViewFrame specifying region start and ends for assignment. Attempts to convert dictionary and pd.Series formats to viewFrames.
cols_unpaired (list of str) – The names of columns containing the chromosome, start and end of the genomic intervals for unpaired features. The default values are “chrom”, “start”, “end”.
cols_paired (list of str) – The names of columns containing the chromosome, start and end of the genomic intervals for paired features. The default values are “chrom1”, “start1”, “end1”, “chrom2”, “start2”, “end2”.
cols_view (list of str) – The names of columns containing the chromosome, start and end of the genomic intervals in the view. The default values are “chrom”, “start”, “end”.
features_view_col_unpaired (str) – Name of the column where to save the assigned region name for unpaired features
features_view_cols_paired (list of str) – Names of the columns where to save the assigned region names for paired features
view_name_col (str) – Column of view_df with region names. Default “name”.
drop_unassigned (bool) – If True, drop intervals in features that do not overlap a region in the view. Default False.
combined_assignments_column (str or None) – If set to a string value, will combine assignments from two sides of paired features when they match into column with this name: region name when regions assigned to both sides match, np.nan if not. Default “region”
force (bool, True or False) – if features already have features_view_col (paired or not, depending on the feature types), should we re-wrtie region columns or keep them.

cooltools.lib.common.assign_view_paired(features, view_df, cols_paired=['chrom1', 'start1', 'end1', 'chrom2', 'start2', 'end2'], cols_view=['chrom', 'start', 'end'], features_view_cols=['region1', 'region2'], view_name_col='name', drop_unassigned=False)[source]

Assign region names from the view to each feature

Assigns a regular 1D view independently to each side of a bedpe-style dataframe. Will add two columns with region names (features_view_cols)

Parameters

features (pd.DataFrame) – bedpe-style dataframe
view_df (pandas.DataFrame) – ViewFrame specifying region start and ends for assignment. Attempts to convert dictionary and pd.Series formats to viewFrames.
cols_paired (list of str) – The names of columns containing the chromosome, start and end of the genomic intervals. The default values are “chrom1”, “start1”, “end1”, “chrom2”, “start2”, “end2”.
cols_view (list of str) – The names of columns containing the chromosome, start and end of the genomic intervals in the view. The default values are “chrom”, “start”, “end”.
features_view_cols (list of str) – Names of the columns where to save the assigned region names
view_name_col (str) – Column of view_df with region names. Default “name”.
drop_unassigned (bool) – If True, drop intervals in df that do not overlap a region in the view. Default False.

cooltools.lib.common.make_cooler_view(clr, ucsc_names=False)[source]

Generate a full chromosome viewframe using cooler’s chromsizes

Parameters

clr (cooler) – cooler-object to extract chromsizes
ucsc_names (bool) – Use full UCSC formatted names instead of short chromosome names.

Returns

cooler_view (viewframe) – full chromosome viewframe

cooltools.lib.common.mask_cooler_bad_bins(track, bintable)[source]

Mask (set to NaN) values in track where bin is masked in bintable.

Currently used in cli.get_saddle(). TODO: determine if this should be used elsewhere.

Parameters

track (tuple of (DataFrame, str)) – bedGraph-like dataframe along with the name of the value column.
bintable (tuple of (DataFrame, str)) – bedGraph-like dataframe along with the name of the weight column.

Returns

track (DataFrame) – New bedGraph-like dataframe with bad bins masked in the value column

cooltools.lib.common.pool_decorator(func)[source]

A decorator function that enables multiprocessing for a given function. The function must have a map_functor keyword argument. It works by hijacking map_functor argument and substituting it with the parallel one: multiprocess.Pool(nproc).imap, when nproc > 1

Parameters: func (callable) – The function to be decorated.
Returns: A wrapper function that enables multiprocessing for the given function.

cooltools.lib.common.view_from_track(track_df)[source]

numutils

cooltools.lib.numutils.COMED(xs, ys, has_nans=False)[source]

Calculate the comedian - the robust median-based counterpart of Pearson’s r.

comedian = median((xs-median(xs))*(ys-median(ys))) / MAD(xs) / MAD(ys)

Parameters: has_nans (bool) – if True, mask (x,y) pairs with at least one NaN

Notes

Citations: “On MAD and comedians” by Michael Falk (1997), “Robust Estimation of the Correlation Coefficient: An Attempt of Survey” by Georgy Shevlyakov and Pavel Smirnov (2011)

cooltools.lib.numutils.MAD(arr, axis=None, has_nans=False)[source]

Calculate the Median Absolute Deviation from the median.

Parameters

arr (np.ndarray) – Input data.
axis (int) – The axis along which to calculate MAD.
has_nans (bool) – If True, use the slower NaN-aware method to calculate medians.

cooltools.lib.numutils.adaptive_coarsegrain(ar, countar, cutoff=5, max_levels=8, min_shape=8)[source]

Adaptively coarsegrain a Hi-C matrix based on local neighborhood pooling of counts.

Parameters

ar (array_like, shape (n, n)) – A square Hi-C matrix to coarsegrain. Usually this would be a balanced matrix.
countar (array_like, shape (n, n)) – The raw count matrix for the same area. Has to be the same shape as the Hi-C matrix.
cutoff (float, optional) – A minimum number of raw counts per pixel required to stop 2x2 pooling. Larger cutoff values would lead to a more coarse-grained, but smoother map. 3 is a good default value for display purposes, could be lowered to 1 or 2 to make the map less pixelated. Setting it to 1 will only ensure there are no zeros in the map.
max_levels (int, optional) – How many levels of coarsening to perform. It is safe to keep this number large as very coarsened map will have large counts and no substitutions would be made at coarser levels.
min_shape (int, optional) – Stop coarsegraining when coarsegrained array shape is less than that.

Returns

Smoothed array, shape (n, n)

Notes

The algorithm works as follows:

First, it pads an array with NaNs to the nearest power of two. Second, it coarsens the array in powers of two until the size is less than minshape.

Third, it starts with the most coarsened array, and goes one level up. It looks at all 4 pixels that make each pixel in the second-to-last coarsened array. If the raw counts for any valid (non-NaN) pixel are less than cutoff, it replaces the values of the valid (4 or less) pixels with the NaN-aware average. It is then applied to the next (less coarsened) level until it reaches the original resolution.

In the resulting matrix, there are guaranteed to be no zeros, unless very large zero-only areas were provided such that zeros were produced max_levels times when coarsening.

Examples

>>> c = cooler.Cooler("/path/to/some/cooler/at/about/2000bp/resolution")

>>> # sample region of about 6000x6000
>>> mat = c.matrix(balance=True).fetch("chr1:10000000-22000000")
>>> mat_raw = c.matrix(balance=False).fetch("chr1:10000000-22000000")
>>> mat_cg = adaptive_coarsegrain(mat, mat_raw)

>>> plt.figure(figsize=(16,7))
>>> ax = plt.subplot(121)
>>> plt.imshow(np.log(mat), vmax=-3)
>>> plt.colorbar()
>>> plt.subplot(122, sharex=ax, sharey=ax)
>>> plt.imshow(np.log(mat_cg), vmax=-3)
>>> plt.colorbar()

cooltools.lib.numutils.coarsen(reduction, x, axes, trim_excess=False)[source]

Coarsen an array by applying reduction to fixed size neighborhoods. Adapted from dask.array.coarsen to work on regular numpy arrays.

Parameters

reduction (function) – Function like np.sum, np.mean, etc…
x (np.ndarray) – Array to be coarsened
axes (dict) – Mapping of axis to coarsening factor
trim_excess (bool, optional) – Remove excess elements. Default is False.

Examples

Provide dictionary of scale per dimension

>>> x = np.array([1, 2, 3, 4, 5, 6])
>>> coarsen(np.sum, x, {0: 2})
array([ 3,  7, 11])

>>> coarsen(np.max, x, {0: 3})
array([3, 6])

>>> x = np.arange(24).reshape((4, 6))
>>> x
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])

>>> coarsen(np.min, x, {0: 2, 1: 3})
array([[ 0,  3],
       [12, 15]])

See also

dask.array.coarsen

cooltools.lib.numutils.dist_to_mask(mask, side='min')[source]

Calculate the distance to the nearest True element of an array.

Parameters

mask (iterable of bool) – A boolean array.
side (str) – The side . Accepted values are: ‘left’ : calculate the distance to the nearest True value on the left ‘right’ : calculate the distance to the nearest True value on the right ‘min’ : calculate the distance to the closest True value ‘max’ : calculate the distance to the furthest of the two neighbouring True values

Returns

dist (array of int)

Notes

The solution is borrowed from https://stackoverflow.com/questions/18196811/cumsum-reset-at-nan

cooltools.lib.numutils.fill_diag(arr, x, i=0, copy=True)[source]: Identical to set_diag, but returns a copy by default

cooltools.lib.numutils.fill_inf(arr, pos_value=0, neg_value=0, copy=True)[source]

Replaces positive and negative infinity entries in an array with the provided values.

Parameters

arr (np.array) –
pos_value (float) – Fill value for np.inf
neg_value (float) – Fill value for -np.inf
copy (bool, optional) – If True, creates a copy of x, otherwise replaces values in-place. By default, True.

cooltools.lib.numutils.fill_na(arr, value=0, copy=True)[source]

Replaces np.nan entries in an array with the provided value.

Parameters

arr (np.array) –
value (float) –
copy (bool, optional) – If True, creates a copy of x, otherwise replaces values in-place. By default, True.

cooltools.lib.numutils.fill_nainf(arr, value=0, copy=True)[source]

Replaces np.nan and np.inf entries in an array with the provided value.

Parameters

arr (np.array) –
value (float) –
copy (bool, optional) – If True, creates a copy of x, otherwise replaces values in-place. By default, True.

Notes

Differs from np.nan_to_num in that it replaces np.inf with the same number as np.nan.

cooltools.lib.numutils.get_diag(arr, i=0)[source]: Get the i-th diagonal of a matrix. This solution was borrowed from http://stackoverflow.com/questions/9958577/changing-the-values-of-the-diagonal-of-a-matrix-in-numpy

cooltools.lib.numutils.get_eig(mat, n=3, mask_zero_rows=False, subtract_mean=False, divide_by_mean=False)[source]

Perform an eigenvector decomposition.

Parameters

mat (np.ndarray) – A square matrix, must not contain nans, infs or zero rows.
n (int) – The number of eigenvectors to return. Output is backfilled with NaNs when n exceeds the size of the input matrix.
mask_zero_rows (bool) – If True, mask empty rows/columns before eigenvector decomposition. Works only with symmetric matrices.
subtract_mean (bool) – If True, subtract the mean from the matrix.
divide_by_mean (bool) – If True, divide the matrix by its mean.

Returns

eigvecs (np.ndarray) – An array of eigenvectors (in rows), sorted by a decreasing absolute eigenvalue.
eigvals (np.ndarray) – An array of sorted eigenvalues.

cooltools.lib.numutils.get_finite(arr)[source]: Select only finite elements of an array.

cooltools.lib.numutils.get_kernel(w, p, ktype)[source]

Return typical kernels given size parameteres w, p,and kernel type.

Parameters

w (int) – Outer kernel size (actually half of it).
p (int) – Inner kernel size (half of it).
ktype (str) – Name of the kernel type, could be one of the following: ‘donut’, ‘vertical’, ‘horizontal’, ‘lowleft’, ‘upright’.

Returns

kernel (ndarray) – A square matrix of int type filled with 1 and 0, according to the kernel type.

cooltools.lib.numutils.infer_mask2D(mat)[source]

cooltools.lib.numutils.interp_nan(a_init, pad_zeros=True, method='linear', verbose=False)[source]

Linearly interpolate to fill NaN rows and columns in a matrix. Also interpolates NaNs in 1D arrays.

Parameters

a_init (np.array) –
pad_zeros (bool, optional) – If True, pads the matrix with zeros to fill NaNs at the edges. By default, True.
method (str, optional) – For 2D: “linear”, “nearest”, or “splinef2d” For 1D: “linear”, “nearest”, “zero”, “slinear”, “quadratic”, “cubic”

Returns

array with NaNs linearly interpolated

Notes

1D case adapted from: https://stackoverflow.com/a/39592604 2D case assumes that entire rows or columns are masked & edges to be NaN-free, but is much faster than griddata implementation.

cooltools.lib.numutils.interpolate_bad_singletons(mat, mask=None, fillDiagonal=True, returnMask=False, secondPass=True, verbose=False)[source]

Interpolate singleton missing bins for visualization

Examples

>>> ax = plt.subplot(121)
>>> maxval =  np.log(np.nanmean(np.diag(mat,3))*2 )
>>> plt.matshow(np.log(mat)), vmax=maxval, fignum=False)
>>> plt.set_cmap('fall');
>>> plt.subplot(122, sharex=ax, sharey=ax)
>>> plt.matshow(
...     np.log(interpolate_bad_singletons(remove_good_singletons(mat))),
...     vmax=maxval,
...     fignum=False
... )
>>> plt.set_cmap('fall');
>>> plt.show()

cooltools.lib.numutils.is_symmetric(mat)[source]: Check if a matrix is symmetric.

cooltools.lib.numutils.normalize_score(arr, norm='z', axis=None, has_nans=True)[source]

Normalize an array by subtracting the first moment and dividing the residual by the second.

Parameters

arr (np.ndarray) – Input data.
norm (str) –
The type of normalization. ‘z’ - report z-scores, norm_arr = (arr - mean(arr)) / std(arr)

’mad’ - report deviations from the median in units of MAD (Median Absolute Deviation from the median), norm_arr = (arr - median(arr)) / MAD(arr)

’madz’ - report robust z-scores, i.e. estimate the mean as the median and the standard error as MAD / 0.67499, norm_arr = (arr - median(arr)) / MAD(arr) * 0.67499
axis (int) – The axis along which to calculate the normalization parameters.
has_nans (bool) – If True, use slower NaN-aware methods to calculate the normalization parameters.

cooltools.lib.numutils.persistent_log_bins(end=10, bins_per_order_magnitude=10)[source]

Creates most nicely looking log-spaced integer bins starting at 1, with the defined number of bins per order of magnitude.

Parameters

end (number (int recommended) log10 of the last value. It is safe to put a) –
later. (large value here and select your range of bins) –
bins_per_order_magnitude (int >0 how many bins per order of magnitude) –

Notes

This is not a replacement for logbins, and it has a different purpose.

Difference between this and logbins

Logbins creates bins from lo to hi, spaced logarithmically with an appriximate ratio. Logbins makes sure that the last bin is large (i.e. hi/ratio … hi), and will not allow the last bin to be much less than ratio. It would slightly adjust the ratio to achieve that. As a result, by construciton, logbins bins are different for different lo or hi.

This function is designed to create exactly the same bins that only depend on one parameter, bins_per_order_magnitude. The goal is to make things calculated for different datasets/organisms/etc. comparable. For example, if these bins are used, it would allow us to divide P(s) for two different organisms by each other because it was calculated for the same bins.

The price you pay for such versatility is that the last bin can be much less than others in real application. For example, if you have 10 bins per order of magnitude (ratio of 1.25), but your data ends at 10500, then the only points in the last bin would be 10000..10500, 1/5 of what could be. This may make the last point noisy.

The main part is done using np.logspace and rounding to the nearest integer, followed by unique. The gaps are then re-sorted to ensure that gaps are strictly increasing. The re-sorting of gaps was essential, and produced better results than manual adjustment.

Alternatives that produce irregular bins

Using np.unique(np.logspace(a,b,N,dtype=int)) can be sub-optimal For example, np.unique(np.logspace(0,1,11,dtype=int)) = [ 1, 2, 3, 5, 6, 7, 10] Note the gaps jump from 1 to 2 back to 1

Similarly using np.unique(np.rint(np.logspace..)) can be suboptimal np.unique(np.array(np.rint(np.logspace(0,1,9)),dtype=int)) = [ 1, 2, 3, 4, 6, 7, 10]

for bins_per_order_of_magnitude=16, 10 is not in bins. Other than that, 10, 100, 1000, etc. are always included.

cooltools.lib.numutils.remove_good_singletons(mat, mask=None, returnMask=False)[source]

cooltools.lib.numutils.robust_gauss_filter(ar, sigma=2, functon=<Mock name='mock.gaussian_filter1d' id='128089087508640'>, kwargs=None)[source]

Implements an edge-handling mode for gaussian filter that basically ignores the edge, and also handles NaNs.

Parameters

ar (array-like) – Input array
sigma (float) – sigma to be passed to the filter
function (callable) – Filter to use. Default is gauusian_filter1d
kwargs (dict) – Additional args to pass to the filter. Default:None

Notes

Available edge-handling modes in ndimage.filters attempt to somehow “extrapolate” the edge value and then apply the filter (see https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.convolve.html).

That’s likely because convolve uses fast fourier transform, which requires

the kernel to be constant. Here we design a better edge-handling for the gaussian smoothing.

In a gaussian-filtered array, a pixel away from the edge is a mean of nearby pixels with gaussian weights. With this mode, pixels near start/end are also a mean of nearby pixels with gaussian weights. That’s it. If we encounter NANs, we also simply ignore them, following the same definition: mean of nearby valid pixels. Yes, it raises the weights for the first/last pixels, because now only a part of the whole gaussian is being used (up to 1/2 for the first/last pixel and large sigma). But it preserves the “mean of nearby pixels” definition. It is different from padding with zeros (it would drag the first pixel down to be more like zero). It is also different from “nearest” - that gives too much weight to the first/last pixel.

To achieve this smoothing, we preform regular gaussian smoothing using mode=”constant” (pad with zeros). Then we take an array of valid pixels and smooth it the same way. This calculates how many “average valid pixels” contributed to each point of a smoothed array. Dividing one by the other achieves the desired result.

cooltools.lib.numutils.set_diag(arr, x, i=0, copy=False)[source]

Rewrite the i-th diagonal of a matrix with a value or an array of values. Supports 2D arrays, square or rectangular. In-place by default.

Parameters

arr (2-D array) – Array whose diagonal is to be filled.
x (scalar or 1-D vector of correct length) – Values to be written on the diagonal.
i (int, optional) – Which diagonal to write to. Default is 0. Main diagonal is 0; upper diagonals are positive and lower diagonals are negative.
copy (bool, optional) – Return a copy. Diagonal is written in-place if false. Default is False.

Returns

Array with diagonal filled.

Notes

Similar to numpy.fill_diagonal, but allows for kth diagonals as well. This solution was borrowed from http://stackoverflow.com/questions/9958577/changing-the-values-of-the-diagonal-of-a-matrix-in-numpy

cooltools.lib.numutils.slice_sorted(arr, lo, hi)[source]: Get the subset of a sorted array with values >=lo and <hi. A faster version of arr[(arr>=lo) & (arr<hi)]

cooltools.lib.numutils.smooth(y, box_pts)[source]

cooltools.lib.numutils.stochastic_sd(arr, n=10000, seed=0)[source]

Estimate the standard deviation of an array by considering only the subset of its elements.

Parameters

n (int) – The number of elements to consider. If the array contains fewer elements, use all.
seed (int) – The seed for the random number generator.

cooltools.lib.numutils.weighted_groupby_mean(df, group_by, weigh_by, mode='mean')[source]

Weighted mean, std, and std in log space for a dataframe.groupby

Parameters

df (dataframe) – Dataframe to groupby
group_by (str or list) – Columns to group by
weight_by (str) – Column to use as weights
mode ("mean", "std" or "logstd") – Do the weighted mean, the weighted standard deviaton, or the weighted std in log-space from the mean-log value (useful for P(s) etc.)

cooltools.lib.numutils.zoom_array(in_array, final_shape, same_sum=False, zoom_function=functools.partial(<Mock name='mock.zoom' id='128089048546464'>, order=1), **zoom_kwargs)[source]

Rescale an array or image.

Normally, one can use scipy.ndimage.zoom to do array/image rescaling. However, scipy.ndimage.zoom does not coarsegrain images well. It basically takes nearest neighbor, rather than averaging all the pixels, when coarsegraining arrays. This increases noise. Photoshop doesn’t do that, and performs some smart interpolation-averaging instead.

If you were to coarsegrain an array by an integer factor, e.g. 100x100 -> 25x25, you just need to do block-averaging, that’s easy, and it reduces noise. But what if you want to coarsegrain 100x100 -> 30x30?

Then my friend you are in trouble. But this function will help you. This function will blow up your 100x100 array to a 120x120 array using scipy.ndimage zoom Then it will coarsegrain a 120x120 array by block-averaging in 4x4 chunks.

It will do it independently for each dimension, so if you want a 100x100 array to become a 60x120 array, it will blow up the first and the second dimension to 120, and then block-average only the first dimension.

(Copied from mirnylib.numutils)

Parameters

in_array (ndarray) – n-dimensional numpy array (1D also works)
final_shape (shape tuple) – resulting shape of an array
same_sum (bool, optional) – Preserve a sum of the array, rather than values. By default, values are preserved
zoom_function (callable) – By default, scipy.ndimage.zoom with order=1. You can plug your own.
**zoom_kwargs – Options to pass to zoomFunction.

Returns

rescaled (ndarray) – Rescaled version of in_array

peaks

cooltools.lib.peaks.find_peak_prominence(arr, max_dist=None)[source]

Find the local maxima of an array and their prominence. The prominence of a peak is defined as the maximal difference between the height of the peak and the lowest point in the range until a higher peak.

Parameters

arr (array_like) –
max_dist (int) – If specified, the distance to the adjacent higher peaks is limited by max_dist.

Returns

loc_max_poss (numpy.array) – The positions of local maxima of a given array.
proms (numpy.array) – The prominence of the detected maxima.

cooltools.lib.peaks.find_peak_prominence_iterative(arr, min_prom=None, max_prom=None, steps_prom=1000, log_space_proms=True, min_n_peak_pairs=5)[source]

Finds the minima/maxima of an array using the peakdet algorithm at different values of the threshold prominence. For each location, returns the maximal threshold prominence at which it is called as a minimum/maximum.

Note that this function is inferior in every aspect to find_peak_prominence. We keep it for testing purposes and will remove in the future.

Parameters

arr (array_like) –
min_prom (float) – The minimal and the maximal values of prominence to probe. If None, these values are inferred as the minimal and the maximal non-zero difference between any two elements of v.
max_prom (float) – The minimal and the maximal values of prominence to probe. If None, these values are inferred as the minimal and the maximal non-zero difference between any two elements of v.
steps_prom (int) – The number of threshold prominence values to probe in the range between min_prom and max_prom.
log_space_proms (bool) – If True, probe logarithmically spaced values of the threshold prominence in the range between min_prom and max_prom.
min_n_peak_pairs (int) – If the number of detected minima/maxima at a certain threshold prominence is < min_n_peak_pairs, the detected peaks are ignored.

Returns

minproms, maxproms (numpy.array) – The prominence of detected minima and maxima.

cooltools.lib.peaks.peakdet(arr, min_prominence)[source]

Detect local peaks in an array. Finds a sequence of minima and maxima such that the two consecutive extrema have a value difference (i.e. a prominence) >= min_prominence. This is analogous to the definition of prominence in topography: https://en.wikipedia.org/wiki/Topographic_prominence

The original peakdet algorithm was designed by Eli Billauer and described in http://billauer.co.il/peakdet.html (v. 3.4.05, Explicitly not copyrighted). This function is released to the public domain; Any use is allowed. The Python implementation was published by endolith on Github: https://gist.github.com/endolith/250860 .

Here, we use the endolith’s implementation with minimal to none modifications to the algorithm, but with significant changes in the interface and the documentation

Parameters

arr (array_like) –
min_prominence (float) – The minimal prominence of detected extrema.

Returns

maxidxs, minidx (numpy.array) – The indices of the maxima and minima in arr.

plotting

Migrated from mirnylib.plotting.

cooltools.lib.plotting.get_cmap(name)[source]

cooltools.lib.plotting.gridspec_inches(wcols, hrows, fig_kwargs={})[source]

cooltools.lib.plotting.list_to_colormap(color_list, name=None)[source]

cooltools.lib package

common

numutils

Difference between this and logbins

Alternatives that produce irregular bins

peaks

plotting

schemas