Release notes
v0.7.0
New features
Add pool decorator to functions for supporting multiprocess
expected_cis
now accepts unbalanced cool file too
API changes
expected_cis
output cvd table now also includes “dist_bp”, “contact_frequency”, and “n_valid” columns
now returns “count.avg.smoothed” and “count.avg.smoothed.agg”, when
clr_weight_name=None, smooth=True, aggregate_smoothed=True
Maintenance
Replaced np.int with int in adaptive_coarsegrain
OE update in sandbox
Cross score sandbox fixes
Support for pandas 2
v0.6.0
New features
New function/tool
rearrange_cooler
to reorder/subset/flip regions of the genome in a coolerNew test dataset for micro-C from hESCs
API changes
snipping: reorder the axes of the output snipper array to (snippet_idx, i, j).
Maintenance
snipping: fix spurious nan->0 conversion of bad bins in on-diagonal pileups
snipping: fix snipping without provided view
snipping: fix for storing the stack in a file
virtual4c: fix for the case when viewpoint has no contacts
fix: Fix numba deprecation warnings by adding
nopython=True
Other small bugfixes
v0.5.3
Maintenance
Improvements for
read_expected_from_file
Bug fix for dot caller 0/0 occurrences
Remove
cytoolz
dependencyPin cooler <0.9 until compatibility
v0.5.2
API changes
remove custom bad_bins from expected & eigdecomp #336
coverage can store total cis counts in the cooler, and sampling can use cis counts #332
can now calculate coverge for balanced data #385
new drop_track_na argument for align_track_with_cooler, allows calcultions that that missing data in tracks as absent #360
multi-thread insulation by chromosome (TODO: by chunk)
Virtual 4C tool #378
CLI changes
CLI tool for
coverage()
Documentation
snipping documentation
dots tutorial
CLI tutorial
Maintenance
Dropped support for Python 3.7 (due to Pandas compatability issues)
Added support for Python 3.10
Minor bugfixes and compatibility updates
Pandas compatibility, pinned above 1.5.1
bioframe compatability
scikit-learn, pinned above >=1.1.2
saddle binedges, value limits #361
pileup CLI bugfix for reading features
Other
v0.5.1
API changes
cooltools.dots is the new user-facing function for calling dots
Maintenance
Compatibility with pandas 1.4
Strict dictinary typing for new numba versions
Update to bioframe 0.3.3
v0.5.0
NOTE: THIS RELEASE BREAKS BACKWARDS COMPATIBILITY!
This release addresses two major issues:
Integration with bioframe viewframes defined as of bioframe v0.3.
Synchronization of the CLI and Python API
Additionally, the documentation has been greatly improved and now includes detailed tutorials that show how to use the cooltools
API in conjunction with other Open2C libraries. These tutorials are automatically re-built from notebooks copied from https://github.com/open2c/open2c_examples repository.
API changes
More clear separation of top-level user-facing functions and low-level API.
Most standard analyses can be performed using just the user-facing functions which are imported into the top-level namespace. Some of them are new or heavily modified from earlier versions.
cooltools.expected_cis
andcooltools.expected_trans
for average by-diagonal contact frequency in intra-chromosomal data and in inter-chromosomal data, respectivelycooltools.eigs_cis
andcooltools.eigs_trans
for eigenvectors (compartment profiles) of cis and trans data, repectivelycooltools.digitize
andcooltools.saddle
can be used together for creation of 2D summary tables of Hi-C interactions in relation to a digitized genomic track, such as eigenvectorscooltools.insulation
for insulation score and annotation of insulating boundariescooltools.directionality
for directionality indexcooltools.pileup
for average signal at 1D or 2D genomic features, including APAcooltools.coverage
for calculation of per-bin sequencing depthcooltools.sample
for random downsampling of cooler filesFor non-standard analyses that require custom algorithms, a lower level API is available under
cooltools.api
Most functions now take an optional
view_df
argument. A pandas dataframe defining a genomic view (https://bioframe.readthedocs.io/en/latest/guide-technical-notes.html) can be provided to limit the analyses to regions included in the view. If not provided, the analysis is performed on whole chromosomes according to what’s stored in the cooler.All functions apart from
coverage
now take aclr_weight_name
argument to specify how the desired balancing weight column is named. Providing aNone
value allows one to use unbalanced data (except theeigs_cis
,eigs_trans
methods, since eigendecomposition is only defined for balanced Hi-C data).The output of
expected-cis
function has changed: it now containsregion1
andregion2
columns (with identical values in case of within-region expected). Additionally, it now allows smoothing of the result to avoid noisy values at long distances (enabled by default and result saved in additional columns of the dataframe)The new
cooltools.insulation
method includes a thresholding step to detect strong boundaries, using either the Li or the Otsu method (fromskimage.thresholding
), or a fixed float value. The result of thresholding for each window size is stored as a boolean in a new columnis_boundary_{window}
.New subpackage
sandbox
for experimental codes that are either candidates for merging into cooltools or candidates for removal. No documentation and tests are expected, proceed at your own risk.New subpackage
lib
for auxiliary modules
CLI changes
CLI tools are renamed with prefixes dropped (e.g.
diamond-insulation
is nowinsulation
), to align with names of user-facing API functions.The CLI tool for expected has been split in two for intra- and inter-chromosomal data (
expected-cis
andexpected-trans
, repectively).Similarly, the compartment profile calculation is now separate for cis and trans (
eigs-cis
andeigs-trans
).New CLI tool
cooltools pileup
for creation of average features based on Hi-C data. It takes a .bed- or .bedpe-style file to create average on-diagonal or off-diagonal pileups, respectively.
Maintenance
Support for Python 3.6 dropped
v0.4.0
Date: 2021-04-06
Maintenance
Make saddle strength work with NaNs
Add output option to diamond-insulation
Upgrade bioframe dependency
Parallelize random sampling
Various compatibility fixes to expected, saddle and snipping and elsewhere to work with standard formats for “expected” and “regions”: https://github.com/open2c/cooltools/issues/217
New features
New dataset download API
New functionality for smoothing P(s) and derivatives (API is not yet stable):
logbin_expected
,interpolate_expected
v0.3.2
Date: 2020-05-05
Updates and bug fixes
Error checking for vmin/vmax in compute-saddle
Various updates and fixes to expected and dot-caller code
Project health
Added docs on RTD, tutorial notebooks, code formatting, linting, and contribution guidelines.
v0.3.0
Date: 2019-11-04
Several library utilities added:
plotting.gridspec_inches
,adaptive_coarsegrain
, singleton interpolation, and colormaps.New tools:
cooltools sample
for random downsampling,cooltools coverage
for marginalization.
Improvements to saddle functions:
compute-saddle
now saves saddledata without transformation, and thescale
argument (with optionslog
orlinear
) now only determines how the saddle is plotted. Consequently,saddleplot
function now expects untransformedsaddledata
, and plots it directly or with log-scaling of the colormap. (https://github.com/open2c/cooltools/pull/105)Added
saddle.mask_bad_bins
method to filter bins in a track based on Hi-C bin-level filtering - improves saddle and histograms when using ChIP-seq and similar tracks. It is automatically applied in the CLI interface. Shouldn’t affect the results when using eigenvectors calculated from the same data.make_saddle
Python function andcompute-saddle
CLI now allow setting min and max distance to use for calculating saddles.