torch_topological.nn
Layers and loss terms for persistence-based optimisation.
- class torch_topological.nn.AlphaComplex(p=2)[source]
Calculate persistence diagrams of an alpha complex.
This module calculates persistence diagrams of an alpha complex, i.e. a subcomplex of the Delaunay triangulation, which is sparse and thus often substantially smaller than other complex.
It was first described in [Edelsbrunner94] and is particularly useful when analysing low-dimensional data.
Notes
At the moment, this alpha complex implementation, following other implementations, provides distance-based filtrations only. This means that the resulting persistence diagrams do not correspond to the circumradius of a simplex.
In addition, this implementation is work in progress. Some of the core features, such as handling of infinite features, are not available at the moment.
References
- Edelsbrunner94
H. Edelsbrunner and E.P. Mücke, “Three-dimensional alpha shapes”, ACM Transactions on Graphics, Volume 13, Number 1, pp. 43–72, 1994.
- __init__(p=2)[source]
Initialise new alpha complex calculation module.
- Parameters
p (float) – Exponent for the
p
-norm calculation of distances.
Notes
This module currently only supports Minkowski norms. It does not yet support other metrics.
- forward(x)[source]
Implement forward pass for persistence diagram calculation.
The forward pass entails calculating persistent homology on a point cloud and returning a set of persistence diagrams.
- Parameters
x (array_like) – Input point cloud(s).
x
can either be a 2D array of shape(n, d)
, which is treated as a single point cloud, or a 3D array/tensor of the form(b, n, d)
, withb
representing the batch size. Alternatively, you may also specify a list, possibly containing point clouds of non-uniform sizes.- Returns
List of
PersistenceInformation
, containing both the persistence diagrams and the generators, i.e. the pairings, of a certain dimension of topological features. Ifx
is a 3D array, returns a list of lists, in which the first dimension denotes the batch and the second dimension refers to the individual instances ofPersistenceInformation
elements.Generators will be represented in the persistence pairing based on proper creator–destroyer pairs of simplices. In dimension
k
, for instance, every generator is stored as ak
-simplex followed by ak+1
simplex.- Return type
list of
PersistenceInformation
- training: bool
- class torch_topological.nn.CubicalComplex(superlevel=False, dim=None)[source]
Calculate cubical complex persistence diagrams.
This module calculates ‘differentiable’ persistence diagrams for structured data, such as images. This is achieved by calculating a cubical complex.
Cubical complexes are the natural choice for calculating topological features of highly-structured inputs. See [Rieck20a] for an example of how to apply such topological features in practice.
References
- Rieck20a
B. Rieck et al., “Uncovering the Topology of Time-Varying fMRI Data Using Cubical Complex”, Advances in Neural Information Processing Systems 33, pp. 6900–6912, 2020.
- __init__(superlevel=False, dim=None)[source]
Initialise new module.
- Parameters
superlevel (bool) – Indicates whether to calculate topological features based on superlevel sets. By default, sublevel set filtrations are used.
dim (int or
None
) – If set, describes dimension of input data. This is meant to be the dimension of an individual image without channel information, if any. The value ofdim
will change the way an input tensor is being handled: additional dimensions, if present, will be treated as batches or channels. If not set to an integer value,forward()
will just guess what to do with an input (which should work in most cases).
- forward(x)[source]
Implement forward pass for persistence diagram calculation.
The forward pass entails calculating persistent homology on a cubical complex and returning a set of persistence diagrams. The way the input will be interpreted depends on the presence of the
dim
attribute of this class. Ifdim
is set, the lastdim
dimensions of an input tensor will be considered to contain the image data. Ifdim
is not set, image dimensions will be guessed as follows:Tensor of
dim = 2
: a single 2D imageTensor of
dim = 3
: a single 2D image with channelsTensor of
dim = 4
: a batch of 2D images with channels
See parameters for more details.
- Parameters
x (array_like) – Input image(s). If
dim
has not been set, will guess how to handle the input as follows:x
can either be a 2D array of shape(H, W)
, which is treated as a single image, or a 3D array/tensor of the form(C, H, W)
, withC
representing the number of channels, or a 4D array/tensor of the form(B, C, H, W)
, withB
being the batch size. Ifdim
has been set, the same handling strategy applies, but the lastdim
dimensions of the tensor are being used for the cubical complex calculation. All subsequent dimensions will be assumed to represent batches or channels (in this order). Hence, ifdim
is set, the tensor must at most havedim + 2
dimensions.- Returns
List of
PersistenceInformation
, containing both the persistence diagrams and the generators, i.e. the pairings, of a certain dimension of topological features. Ifx
is a 3D array, returns a list of lists, in which the first dimension denotes the batch and the second dimension refers to the individual instances ofPersistenceInformation
elements. Similar for higher-order tensors.- Return type
list of
PersistenceInformation
- training: bool
- class torch_topological.nn.PersistenceInformation(pairing, diagram, dimension=None)[source]
Persistence information data structure.
This is a light-weight data structure for carrying information about the calculation of persistent homology. It consists of the following components:
A persistence pairing
A persistence diagram
An (optional) dimension
Due to its lightweight nature, no validity checks are performed, but all calculation modules should return a sequence of instances of the
PersistenceInformation
class.Since this data class is shared with modules that are capable of calculating persistent homology, the exact form of the persistence pairing might change. Please refer to the respective classes for more documentation.
- class torch_topological.nn.SignatureLoss(p=2, normalise=True, dimensions=0)[source]
Implement topological signature loss.
This module implements the topological signature loss first described in [Moor20a]. In contrast to the original code provided by the authors, this module also provides extensions to higher-dimensional generators if desired.
The module can be used in conjunction with any set of generators and persistence diagrams, i.e. with any set of persistence pairings and persistence diagrams. At the moment, it is restricted to calculating a Minkowski distances for the loss calculation.
References
- Moor20a(1,2)
M. Moor et al., “Topological Autoencoders”, Proceedings of the 37th International Conference on Machine Learning, PMLR 119, pp. 7045–7054, 2020.
- __init__(p=2, normalise=True, dimensions=0)[source]
Create new loss instance.
- Parameters
p (float) – Exponent for the
p
-norm calculation of distances.normalise (bool) – If set, normalises distances for each point cloud. This can be useful when working with batches.
dimensions (int or tuple of int) – Dimensions to use in the signature calculation. Following [Moor20a], this is set by default to
0
.
- training: bool
- class torch_topological.nn.SummaryStatisticLoss(summary_statistic='total_persistence', **kwargs)[source]
Implement loss based on summary statistic.
This is a generic loss function based on topological summary statistics. It implements a loss of the following form:
\[\|s(X) - s(Y)\|^p\]In the preceding equation,
s
refers to a function that results in a scalar-valued summary of a persistence diagram.- __init__(summary_statistic='total_persistence', **kwargs)[source]
Create new loss function based on summary statistic.
- Parameters
summary_statistic (str) –
Indicates which summary statistic function to use. Must be a summary statistics function that exists in the utilities module, i.e.
torch_topological.utils
.At present, the following choices are valid:
torch_topological.utils.p_norm
**kwargs – Optional keyword arguments, to be passed to the summary statistic function.
- forward(X, Y=None)[source]
Calculate loss based on input tensor(s).
- Parameters
X (list of
PersistenceInformation
) – Source information. Supposed to contain persistence diagrams and persistence pairings.Y (list of
PersistenceInformation
orNone
) – Optional target information. If set, evaluates a difference in loss functions as shown in the introduction. IfNone
, a simpler variant of the loss will be evaluated.
- Returns
Loss based on the summary statistic selected by the client. Given a statistic \(s\), the function returns the following expression:
\[\|s(X) - s(Y)\|^p\]In case no target tensor
Y
has been provided, the latter part of the expression amounts to0
.- Return type
torch.tensor
- training: bool
- class torch_topological.nn.VietorisRipsComplex(dim=1, p=2, threshold=inf, keep_infinite_features=False, **kwargs)[source]
Calculate Vietoris–Rips complex of a data set.
This module calculates ‘differentiable’ persistence diagrams for point clouds. The underlying topological approximations are done by calculating a Vietoris–Rips complex of the data.
- __init__(dim=1, p=2, threshold=inf, keep_infinite_features=False, **kwargs)[source]
Initialise new module.
- Parameters
dim (int) – Calculates persistent homology up to (and including) the prescribed dimension.
p (float) – Exponent indicating which Minkowski
p
-norm to use for the calculation of pairwise distances between points. Note that iftreat_as_distances
is supplied toforward()
, the parameter is ignored and will have no effect. The rationale is to permit clients to use a pre-computed distance matrix, while always falling back to Minkowski norms.threshold (float) – If set to a finite number, only calculates topological features up to the specified distance threshold. Thus, any persistence pairings may contain infinite features as well.
keep_infinite_features (bool) – If set, keeps infinite features. This flag is disabled by default. The rationale for this is that infinite features require more deliberate handling and, in case
threshold
is not changed, only a single infinite feature will not be considered in subsequent calculations.**kwargs – Additional arguments to be provided to
ripser
, i.e. the backend for calculating persistent homology. Then_threads
parameter, which controls parallelisation, is probably the most relevant parameter to be adjusted. Please refer to the the gitto-ph documentation for more details on admissible parameters.
Notes
This module currently only supports Minkowski norms. It does not yet support other metrics internally. To use custom metrics, you need to set
treat_as_distances
in theforward()
function instead.
- forward(x, treat_as_distances=False)[source]
Implement forward pass for persistence diagram calculation.
The forward pass entails calculating persistent homology on a point cloud and returning a set of persistence diagrams.
- Parameters
x (array_like) – Input point cloud(s).
x
can either be a 2D array of shape(n, d)
, which is treated as a single point cloud, or a 3D array/tensor of the form(b, n, d)
, withb
representing the batch size. Alternatively, you may also specify a list, possibly containing point clouds of non-uniform sizes.treat_as_distances (bool) – If set, treats
x
as containing pre-computed distances between points. The semantics of howx
is handled are not changed; the only difference is that whenx
has a shape of(n, d)
, the values ofn
andd
need to be the same.
- Returns
List of
PersistenceInformation
, containing both the persistence diagrams and the generators, i.e. the pairings, of a certain dimension of topological features. Ifx
is a 3D array, returns a list of lists, in which the first dimension denotes the batch and the second dimension refers to the individual instances ofPersistenceInformation
elements.Generators will be represented in the persistence pairing based on vertex–edge pairs (dimension 0) or edge–edge pairs. Thus, the persistence pairing in dimension zero will have three components, corresponding to a vertex and an edge, respectively, while the persistence pairing for higher dimensions will have four components.
- Return type
list of
PersistenceInformation
- training: bool
- class torch_topological.nn.WassersteinDistance(p=inf, q=1)[source]
Implement Wasserstein distance between persistence diagrams.
This module calculates the Wasserstein between two persistence diagrams. The Wasserstein distance is arguably the most common metric that is applied when dealing with such diagrams. Notice that calculating the metric involves solving optimal transport problems, which are known to suffer from scalability problems. When dealing with large persistence diagrams, other losses may be more appropriate.
- __init__(p=inf, q=1)[source]
Create new Wasserstein distance calculation module.
- Parameters
p (float or
inf
) – Specifies the exponent of the norm to calculate. By default,p = torch.inf
, corresponding to the maximum norm.q (float) – Specifies the order of Wasserstein metric to calculate. This raises all internal matching costs to the power of
q
, hence subsequently returning theq
-th root of the total cost.
- forward(X, Y)[source]
Calculate Wasserstein metric based on input tensors.
- Parameters
X (list or instance of
PersistenceInformation
) – Topological features of the first space. Supposed to contain persistence diagrams and persistence pairings.Y (list or instance of
PersistenceInformation
) – Topological features of the second space. Supposed to contain persistence diagrams and persistence pairings.
- Returns
A single scalar tensor containing the distance between the persistence diagram(s) contained in
X
andY
.- Return type
torch.tensor
- training: bool
- class torch_topological.nn.SlicedWassersteinDistance(num_directions=10)[source]
Calculate sliced Wasserstein distance between persistence diagrams.
This is an implementation of the sliced Wasserstein distance between persistence diagrams, following [Carriere17a].
This module calculates the sliced Wasserstein distance between two persistence diagrams. It is an efficient variant of the Wasserstein distance, and it is commonly used in the Sliced Wasserstein Kernel. It computes the expected value of the Wasserstein distance when the persistence diagram is projected on a random line passing through the origin.
- __init__(num_directions=10)[source]
Create new sliced Wasserstein distance calculation module.
- Parameters
num_directions (int) – Specifies the number of random directions to be sampled for computation of the sliced Wasserstein distance.
- forward(X, Y)[source]
Calculate sliced Wasserstein metric based on input tensors.
- Parameters
X (list or instance of
PersistenceInformation
) – Topological features of the first space. Supposed to contain persistence diagrams and persistence pairings.Y (list or instance of
PersistenceInformation
) – Topological features of the second space. Supposed to contain persistence diagrams and persistence pairings.
- Returns
A single scalar tensor containing the sliced Wasserstein distance between the persistence diagram(s) contained in
X
andY
.- Return type
torch.tensor
- training: bool
- class torch_topological.nn.SlicedWassersteinKernel(num_directions=10, sigma=1.0)[source]
Calculate sliced Wasserstein kernel between persistence diagrams.
This is an implementation of the sliced Wasserstein kernel between persistence diagrams, following [Carriere17a].
References
- Carriere17a(1,2)
M. Carrière et al., “Sliced Wasserstein Kernel for Persistence Diagrams”, Proceedings of the 34th International Conference on Machine Learning, PMLR 70, pp. 664–673, 2017.
- __init__(num_directions=10, sigma=1.0)[source]
Create new sliced Wasserstein kernel module.
- Parameters
num_directions (int) – Specifies the number of random directions to be sampled for computation of the sliced Wasserstein distance.
sigma (int) – Variance term of the sliced Wasserstein kernel expression.
- forward(X, Y)[source]
Calculate sliced Wasserstein kernel based on input tensors.
- Parameters
X (list or instance of
PersistenceInformation
) – Topological features of the first space. Supposed to contain persistence diagrams and persistence pairings.Y (list or instance of
PersistenceInformation
) – Topological features of the second space. Supposed to contain persistence diagrams and persistence pairings.
- Returns
A single scalar tensor containing the sliced Wasserstein kernel between the persistence diagram(s) contained in
X
andY
.- Return type
torch.tensor
- training: bool
- class torch_topological.nn.MultiScaleKernel(sigma)[source]
Implement the multi-scale kernel between two persistence diagrams.
This class implements the multi-scale kernel between two persistence diagrams (also known as the scale space kernel) as defined by Reininghaus et al. [Reininghaus15a] as
\[\begin{split}k_\sigma(F,G) = \frac{1}{8 \pi \sigma} \sum_{\substack{p \in F\\q \in G}} exp{-\frac{\|p-q\|^2}{8\sigma}} - exp{-\frac{\|p-\overline{q}\|^2}{8\sigma}}\end{split}\]where \(z=(z_1, z_2)\) and \(\overline{z}=(z_2, z_1)\)
References
- Reininghaus15a
J. Reininghaus, U. Bauer and R. Kwitt, “A Stable Multi-Scale Kernel for Topological Machine Learning”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4741–4748, 2015.
- __init__(sigma)[source]
Create new instance of the kernel.
- Parameters
sigma (float) – scale parameter of the kernel
- forward(X, Y, p=2.0)[source]
Calculate the kernel value between two persistence diagrams.
The kernel value is computed for each dimension of the persistence diagram individually, according to Equation 10 from Reininghaus et al. The final kernel value is computed as the sum of kernel values over all dimensions.
- Parameters
X (list or instance of
PersistenceInformation
) – Topological features of the first space. Supposed to contain persistence diagrams and persistence pairings.Y (list or instance of
PersistenceInformation
) – Topological features of the second space. Supposed to contain persistence diagrams and persistence pairings.p (float or inf, default 2.) – Specify which p-norm to use for distance calculation. For infinity/maximum norm pass p=float(‘inf’). Please note that using norms other than the 2-norm (Euclidean norm) are not guaranteed to give positive definite results.
- Returns
A single scalar tensor containing the kernel value between the persistence diagram(s) contained in
X
andY
.- Return type
torch.tensor
Examples
>>> from torch_topological.data.shapes import sample_from_disk >>> from torch_topological.nn import VietorisRipsComplex >>> # sample randomly from two disks >>> x = sample_from_disk(r=0.5, R=0.6, n=100) >>> y = sample_from_disk(r=0.9, R=1.0, n=100) >>> # compute vietoris rips filtration for both point clouds >>> vr = VietorisRipsComplex(dim=1) >>> vr_x = vr(x) >>> vr_y = vr(y) >>> # compute kernel value between persistence >>> # diagrams with sigma set to 1 >>> msk = MultiScaleKernel(1.) >>> msk_value = msk(vr_x, vr_y)
- training: bool