torch_topological.nn

Layers and loss terms for persistence-based optimisation.

class torch_topological.nn.AlphaComplex(p=2)[source]

Calculate persistence diagrams of an alpha complex.

This module calculates persistence diagrams of an alpha complex, i.e. a subcomplex of the Delaunay triangulation, which is sparse and thus often substantially smaller than other complex.

It was first described in [Edelsbrunner94] and is particularly useful when analysing low-dimensional data.

Notes

At the moment, this alpha complex implementation, following other implementations, provides distance-based filtrations only. This means that the resulting persistence diagrams do not correspond to the circumradius of a simplex.

In addition, this implementation is work in progress. Some of the core features, such as handling of infinite features, are not available at the moment.

References

Edelsbrunner94

H. Edelsbrunner and E.P. Mücke, “Three-dimensional alpha shapes”, ACM Transactions on Graphics, Volume 13, Number 1, pp. 43–72, 1994.

__init__(p=2)[source]

Initialise new alpha complex calculation module.

Parameters

p (float) – Exponent for the p-norm calculation of distances.

Notes

This module currently only supports Minkowski norms. It does not yet support other metrics.

forward(x)[source]

Implement forward pass for persistence diagram calculation.

The forward pass entails calculating persistent homology on a point cloud and returning a set of persistence diagrams.

Parameters

x (array_like) – Input point cloud(s). x can either be a 2D array of shape (n, d), which is treated as a single point cloud, or a 3D array/tensor of the form (b, n, d), with b representing the batch size. Alternatively, you may also specify a list, possibly containing point clouds of non-uniform sizes.

Returns

List of PersistenceInformation, containing both the persistence diagrams and the generators, i.e. the pairings, of a certain dimension of topological features. If x is a 3D array, returns a list of lists, in which the first dimension denotes the batch and the second dimension refers to the individual instances of PersistenceInformation elements.

Generators will be represented in the persistence pairing based on proper creator–destroyer pairs of simplices. In dimension k, for instance, every generator is stored as a k-simplex followed by a k+1 simplex.

Return type

list of PersistenceInformation

training: bool
class torch_topological.nn.CubicalComplex(superlevel=False, dim=None)[source]

Calculate cubical complex persistence diagrams.

This module calculates ‘differentiable’ persistence diagrams for structured data, such as images. This is achieved by calculating a cubical complex.

Cubical complexes are the natural choice for calculating topological features of highly-structured inputs. See [Rieck20a] for an example of how to apply such topological features in practice.

References

Rieck20a

B. Rieck et al., “Uncovering the Topology of Time-Varying fMRI Data Using Cubical Complex”, Advances in Neural Information Processing Systems 33, pp. 6900–6912, 2020.

__init__(superlevel=False, dim=None)[source]

Initialise new module.

Parameters
  • superlevel (bool) – Indicates whether to calculate topological features based on superlevel sets. By default, sublevel set filtrations are used.

  • dim (int or None) – If set, describes dimension of input data. This is meant to be the dimension of an individual image without channel information, if any. The value of dim will change the way an input tensor is being handled: additional dimensions, if present, will be treated as batches or channels. If not set to an integer value, forward() will just guess what to do with an input (which should work in most cases).

forward(x)[source]

Implement forward pass for persistence diagram calculation.

The forward pass entails calculating persistent homology on a cubical complex and returning a set of persistence diagrams. The way the input will be interpreted depends on the presence of the dim attribute of this class. If dim is set, the last dim dimensions of an input tensor will be considered to contain the image data. If dim is not set, image dimensions will be guessed as follows:

  1. Tensor of dim = 2: a single 2D image

  2. Tensor of dim = 3: a single 2D image with channels

  3. Tensor of dim = 4: a batch of 2D images with channels

See parameters for more details.

Parameters

x (array_like) – Input image(s). If dim has not been set, will guess how to handle the input as follows: x can either be a 2D array of shape (H, W), which is treated as a single image, or a 3D array/tensor of the form (C, H, W), with C representing the number of channels, or a 4D array/tensor of the form (B, C, H, W), with B being the batch size. If dim has been set, the same handling strategy applies, but the last dim dimensions of the tensor are being used for the cubical complex calculation. All subsequent dimensions will be assumed to represent batches or channels (in this order). Hence, if dim is set, the tensor must at most have dim + 2 dimensions.

Returns

List of PersistenceInformation, containing both the persistence diagrams and the generators, i.e. the pairings, of a certain dimension of topological features. If x is a 3D array, returns a list of lists, in which the first dimension denotes the batch and the second dimension refers to the individual instances of PersistenceInformation elements. Similar for higher-order tensors.

Return type

list of PersistenceInformation

training: bool
class torch_topological.nn.PersistenceInformation(pairing, diagram, dimension=None)[source]

Persistence information data structure.

This is a light-weight data structure for carrying information about the calculation of persistent homology. It consists of the following components:

  • A persistence pairing

  • A persistence diagram

  • An (optional) dimension

Due to its lightweight nature, no validity checks are performed, but all calculation modules should return a sequence of instances of the PersistenceInformation class.

Since this data class is shared with modules that are capable of calculating persistent homology, the exact form of the persistence pairing might change. Please refer to the respective classes for more documentation.

class torch_topological.nn.SignatureLoss(p=2, normalise=True, dimensions=0)[source]

Implement topological signature loss.

This module implements the topological signature loss first described in [Moor20a]. In contrast to the original code provided by the authors, this module also provides extensions to higher-dimensional generators if desired.

The module can be used in conjunction with any set of generators and persistence diagrams, i.e. with any set of persistence pairings and persistence diagrams. At the moment, it is restricted to calculating a Minkowski distances for the loss calculation.

References

Moor20a(1,2)

M. Moor et al., “Topological Autoencoders”, Proceedings of the 37th International Conference on Machine Learning, PMLR 119, pp. 7045–7054, 2020.

__init__(p=2, normalise=True, dimensions=0)[source]

Create new loss instance.

Parameters
  • p (float) – Exponent for the p-norm calculation of distances.

  • normalise (bool) – If set, normalises distances for each point cloud. This can be useful when working with batches.

  • dimensions (int or tuple of int) – Dimensions to use in the signature calculation. Following [Moor20a], this is set by default to 0.

forward(X, Y)[source]

Calculate signature loss between two data sets.

training: bool
class torch_topological.nn.SummaryStatisticLoss(summary_statistic='total_persistence', **kwargs)[source]

Implement loss based on summary statistic.

This is a generic loss function based on topological summary statistics. It implements a loss of the following form:

\[\|s(X) - s(Y)\|^p\]

In the preceding equation, s refers to a function that results in a scalar-valued summary of a persistence diagram.

__init__(summary_statistic='total_persistence', **kwargs)[source]

Create new loss function based on summary statistic.

Parameters
forward(X, Y=None)[source]

Calculate loss based on input tensor(s).

Parameters
  • X (list of PersistenceInformation) – Source information. Supposed to contain persistence diagrams and persistence pairings.

  • Y (list of PersistenceInformation or None) – Optional target information. If set, evaluates a difference in loss functions as shown in the introduction. If None, a simpler variant of the loss will be evaluated.

Returns

Loss based on the summary statistic selected by the client. Given a statistic \(s\), the function returns the following expression:

\[\|s(X) - s(Y)\|^p\]

In case no target tensor Y has been provided, the latter part of the expression amounts to 0.

Return type

torch.tensor

training: bool
class torch_topological.nn.VietorisRipsComplex(dim=1, p=2, threshold=inf, keep_infinite_features=False, **kwargs)[source]

Calculate Vietoris–Rips complex of a data set.

This module calculates ‘differentiable’ persistence diagrams for point clouds. The underlying topological approximations are done by calculating a Vietoris–Rips complex of the data.

__init__(dim=1, p=2, threshold=inf, keep_infinite_features=False, **kwargs)[source]

Initialise new module.

Parameters
  • dim (int) – Calculates persistent homology up to (and including) the prescribed dimension.

  • p (float) – Exponent indicating which Minkowski p-norm to use for the calculation of pairwise distances between points. Note that if treat_as_distances is supplied to forward(), the parameter is ignored and will have no effect. The rationale is to permit clients to use a pre-computed distance matrix, while always falling back to Minkowski norms.

  • threshold (float) – If set to a finite number, only calculates topological features up to the specified distance threshold. Thus, any persistence pairings may contain infinite features as well.

  • keep_infinite_features (bool) – If set, keeps infinite features. This flag is disabled by default. The rationale for this is that infinite features require more deliberate handling and, in case threshold is not changed, only a single infinite feature will not be considered in subsequent calculations.

  • **kwargs – Additional arguments to be provided to ripser, i.e. the backend for calculating persistent homology. The n_threads parameter, which controls parallelisation, is probably the most relevant parameter to be adjusted. Please refer to the the gitto-ph documentation for more details on admissible parameters.

Notes

This module currently only supports Minkowski norms. It does not yet support other metrics internally. To use custom metrics, you need to set treat_as_distances in the forward() function instead.

forward(x, treat_as_distances=False)[source]

Implement forward pass for persistence diagram calculation.

The forward pass entails calculating persistent homology on a point cloud and returning a set of persistence diagrams.

Parameters
  • x (array_like) – Input point cloud(s). x can either be a 2D array of shape (n, d), which is treated as a single point cloud, or a 3D array/tensor of the form (b, n, d), with b representing the batch size. Alternatively, you may also specify a list, possibly containing point clouds of non-uniform sizes.

  • treat_as_distances (bool) – If set, treats x as containing pre-computed distances between points. The semantics of how x is handled are not changed; the only difference is that when x has a shape of (n, d), the values of n and d need to be the same.

Returns

List of PersistenceInformation, containing both the persistence diagrams and the generators, i.e. the pairings, of a certain dimension of topological features. If x is a 3D array, returns a list of lists, in which the first dimension denotes the batch and the second dimension refers to the individual instances of PersistenceInformation elements.

Generators will be represented in the persistence pairing based on vertex–edge pairs (dimension 0) or edge–edge pairs. Thus, the persistence pairing in dimension zero will have three components, corresponding to a vertex and an edge, respectively, while the persistence pairing for higher dimensions will have four components.

Return type

list of PersistenceInformation

training: bool
class torch_topological.nn.WassersteinDistance(p=inf, q=1)[source]

Implement Wasserstein distance between persistence diagrams.

This module calculates the Wasserstein between two persistence diagrams. The Wasserstein distance is arguably the most common metric that is applied when dealing with such diagrams. Notice that calculating the metric involves solving optimal transport problems, which are known to suffer from scalability problems. When dealing with large persistence diagrams, other losses may be more appropriate.

__init__(p=inf, q=1)[source]

Create new Wasserstein distance calculation module.

Parameters
  • p (float or inf) – Specifies the exponent of the norm to calculate. By default, p = torch.inf, corresponding to the maximum norm.

  • q (float) – Specifies the order of Wasserstein metric to calculate. This raises all internal matching costs to the power of q, hence subsequently returning the q-th root of the total cost.

forward(X, Y)[source]

Calculate Wasserstein metric based on input tensors.

Parameters
  • X (list or instance of PersistenceInformation) – Topological features of the first space. Supposed to contain persistence diagrams and persistence pairings.

  • Y (list or instance of PersistenceInformation) – Topological features of the second space. Supposed to contain persistence diagrams and persistence pairings.

Returns

A single scalar tensor containing the distance between the persistence diagram(s) contained in X and Y.

Return type

torch.tensor

training: bool
class torch_topological.nn.SlicedWassersteinDistance(num_directions=10)[source]

Calculate sliced Wasserstein distance between persistence diagrams.

This is an implementation of the sliced Wasserstein distance between persistence diagrams, following [Carriere17a].

This module calculates the sliced Wasserstein distance between two persistence diagrams. It is an efficient variant of the Wasserstein distance, and it is commonly used in the Sliced Wasserstein Kernel. It computes the expected value of the Wasserstein distance when the persistence diagram is projected on a random line passing through the origin.

__init__(num_directions=10)[source]

Create new sliced Wasserstein distance calculation module.

Parameters

num_directions (int) – Specifies the number of random directions to be sampled for computation of the sliced Wasserstein distance.

forward(X, Y)[source]

Calculate sliced Wasserstein metric based on input tensors.

Parameters
  • X (list or instance of PersistenceInformation) – Topological features of the first space. Supposed to contain persistence diagrams and persistence pairings.

  • Y (list or instance of PersistenceInformation) – Topological features of the second space. Supposed to contain persistence diagrams and persistence pairings.

Returns

A single scalar tensor containing the sliced Wasserstein distance between the persistence diagram(s) contained in X and Y.

Return type

torch.tensor

training: bool
class torch_topological.nn.SlicedWassersteinKernel(num_directions=10, sigma=1.0)[source]

Calculate sliced Wasserstein kernel between persistence diagrams.

This is an implementation of the sliced Wasserstein kernel between persistence diagrams, following [Carriere17a].

References

Carriere17a(1,2)

M. Carrière et al., “Sliced Wasserstein Kernel for Persistence Diagrams”, Proceedings of the 34th International Conference on Machine Learning, PMLR 70, pp. 664–673, 2017.

__init__(num_directions=10, sigma=1.0)[source]

Create new sliced Wasserstein kernel module.

Parameters
  • num_directions (int) – Specifies the number of random directions to be sampled for computation of the sliced Wasserstein distance.

  • sigma (int) – Variance term of the sliced Wasserstein kernel expression.

forward(X, Y)[source]

Calculate sliced Wasserstein kernel based on input tensors.

Parameters
  • X (list or instance of PersistenceInformation) – Topological features of the first space. Supposed to contain persistence diagrams and persistence pairings.

  • Y (list or instance of PersistenceInformation) – Topological features of the second space. Supposed to contain persistence diagrams and persistence pairings.

Returns

A single scalar tensor containing the sliced Wasserstein kernel between the persistence diagram(s) contained in X and Y.

Return type

torch.tensor

training: bool
class torch_topological.nn.MultiScaleKernel(sigma)[source]

Implement the multi-scale kernel between two persistence diagrams.

This class implements the multi-scale kernel between two persistence diagrams (also known as the scale space kernel) as defined by Reininghaus et al. [Reininghaus15a] as

\[\begin{split}k_\sigma(F,G) = \frac{1}{8 \pi \sigma} \sum_{\substack{p \in F\\q \in G}} exp{-\frac{\|p-q\|^2}{8\sigma}} - exp{-\frac{\|p-\overline{q}\|^2}{8\sigma}}\end{split}\]

where \(z=(z_1, z_2)\) and \(\overline{z}=(z_2, z_1)\)

References

Reininghaus15a

J. Reininghaus, U. Bauer and R. Kwitt, “A Stable Multi-Scale Kernel for Topological Machine Learning”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4741–4748, 2015.

__init__(sigma)[source]

Create new instance of the kernel.

Parameters

sigma (float) – scale parameter of the kernel

forward(X, Y, p=2.0)[source]

Calculate the kernel value between two persistence diagrams.

The kernel value is computed for each dimension of the persistence diagram individually, according to Equation 10 from Reininghaus et al. The final kernel value is computed as the sum of kernel values over all dimensions.

Parameters
  • X (list or instance of PersistenceInformation) – Topological features of the first space. Supposed to contain persistence diagrams and persistence pairings.

  • Y (list or instance of PersistenceInformation) – Topological features of the second space. Supposed to contain persistence diagrams and persistence pairings.

  • p (float or inf, default 2.) – Specify which p-norm to use for distance calculation. For infinity/maximum norm pass p=float(‘inf’). Please note that using norms other than the 2-norm (Euclidean norm) are not guaranteed to give positive definite results.

Returns

A single scalar tensor containing the kernel value between the persistence diagram(s) contained in X and Y.

Return type

torch.tensor

Examples

>>> from torch_topological.data.shapes import sample_from_disk
>>> from torch_topological.nn import VietorisRipsComplex
>>> # sample randomly from two disks
>>> x = sample_from_disk(r=0.5, R=0.6, n=100)
>>> y = sample_from_disk(r=0.9, R=1.0, n=100)
>>> # compute vietoris rips filtration for both point clouds
>>> vr = VietorisRipsComplex(dim=1)
>>> vr_x = vr(x)
>>> vr_y = vr(y)
>>> # compute kernel value between persistence
>>> # diagrams with sigma set to 1
>>> msk = MultiScaleKernel(1.)
>>> msk_value = msk(vr_x, vr_y)
training: bool