torch_topological.nn

Layers and loss terms for persistence-based optimisation.

class torch_topological.nn.AlphaComplex(p=2)[source]

Calculate persistence diagrams of an alpha complex.

This module calculates persistence diagrams of an alpha complex, i.e. a subcomplex of the Delaunay triangulation, which is sparse and thus often substantially smaller than other complex.

It was first described in [Edelsbrunner94] and is particularly useful when analysing low-dimensional data.

Notes

At the moment, this alpha complex implementation, following other implementations, provides distance-based filtrations only. This means that the resulting persistence diagrams do not correspond to the circumradius of a simplex.

In addition, this implementation is work in progress. Some of the core features, such as handling of infinite features, are not available at the moment.

References

[Edelsbrunner94]

H. Edelsbrunner and E.P. Mücke, “Three-dimensional alpha shapes”, ACM Transactions on Graphics, Volume 13, Number 1, pp. 43–72, 1994.

__init__(p=2)[source]

Initialise new alpha complex calculation module.

Parameters:: p (float) – Exponent for the p-norm calculation of distances.

Notes

This module currently only supports Minkowski norms. It does not yet support other metrics.

forward(x)[source]

Implement forward pass for persistence diagram calculation.

The forward pass entails calculating persistent homology on a point cloud and returning a set of persistence diagrams.

Parameters:

x (array_like) – Input point cloud(s). x can either be a 2D array of shape (n, d), which is treated as a single point cloud, or a 3D array/tensor of the form (b, n, d), with b representing the batch size. Alternatively, you may also specify a list, possibly containing point clouds of non-uniform sizes.

Returns:

List of PersistenceInformation, containing both the persistence diagrams and the generators, i.e. the pairings, of a certain dimension of topological features. If x is a 3D array, returns a list of lists, in which the first dimension denotes the batch and the second dimension refers to the individual instances of PersistenceInformation elements.

Generators will be represented in the persistence pairing based on proper creator–destroyer pairs of simplices. In dimension k, for instance, every generator is stored as a k-simplex followed by a k+1 simplex.

Return type:

list of PersistenceInformation

training: bool

class torch_topological.nn.CubicalComplex(superlevel=False, dim=None)[source]

Calculate cubical complex persistence diagrams.

This module calculates ‘differentiable’ persistence diagrams for structured data, such as images. This is achieved by calculating a cubical complex.

Cubical complexes are the natural choice for calculating topological features of highly-structured inputs. See [Rieck20a] for an example of how to apply such topological features in practice.

References

[Rieck20a]

B. Rieck et al., “Uncovering the Topology of Time-Varying fMRI Data Using Cubical Complex”, Advances in Neural Information Processing Systems 33, pp. 6900–6912, 2020.

__init__(superlevel=False, dim=None)[source]

Initialise new module.

Parameters:

superlevel (bool) – Indicates whether to calculate topological features based on superlevel sets. By default, sublevel set filtrations are used.
dim (int or None) –
If set, describes dimension of input data. This is meant to be the dimension of an individual image without channel information, if any. The value of dim will change the way an input tensor is being handled: additional dimensions, if present, will be treated as batches or channels. If not set to an integer value, forward() will just guess what to do with an input (which should work in most cases).

For example, when dealing with volume data, i.e. 3D tensors, set dim=3 when instantiating the class. This will permit a seamless user experience with both batched and non-batched input data sets.

forward(x)[source]

Implement forward pass for persistence diagram calculation.

The forward pass entails calculating persistent homology on a cubical complex and returning a set of persistence diagrams. The way the input will be interpreted depends on the presence of the dim attribute of this class. If dim is set, the last dim dimensions of an input tensor will be considered to contain the image data. If dim is not set, image dimensions will be guessed as follows:

Tensor of dimension 2: a single image
Tensor of dimension 3: a single 2D image with channels
Tensor of dimension 4: a batch of 2D images with channels

This is a conservative way of handling the data, ensuring that by default, 2D tensors with channel information and a potential batch information can be handled, since this is the default for many applications.

To ensure that the class can handle e.g. 3D volume data, it is sufficient to set dim = 3 when initialising the class. Refer to the examples and parameters sections for more details.

Parameters:: x (array_like) – Input image(s). If dim has not been set, will guess how to handle the input as follows: x can either be a 2D array of shape (H, W), which is treated as a single image, or a 3D array/tensor of the form (C, H, W), with C representing the number of channels, or a 4D array/tensor of the form (B, C, H, W), with B being the batch size. If dim has been set, the same handling strategy applies, but the last dim dimensions of the tensor are being used for the cubical complex calculation. All subsequent dimensions will be assumed to represent batches or channels (in this order). Hence, if dim is set, the tensor must at most have dim + 2 dimensions.
Returns:: List of PersistenceInformation, containing both the persistence diagrams and the generators, i.e. the pairings, of a certain dimension of topological features. If x is a 3D array, returns a list of lists, in which the first dimension denotes the batch and the second dimension refers to the individual instances of PersistenceInformation elements. Similar for higher-order tensors.
Return type:: list of PersistenceInformation

Examples

# Handling 3D tensors (volumes), either in batches or presented # individually to the function. >> cubical_complex = CubicalComplex(dim=3) >> cubical_complex(x)

training: bool

class torch_topological.nn.EulerDistance[source]

Calculate the L2 norm between two (Weighted)Euler Curves/Transforms

__init__()[source]: Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(ec1, ec2)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool

class torch_topological.nn.MultiScaleKernel(sigma)[source]

Implement the multi-scale kernel between two persistence diagrams.

This class implements the multi-scale kernel between two persistence diagrams (also known as the scale space kernel) as defined by Reininghaus et al. [Reininghaus15a] as

\[\begin{split}k_\sigma(F,G) = \frac{1}{8 \pi \sigma} \sum_{\substack{p \in F\\q \in G}} exp{-\frac{\|p-q\|^2}{8\sigma}} - exp{-\frac{\|p-\overline{q}\|^2}{8\sigma}}\end{split}\]

where \(z=(z_1, z_2)\) and \(\overline{z}=(z_2, z_1)\)

References

[Reininghaus15a]

J. Reininghaus, U. Bauer and R. Kwitt, “A Stable Multi-Scale Kernel for Topological Machine Learning”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4741–4748, 2015.

__init__(sigma)[source]

Create new instance of the kernel.

Parameters:: sigma (float) – scale parameter of the kernel

forward(X, Y, p=2.0)[source]

Calculate the kernel value between two persistence diagrams.

The kernel value is computed for each dimension of the persistence diagram individually, according to Equation 10 from Reininghaus et al. The final kernel value is computed as the sum of kernel values over all dimensions.

Parameters:

X (list or instance of PersistenceInformation) – Topological features of the first space. Supposed to contain persistence diagrams and persistence pairings.
Y (list or instance of PersistenceInformation) – Topological features of the second space. Supposed to contain persistence diagrams and persistence pairings.
p (float or inf, default 2.) – Specify which p-norm to use for distance calculation. For infinity/maximum norm pass p=float(‘inf’). Please note that using norms other than the 2-norm (Euclidean norm) are not guaranteed to give positive definite results.

Returns:

A single scalar tensor containing the kernel value between the persistence diagram(s) contained in X and Y.

Return type:

torch.tensor

Examples

>>> from torch_topological.data.shapes import sample_from_disk
>>> from torch_topological.nn import VietorisRipsComplex
>>> # sample randomly from two disks
>>> x = sample_from_disk(r=0.5, R=0.6, n=100)
>>> y = sample_from_disk(r=0.9, R=1.0, n=100)
>>> # compute vietoris rips filtration for both point clouds
>>> vr = VietorisRipsComplex(dim=1)
>>> vr_x = vr(x)
>>> vr_y = vr(y)
>>> # compute kernel value between persistence
>>> # diagrams with sigma set to 1
>>> msk = MultiScaleKernel(1.)
>>> msk_value = msk(vr_x, vr_y)

training: bool

class torch_topological.nn.PersistenceInformation(pairing, diagram, dimension=None)[source]

Persistence information data structure.

This is a light-weight data structure for carrying information about the calculation of persistent homology. It consists of the following components:

A persistence pairing
A persistence diagram
An (optional) dimension

Due to its lightweight nature, no validity checks are performed, but all calculation modules should return a sequence of instances of the PersistenceInformation class.

Since this data class is shared with modules that are capable of calculating persistent homology, the exact form of the persistence pairing might change. Please refer to the respective classes for more documentation.

class torch_topological.nn.SignatureLoss(p=2, normalise=True, dimensions=0)[source]

Implement topological signature loss.

This module implements the topological signature loss first described in [Moor20a]. In contrast to the original code provided by the authors, this module also provides extensions to higher-dimensional generators if desired.

The module can be used in conjunction with any set of generators and persistence diagrams, i.e. with any set of persistence pairings and persistence diagrams. At the moment, it is restricted to calculating a Minkowski distances for the loss calculation.

References

[Moor20a] (1,2,3)

M. Moor et al., “Topological Autoencoders”, Proceedings of the 37th International Conference on Machine Learning, PMLR 119, pp. 7045–7054, 2020.

__init__(p=2, normalise=True, dimensions=0)[source]

Create new loss instance.

Parameters:

p (float) – Exponent for the p-norm calculation of distances.
normalise (bool) – If set, normalises distances for each point cloud. This can be useful when working with batches.
dimensions (int or tuple of int) – Dimensions to use in the signature calculation. Following [Moor20a], this is set by default to 0.

forward(X, Y)[source]

Calculate the signature loss between two point clouds.

This loss function uses the persistent homology from each point cloud in order to retrieve the topologically relevant distances from a distance matrix calculated from the point clouds. For more information, see [Moor20a].

Parameters:

X (Tuple[torch.tensor, PersistenceInformation]) – A tuple consisting of the point cloud and the persistence information of the point cloud. The persistent information is calculated by performing persistent homology calculation to retrieve a list of topologically relevant edges.
Y (Tuple[torch.tensor, PersistenceInformation]) – A tuple consisting of the point cloud and the persistence information of the point cloud. The persistent information is calculated by performing persistent homology calculation to retrieve a list of topologically relevant edges.

Returns:

A scalar representing the topological loss term for the two data sets.

Return type:

torch.tensor

training: bool

class torch_topological.nn.SlicedWassersteinDistance(num_directions=10)[source]

Calculate sliced Wasserstein distance between persistence diagrams.

This is an implementation of the sliced Wasserstein distance between persistence diagrams, following [Carriere17a].

This module calculates the sliced Wasserstein distance between two persistence diagrams. It is an efficient variant of the Wasserstein distance, and it is commonly used in the Sliced Wasserstein Kernel. It computes the expected value of the Wasserstein distance when the persistence diagram is projected on a random line passing through the origin.

__init__(num_directions=10)[source]

Create new sliced Wasserstein distance calculation module.

Parameters:: num_directions (int) – Specifies the number of random directions to be sampled for computation of the sliced Wasserstein distance.

forward(X, Y)[source]

Calculate sliced Wasserstein metric based on input tensors.

Parameters:

X (list or instance of PersistenceInformation) – Topological features of the first space. Supposed to contain persistence diagrams and persistence pairings.
Y (list or instance of PersistenceInformation) – Topological features of the second space. Supposed to contain persistence diagrams and persistence pairings.

Returns:

A single scalar tensor containing the sliced Wasserstein distance between the persistence diagram(s) contained in X and Y.

Return type:

torch.tensor

training: bool

class torch_topological.nn.SlicedWassersteinKernel(num_directions=10, sigma=1.0)[source]

Calculate sliced Wasserstein kernel between persistence diagrams.

This is an implementation of the sliced Wasserstein kernel between persistence diagrams, following [Carriere17a].

References

[Carriere17a] (1,2)

M. Carrière et al., “Sliced Wasserstein Kernel for Persistence Diagrams”, Proceedings of the 34th International Conference on Machine Learning, PMLR 70, pp. 664–673, 2017.

__init__(num_directions=10, sigma=1.0)[source]

Create new sliced Wasserstein kernel module.

Parameters:

num_directions (int) – Specifies the number of random directions to be sampled for computation of the sliced Wasserstein distance.
sigma (int) – Variance term of the sliced Wasserstein kernel expression.

forward(X, Y)[source]

Calculate sliced Wasserstein kernel based on input tensors.

Parameters:

X (list or instance of PersistenceInformation) – Topological features of the first space. Supposed to contain persistence diagrams and persistence pairings.
Y (list or instance of PersistenceInformation) – Topological features of the second space. Supposed to contain persistence diagrams and persistence pairings.

Returns:

A single scalar tensor containing the sliced Wasserstein kernel between the persistence diagram(s) contained in X and Y.

Return type:

torch.tensor

training: bool

class torch_topological.nn.SummaryStatisticLoss(summary_statistic='total_persistence', **kwargs)[source]

Implement loss based on summary statistic.

This is a generic loss function based on topological summary statistics. It implements a loss of the following form:

\[\|s(X) - s(Y)\|^p\]

In the preceding equation, s refers to a function that results in a scalar-valued summary of a persistence diagram.

__init__(summary_statistic='total_persistence', **kwargs)[source]

Create new loss function based on summary statistic.

Parameters:

summary_statistic (str) –
Indicates which summary statistic function to use. Must be a summary statistics function that exists in the utilities module, i.e. torch_topological.utils.

At present, the following choices are valid:
- torch_topological.utils.persistent_entropy
- torch_topological.utils.polynomial_function
- torch_topological.utils.total_persistence
- torch_topological.utils.p_norm
**kwargs – Optional keyword arguments, to be passed to the summary statistic function.

forward(X, Y=None)[source]

Calculate loss based on input tensor(s).

Parameters:

X (list of PersistenceInformation) – Source information. Supposed to contain persistence diagrams and persistence pairings.
Y (list of PersistenceInformation or None) – Optional target information. If set, evaluates a difference in loss functions as shown in the introduction. If None, a simpler variant of the loss will be evaluated.

Returns:

Loss based on the summary statistic selected by the client. Given a statistic \(s\), the function returns the following expression:

\[\|s(X) - s(Y)\|^p\]

In case no target tensor Y has been provided, the latter part of the expression amounts to 0.

Return type:

torch.tensor

training: bool

class torch_topological.nn.VietorisRipsComplex(dim=1, p=2, threshold=inf, keep_infinite_features=False, **kwargs)[source]

Calculate Vietoris–Rips complex of a data set.

This module calculates ‘differentiable’ persistence diagrams for point clouds. The underlying topological approximations are done by calculating a Vietoris–Rips complex of the data.

__init__(dim=1, p=2, threshold=inf, keep_infinite_features=False, **kwargs)[source]

Initialise new module.

Parameters:

dim (int) – Calculates persistent homology up to (and including) the prescribed dimension.
p (float) – Exponent indicating which Minkowski p-norm to use for the calculation of pairwise distances between points. Note that if treat_as_distances is supplied to forward(), the parameter is ignored and will have no effect. The rationale is to permit clients to use a pre-computed distance matrix, while always falling back to Minkowski norms.
threshold (float) – If set to a finite number, only calculates topological features up to the specified distance threshold. Thus, any persistence pairings may contain infinite features as well.
keep_infinite_features (bool) – If set, keeps infinite features. This flag is disabled by default. The rationale for this is that infinite features require more deliberate handling and, in case threshold is not changed, only a single infinite feature will not be considered in subsequent calculations.
**kwargs – Additional arguments to be provided to ripser, i.e. the backend for calculating persistent homology. The n_threads parameter, which controls parallelisation, is probably the most relevant parameter to be adjusted. Please refer to the the gitto-ph documentation for more details on admissible parameters.

Notes

This module currently only supports Minkowski norms. It does not yet support other metrics internally. To use custom metrics, you need to set treat_as_distances in the forward() function instead.

forward(x, treat_as_distances=False)[source]

Implement forward pass for persistence diagram calculation.

The forward pass entails calculating persistent homology on a point cloud and returning a set of persistence diagrams.

Parameters:

x (array_like) – Input point cloud(s). x can either be a 2D array of shape (n, d), which is treated as a single point cloud, or a 3D array/tensor of the form (b, n, d), with b representing the batch size. Alternatively, you may also specify a list, possibly containing point clouds of non-uniform sizes.
treat_as_distances (bool) – If set, treats x as containing pre-computed distances between points. The semantics of how x is handled are not changed; the only difference is that when x has a shape of (n, d), the values of n and d need to be the same.

Returns:

List of PersistenceInformation, containing both the persistence diagrams and the generators, i.e. the pairings, of a certain dimension of topological features. If x is a 3D array, returns a list of lists, in which the first dimension denotes the batch and the second dimension refers to the individual instances of PersistenceInformation elements.

Generators will be represented in the persistence pairing based on vertex–edge pairs (dimension 0) or edge–edge pairs. Thus, the persistence pairing in dimension zero will have three components, corresponding to a vertex and an edge, respectively, while the persistence pairing for higher dimensions will have four components.

Return type:

list of PersistenceInformation

training: bool

class torch_topological.nn.WassersteinDistance(p=inf, q=1)[source]

Implement Wasserstein distance between persistence diagrams.

This module calculates the Wasserstein between two persistence diagrams. The Wasserstein distance is arguably the most common metric that is applied when dealing with such diagrams. Notice that calculating the metric involves solving optimal transport problems, which are known to suffer from scalability problems. When dealing with large persistence diagrams, other losses may be more appropriate.

__init__(p=inf, q=1)[source]

Create new Wasserstein distance calculation module.

Parameters:

p (float or inf) – Specifies the exponent of the norm to calculate. By default, p = torch.inf, corresponding to the maximum norm.
q (float) – Specifies the order of Wasserstein metric to calculate. This raises all internal matching costs to the power of q, hence subsequently returning the q-th root of the total cost.

forward(X, Y)[source]

Calculate Wasserstein metric based on input tensors.

Parameters:

X (list or instance of PersistenceInformation) – Topological features of the first space. Supposed to contain persistence diagrams and persistence pairings.
Y (list or instance of PersistenceInformation) – Topological features of the second space. Supposed to contain persistence diagrams and persistence pairings.

Returns:

A single scalar tensor containing the distance between the persistence diagram(s) contained in X and Y.

Return type:

torch.tensor

training: bool

class torch_topological.nn.WeightedEulerCurve(num_directions=100, num_steps=30, prod=False)[source]

“Calculate Weighted Euler Characteristic Transform of a given 3D tensor

This is an implementation of the WECT, following [Jiang_2020].

References

[Jiang_2020]

Q. Jiang et al., “The Weighted Euler Curve Transform for Shape and Image Analysis”, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2020, pp. 844-845

__init__(num_directions=100, num_steps=30, prod=False)[source]

Create new WECT module.

Parameters:

num_directions (int) – Specifies the number of random directions to be sampled for computation of the WECT.
num_steps (int) – Number of steps to be used for the approximation of a single curve.
prod (bool (default=False)) – Specifies whether to use the product of constituent vertices of an edge/square/cube will be used as the value of the edge/square/cube or if the maximum of the constituent vertices will be used.

forward(x)[source]

Calculate the Weighted Euler Characteristic Transform(WECT) for an input 3D float tensor.

Parameters:: x (3D float torch tensor) –
Returns:: A 3D tensor of dimension (num_directions, num_steps, 1) which is a stacked tensor of num_directions Weighted Euler Curves of X, each of which is a 1D tensor of length num_steps.
Return type:: torch.tensor

training: bool