Exascale Data Analysis

Since 2016, I have been extending my work on the contour tree to target exascale data, in collaboration with Lawrence Berkeley National Laboratory and Los Alamos National Laboratory in the United States, under the ECP-Alpine Exascale Computing Project.

Existing algorithms for contour tree computation (including my prior work) were defined serially, and based on serial mathematics. Since modern pre-exascale clusters such as Summit can have 200 million GPU cores spread across 20,000 machines, making effective use of the compute resources requires redefining the algorithms, and in some cases the mathematics that underlie them.

In an initial attempt, we built from my work on the Joint Contour Net by quantising the data set. While this was successful in parallelising the problem, it had a huge memory footprint and was hard to validate.

We went back to the drawing board and came up with an effective solution for massively multicore local computations (ie GPUs, OpenMP and similar), using parallel array transformations, parallel path collapse and parallel sorting to enforce topological invariants.  We presented it at the IEEE Large Data Analysis & Visualization (LDAV) 2016 workshop, and were awarded Best Paper.


We then added further information to the contour tree (technically referred to as augmenting it) and built a hyperstructure to support parallel acceleration of operations in the tree.

My student Petar Hristov had by this point resolved some mathematical and complexity issues caused by W-structures in the contour tree.

This gave him the background to spend an internship at Lawrence Berkeley National Laboratory, where he introduced hypersweeps to compute secondary properties of the contour tree for the purposes of decomposition, simplification and feature detection, then connected them up to the Cinema in situ tools.

Since then, we have implemented a fully distributed version that computes separate contour trees on multiple nodes of a cluster, then combines them to produce a distributed hierarchical contour tree, which won a second Best Paper award at LDAV 2022.

All of this has been implemented and released in the modern multicore vtk-m visualization toolkit, and has been generously supported by the US DoE/NNSA Exascale Computing Project, and there is more to come.

Other Research Topics:

Contour Tree Computation
Scalar Topological Visualisation
Isosurface Acceleration
Isosurface Quality
Direct Volume Rendering
Histograms and Isosurfaces
Topological Comparisons
Multivariate Topology
Fiber Surfaces
Aerial Urban LiDAR

Authors from VCG

Hamish Carr