Parallel Octrees, Multigrid and Elastic Image Registration

Rahul Sampath
Seminar

The first component of this work is a parallel algorithm for constructing non-uniform octree meshes for finite element computations. Prior to octree meshing, the linear octree data structure must be constructed and a constraint known as 2:1 balancing" must be enforced; parallel algo-
rithms for these two subproblems will also be presented.

The second component of this work is a parallel matrix-free geometric multigrid algorithm for solving elliptic partial dierential equations (PDEs) using these octree meshes. The last component of this work is a parallel multiscale Gauss
Newton optimization algorithm for solving the elastic image registration problem. The registration problem is discretized using nite elements on octree meshes and the parallel geometric multigrid algorithm is used as a preconditioner in the Conjugate Gradient (CG) algorithm to solve the linear system of equations formed in each Gauss Newton iteration.

The parallel octree meshing and multigrid algorithms have several physical and computer science applications such as in solid/fluid mechanics, heat/mass transfer, electromagnetism, image processing and unstructured mesh generation. Potential applications for the image registration algorithm include automatic identication of abnormalities in medical images, motion reconstruction from temporal sequences of images and planning of surgeries.
Several ideas were used to reduce the overhead for constructing the octree meshes. These include (a) a way to lower communication costs by reducing the number of synchronizations and reducing the communication message size, (b) a way to reduce the number of searches required to
build element-to-vertex mappings, and (c) a compression scheme to reduce the memory footprint of the entire data structure. To our knowledge, the multigrid algorithm presented in this work is the only matrix-free multiplicative geometric multigrid implementation for solving nite element equations on octree meshes using thousands of processors. The proposed registration algorithm is also unique; it is a combination of many different ideas: adaptivity, parallelism, fast optimization
algorithms, and fast linear solvers.

All the algorithms were implemented in C++ using the Message Passing Interface (MPI) standard and were built on top of the PETSc library from Argonne National Laboratory. The multigrid implementation has been released as an open source software: Dendro. Dendro has been tested on several NSF TeraGrid platforms. Our largest run was a highly nonuniform, 8- billion-unknown, elasticity calculation on 32,000 processors.