Scalable High Performance Multidimensional Scaling

Today is so-called data deluge era. A huge amount of data is flooded in many domains of modern society based on the advancements of technologies and social networks. Dimension reduction is a useful tool for data visualization of such high-dimensional data and abstract data to make data analysis feasible for such large-scale high-dimensional or abstract scientific data. Among the known dimension reduction algorithms, multidimensional scaling (MDS) is investigated in my research due to its theoretical robustness and high applicability. For the purpose of large-scale multidimensional scaling, we need to figure out two main challenges. One problem is that large-scale multidimensional scaling requires huge amounts of computation and memory resources, because it requires O(N2) memory and computation. Another problem is that multidimensional scaling is known as a non-linear optimization problem so that it is easy to be trapped in local optima if EM-like hill-climbing approach is used to solve it.

To tackle two challenges mentioned above, we have applied three methodologies to multidimensional scaling: i) parallelization, ii) interpolation, and iii) deterministic annealing (DA) optimization. Parallelization is applied to provide required huge amounts of computation and memory resources by utilizing large-scale distributed-memory systems, such as multicore cluster systems. In addition, we have investigated an interpolation method which utilizes the known mappings of a subset of the given data, named in-sample data, to generate mappings of the remaining out-of-sample data. This approach dramatically reduces computational complexity and memory requirement. DA optimization method has been applied to multidimensional scaling problem in order to avoid local optima. Experimental results illustrate the proposed methodologies are effective to scale up the mapping capacity of multidimensional scaling algorithm and to improve the mapping quality of multidimensional scaling via avoiding local optima.

Argonne Leadership Computing Facility

Leadership Computing Resources

Featured: Aurora

Computational Science

Featured: Engineering

Growing the HPC Community

Accelerating Science

Support Center

Featured: Get Started

Featured: MyALCF

06/21/2012, 5:30am CT