Patchwork Kriging for Large Datasets

Chiwoo Park
Seminar

Gaussian process (GP) regression or stochastic kriging is a popular Bayesian nonparametric approach for nonlinear regression and simulation meta-modeling, but its omputation does not scale very well for large datasets. This talk presents a new approach for Gaussian process (GP) regression for large datasets. The approach involves partitioning the regression input domain into multiple local regions with a different local GP model fitted in each region. Unlike existing local partitioned GP approaches, a technique for patching together the local GP models nearly seamlessly is introduced to ensure that the local GP models for two neighboring regions produce nearly the same response prediction and prediction error variance on the boundary between the two regions. This effectively solves the well-known discontinuity problem that degrades the boundary accuracy of existing local partitioned GP methods. Our main innovation is to represent the continuity conditions as additional pseudo-observations that the differences between neighboring GP responses are identically zero at an appropriately chosen set of boundary input locations. In contrast to heuristic continuity adjustments, this has an advantage of working within a formal GP framework, so that the GP-based predictive uncertainty quantification remains valid. This new approach also inherits a sparse block-like structure for the sample covariance matrix, which results in computationally efficient closed-form expressions for the predictive mean and variance. The numerical performance of the approach will be presented with comparison to several state-of-the-art approaches.