Reining in computational complexity

1 min read ·

A more efficient approach to modeling spatial data involving thousands of variables keeps computation time in check.

About

With the rapid expansion of worldwide climate monitoring networks, more data is being collected at higher resolution and at more locations than ever before. While such a wealth of data promises unprecedented insight into climate behavior and could vastly improve our ability to predict the frequency and severity of future events, it also presents an enormous computational challenge. Computing power can be repeatedly multiplied to tackle such problems; however, as the number of observation sites and variables expands into the thousands and tens of thousands, this approach will soon fall behind the exponential pace of the data explosion.

To address this looming dilemma, Jian Cao, a doctoral student in Marc Genton's research group at KAUST, has now developed an approach that reduces the computational complexity of such problems by a factor of 50 or more.

“Our goal was to develop a more effective method for computing multivariate normal probabilities, which appear frequently in statistical models used to analyze extreme events,” says Cao.

One of the key motivations behind Cao’s work was the limited number of variables, or dimensions, that can be handled by commonly used statistical platforms. 

“Most current statistical software can only handle hundreds of dimensions,” explains Cao. “The R package, for example, is the most prevalent statistical software used for computing multivariate normal probabilities, and it can only accept 1,000 dimensions.”

Read the full article