Simple statistics can be good enough

Gaussian distributions are simple and easy to understand, but for some data such as rainfall and wind speed, they can result in physically impossible tails to negative values.

© Marek Uliasz / Alamy Stock Photo

Environmental scientists and their statistician colleagues face a common dilemma: Do simpler statistical tests properly characterize a data set? And is it worth the effort to derive and apply statistical methods that are possibly better matched but more difficult to interpret? In most cases the path of least resistance wins, but the choice of a simple statistical basis can cast slight doubt on the validity of statistically derived study results.

KAUST researcher Marc Genton and his doctoral student Yuan Yan developed a framework to test exactly how inaccurate a mismatch between data and statistical analysis could be, and the results are surprising. 

“Researchers tend to fit spatial data with a simple Gaussian model—the classic symmetric bell curve around the average value—even though data might have an asymmetric distribution with features that diverge from Gaussian,” says Yan. “We investigated the effect of the ‘non-Gaussianity’ of data on statistical estimation and prediction under the wrong Gaussian assumption.”

Read the full article