By leveraging the power of machine learning, researchers have devised a fast and accurate approach that can fit statistical models to large complex datasets in a fraction of the time of traditional methods. The “neural estimation” approach, developed by a joint team of researchers from KAUST and the University of Wollongong (UoW) in Australia, is a groundbreaking demonstration of the power of machine learning to efficiently solve computationally intensive data problems.
“Statistical models are used everywhere, for example in the geosciences for modeling sea-level rise, in the health sciences for modeling epidemics, and in the social sciences for modeling crime,” says Matthew Sainsbury-Dale from UoW, who collaborated on the study with KAUST’s Raphaël Huser. “These models contain unknown parameters that must be estimated from data, and the standard approach for this is to use what is known as the likelihood function.”
Tuning the parameters of the likelihood function from the data is straightforward in some cases. But in many instances, and particularly for environmental data, the likelihood function cannot be derived mathematically, or is too computationally expensive to evaluate.
“Instead, we looked at how the parameters of a statistical model could be estimated without evaluating its likelihood function,” says Sainsbury-Dale. “We took the idea of an ‘estimator’ as a general function that takes in data and outputs parameter estimates. Through the use of neural networks, we can construct general likelihood-free neural estimators, which are both accurate and extremely fast, for almost any statistical model.”
The researchers used the method to fit a complex and highly parameterized spatial model to a large dataset of Red Sea temperature extremes. After training, the neural estimator provided parameter estimates and uncertainty quantification in a fraction of a second, with an excellent fit to the data.
“The great benefit of this approach,” says Huser, “is that once trained, the estimator can be used repeatedly with new data at almost no computational cost. We believe this is the future of statistical inference.”
Andrew Zammit-Mangion from UoW who co-authored the study also points out the value of this research partnership.
“The project was the product of a very fruitful ongoing collaboration between the University of Wollongong and KAUST and would not have been possible without the dedication and commitment of personnel and resources from both institutions,” he says.
The researchers have developed and released the open-source software package NeuralEstimators to facilitate the construction of the neural estimators used in their study.
- Sainsbury-Dale, M., Zammit-Mangion, A. & Huser, R. Likelihood-free parameter estimation with neural Bayes estimators. The American Statistician, 17 August 2023.| article.