NNGPs

One of my primary research projects focuses on theoretical and applied developments for neural network Gaussian processes (NNGPs). Alexey Lindo and Devanshu Agrawal are my two main collaborators on this topic.

NNGPs are Gaussian processes (GPs) that arise as infinite-width limits of neural networks. Why conduct research on NNGPs? Firstly, NNGPs give a theoretical understanding of how properties of neural networks and of GPs are interconnected. Secondly, NNGPs provide a practical characterization of parameter priors for wide neural networks in terms of the corresponding GP priors. Parameter priors are typically used to initialize weights and biases in stochastic optimization of neural networks, yet they are also candidate components for Bayesian inference in deep learning.

Along with my collaborators, we have made two contributions to NNGPs. We have shown that wide neural networks with bottlenecks (layers kept at finite width) tend to deep GPs (Agrawal, Papamarkou and Hinkle, 2020). In other words, we have shown that the wide limit of a composition of neural networks is the composition of the corresponding limiting GPs. More recently, we have developed NNGPs with the so-called PGF kernels (Lindo et al, 2021) by utilizing a known duality between activations and probability generating functions (PGFs). Moreover, we have developed θ NNGPs, a family of NNGPs with a special case of PGF kernels related to θ processes.

We are currently working on GPs with PGF kernels, aiming to demonstrate the relative advantages of such GPs. The upcoming research manuscript will be accompanied by software for GPs with PGF kernels.

Theodore Papamarkou
Theodore Papamarkou
Reader in maths of data science

My research interests span probabilistic machine learning.

Related