Doctoral Thesis Defense of Sara Pérez Vieites

  • Title: “Nested filtering methods for Bayesian inference in state space models”
  • Advisor: Joaquín Míguez Arenas.


A common feature to many problems in some of the most active fields of science is the need to calibrate (i.e., estimate the parameters) and then forecast the time evolution of high-dimensional dynamical systems using sequentially collected data. In this dissertation we introduce a generalised nested filtering methodology that is structured in (two or more) intertwined layers in order to estimate the static parameters and the dynamic state variables of nonlinear dynamical systems. This methodology is essentially probabilistic. It aims at recursively computing the sequence of posterior probability distributions of the unknown model parameters and its (time-varying) state variables conditional on the available observations. To be specific, in the first layer of the filter we approximate the posterior probability distribution of the static parameters and in the consecutive layers we employ filtering (or data assimilation) techniques to track and predict different conditional probability distributions of the state variables. We have investigated the use of different Monte Carlo-based methods and Gaussian filtering techniques in each of the layers, leading to a wealth of algorithms. In a first approach, we have introduced a nested filtering methodology of two layers that aims at recursively estimating the static parameters and the dynamical state variables of a state space model. This probabilistic scheme uses Monte Carlo-based methods in the first layer of the filter, combined with the use of Gaussian filters in the second layer. Different from the nested particle filter (NPF) of [25], the use of Gaussian filtering techniques in the second layer allows for fast implementations, leading to algorithms that are better suited to high-dimensional systems. As each layer uses different types of methods, we refer to the proposed methodology as nested hybrid filtering. We specifically explore the combination of Monte Carlo and quasi–Monte Carlo approximations in the first layer, including sequential Monte Carlo (SMC) and sequential quasi-Monte Carlo (SQMC), with standard Gaussian filtering methods in the second layer, such as the ensemble Kalman filter (EnKF) and the extended Kalman filter (EKF). However, other algorithms can fit naturally within the framework. Additionally, we prove a general convergence result for a class of procedures that use SMC in the first layer and we show numerical results for a stochastic two-scale Lorenz 96 system, a model commonly used to assess data assimilation (filtering) procedures in Geophysics. We apply and compare different implementations of the methodology to the tracking of the state and the estimation of the fixed parameters. We show estimation and forecasting results, obtained with a desktop computer, for up to 5000 dynamic state variables. As an extension of the nested hybrid filtering methodology, we have introduced a class of schemes that can incorporate deterministic sampling techniques (such as the cubature Kalman filter (CKF) or the unscented Kalman filter (UKF)) in the first layer of the algorithm, instead of the Monte Carlo-based methods employed in the original procedure. As all the methods used in this scheme are Gaussian, we refer to this class of algorithms as nested Gaussian filters. One more time, we reduce the computational cost with the proposed scheme, making the resulting algorithms potentially better-suited for high-dimensional state and parameter spaces. In the numerical results, we describe and implement a specific instance of the new method (a UKF-EKF algorithm) and evaluate its average performance in terms of estimation errors and running times for nonlinear stochastic models. Specifically, we present numerical results for a stochastic Lorenz 63 model using synthetic data, as well as for a stochastic volatility model with real-world data. Finally, we have extended the proposed methodology in order to estimate the static parameters and the dynamical variables of a class of heterogeneous multi-scale state-space models [1]. This scheme combines three or more layers of filters, one inside the other. Each of the layers corresponds to the different time scales that are relevant to the dynamics of this kind of state-space models, allocating the variables with the greatest time scales (the slowest ones) in the outer-most layer and the variables with the smallest time scales (the fastest ones) to the inner-most layer. In particular, we describe a three-layer filter that approximates the posterior probability distribution of the parameters in a first layer of computation, in a second layer we approximate the posterior probability distribution of the slow state variables, and the posterior probability distribution of the fast state variables is approximated in a third layer. To be specific, we describe two possible algorithms that derive from this scheme, combining Monte Carlo methods and Gaussian filters at different layers. The first method uses SMC methods in both first and second layers, together with a bank of UKFs in the third layer (i.e., a SMC-SMC-UKF algorithm). The second method employs a SMC in the first layer, EnKFs at the second layer and introduces the use of a bank of EKFs in the third layer (i.e., a SMC-EnKF-EKF algorithm). We present numerical results for a two-scale stochastic Lorenz 96 model with synthetic data.