Inference and robotic path planning over high dimensional categorical observations
Inference and robotic path planning over high dimensional categorical observations
Date
2024-06
Authors
San Soucie, John E.
Linked Authors
Person
Alternative Title
Citable URI
As Published
Date Created
Location
DOI
10.1575/1912/69627
Related Materials
Replaces
Replaced By
Keywords
Path planning
Bayesian modeling
Phytoplankton ecology
Bayesian modeling
Phytoplankton ecology
Abstract
Advances in marine autonomy, deep-learning, and in-situ marine sensing technology have enabled oceanographers to collect vast amounts of spatiotemporally-distributed, sparse, highdimensional categorical data. Statistical models, particularly in streaming and computationally constrained settings, have lagged behind data collection. Recent developments in topic modeling for robotics have highlighted the potential to efficiently extract meaningful relationships from categorical data, and adjust robotic path-planning based on real-time inference. This dissertation seeks to fill the gap in streaming statistical models for sparse,
high-dimensional categorical data, in the context of open-ocean phytoplankton community ecology.
We begin by exploring the use of existing topic modeling approaches for plankton community characterization. Topic models are compared to standard ecological techniques for dimensionality reduction. The increased fidelity and expressiveness of the topic modeling approach allows for greater resolution of plankton co-occurrence relationships. By analyzing these relationships and ocean physics in and around a retentive eddy, the source of phytoplankton variability is traced to storm-driven advection on the ocean surface. We conclude that topic models offer unique insights into the causal mechanisms underlying plankton
community variability.
Next, we turn our focus to the development of a streaming belief model for categorical path planning. Such a model must be capable of predicting in regions without data, and it must be able to process streaming data in a computationally efficient manner. We introduce the Gaussian Dirichlet Random Field model, a novel topic model with spatially continuous latent log-probabilities. In addition to producing a more accurate model than the state-ofthe-art in locations with data, the Gaussian Dirichlet Random Field model can interpolate and extrapolate. The model is initially presented with a batch hybrid Markov Chain-Monte Carlo inference procedure.
We develop a streaming fully-variational inference approach for inference, called Streaming Gaussian Dirichlet Random Fields, which satisfies both the prediction and efficiency requirements for path planning belief models. In-silico experiments demonstrate the ability of this model to accurately map latent co-occurrence patterns. Comparisons to a standard Gaussian process on both path-planning tasks and observation mapping tasks show how the ability of Streaming Gaussian Dirichlet Random Fields to leverage additional categorical observations enables superior performance.
Description
Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Mechanical and Oceanographic Engineering at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution June 2024.
Embargo Date
Citation
San Soucie, J. E. (2024). Inference and robotic path planning over high dimensional categorical observations [Doctoral thesis, Massachusetts Institute of Technology and Woods Hole Oceanographic Institution]. Woods Hole Open Access Server. https://doi.org/10.1575/1912/69627