Inference and robotic path planning over high dimensional categorical observations

dc.contributor.advisor Girdhar, Yogesh
dc.contributor.advisor Sosik, Heidi M.
dc.contributor.author San Soucie, John E.
dc.date.accessioned 2024-05-31T15:36:31Z
dc.date.available 2024-05-31T15:36:31Z
dc.date.issued 2024-06
dc.description Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Mechanical and Oceanographic Engineering at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution June 2024.
dc.description.abstract Advances in marine autonomy, deep-learning, and in-situ marine sensing technology have enabled oceanographers to collect vast amounts of spatiotemporally-distributed, sparse, highdimensional categorical data. Statistical models, particularly in streaming and computationally constrained settings, have lagged behind data collection. Recent developments in topic modeling for robotics have highlighted the potential to efficiently extract meaningful relationships from categorical data, and adjust robotic path-planning based on real-time inference. This dissertation seeks to fill the gap in streaming statistical models for sparse, high-dimensional categorical data, in the context of open-ocean phytoplankton community ecology. We begin by exploring the use of existing topic modeling approaches for plankton community characterization. Topic models are compared to standard ecological techniques for dimensionality reduction. The increased fidelity and expressiveness of the topic modeling approach allows for greater resolution of plankton co-occurrence relationships. By analyzing these relationships and ocean physics in and around a retentive eddy, the source of phytoplankton variability is traced to storm-driven advection on the ocean surface. We conclude that topic models offer unique insights into the causal mechanisms underlying plankton community variability. Next, we turn our focus to the development of a streaming belief model for categorical path planning. Such a model must be capable of predicting in regions without data, and it must be able to process streaming data in a computationally efficient manner. We introduce the Gaussian Dirichlet Random Field model, a novel topic model with spatially continuous latent log-probabilities. In addition to producing a more accurate model than the state-ofthe-art in locations with data, the Gaussian Dirichlet Random Field model can interpolate and extrapolate. The model is initially presented with a batch hybrid Markov Chain-Monte Carlo inference procedure. We develop a streaming fully-variational inference approach for inference, called Streaming Gaussian Dirichlet Random Fields, which satisfies both the prediction and efficiency requirements for path planning belief models. In-silico experiments demonstrate the ability of this model to accurately map latent co-occurrence patterns. Comparisons to a standard Gaussian process on both path-planning tasks and observation mapping tasks show how the ability of Streaming Gaussian Dirichlet Random Fields to leverage additional categorical observations enables superior performance.
dc.description.sponsorship This work was based on material supported by a National Defense Science and Engineering Graduate fellowship and an internal WHOI fellowship, as well as by grant 561126 from the Simons Foundation, NSF-NRI Award Number 1734400, NSF-NRI Award Number 2133029, NSF-LTER Award Number 1655686, grant 80NSSC17K0700 from the NASA Ocean Biology and Biogeochemistry program, and the Woods Hole Oceanographic Institution’s Ocean Twilight Zone Project funded as part of the Audacious Project housed at TED.
dc.identifier.citation San Soucie, J. E. (2024). Inference and robotic path planning over high dimensional categorical observations [Doctoral thesis, Massachusetts Institute of Technology and Woods Hole Oceanographic Institution]. Woods Hole Open Access Server. https://doi.org/10.1575/1912/69627
dc.identifier.doi 10.1575/1912/69627
dc.identifier.uri https://hdl.handle.net/1912/69627
dc.language.iso en_US
dc.publisher Massachusetts Institute of Technology and Woods Hole Oceanographic Institution
dc.relation.ispartofseries WHOI Theses
dc.rights ©2024 John E. San Soucie. The author hereby grants to MIT and WHOI a nonexclusive, worldwide, irrevocable, royalty-free license to exercise any and all rights under copyright, including to reproduce, preserve, distribute and publicly display copies of the thesis, or release the thesis under an open-access license.
dc.subject Path planning
dc.subject Bayesian modeling
dc.subject Phytoplankton ecology
dc.title Inference and robotic path planning over high dimensional categorical observations
dc.type Thesis
dspace.entity.type Publication
relation.isAuthorOfPublication 30ea3c44-132e-47c6-b0f1-6b99bc9d2f8a
relation.isAuthorOfPublication.latestForDiscovery 30ea3c44-132e-47c6-b0f1-6b99bc9d2f8a
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
San_Soucie_Thesis.pdf
Size:
17.19 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.88 KB
Format:
Item-specific license agreed upon to submission
Description:
Collections