BCO-DMO Publications & Presentations

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 20 of 47
  • Presentation
    Fitting square pegs into a round hole. Curating heterogeneous oceanographic data at BCO-DMO
    (Woods Hole Oceanographic Institution, 2024-02-22) Soenen, Karen ; Kinkade, Danie ; Shepherd, Adam ; Saito, Mak A. ; Gerlach, Dana ; Merchant, Lynne M. ; Newman, Sawyer ; Rauch, Shannon ; York, Amber D.
    BCO-DMO is a domain-specific repository containing 18 years of curated, heterogeneous oceanographic data. Data managers are at the core of the repository, applying the F.A.I.R. principles to every dataset coming in. This talk steers the audience through such a curated dataset, covering the advancements and challenges that comes with domain curation.
  • Presentation
    Science on Schema.org
    (Woods Hole Oceanographic Institution, 2023-07-27) Shepherd, Adam
    Presented at ZB Med Research Colloqiua 2023, Virtual, July 27, 2023
  • Presentation
    Science on Schema.org
    (Woods Hole Oceanographic Institution, 2023-04-05) Shepherd, Adam
    Presented Science on Schema.org at the Ontology Summit 2023. Conference Website: https://ontologforum.org/index.php/OntologySummit2023
  • Presentation
    BCO-DMO: Surfing the Crests and Troughs of Data Sharing
    (Woods Hole Oceanographic Institution, 2022-03-02) Kinkade, Danie
    Many of the challenges currently associated with sharing oceanographic data currently facing researchers and the repositories through which they share their data, are cultural rather than technical. This talk presents an overview of obstacles and opportunities related to data sharing within the oceanographic community.
  • Article
    Knowledge graphs to support real‐time flood impact evaluation
    (Association for the Advancement of Artificial Intelligence, 2022-03-31) Johnson, J. Michael ; Narock, Tom ; Singh-Mohudpur, Justin ; Fils, Douglas ; Clarke, Keith C. ; Saksena, Siddharth ; Shepherd, Adam ; Arumugam, Sankar ; Yeghiazarian, Lilit
    A digital map of the built environment is useful for a range of economic, emergency response, and urban planning exercises such as helping find places in app driven interfaces, helping emergency managers know what locations might be impacted by a flood or fire, and helping city planners proactively identify vulnerabilities and plan for how a city is growing. Since its inception in 2004, OpenStreetMap (OSM) sets the benchmark for open geospatial data and has become a key player in the public, research, and corporate realms. Following the foundations laid by OSM, several open geospatial products describing the built environment have blossomed including the Microsoft USA building footprint layer and the OpenAddress project. Each of these products use different data collection methods ranging from public contributions to artificial intelligence, and if taken together, could provide a comprehensive description of the built environment. Yet, these projects are still siloed, and their variety makes integration and interoperability a major challenge. Here, we document an approach for merging data from these three major open building datasets and outline a workflow that is scalable to the continental United States (CONUS). We show how the results can be structured as a knowledge graph over which machine learning models are built. These models can help propagate and complete unknown quantities that can then be leveraged in disaster management.
  • Presentation
    Biological and Chemical Oceanography Data Management Office: Supporting a New Vision for Adaptive Management of Oceanographic Data [poster]
    (Woods Hole Oceanographic Institution, 2022-06-21) Shepherd, Adam ; Gerlach, Dana ; Heyl, Taylor ; Kinkade, Danie ; Nagala, Shravani ; Newman, Sawyer ; Rauch, Shannon ; Saito, Mak A. ; Schloer, Conrad ; Soenen, Karen ; Wiebe, Peter ; York, Amber
    An unparalleled data catalog of well-documented, interoperable oceanographic data and information, openly accessible to all end-users through an intuitive web-based interface for the purposes of advancing marine research, education, and policy. Conference Website: https://web.whoi.edu/ocb-workshop/
  • Presentation
    Use of Controlled Vocabularies: potential applications for ocean time series data
    (Woods Hole Oceanographic Institution, 2022-02-25) Shepherd, Adam
    Use of controlled vocabularies: potential applications for ocean time series data
  • Presentation
    How can BCO-DMO help with your oceanographic data?
    (Woods Hole Oceanographic Institution, 2021-12-10) Soenen, Karen ; Gerlach, Dana ; Haskins, Christina ; Heyl, Taylor ; Kinkade, Danie ; Newman, Sawyer ; Rauch, Shannon ; Saito, Mak A. ; Shepherd, Adam ; Wiebe, Peter ; York, Amber D.
    BCO-DMO curates a database of research-ready data spanning the full range of marine ecosystem related measurements including in-situ and remotely sensed observations, experimental and model results, and synthesis products. We work closely with investigators to publish data and information from research projects supported by the National Science Foundation (NSF), as well as those supported by state, private, and other funding sources. BCO-DMO supports all phases of the data life cycle and ensures open access of well-curated project data and information. We employ F.A.I.R. Principles that comprise a set of values intended to guide data producers and publishers in establishing good data management practices that will enable effective reuse.
  • Article
    Geoscience data publication: practices and perspectives on enabling the FAIR guiding principles
    (Royal Meteorological Society, 2021-05-04) Kinkade, Danie ; Shepherd, Adam
    Introduced in 2016, the FAIR Guiding Principles endeavour to significantly improve the process of today's data-driven research. The Principles present a concise set of fundamental concepts that can facilitate the findability, accessibility, interoperability and reuse (FAIR) of digital research objects by both machines and human beings. The emergence of FAIR has initiated a flurry of activity within the broader data publication community, yet the principles are still not fully understood by many community stakeholders. This has led to challenges such as misinterpretation and co-opted use, along with persistent gaps in current data publication culture, practices and infrastructure that need to be addressed to achieve a FAIR data end-state. This paper presents an overview of the practices and perspectives related to the FAIR Principles within the Geosciences and offers discussion on the value of the principles in the larger context of what they are trying to achieve. The authors of this article recommend using the principles as a tool to bring awareness to the types of actions that can improve the practice of data publication to meet the needs of all data consumers. FAIR Guiding Principles should be interpreted as an aspirational guide to focus behaviours that lead towards a more FAIR data environment. The intentional discussions and incremental changes that bring us closer to these aspirations provide the best value to our community as we build the capacity that will support and facilitate new discovery of earth systems.
  • Presentation
    Capturing Provenance of Data Curation at BCO-DMO
    (Woods Hole Oceanographic Institution, 2020-11-09) Shepherd, Adam ; York, Amber ; Schloer, Conrad ; Kinkade, Danie ; Rauch, Shannon ; Copley, Nancy ; Gerlach, Dana ; Haskins, Christina ; Soenen, Karen ; Saito, Mak A. ; Wiebe, Peter
    At domain-specific data repositories, curation that strives for FAIR principles often entails transforming data submissions to improve understanding and reuse. The Biological and Chemical Oceanography Data Management Office (BCO-DMO, https://www.bco-dmo.org) has been adopting the data containerization specification of the Frictionless Data project (https://frictionlessdata.io) in an effort to improve its data curation process efficiency. In doing so, BCO-DMO has been using the Frictionless Data Package Pipelines library (https://github.com/frictionlessdata/datapackage-pipelines) to define the processing steps that transform original submissions to final data products. Because these pipelines are defined using a declarative language they can be serialized into formal provenance data structures using the Provenance Ontology (PROV-O, https://www.w3.org/TR/prov-o/). While there may still be some curation steps that cannot be easily automated, this method is a step towards reproducible transforms that bridge the original data submission to its published state in machine-actionable ways that benefit the research community through transparency in the data curation process. BCO-DMO has built a user interface on top of these modular tools for making it easier for data managers to process submission, reuse existing workflows, and make transparent the added value of domain-specific data curation.
  • Presentation
    Share Your Thoughts [poster]
    (Woods Hole Oceanographic Institution, 2020-02-21) Haskins, Christina ; Biddle, Matt ; Copley, Nancy J. ; Rauch, Shannon ; Soenen, Karen ; York, Amber ; Kinkade, Danie ; Saito, Mak A. ; Shepherd, Adam ; Wiebe, Peter
    Oceanographic data, when well-documented and stewarded toward preservation, have the potential to accelerate new science and facilitate our understanding of complex natural systems. The Biological and Chemical Oceanography Data Management Office (BCO-DMO) is funded by the NSF to document and manage marine biological, chemical, physical, and biogeochemical data, ensuring their discovery and access, and facilitating their reuse. The task of curating and providing access to research data is a collaborative process, with associated actors and critical activities occurring throughout the data’s life cycle. BCO-DMO supports all phases of the data life cycle and works closely with investigators to ensure open access of well-documented project data and information. Supporting this curation process is a flexible cyberinfrastructure that provides the means for data submission, discovery, and access; ultimately enabling reuse. Based upon community feedback, this infrastructure is undergoing evaluation and improvement to better meet oceanographic research needs. This poster will introduce the repository and describe some of the strategic enhancements coming to BCO-DMO, and presents an opportunity for you to provide feedback on enhancements yet to come. We invite you to think about your own research workflow of searching and accessing new data for research, and to provide your feedback through the poster’s interactive sections. Your input can help BCO-DMO improve its service to the research community.
  • Presentation
    Code and Software: How would you share yours? [poster]
    (Woods Hole Oceanographic Institution, 2020-02-21) Biddle, Matt ; Copley, Nancy ; Haskins, Christina ; Rauch, Shannon ; Soenen, Karen ; York, Amber ; Kinkade, Danie ; Saito, Mak A. ; Shepherd, Adam ; Wiebe, Peter
    BCO-DMO curates earth science data where models become increasingly important. The Biological and Chemical Oceanography Data Management Office (BCO-DMO) is a publicly accessible earth science data repository created to curate, publicly serve (publish), and archive digital data and information from biological, chemical and biogeochemical research conducted in coastal, marine, great lakes and laboratory environments. Recently, more and more of the projects submitted to BCO-DMO represent modeling efforts which further increase our knowledge of chemical and biological properties within the ocean ecosystem. We feel the time is at hand for the scientific community to begin a concerted and holistic approach to the curation of code and software.
  • Presentation
    Data Science Training Camp at Woods Hole Oceanographic Institution: Syllabus and slide presentations in 2020
    (Woods Hole Oceanographic Institution, 2020-08-21) Beaulieu, Stace E. ; Raymond, Lisa ; Mickle, Audrey ; Futrelle, Joe ; Symmonds, Nick ; Mazzoli, Roberta ; Brey, Rich ; Kinkade, Danie ; Rauch, Shannon
    With data and software increasingly recognized as scholarly research products, and aiming towards open science and reproducibility, it is imperative for today's oceanographers to learn foundational practices and skills for data management and research computing, as well as practices specific to the ocean sciences. This educational package was developed as a data science training camp for graduate students and professionals in the ocean sciences and implemented at the Woods Hole Oceanographic Institution (WHOI) in 2019 and 2020. Here we provide materials for the 2020 camp which was delivered in-person during two afternoons (total of 8 hours), with two modules per afternoon. We aimed for ~40 participants per camp, with disciplines spanning Earth and life sciences and engineering. Disciplines at each table were mixed on the first afternoon but similar on the second afternoon. Contents of this package include the syllabus and slide presentations for each of the four modules: 1 "Good enough practices in scientific computing," 2 Data management, 3 Software development and research computing, and 4 Best practices in the ocean sciences. The 3rd module is split into two parts. We also include a poster presented at the 2020 Ocean Science Meeting, which has some results from pre- and post-surveys. Funding: The camp was funded by WHOI Academic Programs Office through a Doherty Chair in Education Award, with additional support from WHOI Ocean Informatics Working Group, WHOI Information Services, MBLWHOI Library, the NSF-funded Biological and Chemical Oceanography Data Management Office (BCO-DMO), and an NSF-funded XSEDE Jetstream Education Allocation TG-OCE190011. We also utilized resources from the NSF-funded Pangeo project.
  • Presentation
    Data Management in (Ocean) Sciences – Interactive Class
    (Woods Hole Oceanographic Institution, 2020-02-26) Soenen, Karen ; Harden, Benjamin E.
    Ocean 101, engaging classes to help SEA students understand the frontiers of ocean climate science. This particular class focuses on data management in oceanography. Covered topics are the importance of open data, the data life cycle and F.A.I.R. Principles. The interactive part consists of creating the content for a data management plan and applying general data management practices.
  • Presentation
    Data Help Desk BCO-DMO Lightning Talk
    (Woods Hole Oceanographic Institution, 2020-02-18) Biddle, Matt ; Shepherd, Adam ; Kinkade, Danie ; Haskins, Christina ; Soenen, Karen ; Rauch, Shannon ; Copley, Nancy ; York, Amber ; Schloer, Conrad ; Saito, Mak A. ; Wiebe, Peter
    BCO-DMO is the Biological and Chemical Oceanography Data Management Office. We help oceanography researchers who are funded by the National Science Foundation’s (NSF's) Division of Ocean Sciences' (OCE) Biological or Chemical Oceanography Sections or the Division of Polar Programs' Antarctic Organisms & Ecosystems Program manage their data, making them accessible over the internet. This lightning talk gives a brief overview of who we are, who we work with, and the types of data we manage.
  • Presentation
    Capturing Provenance of Data Curation at BCO-DMO
    (Woods Hole Oceanographic Institution, 2020-05-15) Shepherd, Adam ; York, Amber ; Schloer, Conrad ; Kinkade, Danie ; Rauch, Shannon ; Biddle, Matt ; Copley, Nancy ; Haskins, Christina ; Soenen, Karen ; Saito, Mak A. ; Wiebe, Peter
    At domain-specific data repositories, curation that strives for FAIR principles often entails transforming data submissions to improve understanding and reuse. The Biological and Chemical Oceanography Data Management Office (BCO-DMO, https://www.bco-dmo.org) has been adopting the data containerization specification of the Frictionless Data project (https://frictionlessdata.io) in an effort to improve its data curation process efficiency. In doing so, BCO-DMO has been using the Frictionless Data Package Pipelines library (https://github.com/frictionlessdata/datapackage-pipelines) to define the processing steps that transform original submissions to final data products. Because these pipelines are defined using a declarative language they can be serialized into formal provenance data structures using the Provenance Ontology (PROV-O, https://www.w3.org/TR/prov-o/). While there may still be some curation steps that cannot be easily automated, this method is a step towards reproducible transforms that bridge the original data submission to its published state in machine-actionable ways that benefit the research community through transparency in the data curation process. BCO-DMO has built a user interface on top of these modular tools for making it easer for data managers to process submission, reuse existing workflows, and make transparent the added value of domain-specific data curation.
  • Presentation
    Sharing Data Through the Biological and Chemical Oceanography Data Management Office [talk]
    (Woods Hole Oceanographic Institution, 2020-01-15) Kinkade, Danie
    This talk provides an overview of the Biological and Chemical Oceanography Data Management Office and the collaborative data sharing process that occurs between individual investigators and the BCO-DMO repository. The presentation includes background on the repository, what to expect after submitting your data, and helpful data management practices that can streamline data sharing and support open science.
  • Working Paper
    NSF EarthCube Workshop for Shipboard Ocean Time Series Data Meeting Report
    (Woods Hole Oceanographic Institution, 2020-02) Benway, Heather M. ; Buck, Justin J. H. ; Fujieki, Lance ; Kinkade, Danie ; Lorenzoni, Laura ; Schildhauer, Mark ; Shepherd, Adam ; White, Angelicque
    Prior to the OceanObs’19 Meeting, the Ocean Carbon and Biogeochemistry (OCB) Project Office planned and hosted an NSF EarthCube Workshop focused on shipboard ocean time series data (https://www.us-ocb.org/earthcube-workshop-ocean-time-series-data/). Data synthesis and modeling efforts across ocean time series represent important and necessary steps forward in broadening our view of a changing ocean, and maximizing the return on our continued investment in these programs. Despite the scientific insights and technology advances of the past couple of decades, significant barriers remain that hinder important synthesis work across time series. This workshop convened 37 participants, including seagoing oceanographers, data managers, and experts in data science and informatics. The goal of the workshop was to identify key ocean time series data challenges related to access and discoverability, metadata reporting, interoperability across databases, and broadening users; and developing recommendations to address those challenges. The workshop adopted the FAIR (Findable, Accessible, Interoperable, Reusable; Wilkinson et al., 2016) Guiding Principles to frame these issues, and included presentations on existing data models and use of controlled vocabularies, guidelines and frameworks for conducting data synthesis and establishing community best practices, and existing and planned ocean time series data products.
  • Presentation
    BCO-DMO's migration to ERDDAP
    (Woods Hole Oceanographic Institution, 2019-10-24) Biddle, Matt
    As a domain specific repository, BCO-DMO supports data stewardship throughout the data lifecycle. One key aspect of that data lifecycle is making data and metadata available online in a variety of file formats. This presentation will walk through BCO-DMO's current data serving system, our migration to ERDDAP, and what that might mean for the future. There will be a focus on the nuts-and-bolts of our migration, the benefits of this activity, and some of the difficulties we've encountered along the way.
  • Article
    Ocean FAIR data services
    (Frontiers Media, 2019-08-07) Tanhua, Toste ; Pouliquen, Sylvie ; Hausman, Jessica ; O’Brien, Kevin ; Bricher, Phillippa ; de Bruin, Taco ; Buck, Justin J. H. ; Burger, Eugene ; Carval, Thierry ; Casey, Kenneth S. ; Diggs, Stephen ; Giorgetti, Alessandra ; Glaves, Helen ; Harscoat, Valerie ; Kinkade, Danie ; Muelbert, Jose H. ; Novellino, Antonio ; Pfeil, Benjamin ; Pulsifer, Peter L. ; Van de Putte, Anton ; Robinson, Erin ; Schaap, Dick ; Smirnov, Alexander ; Smith, Neville ; Snowden, Derrick ; Spears, Tobias ; Stall, Shelley ; Tacoma, Marten ; Thijsse, Peter ; Tronstad, Stein ; Vandenberghe, Thomas ; Wengren, Micah ; Wyborn, Lesley ; Zhao, Zhiming
    Well-founded data management systems are of vital importance for ocean observing systems as they ensure that essential data are not only collected but also retained and made accessible for analysis and application by current and future users. Effective data management requires collaboration across activities including observations, metadata and data assembly, quality assurance and control (QA/QC), and data publication that enables local and interoperable discovery and access and secures archiving that guarantees long-term preservation. To achieve this, data should be findable, accessible, interoperable, and reusable (FAIR). Here, we outline how these principles apply to ocean data and illustrate them with a few examples. In recent decades, ocean data managers, in close collaboration with international organizations, have played an active role in the improvement of environmental data standardization, accessibility, and interoperability through different projects, enhancing access to observation data at all stages of the data life cycle and fostering the development of integrated services targeted to research, regulatory, and operational users. As ocean observing systems evolve and an increasing number of autonomous platforms and sensors are deployed, the volume and variety of data increase dramatically. For instance, there are more than 70 data catalogs that contain metadata records for the polar oceans, a situation that makes comprehensive data discovery beyond the capacity of most researchers. To better serve research, operational, and commercial users, more efficient turnaround of quality data in known formats and made available through Web services is necessary. In particular, automation of data workflows will be critical to reduce friction throughout the data value chain. Adhering to the FAIR principles with free, timely, and unrestricted access to ocean observation data is beneficial for the originators, has obvious benefits for users, and is an essential foundation for the development of new services made possible with big data technologies.