Clark Dru

No Thumbnail Available
Last Name
Clark
First Name
Dru
ORCID

Search Results

Now showing 1 - 2 of 2
  • Presentation
    Lessons learned from 104 years of mobile observatories [poster]
    ( 2007-12-10) Miller, Stephen P. ; Neiswender, Caryn ; Clark, Dru ; Raymond, Lisa ; Rioux, Margaret A. ; Norton, Cathy N. ; Detrick, Robert S. ; Helly, John ; Sutton, Don ; Weatherford, John
    As the oceanographic community ventures into a new era of integrated observatories, it may be helpful to look back on the era of "mobile observatories" to see what Cyberinfrastructure lessons might be learned. For example, SIO has been operating research vessels for 104 years, supporting a wide range of disciplines: marine geology and geophysics, physical oceanography, geochemistry, biology, seismology, ecology, fisheries, and acoustics. In the last 6 years progress has been made with diverse data types, formats and media, resulting in a fully-searchable online SIOExplorer Digital Library of more than 800 cruises (http://SIOExplorer.ucsd.edu). Public access to SIOExplorer is considerable, with 795,351 files (206 GB) downloaded last year. During the last 3 years the efforts have been extended to WHOI, with a "Multi-Institution Testbed for Scalable Digital Archiving" funded by the Library of Congress and NSF (IIS 0455998). The project has created a prototype digital library of data from both institutions, including cruises, Alvin submersible dives, and ROVs. In the process, the team encountered technical and cultural issues that will be facing the observatory community in the near future. Technological Lessons Learned: Shipboard data from multiple institutions are extraordinarily diverse, and provide a good training ground for observatories. Data are gathered from a wide range of authorities, laboratories, servers and media, with little documentation. Conflicting versions exist, generated by alternative processes. Domain- and institution-specific issues were addressed during initial staging. Data files were categorized and metadata harvested with automated procedures. With our second-generation approach to staging, we achieve higher levels of automation with greater use of controlled vocabularies. Database and XML- based procedures deal with the diversity of raw metadata values and map them to agreed-upon standard values, in collaboration with the Marine Metadata Interoperability (MMI) community. All objects are tagged with an expert level, thus serving an educational audience, as well as research users. After staging, publication into the digital library is completely automated. The technical challenges have been largely overcome, thanks to a scalable, federated digital library architecture from the San Diego Supercomputer Center, implemented at SIO, WHOI and other sites. The metadata design is flexible, supporting modular blocks of metadata tailored to the needs of instruments, samples, documents, derived products, cruises or dives, as appropriate. Controlled metadata vocabularies, with content and definitions negotiated by all parties, are critical. Metadata may be mapped to required external standards and formats, as needed. Cultural Lessons Learned: The cultural challenges have been more formidable than expected. They became most apparent during attempts to categorize and stage digital data objects across two institutions, each with their own naming conventions and practices, generally undocumented, and evolving across decades. Whether the questions concerned data ownership, collection techniques, data diversity or institutional practices, the solution involved a joint discussion with scientists, data managers, technicians and archivists, working together. Because metadata discussions go on endlessly, significant benefit comes from dictionaries with definitions of all community-authorized metadata values.
  • Article
    Rolling Deck to Repository: supporting the marine science community with data management services from academic research expeditions
    (Frontiers Media, 2022-12-08) Carbotte, Suzanne M. ; O’Hara, Suzanne ; Stocks, Karen ; Clark, P. Dru ; Stolp, Laura ; Smith, Shawn R. ; Briggs, Kristen ; Hudak, Rebecca ; Miller, Emily ; Olson, Chris J. ; Shane, Neville ; Uribe, Rafael ; Arko, Robert ; Chandler, Cynthia L. ; Ferrini, Vicki ; Miller, Stephen P. ; Doyle, Alice ; Holik, James
    Direct observations of the oceans acquired on oceanographic research ships operated across the international community support fundamental research into the many disciplines of ocean science and provide essential information for monitoring the health of the oceans. A comprehensive knowledge base is needed to support the responsible stewardship of the oceans with easy access to all data acquired globally. In the United States, the multidisciplinary shipboard sensor data routinely acquired each year on the fleet of coastal, regional and global ranging vessels supporting academic marine research are managed by the Rolling Deck to Repository (R2R, rvdata.us) program. With over a decade of operations, the R2R program has developed a robust routinized system to transform diverse data contributions from different marine data providers into a standardized and comprehensive collection of global-ranging observations of marine atmosphere, ocean, seafloor and subseafloor properties that is openly available to the international research community. In this article we describe the elements and framework of the R2R program and the services provided. To manage all expeditions conducted annually, a fleet-wide approach has been developed using data distributions submitted from marine operators with a data management workflow designed to maximize automation of data curation. Other design goals are to improve the completeness and consistency of the data and metadata archived, to support data citability, provenance tracking and interoperable data access aligned with FAIR (findable, accessible, interoperable, reusable) recommendations, and to facilitate delivery of data from the fleet for global data syntheses. Findings from a collection-level review of changes in data acquisition practices and quality over the past decade are presented. Lessons learned from R2R operations are also discussed including the benefits of designing data curation around the routine practices of data providers, approaches for ensuring preservation of a more complete data collection with a high level of FAIRness, and the opportunities for homogenization of datasets from the fleet so that they can support the broadest re-use of data across a diverse user community.