The Frictionless Data Package : data containerization for automated scientific workflows [poster]
Saito, Mak A.
MetadataShow full item record
As cross-disciplinary geoscience research increasingly relies on machines to discover and access data, one of the critical questions facing data repositories is how data and supporting materials should be packaged for consumption. Traditionally, data repositories have relied on a human's involvement throughout discovery and access workflows. This human could assess fitness for purpose by reading loosely coupled, unstructured information from web pages and documentation. In attempts to shorten the time to science and access data resources across may disciplines, expectations for machines to mediate the process of discovery and access is challenging data repository infrastructure. This challenge is to find ways to deliver data and information in ways that enable machines to make better decisions by enabling them to understand the data and metadata of many data types. Additionally, once machines have recommended a data resource as relevant to an investigator's needs, the data resource should be easy to integrate into that investigator's toolkits for analysis and visualization. The Biological and Chemical Oceanography Data Management Office (BCO-DMO) supports NSF-funded OCE and PLR investigators with their project's data management needs. These needs involve a number of varying data types some of which require multiple files with differing formats. Presently, BCO-DMO has described these data types and the important relationships between the type's data files through human-readable documentation on web pages. For machines directly accessing data files from BCO-DMO, this documentation could be overlooked and lead to misinterpreting the data. Instead, BCO-DMO is exploring the idea of data containerization, or packaging data and related information for easier transport, interpretation, and use. In researching the landscape of data containerization, the Frictionlessdata Data Package (http://frictionlessdata.io/) provides a number of valuable advantages over similar solutions. This presentation will focus on these advantages and how the Frictionlessdata Data Package addresses a number of real-world use cases faced for data discovery, access, analysis and visualization.
Presented at the Fall AGU Meeting, New Orleans, LA, 11-15 December 2017
Suggested CitationPresentation: Shepherd, Adam, Fils, Douglas, Kinkade, Danie, Saito, Mak A., "The Frictionless Data Package : data containerization for automated scientific workflows [poster]", Presented at the Fall AGU Meeting, New Orleans, LA, 11-15 December 2017, https://hdl.handle.net/1912/9418
The following license files are associated with this item:
Showing items related by title, author, creator and subject.
The Biodiversity Heritage Library : advancing metadata practices in a collaborative digital library Pilsk, Suzanne C.; Person, Matthew A.; deVeer, Joseph M.; Furfey, John F.; Kalfatovic, Martin R. (Taylor & Francis, 2010-04)The Biodiversity Heritage Library is an open access digital library of taxonomic literature, forming a single point of access to this collection for use by a worldwide audience of professional taxonomists, as well as ...
Shepherd, Adam; Chandler, Cynthia L.; Arko, Robert A.; Fils, Douglas; Kinkade, Danie (2017-05-31)One of the central incentives of deploying linked open data is the opportunity to leverage the linkages between source datasets to retrieve related information. The Biological and Chemical Oceanography Data Management ...
Toward cyberinfrastructure to facilitate collaboration and reproducibility for marine integrated ecosystem assessments Beaulieu, Stace E.; Fox, Peter A.; Di Stefano, Massimo; Maffei, Andrew R.; West, Patrick; Hare, Jonathan A.; Fogarty, Michael J. (2016-10-20)There is a growing need for cyberinfrastructure to support science-based decision making in management of natural resources. In particular, our motivation was to aid the development of cyberinfrastructure for Integrated ...