Towards capturing data curation provenance using Frictionless Data Package Pipelines [poster]
MetadataShow full item record
At domain-specific data repositories, curation that strives for FAIR principles often entails transforming data submissions to improve understanding and reuse. The Biological and Chemical Oceanography Data Management Office (BCO-DMO, https://www.bco-dmo.org) has been adopting the data containerization specification of the Frictionless Data project (https://frictionlessdata.io) in an effort to improve its data curation process efficiency. In doing so, BCO-DMO has been using the Frictionless Data Package Pipelines library (https://github.com/frictionlessdata/datapackage-pipelines) to define the processing steps that transform original submissions to final data products. Because these pipelines are defined using a declarative language they can be serialized into formal provenance data structures using the Provenance Ontology (PROV-O, https://www.w3.org/TR/prov-o/). While there may still be some curation steps that cannot be easily automated, this method is a step towards reproducible transforms that bridge the original data submission to its published state in machine-actionable ways that benefit the research community through transparency in the data curation process.
Presented at FORCE2018 Conference, Montreal, Canada, October 10-12, 2018. FORCE: Future of Research Communications and e-Scholarship
Suggested CitationPresentation: Shepherd, Adam, Schloer, Conrad, York, Amber, Kinkade, Danie, "Towards capturing data curation provenance using Frictionless Data Package Pipelines [poster]", Presented at FORCE2018 Conference, Montreal, Canada, October 10-12, 2018. FORCE: Future of Research Communications and e-Scholarship, DOI:10.5281/zenodo.1451679, https://hdl.handle.net/1912/10631
The following license files are associated with this item:
Showing items related by title, author, creator and subject.
Shepherd, Adam; Rauch, Shannon; Schloer, Conrad; Kinkade, Danie; Biddle, Matt; Copley, Nancy; Saito, Mak A.; Wiebe, Peter; York, Amber (2018-12-14)Data repositories often transform submissions to improve understanding and reuse of data by researchers other than the original submitter. However, scientific workflows built by the data submitters often depend on the ...
Shepherd, Adam; Fils, Douglas; Kinkade, Danie; Saito, Mak A. (2017-12-13)As cross-disciplinary geoscience research increasingly relies on machines to discover and access data, one of the critical questions facing data repositories is how data and supporting materials should be packaged for ...