The Frictionless Data Package : data containerization for addressing big data challenges [poster]

dc.contributor.author Shepherd, Adam
dc.contributor.author Fils, Douglas
dc.contributor.author Kinkade, Danie
dc.contributor.author Saito, Mak A.
dc.date.accessioned 2018-02-15T14:51:13Z
dc.date.available 2018-02-15T14:51:13Z
dc.date.issued 2018-02-15
dc.description Presented at AGU Ocean Sciences, 11 - 16 February 2018, Portland, OR en_US
dc.description.abstract At the Biological and Chemical Oceanography Data Management Office (BCO-DMO) Big Data challenges have been steadily increasing. The sizes of data submissions have grown as instrumentation improves. Complex data types can sometimes be stored across different repositories . This signals a paradigm shift where data and information that is meant to be tightly-coupled and has traditionally been stored under the same roof is now distributed across repositories and data stores. For domain-specific repositories like BCO-DMO, a new mechanism for assembling data, metadata and supporting documentation is needed. Traditionally, data repositories have relied on a human's involvement throughout discovery and access workflows. This human could assess fitness for purpose by reading loosely coupled, unstructured information from web pages and documentation. Distributed storage was something that could be communicated in text that a human could read and understand. However, as machines play larger roles in the process of discovery and access of data, distributed resources must be described and packaged in ways that fit into machine automated workflows of discovery and access for assessing fitness for purpose by the end-user. Once machines have recommended a data resource as relevant to an investigator's needs, the data should be easy to integrate into that investigator's toolkits for analysis and visualization. BCO-DMO is exploring the idea of data containerization, or packaging data and related information for easier transport, interpretation, and use. Data containerization reduces not only the friction data repositories experience trying to describe complex data resources, but also for end-users trying to access data with their own toolkits. In researching the landscape of data containerization, the Frictionlessdata Data Package (http://frictionlessdata.io/) provides a number of valuable advantages over similar solutions. This presentation will focus on these advantages and how the Frictionlessdata Data Package addresses a number of real-world use cases faced for data discovery, access, analysis and visualization in the age of Big Data. en_US
dc.description.sponsorship NSF #1435578, NSF #1639714 en_US
dc.identifier.uri https://hdl.handle.net/1912/9577
dc.language.iso en_US en_US
dc.rights Attribution 4.0 International *
dc.rights.uri http://creativecommons.org/licenses/by/4.0/ *
dc.subject Frictionless Data en_US
dc.subject Data management en_US
dc.subject Data exchange en_US
dc.subject Data Transport en_US
dc.subject Distributed data en_US
dc.subject Data tools en_US
dc.subject Big data en_US
dc.title The Frictionless Data Package : data containerization for addressing big data challenges [poster] en_US
dc.type Presentation en_US
dspace.entity.type Publication
relation.isAuthorOfPublication acaa04eb-34c3-4dcd-a8a7-e2a6c525e6cb
relation.isAuthorOfPublication 0fd499a5-2c8f-4e73-afd8-b33db071dd97
relation.isAuthorOfPublication cb145654-8987-45bf-8412-902f2c36b648
relation.isAuthorOfPublication c4bdb97f-7a7b-4b96-8441-962b9ac43442
relation.isAuthorOfPublication.latestForDiscovery acaa04eb-34c3-4dcd-a8a7-e2a6c525e6cb
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
FrictionlessData_Addressing-Big-Data-Challenges.pdf
Size:
1.31 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.89 KB
Format:
Item-specific license agreed upon to submission
Description: