Labeling poststorm coastal imagery for machine learning: measurement of interrater agreement

dc.contributor.author Goldstein, Evan B.
dc.contributor.author Buscombe, Daniel
dc.contributor.author Lazarus, Eli
dc.contributor.author Mohanty, Somya D.
dc.contributor.author Rafique, Shah Nafis
dc.contributor.author Anarde, Katherine A.
dc.contributor.author Ashton, Andrew D.
dc.contributor.author Beuzen, Tomas
dc.contributor.author Castagno, Katherine
dc.contributor.author Cohn, Nicholas
dc.contributor.author Conlin, Matthew P.
dc.contributor.author Ellenson, Ashley
dc.contributor.author Gillen, Megan N.
dc.contributor.author Hovenga, Paige A.
dc.contributor.author Over, Jin-Si
dc.contributor.author Palermo, Rose V.
dc.contributor.author Ratliff, Katherine M.
dc.contributor.author Reeves, Ian R. B.
dc.contributor.author Sanborn, Lily H.
dc.contributor.author Straub, Jessamin A.
dc.contributor.author Taylor, Luke A.
dc.contributor.author Wallace, Elizabeth J.
dc.contributor.author Warrick, Jonathan
dc.contributor.author Wernette, Phillipe
dc.contributor.author Williams, Hannah E.
dc.date.accessioned 2021-12-13T22:28:34Z
dc.date.available 2021-12-13T22:28:34Z
dc.date.issued 2021-09-03
dc.description © The Author(s), 2021. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Goldstein, E. B., Buscombe, D., Lazarus, E. D., Mohanty, S. D., Rafique, S. N., Anarde, K. A., Ashton, A. D., Beuzen, T., Castagno, K. A., Cohn, N., Conlin, M. P., Ellenson, A., Gillen, M., Hovenga, P. A., Over, J.-S. R., Palermo, R., Ratliff, K. M., Reeves, I. R. B., Sanborn, L. H., Straub, J. A., Taylor, L. A., Wallace E. J., Warrick, J., Wernette, P., Williams, H. E. Labeling poststorm coastal imagery for machine learning: measurement of interrater agreement. Earth and Space Science, 8(9), (2021): e2021EA001896, https://doi.org/10.1029/2021EA001896. en_US
dc.description.abstract Classifying images using supervised machine learning (ML) relies on labeled training data—classes or text descriptions, for example, associated with each image. Data-driven models are only as good as the data used for training, and this points to the importance of high-quality labeled data for developing a ML model that has predictive skill. Labeling data is typically a time-consuming, manual process. Here, we investigate the process of labeling data, with a specific focus on coastal aerial imagery captured in the wake of hurricanes that affected the Atlantic and Gulf Coasts of the United States. The imagery data set is a rich observational record of storm impacts and coastal change, but the imagery requires labeling to render that information accessible. We created an online interface that served labelers a stream of images and a fixed set of questions. A total of 1,600 images were labeled by at least two or as many as seven coastal scientists. We used the resulting data set to investigate interrater agreement: the extent to which labelers labeled each image similarly. Interrater agreement scores, assessed with percent agreement and Krippendorff's alpha, are higher when the questions posed to labelers are relatively simple, when the labelers are provided with a user manual, and when images are smaller. Experiments in interrater agreement point toward the benefit of multiple labelers for understanding the uncertainty in labeling data for machine learning research. en_US
dc.description.sponsorship The authors gratefully acknowledge support from the U.S. Geological Survey (G20AC00403 to EBG and SDM), NSF (1953412 to EBG and SDM; 1939954 to EBG), Microsoft AI for Earth (to EBG and SDM), The Leverhulme Trust (RPG-2018-282 to EDL and EBG), and an Early Career Research Fellowship from the Gulf Research Program of the National Academies of Sciences, Engineering, and Medicine (to EBG). U.S. Geological Survey researchers (DB, J-SRO, JW, and PW) were supported by the U.S. Geological Survey Coastal and Marine Hazards and Resources Program as part of the response and recovery efforts under congressional appropriations through the Additional Supplemental Appropriations for Disaster Relief Act, 2019 (Public Law 116-20; 133 Stat. 871). en_US
dc.identifier.citation Goldstein, E. B., Buscombe, D., Lazarus, E. D., Mohanty, S. D., Rafique, S. N., Anarde, K. A., Ashton, A. D., Beuzen, T., Castagno, K. A., Cohn, N., Conlin, M. P., Ellenson, A., Gillen, M., Hovenga, P. A., Over, J.-S. R., Palermo, R., Ratliff, K. M., Reeves, I. R. B., Sanborn, L. H., Straub, J. A., Taylor, L. A., Wallace E. J., Warrick, J., Wernette, P., Williams, H. E. (2021). Labeling poststorm coastal imagery for machine learning: measurement of interrater agreement. Earth and Space Science, 8(9), e2021EA001896. en_US
dc.identifier.doi 10.1029/2021EA001896
dc.identifier.uri https://hdl.handle.net/1912/27811
dc.publisher American Geophysical Union en_US
dc.relation.uri https://doi.org/10.1029/2021EA001896
dc.rights Attribution 4.0 International *
dc.rights.uri http://creativecommons.org/licenses/by/4.0/ *
dc.subject Data labeling en_US
dc.subject Classification en_US
dc.subject Hurricane impacts en_US
dc.subject Machine learning en_US
dc.subject Imagery en_US
dc.subject Data annotation en_US
dc.title Labeling poststorm coastal imagery for machine learning: measurement of interrater agreement en_US
dc.type Article en_US
dspace.entity.type Publication
relation.isAuthorOfPublication 6efd9e62-07ff-4469-be34-af292b7a1ef2
relation.isAuthorOfPublication 45c0b7d9-4af7-41ec-b7f4-1816d51bd0ee
relation.isAuthorOfPublication 72839d75-f2b3-47a5-887c-2c2a101c9097
relation.isAuthorOfPublication 403f7ff4-b6ea-418c-b053-775399bbf6c8
relation.isAuthorOfPublication 8716d250-5f60-4a57-b7f6-04b46b04cfef
relation.isAuthorOfPublication 69123260-3cef-4cc1-8caa-7f127a28fb62
relation.isAuthorOfPublication c7f450d8-ba4e-4f1a-b706-e01f4539f317
relation.isAuthorOfPublication a10ee743-254f-458f-8b69-95c4126bdb85
relation.isAuthorOfPublication 7ef58f8d-830c-4226-ab08-1402c997d459
relation.isAuthorOfPublication 6f1ef68d-f43f-40dd-9997-a98e4a36282a
relation.isAuthorOfPublication f8c9fe8f-cc9c-4ab6-b6c6-4e7d740da6e8
relation.isAuthorOfPublication 02ebdf91-465b-47c6-acba-7189485f6e7b
relation.isAuthorOfPublication fc26a259-29b6-444b-a770-1f87e7d5ead7
relation.isAuthorOfPublication 8aec8e3c-8849-4bbe-83cd-15b35527ae42
relation.isAuthorOfPublication 31ba29b3-a623-437b-9912-70e8e61bf73d
relation.isAuthorOfPublication b69798e8-6a32-4c23-b0b8-f803785a1f3f
relation.isAuthorOfPublication bbbce3f2-06ad-473d-ab5f-93e0c767c0af
relation.isAuthorOfPublication 38dab07c-8301-4310-a07f-2e88a6efd85a
relation.isAuthorOfPublication c5891832-8684-4d1d-8034-701bdc42f77d
relation.isAuthorOfPublication 04d0a9fb-ebe6-402e-81c3-d3ea453e93d4
relation.isAuthorOfPublication 8d0b8878-b98b-45e4-854b-f92f7b4e53e4
relation.isAuthorOfPublication bd9958e1-8a1e-4a6e-9e6f-1677874d0c2c
relation.isAuthorOfPublication ae90f4da-a319-4e55-b362-400637678581
relation.isAuthorOfPublication d5f0b8ab-26db-4703-bd7c-c307d8554bfe
relation.isAuthorOfPublication 6eeff5ae-f490-4c10-bfb9-2456d55ace7f
relation.isAuthorOfPublication.latestForDiscovery 6efd9e62-07ff-4469-be34-af292b7a1ef2
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
2021EA001896.pdf
Size:
1.33 MB
Format:
Adobe Portable Document Format
Description:
Article
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.88 KB
Format:
Item-specific license agreed upon to submission
Description: