mirage

NetiNeti : Discovery of Scientific Names from Text Using Machine Learning Methods Table 2

WHOAS at MBLWHOI Library

a service of the MBLWHOI Library | About WHOAS

Show simple item record

dc.contributor.author Akella, Lakshmi Manohar
dc.date.accessioned 2012-01-27T20:19:07Z
dc.date.available 2012-01-27T20:19:07Z
dc.date.issued 2012-01-27
dc.identifier.uri http://hdl.handle.net/1912/5002
dc.description A comparison of NetiNeti, TaxonFinder and FAT tool for the American Seashell Book book (http://www.biodiversitylibrary.org/item/31699) is presented in Table 2. The FAT approach has lower precision and recall values compared to NetiNeti and TaxonFinder approaches for this corpus. The names marked up by the FAT tool were compared with the manual mark up. 869 of the names identified by FAT did not match with the manually marked up set of names. Most of these unmatched names are species epithets with authorship information. We - 16 - further analyzed a random sample of 100 names out of these 869 names and examined genus information interpreted by the tool in the marked up tags. 32 of the 100 mismatched names have correctly interpreted genus names and the remaining are all true false positives with incorrect genus tags. We estimated that 278 of these 869 are correct identifications and the adjusted precision and recall values for the FAT approach were summarized in Table 2. For many of the true false positives, the FAT tool tags the species epithet, but does not seem to recognize the genus name immediately preceding the species name. en_US
dc.description.abstract A scientific name for an organism can be associated with almost all biological data. Name identification is an important step in many text mining tasks aiming to extract useful information from biological, biomedical and biodiversity text sources. A scientific name acts as an important metadata element to link biological information.We present NetiNeti, a machine learning based approach for identification and discovery of scientific names. The system implementing the approach can be accessed at http://namefinding.ubio.org we present the comparison results of various machine learning algorithms on our annotated corpus. Naïve Bayes and Maximum Entropy with Generalized Iterative Scaling (GIS) parameter estimation are the top two performing algorithms. en_US
dc.format.mimetype text/plain
dc.relation.ispartof http://hdl.handle.net/1912/6236
dc.subject Precision and recall values en_US
dc.title NetiNeti : Discovery of Scientific Names from Text Using Machine Learning Methods Table 2 en_US
dc.type Dataset en_US
dc.identifier.doi 10.1575/1912/5002


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search WHOAS


Browse

My Account

Statistics