Extraction of mitigation-related text from Endangered Species Act documents using machine learning: a case study
Arun Varghese (),
Kasey Allen,
George Agyeman-Badu,
Jennifer Haire and
Rebecca Madsen
Additional contact information
Arun Varghese: ICF
Kasey Allen: ICF
George Agyeman-Badu: ICF
Jennifer Haire: ICF
Rebecca Madsen: Electric Power Resources Institute
Environment Systems and Decisions, 2022, vol. 42, issue 1, 63-74
Abstract:
Abstract Various industrial and development projects have the potential to adversely affect threatened and endangered species and their habitats. The federal Endangered Species Act (ESA) requires preparation of a biological assessment or habitat conservation plan before federal agencies can authorize, through decision documents and permits, unintentional and otherwise prohibited “take” (i.e., harm) of listed species. These documents describe the potential effects of proposed projects on listed species and include measures to mitigate those effects. Collectively, these assessments, plans, decision documents, and permits—termed ESA documents in our study—are valuable for identifying approved mitigation options that could apply to future projects. However, owing to the volume, length, and complexity of these documents, manual review would be time- and labor-intensive. In this study, we apply three supervised machine learning algorithms, including two based on state-of-the-art transfer learning, to develop and evaluate predictive models capable of extracting mitigation-related text from ESA documents. The machine learning models were developed based on a training dataset that was created as part of this study. The best performing model showed an estimated ROC-AUC score of 0.98 and a precision recall AUC score of 0.86 during cross-validation, indicating great potential for effectively extracting mitigation-related content from existing documents. To illustrate the utility of this technology, we present a simulated case study application in which the use of pretrained machine learning models capable of recognizing mitigation measures, coupled with a large historical corpus of ESA documents and keyword filters, provided a means to rapidly assess the commonly used mitigation measures for a given species. While this technology did not eliminate the requirement for biological expertise, it did allow for rapid scoping assessments and could serve as a supporting resource even for experienced biologists.
Keywords: Endangered Species Act; Text mining; Machine learning; Artificial intelligence; Natural language processing; BERT (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s10669-021-09830-2 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:envsyd:v:42:y:2022:i:1:d:10.1007_s10669-021-09830-2
Ordering information: This journal article can be ordered from
https://www.springer.com/journal/10669
DOI: 10.1007/s10669-021-09830-2
Access Statistics for this article
More articles in Environment Systems and Decisions from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().