Big-Data Analysis and Machine Learning Based on Oil Pollution Remediation Cases from CERCLA Database
Hangyu Li,
Ze Zhou,
Tao Long,
Yao Wei,
Jianchun Xu,
Shuyang Liu and
Xiaopu Wang ()
Additional contact information
Hangyu Li: School of Petroleum Engineering, China University of Petroleum (East China), Qingdao 266000, China
Ze Zhou: School of Petroleum Engineering, China University of Petroleum (East China), Qingdao 266000, China
Tao Long: State Environmental Protection Key Laboratory of Soil Environmental Management and Pollution Control, Nanjing Institute of Environmental Sciences, Ministry of Ecology and Environment, Nanjing 210042, China
Yao Wei: School of Computer Science & Engineering, South China University of Technology, Guangzhou 510006, China
Jianchun Xu: School of Petroleum Engineering, China University of Petroleum (East China), Qingdao 266000, China
Shuyang Liu: School of Petroleum Engineering, China University of Petroleum (East China), Qingdao 266000, China
Xiaopu Wang: School of Petroleum Engineering, China University of Petroleum (East China), Qingdao 266000, China
Energies, 2022, vol. 15, issue 15, 1-10
Abstract:
The U.S. Environmental Protection Agency’s (EPA) Superfund—the Comprehensive Environmental Response, Compensation, and Liability Act (CERCLA) database—has collected and built an open-source database based on nearly 2000 US soil remediation cases since 1980, providing detailed information and references for researchers worldwide to carry out remediation work. However, the cases were relatively independent to each other, so the whole database lacks systematicness and instructiveness to some extent. In this study, the basic features of all 144 soil remediation projects in four major oil-producing states (California, Texas, Oklahoma and Alaska) were extracted from the CERCLA database and the correlations among the pollutant species, pollutant site characteristics and selection of remediation methods were analyzed using traditional and machine learning techniques. The Decision Tree Classifier was selected as the machine learning model. The results showed that the growth of new contaminated sites has slowed down in recent years; physical remediation was the most commonly used method, and the probability of its application is more than 80%. The presence of benzene, toluene, ethylbenzene and xylene (BTEX) substances and the geographical location of the site were the two most influential factors in the choice of remediation method for a specific site; the maximum weights of these two features reaches 0.304 and 0.288.
Keywords: CERCLA; oil-contaminated soil; soil remediation; machine learning (search for similar items in EconPapers)
JEL-codes: Q Q0 Q4 Q40 Q41 Q42 Q43 Q47 Q48 Q49 (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/1996-1073/15/15/5698/pdf (application/pdf)
https://www.mdpi.com/1996-1073/15/15/5698/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jeners:v:15:y:2022:i:15:p:5698-:d:881282
Access Statistics for this article
Energies is currently edited by Ms. Agatha Cao
More articles in Energies from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().