A Statistical Approach for Identifying Private Wells Susceptible to Perfluoroalkyl Substances (PFAS) Contamination

Citation:

Xindi C. Hu, Beverly Ge, Bridger J. Ruyle, Jennifer Sun, and Elsie M. Sunderland. 5/11/2021. “A Statistical Approach for Identifying Private Wells Susceptible to Perfluoroalkyl Substances (PFAS) Contamination.” Environmental Science and Technology Letters. Publisher's Version

Abstract:

Drinking water concentrations of per- and polyfluoroalkyl substances (PFAS) exceed provisional guidelines for millions of Americans. Data on private well PFAS concentrations are limited in many regions, and monitoring initiatives are costly and time-consuming. Here, we examine modeling approaches for predicting private wells likely to have detectable PFAS concentrations that could be used to prioritize monitoring initiatives. We used nationally available data on PFAS sources, and geologic, hydrologic and soil properties that affect PFAS transport as predictors, and trained and evaluated models using PFAS data (n ∼ 2300 wells) collected by the state of New Hampshire between 2014 and 2017. Models were developed for the five most frequently detected PFAS: perfluoropentanoate, perfluorohexanoate, perfluoroheptanoate, perfluorooctanoate, and perfluorooctanesulfonate. Classification random forest models that allow nonlinearity in interactions among predictors performed the best (area under the receiver operating characteristics curve: 0.74–0.86). Point sources such as the plastics/rubber and textile industries accounted for the highest contribution to accuracy. Groundwater recharge, precipitation, soil sand content, and hydraulic conductivity were secondary predictors. Our study demonstrates the utility of machine learning models for predicting PFAS in private wells, and the classification random forest model based on nationally available predictors is readily extendable to other regions.