Water quality prediction using machine learning algorithms in recreational beaches from Montevideo, Uruguay

Authors

  • Ángel Segura Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este (CURE), Universidad de la República. Rocha, Uruguay https://orcid.org/0000-0002-1989-8899
  • Lía Sampognaro Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este (CURE), Universidad de la República. Rocha, Uruguay https://orcid.org/0000-0002-7718-9820
  • Guzmán López Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este (CURE), Universidad de la República. Rocha, Uruguay https://orcid.org/0000-0002-1343-492X
  • Carolina Crisci Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este (CURE), Universidad de la República. Rocha, Uruguay https://orcid.org/0000-0002-3089-8048
  • Mathías Bourel Instituto de Matemática y Estadística Prof. Rafael Laguardia, Facultad de Ingeniería, Universidad de la República. Montevideo, Uruguay. https://orcid.org/0000-0002-7472-7179
  • Victoria Vidal Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este (CURE), Universidad de la República. Rocha, Uruguay https://orcid.org/0000-0002-8623-7804
  • Karina Eirin Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este (CURE), Universidad de la República. Rocha, Uruguay https://orcid.org/0000-0002-6588-4738
  • Claudia Piccini Instituto de Investigaciones Biológicas Clemente Estable. Ministerio de Educación y Cultura. Montevideo, Uruguay https://orcid.org/0000-0002-2762-1953
  • Carla Kruk Instituto de Ecología y Ciencias Ambientales (IECA), Facultad de Ciencias, Universidad de la República. Montevideo, Uruguay https://orcid.org/0000-0003-0760-1186
  • Gonzalo Perera Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este (CURE), Universidad de la República. Rocha, Uruguay. Instituto de Matemática y Estadística Prof. Rafael Laguardia, Facultad de Ingeniería, Universidad de la República. Montevideo, Uruguay https://orcid.org/0000-0002-7530-3503

DOI:

https://doi.org/10.26461/22.07

Keywords:

random forests, unbalanced data, contamination, recreational beach, human health

Abstract

We constructed artificial intelligence (AI) models to predict faecal water quality (CF) to aid management in recreational beaches. Historical data base generated by the Laboratorio de Calidad Ambiental de la Intendencia de Montevideo (IM) was analized and AI models wwere constructed to predict CF excess (CF >2.000). Ten years of monitoring 21 recreational beaches (N=19359, november 2009 to september 2019) presented a wide range of salinity and turbidity variability among beaches. CF showed an asymetric distribution (min=4, median=250, average=1.047 and máx=1.280.000) with values exceeding the threshold in all beaches. In situ registered, meteorological and oceanographic variables were used to train AI models. A stratified random forest showed the best performance in the evaluated metrics with an overall accuracy of 86% and 60% of improvement in true positive rates with respect to baseline. High quality data generated by govermental institution together with modeling strategies provided a relevant framework to aid in beach and public health management.

Downloads

Download data is not yet available.

References

American Public Health Association, American Water Works Association y Water Environment Federation, 2012. Standard methods for the examination of water and wastewater. 22a ed. Washington: APHA. Standard Method. 9222 E, Approved 2015.

Avila, R., Horn, B., Moriarty, E., Hodson, R. y Moltachanova E., 2018. Evaluating statistical model performance in water quality prediction. En: Journal of Environmental Management, 206, pp.910–919. DOI: https://doi.org/10.1016/j.jenvman.2017.11.049

Bedri, Z., Corkery, A., O’Sullivan, J.J., Deering, L.A., Demeter, K., Meijer, W.G., O’Hare, G. y Masterson, B., 2016. Evaluating a microbial water quality prediction model for beach management under the revised EU Bathing Water Directive. En: Journal of Environmental Management, 167, pp.49–58. DOI: 10.1016/j.jenvman.2015.10.046

Bouchalová, M., Wennberg, A. y Tryland, I., 2013. Impact of rainfall on bathing water quality–a case study of Fiskevollbukta, Inner Oslofjord, Norway. En: Vann, 4, pp.491–498.

Bourel, M., Crisci, C. y Martínez, A., 2017. Consensus methods based on machine learning techniques for marine phytoplankton presence–absence prediction. En: Ecological Informatics, 42, pp.46–54. DOI: 10.1016/j.ecoinf.2017.09.004

Bourel, M. y Segura, A.M., 2018. Multiclass classification methods in ecology. En: Ecological Indicators, 85, pp.1012–1021. DOI: 10.1016/j.ecolind.2017.11.031

Breiman, L., 2001. Random forests. En: Machine Learning, 45(1), pp.5–32.

Brooks, W.R., Fienen, M.N. y Corsi, S.R., 2013. Partial least squares for efficient models of fecal indicator bacteria on Great Lakes beaches. En: Journal of Environmental Management, 114, pp.470–475. DOI: 10.1016/j.jenvman.2012.09.033

Brooks, W., Corsi, S., Fienen, M. y Carvin, R., 2016. Predicting recreational water quality advisories: a comparison of statistical methods. En: Environ. Model. Softw., 76, pp.81–94. DOI: https://doi.org/10.1016/j.envsoft.2015.10.012

Calliari, D., Gómez, M. y Gómez, N., 2005. Biomass and composition of the phytoplankton in the Río de la Plata estuary: large scale distribution and relationship with environmental variables during a Spring cruise. En: Continental Shelf Research, 25(2), pp.197–210. DOI: 10.1016/j.csr.2004.09.009

Chawla, N.V., Bowyer, K.W., Hall, L.O. y Kegelmeyer, W.P., 2002. SMOTE: Synthetic Minority Over-sampling Technique. En: Journal of Artificial Intelligence Research, 16, pp.321–357. DOI: 10.1613/jair.953

Conde, D., Arocena, R. y Rodríguez-Gallego, L., 2002. Recursos acuáticos superficiales de Uruguay: ambientes algunas problemáticas y desafíos para la gestión. En: AMBIOS, III(10), pp.5-9 y IV(11), pp.32-33.

Crisci, C., Ghattas, B. y Perera, G., 2012. A review of supervised machine learning algorithms and their applications to ecological data. En: Ecological Modelling, 240, pp.113–122. DOI: https://doi.org/10.1016/j.ecolmodel.2012.03.001

Crisci, C., Terra R., Pacheco, J.P., Ghattas, B., Bidegain, M., Goyenola, G., Lagomarsino, J.J., Méndez, G. y Mazzeo, M. 2017. Multi-model approach to predict phytoplankton biomass and composition dynamics in a eutrophic shallow lake. En: Ecological Modelling, 360, pp.80-93. DOI: https://doi.org/10.1016/j.ecolmodel.2017.06.017

Cutler, D.R., Edwards, T.C., Beard, K.H., Cutler, A., Hess, K.T., Gibson, J. y Lawler, J.J., 2007. Random forests for classification in ecology. En: Ecology, 88(11), pp.2783–2792. DOI: 10.1890/07-0539.1

Cyterski, M., Brooks, W., Galvin, M., Wolfe, K., Carvin, R., Roddick, T., Fienen, M. and Corsi, S., 2014. Virtual Beach 3.0.6: user’s guide [En línea]. [s.l.]: USEPA. [Consulta: 9 de junio de 2019]. Disponible en: https://www.epa.gov/sites/default/files/2016-03/documents/vb3_manual_3.0.6.pdf

Eregno, F.E., Tryland, I., Tjomsland, T., Myrmel, M., Robertson, L. y Heistad, A., 2016. Quantitative microbial risk assessment combined with hydrodynamic modelling to estimate the public health risk associated with bathing after rainfall events. En: The Science of the Total Environment, 548–549, pp.270–279. DOI: 10.1016/j.scitotenv.2016.01.034

Giampaoli, S. y Spica, V.R., 2014. Health and safety in recreational waters. En: Bulletin of the World Health Organization, 92(2), pp.79–79. DOI: 10.2471/BLT.13.126391

Gorfinkiel, D., 2006. The economic valuation of coastal areas: the case of Uruguay. En: Ocean Yearbook, 20(1), pp.411–434. DOI: https://doi.org/10.1163/22116001-90000115

Hastie, T.J., Tibshirani, R.J. y Friedman, J.H., 2009. The elements of statistical learning: data mining, inference, and prediction. Nueva York: Springer. (Springer Series in Statistics).

Heaney, C.D., Sams, E., Wing, S., Marshall, S., Brenner, K., Dufour, A.P. y Wade, T.J., 2009. Contact with beach sand among beachgoers and risk of illness. En: American Journal of Epidemiology, 170(2), pp.164-172. DOI: https://doi.org/10.1093/aje/kwp152

He, L. y He, Z., 2008. Water quality prediction of marine recreational beaches receiving watershed baseflow and stormwater runoff in Southern California, USA. En: Water Research, 42, pp.2563–2573. DOI: 10.1016/j.watres.2008.01.002

Instituto Nacional de Investigación Agropecuaria, s.d. Clima [En línea]. Montevideo: INIA. [Consulta: 13 de mayo de 2021]. Disponible en: http://www.inia.uy/gras/Clima/

Intendencia de Montevideo, 2019. Programa de monitoreo de agua de playas y costa del departamento de Montevideo. Informe anual 2018-2019 [En línea]. Montevideo: Intendencia de Montevideo. [Consulta: 12 de abril de 2020]. Disponible en: https://montevideo.gub.uy/sites/default/files/biblioteca/informeanualcalidaddeaguadelacosta-2018-2019_0.pdf

Jones, R.M., Liu, L. y Dorevitch, S., 2013. Hydrometeorological variables predict fecal indicator bacteria densities in freshwater: data-driven methods for variable selection. En: Environmental Monitoring and Assessment, 185(3), pp.2355–2366. DOI: 10.1007/s10661-012-2716-8

Kruk, C., Dobroyan, M., Segura, A.M., Balado, I., Trabal, N., Piccini, C., Sampognaro, L., De Leon, F., Rodríguez, A., y Verrastro, N. 2019. Calidad de agua y su percepción en playas: La Paloma, Rocha [En línea]. En: AUGM. II Congreso de Agua, Ambiente y Energía. Montevideo, Uruguay (25-27 de setiembre de 2019). Montevideo: Uruguay. [Consulta: 13 de mayo de 2021]. Disponible en: https://www.fing.edu.uy/imfia/congresos/caae/assets/trabajos/37_Calidad_de_agua_y_su_percepci%C3%B3n_en_playas__La_Paloma__Rocha.pdf

Kruk, C., Dobroyan, M., González, L., Segura, A.M., Balado, I., Trabal, N., De León, F., Martínez, G., Rodríguez, A., Piccini, C., Chalar, G. y Verrastro, N., 2018. Calidad de agua y salud ecosistémica en playas recreativas de la Paloma, Rocha [En línea]. En: Revista Trama, 9(9), pp.1-10. [Consulta: 13 de mayo de 2021]. Disponible en: http://www.auas.org.uy/trama/index.php/Trama/article/view/179

Kruk, C., Piccini, C., Segura, A., Nogueira, L., Carballo, C., Martínez de la Escalera, G., Calliari, D., Ferrari, G., Simoens, M., Cea, J., Alcántara, I., Vico, P. y Miguez, D., 2015. Herramientas para el monitoreo y sistema de alerta de floraciones de cianobacterias nocivas: Río Uruguay y Río de la Plata. En: INNOTEC, (10), pp.23–39. DOI: https://doi.org/10.26461/10.02

Kruk, C., Segura, A.M., Nogueira, L., Alcántara, I., Calliari, D., Martínez de la Escalera, G., Carballo, C., Cabrera, C., Sarthou, F., Scavone, P. y Piccini, C., 2017. A multilevel trait-based approach to the ecological performance of Microcystis aeruginosa complex from headwaters to the ocean. En: Harmful Algae, 70, pp.23–36. DOI: 10.1016/j.hal.2017.10.004

Kuhn, M. y Johnson, K., 2016. Applied predictive modeling. 5ta. imp. cor. Nueva York: Springer.

Lotze, H.K., Lenihan, H.S., Bourque, B.J., Bradbury, R.H., Cooke, R.G., Kay, M.C., Kidwell, S.M., Kirby, M.X., Peterson, C.H. y Jackson, J.B.C., 2006. Depletion, degradation, and recovery potential of estuaries and coastal seas. En: Science, 312, pp.1806-1809. DOI: https://doi.org/10.1126/science.1128035

Mara, D. 2013. Domestic wastewater treatment in developing countries [En línea]: Londres: Earthscan. [Consulta: 13 de mayo de 2021]. Disponible en: https://www.researchgate.net/publication/287291244_Domestic_Wastewater_Treatment_in_Developing_Countries#fullTextFileContent

Martínez de la Escalera, G., Kruk, C., Segura, A.M., Nogueira, L., Alcántara, I. y Piccini, C., 2017. Dynamics of toxic genotypes of Microcystis aeruginosa complex (MAC) through a wide freshwater to marine environmental gradient. En: Harmful Algae, 62, pp.73–83. DOI: 10.1016/j.hal.2016.11.012

Meteomanz.com, s.d. Meteomanz.com [En línea]. [s.l.]: [s.n.]. [Consulta: 13 de mayo de 2021]. Disponible en: http://meteomanz.com/

Park, Y., Kim, M., Pachepsky, Y., Choi, S.H., Cho J.G., Jeon, J. y Cho, K.H., 2018. Development of a nowcasting system using machine learning approaches to predict fecal contamination levels at recreational beaches in Korea. En: Journal of Environment Quality, 47(5), pp.1094-1102. DOI: 10.2134/jeq2017.11.0425

Parkhurst, D.F., Brenner, K.P., Dufour, A.P. y Wymer, L.J., 2005. Indicator bacteria at five swimming beaches—analysis using random forests. En: Water Research, 39(7), pp.1354–1360. https://doi.org/10.1016/j.watres.2005.01.001

R Core Team, 2020. R: A language and environment for statistical computing [En línea]. Viena: R Foundation for Statistical Computing. [Consulta: 30 de marzo de 2021]. Disponible en: http://www.r-project.org/index.html

Sabino, R., Rodrigues, R., Costa, I., Carneiro, C., Cunha, M., Duarte, A., Faria, N., Ferreira, F.C., Gargaté, M.J, Júlio, C., Martins, M.L., Nevers, M.B., Oleastro, M., Solo-Gabriele, H., Veríssimo, C., Viegas, C., Whitman, R.L. y Brandão, J., 2014. Routine screening of harmful microorganisms in beach sands: implications to public health. En: Science of The Total Environment, 472, pp.1062–1069. DOI: 10.1016/j.scitotenv.2013.11.091

Savichtcheva, O. y Okabe, S., 2006. Alternative indicators of fecal pollution: relations with pathogens and conventional indicators, current methodologies for direct pathogen monitoring and future application perspectives. En: Water Research, 40(13), pp.2463–2476. DOI: 10.1016/j.watres.2006.04.040

Searcy, R.T., Taggart, M., Gold, M. y Boehm, A.B., 2018. Implementation of an automated beach water quality nowcast system at ten California oceanic beaches. En: Journal of Environmental Management, 223, pp.633–643. DOI: 10.1016/j.jenvman.2018.06.058

Segura, A.M., Piccini, C., Nogueira, L., Alcántara, I., Calliari, D. y Kruk, C., 2017. Increased sampled volume improves Microcystis aeruginosa complex (MAC) colonies detection and prediction using Random Forests. En: Ecological Indicators, 79, pp.347–354. DOI: 10.1016/j.ecolind.2017.04.047

Shively, D.A., Nevers, M.B., Breitenbach, C., Phanikumar, M.S., Przybyla-Kelly, K., Spoljaric, A.M. y Whitman, R.L., 2016. Prototypic automated continuous recreational water quality monitoring of nine Chicago beaches. En: Journal of Environmental Management, 166, pp.285–293. DOI: 10.1016/j.jenvman.2015.10.011

Simionato, C.G., Clara Tejedor, M.L., Campetella, C., Guerrero, R. y Moreira, D., 2010. Patterns of sea surface temperature variability on seasonal to sub-annual scales at and offshore the Río de la Plata estuary. En: Continental Shelf Research, 30(19), pp.1983–1997. DOI: 10.1016/j.csr.2010.09.012

Thoe, W. y Lee, J.H.W., 2014. Daily forecasting of Hong Kong beach water quality by multiple linear regression models. En: Journal of Environmental Engineering, 140(2). DOI: 10.1061/(ASCE)EE.1943-7870.0000800

United States Environmental Protection Agency, 2019. Virtual beach [En línea]. [s.l.]. USEPA. [Consulta: 28 de junio de 2019]. Disponible en: https://www.epa.gov/ceam/virtual-beach-vb

Uruguay. Decreto 253/979, de 09 de mayo de 2009. Diario Oficial, 31 de mayo de 1979, p.1473.

Uruguay. Ministerio de Ambiente, Dirección Nacional de Medio Ambiente, 2017. Técnica de filtración por membrana 5053UY. En: Uruguay. Ministerio de Ambiente, Dirección Nacional de Medio Ambiente. Manual de procedimientos analíticos para muestras ambientales [En línea]. Montevideo: DINAMA. [Consulta: 12 de marzo de 2021]. Disponible en: https://www.gub.uy/ministerio-ambiente/politicas-y-gestion/manual-procedimientos-analiticos-para-muestras-ambientales-tercera-edicion-2017.

Uruguay. Ministerio de Vivienda Ordenamiento Territorial y Medio Ambiente, 2020. Plan nacional de saneamiento [En línea]. Montevideo: MVOTMA. [Consulta: 30 de mayo de 2020]. Disponible en: https://www.gub.uy/ministerio-ambiente/politicas-y-gestion/planes/plan-nacional-saneamiento

Uruguay. Resolución S/N del 25 de febrero de 2005. Diario Oficial, 2 de marzo de 2005, p.543.

Vapnik, V., 1998. Statistical learning theory. Nueva York: John Wiley and Sons, Inc.

Wade, T.J., Calderon, R.L., Brenner, K.P., Sams, E., Beach, M., Haugland, R. y Dufour, A.P., 2008. High sensitivity of children to swimming-associated gastrointestinal illness: results using a rapid assay of recreational water quality. En: Epidemiology, 19(3), pp.375-383. DOI: 10.1097/EDE.0b013e318169cc87

WHO, 2018. WHO recommendations on scientific, analytical and epidemiological developments relevant to the parameters for bathing water quality in the Bathing Water Directive (2006/7/EC). [s.n.]: WHO.

Zepp, R.G., Cyterski, M., Parmar, R., Wolfe, K., White, E.M. y Molina, M., 2010. Predictive modeling at beaches. Volume II: predictive tools for beach notification. Washington: USEPA.

Zhang, Z., Deng, Z. y Rusch, K.A., 2015. Modeling fecal coliform bacteria levels at gulf coast beaches. En: Water Quality, Exposure and Health, 7(3), pp.255–263. DOI: https://doi.org/10.1007/s12403-014-0145-3

Published

2021-10-18

How to Cite

Segura, Ángel, Sampognaro, L., López, G., Crisci, C., Bourel, M., Vidal, V., Eirin, K., Piccini, C., Kruk, C., & Perera, G. (2021). Water quality prediction using machine learning algorithms in recreational beaches from Montevideo, Uruguay. INNOTEC, (22 jul-dic), e555. https://doi.org/10.26461/22.07

Issue

Section

Articles

Most read articles by the same author(s)