Forecasting the Spread of Dengue Outbreaks with a Synthesis of Machine Learning Models Utilizing Exogenous Variables
Main Article Content
Research
Dengue Fever, Forecasting, Exogenous Variables, Stagnant Water, Machine Learning
Abstract
Dengue fever, a viral mosquito-borne disease, affects four billion people worldwide, posing economic and health burdens. Unfortunately, there are no antiviral drugs to treat dengue infections, requiring patients to rely solely on palliative treatment. Forecasting future epidemics will aid public officials in implementing mitigation efforts by predicting dengue cases. The purpose of this study was to develop a machine learning model that forecasts the incidence of dengue outbreaks temporally and geographically by utilizing eco-climatic and socioeconomic factors. Methods included preprocessing monthly dengue cases, precipitation, temperature, and socioeconomic datasets from seven countries (between 2014 and 2023) before performing a principal component analysis. A novel topographical feature applied to the model was stagnant water, a critical breeding ground for mosquitoes. A ridge regression technique was used to manage multicollinearity within the data before applying it to the seasonal autoregressive integrated moving average with exogenous variables (SARIMAX) model, which accounts for the seasonality aspect of the variables being examined. Overall, the forecasting algorithm was capable of accurately predicting dengue incidence up to at least six months in advance with a mean absolute error of 2.420e-6. When the novel feature of stagnant water was removed from the datasets, the prediction’s accuracy significantly decreased when forecasting for the same time period of six months in advance, demonstrating its importance as a feature when forecasting dengue. Therefore, this algorithm can assist public health officials with planning proactive measures, significantly diminishing economic stress and dengue transmission, thus improving the quality of life in dengue-endemic countries.