基于混合分解和Time2Vec-BiLSTM-SA模型的COVID-19疫情趋势预测

COVID-19 epidemic trend prediction based on hybrid decomposition and Time2Vec-BiLSTM-SA models

  • 摘要: COVID-19疫情对国际公共卫生构成了巨大的挑战,准确的病例数预测对卫生资源的规划和疫情评估具有重要意义.COVID-19疫情在疫情防控政策影响下病毒、个体、环境和防控组织复杂的交互作用下涌现出来的宏观涌现结果,疫情传播出现复杂的非线性动力学特性,疫情新发病例时间序列具有复杂的非线性非平稳特征. 为提高对COVID-19大流行趋势的预测精度,提出一种结合变分模态分解(variational mode decomposition,VMD)和奇异谱分解(singular spectrum decomposition,SSD)混合,以及时间向量嵌入层(time to vector,Time2Vec)、双向长短期记忆模块(bidirectional long short-term memory,BiLSTM)和自注意力机制模块(self attention,SA)组合的三分支输入的Time2Vec-BiLSTM-SA综合集成结构预测方法. 利用VMD和SSD方法对复杂的新发病例序列生成具有不同模态且复杂度低规律性强的多尺度分量,两种不同模态子分量组合为混合分量,实现不同模态多尺度分量的特征互补,丰富输入特征信息,为提取复杂非线性序列的不同侧面的特征信息提供基础. 各分支均利用Time2Vec模块编码时间信息,捕获周期性、非周期性特征,BiLSTM模块捕捉序列双向长时依赖关系,SA模型加权重要特征并调整权重进行综合集成. 提出的方法在美国加州8个县的县级数据以及印度和意大利两个国家级数据上进行了实验验证,并与多种数据处理方式、模型结构及使用相同数据集的其他文献方法进行了对比. 结果表明,提出的方法在各项预测误差指标上均显著优于现有方法,证明了其在提高COVID-19疫情趋势预测精度方面的有效性,具备良好的普适性和泛化能力.

     

    Abstract: The COVID-19 epidemic poses a great challenge to international public health, and accurate case number prediction is important for health resource planning and outbreak assessment. The COVID-19 epidemic emerges as a macro emergent result of the complex interactions among viruses, individuals, the environment, and control organisations under the influence of epidemic prevention and control policies, and the spread of the epidemic appears to have complex non-linear dynamics, and the outbreak of new cases has complex non-stationary characteristics. The time series of new outbreaks of the epidemic has complex non-linear non-smooth characteristics. In order to improve the prediction accuracy of COVID-19 pandemic trend, proposes a three-branch input Time2Vec-SA combining the hybrid decomposition of variational modal decomposition (VMD) and singular spectral decomposition (SSD), as well as the combination of the time vector embedding layer (Time2Vec), the bi-directional long and short-term memory module (BiLSTM), and the self-attention mechanism module (SA). Time2Vec-BiLSTM-SA comprehensive integrated structure prediction method. Variational modal decomposition (VMD) and singular spectral decomposition (SSD) methods are used to generate multiscale components with different modalities and low complexity and high regularity for complex new-onset case sequences, and the two different modal subcomponents are combined into a hybrid component, so as to achieve the feature complementarity of the multiscale components with different modalities, enrich the input feature information, and provide the basis for extracting the feature information of different sides of the complex nonlinear sequences. The Time2Vec module is used to encode the temporal information and capture the periodic and non-periodic features, the BiLSTM module captures the bidirectional long-time dependence of the sequences, and the SA model weights the important features and adjusts the weights for comprehensive integration. The methodological models in this paper are experimentally validated on county-level data from eight counties in California, USA, and two national-level data from India and Italy, and compared with a variety of data processing methods, model structures, and other methods in the literature that use the same dataset. The results show that the method proposed in this paper significantly outperforms the existing methods in all prediction error metrics, proving its effectiveness in improving the accuracy of COVID-19 outbreak trend prediction, with good generalisability and generalisation ability.

     

/

返回文章
返回