Abstract:
The COVID-19 epidemic poses a great challenge to international public health, and accurate case number prediction is important for health resource planning and outbreak assessment. The COVID-19 epidemic emerges as a macro emergent result of the complex interactions among viruses, individuals, the environment, and control organisations under the influence of epidemic prevention and control policies, and the spread of the epidemic appears to have complex non-linear dynamics, and the outbreak of new cases has complex non-stationary characteristics. The time series of new outbreaks of the epidemic has complex non-linear non-smooth characteristics. In order to improve the prediction accuracy of COVID-19 pandemic trend, proposes a three-branch input Time2Vec-SA combining the hybrid decomposition of variational modal decomposition (VMD) and singular spectral decomposition (SSD), as well as the combination of the time vector embedding layer (Time2Vec), the bi-directional long and short-term memory module (BiLSTM), and the self-attention mechanism module (SA). Time2Vec-BiLSTM-SA comprehensive integrated structure prediction method. Variational modal decomposition (VMD) and singular spectral decomposition (SSD) methods are used to generate multiscale components with different modalities and low complexity and high regularity for complex new-onset case sequences, and the two different modal subcomponents are combined into a hybrid component, so as to achieve the feature complementarity of the multiscale components with different modalities, enrich the input feature information, and provide the basis for extracting the feature information of different sides of the complex nonlinear sequences. The Time2Vec module is used to encode the temporal information and capture the periodic and non-periodic features, the BiLSTM module captures the bidirectional long-time dependence of the sequences, and the SA model weights the important features and adjusts the weights for comprehensive integration. The methodological models in this paper are experimentally validated on county-level data from eight counties in California, USA, and two national-level data from India and Italy, and compared with a variety of data processing methods, model structures, and other methods in the literature that use the same dataset. The results show that the method proposed in this paper significantly outperforms the existing methods in all prediction error metrics, proving its effectiveness in improving the accuracy of COVID-19 outbreak trend prediction, with good generalisability and generalisation ability.