Abstract:
Aiming at the problems of poor robustness of traditional speech activity detection methods in noisy environment and the performance of speech segment detection is not good, a speech activity detection method based on multi-feature fusion is proposed. Firstly, the band-partitioning spectral entropy and projection feature based on Mel Frequency Cepstral Coefficient(MFCC) of the speech signal with noise are extracted, and the GFCC
0 feature are applied to speech activity detection tasks. Then, the fusion features suitable for speech activity detection are obtained by adaptive weighted fusion of the three types of features. Finally, the threshold value of fusion features is estimated adaptively based on fuzzy C-means clustering and the speech activity detection results are obtained by double threshold method. Compared with the existing traditional methods, the proposed method in this paper achieves better speech activity detection results in seven noise environments, and improves the accuracy of speech activity detection. Especially in volvo noise environment, the accuracy can reach more than 94.5%.