Abstract:
Because the voice and music of the range signal often appear in the form of aliasing, it is hoped to effectively separate the voice and music in the range signal in many applications. However, the common separation method generally adopts the processing method based on frequency domain signal, and the frequency domain signal restoration needs the help of phase information, resulting in the deviation of the restored speech information. Therefore, a joint training and temporal convolution approach is proposed to introduce in the adversarial generative network for the problem of of poor separation effect of time domain single channel tone domain signal separation. Firstly, the time domain speech is preprocessed. Then, the preprocessed data is sent to the time series convolutional generative adversarial network generator for separation. Finally, the separated interference speech and pure interference speech are sent to the generative adversarial network discriminator for discrimination, and feed the discriminant results back to the generator. The experiment adopts MIR-1K and data_ thchs30 dataset for algorithm performance test. The experimental results show that the PESQ and STOI indexes of the single channel range separation model proposed in this paper are improved by 0.31 and 0.07 , which proves that the proposed algorithm effectively improves the separation effect of speech and music in the range signal.