Abstract:
A capsule network-based detection model is proposed for the problem that traditional toxic comments detection models cannot adapt to the constantly updated online culture and language habits as well as the loss of information in neural networks. Firstly, the BERT model is used to extract the features of word vectors to retain the potential semantic information of the text. Then the feature representation is extracted in the local range by the capsule network and combined with Bi-LSTM in the global range to obtain a more comprehensive feature representation. The attention mechanism is then used to fuse the local and global feature representations to extract key information and reduce the dimensionality of the feature representation. Finally, the results are classified using the Sigmoid classifier and the detection results are output. The experimental results show that the proposed combined model is able to extract finer semantic information relative to the traditional model, effectively improving the classification effect and achieving an accuracy of 0.922 in the detection task of toxic comments.