聚类感知的文本多标签分类模型

赵金榜; 秦绍伟; 武浩

doi:10.7540/j.ynu.20210621

聚类感知的文本多标签分类模型

Clustering perceptual text multi-label classification model

摘要

摘要: 文本是人类社会中使用最广泛的信息载体，对其进行准确的分类具有很重要的现实意义. 现有方法在文本多标签分类的问题上已经取得了一定的效果，但仍存在对文档和标签线索利用不充分的问题. 从文本多义性的角度出发，提出了一种聚类感知的文本多标签分类模型. 首先利用深度模型得到文本的原始特征，然后使用多个簇心向量结合注意力机制提取不同语境下的文本特征，最后将这些特征融合增强后与标签的嵌入表示做点积进行分类. 在4个数据集下的实验结果表明，该方法在多个评价指标上的表现均取得显著提升.

Abstract: Text is the most widely used information carrier in human society, and it is of great practical significance to classify it accurately. The existing methods have achieved some results in the problem of multi-label text classification, but there are still problems of insufficient use of documents and label clues. From the perspective of text polysemy, this paper proposes a clustering perceptual text multi-label classification model. Firstly, the deep learning model is used to obtain the original features of document. Then, multiple cluster center vectors combined with attention mechanism are used to extract text features in different contexts. Finally, these features are combined with the embedded representation of tags to do the dot product for classification. Experimental results in 4 data sets show that the performance of this method has been significantly improved in several evaluation indexes.

HTML全文

参考文献(25)

施引文献

资源附件(0)