Abstract:
There are problems in Chinese Semantic Role Tagging (SRL) based on statistical machine learning methods. For example, manual feature extraction is cumbersome and inefficient, and the model is difficult to capture the contextual semantic information of long sentences. Regarding the issue above, this paper proposes a BiLSTM-MaxPool-CRF fusion model for Chinese SRL, and conducts model performance optimization research. Firstly, multi-level linguistic features such as part of speech, argument markers, short syntax are incorporated into the training corpus. Then, the average pooling technology is used to sample and select multiple feature vector groups to eliminate feature redundant information. Finally, the results of multiple sets of experiments show that compared to the multi-level features extracted without sampling, the multi-features extracted through the average pooling technique can significantly improve the performance of the sequence annotation model.