Abstract:
Node representation learning is a fundamental technique for studying various graph-structured data. Graph-structured data exhibits complex structure relationships and rich node information, and thus, how to integrate graph structure and node information to learn high-quality node representation is still a challenging problem. Therefore, a node representation learning model based on mutual information maximization and cluster perception is proposed. First, a diffusion graph is constructed by using a graph diffusion method on the original graph; Then, the graph convolution network is used to encode the two graphs into the low-dimensional latent space to obtain the node representation and global representation. Finally, based on the principle of mutual information maximization, the mutual information between the two graphs is maximized by comparing the node representation of one graph with the global representation of the another graph, and vice versa. Meanwhile, nodes with similar semantics are clustered into the same cluster, and the clustering consistency between the node representations of two graphs is maximized. The experimental results on node classification and node clustering on two citation datasets show that the proposed model outperforms baseline methods on several indicators. Taking the Cora dataset as an example, on the node classification task, the model improves the classification accuracy and F1 value indicators by 2.7 and 0.6 percentage points respectively compared with the baseline method.