Abstract:
As an important Probabilistic Graphical Model (PGM), Bayesian Network with Latent Variables (BNLV) is an effective framework for representing and inferring uncertainty knowledge by incorporating latent variables to qualitatively and quantitatively describe the implicit knowledge implied in data. In recent years, BNLV learning has become an important research issue in the field of uncertain artificial intelligence and knowledge discovery. In this paper, we analyze and summarize the challenges of BNLV learning, and survey the representative methods in three aspects: determination of the cardinality and number of latent variables, parameter learning, and structure learning. Moreover, we give the applicable scenarios, basic ideas and principal steps of the above methods, as well as the corresponding comparative analysis. Regarding the determination of the cardinality and number of latent variables, we interpret the cluster-based and clique-based methods. Then, we discuss the methods of parameter learning, including the imputation, gradient ascent and EM algorithm, as well as the EM based improvement methods. Further, we present the basic ideas of structure learning, including condition independence and scoring & search, as well as the scoring & search based improvement methods. Upon the analysis and summary of the state-of-the-art research findings, we also point out some problems and emphasis of further study of BNLV learning.