刘云, 向婵. 基于虚构理论对不平衡数据集中少数类关联规则挖掘的研究[J]. 云南大学学报(自然科学版), 2017, 39(1): 33-38. doi: 10.7540/j.ynu.20160221
引用本文: 刘云, 向婵. 基于虚构理论对不平衡数据集中少数类关联规则挖掘的研究[J]. 云南大学学报(自然科学版), 2017, 39(1): 33-38. doi: 10.7540/j.ynu.20160221
LIU Yun, XIANG Chan. Research on association rule mining for minority classes of unbalanced database by confabulation theory[J]. Journal of Yunnan University: Natural Sciences Edition, 2017, 39(1): 33-38. DOI: 10.7540/j.ynu.20160221
Citation: LIU Yun, XIANG Chan. Research on association rule mining for minority classes of unbalanced database by confabulation theory[J]. Journal of Yunnan University: Natural Sciences Edition, 2017, 39(1): 33-38. DOI: 10.7540/j.ynu.20160221

基于虚构理论对不平衡数据集中少数类关联规则挖掘的研究

Research on association rule mining for minority classes of unbalanced database by confabulation theory

  • 摘要: 在网络入侵检测系统中,数据挖掘往往面对的是不平衡数据集,而对不平衡数据集中少数类的挖掘是现在研究的热点.针对不平衡数据集中少数类的挖掘问题,提出了不平衡库关联规则挖掘算法(ARUD).算法首先构造一个知识联接强度矩阵,用来存储所有二项集的支持度计数,然后基于该矩阵挖掘满足最小说服度的所有关联规则,且ARUD算法仅需扫描整个事务数据库1次.采用了UCI数据库中4个典型的不平衡数据集,对比Apriori算法与CFP-Growth算法, ARUD算法能有效提取不平衡数据集中的少数类,并在数据挖掘运行时间和占用内存方面均有性能提升.

     

    Abstract: According to network intrusion detection system,data mining are frequently faced with unbalanced data sets,and the mining of minority classes from unbalanced database is research focus. Aim at mining of minority classes from unbalanced database,this paper proposes an association rules mining from unbalanced database called ARUD,the proposed algorithm through based on pairwise item condition probability to calculate the cogency of find all association rules,ARUD only one passes through the file.In this paper,four typical data sets from the UCI,compared with Apriori and CFP-Growth,ARUD has the superiority of the approach for classifying minority classes in unbalanced data sets.In addition,ARUD is consistently faster and consumes less memory space.

     

/

返回文章
返回