融合“动态权重”与“核差距”改进的Vague相似度模型

An Improved Vague Similarity Measure via Dynamic Weighting and Kernel Divergence

  • 摘要: 相似度量是Vague集处理不确定信息的关键. 针对传统Vague集相似度模型忽略未知度影响、参数权重固定等缺陷,本文提出了一种融合未知度、基于Vague熵的动态权重分配与核差距新的Vague相似度量模型,此模型可在真、假隶属度以及未知度3部分自适应调整权重,通过核差距来刻画真、假隶属度整体偏移程度. 理论上严格证明本文所提相似度满足有界性、对称性、归一性、退化一致性,经小样本对比试验及在NASA软件缺陷数据集上构造三元数组,结合KNN分类,对8组相似度量方法展开对比测试. 实验结果表明,所提方法在缺陷识别的 F1 值、精确率和召回率等指标上整体优于传统方法,说明其在处理软件缺陷类不平衡与不确定信息方面具有较好的有效性和稳定性.

     

    Abstract: Similarity measures play a crucial role in handling uncertain and indeterminate information in Vague sets. To overcome the limitations of traditional Vague-set similarity models, such as insufficient consideration of indeterminacy and the use of fixed parameter weights, this paper proposes a novel Vague similarity measure that incorporates indeterminacy, adopts a dynamic weight allocation strategy based on Vague entropy, and introduces a kernel divergence term. The proposed measure adaptively adjusts the weights assigned to the truth-membership, falsity-membership, and indeterminacy components, while using kernel divergence to characterize the overall deviation between the truth-membership and falsity-membership degrees. From a theoretical perspective, we rigorously prove that the proposed measure satisfies boundedness, symmetry, and normalization. Experiments are conducted on both small-sample comparative tests and ternary tuples constructed from NASA software defect datasets. Combined with KNN classification and a discrimination index, the proposed method is compared with eight existing similarity measures. The experimental results show that the proposed method generally outperforms traditional approaches in software defect identification metrics such as F1-score, precision, and recall, demonstrating its effectiveness and stability in handling class imbalance and uncertain information in software defect data.

     

/

返回文章
返回