摘要: |
目的 针对不均训练集导致印刷套准识别模型无法较好识别印刷套不准图像的问题,提出基于最大相关、最小冗余的印刷标志图像数据特征选择方法。方法 提取印刷标志图像的多维特征数据,计算特征与印刷套准和印刷套不准2类之间的相关性和特征之间的冗余度。确定特征选择的目标函数,通过增量搜索方法寻找最优特征,加入特征子集,实现不均衡印刷标志图像的特征选择。结果 文中的特征选择方法获得了3项不均衡数据分类性能评价指标,A为0.9900,R为0.9400,Gmean为0.9466。结论 在不均衡印刷标志图像套准识别中,文中提出的方法性能优于实验中的未处理方法、PCA方法、Relief方法和NCA方法。 |
关键词: 不均衡数据 印刷套准 特征选择 |
DOI:10.19554/j.cnki.1001-3563.2021.05.035 |
分类号:TP391.41 |
基金项目:广东省信息物理融合系统重点实验室项目(2016B030301008);广东工业大学青年基金重点项目(17QNZD001);大学生创新创业训练项目(xj202011845017,xj202011845015,xj202011845016) |
|
Printing Registration Recognition Based on Feature Selection of Imbalanced Training Set |
JIAN Chuan-xia, SHU Zhi-peng, XIE Hao-zhe, ZHOU Yu-qi, WANG Hua-ming
|
(School of Electromechanical Engineering, Guangdong University of Technology, Guangzhou 510006, China)
|
Abstract: |
The work aims to propose a feature selection method of printing mark images dataset based on max-relevance and min-redundancy in view of that the model of printing registration recognition cannot accurately identify the printing misregistration images due to the imbalanced training set. The multi-dimensional features of printing mark images were extracted, and the correlation between features and printing registration/misregistration and the redundancy between features were calculated. The objective function of feature selection was determined, and the incremental search method was used to find the optimal feature and add the optimal feature to the feature subset, which realized the feature selection of imbalanced printing mark images. The proposed method achieved 3 evaluation indicators of imbalanced data classification, 0.9900 of A, 0.9400 of R, and 0.9466 of Gmean. The proposed method outperforms the untreated method, the PCA method, the Relief method and the NCA method on the identification of imbalanced printing mark images in the experiment. |
Key words: imbalanced data printing registration feature selection |