您的位置:首页  > 论文页面

训练误差概率分布对泛化性能的影响

发表时间:2018-08-31  浏览量:1867  下载量:240
全部作者: 赵金伟,王宇飞,柳宇,黑新宏
作者单位: 西安理工大学计算机科学与工程学院
摘 要: 当损失函数确定后,在经验风险中平等对待给定的样本数据集是不正确的,因此在分析无界损失函数集合的风险界时,必须考虑训练样本预测误差概率分布的尾部厚重程度。本文将训练样本的预测误差按倒序排序,然后重新讨论经验风险与结构风险,分析期望风险的上界,提出新的泛化误差界——尾部综合指数下界。基于商空间理论,提出将尾部综合指数下界与留一法(leave-one-out,LOO)误差界综合考虑,在假设空间构造多层拓扑结构来寻找最优的假设紧子空间。并通过实验证明了对于一些学习算法,本方法可以提高其泛化性能的鲁棒性和有效性。
关 键 词: 人工智能;无界非负损失函数集;统计学习理论;拓扑结构;重尾指数
Title: Influence of training error probability distribution on generalization performance
Author: ZHAO Jinwei, WANG Yufei, LIU Yu, HEI Xinhong
Organization: School of Computer Science and Engineering, Xi'an University of Technology
Abstract: When the loss function is determined, it is not correct to treat the given sample data set equally in empirical risk. Therefore, when analyzing the risk bound of the set of unbounded loss functions, the thickness of tail in the training sample prediction error probability distribution must be considered. In this paper, the prediction errors of the training sample are sorted in ascending order, then new empirical risk and structural risk are discussed, and the upper bound of the expected risk is analyzed. A new generalization error boundary, lower bound of tail synthetic index, is proposed. Based on quotient space theory, the new bound of tail synthetic index and leave-one-out (LOO) error bound are introduced for finding an optimal compact hypothesis subspace by constracting a multi-layer topology in hypothesis space. The robustness and effectiveness of the proposed method in improving the generalization performance of some learning algorithms are validated by experiments.
Key words: artificial intelligence; unbounded nonnegative loss function set; statistical learning theory; topological structure; tail synthetic index
发表期数: 2018年8月第16期
引用格式: 赵金伟,王宇飞,柳宇,等. 训练误差概率分布对泛化性能的影响[J]. 中国科技论文在线精品论文,2018,11(16):1586-1598.
 
5 评论数 0
暂无评论
友情链接