您的位置:首页  > 论文页面

子空间高斯混合模型在中文语音识别系统中的实现

发表时间:2013-08-31  浏览量:1614  下载量:997
全部作者: 肖云鹏,朱维彬
作者单位: 北京交通大学计算机学院
摘 要: 当前,语音识别系统主要采用隐马尔可夫模型(hidden Markov model,HMM)作为声学建模以及声学解码的基本模型。但在考虑上下文以后,细化模型的数量以及参数规模急剧增长。在训练数据有限的情况下,会带来参数训练不充分的问题。引入一种新型的子空间高斯混合模型(subspace Gaussian mixture model,SGMM)框架。区别于传统的HMM框架中每个状态均关联若干个均值方差参数来计算输出概率,SGMM的每个状态只关联一个低维的映射向量,其均值和方差通过所有状态共享的映射矩阵计算,这使得模型的参数表示变得十分紧凑。实验结果表明:SGMM在有限的语音训练数据条件下能够使词错误率较传统HMM有6.44%的绝对降低(23.43%降至16.82%),以及28%的相对降低;同时现有的优化算法如模型细化、特征优化、区分性训练等,在该框架下依然有效。
关 键 词: 模式识别;语音识别;隐马尔可夫模型;子空间高斯混合模型
Title: Implement of subspace Gaussian mixture models in Chinese speech recognition system
Author: XIAO Yunpeng, ZHU Weibin
Organization: School of Computer and Information Technology, Beijing Jiaotong University
Abstract: Hidden Markov model (HMM) currently serves as the basic acoustic model of speech recognition system. However, the number of model parameters has a dramatic increase after taking account of context. Due to the limitation of training speech, the parameters of classic model may not be fully trained. This paper introduced a new framework which was named subspace Gaussian mixture model (SGMM). Different from the traditional HMM structure, this model was defined by vectors associated with each state with a less dimension, together with a global mapping from this vector space to the space of parameters of SGMM. The results showed that this SGMM approach gave better results than the conventional HHM, with word error rate decreasing 6.44% absolutely (from 23.43% to 16.82%) and 28% relatively. Besides, this new method was compatible with exiting model refine method such as discriminative training, feature transformation and speaker adaptation.
Key words: pattern recognition; speech recognition; hidden Markov model; subspace Gaussian mixture model
发表期数: 2013年8月第16期
引用格式: 肖云鹏,朱维彬. 子空间高斯混合模型在中文语音识别系统中的实现[J]. 中国科技论文在线精品论文,2013,6(16):1537-1541.
 
0 评论数 0
暂无评论
友情链接