您的位置:首页  > 论文页面

基于多层特征和性别相关模型的汉语口音辨识方法

发表时间:2011-11-30  浏览量:1664  下载量:494
全部作者: 侯珏,刘轶
作者单位: 清华信息科学与技术国家实验室技术创新与开发部语音和语言技术中心;清华大学计算机科学与技术系
摘 要: 提出一种用于汉语口音辨识的两阶段方法,使用了多层特征以及性别相关模型。利用传统的梅尔频率倒谱系数(Mel frequency cepstrum coefficients,MFCC)参数和基频曲线特征作为音段特征和超音段特征的代表,对中文口音的特点进行刻画。使用三次多项式对基频曲线片段进行拟合,用来对口音之间的差别进行建模。使用分阶段的方法,用性别相关的高斯混合模型(Gaussian mixture model,GMM)描述特征。由于传统的基于高斯混合模型的方法不能利用多重特征,提出使用支持向量机(support vector machine, SVM)进行决策。在863汉语口音数据集上对提出的方法进行测试和评价,结果显示:与传统的只用MFCC特征并且不对性别进行建模的方法相比较,该方法减少了约20%的相对误差。
关 键 词: 信号与信息处理;汉语口音辨识;支持向量机;多种特征;性别相关模型
Title: Chinese accent identification based on multi-layered features with gender-dependent model
Author: HOU Jue, LIU Yi
Organization: Center for Speech and Language Technologies, Division of Technology Innovation and Development, Tsinghua National Laboratory for Information Science and Technology; Department of Computer Science and Technology, Tsinghua University
Abstract: This paper proposes a novel two-stage approach for Chinese accent identification using multi-layered features with gender-dependent model. A combination of conventional Mel frequency cepstrum coefficients (MFCC) parameters and pitch contour features as an example of segmental and suprasegmental features is exploited to capture the characteristics in Chinese accents. The cubic polynomials are used to estimate the pitch contour segments in order to model the differences within accents. A two-stage scheme is used to deal with the gender variation, by training gender-dependent Gaussian mixture model (GMM) acoustic models to express the features. Since conventional criterion of the GMM assumption cannot solve those multi-feature problems, the support vector machine (SVM) is used to make the decision. The effectiveness of the proposed approach is evaluated on the 863 Chinese accent corpus. The result shows that our approach yields a 20% relative error rate reduction compared to conventional approaches of using only MFCC features and without modeling gender.
Key words: signal and information processing; Chinese accent identification; support vector machine; multi-layered features; gender-dependent model
发表期数: 2011年11月第22期
引用格式: 侯珏,刘轶. 基于多层特征和性别相关模型的汉语口音辨识方法[J]. 中国科技论文在线精品论文,2011,4(22):2061-2067.
 
0 评论数 0
暂无评论
友情链接