• 查询稿件
  • 获取最新论文
  • 知晓行业信息

面向铁路客运站场景的语音降噪模型研究

Speech noise reduction model for railway passenger station scene

  • 摘要: 为进一步提升铁路客运站嘈杂环境下的语音识别效果,文章提出一种基于Conformer的语音降噪模型ConformerGAN。其训练流程类似生成对抗网络,生成器采用Conformer进行语音特征提取,对特征建模;鉴别器使用代理评估函数对语音感知进行质量评价。为增强模型的泛化能力并提高模型对未知噪声的降噪能力,在噪声的叠加上采用随机截取片段融入的方式,并构建铁路客运站场景噪声数据集。与语音降噪相关模型效果对比的结果表明,ConformerGAN模型可将客观语音质量评估(PESQ,Perceptual Evaluation of Speech Quality)分数提高0.19,有效提高铁路客运站嘈杂环境下的语音识别准确率,改善铁路旅客语音交互体验。

     

    Abstract: In order to further improve the speech recognition effect in the noisy environment of the station, this paper proposed a Conformer based generative adjunctive network Conformer Generative Adversarial Network (GAN) for speech noise reduction. Its training process was similar to GAN, generator used the Conformer to extract speech features and model them; discriminator constructed a proxy evaluation function to evaluate the perceptual quality of speech. In order to enhance the generalization ability of the model and improve the noise reduction ability of the model for unknown noise, the overlay of noise was incorporated by randomly intercepting fragments. The paper also built a station scene noise dataset. Compared with the effect of related models, the ConformierGAN model can improve the Perceptual Evaluation of Speech Quality (PESQ) score by 0.19, effectively improve the accuracy of voice recognition in the noisy environment of railway passenger stations, and improve the voice interaction experience of railway passengers.

     

/

返回文章
返回