• 查询稿件
  • 获取最新论文
  • 知晓行业信息
王心雨, 景辉. 面向铁路旅客服务应用的语音识别模型研究[J]. 铁路计算机应用, 2022, 31(4): 7-15. DOI: 10.3969/j.issn.1005-8451.2022.04.02
引用本文: 王心雨, 景辉. 面向铁路旅客服务应用的语音识别模型研究[J]. 铁路计算机应用, 2022, 31(4): 7-15. DOI: 10.3969/j.issn.1005-8451.2022.04.02
WANG Xinyu, JING Hui. Research on speech recognition model for railway passenger service application[J]. Railway Computer Application, 2022, 31(4): 7-15. DOI: 10.3969/j.issn.1005-8451.2022.04.02
Citation: WANG Xinyu, JING Hui. Research on speech recognition model for railway passenger service application[J]. Railway Computer Application, 2022, 31(4): 7-15. DOI: 10.3969/j.issn.1005-8451.2022.04.02

面向铁路旅客服务应用的语音识别模型研究

Research on speech recognition model for railway passenger service application

  • 摘要: 为扩大面向铁路旅客服务的语音识别应用,文章研究适用于铁路旅客服务应用的语音识别模型,使用基于卷积增强的Conformer编码结构和RNN-T模型结构,构建基于Conformer-Transducer的语音识别模型。由于卷积网络容易忽视输入信号整体与局部间关联,在Conformer结构中的卷积模块加入注意力机制,用以修正卷积模块的计算结果。构建铁路旅客服务语音数据集,对改进的语音识别模型进行测评;结果表明:改进后的语音识别模型准确率达到92.09%,相较于一般的Conformer-Transducer模型,语音识别字错误率降低0.33%。鉴于铁路旅客服务涉及铁路出行条例、旅客常问问题等众多文本信息,在语音识别模型中融入语言模型与热词赋权2种文本处理机制,使其在铁路专有名词的识别上优于通用的语音识别算法;文章研究提出的语音识别模型已应用于旅客常问问题查询设备和车站智能服务机器人,有助于提高铁路旅客服务水平,改善铁路旅客出行体验,促进铁路旅客服务工作实现减员增效。

     

    Abstract: In order to promote the application of speech recognition for railway passenger services, a study on speech recognition model for railway passenger service applications is made, in which the Conformer encoder structure based on convolution enhancement and the RNN-Transducer model structure are used to realize the Conformer-Transducer speech recognition model. Since the convolution neural networks tend to ignore the association between the whole signal and a signal sequence, the convolution module in the Conformer structure are improved and the attention mechanism is added to the convolution module for modifying the calculation results of the convolution module. A speech data set of railway passenger service is built to test and evaluate the improved model and the results show that the accuracy of the improved speech recognition model can reach 92.09% and the error rate of speech recognition is reduced by 0.33% compared with the general Conformer-Transducer model. Because railway passenger services involves specific text information, such as railway travel regulations and frequently asked questions by the passengers, a text processing mechanism, language model or weighting of hot words, is then integrated into the speech recognition model, which enable the model recognize railway-specific terms better than other speech recognition algorithims. This speech recognition model has been applied in passenger FAQ inquiry equipment and intelligent station service robot, which is conducive not only to enhance the level of railway passenger services and improve railway passenger travel experience but also to facilitate downsizing the staff and increasing the work efficiency of railway passenger service.

     

/

返回文章
返回