• 查询稿件
  • 获取最新论文
  • 知晓行业信息

铁路货运数据仓库与数据挖掘应用研究

Research on Application of Data Warehousing and Data Mining Techniques in Railway Freight

  • 摘要: 随着近年来中国铁路南宁局集团有限公司(简称:南宁局)货运量激增,既有货运业务系统的数据量增长迅速,由于数据分散存储在各系统独立建设的数据库中,对于需要跨库的复杂货运业务查询和联机分析应用效果不佳。为充分发掘货运业务数据资产的价值,文章选用基于大规模并行处理(MPP,Massively Parallel Processing)集群架构的数据仓库产品,对多源海量货运业务数据进行整合处理和高效存储,使联机分析数据查询的响应时间从传统数据库的分钟级优化至秒级,为开展数据挖掘应用研究奠定了基础。采用改进K-means聚类和朴素贝叶斯分类算法,开展货运客户价值分析与装车落空预测。结果表明,基于K-means算法构建货运客户细分模型,可帮助货运部门快速识别不同客户的价值,为货运营销找准营销方向及价格策略调整提供可靠依据;基于朴素贝叶斯算法构建装车落空预测模型,预测结果可增强货运组织对潜在风险的预见能力。研究成果有助于推动南宁局货运管理经验驱动向数据驱动转型,为货运业务高质量发展提供支撑。

     

    Abstract: With the sharp increase in freight volume of China Railway Nanning Group Co., LTD. (referred to as: Nanning Bureau) over past years, the data volume of the existing freight business information systems has grown rapidly. Since the data is scattered and stored in the independently built databases of each system, the application effect for complex freight business queries and online analysis that require cross-database is not good. To fully explore the value of freight business data assets, this article selects the data warehouse product based on the MPP cluster architecture to integrate, process and efficiently store multi-source and massive freight business data, optimizing the response time of online analysis data query from minute-level of traditional databases to second-level, laying the foundation for conducting research on data mining applications. The improved K-means clustering and Naive Bayes classification algorithms are respectively adopted to carry out the analysis of freight customer value and the prediction of missed loading. The results show that constructing a freight customer segmentation model based on the K-means algorithm can help the freight department quickly identify the value of different customers and provide a reliable basis for freight marketing to accurately identify the marketing direction and adjust the price strategy. The loading failure prediction model is constructed based on the Naive Bayes algorithm, and the prediction results is condusive to enhance the risk foreseeing capability of freight organization. The research results are conducive to promoting the transformation of Nanning Bureau's freight management from experience-driven to data-driven, and providing support for the high-quality development of freight business.

     

/

返回文章
返回