Improvement of Aprior algorithm based on Hadoop and its application in operation and maintenance of EMU
-
摘要: 论文着眼于解决大数据下的动车组关联规则挖掘问题,提出了一种基于Apriori算法改进的大数据关联规则挖掘算法T-MR-Apriori算法。该算法融合Hadoop技术,执行两遍MR分布式计算过程,完成整个关联规则挖掘流程,提高了海量数据下关联规则挖掘的效率和准确率。同时利用实际动车组运维数据进行验证,证明该算法在海量数据下具有良好的挖掘速度又能不降低挖掘性能。并且将该方法应用于动车组牵引电机运维数据挖掘,进行可视化展示。Abstract: This paper focused on solving the problem of EMU association rule mining under big data, proposed an improved T-MR-Apriori algorithm for association rule mining of big data based on Apriori algorithm. The improved algorithm combined Hadoop technology, performed two times MR distributed computing process, completed the whole process of association rule mining, improved the efficiency and accuracy of association rule mining under massive data. The actual EMU operation and maintenance data were used to verify the algorithm, which proved that the algorithm had good speed of mining in mass data and cannot reduce the performance of mining. The method would be applied to data mining and visualization of traction motor operation and maintenance in EMU.
-
Keywords:
- big data /
- EMU /
- data mining /
- Hadoop /
- Apriori algorithm
-
-
[1] 李和平, 曹宏发, 杨伟君,等. 和谐号动车组制动技术概述[J]. 铁道机车车辆, 2011, 31(5):1-11. [2] 王华胜, 文 礼, 李忠厚,等. 电动车组整车的可靠性测试方法及装置: CN101699246[P]. 2010. [3] 王同军. 以大数据为牵引驱动铁路创新发展[J]. 铁路计算机应用, 2016, 25(9). [4] 高子喆. 基于云计算的并行FFT 算法及其在高铁数据中的应用研究[D]. 成都:西南交通大学,2013. [5] 赵成兵. 基于云计算的高铁振动数据预处理与特征提取研究[D]. 成都:西南交通大学,2013. [6] Aflori C, Craus M. Grid implementation of the Apriori algorithm[M]. Elsevier Science Ltd. 2007. [7] Agrawal R, Shafer J C. Parallel Mining of Association Rules[M]. IEEE Educational Activities Department, 1996. [8] Yang X Y, Liu Z, Fu Y. MapReduce as a programming model for association rules algorithm on Hadoop[C]// International Conference on Information Sciences and Interaction Sciences.IEEE, 2010 :99-102. [9] Yates J, Mcgregor J D, Ingram J E. Hadoop and its evolving ecosystem[C]// International Workshop on Software Ecosystems. 2014. [10] Dadachev B, Balinsky A, Balinsky H, et al. On the Helmholtz Principle for Data Mining[C]// Third International Conference on Emerging Security Technologies. IEEE, 2017 :99-102. [11] Huang L, Chen H, Xun W, et al. A fast algorithm for mining association rules[J]. 计算机科学技术学报( 英文版), 2000,15(6) :619-624. [12] Prakash R V, Govardhan, Sarma S S V N. Mining Frequent Itemsets from Large Data Sets using Genetic Algorithms[J].International Journal of Computer Applications, 2011(4):38-43. [13] 尧 炜,马又良. 浅析 Hadoop 1.0 与2.0 设计原理[J]. 邮电设计技术,2014(7):37-42. [14] Yuan S, He S. Set up and collocate servers based on CentOS6.5[J]. Microcomputer & Its Applications, 2014(16): 44-45,48. -
期刊类型引用(3)
1. 杨文成. 基于BIM设计的道岔设备建模方法研究. 铁道勘察. 2020(01): 133-136 . 百度学术
2. 康峰,韩峰. 基于BIM的三维铁路道岔建模方法研究. 铁道科学与工程学报. 2017(04): 716-720 . 百度学术
3. 韩峰,袁锋,康峰. 基于VBA的轨道结构三维数字化建模方法研究. 铁道科学与工程学报. 2017(11): 2352-2357 . 百度学术
其他类型引用(4)
计量
- 文章访问数: 104
- HTML全文浏览量: 0
- PDF下载量: 32
- 被引次数: 7