Research on railway large vision model
-
摘要:
研究铁路计算机视觉大模型关键技术及其应用,对统筹和促进铁路人工智能发展具有深远意义。文章依托铁路人工智能平台的算力与大模型支撑组件,提出从基础大模型到铁路计算机视觉大模型,再到铁路计算机视觉场景大模型的架构设计思路。基于基础大模型,设计模型训练框架,运用模型剪枝和多尺度推理技术保障推理速度与精度,完成铁路计算机视觉大模型的构建;提出铁路计算机视觉大模型的应用场景,并选取线路环境安全管控智能识别场景对该大模型能力进行验证。实验结果表明,铁路计算机视觉大模型在复杂背景下的微小目标检测方面表现卓越,具有较好的应用前景,将在铁路运输安全、移动装备检测、铁路客货运服务等业务领域发挥更加重要的作用。
Abstract:Studying the key technologies and applications of railway large vision model has profound significance for coordinating and promoting the development of railway artificial intelligence. This paper relied on the computing power and large model support components of the railway artificial intelligence platform to propose an architecture design concept from the basic large model to the railway large vision model and then to the railway scenario large vision model. Based on a basic large model, the paper designed a model training framework, and used model pruning and multi-scale reasoning techniques to ensure inference speed and accuracy, implemented the construction of a railway large vision model. The paper also proposed the application scenarios of railway large vision model, and selected the intelligent recognition scenario of line environment safety management and control to verify the capabilities of this large vision model. The experimental results show that the railway large vision model performs excellently in detecting small targets in complex backgrounds and has good application prospects. It will play a more important role in railway transportation safety, mobile equipment detection, railway passenger and freight transport services, and other business domain.
-
-
表 1 实验环境配置
硬件配置 型号/规格 CPU Intel(R) Xeon(R) Gold 5218 CPU @ 2.30 GHzGPU NVIDIA GeForce RTX 6000Ada 显卡显存 48 G 操作系统版本 Ubuntu 18.04.6 LTS CUDA 11.7 表 2 模型训练核心参数
参数项 参数值 优化器 AdamW 优化器学习率 2e-05 优化器权重衰减 0.05 层衰减率 0.9 最大迭代次数 120 表 3 实验结果对比
模型类别 mAP50 mAP50:95 推理速度/(ms·张−1) 铁路计算机视觉大模型 87.42 70.91 357.5 铁路计算机视觉大模型(模型剪枝后) 85.35 65.81 201.7 YOLOv8 81.35 61.49 30.2 Faster R-CNN 66.31 39.94 68.2 ViT-Adapter-L 86.25 63.72 302.3 铁路计算机视觉场景大模型 94.64 77.48 341.5 铁路计算机视觉场景大模型
(模型剪枝后)92.85 75.61 198.7 -
[1] 孙露露,刘建平,王 健,等. 细粒度图像分类上Vision Transformer的发展综述[J]. 计算机工程与应用,2024,60(10):30-46. [2] Liu Z, Lin Y T, Cao Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision, 10-17 October, 2021, Montreal, Canada. New York, USA: IEEE, 2021: 10012-10022.
[3] 吴锦浩,朱权洁,廖忠友,等. 基于盘古大模型的矿用钢丝绳表面损伤检测研究[J]. 工业控制计算机,2024,37(1):1-3,6. DOI: 10.3969/j.issn.1001-182X.2024.01.001 [4] Zhang C W, Yu X. Domestic large model technology and medical applications analysis[J]. Advanced Ultrasound in Diagnosis and Therapy, 2023, 7(2): 172-187. DOI: 10.37015/AUDT.2023.230027
[5] Wang W H, Dai J F, Chen Z, et al. InternImage: exploring large-scale vision foundation models with deformable convolutions[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 17-24 June, 2023, Vancouver, BC, Canada. New York, USA: IEEE, 2023: 14408-14419.
[6] Kirillov A, Mintun E, Ravi N, et al. Segment anything[EB/OL]. [2024-06-20]. https://arxiv.org/abs/2304.02643.
[7] 陈相羽,李 豪,王炳炎,等. 基于SAM预训练大模型智能化组合策略的燃料组件水下快速精确定位优化研究[J]. 核动力工程,2023,44(S2):140-145. [8] 刘金明,朱成波,周长义. 视频人脸识别技术在铁路人员管控中的应用[J]. 中国铁路,2019(4):99-103. [9] 李长泰,韩 旭,蒋若辉,等. 大模型及其在材料科学中的应用与展望[J]. 工程科学学报,2024,46(2):290-305. [10] 衣 帅,戴琳琳,阎志远,等. 基于机器视觉的铁路客运列车移动作业流程智能化提升方案[J]. 铁道运输与经济,2023,45(8):69-74. [11] 张晓栋,马小宁,李 平,等. 人工智能在我国铁路的应用与发展研究[J]. 中国铁路,2019(11):32-38. [12] 史天运,侯 博,李国华,等. 铁路人工智能平台设计及关键技术研究[J]. 铁路计算机应用,2023,32(8):9-16. [13] Russell B C, Torralba A, Murphy K P, et al. LabelMe: a database and web-based tool for image annotation[J]. International Journal of Computer Vision, 2008, 77(1-3): 157-173. DOI: 10.1007/s11263-007-0090-8
[14] Varghese R, Sambath M. YOLOv8: a novel object detection algorithm with enhanced performance and robustness[C]//Proceedings of 2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems, 18-19 April, 2024, Chennai, India. New York, USA: IEEE, 2024, doi: 10.1109/ADICS58448.2024.10533619.
[15] Ranftl R, Bochkovskiy A, Koltun V. Vision Transformers for Dense Prediction[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision, 11-17 October, 2021, Montreal, QC, Canada. New York, USA: IEEE, 2021: 12159-12168.
[16] Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
-
期刊类型引用(4)
1. 张智荣. 25t轴重重载轨道扣件动参数对轮轨系统振动响应的影响研究. 科学技术创新. 2024(07): 140-144 . 百度学术
2. 黄晶,李永明,李俊杰,秦耀辉. 道床刚度对钢轨接头动力响应影响研究. 中国铁路. 2023(11): 15-21 . 百度学术
3. 赵阳阳. 基于陀螺仪的大包重载铁路轨道承载能力研究. 自动化与仪器仪表. 2021(09): 216-219+223 . 百度学术
4. 高飞,戚壮,王军平. 轨道垂向刚度不均匀对车辆系统振动影响研究. 铁路采购与物流. 2021(12): 36-40 . 百度学术
其他类型引用(3)