• 查询稿件
  • 获取最新论文
  • 知晓行业信息

基于EGA-CLIP多模态大模型的轨旁电缆槽外观异常检测方法

Method for appearance anomaly detection of trackside cable trough based on EGA-CLIP multimodal large model

  • 摘要: 针对铁路轨旁电缆巡检图像中存在的图像退化、可学习特征匮乏及异常样本稀少等问题,提出一种在对比语言−图像预训练(CLIP,Contrastive Language-Image Pretraining)架构中引入边缘引导注意力(EGA,Edge-Guided Attentiom)模块的多模态大模型EGA-CLIP,以及基于EGA-CLIP多模态大模型的轨旁电缆槽外观异常检测方法。设计融合限制对比度自适应直方图均衡化(CLAHE)增强、YOLO(You Only Look Once)v11定位与高斯滤波的异常检测流程,优化输入图像质量;通过多尺度融合Canny-Sobel边缘特征与视觉Transformer特征强化结构感知能力,生成异常分割图。实验结果表明,EGA-CLIP在 像素级接受者操作特征曲线下面积(Pixel AUROC)、图像级接受者操作特征曲线下面积(Image AUROC) 和准确率上分别达 99.00%、89.52% 和 99.19%,优于对比模型,少样本场景泛化性强,可为铁路轨旁设备检测提供可靠方案。

     

    Abstract: In response to the problems of image degradation, lack of learnable features, and scarcity of abnormal samples in railway trackside cable inspection images, this paper proposed a multimodal large model EGA-CLIP that introduced an Edge Guided Attention (EGA) module into the Contrastive Language Image Pretraining (CLIP) architecture, as well as a method for appearance anomaly detection of trackside cable trough based on EGA-CLIP multimodal large model. It designed an anomaly detection process that combined Contrast Limited Adaptive Histogram Equalization (CLAHE) enhancement, YOLO (You Only Look Once) v11 localization, and Gaussian filtering to optimize input image quality, enhanced the structural perception ability through multi-scale fusion of Canny Sobel edge features and visual Transformer features, and generated anomaly segmentation maps. The experimental results show that EGA-CLIP achieves 99.00%, 89.52%, and 99.19% in pixel level receiver operating characteristic curve area (Pixel AUROC), image level receiver operating characteristic curve area (Image AUROC), and accuracy, respectively, which is superior to the comparison model. It has strong generalization ability in few sample scenarios and can provide a reliable solution for railway trackside equipment detection.

     

/

返回文章
返回