Abstract:
This paper proposed a method for constructing a railway multimodal knowledge base question and answering system based on Hybrid Retrieval Enhanced Generation (Hybrid RAG) to address issues such as low efficiency in querying railway line design specifications, deviations in the implementation of track maintenance and repair standards, difficulties in cross departmental collaboration, and barriers to multidisciplinary knowledge fusion in train traction energy-saving decision-making. The paper deployed a large language model on a local server and fine tuned the model using the DyLoRA framework, constructed a dual database architecture based on MongoDB and PostgreSQL by combining a differentiated processing scheme for unstructured text and table data, adopted methods such as table large model encoding, vector image vectorization, and image multimodal embedding to implement unified multimodal processing of text, tables, drawings, and vector images, and adopted a hybrid retrieval approach that combined semantic vector retrieval and full-text sparse retrieval, as well as strategies such as reordering and search filtering to optimize retrieval quality and reduce the risk of model illusion. After deployment and testing on multiple platforms such as Web and mobile devices, this method can significantly improve the efficiency of standardized queries and decision-making accuracy in railway line design, track maintenance and repair, and train traction scenarios, promote the intelligent development of the railway industry.