Abstract:
The massive data generated by railway informatization poses challenges to data security, and research on sensitive data identification methods is particularly urgent. This paper systematically analyzed the current research status and development trends of sensitive data recognition methods at home and abroad, summarized multidimensional sensitive data recognition methods and their classifications, sorted out and deeply compared and analyzed sensitive data recognition methods based on rule matching and machine learning. The rule matching-based sensitive data recognition method has the advantages of fast setup and low resource requirements, is suitable for identifying specific patterns of sensitive data. The machine learning-based sensitive data recognition method has high adaptability, efficiency, and accuracy, which can better adapt to unstructured data and improve recognition accuracy and efficiency. Different recognition methods need to be comprehensively considered and selected based on factors such as different application scenarios, data properties, and available resources. This study can provide theoretical support for data security in the railway field.