• 查询稿件
  • 获取最新论文
  • 知晓行业信息
官方微信 欢迎关注

基于横向联邦学习的风险用户识别研究

樊春美, 杨立鹏, 李雯, 张智

樊春美, 杨立鹏, 李雯, 张智. 基于横向联邦学习的风险用户识别研究[J]. 铁路计算机应用, 2024, 33(8): 72-77. DOI: 10.3969/j.issn.1005-8451.2024.08.12
引用本文: 樊春美, 杨立鹏, 李雯, 张智. 基于横向联邦学习的风险用户识别研究[J]. 铁路计算机应用, 2024, 33(8): 72-77. DOI: 10.3969/j.issn.1005-8451.2024.08.12
FAN Chunmei, YANG Lipeng, LI Wen, ZHANG Zhi. Risk user identification based on horizontal federated learning[J]. Railway Computer Application, 2024, 33(8): 72-77. DOI: 10.3969/j.issn.1005-8451.2024.08.12
Citation: FAN Chunmei, YANG Lipeng, LI Wen, ZHANG Zhi. Risk user identification based on horizontal federated learning[J]. Railway Computer Application, 2024, 33(8): 72-77. DOI: 10.3969/j.issn.1005-8451.2024.08.12

基于横向联邦学习的风险用户识别研究

基金项目: 中国国家铁路集团有限公司2023系统性重大项目(P2023W002);中国铁道科学研究院集团有限公司电子计算技术研究所项目(DZYF23-11)
详细信息
    作者简介:

    樊春美,助理研究员

    杨立鹏,副研究员

  • 中图分类号: U293.221 : TP39

Risk user identification based on horizontal federated learning

  • 摘要:

    第三方平台推出的各种铁路旅客抢票服务,给中国铁路12306互联网售票系统(简称:12306)带来了较大压力,为保障12306的稳定性和旅客购票的公平性,亟需对风险用户进行识别。为应对因12306部署在不同的物理位置、不同中心的数据聚合存在一定风险的情况,研究在用户数据分散的条件下,基于横向联邦学习的风险用户识别方法。文章基于用户的访问行为,构建和提取用户特征,构建基于XGboost、逻辑回归和神经网络等算法的横向联邦学习模型,并进行模型验证。实验结果表明,基于XGboost算法的横向联邦学习模型具有较好的风险用户识别效果,为铁路数据的安全使用提供了技术支撑。

    Abstract:

    Various railway passenger ticket grabbing services launched by the third-party platform have brought great pressure to the China railway 12306 Internet ticketing and reservation system (12306 for short). In order to ensure the stability of 12306 and the fairness of passenger ticket purchase, it is urgent to identify risk users. This paper aimed to address the risk of data aggregation caused by the deployment of 12306 in different physical locations and centers, studied a risk user identification method based on horizontal federated learning under the condition of dispersed user data. Based on user access behavior, the paper constructed and extracted user features, constructed a horizontal federated learning model using algorithms such as XGboost, logistic regression, and neural networks, and validated the model. The experimental results show that the horizontal federated learning model based on XGboost algorithm has good risk user recognition performance, provides technical support for the safe use of railway data.

  • 图  1   用户购票流程

    图  2   用户特征挖掘架构

    图  3   12306用户部分特征集合

    图  4   Fed_XGb模型组件对应的工作流程

    表  1   Fed_lr模型部分参数的权重

    参数 权重
    Intercept(常变量) -1.978
    len_full -1.03212
    min_dur -0.94088
    len_uniq -0.89904
    getwaittime_num_5min -0.37734
    confirmpassengerinfosingle_num_15min -0.29207
    url3 -0.28003
    querypassenger_num_5min -0.24953
    下载: 导出CSV

    表  2   一中心数据集的指标结果

    模型 AUC F1-score Accuracy Recall Precesion
    Fed_XGb 0.9856 0.9061 0.9591 0.9490 0.8670
    Fed_lr 0.9545 0.7948 0.9036 0.8981 0.7129
    Fed_nn 0.9550 0.8130 0.9102 0.9393 0.7167
    XGBoost 0.9868 0.8828 0.9510 0.9389 0.8330
    下载: 导出CSV

    表  3   二中心数据集的指标结果

    模型 AUC F1-score Accuracy Recall Precesion
    Fed_XGb 0.9837 0.8715 0.9444 0.9491 0.8056
    Fed_lr 0.9609 0.8168 0.9186 0.9135 0.7387
    Fed_nn 0.9680 0.8598 0.9394 0.9364 0.7948
    XGBoost 0.9825 0.8543 0.9335 0.9443 0.7800
    下载: 导出CSV
  • [1] 李 雯,朱建生,单杏花. 基于指数权重算法的铁路互联网售票异常用户智能识别的研究与实现[J]. 铁路计算机应用,2018,27(10):7-10, DOI: 10.3969/j.issn.1005-8451.2018.10.002.
    [2]

    Fan C M, Li W, Zhu Y T, et al. Anomaly access detection method based on multi-channel data[C]//Proceedings of the IEEE 5th International Conference on Cloud Computing and Big Data Analytics, 10-13 April, 2020, Chengdu, China. New York, USA: IEEE, 2020. 295-300.

    [3]

    Wang J Q, He X L, Gong Q Y, et al. Deep learning-based malicious account detection in the Momo social network[C]//Proceedings of the 27th International Conference on Computer Communication and Networks (ICCCN), 30 July - 2 August, 2018, Hangzhou, China. New York, USA: IEEE, 2018. 1-2.

    [4]

    Zhang Y, Chen W L, Yeo C K, et al. Detecting rumors on online social networks using multi-layer autoencoder[C]//Proceedings of 2017 IEEE Technology & Engineering Management Conference (TEMSCON), 8-10 June, 2017, Santa Clara, CA, USA. New York, USA: IEEE, 2017. 437-441.

    [5]

    Sun X, Zhang C, Ding S, et al. Detecting anomalous emotion through big data from social networks based on a deep learning method[J]. Multimedia Tools and Applications, 2020, 79(13-14): 9687. DOI: 10.1007/s11042-018-5665-6

    [6] 卫新乐,张志勇,宋 斌,等. 基于纵向联邦学习的社交网络跨平台恶意用户检测方法[J]. 小型微型计算机系统,2022,43(7):1541-1546, DOI: 10.20009/j.cnki.21-1106/TP.2020-1108.
    [7]

    Mcmahan H B, Moore E, Ramage D, et al. Federated learning of deep networks using model averaging[DB/OL]. https://arxiv.org/abs/1602.05629, 2017.

    [8]

    Konen J, Mcmahan H B, Ramage D, et al. Federated optimization:distributed machine learning for on-device intelligence[DB/OL]. [2024-05-31]. https://arxiv.org/abs/1610.02527, 2016.

    [9]

    Yang Q, Liu Y, Chen T J, et al. Federated machine learning: Concept and applications[J]. ACM Transactions on Intelligent Systems and Technology, 2019, 10(2): 12.

    [10] 陈 涛,郭 睿,刘志强. 面向大数据隐私保护的联邦学习算法航空应用模型研究[J]. 信息安全与通信保密,2020(9):75-84. DOI: 10.3969/j.issn.1009-8054.2020.09.010
    [11] 李 国,张秋杰. 基于纵向联邦学习的航班延误预测[J]. 计算机工程与设计,2023,44(5):1594-1601.
    [12]

    Liu Y, Yu J J Q, Kang J W, et al. Privacy-preserving traffic flow prediction: A federated learning approach[J]. IEEE Internet of Things Journal, 2020, 7(8): 7751-7763. DOI: 10.1109/JIOT.2020.2991401

    [13]

    William Marfo, William Marfo, Shirley V. Moore. Network Anomaly Detection Using Federated Learning[DB/OL]. [2024-05-31]. https://arxiv.org/abs/2303.07452, 2023.

    [14] 赵 英,王丽宝,陈骏君,等. 基于联邦学习的网络异常检测[J]. 北京化工大学学报(自然科学版),2021,48(2):92-99.
    [15] 刘金硕,詹岱依,邓 娟,等. 基于深度神经网络和联邦学习的网络入侵检测[J]. 计算机工程,2023,49(1):15-21,30.
    [16] 王 楠,张大林,刘娟. 一种基于联邦学习的风险权重融合的异常检测方法:中国,202111362361.7[P]. 2022-04-15.
    [17] 曾闽川,方 勇,许益家. 基于联邦迁移学习的应用系统日志异常检测研究[J]. 四川大学学报(自然科学版),2023,60(3):79-86.
    [18] 张泽辉,李庆丹,富 瑶,等. 面向非独立同分布数据的自适应联邦深度学习算法[J]. 自动化学报,2023,49(12):2493-2506, DOI: 10.16383/j.aas.c201018.
    [19] 曲 强,于洪涛,黄瑞阳. 社交网络异常用户检测技术研究进展[J]. 网络与信息安全学报,2018,4(3):13-23.
图(4)  /  表(3)
计量
  • 文章访问数:  57
  • HTML全文浏览量:  35
  • PDF下载量:  26
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-05-30
  • 刊出日期:  2024-08-24

目录

    /

    返回文章
    返回