• 全国中文核心期刊
  • 中国科技核心期刊
  • 美国工程索引(EI)收录期刊
  • Scopus数据库收录期刊

融合XGBoost和SHAP的地表峰值加速度预测分析模型

齐婉婉, 孙锐, 郑桐, 亓金磊

齐婉婉, 孙锐, 郑桐, 亓金磊. 融合XGBoost和SHAP的地表峰值加速度预测分析模型[J]. 岩土工程学报, 2023, 45(9): 1934-1943. DOI: 10.11779/CJGE20220417
引用本文: 齐婉婉, 孙锐, 郑桐, 亓金磊. 融合XGBoost和SHAP的地表峰值加速度预测分析模型[J]. 岩土工程学报, 2023, 45(9): 1934-1943. DOI: 10.11779/CJGE20220417
QI Wanwan, SUN Rui, ZHENG Tong, QI Jinlei. Prediction and analysis model for ground peak acceleration based on XGBoost and SHAP[J]. Chinese Journal of Geotechnical Engineering, 2023, 45(9): 1934-1943. DOI: 10.11779/CJGE20220417
Citation: QI Wanwan, SUN Rui, ZHENG Tong, QI Jinlei. Prediction and analysis model for ground peak acceleration based on XGBoost and SHAP[J]. Chinese Journal of Geotechnical Engineering, 2023, 45(9): 1934-1943. DOI: 10.11779/CJGE20220417

融合XGBoost和SHAP的地表峰值加速度预测分析模型  English Version

基金项目: 

中国地震局工程力学研究所基本科研业务费专项项目 2020C04

黑龙江省自然科学基金联合引导项目 LH2020E019

详细信息
    作者简介:

    齐婉婉(1995—),女,硕士研究生,主要从事岩土地震工程方面研究工作。E-mail: iem_qiww@163.com

    通讯作者:

    郑桐, E-mail: Zhengt0928@163.com

  • 中图分类号: P315

Prediction and analysis model for ground peak acceleration based on XGBoost and SHAP

  • 摘要: 为建立一种不依赖土体本构模型,只依靠地震动和场地主要特征的地表加速度峰值预测方法,以日本KiK-net强震台网搜集到的3104组基岩和地表地震动记录为基础,通过特征选择筛选出6个特征参数,以输入地震动加速度峰值和输入地震动卓越频率表征输入地震动特性,以剪切波速达800 m/s时的土层埋深、场地基本周期、基岩剪切波速和地表剪切波速表征场地特性。采用XGBoost模型,构建基于6个特征参数的地表峰值加速度(PGA)预测模型。通过对比实测记录和一维数值模拟计算结果,表明本文建立的XGBoost模型预测结果稳定,能较好的预测PGA,训练集和测试集的决定系数均大于0.925,平均绝对百分比误差均在20%左右。同时引入SHAP对输入特征与预测结果之间的影响和依赖性进行分析,增强了模型的可解释性,同时也为预测结果提供了可靠性支撑。
    Abstract: In order to establish a prediction method for the ground peak acceleration (PGA) that does not depend on the soil constitutive model but only on the ground motion and site characteristics, six characteristic parameters are chosen through the feature selection based on 3104 groups of bedrock and surface seismic records collected from the KiK-net strong-motion seismograph network of Japan. Then, the input ground motion characteristics are characterized through the peak bedrock acceleration and predominant frequency, and the site characteristics are characterized by the soil depth at shear wave velocity of 800 m/s, site fundamental period, bedrock shear wave velocity and surface shear wave velocity. The XGBoost model in machine learning is used to establish the prediction models for the PGA based on the above six characteristics. It is shown that the prediction results of the XGBoost prediction model are stable and can be used to predict the PGA better by comparing the records and one-dimensional numerical simulation methods. The coefficients of determination of the training set and the test set are greater than 0.925, and the mean absolute percentage errors are about 20%, which is obviously better than the one-dimensional numerical simulation methods. At the same time, the SHAP is introduced to analyze the influence and dependence between the input characteristics and the predicted results, which enhances the interpretability of the model and provides reliability support for the predicted results.
  • 图  1   PGA分布图

    Figure  1.   Distribution of PGA

    图  2   特征重要性排序

    Figure  2.   Ranking of feature importance

    图  3   训练集和测试集残差图

    Figure  3.   Predicted residuals of training and test sets

    图  4   不同类别MAPE值对比

    Figure  4.   Comparison of MAPE values among different categories

    图  5   特征密度散点图

    Figure  5.   Plot of SHAP summary plot

    图  6   SHAP特征依赖图

    Figure  6.   Plots of SHAP dependency plot

    图  7   特定样本预测结果瀑布图

    Figure  7.   Waterfalls of predicted results of specific samples

    表  1   各类场地台站及地震动记录数量

    Table  1   Numbers of stations and records of ground motion at various sites

    场地类别 台站数量/个 地震记录/条
    Ⅱ类 6 1524
    Ⅲ类 26 1296
    Ⅳ类 8 284
    合计 40 3104
    下载: 导出CSV

    表  2   XGBoost回归模型最佳超参数

    Table  2   Best hyperparameters of XGBoost regression model

    参数 取值
    n_estimators 500
    learning_rate 0.42
    subsample 0.6
    booster gbtree
    max_depth 2
    reg_alpha 0
    reg_lambda 16
    下载: 导出CSV

    表  3   XGBoost模型预测结果

    Table  3   Predicted results by XGBoost model

    评价指标 MAE RMSE MAPE R2
    训练集 10.3 14.55 0.18 0.958
    测试集 13.7 20.06 0.20 0.925
    下载: 导出CSV
  • [1] 建筑抗震设计规范: GB 50011—2010[S]. 北京: 中国建筑工业出版社, 2010.

    Code for Seismic Design of Buildings: GB 50011—2010[S]. Beijing: China Architecture & Building Press, 2010. (in Chinese)

    [2] 余聪, 宋晋东, 李山有. 基于支持向量机的现地地震预警地震动峰值预测[J]. 振动与冲击, 2021, 40(3): 63-72, 80. https://www.cnki.com.cn/Article/CJFDTOTAL-ZDCJ202103010.htm

    YU Cong, SONG Jindong, LI Shanyou. Prediction of peak ground motion for on-site earthquake early warning based on SVM[J]. Journal of Vibration and Shock, 2021, 40(3): 63-72, 80. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-ZDCJ202103010.htm

    [3]

    BOORE D M, ATKINSON G M. Ground-motion prediction equations for the average horizontal component of PGA, PGV, and 5%-damped PSA at spectral periods between 0.01 s and 10.0 S[J]. Earthquake Spectra, 2008, 24(1): 99-138. doi: 10.1193/1.2830434

    [4]

    DU K, DING B R, BAI W, et al. Quantifying uncertainties in ground motion-macroseismic intensity conversion equations. A probabilistic relationship for western China[J]. Journal of Earthquake Engineering, 2020: 1-25.

    [5] 张震. 场地地震反应一维数值分析方法对比分析[D]. 廊坊: 防灾科技学院, 2020.

    ZHANG Zhen. Comparison on one Dimension Numerical Methods of Site Seismic Response Analysis[D]. Langfang: Institute of Disaster Prevention, 2020. (in Chinese)

    [6]

    SCHNABEL P B, LYSMER J, SEED H B. SHAKE: A Computer Program for Earthquake Response Analysis of Horizontal Layer Sites[R]. Berkeley: University of California, 1972.

    [7]

    HASHASH Y M, PARK D. Non-linear one-dimensional seismic ground motion propagation in the Mississippi embayment[J]. Engineering Geology, 2001, 62(1): 185-206.

    [8] 李小军. 一维土层地震反应线性化计算程序[M]. 北京: 地震出版社, 1989.

    LI Xiaojun. A Computer Program for Calculating Earthquake Response of Ground Layered Soil[M]. Beijing: Seismological Press, 1989. (in Chinese)

    [9] 袁晓铭, 李瑞山, 孙锐. 新一代土层地震反应分析方法[J]. 土木工程学报, 2016, 49(10): 95-102, 122. https://www.cnki.com.cn/Article/CJFDTOTAL-TMGC201610015.htm

    YUAN Xiaoming, LI Ruishan, SUN Rui. A new generation method for earthquake response analysis of soil layers[J]. China Civil Engineering Journal, 2016, 49(10): 95-102, 122. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-TMGC201610015.htm

    [10]

    SUN R, YUAN X M. A holistic equivalent linear method for site response analysis[J]. Soil Dynamics and Earthquake Engineering, 2020, 141: 106476.

    [11] 唐川, 陈龙伟. 场地校正的地表PGA放大系数概率模型研究[J]. 工程力学, 2020, 37(12): 99-113. https://www.cnki.com.cn/Article/CJFDTOTAL-GCLX202012011.htm

    TANG Chuan, CHEN Longwei. Probability modelling of pga amplification factors corrected by site conditions[J]. Engineering Mechanics, 2020, 37(12): 99-113. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-GCLX202012011.htm

    [12]

    BÖSE M B, HEATON T, HAUKSSON E. Rapid estimation of earthquake source and ground-motion parameters for earthquake early warning using data from a single three- component broadband or strong-motion sensor[J]. Bulletin of the Seismological Society of America, 2012, 102(2): 738-750.

    [13]

    KERH T, TING S B. Neural network estimation of ground peak acceleration at stations along Taiwan high-speed rail system[J]. Engineering Applications of Artificial Intelligence, 2005, 18(7): 857-866.

    [14]

    DERRAS B, BARD P Y, COTTOM F, et al. Adapting the neural network approach to pga prediction: an example based on the kik-net data[J]. Bulletin of the Seismological Society of America, 2012, 102(4): 1446-1461.

    [15]

    DHANYA J, RAGHUKANTH S T G. Ground motion prediction model using artificial neural network[J]. Pure and Applied Geophysics, 2018, 175(3): 1035-1064.

    [16]

    RIBEIRO M T, SINGH S, GUESTRIN C. "Why should I trust you?": explaining the predictions of any classifier[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, CA, USA, 2016: 1135-1144.

    [17]

    LUNDBERG S M, LEE S I. A Unified approach to interpreting model predictions[C]// Conference and Workshop on Neural Information Processing Systems. California: NIPS Press, 2017: 4765-4774.

    [18]

    LUNDBERG S M, NAIR B, VAVILALA M S, et al. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery[J]. Nature Biomedical Engineering, 2018, 2(10): 749-760.

    [19]

    CHA Y, SHIN J, GO B, et al. An interpretable machine learning method for supporting ecosystem management: Application to species distribution models of freshwater macroinvertebrates[J]. Journal of Environmental Management, 2021, 291: 1-13.

    [20]

    PARSA A B, MOVAHEDI A, TAGHIPOUR H, et al. Toward safer highways, application of XGBoost and SHAP for real-time accident detection and feature analysis[J]. Accident Analysis and Prevention, 2020, 136(C): 105405.

    [21]

    LYNGDOH GIDEON A, MOHD Z, ANOOP K N, et al. Prediction of concrete strengths enabled by missing data imputation and interpretable machine learning[J]. Cement and Concrete Composites, 2022, 128: 104414.

    [22]

    MANGALATHU S, HWANG S H, JEON J S. Failure mode and effects analysis of RC members based on machine-learning-based SHapley Additive exPlanations (SHAP) approach[J]. Engineering Structures, 2020: 1-10.

    [23]

    National Research Institute for Earth Science and Disaster Resilience(NIED) Strong-Motion Seismograph Networks (KiK-net)[OL]. https://www.kyoshin.bosai.go.jp/.

    [24]

    CHEN T, HE T. Higgs boson discovery with boosted trees[J]. JMLR: Workshop and Conference Proceedings, 2015, 42: 69-80.

    [25]

    CHEN T, GUESTRIN C. XGBoost: A scalable tree boosting system[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, CA, USA, 2016: 785-794.

    [26]

    SHAPLEY L S. A value for n-person games[J]. Contributions to the Theory of Games, 1953, 2(28): 307-317.

    [27] 齐文浩, 薄景山, 刘红帅. 水平成层场地基本周期的估算公式[J]. 岩土工程学报, 2013, 35(4): 779-784. http://www.cgejournal.com/cn/article/id/15016

    QI Wenhao, BO Jingshan, LIU Hongshuai. Fundamental period formula for horizontal layered soil profiles[J]. Chinese Journal of Geotechnical Engineering, 2013, 35(4): 779-784. (in Chinese) http://www.cgejournal.com/cn/article/id/15016

    [28]

    BOORE D M, THOMPSON E M, CADET H. Regional correlations of VS30 and velocities averaged over depths less than and greater than 30 meters[J]. Bulletin of the Seismological Society of America, 2011, 101(6): 3046-3059.

    [29] 孙锐, 袁晓铭. 全局等效线性化土层地震反应分析方法[J]. 岩土工程学报, 2021, 43(4): 603-612. doi: 10.11779/CJGE202104002

    SUN Rui, YUAN Xiaoming. Holistic equivalent linearization approach for seismic response analysis of soil layers[J]. Chinese Journal of Geotechnical Engineering, 2021, 43(4): 603-612. (in Chinese) doi: 10.11779/CJGE202104002

    [30]

    DARENDELI M B. Development of A New Family of Normalized Modulus Reduction and Material Damping Curves[D]. Austin: The University of Texas at Austin, 2001.

  • 期刊类型引用(5)

    1. 李威,熊凌,罗钟邱,吴经纬,万诗斐,但斌斌. 基于加权聚类和DNN的KR法脱硫剂加入量预报模型. 炼钢. 2025(01): 12-18+44 . 百度学术
    2. 谌柳谦. 紧邻深基坑的历史保护建筑保护措施关键技术研究. 建筑施工. 2024(01): 81-84 . 百度学术
    3. 彭白雪,陈清华,季家东. 基于XGBoost和SHAP的制冷系统故障分析. 低温与超导. 2024(07): 89-96 . 百度学术
    4. 曹放,孙徐,张钰. 基于“XGBoost—SHAP”的可解释性崩塌落石风险预测在公路工程中的应用. 工程技术研究. 2024(14): 1-4 . 百度学术
    5. 龙潇,孙锐,郑桐. 基于卷积神经网络的液化预测模型及可解释性分析. 岩土力学. 2024(09): 2741-2753 . 百度学术

    其他类型引用(9)

图(7)  /  表(3)
计量
  • 文章访问数:  378
  • HTML全文浏览量:  52
  • PDF下载量:  105
  • 被引次数: 14
出版历程
  • 收稿日期:  2022-04-06
  • 网络出版日期:  2023-09-06
  • 刊出日期:  2023-08-31

目录

    /

    返回文章
    返回