Prediction and analysis model for ground peak acceleration based on XGBoost and SHAP
-
摘要: 为建立一种不依赖土体本构模型,只依靠地震动和场地主要特征的地表加速度峰值预测方法,以日本KiK-net强震台网搜集到的3104组基岩和地表地震动记录为基础,通过特征选择筛选出6个特征参数,以输入地震动加速度峰值和输入地震动卓越频率表征输入地震动特性,以剪切波速达800 m/s时的土层埋深、场地基本周期、基岩剪切波速和地表剪切波速表征场地特性。采用XGBoost模型,构建基于6个特征参数的地表峰值加速度(PGA)预测模型。通过对比实测记录和一维数值模拟计算结果,表明本文建立的XGBoost模型预测结果稳定,能较好的预测PGA,训练集和测试集的决定系数均大于0.925,平均绝对百分比误差均在20%左右。同时引入SHAP对输入特征与预测结果之间的影响和依赖性进行分析,增强了模型的可解释性,同时也为预测结果提供了可靠性支撑。Abstract: In order to establish a prediction method for the ground peak acceleration (PGA) that does not depend on the soil constitutive model but only on the ground motion and site characteristics, six characteristic parameters are chosen through the feature selection based on 3104 groups of bedrock and surface seismic records collected from the KiK-net strong-motion seismograph network of Japan. Then, the input ground motion characteristics are characterized through the peak bedrock acceleration and predominant frequency, and the site characteristics are characterized by the soil depth at shear wave velocity of 800 m/s, site fundamental period, bedrock shear wave velocity and surface shear wave velocity. The XGBoost model in machine learning is used to establish the prediction models for the PGA based on the above six characteristics. It is shown that the prediction results of the XGBoost prediction model are stable and can be used to predict the PGA better by comparing the records and one-dimensional numerical simulation methods. The coefficients of determination of the training set and the test set are greater than 0.925, and the mean absolute percentage errors are about 20%, which is obviously better than the one-dimensional numerical simulation methods. At the same time, the SHAP is introduced to analyze the influence and dependence between the input characteristics and the predicted results, which enhances the interpretability of the model and provides reliability support for the predicted results.
-
Keywords:
- machine learning /
- XGBoost /
- prediction of PGA /
- SHAP /
- interpretability
-
-
表 1 各类场地台站及地震动记录数量
Table 1 Numbers of stations and records of ground motion at various sites
场地类别 台站数量/个 地震记录/条 Ⅱ类 6 1524 Ⅲ类 26 1296 Ⅳ类 8 284 合计 40 3104 表 2 XGBoost回归模型最佳超参数
Table 2 Best hyperparameters of XGBoost regression model
参数 取值 n_estimators 500 learning_rate 0.42 subsample 0.6 booster gbtree max_depth 2 reg_alpha 0 reg_lambda 16 表 3 XGBoost模型预测结果
Table 3 Predicted results by XGBoost model
评价指标 MAE RMSE MAPE R2 训练集 10.3 14.55 0.18 0.958 测试集 13.7 20.06 0.20 0.925 -
[1] 建筑抗震设计规范: GB 50011—2010[S]. 北京: 中国建筑工业出版社, 2010. Code for Seismic Design of Buildings: GB 50011—2010[S]. Beijing: China Architecture & Building Press, 2010. (in Chinese)
[2] 余聪, 宋晋东, 李山有. 基于支持向量机的现地地震预警地震动峰值预测[J]. 振动与冲击, 2021, 40(3): 63-72, 80. https://www.cnki.com.cn/Article/CJFDTOTAL-ZDCJ202103010.htm YU Cong, SONG Jindong, LI Shanyou. Prediction of peak ground motion for on-site earthquake early warning based on SVM[J]. Journal of Vibration and Shock, 2021, 40(3): 63-72, 80. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-ZDCJ202103010.htm
[3] BOORE D M, ATKINSON G M. Ground-motion prediction equations for the average horizontal component of PGA, PGV, and 5%-damped PSA at spectral periods between 0.01 s and 10.0 S[J]. Earthquake Spectra, 2008, 24(1): 99-138. doi: 10.1193/1.2830434
[4] DU K, DING B R, BAI W, et al. Quantifying uncertainties in ground motion-macroseismic intensity conversion equations. A probabilistic relationship for western China[J]. Journal of Earthquake Engineering, 2020: 1-25.
[5] 张震. 场地地震反应一维数值分析方法对比分析[D]. 廊坊: 防灾科技学院, 2020. ZHANG Zhen. Comparison on one Dimension Numerical Methods of Site Seismic Response Analysis[D]. Langfang: Institute of Disaster Prevention, 2020. (in Chinese)
[6] SCHNABEL P B, LYSMER J, SEED H B. SHAKE: A Computer Program for Earthquake Response Analysis of Horizontal Layer Sites[R]. Berkeley: University of California, 1972.
[7] HASHASH Y M, PARK D. Non-linear one-dimensional seismic ground motion propagation in the Mississippi embayment[J]. Engineering Geology, 2001, 62(1): 185-206.
[8] 李小军. 一维土层地震反应线性化计算程序[M]. 北京: 地震出版社, 1989. LI Xiaojun. A Computer Program for Calculating Earthquake Response of Ground Layered Soil[M]. Beijing: Seismological Press, 1989. (in Chinese)
[9] 袁晓铭, 李瑞山, 孙锐. 新一代土层地震反应分析方法[J]. 土木工程学报, 2016, 49(10): 95-102, 122. https://www.cnki.com.cn/Article/CJFDTOTAL-TMGC201610015.htm YUAN Xiaoming, LI Ruishan, SUN Rui. A new generation method for earthquake response analysis of soil layers[J]. China Civil Engineering Journal, 2016, 49(10): 95-102, 122. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-TMGC201610015.htm
[10] SUN R, YUAN X M. A holistic equivalent linear method for site response analysis[J]. Soil Dynamics and Earthquake Engineering, 2020, 141: 106476.
[11] 唐川, 陈龙伟. 场地校正的地表PGA放大系数概率模型研究[J]. 工程力学, 2020, 37(12): 99-113. https://www.cnki.com.cn/Article/CJFDTOTAL-GCLX202012011.htm TANG Chuan, CHEN Longwei. Probability modelling of pga amplification factors corrected by site conditions[J]. Engineering Mechanics, 2020, 37(12): 99-113. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-GCLX202012011.htm
[12] BÖSE M B, HEATON T, HAUKSSON E. Rapid estimation of earthquake source and ground-motion parameters for earthquake early warning using data from a single three- component broadband or strong-motion sensor[J]. Bulletin of the Seismological Society of America, 2012, 102(2): 738-750.
[13] KERH T, TING S B. Neural network estimation of ground peak acceleration at stations along Taiwan high-speed rail system[J]. Engineering Applications of Artificial Intelligence, 2005, 18(7): 857-866.
[14] DERRAS B, BARD P Y, COTTOM F, et al. Adapting the neural network approach to pga prediction: an example based on the kik-net data[J]. Bulletin of the Seismological Society of America, 2012, 102(4): 1446-1461.
[15] DHANYA J, RAGHUKANTH S T G. Ground motion prediction model using artificial neural network[J]. Pure and Applied Geophysics, 2018, 175(3): 1035-1064.
[16] RIBEIRO M T, SINGH S, GUESTRIN C. "Why should I trust you?": explaining the predictions of any classifier[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, CA, USA, 2016: 1135-1144.
[17] LUNDBERG S M, LEE S I. A Unified approach to interpreting model predictions[C]// Conference and Workshop on Neural Information Processing Systems. California: NIPS Press, 2017: 4765-4774.
[18] LUNDBERG S M, NAIR B, VAVILALA M S, et al. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery[J]. Nature Biomedical Engineering, 2018, 2(10): 749-760.
[19] CHA Y, SHIN J, GO B, et al. An interpretable machine learning method for supporting ecosystem management: Application to species distribution models of freshwater macroinvertebrates[J]. Journal of Environmental Management, 2021, 291: 1-13.
[20] PARSA A B, MOVAHEDI A, TAGHIPOUR H, et al. Toward safer highways, application of XGBoost and SHAP for real-time accident detection and feature analysis[J]. Accident Analysis and Prevention, 2020, 136(C): 105405.
[21] LYNGDOH GIDEON A, MOHD Z, ANOOP K N, et al. Prediction of concrete strengths enabled by missing data imputation and interpretable machine learning[J]. Cement and Concrete Composites, 2022, 128: 104414.
[22] MANGALATHU S, HWANG S H, JEON J S. Failure mode and effects analysis of RC members based on machine-learning-based SHapley Additive exPlanations (SHAP) approach[J]. Engineering Structures, 2020: 1-10.
[23] National Research Institute for Earth Science and Disaster Resilience(NIED) Strong-Motion Seismograph Networks (KiK-net)[OL]. https://www.kyoshin.bosai.go.jp/.
[24] CHEN T, HE T. Higgs boson discovery with boosted trees[J]. JMLR: Workshop and Conference Proceedings, 2015, 42: 69-80.
[25] CHEN T, GUESTRIN C. XGBoost: A scalable tree boosting system[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, CA, USA, 2016: 785-794.
[26] SHAPLEY L S. A value for n-person games[J]. Contributions to the Theory of Games, 1953, 2(28): 307-317.
[27] 齐文浩, 薄景山, 刘红帅. 水平成层场地基本周期的估算公式[J]. 岩土工程学报, 2013, 35(4): 779-784. http://www.cgejournal.com/cn/article/id/15016 QI Wenhao, BO Jingshan, LIU Hongshuai. Fundamental period formula for horizontal layered soil profiles[J]. Chinese Journal of Geotechnical Engineering, 2013, 35(4): 779-784. (in Chinese) http://www.cgejournal.com/cn/article/id/15016
[28] BOORE D M, THOMPSON E M, CADET H. Regional correlations of VS30 and velocities averaged over depths less than and greater than 30 meters[J]. Bulletin of the Seismological Society of America, 2011, 101(6): 3046-3059.
[29] 孙锐, 袁晓铭. 全局等效线性化土层地震反应分析方法[J]. 岩土工程学报, 2021, 43(4): 603-612. doi: 10.11779/CJGE202104002 SUN Rui, YUAN Xiaoming. Holistic equivalent linearization approach for seismic response analysis of soil layers[J]. Chinese Journal of Geotechnical Engineering, 2021, 43(4): 603-612. (in Chinese) doi: 10.11779/CJGE202104002
[30] DARENDELI M B. Development of A New Family of Normalized Modulus Reduction and Material Damping Curves[D]. Austin: The University of Texas at Austin, 2001.
-
期刊类型引用(5)
1. 李威,熊凌,罗钟邱,吴经纬,万诗斐,但斌斌. 基于加权聚类和DNN的KR法脱硫剂加入量预报模型. 炼钢. 2025(01): 12-18+44 . 百度学术
2. 谌柳谦. 紧邻深基坑的历史保护建筑保护措施关键技术研究. 建筑施工. 2024(01): 81-84 . 百度学术
3. 彭白雪,陈清华,季家东. 基于XGBoost和SHAP的制冷系统故障分析. 低温与超导. 2024(07): 89-96 . 百度学术
4. 曹放,孙徐,张钰. 基于“XGBoost—SHAP”的可解释性崩塌落石风险预测在公路工程中的应用. 工程技术研究. 2024(14): 1-4 . 百度学术
5. 龙潇,孙锐,郑桐. 基于卷积神经网络的液化预测模型及可解释性分析. 岩土力学. 2024(09): 2741-2753 . 百度学术
其他类型引用(9)
-
其他相关附件