基于机器学习均值化的地质灾害易发性评价

Evaluation of the susceptibility of geological disasters based on machine learning averaging

  • 摘要: 建立地质灾害易发性评价模型并开展易发性评价,对提高区域地质灾害预报预警效率和精度有重要意义. 然而,如何建立既切合区域实际、又具有推广适用价值的地质灾害易发性评价模型是制约地质灾害预报预警的关键科学问题. 以云南省南华县2015年地质灾害详查数据为基础,选择地形、地貌等11个因子,基于均值法,采取梯度提升树算法(XGBoost、LightGBM、CatBoost)、信息量模型与地理加权回归模型开展了地质灾害易发性评价研究. 结果表明:①地理加权回归模型预测结果存在过拟合现象,信息量模型则存在欠拟合现象;②均值法效果最好,AUC(Area Under Curve)值为0.9337,精度较地理加权回归模型、XGBoost、LightGBM、CatBoost、信息量模型分别提高了1.7%、1.8%、2.0%、3.8%、4.0%;③Catboost对正样本的预测效果最差,但是对负样本的预测效果最好,而XGBoost对正样本的预测效果最好,对负样本的预测效果很差,基于3种梯度算法的均值法则对正负样本的预测精度有了明显提高;④南华县地质灾害主要诱因有道路修建、断层活动、降雨冲刷、河流侵蚀,高易发区具有沿河性、沿路性、沿断层性等特征.

     

    Abstract: It is of great significance to establish the evaluation model and evaluate geological disaster susceptibility for improving the efficiency and accuracy of regional geological disaster forecasts and early warning. However, how to establish a geological disaster susceptibility evaluation model which is suitable for regional reality and has the value of popularization and application is the key scientific problem that restricts the prediction and early warning of geological disasters. Taking Nanhua County in Yunnan Province as the research example, based on the detailed survey data of geological disasters in 2015, 11 factors were selected, including distances from roads, distances from rivers, distances from faults, soil types, precipitation, land use types, rock mass types, vegetation coverage, slopes, aspects, and elevations. Based on the mean method, the evaluation study on the susceptibility of geological disasters in Nanhua County was carried out by using the gradient boosting tree algorithm (XGBoost, LightGBM, CatBoost), the information model, and the geographically weighted regression model. The results showed that: ① The prediction results of the geographically weighted regression model had the phenomenon of over-fitting, and the information model had the phenomenon of under-fitting. ② The mean method had the best effect. The AUC value was 0.9337, and the accuracy was respectively 1.7%, 1.8%, 2.0%, 3.8%, and 4.0% higher than that of a geographically weighted regression model, XGBoost, LightGBM, CatBoost, and information volume model. ③ Catboost had the worst prediction effect on positive samples, but the highest prediction effect on negative samples; XGBoost had the best prediction effect on positive samples and poor prediction effect on negative samples; while the mean method based on three gradient algorithms had significantly improved the prediction of positive and negative samples. ④ The main inducements of geological disasters in Nanhua County were road constructions, fault activities, rainfall rushes, and river erosion. The high susceptibility areas were characterized by their closeness to rivers, roads and faults.

     

/

返回文章
返回