欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture

• •    

基于GWRF模型的鲁中山区县域土壤有机质数字制图方法研究

张树琳1, 崔丽芹3, 刘健1, 张灿婷1, 王洪佳1, 张婷婷1, 王瑷玲1,2()   

  1. 1. 山东农业大学 资源与环境学院,山东 泰安 271018,中国
    2. 山东农业大学 土肥高效利用国家工程研究中心,山东 泰安 271018,中国
    3. 沂源县农业农村局,山东 淄博 256100,中国
  • 收稿日期:2025-08-21 出版日期:2026-01-06
  • 基金项目:
    国家自然科学基金(42171378); 山东省自然科学基金(ZR2021MD018); 泰山学者青年专家项目(tsqnz20231205)
  • 作者简介:

    张树琳,硕士研究生,研究方向为数字土壤制图。E-mail:

  • 通信作者:
    王瑷玲,博士,教授,研究方向为土地利用与信息技术。E-mail:

Geographically Weighted Random Forest for County-scale Digital Mapping of Soil Organic Matter in the Central Shandong Mountains

ZHANG Shulin1, CUI Liqin3, LIU Jian1, ZHANG Canting1, WANG Hongjia1, ZHANG Tingting1, WANG Ailing1,2()   

  1. 1. College of Resources and Environment, Shandong Agricultural University, Tai'an 271018, China
    2. National Engineering Research Center for Efficient Utilization of Soil and Fertilizer, Shandong Agricultural University, Tai'an 271018, China
    3. Yiyuan County Agriculture and Rural Affairs Bureau, Zibo 256100, China
  • Received:2025-08-21 Online:2026-01-06
  • Foundation items:National Natural Science Foundation of China(42171378); The Natural Science Foundation of Shandong Province(ZR2021MD018); The Special Funds of Taishan Scholar of Shandong Province(tsqnz20231205)
  • About author:

    ZHANG Shulin, E-mail:

  • Corresponding author:
    WANG Ailing, E-mail:.

摘要:

【目的/意义】 土壤有机质(Soil Organic Matter, SOM)是土壤肥力和质量的重要指标。为解决山区复杂地形条件下SOM空间变异性强、传统数字土壤制图方法预测精度受限的问题,本研究旨在引入并验证地理加权随机森林(Geographically Weighted Random Forest, GWRF)模型在县域尺度复杂地形区SOM数字制图中的适用性与精度优势,并实现SOM含量的高精度空间预测,对合理利用和科学管理土壤资源具有重要意义。 【方法】 以山东省典型山地农区沂源县为研究区,利用实测SOM样点数据,考虑气候、地形、土壤、植被和土地利用等环境变量,通过构建融合空间局部建模与非线性建模能力的地理加权随机森林(GWRF)模型对研究区SOM预测、数字制图,并与普通克里格(Ordinary Kriging, OK)、多元线性回归(Multiple Linear Regression, MLR)、地理加权回归(Geographically Weighted Regression, GWR)和随机森林(Random Forest, RF)等模型进行精度对比。 【结果和讨论】 GWRF模型表现最优,决定系数(Coefficient of Determination, R²)为0.48、均方根误差(Root Mean Square Error, RMSE)为5.12 g/kg,较OK、MLR、GWR和RF模型R²分别提升0.24、0.16、0.13和0.07,RMSE降低了1.06、0.73、0.59和0.36 g/kg。研究区SOM含量整体偏低,呈中部高、西南和东北低的空间分布特征,主要受土壤类型、年蒸散量、坡度和砂粒含量等环境因素影响。 【结论】 研究结果证明了GWRF模型在复杂地形区的县域SOM预测中具有显著优势,可为第三次全国土壤普查背景下的高精度SOM数字制图提供技术支撑。

关键词: 土壤有机质, 数字土壤制图, 山区, 地理加权随机森林

Abstract:

[Objective] Soil organic matter (SOM) is a fundamental indicator for evaluating soil fertility and soil quality. In mountainous counties characterized by complex terrain and pronounced environmental heterogeneity, SOM exhibits strong spatial variability even over short distances, which often results in limited prediction accuracy for conventional digital soil mapping (DSM) models. With the nationwide implementation of the Third National Soil Census, the demand for high-resolution and high-accuracy SOM mapping at the county scale has become increasingly urgent. Against this backdrop, Yiyuan county in Shandong Province was selected as the study area. The aim is to assess the applicability of the geographically weighted random forest (GWRF) model in SOM mapping within complex terrain regions. Furthermore, it sought to systematically compare the predictive performance of GWRF with several commonly used models, thereby providing technical support for soil resource surveys, census result compilation, and county-level land management. [Methods] The dataset consisting of 1 565 measured topsoil SOM samples was utilized, along with nineteen environmental variables representing five categories: topography, climate, vegetation, soil properties, and land use. Through correlation analysis and collinearity diagnostics, twelve key variables were retained for model construction. The GWRF model, which integrates localized spatial modeling with nonlinear machine-learning capability, was developed to generate high-resolution SOM predictions across the study region. An adaptive bandwidth strategy was employed, and the optimal bandwidth of 500 was determined. Grid search combined with cross-validation was used to identify the optimal mtry value of 4 for the random forest component. In addition to GWRF, four reference models were constructed for comparison: ordinary kriging (OK), multiple linear regression (MLR), geographically weighted regression (GWR), and random forest (RF). Model performance was evaluated using two commonly adopted accuracy metrics—the coefficient of determination (R²) and root-mean-square error (RMSE). [Results and Discussions] This study focused on exploring the spatial pattern of SOM in the study area while systematically comparing the performance of multiple DSM models. Overall, SOM levels in Yiyuan County were relatively low, with a mean value of 15.62 g/kg. The spatial variation was moderate and exhibited a clear pattern: SOM values were higher in the central region and lower in the northeastern and southwestern areas. Considerable differences were observed in prediction accuracy among the five models. The GWRF model achieved the best overall performance, with an R2 of 0.48 and an RMSE of 5.12 g/kg. This accuracy clearly surpassed that of RF (R2=0.41) and GWR (R2=0.35), and its advantage over MLR and OK was even more pronounced. A paired-sample t-test further confirmed that the accuracy improvements of GWRF over the other four models were statistically significant, supporting the robustness and reliability of the model's enhanced performance. According to the mapping results, the OK model produced an excessively smooth surface, making it difficult to reveal local details. While the MLR and GWR models could characterize certain environmental effects, they exhibited significant biases such as underestimation of high values and overestimation of low values. In contrast, the GWRF model performed prominently in capturing both global trends and local subtle variations. The analysis of variable importance showed that soil type, annual evapotranspiration, slope, and sand content were the most influential factors governing SOM distribution in the study area. Moreover, their spatially varying importance revealed notable heterogeneity. [Conclusions] This study demonstrated that the GWRF model possesses significant advantages in county-scale SOM digital mapping within mountainous regions. Its prediction accuracy markedly exceeded that of RF and conventional linear models, owing to its ability to simultaneously capture nonlinear environmental relationships and localized spatial variations. The enhanced mapping precision and improved representation of spatial details highlight the strong potential of GWRF for applications requiring high-accuracy soil information, such as the Third National Soil Census.The successful implementation of GWRF in this study suggests that the model is well-suited for SOM prediction under complex terrain conditions and can serve as an effective technical tool for county-level soil property estimation. Future research may incorporate human-activity-related variables, employ localized variable-selection strategies within the GWRF framework to further refine model performance, and explore the application potential of more advanced deep learning models in soil property mapping.

Key words: soil organic matter, digital soil mapping, mountainous regions, geographically weighted random forest

中图分类号: