欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture ›› 2025, Vol. 7 ›› Issue (6): 136-148.doi: 10.12133/j.smartag.SA202508025

• 专刊--遥感+AI 赋能农业农村现代化 • 上一篇    下一篇

轻量可解释的大豆遥感识别模型构建与评估

王银辉1, 赵安周2, 李丹1, 朱秀芳3,4(), 赵军5, 王紫晴6   

  1. 1. 河北工程大学 地球科学与工程学院,河北 邯郸 056038,中国
    2. 河北工程大学 矿业与测绘工程学院,河北 邯郸 056038,中国
    3. 北京师范大学 遥感与数字地球全国重点实验室,北京 100875,中国
    4. 北京师范大学 环境演变与自然灾害教育部重点实验室,北京 100875,中国
    5. 青岛市智慧乡村发展服务中心,山东 青岛 266199,中国
    6. 青岛国测海遥信息技术有限公司,山东 青岛 266114,中国
  • 收稿日期:2025-08-27 出版日期:2025-11-30
  • 基金项目:
    国家重点研发计划(2023YFB3906201)
  • 作者简介:

    王银辉,硕士研究生,研究方向为遥感农作物识别研究。E-mail:

  • 通信作者:
    朱秀芳,博士,教授,研究方向为遥感应用研究。E-mail:

Construction and Evaluation of Lightweight and Interpretable Soybean Remote Sensing Identification Model

WANG Yinhui1, ZHAO Anzhou2, LI Dan1, ZHU Xiufang3,4(), ZHAO Jun5, WANG Ziqing6   

  1. 1. School of Earth Science and Engineering, Hebei University of Engineering, Handan 056038, China
    2. School of Mining and Geomatics Engineering, Hebei University of Engineering, Handan 056038, China
    3. State Key Laboratory of Remote Sensing and Digital Earth, Beijing Normal University, Beijing 100875, China
    4. Key Laboratory of Environmental Change and Natural Disaster, Ministry of Education, Beijing Normal University, Beijing 100875, China
    5. Qingdao Smart Village Development Service Center, Qingdao 266199, China
    6. Qingdao Acelmage Technologis Information Technology Co. , Ltd. , Qingdao 266114, China
  • Received:2025-08-27 Online:2025-11-30
  • Foundation items:National Key R&D Program of China(2023YFB3906201)
  • About author:

    WANG Yinhui, E-mail:

  • Corresponding author:
    ZHU Xiufang, E-mail:

摘要:

目的/意义 大豆是全球重要的粮食与经济作物,快速、精准识别其种植区域对粮食安全监测和精准农业发展具有重要意义。然而,现有的大豆遥感识别方法普遍存在效率低和精度不足的问题。 方法 基于2021—2023年Sentinel-2遥感影像,结合大豆生育期信息,采用二元Logistic模型构建大豆遥感识别模型,并在美国大豆主产区的6个典型区域开展识别实验,以期提升识别精度与模型可推广性。 结果和讨论 大豆识别的最佳物候窗口为7月下旬至9月中旬(第210~260天)。在模型构建区域及2022年的识别中,其整体精度和Kappa系数分别为0.90和0.79;在2022年其余区域的平均精度和平均Kappa系数分别为0.88和0.76。进一步验证表明,模型在2021—2023年所有区域的平均总体精度与Kappa系数分别为0.87和0.76,体现出良好的跨区域与跨年度稳定性和适应性。同时,该模型具有较好的可解释性与轻量化特征。 结论 基于Sentinel-2影像与二元Logistic模型的大豆识别方法能够实现跨区域、跨年度的稳定应用,可为大豆田的快速、准确识别提供参考方法,并为精准农业的可视化管理与科学决策提供技术支撑。

关键词: Sentinel-2, 二元Logistic模型, 大豆, 制图, 遥感, 作物识别, 轻量化

Abstract:

Objective Soybean stands as one of the most crucial global crops, serving as a vital source of plant-based protein and vegetable oil while playing an indispensable role in sustainable agricultural systems and global food security. Accurate and timely mapping of soybean cultivation areas is essential for agricultural monitoring, policy-making, and precision farming. However, existing remote sensing methods for soybean identification, such as threshold-based approaches, traditional machine learning, and deep learning, often face challenges related to model complexity, computational efficiency, and interpretability. These limitations collectively highlight the pressing need for a methodological solution that maintains classification accuracy while simultaneously offering computational efficiency, operational simplicity, and interpretable results, a balance crucial for effective agricultural monitoring and policy-making. To address these limitations, a lightweight and interpretable soybean mapping framework was proposed based on Sentinel-2 imagery and a binary logistic regression model in this method. Methods Six representative agricultural regions within the primary U.S. soybean production belt were selected to capture the diversity of cultivation practices and environmental conditions across this major production area. The analysis utilized the complete growing season (April-October) Sentinel-2 satellite imagery from 2021 to 2023. The USDA's cropland data layer served as reference data for model training and validation, benefiting from its extensive ground verification and statistical rigor. All Sentinel-2 images undergo rigorous preprocessing, including atmospheric correction, cloud and shadow masking with the scene classification layer, and spatial subsetting to the regions of interest. The Jeffries-Matusita distance was employed as a quantitative metric to objectively identify the optimal temporal window for soybean discrimination. This statistical measure evaluated the separability between soybean and other major crops across the growing season, with calculations performed on 10 d composite periods to ensure data quality and temporal consistency. The analysis revealed that late July to mid-September (Day of Year 210-260) provided maximum spectral separability, corresponding to the soybean's critical reproductive stages (pod setting and filling) when its spectral signature becomes most distinct from other crops, particularly in short-wave infrared regions sensitive to canopy structure and water content. Within this optimally identified window, a binary logistic regression model was implemented that treated soybean identification as a probabilistic classification problem. The model was trained using spectral features from the optimal period through maximum likelihood estimation, creating a computationally efficient framework that required optimization of only a limited number of parameters while maintaining physical interpretability through explicit feature coefficients. Results and Discussions The comprehensive evaluation showed that the integrated approach balanced classification performance and operational practicality optimally. The temporal optimization identified late July to mid-September as the peak discriminative period, which matches soybean's reproductive phenological stages (when its canopy spectral characteristics differ most from other crops). This finding was consistent across three study years and multiple regions, verifying the robustness of the data-driven window selection. The binary logistic regression model, trained on features from this optimal period, performed excellently: In the 2022 model construction region, it achieved 0.90 overall accuracy and 0.79 Kappa coefficient. When applied to independent validation regions in the same year, it maintained strong performance (0.88 overall accuracy, 0.76 Kappa) without region-specific parameter adjustments, demonstrating outstanding spatial transferability. Temporal validation further confirmed the model's robustness: Across the 2021 to 2023 study period, it maintained consistent performance across all regions, with an average accuracy of 0.87 and Kappa of 0.76. This inter-annual stability is notable, despite potential variations in annual weather, management practices, and planting schedules, and highlights the advantage of basing the model on a stable phenological period rather than fixed calendar dates. The model's lightweight architecture offered practical benefits: Compared with complex ensemble or deep learning methods, it only requires optimizing a limited number of parameters. This parsimonious structure enhances computational efficiency, enabling rapid training and deployment over large areas while reducing reliance on extensive labeled datasets—a key advantage in regions lacking sufficient ground truth data. Beyond accuracy and efficiency, the model exhibited exceptional interpretability via its probabilistic framework and transparent feature weighting. Coefficient analysis provided quantifiable insights into feature contributions, revealing that short-wave infrared bands and specific vegetation indices had the highest discriminative power during the optimal temporal window. Conclusions An effective soybean mapping approach that balances accuracy with operational practicality through the strategic combination of temporal optimization and binary logistic regression was proposed. The method offers a viable solution for operational agricultural monitoring, especially in resource-constrained environments. Future work can enhance the robustness of the model across multiple regional conditions through cross-regional validation in different climate zones and cropping systems, or by integrating transfer learning with domain adaptation methods. This will improve its potential for global-scale application. Concurrently, integrating additional data, methodologies, and models to achieve end-to-end feature learning should be considered.

Key words: Sentinel-2, binary Logistic model, soybean, mapping, remote sensing, crop identification, ligthweight

中图分类号: