欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture

• •    

基于CNN-LSTM-SA的玉米地上生物量估算

王毅1, 薛蓉1, 韩文霆2(), 邵国敏3, 侯艳巧1, 崔茜彤1   

  1. 1. 西安财经大学信息学院,陕西 西安 710100,中国
    2. 西北农林科技大学机械与电子工程学院,陕西杨凌 712100,中国
    3. 西安理工大学西北旱区生态水利国家重点实验室,陕西 西安 710048,中国
  • 收稿日期:2024-12-01 出版日期:2025-06-27
  • 基金项目:
    陕西省自然科学基础研究计划项目(2022JQ-363); 陕西省社会科学基金项目(2021R022); 陕西省重点研发计划项目(S2024-YF-ZDCXL-ZDLNY-0158)
  • 作者简介:

    王 毅,博士,讲师,研究方向为农情信息空天地一体化智能感知与精准作业技术。E-mail:

  • 通信作者:
    韩文霆,博士,研究员,农业水信息天空地一体化智能感知与精准灌溉技术及装备。E-mail:

Estimation of Corn Aboveground Biomass Based on CNN-LSTM-SA

WANG Yi1, XUE Rong1, HAN Wenting2(), SHAO Guomin3, HOU Yanqiao1, CUI Xitong1   

  1. 1. College of Information, Xi'an University of Finance and Economics, Xi'an 710100, China
    2. College of Mechanical and Electronic Engineering, Northwest A&F University, Yangling 712100, China
    3. State Key Laboratory of Eco-hydraulics in Northwest Arid Region of China, Xi'an University of Technology, Xi'an 710048, China
  • Received:2024-12-01 Online:2025-06-27
  • Foundation items:Shaanxi Province Natural Science Basic Research Program(2022JQ-363); Shaanxi Provincial Social Science Foundation Program(2021R022); Shaanxi Provincial Key Research and Development Program(S2024-YF-ZDCXL-ZDLNY-0158)
  • About author:

    WANG Yi, E-mail:

  • Corresponding author:
    HAN Wenting, E-mail:

摘要:

[目的/意义] 玉米地上生物量(Above Ground Biomass, AGB)反映了玉米的生长状况,但其形成受多种动态因素的影响,导致AGB在空间和时间上的变化较为复杂。因此,本研究引入卷积神经网络(Convolutional Neural Network, CNN)、长短期记忆网络(Long Short-Term Memory, LSTM)和自注意力机制(Self-Attention, SA)结合的模型架构,用于估算田间尺度的玉米AGB。 [方法] 首先,利用该架构构建CNN-LSTM-SA优化模型。分析影响因子与玉米AGB之间的皮尔逊相关系数,并通过递归特征消除法确定模型的最佳输入特征。其次,使用局部可解释模型无关解释方法对单个样本进行解释。最后,通过消融实验,探讨引入CNN和SA对CNN-LSTM-SA模型的影响,并与随机森林(Random Forest, RF)和支持向量机(Support Vector Machine, SVM)模型进行对比。 [结果和讨论] CNN-LSTM-SA模型的决定系数(R2)为0.92,均方根误差(Root Mean Square Error, RMSE)和平均绝对误差(Mean Absolute Error, MAE)为107.53 g/m2和55.19 g/m2,优于单一类型的LSTM模型、CNN-LSTM模型和LSTM-SA模型。同时,在各项指标上比RF模型和SVM模型效果更好。 [结论] 该模型从时空角度出发,提升了玉米AGB估算的准确性,具有可解释性。该研究为作物AGB的动态建模提供了思路与方法,具有一定参考价值。

关键词: 玉米, 地上生物量, 卷积神经网络, 长短期记忆网络, 自注意力机制

Abstract:

[Objective] Maize is one of the most widely cultivated staple crops worldwide, and its aboveground biomass (AGB) serves as a crucial indicator for evaluating crop growth status. Accurate estimation of maize AGB is vital for ensuring food security and enhancing agricultural productivity. However, maize AGB is influenced by a multitude of dynamic factors, exhibiting complex spatial and temporal variations that pose significant challenges to precise estimation. Although deep learning techniques have demonstrated strong capabilities in feature extraction and spatiotemporal modeling, their application in agricultural remote sensing still has significant potential for development and exploration. At present, most studies on maize AGB estimation rely primarily on single-source remote sensing data and conventional machine learning algorithms, which limits the accuracy and generalizability of the models. To overcome these limitations, this study develops a model architecture that integrates convolutional neural networks (CNN), Long short-term memory networks (LSTM), and a self-attention (SA) mechanism to estimate maize AGB at the field scale. [Methods] The experimental site was located in Zhaojun Town, Dalad Banner, Ordos City, Inner Mongolia Autonomous Region. The study utilized vegetation indices, crop parameters, and meteorological data that were collected under varying gradient water treatments in the experimental area. First, an optimized CNN-LSTM-SA model was constructed. The model employed two-dimensional convolutional layers to extract both spatial and temporal features, while utilizing max-pooling and dropout techniques to mitigate overfitting. The LSTM module was used to capture temporal dependencies in the data. Subsequently, the SA mechanism was introduced to compute global attention weights, enhancing the representation of critical time steps. Additionally, nonlinear activation functions were applied to mitigate multicollinearity among features. A fully connected layer was used to output the estimated AGB values. Second, the Pearson correlation coefficients between influencing factors and maize AGB were analyzed, and the importance of multi-source data was validated. recursive feature elimination (RFE) was used to select the optimal input features. The local interpretable model-agnostic explanations (LIME) method was employed to interpret individual samples. Finally, ablation experiments were conducted to assess the effects of incorporating CNN and SA into the model, with performance comparisons made against random forest (RF) and Support Vector Machine (SVM) models. [Results and Discussions] The correlation analysis revealed that crop parameters exhibited strong correlations with AGB (-0.66 ≤ r ≤ 0.79). Among the vegetation indices, the normalized difference red edge index (NDREI) demonstrated the highest correlation (r = 0.63). To address multicollinearity issues, the visible atmospherically resistant index (VARI), Soil Adjusted vegetation index (SAVI), and normalized difference red edge index (NDRE) were excluded from the analysis. The CNN-LSTM-SA model integrated crop parameters, vegetation indices, and meteorological data and initially achieved a coefficient of determination (R2) of 0.89, a root mean square error (RMSE) of 129.38 g/m2, and a mean absolute error (MAE) of 65.99 g/m2. When only vegetation indices and meteorological data were included, the model's performance declined, yielding an R2 of 0.83, an RMSE of 161.36 g/m2, and an MAE of 89.37 g/m2. Using a single vegetation index further reduced model accuracy. Based on multi-source data integration, RFE removed redundant features. After excluding the 2-meter average wind speed, the model reached its best performance with R2 of 0.92, RMSE of 107.53 g/m2, and MAE of 55.19 g/m2. Using the LIME method to interpret feature contributions for individual maize samples, the analysis revealed that during the rapid growth stage, the model was primarily influenced by the current growth status and vegetation indices. For samples in the mid-growth stage, multi-day crop physiological characteristics had a substantial impact on model predictions. In the late growth stage, higher vegetation index values showed a clear suppressive effect on the model outputs. During the mid-growth stage of maize under varying moisture conditions, the model consistently demonstrated heightened sensitivity to low temperatures, moderate humidity levels, and optimal vegetation indices. The CNN-LSTM-SA model demonstrated more consistent fitting performance across different growth stages and water conditions compared to the LSTM, LSTM-SA, and CNN-LSTM models. Its accuracy surpassed the standalone LSTM model, the CNN-LSTM model, and the LSTM-SA model. Additionally, it also exceeded the performance of the RF model and the SVM model in all evaluation metrics. [Conclusions] This study leveraged the feature extraction capabilities of CNN, the temporal modeling strength of LSTM, and the dynamic attention mechanism of the SA to enhance the accuracy of maize AGB estimation from a spatiotemporal perspective. The approach not only reduced estimation errors but also improved model interpretability. This research provided valuable insights and references for the dynamic modeling of crop AGB.

Key words: maize, aboveground biomass, convolutional neural network, long short-term memory network, self-attention mechanism

中图分类号: