0 Introduction
1 Materials and methods
1.1 Study area
1.2 Data
1.2.1 Field data acquisition
Fig. 1 Five-point sampling combined with grid method for soybean field survey at Jianshan Farm |
1.2.2 Sentinel-2 imagery acquisition and preprocessing
Table 1 Band information of acquired Sentinel-2 imagery |
| Band number | Band | Central wavelength/μm | Bandwidth/nm | Spatial resolution/m |
|---|---|---|---|---|
| 1 | Coastal aerosol | 0.443 | 20 | 60 |
| 2 | Blue | 0.490 | 65 | 10 |
| 3 | Green | 0.560 | 35 | 10 |
| 4 | Red | 0.665 | 30 | 10 |
| 5 | Vegetation red edge | 0.705 | 15 | 20 |
| 6 | Vegetation red edge | 0.740 | 15 | 20 |
| 7 | Vegetation red edge | 0.783 | 20 | 20 |
| 8 | Near Infrared | 0.842 | 115 | 10 |
| 8A | Vegetation red edge | 0.865 | 20 | 20 |
| 9 | Water vapour | 0.945 | 20 | 60 |
| 10 | Short-Wave Infrared -Cirrus | 1.375 | 20 | 60 |
| 11 | SWIR | 1.610 | 90 | 20 |
1.2.3 UAV imagery acquisition and preprocessing
Fig. 2 Comparison of UAV multispectral band information before and after radiometric correction |
1.3 Feature extraction from remote sensing imagery
Table 2 GLCM texture parameters and their physical significance for soybean canopy characterization |
| Texture feature parameters | Abbreviation | Central band/μm |
|---|---|---|
| Mean texture | Mean | Describe the brightness and darkness of an image |
| Homogeneity texture | Hom | Evaluate the local grayscale uniformity of the image |
| Dissimilarity texture | Dis | Characterize the differences in texture features between pixels |
| Entropy texture | Ent | Indicate the size of the amount of information |
| Second Moment texture | Sm | Describe the uniformity and thickness of texture features |
| Correlation texture | Corr | Predict the main trend of texture |
Fig. 3 Spatial distribution of GLCM texture features extracted from Sentinel-2 and UAV imagery during a typical soybean growth stage |
1.4 Feature fusion methods for UAV-Sentinel-2 imagery
1.4.1 Transfer learning fusion algorithm
1.4.2 Recurrent neural network algorithm
Fig. 4 Diagram of the gate units in an LSTM network |
Fig. 5 Diagram of the gate units in the GRU network |
1.5 Improved CNN models
1.5.1 CNN model improved by support vector machine
Fig. 6 SVM-CNN model structure diagram |
1.5.2 CNN model improved by attention mechanism
Fig. 7 Calculation process of the self-attention mechanism module |
Fig. 8 Schematic diagram of the squeeze-and-excitation (SE) attention mechanism module |
Fig. 9 Structure of CNN-GRU/LSTM-Attention model |
1.6 Feature fusion of multi-source data based on UAV-Sentinel-2
1.6.1 Construction of time series multi-feature imagery
Fig. 10 Multi-feature time-series imagery of Sentinel-2 for the full months of 2022 at Jianshan Farm, Heilongjiang province |
Fig. 11 Sampling point extraction scheme based on ArcGIS at Jianshan Farm |
1.6.2 Feature fusion based on transfer learning algorithms
Fig. 12 Feature distribution alignment results of UAV and Sentinel-2 data using TCA algorithma. UAV source domain feature distribution b. Sentinel-2 target domain feature distribution |
1.6.3 UAV-Sentinel-2 time-series feature fusion via LSTM/GRU
Fig. 13 Sliding time window processing for LSTM/GRU sequential feature fusion |
Fig. 14 LSTM/GRU feature fusion process for UAV-Sentinel-2 multi-source data |
2 Results and analysis
2.1 Establishment of soybean yield estimation models using CNN
Table 3 Performance comparison of CNN yield estimation models using half-monthly and monthly multi-feature time-series data |
| Model | MAE/(kg/hm2) | RMSE/(kg/hm2) | MAPE | R 2 |
|---|---|---|---|---|
| CNN-UAV-15 | 22.294 1 | 30.850 5 | 0.106 6 | 0.39 |
| CNN-Sentinel-15 | 26.461 6 | 32.930 7 | 0.111 6 | 0.32 |
| CNN-US-15 | 21.827 5 | 28.974 6 | 0.095 7 | 0.44 |
| CNN-UAV-30 | 24.895 8 | 31.450 9 | 0.105 8 | 0.37 |
| CNN-Sentinel-30 | 27.005 5 | 34.359 5 | 0.120 5 | 0.21 |
| CNN-US-30 | 22.177 5 | 29.737 7 | 0.096 8 | 0.40 |
|
2.2 Comparative analysis of multi-source data fusion methods
Table 4 Performance comparison of CNN models with different fusion methods for half-monthly and monthly time-series data |
| Model | MAE/(kg/hm2) | RMSE/(kg/hm2) | MAPE | R 2 |
|---|---|---|---|---|
| CNN-US-15 | 21.827 5 | 28.974 6 | 0.095 7 | 0.44 |
| CNN-TCA-15 | 15.172 2 | 18.633 1 | 0.064 3 | 0.88 |
| CNN-GRU-15 | 12.526 8 | 17.254 5 | 0.055 5 | 0.90 |
| CNN-LSTM-15 | 10.093 1 | 14.137 4 | 0.045 0 | 0.93 |
| CNN-US-30 | 22.177 5 | 29.737 7 | 0.096 8 | 0.40 |
| CNN-TCA-30 | 21.568 8 | 27.953 3 | 0.092 6 | 0.71 |
| CNN-GRU-30 | 19.322 2 | 26.188 7 | 0.083 2 | 0.79 |
| CNN-LSTM-30 | 18.030 7 | 25.131 2 | 0.080 7 | 0.80 |
2.3 Comparative analysis of improved CNN models
Table 5 Performance of improved CNN models with different fusion methods for half-monthly time-series data |
| Model | MAE/(kg/hm2) | RMSE/(kg/hm2) | MAPE | R 2 |
|---|---|---|---|---|
| SVM-US-CNN | 42.304 6 | 55.776 7 | 0.191 5 | 0.38 |
| SVM-TCA-CNN | 35.184 3 | 45.623 3 | 0.168 3 | 0.66 |
| SVM-GRU-CNN | 31.577 1 | 38.886 2 | 0.153 2 | 0.73 |
| SVM-LSTM-CNN | 26.336 9 | 32.266 6 | 0.132 6 | 0.80 |
| SA-US-CNN | 10.730 5 | 14.740 8 | 0.050 4 | 0.85 |
| SA-TCA-CNN | 9.488 0 | 13.941 9 | 0.048 9 | 0.87 |
| SA-GRU-CNN | 8.403 8 | 12.283 8 | 0.037 8 | 0.94 |
| SA-LSTM-CNN | 10.219 8 | 14.569 3 | 0.0454 | 0.91 |
| SE-US-CNN | 14.983 6 | 12.230 6 | 0.104 2 | 0.70 |
| SE-TCA-CNN | 11.693 5 | 13.983 2 | 0.0963 | 0.71 |
| SE-GRU-CNN | 11.707 1 | 12.191 7 | 0.053 0 | 0.90 |
| SE-LSTM-CNN | 8.520 9 | 12.609 2 | 0.036 8 | 0.92 |
Table 6 Performance of improved CNN models with different fusion methods for monthly time-series data |
| Model | MAE/(kg/hm2) | RMSE/(kg/hm2) | MAPE | R 2 |
|---|---|---|---|---|
| SVM-US-CNN | 48.918 5 | 58.501 2 | 0.211 0 | 0.25 |
| SVM-TCA-CNN | 45.172 2 | 56.643 1 | 0.189 9 | 0.58 |
| SVM-GRU-CNN | 41.448 2 | 55.486 7 | 0.173 6 | 0.69 |
| SVM-LSTM-CNN | 28.893 5 | 36.069 6 | 0.152 6 | 0.79 |
| SA-US-CNN | 10.524 5 | 14.188 6 | 0.048 1 | 0.83 |
| SA-TCA-CNN | 7.777 5 | 10.654 0 | 0.035 5 | 0.86 |
| SA-GRU-CNN | 12.201 6 | 17.046 4 | 0.052 0 | 0.91 |
| SA-LSTM-CNN | 12.183 3 | 17.698 3 | 0.052 4 | 0.90 |
| SE-US-CNN | 14.974 6 | 32.250 4 | 0.108 2 | 0.66 |
| SE-TCA-CNN | 11.568 8 | 25.953 3 | 0.083 2 | 0.71 |
| SE-GRU-CNN | 17.902 1 | 25.219 7 | 0.076 3 | 0.78 |
| SE-LSTM-CNN | 14.953 4 | 20.938 2 | 0.063 8 | 0.85 |
Table 7 Ablation study results of the SA-GRU-CNN model on half-monthly time-series data |
| Model | R 2 | MAE/(kg/hm2) | RMSE/(kg/hm2) | MAPE |
|---|---|---|---|---|
| UAV-CNN | 0.37 | 24.90 | 31.45 | 0.106 |
| US-CNN fusion | 0.44 | 21.83 | 28.97 | 0.096 |
| SG-CNN | 0.71 | 15.17 | 18.63 | 0.064 |
| GRU-CNN | 0.90 | 12.53 | 17.25 | 0.056 |
| SA-GRU-CNN | 0.94 | 8.40 | 12.28 | 0.038 |
2.4 Yield analysis of the optimal soybean yield estimation model
Fig. 15 Spatial distribution of predicted soybean yield for North 13, 14 and 15 Plots at Jianshan Farm in 2022 |
2.5 Validation of the optimal yield estimation model
Table 8 Performance of improved CNN models with different fusion methods for half-monthly time-series data in 2023 |
| Model | MAE//(kg/hm2) | RMSE/(kg/hm2) | MAPE | R 2 |
|---|---|---|---|---|
| SVM-US-CNN | 43.456 2 | 55.967 5 | 0.196 6 | 0.37 |
| SVM-TCA-CNN | 34.683 2 | 44.326 8 | 0.153 4 | 0.67 |
| SVM-GRU-CNN | 31.867 4 | 39.326 5 | 0.157 6 | 0.72 |
| SVM-LSTM-CNN | 26.635 9 | 31.324 4 | 0.113 5 | 0.81 |
| SA-US-CNN | 15.786 5 | 14.964 2 | 0.056 8 | 0.84 |
| SA-TCA-CNN | 9.569 0 | 13.082 6 | 0.040 9 | 0.86 |
| SA-GRU-CNN | 8.253 8 | 9.603 9 | 0.036 5 | 0.93 |
| SA-LSTM-CNN | 16.326 5 | 15.834 5 | 0.073 2 | 0.82 |
| SE-US-CNN | 14.504 9 | 12.130 9 | 0.103 3 | 0.71 |
| SE-TCA-CNN | 10.236 9 | 12.026 0 | 0.086 3 | 0.73 |
| SE-GRU-CNN | 8.962 3 | 11.160 2 | 0.070 9 | 0.79 |
| SE-LSTM-CNN | 8.054 2 | 9.863 4 | 0.059 6 | 0.91 |





