Smart Agriculture

Research Progress and Challenges of Oil Crop Yield Monitoring by Remote Sensing | Open Access

MA Yujing, WU Shangrong, YANG Peng, CAO Hong, TAN Jieyang, ZHAO Rongkun

2023, 5(3): 1-16. doi:10.12133/j.smartag.SA202303002

Asbtract ( 437 )

HTML ( 136)

PDF (837KB) ( 527 )

Figures and Tables | References | Related Articles | Metrics

[Significance] Oil crops play a significant role in the food supply, as well as the important source of edible vegetable oils and plant proteins. Real-time, dynamic and large-scale monitoring of oil crop growth is essential in guiding agricultural production, stabilizing markets, and maintaining health. Previous studies have made a considerable progress in the yield simulation of staple crops in regional scale based on remote sensing methods, but the yield simulation of oil crops in regional scale is still poor as its complexity of the plant traits and structural characteristics. Therefore, it is urgently needed to study regional oil crop yield estimation based on remote sensing technology. [Progress] This paper summarized the content of remote sensing technology in oil crop monitoring from three aspects: backgrounds, progressions, opportunities and challenges. Firstly, significances and advantages of using remote sensing technology to estimate the of oil crops have been expounded. It is pointed out that both parameter inversion and crop area monitoring were the vital components of yield estimation. Secondly, the current situation of oil crop monitoring was summarized based on remote sensing technology from three aspects of remote sensing parameter inversion, crop area monitoring and yield estimation. For parameter inversion, it is specified that optical remote sensors were used more than other sensors in oil crops inversion in previous studies. Then, advantages and disadvantages of the empirical model and physical model inversion methods were analyzed. In addition, advantages and disadvantages of optical and microwave data were further illustrated from the aspect of oil crops structure and traits characteristics. At last, optimal choice on the data and methods were given in oil crop parameter inversion. For crop area monitoring, this paper mainly elaborated from two parts of optical and microwave remote sensing data. Combined with the structure of oil crops and the characteristics of planting areas, the researches on area monitoring of oil crops based on different types of remote sensing data sources were reviewed, including the advantages and limitations of different data sources in area monitoring. Then, two yield estimation methods were introduced: remote sensing yield estimation and data assimilation yield estimation. The phenological period of oil crop yield estimation, remote sensing data source and modeling method were summarized. Next, data assimilation technology was introduced, and it was proposed that data assimilation technology has great potential in oil crop yield estimation, and the assimilation research of oil crops was expounded from the aspects of assimilation method and grid selection. All of them indicate that data assimilation technology could improve the accuracy of regional yield estimation of oil crops. Thirdly, this paper pointed out the opportunities of remote sensing technology in oil crop monitoring, put forward some problems and challenges in crop feature selection, spatial scale determination and remote sensing data source selection of oil crop yield, and forecasted the development trend of oil crop yield estimation research in the future. [Conclusions and Prospects] The paper puts forward the following suggestions for the three aspects: (1) Regarding crop feature selection, when estimating yields for oil crops such as rapeseed and soybeans, which have active photosynthesis in siliques or pods, relying solely on canopy leaf area index (LAI) as the assimilation state variable for crop yield estimation may result in significant underestimation of yields, thereby impacting the accuracy of regional crop yield simulation. Therefore, it is necessary to consider the crop plant characteristics and the agronomic mechanism of yield formation through siliques or pods when estimating yields for oil crops. (2) In determining the spatial scale, some oil crops are distributed in hilly and mountainous areas with mixed land cover. Using regularized yield simulation grids may result in the confusion of numerous background objects, introducing additional errors and affecting the assimilation accuracy of yield estimation. This poses a challenge to yield estimation research. Thus, it is necessary to choose appropriate methods to divide irregular unit grids and determine the optimal scale for yield estimation, thereby improving the accuracy of yield estimation. (3) In terms of remote sensing data selection, the monitoring of oil crops can be influenced by crop structure and meteorological conditions. Depending solely on spectral data monitoring may have a certain impact on yield estimation results. It is important to incorporate radar off-nadir remote sensing measurement techniques to perceive the response relationship between crop leaves and siliques or pods and remote sensing data parameters. This can bridge the gap between crop characteristics and remote sensing information for crop yield simulation. This paper can serve as a valuable reference and stimulus for further research on regional yield estimation and growth monitoring of oil crops. It supplements existing knowledge and provides insightful considerations for enhancing the accuracy and efficiency of oil crop production monitoring and management.

The Key Issues and Evaluation Methods for Constructing Agricultural Pest and Disease Image Datasets: A Review | Open Access

GUAN Bolun, ZHANG Liping, ZHU Jingbo, LI Runmei, KONG Juanjuan, WANG Yan, DONG Wei

2023, 5(3): 17-34. doi:10.12133/j.smartag.SA202306012

Asbtract ( 420 )

HTML ( 114)

PDF (1576KB) ( 710 )

Figures and Tables | References | Related Articles | Metrics

[Significance] The scientific dataset of agricultural pests and diseases is the foundation for monitoring and warning of agricultural pests and diseases. It is of great significance for the development of agricultural pest control, and is an important component of developing smart agriculture. The quality of the dataset affecting the effectiveness of image recognition algorithms, with the discovery of the importance of deep learning technology in intelligent monitoring of agricultural pests and diseases. The construction of high-quality agricultural pest and disease datasets is gradually attracting attention from scholars in this field. In the task of image recognition, on one hand, the recognition effect depends on the improvement strategy of the algorithm, and on the other hand, it depends on the quality of the dataset. The same recognition algorithm learns different features in different quality datasets, so its recognition performance also varies. In order to propose a dataset evaluation index to measure the quality of agricultural pest and disease datasets, this article analyzes the existing datasets and takes the challenges faced in constructing agricultural pest and disease image datasets as the starting point to review the construction of agricultural pest and disease datasets. [Progress] Firstly, disease and pest datasets are divided into two categories: private datasets and public datasets. Private datasets have the characteristics of high annotation quality, high image quality, and a large number of inter class samples that are not publicly available. Public datasets have the characteristics of multiple types, low image quality, and poor annotation quality. Secondly, the problems faced in the construction process of datasets are summarized, including imbalanced categories at the dataset level, difficulty in feature extraction at the dataset sample level, and difficulty in measuring the dataset size at the usage level. These include imbalanced inter class and intra class samples, selection bias, multi-scale targets, dense targets, uneven data distribution, uneven image quality, insufficient dataset size, and dataset availability. The main reasons for the problem are analyzed by two key aspects of image acquisition and annotation methods in dataset construction, and the improvement strategies and suggestions for the algorithm to address the above issues are summarized. The collection devices of the dataset can be divided into handheld devices, drone platforms, and fixed collection devices. The collection method of handheld devices is flexible and convenient, but it is inefficient and requires high photography skills. The drone platform acquisition method is suitable for data collection in contiguous areas, but the detailed features captured are not clear enough. The fixed device acquisition method has higher efficiency, but the shooting scene is often relatively fixed. The annotation of image data is divided into rectangular annotation and polygonal annotation. In image recognition and detection, rectangular annotation is generally used more frequently. It is difficult to label images that are difficult to separate the target and background. Improper annotation can lead to the introduction of more noise or incomplete algorithm feature extraction. In response to the problems in the above three aspects, the evaluation methods are summarized for data distribution consistency, dataset size, and image annotation quality at the end of the article. [Conclusions and Prospects] The future research and development suggestions for constructing high-quality agricultural pest and disease image datasets based are proposed on the actual needs of agricultural pest and disease image recognition:(1) Construct agricultural pest and disease datasets combined with practical usage scenarios. In order to enable the algorithm to extract richer target features, image data can be collected from multiple perspectives and environments to construct a dataset. According to actual needs, data categories can be scientifically and reasonably divided from the perspective of algorithm feature extraction, avoiding unreasonable inter class and intra class distances, and thus constructing a dataset that meets task requirements for classification and balanced feature distribution. (2) Balancing the relationship between datasets and algorithms. When improving algorithms, consider the more sufficient distribution of categories and features in the dataset, as well as the size of the dataset that matches the model, to improve algorithm accuracy, robustness, and practicality. It ensures that comparative experiments are conducted on algorithm improvement under the same evaluation standard dataset, and improved the pest and disease image recognition algorithm. Research the correlation between the scale of agricultural pest and disease image data and algorithm performance, study the relationship between data annotation methods and algorithms that are difficult to annotate pest and disease images, integrate recognition algorithms for fuzzy, dense, occluded targets, and propose evaluation indicators for agricultural pest and disease datasets. (3) Enhancing the use value of datasets. Datasets can not only be used for research on image recognition, but also for research on other business needs. The identification, collection, and annotation of target images is a challenging task in the construction process of pest and disease datasets. In the process of collecting image data, in addition to collecting images, attention can be paid to the collection of surrounding environmental information and host information. This method is used to construct a multimodal agricultural pest and disease dataset, fully leveraging the value of the dataset. In order to focus researchers on business innovation research, it is necessary to innovate the organizational form of data collection, develop a big data platform for agricultural diseases and pests, explore the correlation between multimodal data, improve the accessibility and convenience of data, and provide efficient services for application implementation and business innovation.

Spectroscopic Detection of Rice Leaf Blast Infection at Different Leaf Positions at The Early Stages With Solar-Induced Chlorophyll Fluorescence | Open Access

CHENG Yuxin, XUE Bowen, KONG Yuanyuan, YAO Dongliang, TIAN Long, WANG Xue, YAO Xia, ZHU Yan, CAO Weixing, CHENG Tao

2023, 5(3): 35-48. doi:10.12133/j.smartag.SA202309008

Asbtract ( 222 )

HTML ( 35)

PDF (5433KB) ( 206 )

Figures and Tables | References | Related Articles | Metrics

[Objective] Rice blast is considered as the most destructive disease that threatens global rice production and causes severe economic losses worldwide. The detection of rice blast in an early manner plays an important role in resistance breeding and plant protection. At present, most studies on rice blast detection have been devoted to its symptomatic stage, while none of previous studies have used solar-induced chlorophyll fluorescence (SIF) to monitor rice leaf blast (RLB) at early stages. This research was conducted to investigate the early identification of RLB infected leaves based on solar-induced chlorophyll fluorescence at different leaf positions. [Methods] Greenhouse experiments and field trials were conducted separately in Nanjing and Nantong in July and August, 2021, in order to record SIF data of the top 1^th to 4^th leaves of rice plants at jointing and heading stages with an Analytical Spectral Devices (ASD) spectrometer coupled with a FluoWat leaf clip and a halogen lamp. At the same time, the disease severity levels of the measured samples were manually collected according to the GB/T 15790-2009 standard. After the continuous wavelet transform (CWT) of SIF spectra, separability assessment and feature selection were applied to SIF spectra. Wavelet features sensitive to RLB were extracted, and the sensitive features and their identification accuracy of infected leaves for different leaf positions were compared. Finally, RLB identification models were constructed based on linear discriminant analysis (LDA). [Results and Discussion] The results showed that the upward and downward SIF in the far-red region of infected leaves at each leaf position were significantly higher than those of healthy leaves. This may be due to the infection of the fungal pathogen Magnaporthe oryzae, which may have destroyed the chloroplast structure, and ultimately inhibited the primary reaction of photosynthesis. In addition, both the upward and downward SIF in the red region and the far-red region increased with the decrease of leaf position. The sensitive wavelet features varied by leaf position, while most of them were distributed in the steep slope of the SIF spectrum and wavelet scales 3, 4 and 5. The sensitive features of the top 1^th leaf were mainly located at 665-680 nm, 755-790 nm and 815-830 nm. For the top 2^th leaf, the sensitive features were mainly found at 665-680 nm and 815-830 nm. For the top 3^th one, most of the sensitive features lay at 690 nm, 755-790 nm and 815-830 nm, and the sensitive bands around 690 nm were observed. The sensitive features of the top 4^th leaf were primarily located at 665-680 nm, 725 nm and 815-830 nm, and the sensitive bands around 725 nm were observed. The wavelet features of the common sensitive region (665-680 nm), not only had physiological significance, but also coincided with the chlorophyll absorption peak that allowed for reasonable spectral interpretation. There were differences in the accuracy of RLB identification models at different leaf positions. Based on the upward and downward SIF, the overall accuracies of the top 1^th leaf were separately 70% and 71%, which was higher than other leaf positions. As a result, the top 1^th leaf was an ideal indicator leaf to diagnose RLB in the field. The classification accuracy of SIF wavelet features were higher than the original SIF bands. Based on CWT and feature selection, the overall accuracy of the upward and downward optimal features of the top 1^th to 4^th leaves reached 70.13%、63.70%、64.63%、64.53% and 70.90%、63.12%、62.00%、64.02%, respectively. All of them were higher than the canopy monitoring feature F760, whose overall accuracy was 69.79%, 61.31%, 54.41%, 61.33% and 69.99%, 58.79%, 54.62%, 60.92%, respectively. This may be caused by the differences in physiological states of the top four leaves. In addition to RLB infection, the SIF data of some top 3^th and top 4^th leaves may also be affected by leaf senescence, while the SIF data of top 1^th leaf, the latest unfolding leaf of rice plants was less affected by other physical and chemical parameters. This may explain why the top 1^th leaf responded to RLB earlier than other leaves. The results also showed that the common sensitive features of the four leaf positions were also concentrated on the steep slope of the SIF spectrum, with better classification performance around 675 and 815 nm. The classification accuracy of the optimal common features, ↑WF_832,3 and ↓WF_809,3, reached 69.45%, 62.19%, 60.35%, 63.00% and 69.98%, 62.78%, 60.51%, 61.30% for the top 1^th to top 4^th leaf positions, respectively. The optimal common features, ↑WF_832,3 and ↓WF_809,3, were both located in wavelet scale 3 and 800-840nm, which may be related to the destruction of the cell structure in response to Magnaporthe oryzae infection. [Conclusions] In this study, the SIF spectral response to RLB was revealed, and the identification models of the top 1^th leaf were found to be most precise among the top four leaves. In addition, the common wavelet features sensitive to RLB, ↑WF_832,3 and ↓WF_809,3, were extracted with the identification accuracy of 70%. The results proved the potential of CWT and SIF for RLB detection, which can provide important reference and technical support for the early, rapid and non-destructive diagnosis of RLB in the field.

Diagnosis of Grapevine Leafroll Disease Severity Infection via UAV Remote Sensing and Deep Learning | Open Access

LIU Yixue, SONG Yuyang, CUI Ping, FANG Yulin, SU Baofeng

2023, 5(3): 49-61. doi:10.12133/j.smartag.SA202308013

Asbtract ( 329 )

HTML ( 75)

PDF (3044KB) ( 279 )

Figures and Tables | References | Related Articles | Metrics

[Objective] Wine grapes are severely affected by leafroll disease, which affects their growth, and reduces the quality of the color, taste, and flavor of wine. Timely and accurate diagnosis of leafroll disease severity is crucial for preventing and controlling the disease, improving the wine grape fruit quality and wine-making potential. Unmanned aerial vehicle (UAV) remote sensing technology provides high-resolution images of wine grape vineyards, which can capture the features of grapevine canopies with different levels of leafroll disease severity. Deep learning networks extract complex and high-level features from UAV remote sensing images and perform fine-grained classification of leafroll disease infection severity. However, the diagnosis of leafroll disease severity is challenging due to the imbalanced data distribution of different infection levels and categories in UAV remote sensing images. [Method] A novel method for diagnosing leafroll disease severity was developed at a canopy scale using UAV remote sensing technology and deep learning. The main challenge of this task was the imbalanced data distribution of different infection levels and categories in UAV remote sensing images. To address this challenge, a method that combined deep learning fine-grained classification and generative adversarial networks (GANs) was proposed. In the first stage, the GANformer, a Transformer-based GAN model was used, to generate diverse and realistic virtual canopy images of grapevines with different levels of leafroll disease severity. To further analyze the image generation effect of GANformer. The t-distributed stochastic neighbor embedding (t-SNE) to visualize the learned features of real and simulated images. In the second stage, the CA-Swin Transformer, an improved image classification model based on the Swin Transformer and channel attention mechanism was used, to classify the patch images into different classes of leafroll disease infection severity. CA-Swin Transformer could also use a self-attention mechanism to capture the long-range dependencies of image patches and enhance the feature representation of the Swin Transformer model by adding a channel attention mechanism after each Transformer layer. The channel attention (CA) mechanism consisted of two fully connected layers and an activation function, which could extract correlations between different channels and amplify the informative features. The ArcFace loss function and instance normalization layer was also used to enhance the fine-grained feature extraction and downsampling ability for grapevine canopy images. The UAV images of wine grape vineyards were collected and processed into orthomosaic images. They labeled into three categories: healthy, moderate infection, and severe infection using the in-field survey data. A sliding window method was used to extract patch images and labels from orthomosaic images for training and testing. The performance of the improved method was compared with the baseline model using different loss functions and normalization methods. The distribution of leafroll disease severity was mapped in vineyards using the trained CA-Swin Transformer model. [Results and Discussions] The experimental results showed that the GANformer could generate high-quality virtual canopy images of grapevines with an FID score of 93.20. The images generated by GANformer were visually very similar to real images and could produce images with different levels of leafroll disease severity. The T-SNE visualization showed that the features of real and simulated images were well clustered and separated in two-dimensional space, indicating that GANformer learned meaningful and diverse features, which enriched the image dataset. Compared to CNN-based deep learning models, Transformer-based deep learning models had more advantages in diagnosing leafroll disease infection. Swin Transformer achieved an optimal accuracy of 83.97% on the enhanced dataset, which was higher than other models such as GoogLeNet, MobileNetV2, NasNet Mobile, ResNet18, ResNet50, CVT, and T2TViT. It was found that replacing the cross entropy loss function with the ArcFace loss function improved the classification accuracy by 1.50%, and applying instance normalization instead of layer normalization further improved the accuracy by 0.30%. Moreover, the proposed channel attention mechanism, named CA-Swin Transformer, enhanced the feature representation of the Swin Transformer model, achieved the highest classification accuracy on the test set, reaching 86.65%, which was 6.54% higher than using the Swin Transformer on the original test dataset. By creating a distribution map of leafroll disease severity in vineyards, it was found that there was a certain correlation between leafroll disease severity and grape rows. Areas with a larger number of severe leafroll diseases caused by Cabernet Sauvignon were more prone to have missing or weak plants. [Conclusions] A novel method for diagnosing grapevine leafroll disease severity at a canopy scale using UAV remote sensing technology and deep learning was proposed. This method can generate diverse and realistic virtual canopy images of grapevines with different levels of leafroll disease severity using GANformer, and classify them into different classes using CA-Swin Transformer. This method can also map the distribution of leafroll disease severity in vineyards using a sliding window method, and provides a new approach for crop disease monitoring based on UAV remote sensing technology.

Wheat Lodging Types Detection Based on UAV Image Using Improved EfficientNetV2 | Open Access

LONG Jianing, ZHANG Zhao, LIU Xiaohang, LI Yunxia, RUI Zhaoyu, YU Jiangfan, ZHANG Man, FLORES Paulo, HAN Zhexiong, HU Can, WANG Xufeng

2023, 5(3): 62-74. doi:10.12133/j.smartag.SA202308010

Asbtract ( 191 )

HTML ( 31)

PDF (2022KB) ( 203 )

Figures and Tables | References | Related Articles | Metrics

[Objective] Wheat, as one of the major global food crops, plays a key role in food production and food supply. Different influencing factors can lead to different types of wheat lodging, e.g., root lodging may be due to improper use of fertilizers. While stem lodging is mostly due to harsh environments, different types of wheat lodging can have different impacts on yield and quality. The aim of this study was to categorize the types of wheat lodging by unmanned aerial vehicle (UAV) image detection and to investigate the effect of UAV flight altitude on the classification performance. [Methods] Three UAV flight altitudes (15, 45, and 91 m) were set to acquire images of wheat test fields. The main research methods contained three parts: an automatic segmentation algorithm, wheat classification model selection, and an improved classification model based on EfficientNetV2-C. In the first part, the automatic segmentation algorithm was used to segment the UAV to acquire the wheat test field at three different heights and made it into the training dataset needed for the classification model. The main steps were first to preprocess the original wheat test field images acquired by the UAV through scaling, skew correction, and other methods to save computation time and improve segmentation accuracy. Subsequently, the pre-processed image information was analyzed, and the green part of the image was extracted using the super green algorithm, which was binarized and combined with the edge contour extraction algorithm to remove the redundant part of the image to extract the region of interest, so that the image was segmented for the first time. Finally, the idea of accumulating pixels to find sudden value added was used to find the segmentation coordinates of two different sizes of wheat test field in the image, and the region of interest of the wheat test field was segmented into a long rectangle and a short rectangle test field twice, so as to obtain the structural parameters of different sizes of wheat test field and then to generate the dataset of different heights. In the second part, four machine learning classification models of support vector machine (SVM), K nearest neighbor (KNN), decision tree (DT), and naive bayes (NB), and two deep learning classification models (ResNet101 and EfficientNetV2) were selected. Under the unimproved condition, six classification models were utilized to classify the images collected from three UAVs at different flight altitudes, respectively, and the optimal classification model was selected for improvement. In the third part, an improved model, EfficientNetV2-C, with EfficientNetV2 as the base model, was proposed to classify and recognized the lodging type of wheat in test field images. The main improvement points were attention mechanism improvement and loss function improvement. The attention mechanism was to replace the original model squeeze and excitation (SE) with coordinate attention (CA), which was able to embed the position information into the channel attention, aggregate the features along the width and height directions, respectively, during feature extraction, and capture the long-distance correlation in the width direction while retaining the long-distance correlation in the length direction, accurate location information, enhancing the feature extraction capability of the network in space. The loss function was replaced by class-balanced focal loss (CB-Focal Loss), which could assign different loss weights according to the number of valid samples in each class when targeting unbalanced datasets, effectively solving the impact of data imbalance on the classification accuracy of the model. [Results and Discussions] Four machine learning classification results: SVM average classification accuracy was 81.95%, DT average classification accuracy was 79.56%, KNN average classification accuracy was 59.32%, and NB average classification accuracy was 59.48%. The average classification accuracy of the two deep learning models, ResNet101 and EfficientNetV2, was 78.04%, and the average classification accuracy of ResNet101 was 81.61%. Comparing the above six classification models, the EfficientNetV2 classification model performed optimally at all heights. And the improved EfficientNetV2-C had an average accuracy of 90.59%, which was 8.98% higher compared to the average accuracy of EfficientNetV2. The SVM classification accuracies of UAVs at three flight altitudes of 15, 45, and 91 m were 81.33%, 83.57%, and 81.00%, respectively, in which the accuracy was the highest when the altitude was 45 m, and the classification results of the SVM model values were similar to each other, which indicated that the imbalance of the input data categories would not affect the model's classification effect, and the SVM classification model was able to solve the problem of high dimensionality of the data efficiently and had a good performance for small and medium-sized data sets. The SVM classification model could effectively solve the problem of the high dimensionality of data and had a better classification effect on small and medium-sized datasets. For the deep learning classification model, however, as the flight altitude increases from 15 to 91 m, the classification performance of the deep learning model decreased due to the loss of image feature information. Among them, the classification accuracy of ResNet101 decreased from 81.57% to 78.04%, the classification accuracy of EfficientNetV2 decreased from 84.40% to 81.61%, and the classification accuracy of EfficientNetV2-C decreased from 97.65% to 90.59%. The classification accuracy of EfficientNetV2-C at each of the three altitudes. The difference between the values of precision, recall, and F₁-Score results of classification was small, which indicated that the improved model in this study could effectively solve the problems of unbalanced model classification results and poor classification effect caused by data imbalance. [Conclusions] The improved EfficientNetV2-C achieved high accuracy in wheat lodging type detection, which provides a new solution for wheat lodging early warning and crop management and is of great significance for improving wheat production efficiency and sustainable agricultural development.

Identification Method of Wheat Field Lodging Area Based on Deep Learning Semantic Segmentation and Transfer Learning | Open Access

ZHANG Gan, YAN Haifeng, HU Gensheng, ZHANG Dongyan, CHENG Tao, PAN Zhenggao, XU Haifeng, SHEN Shuhao, ZHU Keyu

2023, 5(3): 75-85. doi:10.12133/j.smartag.SA202309013

Asbtract ( 157 )

HTML ( 34)

PDF (2219KB) ( 225 )

Figures and Tables | References | Related Articles | Metrics

[Objective] Lodging constitutes a severe crop-related catastrophe, resulting in a reduction in photosynthesis intensity, diminished nutrient absorption efficiency, diminished crop yield, and compromised crop quality. The utilization of unmanned aerial vehicles (UAV) to acquire agricultural remote sensing imagery, despite providing high-resolution details and clear indications of crop lodging, encounters limitations related to the size of the study area and the duration of the specific growth stages of the plants. This limitation hinders the acquisition of an adequate quantity of low-altitude remote sensing images of wheat fields, thereby detrimentally affecting the performance of the monitoring model. The aim of this study is to explore a method for precise segmentation of lodging areas in limited crop growth periods and research areas. [Methods] Compared to the images captured at lower flight altitudes, the images taken by UAVs at higher altitudes cover a larger area. Consequently, for the same area, the number of images taken by UAVs at higher altitudes is fewer than those taken at lower altitudes. However, the training of deep learning models requires huge amount supply of images. To make up the issue of insufficient quantity of high-altitude UAV-acquired images for the training of the lodging area monitoring model, a transfer learning strategy was proposed. In order to verify the effectiveness of the transfer learning strategy, based on the Swin-Transformer framework, the control model, hybrid training model and transfer learning training model were obtained by training UAV images in 4 years (2019, 2020, 2021, 2023)and 3 study areas(Shucheng, Guohe, Baihe) under 2 flight altitudes (40 and 80 m). To test the model's performance, a comparative experimental approach was adopted to assess the accuracy of the three models for segmenting 80 m altitude images. The assessment relied on five metrics: intersection of union (IoU), accuracy, precision, recall, and F₁-score. [Results and Discussions] The transfer learning model shows the highest accuracy in lodging area detection. Specifically, the mean IoU, accuracy, precision, recall, and F₁-score achieved 85.37%, 94.98%, 91.30%, 92.52% and 91.84%, respectively. Notably, the accuracy of lodging area detection for images acquired at a 40 m altitude surpassed that of images captured at an 80 m altitude when employing a training dataset composed solely of images obtained at the 40 m altitude. However, when adopting mixed training and transfer learning strategies and augmenting the training dataset with images acquired at an 80 m altitude, the accuracy of lodging area detection for 80 m altitude images improved, inspite of the expense of reduced accuracy for 40 m altitude images. The performance of the mixed training model and the transfer learning model in lodging area detection for both 40 and 80 m altitude images exhibited close correspondence. In a cross-study area comparison of the mean values of model evaluation indices, lodging area detection accuracy was slightly higher for images obtained in Baihu area compared to Shucheng area, while accuracy for images acquired in Shucheng surpassed that of Guohe. These variations could be attributed to the diverse wheat varieties cultivated in Guohe area through drill seeding. The high planting density of wheat in Guohe resulted in substantial lodging areas, accounting for 64.99% during the late mature period. The prevalence of semi-lodging wheat further exacerbated the issue, potentially leading to misidentification of non-lodging areas. Consequently, this led to a reduction in the recall rate (mean recall for Guohe images was 89.77%, which was 4.88% and 3.57% lower than that for Baihu and Shucheng, respectively) and IoU (mean IoU for Guohe images was 80.38%, which was 8.80% and 3.94% lower than that for Baihu and Shucheng, respectively). Additionally, the accuracy, precision, and F₁-score for Guohe were also lower compared to Baihu and Shucheng. [Conclusions] This study inspected the efficacy of a strategy aimed at reducing the challenges associated with the insufficient number of high-altitude images for semantic segmentation model training. By pre-training the semantic segmentation model with low-altitude images and subsequently employing high-altitude images for transfer learning, improvements of 1.08% to 3.19% were achieved in mean IoU, accuracy, precision, recall, and F₁-score, alongside a notable mean weighted frame rate enhancement of 555.23 fps/m². The approach proposed in this study holds promise for improving lodging monitoring accuracy and the speed of image segmentation. In practical applications, it is feasible to leverage a substantial quantity of 40 m altitude UAV images collected from diverse study areas including various wheat varieties for pre-training purposes. Subsequently, a limited set of 80 m altitude images acquired in specific study areas can be employed for transfer learning, facilitating the development of a targeted lodging detection model. Future research will explore the utilization of UAV images captured at even higher flight altitudes for further enhancing lodging area detection efficiency.

In Situ Identification Method of Maize Stalk Width Based on Binocular Vision and Improved YOLOv8 | Open Access

ZUO Haoxuan, HUANG Qicheng, YANG Jiahao, MENG Fanjia, LI Sien, LI Li

2023, 5(3): 86-95. doi:10.12133/j.smartag.SA202309004

Asbtract ( 243 )

HTML ( 47)

PDF (1659KB) ( 659 )

Figures and Tables | References | Related Articles | Metrics

[Objective] The width of maize stalks is an important indicator affecting the lodging resistance of maize. The measurement of maize stalk width has many problems, such as cumbersome manual collection process and large errors in the accuracy of automatic equipment collection and recognition, and it is of great application value to study a method for in-situ detection and high-precision identification of maize stalk width. [Methods] The ZED2i binocular camera was used and fixed in the field to obtain real-time pictures from the left and right sides of maize stalks together. The picture acquisition system was based on the NVIDIA Jetson TX2 NX development board, which could achieve timed shooting of both sides view of the maize by setting up the program. A total of maize original images were collected and a dataset was established. In order to observe more features in the target area from the image and provide assistance to improve model training generalization ability, the original images were processed by five processing methods: image saturation, brightness, contrast, sharpness and horizontal flipping, and the dataset was expanded to 3500 images. YOLOv8 was used as the original model for identifying maize stalks from a complex background. The coordinate attention (CA) attention mechanism can bring huge gains to downstream tasks on the basis of lightweight networks, so that the attention block can capture long-distance relationships in one direction while retaining spatial information in the other direction, so that the position information can be saved in the generated attention map to focus on the area of interest and help the network locate the target better and more accurately. By adding the CA module multiple times, the CA module was fused with the C2f module in the original Backbone, and the Bottleneck in the original C2f module was replaced by the CA module, and the C2fCA network module was redesigned. Replacing the loss function Efficient IoU Loss(EIoU) splits the loss term of the aspect ratio into the difference between the predicted width and height and the width and height of the minimum outer frame, which accelerated the convergence of the prediction box, improved the regression accuracy of the prediction box, and further improved the recognition accuracy of maize stalks. The binocular camera was then calibrated so that the left and right cameras were on the same three-dimensional plane. Then the three-dimensional reconstruction of maize stalks, and the matching of left and right cameras recognition frames was realized through the algorithm, first determine whether the detection number of recognition frames in the two images was equal, if not, re-enter the binocular image. If they were equal, continue to judge the coordinate information of the left and right images, the width and height of the bounding box, and determine whether the difference was less than the given T_a. If greater than the given T_a, the image was re-imported; If it was less than the given T_a, the confidence level of the recognition frame of the image was determined whether it was less than the given T_b. If greater than the given T_b, the image is re-imported; If it is less than the given T_b, it indicates that the recognition frame is the same maize identified in the left and right images. If the above conditions were met, the corresponding point matching in the binocular image was completed. After the three-dimensional reconstruction of the binocular image, the three-dimensional coordinates (A_x, A_y, A_z) and (B_x, B_y, B_z) in the upper left and upper right corners of the recognition box under the world coordinate system were obtained, and the distance between the two points was the width of the maize stalk. Finally, a comparative analysis was conducted among the improved YOLOv8 model, the original YOLOv8 model, faster region convolutional neural networks (Faster R-CNN), and single shot multiBox detector (SSD)to verify the recognition accuracy and recognition accuracy of the model. [Results and Discussions] The precision rate (P)、recall rate (R)、average accuracy mAP_0.5、average accuracy mAP_0.5:0.95 of the improved YOLOv8 model reached 96.8%、94.1%、96.6% and 77.0%. Compared with YOLOv7, increased by 1.3%、1.3%、1.0% and 11.6%, compared with YOLOv5, increased by 1.8%、2.1%、1.2% and 15.8%, compared with Faster R-CNN, increased by 31.1%、40.3%、46.2%、and 37.6%, and compared with SSD, increased by 20.6%、23.8%、20.9% and 20.1%, respectively. Respectively, and the linear regression coefficient of determination R², root mean square error RMSE and mean absolute error MAE were 0.373, 0.265 cm and 0.244 cm, respectively. The method proposed in the research can meet the requirements of actual production for the measurement accuracy of maize stalk width. [Conclusions] In this study, the in-situ recognition method of maize stalk width based on the improved YOLOv8 model can realize the accurate in-situ identification of maize stalks, which solves the problems of time-consuming and laborious manual measurement and poor machine vision recognition accuracy, and provides a theoretical basis for practical production applications.

Root Image Segmentation Method Based on Improved UNet and Transfer Learning | Open Access

TANG Hui, WANG Ming, YU Qiushi, ZHANG Jiaxi, LIU Liantao, WANG Nan

2023, 5(3): 96-109. doi:10.12133/j.smartag.SA202308003

Asbtract ( 156 )

HTML ( 29)

PDF (2442KB) ( 183 )

Figures and Tables | References | Related Articles | Metrics

[Objective] The root system is an important component of plant composition, and its growth and development are crucial for plants. Root image segmentation is an important method for obtaining root phenotype information and analyzing root growth patterns. Research on root image segmentation still faces difficulties, because of the noise and image quality limitations, the intricate and diverse soil environment, and the ineffectiveness of conventional techniques. This paper proposed a multi-scale feature extraction root segmentation algorithm that combined data augmentation and transfer learning to enhance the generalization and universality of the root image segmentation models in order to increase the speed, accuracy, and resilience of root image segmentation. [Methods] Firstly, the experimental datasets were divided into a single dataset and a mixed dataset. The single dataset acquisition was obtained from the experimental station of Hebei Agricultural University in Baoding city. Additionally, a self-made RhizoPot device was used to collect images with a resolution pixels of 10,200×14,039, resulting in a total of 600 images. In this experiment, 100 sheets were randomly selected to be manually labeled using Adobe Photoshop CC2020 and segmented into resolution pixels of 768×768, and divided into training, validation, and test sets according to 7:2:1. To increase the number of experimental samples, an open source multi-crop mixed dataset was obtained in the network as a supplement, and it was reclassified into training, validation, and testing sets. The model was trained using the data augmentation strategy, which involved performing data augmentation operations at a set probability of 0.3 during the image reading phase, and each method did not affect the other. When the probability was less than 0.3, changes would be made to the image. Specific data augmentation methods included changing image attributes, randomly cropping, rotating, and flipping those images. The UNet structure was improved by designing eight different multi-scale image feature extraction modules. The module structure mainly included two aspects: Image convolution and feature fusion. The convolution improvement included convolutional block attention module (CBAM), depthwise separable convolution (DP Conv), and convolution (Conv). In terms of feature fusion methods, improvements could be divided into concatenation and addition. Subsequently, ablation tests were conducted based on a single dataset, data augmentation, and random loading of model weights, and the optimal multi-scale feature extraction module was selected and compared with the original UNet. Similarly, a single dataset, data augmentation, and random loading of model weights were used to compare and validate the advantages of the improved model with the PSPNet, SegNet, and DeeplabV3Plus algorithms. The improved model used pre-trained weights from a single dataset to load and train the model based on mixed datasets and data augmentation, further improving the model's generalization ability and root segmentation ability. [Results and Discussions] The results of the ablation tests indicated that Conv_ 2+Add was the best improved algorithm. Compared to the original UNet, the mIoU, mRecall, and root F₁ values of the model increased by 0.37%, 0.99%, and 0.56%, respectively. And, comparative experiments indicate Unet+Conv_2+Add model was superior to the PSPNet, SegNet, and DeeplabV3Plus models, with the best evaluation results. And the values of mIoU, mRecall, and the harmonic average of root F₁ were 81.62%, 86.90%, and 77.97%, respectively. The actual segmented images obtained by the improved model were more finely processed at the root boundary compared to other models. However, for roots with deep color and low contrast with soil particles, the improved model could only achieve root recognition and the recognition was sparse, sacrificing a certain amount of information extraction ability. This study used the root phenotype evaluation software Rhizovision to analyze the root images of the Unet+Conv_2+Add improved model, PSPNet, SegNet, and DeeplabV3Plu, respectively, to obtain the values of the four root phenotypes (total root length, average diameter, surface area, and capacity), and the results showed that the average diameter and surface area indicator values of the improved model, Unet+Conv_2+Add had the smallest differences from the manually labeled indicator values and the SegNet indicator values for the two indicators. Total root length and volume were the closest to those of the manual labeling. The results of transfer learning experiments proved that compared with ordinary training, the transfer training of the improved model UNet+Conv_2+Add increased the IoU value of the root system by 1.25%. The Recall value of the root system was increased by 1.79%, and the harmonic average value of F₁ was increased by 0.92%. Moreover, the overall convergence speed of the model was fast. Compared with regular training, the transfer training of the original UNet improved the root IoU by 0.29%, the root Recall by 0.83%, and the root F₁ value by 0.21%, which indirectly confirmed the effectiveness of transfer learning. [Conclusions] The multi-scale feature extraction strategy proposed in this study can accurately and efficiently segment roots, and further improve the model's generalization ability using transfer learning methods, providing an important research foundation for crop root phenotype research.

Identification Method of Wheat Grain Phenotype Based on Deep Learning of ImCascade R-CNN | Open Access

PAN Weiting, SUN Mengli, YUN Yan, LIU Ping

2023, 5(3): 110-120. doi:10.12133/j.smartag.SA202304006

Asbtract ( 205 )

HTML ( 46)

PDF (1664KB) ( 248 )

Figures and Tables | References | Related Articles | Metrics

[Objective] Wheat serves as the primary source of dietary carbohydrates for the human population, supplying 20% of the required caloric intake. Currently, the primary objective of wheat breeding is to develop wheat varieties that exhibit both high quality and high yield, ensuring an overall increase in wheat production. Additionally, the consideration of phenotype parameters, such as grain length and width, holds significant importance in the introduction, screening, and evaluation of germplasm resources. Notably, a noteworthy positive association has been observed between grain size, grain shape, and grain weight. Simultaneously, within the scope of wheat breeding, the occurrence of inadequate harvest and storage practices can readily result in damage to wheat grains, consequently leading to a direct reduction in both emergence rate and yield. In essence, the integrity of wheat grains directly influences the wheat breeding process. Nevertheless, distinguishing between intact and damaged grains remains challenging due to the minimal disparities in certain characteristics, thereby impeding the accurate identification of damaged wheat grains through manual means. Consequently, this study aims to address this issue by focusing on the detection of wheat kernel integrity and completing the attainment of grain phenotype parameters. [Methods] This study presented an enhanced approach for addressing the challenges of low detection accuracy, unclear segmentation of wheat grain contour, and missing detection. The proposed strategy involves utilizing the Cascade Mask R-CNN model and replacing the backbone network with ResNeXt to mitigate gradient dispersion and minimize the model's parameter count. Furthermore, the inclusion of Mish as an activation function enhanced the efficiency and versatility of the detection model. Additionally, a multilayer convolutional structure was introduced in the detector to thoroughly investigate the latent features of wheat grains. The Soft-NMS algorithm was employed to identify the candidate frame and achieve accurate segmentation of the wheat kernel adhesion region. Additionally, the ImCascade R-CNN model was developed. Simultaneously, to address the issue of low accuracy in obtaining grain contour parameters due to disordered grain arrangement, a grain contour-based algorithm for parameter acquisition was devised. Wheat grain could be approximated as an oval shape, and the grain edge contour could be obtained according to the mask, the distance between the farthest points could be iteratively obtained as the grain length, and the grain width could be obtained according to the area. Ultimately, a method for wheat kernel phenotype identification was put forth. The ImCascade R-CNN model was utilized to analyze wheat kernel images, extracting essential features and determining the integrity of the kernels through classification and boundary box regression branches. The mask generation branch was employed to generate a mask map for individual wheat grains, enabling segmentation of the grain contours. Subsequently, the number of grains in the image was determined, and the length and width parameters of the entire wheat grain were computed. [Results and Discussions] In the experiment on wheat kernel phenotype recognition, a comparison and improvement were conducted on the identification results of the Cascade Mask R-CNN model and the ImCascade R-CNN model across various modules. Additionally, the efficacy of the model modification scheme was verified. The comparison of results between the Cascade Mask R-CNN model and the ImCascade R-CNN model served to validate the proposed model's ability to significantly decrease the missed detection rate. The effectiveness and advantages of the ImCascade R-CNN model were verified by comparing its loss value, P-R value, and mAP_50 value with those of the Cascade Mask R-CNN model. In the context of wheat grain identification and segmentation, the detection results of the ImCascade R-CNN model were compared to those of the Cascade Mask R-CNN and Deeplabv3+ models. The comparison confirmed that the ImCascade R-CNN model exhibited superior performance in identifying and locating wheat grains, accurately segmenting wheat grain contours, and achieving an average accuracy of 90.2% in detecting wheat grain integrity. These findings serve as a foundation for obtaining kernel contour parameters. The grain length and grain width exhibited average error rates of 2.15% and 3.74%, respectively, while the standard error of the aspect ratio was 0.15. The statistical analysis and fitting of the grain length and width, as obtained through the proposed wheat grain shape identification method, yielded determination coefficients of 0.9351 and 0.8217, respectively. These coefficients demonstrated a strong agreement with the manually measured values, indicating that the method is capable of meeting the demands of wheat seed testing and providing precise data support for wheat breeding. [Conclusions] The findings of this study can be utilized for the rapid and precise detection of wheat grain integrity and the acquisition of comprehensive grain contour data. In contrast to current wheat kernel recognition technology, this research capitalizes on enhanced grain contour segmentation to furnish data support for the acquisition of wheat kernel contour parameters. Additionally, the refined contour parameter acquisition algorithm effectively mitigates the impact of disordered wheat kernel arrangement, resulting in more accurate parameter data compared to existing kernel appearance detectors available in the market, providing data support for wheat breeding and accelerating the cultivation of high-quality and high-yield wheat varieties.

A Multi-Focal Green Plant Image Fusion Method Based on Stationary Wavelet Transform and Parameter-Adaptation Dual Channel Pulse-Coupled Neural Network | Open Access

LI Jiahao, QU Hongjun, GAO Mingzhe, TONG Dezhi, GUO Ya

2023, 5(3): 121-131. doi:10.12133/j.smartag.SA202308005

Asbtract ( 113 )

HTML ( 18)

PDF (1435KB) ( 213 )

Figures and Tables | References | Related Articles | Metrics

[Objective] To construct the 3D point cloud model of green plants a large number of clear images are needed. Due to the limitation of the depth of field of the lens, part of the image would be out of focus when the green plant image with a large depth of field is collected, resulting in problems such as edge blurring and texture detail loss, which greatly affects the accuracy of the 3D point cloud model. However, the existing processing algorithms are difficult to take into account both processing quality and processing speed, and the actual effect is not ideal. The purpose of this research is to improve the quality of the fused image while taking into account the processing speed. [Methods] A plant image fusion method based on non-subsampled shearlet transform (NSST) based parameter-adaptive dual channel pulse-coupled neural network (PADC-PCNN) and stationary wavelet transform (SWT) was proposed. Firstly, the RGB image of the plant was separated into three color channels, and the G channel with many features such as texture details was decomposed by NSST in four decomposition layers and 16 directions, which was divided into one group of low frequency subbands and 64 groups of high frequency subbands. The low frequency subband used the gradient energy fusion rule, and the high frequency subband used the PADC-PCNN fusion rule. In addition, the weighting of the eight-neighborhood modified Laplacian operator was used as the link strength of the high-frequency fusion part, which enhanced the fusion effect of the detailed features. At the same time, for the R and B channels with more contour information and background information, a SWT with fast speed and translation invariance was used to suppress the pseudo-Gibbs effect. Through the high-precision and high-stability multi-focal length plant image acquisition system, 480 images of 8 experimental groups were collected. The 8 groups of data were divided into an indoor light group, natural light group, strong light group, distant view group, close view group, overlooking group, red group, and yellow group. Meanwhile, to study the application range of the algorithm, the focus length of the collected clear plant image was used as the reference (18 mm), and the image acquisition was adjusted four times before and after the step of 1.5 mm, forming the multi-focus experimental group. Subjective evaluation and objective evaluation were carried out for each experimental group to verify the performance of the algorithm. Subjective evaluation was analyzed through human eye observation, detail comparison, and other forms, mainly based on the human visual effect. The image fusion effect of the algorithm was evaluated using four commonly used objective indicators, including average gradient (AG), spatial frequency (SF), entropy (EN), and standard deviation (SD). [Results and Discussions] The proposed PADC-PCNN-SWT algorithm and other five algorithms of common fast guided filtering algorithm (FGF), random walk algorithm (RW), non-subsampled shearlet transform based PCNN (NSST-PCNN) algorithm, SWT algorithm and non-subsampled shearlet transform based parameter-adaptive dual-channel pulse-coupled neural network (NSST-PADC) and were compared. In the objective evaluation data except for the red group and the yellow group, each index of the PADC-PCNN-SWT algorithm was second only to the NSST-PADC algorithm, but the processing speed was 200.0% higher than that of the NSST-PADC algorithm on average. At the same time, compared with the FDF, RW, NSST-PCNN, and SWT algorithms, the PADC-PCN -SWT algorithm improved the clarity index by 5.6%, 8.1%, 6.1%, and 17.6%, respectively, and improved the spatial frequency index by 2.9%, 4.8%, 7.1%, and 15.9%, respectively. However, the difference between the two indicators of information entropy and standard deviation was less than 1%, and the influence was ignored. In the yellow group and the red group, the fusion quality of the non-green part of the algorithm based on PADC-PCNN-SWT was seriously degraded. Compared with other algorithms, the sharpness index of the algorithm based on PADC-PCNN-SWT decreased by an average of 1.1%, and the spatial frequency decreased by an average of 5.1%. However, the indicators of the green part of the fused image were basically consistent with the previous several groups of experiments, and the fusion effect was good. Therefore, the algorithm based on PADC-PCNN-SWT only had a good fusion effect on green plants. Finally, by comparing the quality of four groups of fused images with different focal length ranges, the results showed that the algorithm based on PADC-PCNN-SWT had a better contour and color restoration effect for out-of-focus images in the range of 15-21 mm, and the focusing range based on PADC-PCNN-SWT was about 6 mm. [Conclusions] The multi-focal length image fusion algorithm based on PADC-PCNN-SWT achieved better detail fusion performance and higher image fusion efficiency while ensuring fusion quality, providing high-quality data, and saving a lot of time for building 3D point cloud model of green plants.

Visible/NIR Spectral Inversion of Malondialdehyde Content in JUNCAO Based on Deep Convolutional Gengrative Adversarial Network | Open Access

YE Dapeng, CHEN Chen, LI Huilin, LEI Yingxiao, WENG Haiyong, QU Fangfang

2023, 5(3): 132-141. doi:10.12133/j.smartag.SA202307011

Asbtract ( 95 )

HTML ( 16)

PDF (1784KB) ( 142 )

Figures and Tables | References | Related Articles | Metrics

[Objective] JUNCAO, a perennial herbaceous plant that can be used as medium for cultivating edible and medicinal fungi. It has important value for promotion, but the problem of overwintering needs to be overcome when planting in the temperate zone. Low-temperature stress can adversely impact the growth of JUNCAO plants. Malondialdehyde (MDA) is a degradation product of polyunsaturated fatty acid peroxides, which can serve as a useful diagnostic indicator for studying plant growth dynamics. Because the more severe the damage caused by low temperature stress on plants, the higher their MDA content. Therefore, the detection of MDA content can provide instruct for low-temperature stress diagnosis and JUNCAO plants breeding. With the development of optical sensors and machine learning technologies, visible/near-infrared spectroscopy technology combined with algorithmic models has great potential in rapid, non-destructive and high-throughput inversion of MDA content and evaluation of JUNCAO growth dynamics. [Methods] In this research, six varieties of JUNCAO plants were selected as experimental subjects. They were divided into a control group planted at ambient temperature (28°C) and a stress group planted at low temperature (4°C). The hyperspectral reflectances of JUNCAO seedling leaves during the seedling stage were collected using an ASD spectroradiomete and a near-infrared spectrometer, and then the leaf physiological indicators were measured to obtain leaf MDA content. Machine learning methods were used to establish the MDA content inversion models based on the collected spectral reflectance data. To enhance the prediction accuracy of the model, an improved one-dimensional deep convolutional generative adversarial network (DCAGN ) was proposed to increase the sample size of the training set. Firstly, the original samples were divided into a training set (96 samples) and a prediction set (48 samples) using the Kennard stone (KS) algorithm at a ratio of 2:1. Secondly, the 96 training set samples were generated through the DCGAN model, resulting in a total of 384 pseudo samples that were 4 times larger than the training set. The pseudo samples were randomly shuffled and sequentially added to the training set to form an enhanced modeling set. Finally, the MDA quantitative detection models were established based on random forest (RF), partial least squares regression (PLSR), and convolutional neural network (CNN) algorithms. By comparing the prediction accuracies of the three models after increasing the sample size of the training set, the best MDA regression detection model of JUNCAO was obtained. [Results and Discussions] (1) The MDA content of the six varieties of JUNCAO plants ranged from 12.1988 to 36.7918 nmol/g. Notably, the MDA content of JUNCAO under low-temperature stress was remarkably increased compared to the control group with significant differences (P<0.05). Moreover, the visible/near-infrared spectral reflectance in the stressed group also exhibited an increasing trend compared to the control group. (2) Samples generated by the DCAGN model conformed to the distribution patterns of the original samples. The spectral curves of the generated samples retained the shape and trends of the original data. The corresponding MDA contented of generated samples consistently falling within the range of the original samples, with the average and standard deviation only decreased by 0.6650 and 0.9743 nmol/g, respectively. (3) Prior to the inclusion of generated samples, the detection performance of the three models differed significantly, with a correlation coefficient (R²) of 0.6967 for RF model, that of 0.6729 for CNN model, and that of 0.5298 for the PLSR model. After the introduction of generated samples, as the number of samples increased, all three models exhibited an initial increase followed by a decrease in R² on the prediction set, while the root mean square error of prediction (RMSEP) first decreased and then increased. (4) The prediction results of the three regression models indicated that augmenting the sample size by using DCGAN could effectively enhance the prediction performance of models. Particularly, utilizing DCGAN in combination with the RF model achieved the optimal MDA content detection performance, with the R² of 0.7922 and the RMSEP of 2.1937. [Conclusions] Under low temperature stress, the MDA content and spectral reflectance of the six varieties of JUNCAO leaves significantly increased compared to the control group, which might due to the damage of leaf pigments and tissue structure, and the decrease in leaf water content. Augmenting the sample size using DCGAN effectively enhanced the reliability and detection accuracy of the models. This improvement was evident across different regression models, illustrating the robust generalization capabilities of this DCGAN deep learning network. Specifically, the combination of DCGAN and RF model achieved optimal MDA content detection performance, as expanding to a sufficient sample dataset contributed to improve the modeling accuracy and stability. This research provides valuable insights for JUNCAO plants breeding and the diagnosis of low-temperature stress based on spectral technology and machine learning methods, offering a scientific basis for achieving high, stable, and efficient utilization of JUNCAO plants.

A Hyperspectral Image-Based Method for Estimating Water and Chlorophyll Contents in Maize Leaves under Drought Stress | Open Access

WANG Jingyong, ZHANG Mingzhen, LING Huarong, WANG Ziting, GAI Jingyao

2023, 5(3): 142-153. doi:10.12133/j.smartag.SA202308018

Asbtract ( 133 )

HTML ( 23)

PDF (2191KB) ( 103 )

Figures and Tables | References | Related Articles | Metrics

[Objectives] Chlorophyll content and water content are key physiological indicators of crop growth, and their non-destructive detection is a key technology to realize the monitoring of crop growth status such as drought stress. This study took maize as an object to develop a hyperspectral-based approach for the rapid and non-destructive acquisition of the leaf chlorophyll content and water content for drought stress assessment. [Methods] Drought treatment experiments were carried out in a greenhouse of the College of Agriculture, Guangxi University. Maize plants were subjected to drought stress treatment at the seedling stage (four leaves). Four drought treatments were set up for normal water treatment [CK], mild drought [W1], moderate drought [W2], and severe drought [W3], respectively. Leaf samples were collected at the 3^rd, 6^th, and 9^th days after drought treatments, and 288 leaf samples were collected in total, with the corresponding chlorophyll content and water content measured in a standard laboratory protocol. A pair of push-broom hyperspectral cameras were used to collect images of the 288 seedling maize leaf samples, and image processing techniques were used to extract the mean spectra of the leaf lamina part. The algorithm flow framework of "pre-processing - feature extraction - machine learning inversion" was adopted for processing the extracted spectral data. The effects of different pre-processing methods, feature wavelength extraction methods and machine learning regression models were analyzed systematically on the prediction performance of chlorophyll content and water content, respectively. Accordingly, the optimal chlorophyll content and water content inversion models were constructed. Firstly, 70% of the spectral data was randomly sampled and used as the training dataset for training the inversion model, whereas the remaining 30% was used as the testing dataset to evaluate the performance of the inversion model. Subsequently, the effects of different spectral pre-processing methods on the prediction performance of chlorophyll content and water content were compared. Different feature wavelengths were extracted from the optimal pre-processed spectra using different algorithms, then their capabilities in preserve the information useful for the inversion of leaf chlorophyll content and water content were compared. Finally, the performances of different machine learning regression model were compared, and the optimal inversion model was constructed and used to visualize the chlorophyll content and water content. Additionally, the construction of vegetation coefficients were explored for the inversion of chlorophyll content and water content and evaluated their inversion ability. The performance evaluation indexes used include determination coefficient and root mean squared error (RMSE). [Results and Discussions] With the aggravation of stress, the reflectivity of leaves in the wavelength range of 400~1700 nm gradually increased with the degree of drought stress. For the inversion of leaf chlorophyll content and water content, combining stepwise regression (SR) feature extraction with Stacking regression could obtain an optimal performance for chlorophyll content prediction, with an R² of 0.878 and an RMSE of 0.317 mg/g. Compared with the full-band stacking model, SR-Stacking not only improved R² by 2.9%, reduced RMSE by 0.0356mg/g, but also reduced the number of model input variables from 1301 to 9. Combining the successive projection algorithm (SPA) feature extraction with Stacking regression could obtain the optimal performance for water content prediction, with an R² of 0.859 and RMSE of 3.75%. Compared with the full-band stacking model, SPA-Stacking not only increased R² by 0.2%, reduced RMSE by 0.03%, but also reduced the number of model input variables from 1301 to 16. As the newly constructed vegetation coefficients, normalized difference vegetation index(NDVI) [(R₄₁₀-R₅₅₉)/(R₄₁₀+R₅₅₉)] and ratio index (RI) (R₄₀₀/R₁₁₇₁) had the highest accuracy and were significantly higher than the traditional vegetation coefficients for chlorophyll content and water content inversion, respectively. Their R² were 0.803 and 0.827, and their RMSE were 0.403 mg/g and 3.28%, respectively. The chlorophyll content and water content of leaves were visualized. The results showed that the physiological parameters of leaves could be visualized and the differences of physiological parameters in different regions of the same leaves can be found more intuitively and in detail. [Conclusions] The inversion models and vegetation indices constructed based on hyperspectral information can achieve accurate and non-destructive measurement of chlorophyll content and water content in maize leaves. This study can provide a theoretical basis and technical support for real-time monitoring of corn growth status. Through the leaf spectral information, according to the optimal model, the water content and chlorophyll content of each pixel of the hyperspectral image can be predicted, and the distribution of water content and chlorophyll content can be intuitively displayed by color. Because the field environment is more complex, transfer learning will be carried out in future work to improve its generalization ability in different environments subsequently and strive to develop an online monitoring system for field drought and nutrient stress.

Low-Cost Chlorophyll Fluorescence Imaging System Applied in Plant Physiology Status Detection | Open Access

YANG Zhenyu, TANG Hao, GE Wei, XIA Qian, TONG Dezhi, FU Lijiang, GUO Ya

2023, 5(3): 154-165. doi:10.12133/j.smartag.SA202306006

Asbtract ( 140 )

HTML ( 24)

PDF (1735KB) ( 111 )

Figures and Tables | References | Related Articles | Metrics

[Objective] Chlorophyll fluorescence (ChlF) emission from photosystem II (PSII) is closely coupled with photochemical reactions. As an efficient and non-destructive means of obtaining plant photosynthesis efficiency and physiological state information, the collection of fluorescence signals is often used in many fields such as plant physiological research, smart agricultural information sensing, etc. Chlorophyll fluorescence imaging systems, which is the experimental device for collecting the fluorescence signal, have difficulties in application due to their high price and complex structure. In order to solve the issues, this paper investigates and constructs a low-cost chlorophyll fluorescence imaging system based on a micro complementary metal oxide semiconductor (CMOS) camera and a smartphone, and carries out experimental verifications and applications on it. [Method] The chlorophyll fluorescence imaging system is mainly composed of three parts: excitation light, CMOS camera and its control circuit, and a upper computer based on a smartphone. The light source of the excitation light group is based on the principle and characteristics of chlorophyll fluorescence, and uses a blue light source of 460 nm band to achieve the best fluorescence excitation effect. In terms of structure, the principle of integrating sphere was borrowed, the bowl-shaped light source structure was adopted, and the design of the LED surface light source was used to meet the requirements of chlorophyll fluorescence signal measurement for the uniformity of the excitation light field. For the adjustment of light source intensity, the control scheme of pulse width modulation was adopted, which could realize sequential control of different intensities of excitation light. Through the simulation analysis of the light field, the light intensity and distribution characteristics of the light field were stuidied, and the calibration of the excitation light group was completed according to the simulation results. The OV5640 micro CMOS camera was used to collect fluorescence images. Combined with the imaging principle of the CMOS camera, the fluorescence imaging intensity of the CMOS camera was calculated, and its ability to collect chlorophyll fluorescence was analyzed and discussed. The control circuit of the CMOS camera uses an STM32 microcontroller as the microcontroller unit, and completes the data communication between the synchronous light group control circuit and the smartphone through the RS232 to TTL serial communication module and the full-speed universal serial bus, respectively. The smartphone upper computer software is the operating software of the chlorophyll fluorescence imaging system user terminal and the overall control program for fluorescence image acquisition. The overall workflow could be summarized as the user sets the relevant excitation light parameters and camera shooting instructions in the upper computer as needed, sends the instructions to the control circuit through the universal serial bus and serial port, and completes the control of excitation light and CMOS camera image acquisition. After the chlorophyll fluorescence image collection was completed, the data would be sent back to the smart phone or server for analysis, processing, storage, and display. In order to verify the design of the proposed scheme, a prototype of the chlorophyll fluorescence imaging system based on this scheme was made for experimental verification. Firstly, the uniformity of the light field was measured on the excitation light to test the actual performance of the excitation light designed in this article. On this basis, a chlorophyll fluorescence imaging experiment under continuous light excitation and modulated pulse light protocols was completed. Through the analysis and processing of the experimental results and comparison with mainstream chlorophyll fluorometers, the fluorescence imaging capabilities and low-cost advantages of this chlorophyll fluorometer were further verified. [Results and Discussions] The maximum excitation light intensity of the chlorophyll fluorescence imaging system designed in this article was 6250 µmol/(m²·s). Through the simulation analysis of the light field and the calculation and analysis of the fluorescence imaging intensity of the CMOS camera, the feasibility of collecting chlorophyll fluorescence images by the OV5640 micro CMOS camera was demonstrated, which provided a basis for the specific design and implementation of the fluorometer. In terms of hardware circuits, it made full use of the software and hardware advantages of smartphones, and only consisted of the control circuits of the excitation light and CMOS camera and the corresponding communication modules to complete the fluorescence image collection work, simplifying the circuit structure and reducing hardware costs to the greatest extent. The final fluorescence instrument achieved a collection resolution of 5 million pixels, a spectral range of 400~1000 nm, and a stable acquisition frequency of up to 42 f/s. Experimental results showed that the measured data was consistent with theoretical analysis and simulation, which could meet the requirements of fluorescence detection. The instrument was capable of collecting images of chlorophyll fluorescence under continuous light excitation or the protocol of modulated pulsed light. The acquired chlorophyll fluorescence images could reflect the two-dimensional heterogeneity of leaves and could effectively distinguish the photosynthetic characteristics of different leaves. Typical chlorophyll fluorescence parameter images of F_v/F_m, Rfd, etc. were in line with expectations. Compared with the existing chlorophyll fluorescence imaging system, the chlorophyll fluorescence imaging system designed in this article has obvious cost advantages while realizing the rapid detection function of chlorophyll fluorescence. [Conclusions] The instrument is with a simple structure and low cost, and has good application value for the detection of plant physiology and environmental changes. The system is useful for developing other fluorescence instruments.

Table of Content