Select

A Rapid Detection Method for Wheat Seedling Leaf Number in Complex Field Scenarios Based on Improved YOLOv8

HOU Yiting, RAO Yuan, SONG He, NIE Zhenjun, WANG Tan, HE Haoxu

Smart Agriculture 2024, 6 (4): 128-137. DOI: 10.12133/j.smartag.SA202403019

Abstract （513）

HTML （97）

PDF（pc）（2913KB）（844）

Save

[Objective] The enumeration of wheat leaves is an essential indicator for evaluating the vegetative state of wheat and predicting its yield potential. Currently, the process of wheat leaf counting in field settings is predominantly manual, characterized by being both time-consuming and labor-intensive. Despite advancements, the efficiency and accuracy of existing automated detection and counting methodologies have yet to satisfy the stringent demands of practical agricultural applications. This study aims to develop a method for the rapid quantification of wheat leaves to refine the precision of wheat leaf tip detection. [Methods] To enhance the accuracy of wheat leaf detection, firstly, an image dataset of wheat leaves across various developmental stages—seedling, tillering, and overwintering—under two distinct lighting conditions and using visible light images sourced from both mobile devices and field camera equipmen, was constructed. Considering the robust feature extraction and multi-scale feature fusion capabilities of YOLOv8 network, the foundational architecture of the proposed model was based on the YOLOv8, to which a coordinate attention mechanism has been integrated. To expedite the model's convergence, the loss functions were optimized. Furthermore, a dedicated small object detection layer was introduced to refine the recognition of wheat leaf tips, which were typically difficult for conventional models to discern due to their small size and resemblance to background elements. This deep learning network was named as YOLOv8-CSD, tailored for the recognition of small targets such as wheat leaf tips, ascertains the leaf count by detecting the number of leaf tips present within the image. A comparative analysis was conducted on the YOLOv8-CSD model in comparison with the original YOLOv8 and six other prominent network architectures, including Faster R-CNN, Mask R-CNN, YOLOv7, and SSD, within a uniform training framework, to evaluate the model's effectiveness. In parallel, the performance of both the original and YOLOv8-CSD models was assessed under challenging conditions, such as the presence of weeds, occlusions, and fluctuating lighting, to emulate complex real-world scenarios. Ultimately, the YOLOv8-CSD model was deployed for wheat leaf number detection in intricate field conditions to confirm its practical applicability and generalization potential. [Results and Discussions] The research presented a methodology that achieved a recognition precision of 91.6% and an mAP_0.5 of 85.1% for wheat leaf tips, indicative of its robust detection capabilities. This method exceled in adaptability within complex field environments, featuring an autonomous adjustment mechanism for different lighting conditions, which significantly enhanced the model's robustness. The minimal rate of missed detections in wheat seedlings' leaf counting underscored the method's suitability for wheat leaf tip recognition in intricate field scenarios, consequently elevating the precision of wheat leaf number detection. The sophisticated algorithm embedded within this model had demonstrated a heightened capacity to discern and focus on the unique features of wheat leaf tips during the detection process. This capability was essential for overcoming challenges such as small target sizes, similar background textures, and the intricacies of feature extraction. The model's consistent performance across diverse conditions, including scenarios with weeds, occlusions, and fluctuating lighting, further substantiated its robustness and its readiness for real-world application. [Conclusions] This research offers a valuable reference for accurately detecting wheat leaf numbers in intricate field conditions, as well as robust technical support for the comprehensive and high-quality assessment of wheat growth.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Recognition Method of Facility Cucumber Farming Behaviours Based on Improved SlowFast Model

HE Feng, WU Huarui, SHI Yangming, ZHU Huaji

Smart Agriculture 2024, 6 (3): 118-127. DOI: 10.12133/j.smartag.SA202402001

Abstract （348）

HTML （104）

PDF（pc）（1737KB）（450）

Save

[Objective] The identification of agricultural activities plays a crucial role for greenhouse vegetables production, particularly in the precise management of cucumber cultivation. By monitoring and analyzing the timing and procedures of agricultural operations, effective guidance can be provided for agricultural production, leading to increased crop yield and quality. However, in practical applications, the recognition of agricultural activities in cucumber cultivation faces significant challenges. The complex and ever-changing growing environment of cucumbers, including dense foliage and internal facility structures that may obstruct visibility, poses difficulties in recognizing agricultural activities. Additionally, agricultural tasks involve various stages such as planting, irrigation, fertilization, and pruning, each with specific operational intricacies and skill requirements. This requires the recognition system to accurately capture the characteristics of various complex movements to ensure the accuracy and reliability of the entire recognition process. To address the complex challenges, an innovative algorithm: SlowFast-SMC-ECA (SlowFast-Spatio-Temporal Excitation, Channel Excitation, Motion Excitation-Efficient Channel Attention) was proposed for the recognition of agricultural activity behaviors in cucumber cultivation within facilities. [Methods] This algorithm represents a significant enhancement to the traditional SlowFast model, with the goal of more accurately capturing hand motion features and crucial dynamic information in agricultural activities. The fundamental concept of the SlowFast model involved processing video streams through two distinct pathways: the Slow Pathway concentrated on capturing spatial detail information, while the Fast Pathway emphasized capturing temporal changes in rapid movements. To further improve information exchange between the Slow and Fast pathways, lateral connections were incorporated at each stage. Building upon this foundation, the study introduced innovative enhancements to both pathways, improving the overall performance of the model. In the Fast Pathway, a multi-path residual network (SMC) concept was introduced, incorporating convolutional layers between different channels to strengthen temporal interconnectivity. This design enabled the algorithm to sensitively detect subtle temporal variations in rapid movements, thereby enhancing the recognition capability for swift agricultural actions. Meanwhile, in the Slow Pathway, the traditional residual block was replaced with the ECA-Res structure, integrating an effective channel attention mechanism (ECA) to improve the model's capacity to capture channel information. The adaptive adjustment of channel weights by the ECA-Res structure enriched feature expression and differentiation, enhancing the model's understanding and grasp of key spatial information in agricultural activities. Furthermore, to address the challenge of class imbalance in practical scenarios, a balanced loss function (Smoothing Loss) was developed. By introducing regularization coefficients, this loss function could automatically adjust the weights of different categories during training, effectively mitigating the impact of class imbalance and ensuring improved recognition performance across all categories. [Results and Discussions] The experimental results significantly demonstrated the outstanding performance of the improved SlowFast-SMC-ECA model on a specially constructed agricultural activity dataset. Specifically, the model achieved an average recognition accuracy of 80.47%, representing an improvement of approximately 3.5% compared to the original SlowFast model. This achievement highlighted the effectiveness of the proposed improvements. Further ablation studies revealed that replacing traditional residual blocks with the multi-path residual network (SMC) and ECA-Res structures in the second and third stages of the SlowFast model leads to superior results. This highlighted that the improvements made to the Fast Pathway and Slow Pathway played a crucial role in enhancing the model's ability to capture details of agricultural activities. Additional ablation studies also confirmed the significant impact of these two improvements on improving the accuracy of agricultural activity recognition. Compared to existing algorithms, the improved SlowFast-SMC-ECA model exhibited a clear advantage in prediction accuracy. This not only validated the potential application of the proposed model in agricultural activity recognition but also provided strong technical support for the advancement of precision agriculture technology. In conclusion, through careful refinement and optimization of the SlowFast model, it was successfully enhanced the model's recognition capabilities in complex agricultural scenarios, contributing valuable technological advancements to precision management in greenhouse cucumber cultivation. [Conclusions] By introducing advanced recognition technologies and intelligent algorithms, this study enhances the accuracy and efficiency of monitoring agricultural activities, assists farmers and agricultural experts in managing and guiding the operational processes within planting facilities more efficiently. Moreover, the research outcomes are of immense value in improving the traceability system for agricultural product quality and safety, ensuring the reliability and transparency of agricultural product quality.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Identification Method of Kale Leaf Ball Based on Improved UperNet

ZHU Yiping, WU Huarui, GUO Wang, WU Xiaoyan

Smart Agriculture 2024, 6 (3): 128-137. DOI: 10.12133/j.smartag.SA202401020

Abstract （392）

HTML （112）

PDF（pc）（1568KB）（659）

Save

[Objective] Kale is an important bulk vegetable crop worldwide, its main growth characteristics are outer leaves and leaf bulbs. The traits of leaf bulb kale are crucial for adjusting water and fertilizer parameters in the field to achieve maximum yield. However, various factors such as soil quality, light exposure, leaf overlap, and shading can affect the growth of in practical field conditions. The similarity in color and texture between leaf bulbs and outer leaves complicates the segmentation process for existing recognition models. In this paper, the segmentation of kale outer leaves and leaf bulbs in complex field background was proposed, using pixel values to determine leaf bulb size for intelligent field management. A semantic segmentation algorithm, UperNet-ESA was proposed to efficiently and accurately segment nodular kale outer leaf and leaf bulb in field scenes using the morphological features of the leaf bulbs and outer leaves of nodular kale to realize the intelligent management of nodular kale in the field. [Methods] The UperNet-ESA semantic segmentation algorithm, which uses the unified perceptual parsing network (UperNet) as an efficient semantic segmentation framework, is more suitable for extracting crop features in complex environments by integrating semantic information across different scales. The backbone network was improved using ConvNeXt, which is responsible for feature extraction in the model. The similarity between kale leaf bulbs and outer leaves, along with issues of leaf overlap affecting accurate target contour localization, posed challenges for the baseline network, leading to low accuracy. ConvNeXt effectively combines the strengths of convolutional neural networks (CNN) and Transformers, using design principles from Swin Transformer and building upon ResNet50 to create a highly effective network structure. The simplicity of the ConvNeXt design not only enhances segmentation accuracy with minimal model complexity, but also positions it as a top performer among CNN architectures. In this study, the ConvNeXt-B version was chosen based on considerations of computational complexity and the background characteristics of the knotweed kale image dataset. To enhance the model's perceptual acuity, block ratios for each stage were set at 3:3:27:3, with corresponding channel numbers of 128, 256, 512 and 1 024, respectively. Given the visual similarity between kale leaf bulbs and outer leaves, a high-efficiency channel attention mechanism was integrated into the backbone network to improve feature extraction in the leaf bulb region. By incorporating attention weights into feature mapping through residual inversion, attention parameters were cyclically trained within each block, resulting in feature maps with attentional weights. This iterative process facilitated the repeated training of attentional parameters and enhanced the capture of global feature information. To address challenges arising from direct pixel addition between up-sampling and local features, potentially leading to misaligned context in feature maps and erroneous classifications at kale leaf boundaries, a feature alignment module and feature selection module were introduced into the feature pyramid network to refine target boundary information extraction and enhance model segmentation accuracy. [Results and Discussions] The UperNet-ESA semantic segmentation model outperforms the current mainstream UNet model, PSPNet model, DeepLabV3+ model in terms of segmentation accuracy, where mIoU and mPA reached 92.45% and 94.32%, respectively, and the inference speed of up to 16.6 frames per second (fps). The mPA values were better than that of the UNet model, PSPNet model, ResNet-50 based, MobilenetV2, and DeepLabV3+ model with Xception as the backbone, showing improvements of 11.52%, 13.56%, 8.68%, 4.31%, and 6.21%, respectively. Similarly, the mIoU exhibited improvements of 12.21%, 13.04%, 10.65%, 3.26% and 7.11% compared to the mIoU of the UNet-based model, PSPNet model, and DeepLabV3+ model based on the ResNet-50, MobilenetV2, and Xception backbones, respectively. This performance enhancement can be attributed to the introduction of the ECA module and the improvement made to the feature pyramid network in this model, which strengthen the judgement of the target features at each stage to obtain effective global contextual information. In addition, although the PSPNet model had the fastest inference speed, the overall accuracy was too low to for developing kale semantic segmentation models. On the contrary, the proposed model exhibited superior inference speed compared to all other network models. [Conclusions] The experimental results showed that the UperNet-ESA semantic segmentation model proposed in this study outperforms the original network in terms of performance. The improved model achieves the best accuracy-speed balance compared to the current mainstream semantic segmentation networks. In the upcoming research, the current model will be further optimized and enhanced, while the kale dataset will be expanded to include a wider range of samples of nodulated kale leaf bulbs. This expansion is intended to provide a more robust and comprehensive theoretical foundation for intelligent kale field management.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Phenotypic Traits Extraction of Wheat Plants Using 3D Digitization

ZHENG Chenxi, WEN Weiliang, LU Xianju, GUO Xinyu, ZHAO Chunjiang

Smart Agriculture 2022, 4 (2): 150-162. DOI: 10.12133/j.smartag.SA202203009

Abstract （870）

HTML （111）

PDF（pc）（1803KB）（1710）

Save

Aiming at the difficulty of accurately extract the phenotypic traits of plants and organs from images or point clouds caused by the multiple tillers and serious cross-occlusion among organs of wheat plants, to meet the needs of accurate phenotypic analysis of wheat plants, three-dimensional (3D) digitization was used to extract phenotypic parameters of wheat plants. Firstly, digital representation method of wheat organs was given and a 3D digital data acquisition standard suitable for the whole growth period of wheat was formulated. According to this standard, data acquisition was carried out using a 3D digitizer. Based on the definition of phenotypic parameters and semantic coordinates information contained in the 3D digitizing data, eleven conventional measurable phenotypic parameters in three categories were quantitative extracted, including lengths, thicknesses, and angles of wheat plants and organs. Furthermore, two types of new parameters for shoot architecture and 3D leaf shape were defined. Plant girth was defined to quantitatively describe the looseness or compactness by fitting 3D discrete coordinates based on the least square method. For leaf shape, wheat leaf curling and twisting were defined and quantified according to the direction change of leaf surface normal vector. Three wheat cultivars including FK13, XN979, and JM44 at three stages (rising stage, jointing stage, and heading stage) were used for method validation. The Open3D library was used to process and visualize wheat plant data. Visualization results showed that the acquired 3D digitization data of maize plants were realistic, and the data acquisition approach was capable to present morphological differences among different cultivars and growth stages. Validation results showed that the errors of stem length, leaf length, stem thickness, stem and leaf angle were relatively small. The R² were 0.93, 0.98, 0.93, and 0.85, respectively. The error of the leaf width and leaf inclination angle were also satisfactory, the R² were 0.75 and 0.73. Because wheat leaves are narrow and easy to curl, and some of the leaves have a large degree of bending, the error of leaf width and leaf angle were relatively larger than other parameters. The data acquisition procedure was rather time-consuming, while the data processing was quite efficient. It took around 133 ms to extract all mentioned parameters for a wheat plant containing 7 tillers and total 27 leaves. The proposed method could achieve convenient and accurate extraction of wheat phenotypes at individual plant and organ levels, and provide technical support for wheat shoot architecture related research.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Select

Identification and Counting of Silkworms in Factory Farm Using Improved Mask R-CNN Model

HE Ruimin, ZHENG Kefeng, WEI Qinyang, ZHANG Xiaobin, ZHANG Jun, ZHU Yihang, ZHAO Yiying, GU Qing

Smart Agriculture 2022, 4 (2): 163-173. DOI: 10.12133/j.smartag.SA202201012

Abstract （700）

HTML （37）

PDF（pc）（2357KB）（2880）

Save

Factory-like rearing of silkworm (Bombyx mori) using artificial diet for all instars is a brand-new rearing mode of silkworm. Accurate feeding is one of the core technologies to save cost and increase efficiency in factory silkworm rearing. Automatic identification and counting of silkworm play a key role to realize accurate feeding. In this study, a machine vision system was used to obtain digital images of silkworms during main instars, and an improved Mask R-CNN model was proposed to detect the silkworms and residual artificial diet. The original Mask R-CNN was improved using the noise data of annotations by adding a pixel reweighting strategy and a bounding box fine-tuning strategy to the model frame. A more robust model was trained to improve the detection and segmentation abilities of silkworm and residual feed. Three different data augmentation methods were used to expand the training dataset. The influences of silkworm instars, data augmentation, and the overlap between silkworms on the model performance were evaluated. Then the improved Mask R-CNN was used to detect silkworms and residual feed. The AP50 (Average Precision at IoU=0.5) of the model for silkworm detection and segmentation were 0.790 and 0.795, respectively, and the detection accuracy was 96.83%. The detection and segmentation AP50 of residual feed were 0.641 and 0.653, respectively, and the detection accuracy was 87.71%. The model was deployed on the NVIDIA Jetson AGX Xavier development board with an average detection time of 1.32 s and a maximum detection time of 2.05 s for a image. The computational speed of the improved Mask R-CNN can meet the requirement of real-time detection of the moving unit of the silkworm box on the production line. The model trained by the fifth instar data showed a better performance on test data than the fourth instar model. The brightness enhancement method had the greatest contribution to the model performance as compared to the other data augmentation methods. The overlap between silkworms also negatively affected the performance of the model. This study can provide a core algorithm for the research and development of the accurate feeding information system and feeding device for factory silkworm rearing, which can improve the utilization rate of artificial diet and improve the production and management level of factory silkworm rearing.

Table and Figures | Reference | Related Articles | Metrics | Comments（0）

Content of Information Perception and Acquisition in our journal