Welcome to Smart Agriculture

Smart Agriculture

   

Crop Pest Target Detection Algorithm in Complex Scenes:YOLOv8-Extend

ZHANG Ronghua1(), BAI Xue1, FAN Jiangchuan2,3()   

  1. 1. Jinghang Chuangzhi (Beijing) Technology Co. Ltd. , Beijing 102404, China
    2. National EngineeringResearch Center for Information Technology in Agriculture, Beijing 100097, China
    3. Beijing Key Laboratory of Digital Plants, Beijing 100097, China
  • Received:2023-11-02 Online:2024-03-04
  • corresponding author:
    FAN Jiangchuan, E-mail:
  • Supported by:
    Beijing Nova Program(Z211100002121065); Beijing Nova Program(Z20220484202); National Key R&D Program(2022YFD2002302-02)

Abstract:

Objective Crop pest detection plays a crucial role in enhancing crop yield. By detecting pests, the distribution and seasonal changes of pests can be understood, and reasonable control plans can be formulated. This, in turn, offers a scientific foundation for agricultural management to improve crop yield and quality. To achieve the identification and detection of crop pests in complex natural environments, the aim is to shift away from the current dependence on expert human sensory recognition and judgment in agricultural practices. This transition seeks to boost detection efficiency and accuracy, thereby improving overall pest management processes. Methods An algorithm named YOLOv8-Entend for detecting crop pests was proposed. To address challenges such as detecting small targets, distinguishing between crops and pests, low accuracy in detection, and slow inference speed, the algorithm inherits the excellent feature extraction and multi-scale feature fusion capabilities of YOLOv8. Firstly, the algorithm introduces GSConv to enhance the model's receptive field, allowing for global feature aggregation. This mechanism enables feature aggregation at both node and global levels simultaneously, obtaining local features from neighboring nodes through neighbor sampling and aggregation operations, enhancing the model's receptive field and semantic understanding ability. Additionally, the algorithm replaces some Convs with lightweight Ghost Convolutions and utilizes HorBlock to capture longer-term feature dependencies. The recursive gate convolution employed gating mechanisms to remember and transmit previous information, capturing long-term correlations. Furthermore, Concat was replaced with BiFPN for richer feature fusion. The bidirectional fusion of depth features from top to bottom and from bottom to top enhances the transmission of feature information acrossed different network layers. Utilizing the VoVGSCSP module, feature maps of different scales were connected to create longer feature map vectors, increasing model diversity and enhancing small object detection. The convolutional block attention module (CBAM) attention mechanism was introduced to strengthen features of field pests and reduce background weights caused by complexity. Next, the Wise IoU dynamic non-monotonic focusing mechanism was implemented to evaluate the quality of anchor boxes using "outlier" instead of IoU. This mechanism also included a gradient gain allocation strategy, which reduced the competitiveness of high-quality anchor frames and minimizes harmful gradients from low-quality examples. This approach allowed WIoU to concentrate on anchor boxes of average quality, improving the network model's generalization ability and overall performance. Subsequently, the improved YOLOv8-Extend model was compared with the original YOLOv8 model, YOLOv5, YOLOv8-GSCONV, YOLOv8-BiFPN, and YOLOv8-CBAM to validate the accuracy and precision of model detection. Finally, the model was deployed on edge devices for inference verification to confirm its effectiveness in practical application scenarios. Results and Discussions The results indicated that the improved YOLOv8-Extend model achieved notable improvements in accuracy, recall, mAP@0.5, and mAP@0.5:0.95 evaluation indices. Specifically, there were increases of 2.6%, 3.6%, 2.4% and 7.2%, respectively, showcasing superior detection performance. YOLOv8-Extend and YOLOv8 run respectively on the edge computing device JETSON ORIN NX 16 GB and were accelerated by TensorRT, mAP@0.5 improved by 4.6%, FPS reached 57.6, meeting real-time detection requirements. The YOLOv8-Extend model demonstrated better adaptability in complex agricultural scenarios and exhibited clear advantages in detecting small pests and pests sharing similar growth environments in practical data collection. The accuracy in detecting challenging data saw a notable increased of 11.9%. Through algorithm refinement, the model showcased improved capability in extracting and focusing on features in crop pest target detection, addressing issues such as small targets, similar background textures, and challenging feature extraction. Conclusions The YOLOv8-Extend model introduced in this study significantly boosts detection accuracy and recognition rates while upholding high operational efficiency. It is suitable for deployment on edge terminal computing devices to facilitate real-time detection of crop pests, offering technological advancements and methodologies for the advancement of cost-effective terminal-based automatic pest recognition systems. This research can serve as a valuable resource and aid in the intelligent detection of other small targets, as well as in optimizing model structures. Additionally, it provides a theoretical foundation for automated detection and algorithm development in the realm of crop pest management.

Key words: YOLOv8, pest detection, attention mechanism, edge computing, CBAM, BiFPN, VoVGSCSP, GSConv