欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture ›› 2025, Vol. 7 ›› Issue (1): 111-123.doi: 10.12133/j.smartag.SA202410019

• 专题--农业知识智能服务和智慧无人农场(下) • 上一篇    下一篇

多源场景下粘虫板小目标害虫轻量化检测识别模型

杨信廷2,3, 胡焕1,2,3, 陈晓1,2,3, 李汶政1,2,3, 周子洁2,3,4, 李文勇2,3()   

  1. 1. 上海海洋大学 信息学院,上海 201306,中国
    2. 国家农业信息化工程技术研究中心,北京 100097,中国
    3. 北京市农林科学院信息技术研究中心,北京 100097,中国
    4. 吉林农业大学 信息技术学院,吉林 长春 130118,中国
  • 收稿日期:2024-10-21 出版日期:2025-01-30
  • 基金项目:
    国家重点研发计划项目(2022YFD2001804); 北京市农林科学院协同创新中心建设专项
  • 作者简介:
    杨信廷,博士,研究员,研究方向为农业信息化关键技术研究。E-mail:
  • 通信作者:
    李文勇,博士,研究员,研究方向为植保信息化。E-mail:

Lightweight Detection and Recognition Model for Small Target Pests on Sticky Traps in Multi-Source Scenarios

YANG Xinting2,3, HU Huan1,2,3, CHEN Xiao1,2,3, LI Wenzheng1,2,3, ZHOU Zijie2,3,4, LI Wenyong2,3()   

  1. 1. Shanghai Ocean University, Shanghai 201306, China
    2. National Research Center for Information Technology in Agriculture, Beijing 100097, China
    3. Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
    4. Jilin Agricultural University, Changchun 130118, China
  • Received:2024-10-21 Online:2025-01-30
  • Foundation items:National Key Technology R&D Program of China(2022YFD2001804); The Promotion and Innovation of Beijing Academy of Agriculture and Forestry Sciences
  • About author:

    YANG Xinting, E-mail:

  • Corresponding author:
    LI Wenyong, E-mail:

摘要:

【目的/意义】 为了解决多源场景下粘虫板图像中粉虱和蓟马两种害虫由于个体小难以精确检测以及设备计算资源受限的问题,本研究基于YOLOv5s提出了一种名为MobileNetV4+VN-YOLOv5s的小目标图像轻量化检测识别模型。 【方法】 模型框架结合MobileNetV4主干网络构建EM模块,实现特征提取网络结构的优化和精度的提升;在模型颈部引入轻量化模块GSConv和VoV-GSCSP,替代普通卷积,降低模型复杂度;最后添加NWD(Normalized Wasserstein Distance)损失函数,用于增强小目标的判别敏感度与定位能力。 【结果和讨论】 所提出模型在室内场景下对小目标害虫的检测性能最佳,平均检测精度为82.5%,较原始模型YOLOv5s提升了8.4%;模型参数量降低了3.0 M,帧率提升了6.0帧/s;在室外场景下,所提出模型的平均精度为70.8%,较YOLOv5s提升了7.3个百分点,参数量降低了3.0 M,帧率提升了5.5帧/s;在混合场景下,模型的平均精度为74.7%,较YOLOv5s提升了8.0个百分点,参数量降低了3.0 M,帧率提升了4.3帧/s。同时研究发现,对原始图像进行裁剪分割能够影响模型性能,在采用5×5的比率对原图进行分割下模型的检测识别性能最佳;利用室内场景数据训练的模型能够在所有场景下都获得最佳的检测性能。 【结论】 本研究提出的MobileNetV4+VN-YOLOv5s模型兼顾轻量化和精度,模型可部署到嵌入式设备,便于落地应用,可为各种多源场景下粘虫板图像中小目标害虫检测提供参考。

关键词: 小目标, 害虫检测, 轻量化, 粘虫板, 多源场景, MobileNetV4, YOLOv5s

Abstract:

[Objective] In crop cultivation and production, pests have gradually become one of the main issues affecting agricultural yield. Traditional models often focus on achieving high accuracy, however, to facilitate model application, lightweighting is necessary. The targets in yellow sticky trap images are often very small with low pixel resolution, so modifications in network structure, loss functions, and lightweight convolutions need to adapt to the detection of small-object pests. Ensuring a balance between model lightweighting and small-object pest detection is particularly important. To improve the detection accuracy of small target pests on sticky trap images from multi-source scenarios, a lightweight detection model named MobileNetV4+VN-YOLOv5s was proposed in this research to detect two main small target pests in agricultural production, whiteflies and thrips. [Methods] In the backbone layer of MobileNetV4+VN-YOLOv5s, an EM block constructed with the MobileNetV4 backbone network was introduced for detecting small, high-density, and overlapping targets, making it suitable for deployment on mobile devices. Additionally, the Neck layer of MobileNetV4+VN-YOLOv5s incorporates the GSConv and VoV-GSCSP modules to replace regular convolutional modules with lightweight design, effectively reducing the parameter size of the model while improving detection accuracy. Lastly, a normalized wasserstein distance (NWD)loss function was introduced into the framework to enhance the sensitivity for low-resolution small target pests. Extensive experiments including state-of-the-art comparison, ablation evaluation, performance analysis on image splitting, pest density and multi-source data were conducted. [Results and Discussions] Through ablation tests, it was concluded that the EM module and the VoV-GSCSP convolution module had significant effects in reducing the model parameter size and frame rate, the NWD loss function significantly improved the mean average precision (mAP) of the model. By comparing tests with different loss functions, the NWD loss function improves the mAP by 6.1, 10.8 and 8.2 percentage compared to the DIoU, GIoU and EIoU loss functions, respectively, so the addition of the NWD loss function achieved good results. Comparative performance tests were detected wiht different light weighting models, the experimental results showed that the mAP of the proposed MobileNetV4+VN-YOLOv5s model in three scenarios (Indoor, Outdoor, Indoor&Outdoor) was 82.5%, 70.8%, and 74.7%, respectively. Particularly, the MobileNetV4+VN-YOLOv5s model had a parameter size of only 4.2 M, 58% of the YOLOv5s model, the frame rate was 153.2 fps, an increase of 6.0 fps compared to the YOLOv5s model. Moreover, the precision and mean average precision reach 79.7% and 82.5%, which were 5.6 and 8.4 percentage points higher than the YOLOv5s model, respectively. Comparative tests were conducted in the upper scenarios based on four splitting ratios: 1×1, 2×2, 5×5, and 10×10. The most superior was the result by using 5×5 ratio in indoor scenario, and the mAP of this case reached 82.5%. The mAP of the indoor scenario was the highest in the low-density case, reaching 83.8%, and the model trained based on the dataset from indoor condition achieves the best performance. Comparative tests under different densities of pest data resulted in a decreasing trend in mAP from low to high densities for the MobileNetV4+VN-YOLOv5s model in the three scenarios. Based on the comparison of the experimental results of different test sets in different scenarios, all three models achieved the best detection accuracy on the IN dataset. Specifically, the IN-model had the highest mAP at 82.5%, followed by the IO-model. At the same time, the detection performance showed the same trend across all three test datasets: The IN model performed the best, followed by the IO-model, and the OUT-model performed the lowest. By comparing the tests with different YOLO improvement models, it was concluded that MobileNetV4+VN-YOLOv5s had the highest mAP, EVN-YOLOv8s was the second highest, and EVN-YOLOv11s was the lowest. Besides, after deploying the model to the Raspberry Pi 4B motherboard, it was concluded that the detection results of the YOLOv5s model had more misdetections and omissions than those of the MobileNetV4+VN-YOLOv5s model, and the time of the model was shortened by about 33% compared to that of the YOLOv5s model, which demonstrated that the model had a good prospect of being deployed in the application. [Conclusions] The MobileNetV4+VN-YOLOv5s model proposed in this study achieved a balance between lightweight design and accuracy. It can be deployed on embedded devices, facilitating practical applications. The model can provide a reference for detecting small target pests in sticky trap images under various multi-source scenarios.

Key words: small target, pest detection, lightweight, sticky trap, multi-source scenarios, MobileNetV4, YOLOv5s

中图分类号: