Light-trapping Small-sized Rice Pest Detection Method by Combining Spatial Depth Transform Convolution and Multi-scale Attention Mechanism

doi:10.12133/j.smartag.SA202507024

Abstract

Abstract:

[Objective] Planthoppers suck the sap from the phloem of rice plants, causing malnutrition and slow growth of the plants, resulting in large-scale yield reduction. Therefore, timely and effective monitoring of planthopper pests and analysis of their occurrence degree are of vital importance for the prevention of rice diseases. The traditional detection of planthopper pests mainly relies on manual methods for diagnosis and identification. However, due to the tiny size of planthopper pests, on-site manual investigation is not only time-consuming and labor-intensive but also greatly influenced by human subjectivity, making it easy to misjudge. In response to the above issues, the intelligent light traps can be used to assist in the work. When using intelligent light traps to detect dense and occluded low-resolution and small-sized planthopper pests, problems such as low accuracy, false detection, and missed detection are prone to occur. For this purpose, based on YOLOv11x, the research proposes a light-trapping small-sized rice pest detection method by combining spatial depth transform convolution and multi-scale attention mechanism. [Methods] The image data in this research were collected by multiple light-induced pest monitoring devices installed in the experimental rice fields. The images included two types of planthopper pests, the brown planthopper and the white-backed planthopper. The image sizes were both 5 472 pixels ×3 648 pixels, totaling 998 images. The original dataset was divided into a training set and a validation set in a 4:1 ratio. To enhance the learning efficiency of the model during training, two data augmentation operations, horizontal flipping and vertical flipping, were performed on the images in the training set. A total of 2 388 images in the training set were obtained for model training, and 200 images in the validation set were used for model inference validation. To improve the model performance, first of all, the C3k2 module in the original YOLOv11x network was improved by using the Efficient Multi-Scale Attention (EMA) attention mechanism to enhance the perception of the model and the fusion ability of small-volume pest features in dense and occlusions. Secondly, the Space-to-Depth-Convolution (SPD-Conv) convolution was used to replace the Conv common convolution module in the original model, further improving the extraction accuracy of the model for low-resolution and small-volume pest features and reducing the number of parameters. In addition, a P2 detection layer was added to the original network and the P5 detection layer was removed, thereby enhancing the model's detection performance for small targets in a targeted manner. Finally, by introducing the dynamic non-monotonic focusing mechanism loss function Wise-Intersection over Union (WIoU)v3, the positioning ability of the model was enhanced, thereby reducing the false detection rate and missed detection rate. [Results and Discussions] The test results show that the Precision (P), Recall(R), mean Average Precision at IoU equals 0.50(mAP₅₀) and the mean Average Precision at IoU thresholded from 0.50 to 0.95 with a step size of 0.05 (mAP_50-95) of the improved model on the self-built planthopper pest dataset dataset_Planthopper reached 77.5%, 73.5%, 80.8%, and 44.9% respectively. Compared with the baseline model YOLOv11x, it has increased by 4.8, 3.5, 5.5 and 4.7 percent point, respectively. The number of parameters has been reduced from 56 M to 40 M, a reduction of 29%. Compared with the current mainstream object detection models YOLOv5x, YOLO8x, YOLOv10x, YOLOv11x, YOLOv12x, Salience DETR-R50, Relation DETR-R50, RT-DETR-x, the mAP₅₀ of the improved model was 6.8, 7.8, 8.6, 5.5, 5.6, 8.7, 6.9 and 6.9 percentage points higher, respectively, and it had the best comprehensive performance. In practical applications, it could assist in achieving precise monitoring of farmland pests and scientific prevention and control decisions, thereby reducing the use of chemical pesticides and promoting the intelligent development of agriculture. Although this method has achieved significant improvements in multiple indicators, it still had certain limitations. Firstly, the species of planthoppers were numerous and their forms were diverse. The current models mainly targeted some typical species, and their generalization ability needed to be further verified. Secondly, due to the limitations of the data collection environment, there was still room for improvement in the performance of the model under extreme lighting changes and extremely occluded scenarios. Finally, although the number of parameters had decreased, the real-time detection speed still needed to be optimized to meet the requirements of some low-power edge devices. [Conclusions] The improved YOLOv11x model effectively enhances the performance of detecting low-resolution and small-sized planthopper pests under dense and occluded insect conditions, and reduces the probability of missed detection and false detection. Future research can focus on expanding the generalization, robustness and lightweighting of more types of rice planthopper pest models in more complex situations.

Key words: rice planthopper, small-sized target, dense occlusion, detection and recognition, deep learning

CLC Number:

S433
TP391.4

LI Wenzheng, YANG Xingting, SUN Chuanheng, CUI Tengpeng, WANG Hui, LI Shanshan, LI Wenyong. Light-trapping Small-sized Rice Pest Detection Method by Combining Spatial Depth Transform Convolution and Multi-scale Attention Mechanism[J]. Smart Agriculture, doi: 10.12133/j.smartag.SA202507024.

Figures/Tables 21

Fig. 1

Fig. 2

Fig. 3

Table 1

Table 2

Fig. 4

Fig. 5

Fig. 6

Fig.7

Fig. 8

Fig. 9

Fig. 10

Table 3

Table 4

Table 5

Table 6

Table 7

Fig. 11

Table 8

Fig. 12

Fig. 13

References 32

[1]	朱友理, 何东兵, 邱晓红, 等. 病虫草危害对稻米品质的影响[J]. 中国稻米, 2021, 27(6): 115-118.
	ZHU Y L, HE D B, QIU X H, et al. Effects of damage by diseases, pests and weeds on quality of rice[J]. China rice, 2021, 27(6): 115-118.
[2]	卓富彦, 陈学新, 夏玉先, 等. 2013—2022 年我国水稻病虫害发生特点与绿色防控技术集成[J].中国生物防治学报, 2024, 40(5): 1207-1213.
	ZHUO Fuyan, CHEN Xuexin, XIA Yuxian, et al. The occurrence characteristics of rice diseases and insect pests and the integration of green control technology in China from 2013 to 2022[J]. Chinese Journal of Biological Control,2024,40(5): 1207-1213.
[3]	蔡永凤. 戊唑醇对褐飞虱的生物活性及其作用机制[D]. 武汉: 华中农业大学, 2022
	CAI Y F. The bioactivity and mechanism of tebuconazole on Nilaparvata lugens(st(?)l)[D]. Wuhan: Huazhong Agricultural University, 2022
[4]	刘万才, 刘振东, 黄冲, 等. 近10年农作物主要病虫害发生危害情况的统计和分析[J]. 植物保护, 2016, 42(5): 1-9, 46.
	LIU W C, LIU Z D, HUANG C, et al. Statistics and analysis of crop yield losses caused by main diseases and insect pests in recent 10 years[J]. Plant protection, 2016, 42(5): 1-9, 46.
[5]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2016: 779-788.
[6]	CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[M]// Computer Vision – ECCV 2020. Cham: Springer International Publishing, 2020: 213-229.
[7]	CHEN X, YANG X T, HU H, et al. DAMI-YOLOv8l: A multi-scale detection framework for light-trapping insect pest monitoring[J]. Ecological informatics, 2025, 86: ID 103067.
[8]	彭红星, 徐慧明, 高宗梅, 等. 基于改进YOLOF模型的田间农作物害虫检测方法[J]. 农业机械学报, 2023, 54(4): 285-294, 303.
	PENG H X, XU H M, GAO Z M, et al. Insect pest detection of field crops based on improved YOLOF model[J]. Transactions of the Chinese society for agricultural machinery, 2023, 54(4): 285-294, 303.
[9]	赵辉, 黄镖, 王红君, 等. 基于改进YOLOv7的农田复杂环境下害虫识别算法研究[J]. 农业机械学报, 2023, 54(10): 246-254.
	ZHAO H, HUANG B, WANG H J, et al. Pest identification method in complex farmland environment based on improved YOLOv7[J]. Transactions of the Chinese society for agricultural machinery, 2023, 54(10): 246-254.
[10]	LIU B, JIA Y X, LIU L Y, et al. Skip DETR: End-to-end Skip connection model for small object detection in forestry pest dataset[J]. Frontiers in plant science, 2023, 14: ID 1219474.
[11]	QI F, CHEN G M, LIU J Y, et al. End-to-end pest detection on an improved deformable DETR with multihead criss cross attention[J]. Ecological informatics, 2022, 72: ID 101902.
[12]	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, New Jersey, USA: IEEE, 2018: 7132-7141.
[13]	ZHU X, SU W, LU L, et al. Deformable DETR: Deformable transformers for end-to-end object detection [EB/OL]. arXiv: 2010.04159, 2020.
[14]	TANG K, QIAN Y R, DONG H L, et al. SP-YOLO: A real-time and efficient multi-scale model for pest detection in sugar beet fields[J]. Insects, 2025, 16(1): ID 102.
[15]	蒋心璐, 陈天恩, 王聪, 等. 大田环境下的农业害虫图像小目标检测算法[J]. 计算机工程, 2024, 50(1): 232-241.
	JIANG X L, CHEN TE, WANG C, et al. Small object detection algorithm for agricultural pest images in field environments[J]. Computer engineering, 2024, 50(1): 232-241.
[16]	魏志慧, 张聪, 成泞伸, 等. 一种水稻害虫的小目标检测方法研究[J]. 江苏农业科学, 2024, 52(9): 232-241.
	WEI Z H, ZHANG C, CHENG N S, et al. Study on a small target detection method for rice pests[J]. Jiangsu agricultural sciences, 2024, 52(9): 232-241.
[17]	谭泗桥, 陈涵, 朱磊, 等. 基于改进YOLOv8m的稻田害虫识别方法[J]. 农业工程学报, 2025, 41(02): 185-195.
	TAN S Q, CHEN H, ZHU L, et al. Rice field pest recognition method based on improved YOLOv8m[J]. Transactions of the Chinese society of agricultural engineering, 2025, 41(2): 185-195.
[18]	ZHANG Z L, ZHAN W, SUN K L, et al. RPH-Counter: Field detection and counting of rice planthoppers using a fully convolutional network with object-level supervision[J]. Computers and electronics in agriculture, 2024, 225: ID 109242.
[19]	KHANAM R, HUSSAIN M. YOLOv11: An overview of the key architectural enhancements[EB/OL]. arXiv: 2410.17725, 2024.
[20]	WANG A, CHEN H, LIU L H, et al. YOLOv10: Real-time end to-end object detection[EB/OL]. arXiv: 2405.14458, 2024.
[21]	OUYANG D L, HE S, ZHANG G Z, et al. Efficient multi-scale attention module with cross-spatial learning[C]// ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Piscataway, New Jersey, USA: IEEE, 2023: 1-5.
[22]	SUNKARA R, LUO T. No more strided convolutions or pooling: A new CNN building block for low-resolution images and small objects[M]// Machine Learning and Knowledge Discovery in Databases. Cham: Springer Nature Switzerland, 2023: 443-459.
[23]	TONG Z, CHEN Y, XU Z, et al. Wise-IoU: Bounding box regression loss with dynamic focusing mechanism[EB/OL]. arXiv: 2301.10051, 2023.
[24]	TIAN Y, YE Q, DOERMANN D. YOLOv12: Attention-centric real-time object detectors[EB/OL]. arXiv: 2502.12524, 2025.
[25]	ZHAO Y A, LV W Y, XU S L, et al. DETRs beat YOLOs on real-time object detection[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2024: 16965-16974.
[26]	HOU X Q, LIU M Q, ZHANG S L, et al. Salience DETR: Enhancing detection transformer with hierarchical salience filtering refinement[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2024: 17574-17583.
[27]	HOU X Q, LIU M Q, ZHANG S L, et al. Relation DETR: Exploring explicit position relation prior for object detection[M]// Computer Vision – ECCV 2024. Cham: Springer Nature Switzerland, 2024: 89-105.
[28]	WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional Block attention module[M]// Computer Vision – ECCV 2018. Cham: Springer International Publishing, 2018: 3-19.
[29]	WANG Q L, WU B G, ZHU P F, et al. ECA-net: Efficient channel attention for deep convolutional neural networks[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2020: 11531-11539.
[30]	YU Z P, HUANG H B, CHEN W J, et al. YOLO-FaceV2: A scale and occlusion aware face detector[J]. Pattern recognition, 2024, 155: ID 110714.
[31]	YANG L, ZHANG R Y, LI L, et al. Simam: A simple, parameter free attention module for convolutional neural networks[C]// In ternational conference on machine learning. New York, USA: PMLR, 2021: 11863-11874.
[32]	SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: Visual explanations from deep networks via gradient-based localization[J]. International journal of computer vision, 2020, 128(2): 336-359.

虫情情况	目标间距均值（x）/像素点	害虫密度（y）/（个/图）	遮挡面积比（z）/%
普通情况	x>5	y≤100	z=0
密集	2˂x≤5	100˂y≤500	z≤20
遮挡	x≤2	y>500	z>20

害虫名称	训练集		验证集
害虫名称	图片数	标签数	图片数	标签数
褐飞虱	2 388	22 922	200	1 944
白背飞虱	2 388	14 844	200	1 331

检测模型	P/%	R/%	mAP₅₀/%	mAP_50-95/%	Parameters/M	GFLOPs
YOLOv5x	71.1	69.0	74.0	39.2	97	246
YOLOv8x	70.1	68.8	73.0	37.7	68	256
YOLOv10x	70.8	67.0	72.2	38.9	32	171
YOLOv11x	72.7	70.0	75.3	40.2	56	196
YOLOv12x	72.1	69.0	75.2	40.4	59	200
Salience DETR-R50	—	86.1	72.1	42.1	56	201
Relation DETR-R50	—	87.0	73.9	43.1	49	303
RT-DETR-x	74.6	70.1	73.9	41.1	65	223
改进的YOLOv11x	77.5	73.5	80.8	44.9	40	246

检测模型	t检验	是否显著
YOLOv5x	t=217.46， p=3.10e-32	是
YOLOv8x	t=249.44， p=2.62e-33	是
YOLOv10x	t=275.03， p=4.53e-34	是
YOLOv11x	t=175.89， p=1.41e-30	是
YOLOv12x	t=179.09， p=1.02e-30	是
Salience DETR-R50	t=278.23， p=3.68e-34	是
Relation DETR-R50	t=220.66， p=2.39e-32	是
RT-DETR-x	t=219.10， p=2.71e-32	是

模型	P/%	R/%	mAP₅₀/%	mAP_50-95/%	Parameters/M	GFLOPs
YOLOv11x	72.7	70.0	75.3	40.2	56	196
YOLOv11x+C3k2-EMA	73.5	70.0	76.8	42.2	65	236
YOLOv11x+SPD-Conv	71.6	70.4	75.4	41.2	48	171
YOLOv11x+P2	74.4	71.6	78.2	42.9	59	252
YOLOv11x+P2-P5	76.5	70.0	78.8	44.0	47	242
YOLOv11x+P2-P5+C3k2-EMA	77.0	73.1	80.1	44.4	48	277
YOLOv11x+P2-P5+C3k2-EMA+SPD-Conv	76.5	73.9	80.1	44.8	40	246
YOLOv11x+P2-P5+C3k2-EMA+SPD-Conv+WIoUv3	77.5	73.5	80.8	44.9	40	246