基于改进YOLOv11n的芡实外观缺陷检测算法

doi:10.12133/j.smartag.SA202602012

摘要/Abstract

摘要：

【目的/意义】 针对芡实在加工分选过程中人工分选成本高、效率低且一致性差的问题，提出了一种基于改进YOLOv11n的芡实外观缺陷检测算法，旨在研究一种轻量化的芡实精准分选模型。 【方法】 首先，将通用感知大核卷积模块（UniRepLKNetBlock）融入C3k2（Cross Stage Partial with kernel size 2）结构，构建新型特征提取模块CURK（C3k2-UniRepLKNetBlock），增强模型对细粒度缺陷纹理和复杂背景的表征能力；其次，采用深度轻量化自适应提取模块替换主干网络中的部分卷积模块，在增强关键区域的自适应学习能力的同时减少计算量；最后，引入SDIoU（Scale Dynamic IoU Loss）损失函数，弥补DIoU（Distance Intersection over Union）损失函数在多目标检测中回归不稳定、定位精度不足的问题，提升模型的检测准确性与边界框回归效果。 【结果和讨论】 改进后的模型平均精度均值达到97.4%，召回率为92.8%，相较于基线模型YOLOv11n分别提高了0.4和2.9个百分点；模型权重文件大小为4.9 MB，模型参数量为2.31 M，浮点运算量为6.1 GFLOPs，相较于基线模型分别降低了10.9%、10.7%和3.2%；改进后模型的推理速度达到189.2帧/s，能够满足实时检测的要求。在实验平台测试实验中，总体平均准确率达到92.25%，验证了模型的实用性与工程适应性。 【结论】 改进模型在保证较高检测精度的同时，兼顾了模型轻量化和实时性，具有较好的工程应用潜力，为芡实的智能化分选提供了有效的技术参考。

关键词: 芡实, YOLOv11n, 缺陷检测, 轻量化模型, 图像识别

Abstract:

[Objective] To address the problems of high labor cost, low efficiency, and poor consistency in manual sorting during the post-harvest processing of Euryale ferox, and to develop a high-precision, lightweight, and real-time sorting model to provide technical support for intelligent processing, an appearance defect detection algorithm based on an improved YOLOv11n model is proposed. [Methods] YOLOv11n was improved from three aspects: feature extraction, downsampling mechanism, and loss function. First, the Universal Perception Large‑Kernel ConvNet Block (UniRepLKNetBlock) was integrated into the C3k2 structure of the neck network to construct a novel feature extraction module named CURK (C3k2‑UniRepLKNetBlock). This module used a depthwise large‑kernel convolution as the main branch, with four parallel convolutional branches. After training, these branches were merged into a single convolutional layer via structural re‑parameterization, which significantly enlarged the effective receptive field and enhanced the model's representation capability. Second, a Depth Lightweight Adaptive Extraction module (DLAE) replaced the standard convolutional downsampling layer in the 8th layer of the backbone. DLAE adopted a parallel two‑branch design: a feature extraction branch based on depthwise separable convolution (DWConv) to capture local texture details, and an attention branch that generated spatial attention weights through global average pooling and 1×1 DWConv, followed by Softmax normalization. The attention weights were then multiplied channel‑wise with the features, reducing computational load while adaptively enhancing key defect regions and suppressing background noise. Third, the Scale Dynamic IoU Loss (SDIoU) was introduced to replace the original loss function. In the BBox branch, SDIoU calculated a dynamic coefficient based on the ratio of the target bounding box area to the preset maximum target area, combined with the feature map compression ratio (ROC), and a threshold $δ = 0.5$ was used for clipping, automatically adjusting the weights of scale loss and location loss. A similar mechanism was applied in the Mask branch. A self‑built Euryale ferox appearance defect image dataset was constructed. A total of 2 537 original images were collected, and after data augmentation, the dataset was expanded to 5 354 images covering five categories: qualified, surface scratch, broken, dark (overripe), and shell. The dataset was randomly divided into training, validation, and test sets in a 7:1:2 ratio. [Results and Discussions] Ablation experiments showed that the combination of CURK, DLAE, and SDIoU achieved the best overall performance while maintaining lightweight advantages: precision was 95.4%, recall was 92.8%, and mAP50 reached 97.4%. Compared with the baseline YOLOv11n, recall increased by 2.9 percentage points and mAP50 by 0.4 percentage points. The model weight file size was reduced to 4.9 MB, parameters to 2.31 M, and computational cost to 6.1 GFLOPs. The inference speed reached 189.2 f/s, meeting real‑time detection requirements. In comparative experiments with mainstream models, the proposed model achieved the highest mAP50 (97.4%) and the lowest parameter count and computational cost. Heatmap visualization analysis indicated that after integrating CURK, DLAE, and SDIoU, the model's focus on Euryale ferox target regions became more concentrated, background interference was significantly suppressed, and missed and false detections were effectively reduced. A physical visual detection validation platform was built, and 402 samples each from Tianchang city and Funan county were tested. The overall accuracy was 94.5% for Tianchang samples and 90.0% for Funan samples, with an average of 92.25%, confirming the model's good generalization capability and engineering adaptability under different imaging conditions and geographical origins. [Conclusions] Through the synergistic effects of the CURK large‑kernel re‑parameterization module, the DLAE lightweight adaptive downsampling module, and the SDIoU scale‑dynamic loss function, the improved YOLOv11n model maintains high detection accuracy while balancing model lightweighting and real‑time performance, demonstrating good engineering application potential. It provides an efficient, accurate, and lightweight technical reference for intelligent sorting of Euryale ferox.

Key words: Euryale ferox, YOLOv11n, defect detection, lightweight model, image recognition

中图分类号:

TP391
S24

张坤, 张春雨, 陈龙梅, 刘启程, 李永康, 柳凯, 曾文豪. 基于改进YOLOv11n的芡实外观缺陷检测算法[J]. 智慧农业(中英文), doi: 10.12133/j.smartag.SA202602012.

ZHANG Kun, ZHANG Chunyu, CHEN Longmei, LIU Qicheng, LI Yongkang, LIU Kai, ZENG Wenhao. Appearance Defect Detection Algorithm of Euryale Ferox Based on Improved YOLOv11n[J]. Smart Agriculture, doi: 10.12133/j.smartag.SA202602012.

图/表 14

图1

图2

图3

图4

图5

图6

表1

表2

表3

图7

图8

图9

表4

图10

参考文献 37

[1]	杨校, 王新宇, 朱恒岳, 等. 重构本草——芡实[J]. 吉林中医药, 2024, 44(5): 576-578.
	YANG X, WANG X Y, ZHU H Y, et al. Reconstruction of Chinese materia Medica-Gordon Euryale seed[J]. Jilin Journal of Traditional Chinese Medicine, 2024, 44(5): 576-578.
[2]	徐旭, 刘娴, 李良俊. 芡实研究进展[J]. 长江蔬菜, 2017(18): 62-68.
	XU X, LIU X, LI L J. Research progress on Euryale ferox[J]. Journal of Changjiang Vegetables, 2017(18): 62-68.
[3]	陆娴, 雷根平, 杨东, 等. 芡实化学成分及现代药理研究进展[J]. 新乡医学院学报, 2026, 43(5): 410-416.
	LU X, LEI G P, YANG D, et al. Research progress on chemical composition and modern pharmacology of Euryale ferox Salisb[J]. Journal of Xinxiang Medical University, 2026, 43(5): 410-416.
[4]	JIANG J H, OU H Y, CHEN R Y, et al. The ethnopharmacological, phytochemical, and pharmacological review of Euryale ferox salisb.: a Chinese medicine food homology[J]. Molecules, 2023, 28(11): 4399.
[5]	潘复生, 鲍忠洲, 谢贻格. 苏芡优质高效精准栽培管理技术[J]. 长江蔬菜, 2016(10): 29-32.
[6]	张良. 浅议芡实生产和初加工的机械化[J]. 农业装备技术, 2024, 50(6): 38-39.
[7]	唐彦嵩, 徐锐豪, 王夙加. 机器视觉在食品无损检测中的应用研究进展[J]. 中国食品学报, 2024, 24(12): 13-27.
	TANG Y S, XU R H, WANG S J. Research progress in the application of machine vision in food nondestructive detection[J]. Journal of Chinese Institute of Food Science and Technology, 2024, 24(12): 13-27.
[8]	LI Q L, WANG Z J, WANG M Y, et al. Next-generation optical imaging and spectroscopy: AI and chemometrics in assessing authenticity, nutrition, and hazard factors in cereals[J]. Comprehensive Reviews in Food Science and Food Safety, 2025, 24(5): e70248.
[9]	贾志鑫, 杨霖, 史策, 等. 农产品品质在线感知技术应用研究进展[J]. 农业机械学报, 2025, 56(6): 17-32.
	JIA Z X, YANG L, SHI C, et al. Research progress on application of online perception technology for agricultural product quality[J]. Transactions of the Chinese Society for Agricultural Machinery, 2025, 56(6): 17-32.
[10]	成军虎, 曾弘, 郭鸿樟, 等. 机器学习在生鲜农产品质量与安全快速无损智能检测中的应用与展望[J]. 现代食品科技, 2025, 41(12): 334-345.
	CHENG J H, ZENG H, GUO H Z, et al. Non-destructive intelligent testing of the quality and safety of fresh agricultural products based on machine learning: principles, challenges, and applications[J]. Modern Food Science & Technology, 2025, 41(12): 334-345.
[11]	山显英, 张琳, 李泽慧. 深度学习驱动下的目标检测研究进展综述[J]. 计算机工程与应用, 2025, 61(1): 24-41.
	SHAN X Y, ZHANG L, LI Z H. Review of research progress in object detection driven by deep learning[J]. Computer Engineering and Applications, 2025, 61(1): 24-41.
[12]	LI J J, ZHU Z F, LIU H X, et al. Strawberry R-CNN: Recognition and counting model of strawberry based on improved faster R-CNN[J]. Ecological Informatics, 2023, 77: 102210.
[13]	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[14]	MACÍAS-MACÍAS M, SÁNCHEZ-SANTAMARIA H, GARCÍA ORELLANA C J, et al. Mask R-CNN for quality control of table olives[J]. Multimedia Tools and Applications, 2023, 82(14): 21657-21671.
[15]	LIU Q P, BI J J, ZHANG J W, et al. B-FPN SSD: an SSD algorithm based on a bidirectional feature fusion pyramid[J]. The Visual Computer, 2023, 39(12): 6265-6277.
[16]	BADGUJAR C M, POULOSE A, GAN H. Agricultural object detection with You Only Look Once (YOLO) Algorithm: A bibliometric and systematic literature review[J]. Computers and Electronics in Agriculture, 2024, 223: 109090.
[17]	YUAN W Q, XU W Q. RFE-YOLO: A more accurate YOLO for distinguishing high-quality and defective apples[J]. Journal of Food Measurement and Characterization, 2025, 19(11): 9124-9135.
[18]	叶秉良, 丰睿, 唐涛, 等. 基于改进YOLOv10n的自然环境下莲蓬成熟度检测方法[J]. 农业工程学报, 2025, 41(22): 145-153.
	YE B L, FENG R, TANG T, et al. Ripeness detection of Lotus seedpod in natural environment based on improved YOLOv10n[J]. Transactions of the Chinese Society of Agricultural Engineering, 2025, 41(22): 145-153.
[19]	黎祖胜, 唐吉深, 匡迎春. 基于改进YOLOv10n的轻量化荔枝虫害小目标检测模型[J]. 智慧农业(中英文), 2025, 7(2): 146-159.
	LI Z S, TANG J S, KUANG Y C. A lightweight model for detecting small targets of Litchi pests based on improved YOLOv10n[J]. Smart Agriculture, 2025, 7(2): 146-159.
[20]	陈龙梅, 张春雨. 改进YOLOv8模型的芡种成熟度检测[J]. 安徽科技学院学报, 2025, 39(1): 70-76.
	CHEN L M, ZHANG C Y. Maturity detection of Euryale ferox seeds based on YOLOv8 modeling[J]. Journal of Anhui Science and Technology University, 2025, 39(1): 70-76.
[21]	YU D X, DAI C Y, QU C, et al. Rapid classification and quantification of Euryales semen (Euryale ferox Salisb.) from different origins and varieties using multispectral fingerprints combined with machine learning methods[J]. Journal of Food Composition and Analysis, 2025, 140: 107239.
[22]	常永雷, 张熔龙, 惠振阳, 等. 一种基于深度学习的芡实中药材遥感识别方法: CN121354094A[P]. 2026-01-16.
[23]	修贤超, 费士祺, 黄文倩, 等. 基于轻量化Mamba-YOLO模型的梨表面缺陷检测方法[J]. 智慧农业(中英文), 2026, 8(2): 147-157.
	XIU X C, FEI S Q, HUANG W Q, et al. A lightweight method for pear surface defect detection based on improved mamba-YOLO architecture[J]. Smart Agriculture, 2026, 8(2): 147-157.
[24]	朱然辉, 王相友, 吴海涛, 等. 基于YOLOv11-MML的马铃薯表面缺陷实时检测方法[J]. 农业工程学报, 2025, 41(15): 117-126.
	ZHU R H, WANG X Y, WU H T, et al. Real-time detection method for potato surface defects based on YOLOv11-MML[J]. Transactions of the Chinese Society of Agricultural Engineering, 2025, 41(15): 117-126.
[25]	SUN D Y, LAN W J, ZHAO K X, et al. Real-time imaging quantification of mixed defective wheat kernels using a lightweight You Only Look Once version 8 instance-segmentation model[J]. Microchemical Journal, 2026, 221: 116820.
[26]	XIA Y, CHE T C, MENG J W, et al. Detection of surface defects for maize seeds based on YOLOv5[J]. Journal of Stored Products Research, 2024, 105: 102242.
[27]	徐君, 孙芳芳, 尹渝来, 等. 江苏省芡实冻鲜米产业发展现状与对策[J]. 农村经济与科技, 2025, 36(15): 88-90.
[28]	KHANAM R, HUSSAIN M. YOLOv11: An overview of the key architectural enhancements[EB/OL]. arXiv: 2410.17725, 2024.
[29]	DING X H, ZHANG Y Y, GE Y X, et al. UniRepLKNet: A universal perception large-kernel ConvNet for audio, video, point cloud, time-series and image recognition[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2024: 5513-5524.
[30]	YU Z W, GUAN Q, YANG J M, et al. LSM-YOLO: a compact and effective ROI detector for medical detection[EB/OL]. arXiv: 2408.14087, 2024.
[31]	YANG J N, LIU S L, WU J J, et al. Pinwheel-shaped convolution and scale-based dynamic loss for infrared small target detection[C]// Proceedings of the Thirty-ninth AAAI Conference on Artificial Intelligence and Thirty-seventh Conference on Innovative Applications of Artificial Intelligence and Fifteenth Symposium on Educational Advances in Artificial Intelligence. New York, USA: ACM, 2025: 9202-9210.
[32]	CHENG T H, SONG L, GE Y X, et al. YOLO-world: real-time open-vocabulary object detection[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2024: 16901-16911.
[33]	YANG G M, WANG Y B, LI X Y, et al. YOLOv9t-DM: A lightweight multi-target detection method for walnut shell kernel materials[J]. Signal, Image and Video Processing, 2025, 19(7): 591.
[34]	WANG Q, YAN N, QIN Y S, et al. BED-YOLO: An enhanced YOLOv10n-based tomato leaf disease detection algorithm[J]. Sensors, 2025, 25(9): 2882.
[35]	JIANG S, CHEN X, LI Z M, et al. Detecting tassels in UAV imagery with Tassel-YOLOv12n model: A case study of adapted maize varieties in the Huaibei Plain, China[J]. Smart Agricultural Technology, 2025, 12: 101397.
[36]	JIA H J, ZHANG L J, LIANG X M, et al. DPDB-YOLO: A lightweight YOLOv13 cherry tomato ripeness detection method with adaptive extraction module and multi-scale feature fusion architecture[J]. Industrial Crops and Products, 2025, 238: 122419.
[37]	SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: Visual explanations from deep networks via gradient-based localization[J]. International Journal of Computer Vision, 2020, 128(2): 336-359.

参数	参数值	参数	参数值
操作系统	Ubuntu22.04	输入大小	640×640
运行内存	90 GB	轮数	100
显卡	GeForceRTX5090	批次大小	16
显存	32 GB	学习率	0.01
CPU	25vCPUIntel（R）Xeon（R）Platinum8470Q	动量	0.937
Pytorch框架	2.7.0	权重衰减	0.000 5
CUDA版本	12.8	线程数	16
Python版本	3.12.3

CURK	DLAE	SDIoU	准确率/%	召回率/%	平均精度均值/%	参数量/M	权重文件/MB	计算量/GFLOPs
×	×	×	95.4	89.9	97.0	2.58	5.5	6.3
√	×	×	95.9	94.7	97.3	2.60	5.5	6.4
×	√	×	92.7	87.4	94.0	2.29	4.9	6.1
×	×	√	94.8	94.2	97.7	2.58	5.5	6.3
√	√	×	94.5	92.3	97.1	2.31	4.9	6.1
√	√	√	95.4	92.8	97.4	2.31	4.9	6.1

模型	准确率/%	召回率/%	平均精度均值 /%	参数量/M	权重文件/MB	计算量/GFLOPs	帧率/（帧/s）
YOLOv8n-Worldv2	94.3	92.1	97.2	2.58	7.3	9.8	232.3
YOLOv9t	94.9	92.2	97.2	2.60	4.6	7.6	142.1
YOLOv10n	92.7	91.4	96.8	2.29	5.7	6.5	235.7
YOLOv11n	95.4	89.9	97.0	2.58	5.5	6.3	273.2
YOLOv12n	93.4	91.7	96.6	2.58	5.5	6.3	169.5
YOLOv13n	92.5	92.8	96.9	2.31	5.4	6.2	133.5
Faster R-CNN	92.7	92.9	96.1	136.77	108.3	401.7	66.3
SSD	93.8	89.3	95.7	4.08	16.3	6.3	157.4
改进YOLOv11n	95.4	92.8	97.4	2.31	4.9	6.1	189.2

地区	合格芡实总数/个	正检数1/个	缺陷芡实总数/个	正检数2/个	缺陷查准率/%	缺陷查全率/%	总体准确率/%
天长市	200	188	202	192	94.1	95.1	94.5
阜南县	200	178	202	184	89.3	91.1	90.0