欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture

• •    

基于改进YOLOv11n的芡实外观缺陷检测算法

张坤1, 张春雨1(), 陈龙梅2, 刘启程1, 李永康1, 柳凯1, 曾文豪1   

  1. 1. 安徽科技工程大学智能制造学院,安徽 滁州 239000,中国
    2. 江苏省高邮中等专业学校,江苏高邮 225600,中国
  • 收稿日期:2026-02-04 出版日期:2026-05-22
  • 基金项目:
    安徽省教育厅自然科学重大项目(2025AHGXZK20066); 安徽省工业和信息化厅制造业揭榜挂帅项目(JB25116)
  • 作者简介:

    张 坤,硕士研究生,研究方向为图像处理与智能农机。E-mail:

  • 通信作者:
    张春雨,硕士,教授,研究方向为智能农机装备。E-mail:

Appearance Defect Detection Algorithm of Euryale Ferox Based on Improved YOLOv11n

ZHANG Kun1, ZHANG Chunyu1(), CHEN Longmei2, LIU Qicheng1, LI Yongkang1, LIU Kai1, ZENG Wenhao1   

  1. 1. College of Intelligent Manufacturing, Anhui University of Science and Technology, Chuzhou 239000, China
    2. Gaoyou Secondary Professional School, Gaoyou 225600, China
  • Received:2026-02-04 Online:2026-05-22
  • Foundation items:Anhui Provincial Department of Education Natural Science Major Project(2025AHGXZK20066); Anhui Provincial Department of Industry and Information Technology Manufacturing Challenge Project(JB25116)
  • About author:

    ZHANG Kun, E-mail:

  • Corresponding author:
    ZHANG Chunyu, E-mail:

摘要:

【目的/意义】 针对芡实在加工分选过程中人工分选成本高、效率低且一致性差的问题,提出了一种基于改进YOLOv11n的芡实外观缺陷检测算法,旨在研究一种轻量化的芡实精准分选模型。 【方法】 首先,将通用感知大核卷积模块(UniRepLKNetBlock)融入C3k2(Cross Stage Partial with kernel size 2)结构,构建新型特征提取模块CURK(C3k2-UniRepLKNetBlock),增强模型对细粒度缺陷纹理和复杂背景的表征能力;其次,采用深度轻量化自适应提取模块替换主干网络中的部分卷积模块,在增强关键区域的自适应学习能力的同时减少计算量;最后,引入SDIoU(Scale Dynamic IoU Loss)损失函数,弥补DIoU(Distance Intersection over Union)损失函数在多目标检测中回归不稳定、定位精度不足的问题,提升模型的检测准确性与边界框回归效果。 【结果和讨论】 改进后的模型平均精度均值达到97.4%,召回率为92.8%,相较于基线模型YOLOv11n分别提高了0.4和2.9个百分点;模型权重文件大小为4.9 MB,模型参数量为2.31 M,浮点运算量为6.1 GFLOPs,相较于基线模型分别降低了10.9%、10.7%和3.2%;改进后模型的推理速度达到189.2帧/s,能够满足实时检测的要求。在实验平台测试实验中,总体平均准确率达到92.25%,验证了模型的实用性与工程适应性。 【结论】 改进模型在保证较高检测精度的同时,兼顾了模型轻量化和实时性,具有较好的工程应用潜力,为芡实的智能化分选提供了有效的技术参考。

关键词: 芡实, YOLOv11n, 缺陷检测, 轻量化模型, 图像识别

Abstract:

[Objective] To address the problems of high labor cost, low efficiency, and poor consistency in manual sorting during the post-harvest processing of Euryale ferox, and to develop a high-precision, lightweight, and real-time sorting model to provide technical support for intelligent processing, an appearance defect detection algorithm based on an improved YOLOv11n model is proposed. [Methods] YOLOv11n was improved from three aspects: feature extraction, downsampling mechanism, and loss function. First, the Universal Perception Large‑Kernel ConvNet Block (UniRepLKNetBlock) was integrated into the C3k2 structure of the neck network to construct a novel feature extraction module named CURK (C3k2‑UniRepLKNetBlock). This module used a depthwise large‑kernel convolution as the main branch, with four parallel convolutional branches. After training, these branches were merged into a single convolutional layer via structural re‑parameterization, which significantly enlarged the effective receptive field and enhanced the model's representation capability. Second, a Depth Lightweight Adaptive Extraction module (DLAE) replaced the standard convolutional downsampling layer in the 8th layer of the backbone. DLAE adopted a parallel two‑branch design: a feature extraction branch based on depthwise separable convolution (DWConv) to capture local texture details, and an attention branch that generated spatial attention weights through global average pooling and 1×1 DWConv, followed by Softmax normalization. The attention weights were then multiplied channel‑wise with the features, reducing computational load while adaptively enhancing key defect regions and suppressing background noise. Third, the Scale Dynamic IoU Loss (SDIoU) was introduced to replace the original loss function. In the BBox branch, SDIoU calculated a dynamic coefficient based on the ratio of the target bounding box area to the preset maximum target area, combined with the feature map compression ratio (ROC), and a threshold δ=0.5was used for clipping, automatically adjusting the weights of scale loss and location loss. A similar mechanism was applied in the Mask branch. A self‑built Euryale ferox appearance defect image dataset was constructed. A total of 2 537 original images were collected, and after data augmentation, the dataset was expanded to 5 354 images covering five categories: qualified, surface scratch, broken, dark (overripe), and shell. The dataset was randomly divided into training, validation, and test sets in a 7:1:2 ratio. [Results and Discussions] Ablation experiments showed that the combination of CURK, DLAE, and SDIoU achieved the best overall performance while maintaining lightweight advantages: precision was 95.4%, recall was 92.8%, and mAP50 reached 97.4%. Compared with the baseline YOLOv11n, recall increased by 2.9 percentage points and mAP50 by 0.4 percentage points. The model weight file size was reduced to 4.9 MB, parameters to 2.31 M, and computational cost to 6.1 GFLOPs. The inference speed reached 189.2 f/s, meeting real‑time detection requirements. In comparative experiments with mainstream models, the proposed model achieved the highest mAP50 (97.4%) and the lowest parameter count and computational cost. Heatmap visualization analysis indicated that after integrating CURK, DLAE, and SDIoU, the model's focus on Euryale ferox target regions became more concentrated, background interference was significantly suppressed, and missed and false detections were effectively reduced. A physical visual detection validation platform was built, and 402 samples each from Tianchang city and Funan county were tested. The overall accuracy was 94.5% for Tianchang samples and 90.0% for Funan samples, with an average of 92.25%, confirming the model's good generalization capability and engineering adaptability under different imaging conditions and geographical origins. [Conclusions] Through the synergistic effects of the CURK large‑kernel re‑parameterization module, the DLAE lightweight adaptive downsampling module, and the SDIoU scale‑dynamic loss function, the improved YOLOv11n model maintains high detection accuracy while balancing model lightweighting and real‑time performance, demonstrating good engineering application potential. It provides an efficient, accurate, and lightweight technical reference for intelligent sorting of Euryale ferox.

Key words: Euryale ferox, YOLOv11n, defect detection, lightweight model, image recognition

中图分类号: