欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture ›› 2026, Vol. 8 ›› Issue (2): 147-157.doi: 10.12133/j.smartag.SA202508022

• 信息处理与决策 • 上一篇    

基于轻量化Mamba-YOLO模型的梨表面缺陷检测方法

修贤超1, 费士祺1,2,3, 黄文倩2,3, 李楠1(), 苗中华1   

  1. 1. 上海大学机电工程与自动化学院,上海 200444,中国
    2. 北京市农林科学院智能装备技术研究中心,北京 100097,中国
    3. 北京市农林科学院信息技术研究中心,北京 100097,中国
  • 收稿日期:2025-08-21 出版日期:2026-03-30
  • 基金项目:
    国家重点研发计划项目(2024YFB4707400); 上海市重点科技攻关项目(24N32800100)
  • 作者简介:

    修贤超,博士,副教授,研究方向为人工智能与具身智能。E-mail:

  • 通信作者:
    李 楠,博士,讲师,研究方向为智能装备与机器人技术。E-mail:

A Lightweight Method for Pear Surface Defect Detection Based on Improved Mamba-YOLO Architecture

XIU Xianchao1, FEI Shiqi1,2,3, HUANG Wenqian2,3, LI Nan1(), MIAO Zhonghua1   

  1. 1. School of Mechanic Engineering and Automation, Shanghai University, Shanghai 200444, China
    2. Intelligent Equipment Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
    3. Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
  • Received:2025-08-21 Online:2026-03-30
  • Foundation items:National Key Research and Development Program of China(2024YFB4707400); Shanghai Key Science and Technology Project(24N32800100)
  • About author:

    XIU Xianchao, E-mail:

  • Corresponding author:
    LI Nan, E-mail:

摘要:

【目的/意义】 针对当前砀山梨表面缺陷因尺度小而导致检测精度差的问题,本研究提出了一种基于改进Mamba-YOLO的轻量化高精度模型,旨在实现检测精度与效率的平衡。 【方法】 首先,采用动态上采样模块,相较于现有Mamba-YOLO的上采样模块具有更少的参数量和浮点运算次数,可在保障模型计算效率的同时,提升对缺陷细节信息的保留能力。其次,提出频率自适应空洞卷积,通过动态调整卷积核尺寸,使网络依据输入局部特征自适应选择匹配的卷积核,从而增强对缺陷的特征提取能力。最后,融合压缩和激励模块和通道混合器卷积门控线性单元,同时引入多尺寸卷积核提取多尺度特征,进一步提升模型对局部细节的捕捉能力与鲁棒性。 【结果和讨论】 改进后的算法在砀山梨测试集上经过评估,平均精度均值达到了95.1%,帧率达到了72帧/s。与YOLOv8n、Gold-YOLO-N和YOLOv12n相比,平均精度均值分别高出了4.7、5.3和6.3个百分点;与基准Mamba-YOLO-T相比,平均精度均值提升了3.4个百分点,帧率提高了10.8个百分点。 【结论】 改进模型在提升综合检测性能的同时降低了计算复杂度与参数量,可为轻量化梨表面缺陷检测研究提供可靠的算法支撑。

关键词: Mamba-YOLO, 缺陷检测, 图像识别, 频率自适应空洞卷积, 卷积核, 动态上采样模块

Abstract:

[Objective] Pears are a common fruit rich in vitamins and minerals. Traditional pear grading primarily relies on manual inspection, which is not only laborious but also susceptible to subjective factors, leading to unstable and inaccurate results. Furthermore, manual operations may cause varying degrees of physical damage to pears, affecting their appearance and market value. Therefore, developing an automated, efficient, and reliable pear grading technology has become an urgent demand in the industry. To address the current problem of poor detection accuracy caused by the small scale of surface defects in Dangshan pears, a lightweight high-precision model was proposed based on an improved Mamba-YOLO architecture, aiming to balance detection accuracy and efficiency. [Methods] The dataset comprised 1 000 images, which were partitioned into training, validation, and test sets in an 8:1:1 ratio. The following improvements were made to the network architecture. Firstly, a dynamic upsampling (Dysample) module was adopted. Compared to the existing upsampling module in Mamba-YOLO, the Dysample module featured fewer parameters and floating-point operations (FLOPs). Its design eliminated complex dynamic convolution kernels, requiring only a small number of linear layers and grouping operations, thereby preserving computational efficiency while enhancing the retention of defect details. Secondly, regarding pear surface defect detection, defects often exhibited high-frequency local features, whereas traditional convolutional neural networks (CNNs) suffer from insufficient feature capture and imbalanced frequency response. As the dilation rate increased, the frequency response of the convolution kernel decreased and its bandwidth narrowed, consequently limiting its ability to process high-frequency information. Therefore, a frequency-adaptive dilated convolution (FADC) module was proposed, which dynamically adjusted the convolution kernel size, enabling the network to adaptively select matching kernels based on local input features. Smaller kernels were used in high-frequency regions, and larger kernels in low-frequency regions, thereby achieving collaborative optimization of multi-band features and enhancing the ability to extract defect features. Finally, considering that using only single-scale depthwise convolutions to capture local features might lead to insufficient perception of input feature information, and that traditional gating mechanisms may lack adequate global context information modeling, the squeeze-and-excitation module was fused with a channel mixer based on the convolutional gated linear unit (CGLU). This combination was extended into a multi-scale version termed MS-CGLU. By incorporating convolutional kernels of different sizes to extract multi-scale features, followed by weighted fusion, stronger feature representation was achieved. [Results and Discussions] The proposed method was rigorously evaluated on the dangshan pear test set. Ablation experiments demonstrated that introducing the CGLU, FADC, and Dysample enhanced detection performance, confirming the effectiveness of these modules. Compared to YOLOv8n, Gold-YOLO-N, and YOLOv12n, the mean average precision (mAP) was higher by 4.7, 5.3, and 6.3 percentage points, respectively. Compared to the baseline Mamba-YOLO-T, the mAP increased by 3.4 percentage points and the frames per second improved by 10.8 percentage points. Furthermore, in comparative experiments with larger-scale models from the same Mamba-YOLO series, the proposed algorithm still demonstrated significant advantages, i.e., its parameter count was only 41.7% of Mamba-YOLO-B and 15.7% of Mamba-YOLO-L, and its FLOPs was merely 57.1% and 18.1% of the respective models, yet it achieved increases in mAP@0.5 of 3.2% and 1.4%, and increases in mAP@0.5:0.95 of 3.1% and 2.6%, respectively. [Conclusions] This research developed a high-precision and lightweight algorithm for detecting surface defects on Dangshan pears. It achieved a superior balance between detection accuracy and inference speed, significantly outperforming relevant lightweight benchmarks and even larger models within its own family in terms of efficiency. This work can provide reliable algorithmic support for lightweight detection research of pear surface defects.

Key words: Mamba-YOLO, defect detection, image recognition, frequency-adaptive dilated convolution (FADC), convolutional kernels, dynamic upsampling (Dysample)

中图分类号: