Welcome to Smart Agriculture 中文

Smart Agriculture

   

Lightweight Mamba-YOLO Based Approach for Pear Surface Defect Detection

XIU Xianchao1, FEI Shiqi1,2,3, HUANG Wenqian2,3, LI Nan1(), MIAO Zhonghua1   

  1. 1. School of Mechanic Engineering and Automation, Shanghai University, Shanghai 200444, China
    2. Intelligent Equipment Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
    3. Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
  • Received:2025-08-21 Online:2025-12-11
  • Foundation items:National Key Research and Development Program of China(2024YFB4707400); Shanghai Key Science and Technology Project(24N32800100)
  • About author:

    XIU Xianchao, E-mail:

  • corresponding author:
    LI Nan, E-mail:

Abstract:

[Objective] Pears are a common fruit rich in vitamins and minerals. Traditional pear grading primarily relies on manual inspection, which is not only laborious but also susceptible to subjective factors, leading to unstable and inaccurate results. Furthermore, manual operations may cause varying degrees of physical damage to pears, affecting their appearance and market value. Therefore, developing an automated, efficient, and reliable pear grading technology has become an urgent demand in the industry. To address the current problem of poor detection accuracy caused by the small scale of surface defects in dangshan pears, a lightweight high-precision model is proposed based on an improved Mamba-YOLO architecture, aiming to balance detection accuracy and efficiency. [Methods] To enhance model training precision and generalization capability, images with poor pixel quality or blurriness were manually removed. The final dataset comprised 1 000 images, which were partitioned into training, validation, and test sets in an 8:1:1 ratio. Additionally, data augmentation techniques, including rotation, cropping, mirroring, and brightness adjustment, were applied to the dataset to improve training effectiveness. The following improvements were made to the network architecture. Firstly, a dynamic upsampling (Dysample) module was adopted. Compared to the existing upsampling module in Mamba-YOLO, the Dysample module featured fewer parameters and floating-point operations (FLOPs). Its design eliminated complex dynamic convolution kernels, requiring only a small number of linear layers and grouping operations, thereby preserving computational efficiency while enhancing the retention of defect details. Secondly, regarding pear surface defect detection, defects often exhibited high-frequency local features, whereas traditional convolutional neural networks (CNNs) suffered from insufficient feature capture and imbalanced frequency response. As the dilation rate increased, the frequency response of the convolution kernel decreased and its bandwidth narrowed, consequently limiting its ability to process high-frequency information. Therefore, a frequency-adaptive dilated convolution (FADC) module was proposed, which dynamically adjusted the convolution kernel size, enabling the network to adaptively select matching kernels based on local input features. Smaller kernels were used in high-frequency regions, and larger kernels in low-frequency regions, thereby achieving collaborative optimization of multi-band features and enhancing the ability to extract defect features. Finally, considering that using only single-scale depthwise convolutions to capture local features might lead to insufficient perception of input feature information, and that traditional gating mechanisms may lack adequate global context information modeling, the squeeze-and-excitation (SE) module was fused with a channel mixer based on the convolutional gated linear unit (CGLU). This combination was extended into a multi-scale version termed MS-CGLU. By incorporating convolutional kernels of different sizes to extract multi-scale features, followed by weighted fusion, stronger feature representation was achieved. [Results and Discussions] The proposed algorithm was rigorously evaluated on the dangshan pear test set. Ablation experiments demonstrated that introducing the CGLU, FADC, and Dysample enhanced detection performance, confirming the effectiveness of these modules. Compared to YOLOv8n, Gold-YOLO-N, and YOLOv12n, the mean average precision (mAP) was higher by 4.7, 5.3, and 6.3 percent points, respectively. Compared to the baseline Mamba-YOLO-T, the mAP increased by 3.4 percent points and the frames per second (FPS) improved by 10.8 percent points. Furthermore, in comparative experiments with larger-scale models from the same Mamba-YOLO series, the proposed algorithm still demonstrated significant advantages, i.e., its parameter count was only 41.7% of Mamba-YOLO-B and 15.7% of Mamba-YOLO-L, and its FLOPs was merely 57.1% and 18.1% of the respective models, yet it achieved increases in mAP@0.5 of 3.2% and 1.4%, and increases in mAP@0.5:0.95 of 3.1% and 2.6%, respectively. [Conclusions] This study successfully developed a high-precision and lightweight algorithm for detecting surface defects on dangshan pears. It achieved a superior balance between detection accuracy and inference speed, significantly outperforming relevant lightweight benchmarks and even larger models within its own family in terms of efficiency. This work can provide reliable algorithmic support for lightweight detection research of pear surface defects.

Key words: dangshan pear, lightweight YOLO, defect detection, image recognition

CLC Number: