欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture

• •    

Chilli-YOLO:基于改进YOLOv10的露地辣椒成熟度智能检测算法

司超国1,2, 刘梦晨2,3, 吴华瑞2,4, 缪祎晟2,4, 赵春江2()   

  1. 1. 上海海洋大学 信息学院,上海 201306,中国
    2. 国家农业信息化工程技术研究中心,北京 100097,中国
    3. 北京信息科技大学 自动化学院,北京 100096,中国
    4. 农业农村部数字乡村技术重点实验室,北京 100097,中国
  • 收稿日期:2024-10-27 出版日期:2025-03-24
  • 基金项目:
    国家重点研发计划课题(2023YFD2001205); 国家现代农业产业技术体系(CARS-23-D07); 北京市岗位专家任务(BAIC10-2024-E10)
  • 作者简介:

    司超国,硕士研究生,研究方向为计算机视觉。E-mail:

  • 通信作者:
    赵春江,博士,研究员,中国工程院院士,研究方向为智慧农业。E-mail:

Chilli-YOLO: An Intelligent Maturity Detection Algorithm for Field-Grown Chilli Based on Improved YOLOv10

SI Chaoguo1,2, LIU Mengchen2,3, WU Huarui2,4, MIAO Yisheng2,4, ZHAO Chunjiang2()   

  1. 1. College of Information Technology, Shanghai Ocean University, Shanghai 201306, China
    2. National Engineering Research Center for Information Technology in Agriculture, Beijing 100097, China
    3. School of Automation, Beijing Information Science& Technology University, Beijing 100096, China
    4. Key Laboratory of Digital Village Technology, Ministry of Agriculture and Rural Affairs, Beijing 100097, China
  • Received:2024-10-27 Online:2025-03-24
  • Foundation items:National Key Research and Development Program Project(2023YFD2001205); National Modern Agricultural Industry Technology System(CARS-23-D07); Beijing Position Expert Task(BAIC10-2024-E10)
  • About author:

    SI Chaoguo, E-mail:

  • Corresponding author:
    ZHAO Chunjiang, E-mail:

摘要:

【目的/意义】 为确定露地辣椒的最佳采摘时机和实现智能分拣。针对现有目标检测模型在辣椒成熟度检测任务中的效率低和准确率不高问题,提出了一种目标检测算法Chilli-YOLO,旨在快速、准确地检测辣椒果实的成熟度,以实现大田辣椒的智能化采摘及成熟度检测。 【方法】 以复杂背景下的露地辣椒为研究对象,将辣椒分为未熟期、过渡期、成熟期和干辣椒四个成熟度等级。在此基础上,对YOLOv10s(You Only Look Once version 10 small)进行了优化改进。首先,使用Ghost卷积优化主干网络,将普通卷积替换为GhostConv,并用C2f_Ghost代替C2f,以减少计算冗余。其次,将PSA(Partial Self-Attention)模块替换为SOCA(Second-Order Channel Attention)注意力机制,引入高阶特征相关性,捕捉辣椒细粒度特征。最后,通过引入XIoU(Extended intersection over union)损失函数来增强模型的定位精度,提升模型的准确性。 【结果和讨论】 在自建的辣椒成熟度检测数据集上进行的实验表明,Chilli-YOLO在计算量、参数量和模型大小分别达到18.3 GFLOPs、6.37 M和12.6 M的同时,推理时间为7.3 ms。模型的平均精度均值(Mean Average Precision, mAP)、准确率和召回率分别达到了88.9%、90.7%和82.4%,较基线模型分别提升了2.8、2.6和2.8个百分点。此外,实验结果还与目前主流的Faster RCNN(Faster Region-based Convolutional Neural Network)、SSD(Single Shot MultiBox Detector)和YOLO系列的多个版本进行了对比,验证了所提出方法的综合性能优于其他算法。 【结论】 提出的Chilli-YOLO模型能够实现露地辣椒不同成熟度的精准划分,不仅提升了检测精度,还有效降低了模型计算开销。为辣椒的智能化采摘提供了有效的技术参考。

关键词: YOLOv10, 辣椒, 成熟度, SOCA, Ghost

Abstract:

[Objective] In modern agriculture, the rapid and accurate detection of chillies at different maturity stages is a critical step for determining the optimal harvesting time and achieving intelligent sorting of field-grown chillies. However, existing target detection models face challenges in efficiency and accuracy when applied to the task of detecting chilli maturity, which limits their widespread use and effectiveness in practical applications. To address these challenges, a new algorithm, Chilli-YOLO, was proposed for achieving efficient and precise detection of chilli maturity in complex environments. [Methods] This research focused on field-grown chillis cultivated at the national precision agriculture base in Changping district, Beijing, China. A comprehensive image dataset was collected, capturing chillis under diverse and realistic agricultural conditions, including varying lighting conditions, camera angles, and background complexities. These images were then meticulously categorized into four distinct maturity stages: Immature, transitional, mature, and dried. To ensure data quality and robustness, the initial image pool underwent a rigorous process of manual screening, followed by precise annotation using bounding boxes to delineate individual chillis. Furthermore, data augmentation techniques were employed to expand the dataset and enhance the model's generalization capabilities. To develop an accurate and efficient chili maturity detection system, the YOLOv10s object detection network was chosen as the foundational architecture. The model's performance was further enhanced through strategic optimizations targeting the backbone network. Specifically, standard convolutional layers were replaced with Ghost convolutions. This technique generated more feature maps from fewer parameters, resulting in significant computational savings and improved processing speed without compromising feature extraction quality. Additionally, the C2f module was substituted with the more computationally efficient GhostConv module, further reducing redundancy and enhancing the model's overall efficiency. To improve the model's ability to discern subtle visual cues indicative of maturity, particularly in challenging scenarios involving occlusion, uneven lighting, or complex backgrounds, the Partial Self-Attention (PSA) module within YOLOv10s was replaced with the Second-Order Channel Attention (SOCA) mechanism. SOCA leverages higher-order feature correlations to more effectively capture fine-grained characteristics of the chillis. This enabled the model to focus on relevant feature channels and effectively identify subtle maturity-related features, even when faced with significant visual noise and interference. Finally, to refine the precision of target localization and minimize bounding box errors, the Extended Intersection over Union (XIoU) loss function was integrated into the model training process. XIoU enhances the traditional IoU loss by considering factors such as the aspect ratio difference and the normalized distance between the predicted and ground truth bounding boxes. By optimizing for these factors, the model achieved significantly improved localization accuracy, resulting in a more precise delineation of chillis in the images and contributing to the overall enhancement of the detection performance. The combined implementation of these improvements aimed to construct an effective approach to correctly classify the maturity level of chillis within the challenging and complex environment of a real-world farm. [Results and Discussion] The experimental results on the custom-built chilli maturity detection dataset showed that the Chilli-YOLO model performed excellently across multiple evaluation metrics. The model achieved an accuracy of 90.7%, a recall rate of 82.4%, and a mean average precision (mAP) of 88.9%. Additionally, the model's computational load, parameter count, model size, and inference time were 18.3 GFLOPs, 6.37 M, 12.6 M, and 7.3 ms, respectively. Compared to the baseline model, Chilli-YOLO improved accuracy by 2.6 percent point, recall by 2.8 percent point and mAP by 2.8 percent point. At the same time, the model's computational load decreased by 6.2 GFLOPs, the parameter count decreased by 1.67 M, model size reduced by 3.9 M. These results indicated that Chilli-YOLO strikes a good balance between accuracy and efficiency, making it capable of fast and precise detection of chilli maturity in complex agricultural environments. Moreover, compared to earlier versions of the YOLO model, Chilli-YOLO showed improvements in accuracy of 2.7, 4.8, and 5 percent point over YOLOv5s, YOLOv8n, and YOLOv9s, respectively. Recall rates were higher by 1.1, 0.3, and 2.3 percent point, and mAP increased by 1.2, 1.7, and 2.3 percent point, respectively. In terms of parameter count, model size, and inference time, Chilli-YOLO outperformed YOLOv5. This avoided the issue of YOLOv8n's lower accuracy, which was unable to meet the precise detection needs of complex outdoor environments. When compared to the traditional two-stage network Faster RCNN, Chilli-YOLO showed significant improvements across all evaluation metrics. Additionally, compared to the one-stage network SSD, Chilli-YOLO achieved substantial gains in accuracy, recall, and mAP, with increases of 16.6%, 12.1%, and 16.8%, respectively. Chilli-YOLO also demonstrated remarkable improvements in memory usage, model size, and inference time. These results highlighted the superior overall performance of the Chilli-YOLO model in terms of both memory consumption and detection accuracy, confirming its advantages for chilli maturity detection. [Conclusion] The proposed Chilli-YOLO model optimizes the network structure and loss functions, not only significantly improving detection accuracy but also effectively reducing computational overhead, making it better suited for resource-constrained agricultural production environments. This model provides a reliable technical reference for intelligent harvesting of chillies in agricultural production environments, especially in resource-constrained settings. By improving both performance and efficiency, Chilli-YOLO represents a significant step forward in the field of agricultural automation and precision farming.

Key words: YOLOv10, chilli, maturity, SOCA, Ghost

中图分类号: