欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture

• •    

CD-YOLO:基于改进YOLOv11s的田间胡萝卜苗检测方法

刘浩然1,2, 王宇1, 赵学观2,4, 吴华瑞3, 付豪2,4, 庞树杰5, 翟长远2,4()   

  1. 1. 辽宁科技大学 机械工程与自动化学院,辽宁 鞍山 114051,中国
    2. 北京市农林科学院智能装备技术研究中心,北京 100097,中国
    3. 北京市农林科学院信息技术研究中心,北京 100097,中国
    4. 国家农业智能装备工程技术研究中心,北京 100097,中国
    5. 中国农业机械化科学研究院集团有限公司,北京 100083,中国
  • 收稿日期:2025-11-09 出版日期:2026-01-21
  • 基金项目:
    北京市农林科学院院创新能力建设专项(KJCX20230409); 国家自然科学基金(32201647); 改革与发展项目(GGFZ20250205)
  • 作者简介:

    刘浩然,硕士研究生,研究方向为智能化施药装备研发。E-mail:

  • 通信作者:
    翟长远,博士,研究员,研究方向为智能化施药装备研发。E-mail:

CD-YOLO: A Method for Detecting Carrot Seedlings in Fields Based on an Improved YOLOv11s

LIU Haoran1,2, WANG Yu1, ZHAO Xueguan2,4, WU Huarui3, FU Hao2,4, PANG Shujie5, ZHAI Changyuan2,4()   

  1. 1. School of Mechanical Engineering and Automation, University of Science and Technology Liaoning, Anshan 114051, China
    2. Intelligent Equipment Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
    3. Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
    4. National Engineering Research Center of Intelligent Equipment for Agriculture (NERCIEA), Beijing 100097, China
    5. Chinese Academy of Agricultural Mechanization Sciences Group Co. , Ltd. , Beijing 100083, China
  • Received:2025-11-09 Online:2026-01-21
  • Foundation items:Beijing Academy of Agriculture and Forestry Sciences Innovation Capacity Building Project(KJCX20230409); National Natural Science Foundation of China(32201647); Reform and Development Project(GGFZ20250205)
  • About author:

    LIU Haoran, E-mail:

  • Corresponding author:
    ZHAI Changyuan, E-mail:

摘要:

【目的/意义】 枝叶遮挡是田间胡萝卜苗精准识别的主要挑战,对此开发一种轻量、高效的检测模型,以实现田间遮挡条件下胡萝卜苗的精准、鲁棒识别。 【方法】 提出了一种基于改进YOLOv11s(You Only Look Once version 11 small)的轻量化检测方法CD-YOLO(Carrot Detection-YOLO)。首先,为降低模型复杂度,将主干网络中的部分标准卷积(Convolution-BatchNorm-SiLU, CBS)替换为深度可分离卷积(Depthwise Separable Convolution, DWConv),以减少浮点运算量(Floating Point Operations, FLOPs)与参数量;其次,在C3k2模块中引入高效多尺度注意力机制(Efficient Multi-Scale Attention, EMA),构建C3k2_EMA模块,增强模型对关键特征的动态感知能力,从而抑制背景与遮挡噪声;最后,采用动态检测头(DynamicHead),强化模型对不同尺度特征的融合与感知,进一步提升检测鲁棒性。 【结果与讨论】 结果表明:CD-YOLO模型的计算量和尺寸相较于YOLOv11s分别降低了6.2 GFLOPs和4.8 M,单张图像推理速度提升了4.7 ms,准确率、召回率与平均精度(Mean Average Precision, mAP0.5)分别提高了3.0、1.5和2.4个百分点;在应对枝叶遮挡方面,CD-YOLO的漏识率为13.4%,较YOLOv11s下降了5.7个百分点;将模型部署于边缘算力设备,在随机选取的胡萝卜苗图像上进行测试,漏识率和误识率分别为5.1%和2.7%。 【结论】 CD-YOLO模型在保持轻量化的同时,有效提升了田间胡萝卜苗的检测精度与速度,特别是在应对枝叶遮挡方面表现优异。该研究可为胡萝卜自动化育苗过程中的出苗率统计与精准管理提供可靠的技术支持。

关键词: 胡萝卜苗, 遮挡, 目标检测, CD-YOLO, 轻量化

Abstract:

[Objective] In field environments under natural conditions, leaf occlusion and mutual plant shading pose significant challenges to the accurate identification of carrot seedlings. Furthermore, practical agricultural applications often rely on edge devices with limited computational power, necessitating a detection model that combines lightweight design, high accuracy, and robust anti-occlusion capability. The aim is to develop a robust recognition method for carrot seedlings suitable for complex field conditions, thereby enhancing the accuracy and efficiency of seedling emergence statistics in automated seedling raising processes and providing reliable technical support for precise farm management. [Methods] The CD-YOLO (Carrot Detection-YOLO), a lightweight detection model was proposed based on an improved YOLOv11s. Firstly, to reduce model complexity, some standard convolutions (CBS) in the backbone network were replaced with depthwise separable convolutions (DWConv), thereby decreasing Floating-Point Operations (FLOPs) and the number of parameters, establishing a lightweight foundation for edge deployment. Secondly, the efficient multi scale attention (EMA) mechanism was embedded into the critical feature extraction module C3k2, constructing a C3k2_EMA module. This module enhanced dynamic perception of local key features and reconstructed cross-scale contextual dependencies broken by occlusion through its parallel multi-branch structure, effectively suppressing background and occlusion noise. Finally, the DynamicHead detection head was introduced. Leveraging its scale-aware and spatial-aware mechanisms, it achieved dynamic fusion of multi-level features and adaptive weight adjustment, further improving the model's decision-making robustness in complex scenes. To comprehensively evaluate model performance, a carrot seedling dataset covering various field scenarios was independently constructed. Through offline data augmentation, the original 1 274 images were expanded to 4 796, which were then split into training, validation, and test sets in an 8:1:1 ratio. Meanwhile, to systematically quantify the model's anti-occlusion performance, an occlusion severity assessment criterion based on the overlapping area of bounding boxes was proposed. Targets were categorized into three occlusion levels: mild, moderate, and severe. Based on this, a dedicated "Occlusion Test Subset" was separated from the main test set, providing an objective and reproducible benchmark for evaluating the model's anti-occlusion capability. [Results and Discussions] Experimental results on the custom dataset demonstrated that CD-YOLO comprehensively improved detection performance while maintaining its lightweight characteristics. Compared to the baseline model YOLOv11s, CD- YOLO reduced computational load by 6.2 GFLOPs (a 28.8% decrease), decreased model size by 4.8 MB (a 25.0% reduction), improved single-image inference speed by 4.7 ms, reaching 9.6 ms. Concurrently, Precision, Recall, and mean average precision (mAP0.5) increased by 3.0, 1.5, and 2.4 percentage points, respectively, ultimately reaching 81.2%, 76.4%, and 84.0%. In comparisons with other lightweight backbone networks like MobileNetv3 and ShuffleNetv2, CD-YOLO consistently outperformed them on the accuracy-speed comprehensive metric, validating the effectiveness of its improvement strategies. In occlusion performance tests, the missed detection rate of CD-YOLO on the occlusion test subset was 13.4%, a 5.7 percentage points decrease compared to YOLOv11s. Its mAP0.5 on the occlusion subset reached 80.6%, a 5.1 percentage points improvement over the baseline, whereas the improvement on the regular subset was 1.8 percentage points, proving the model's enhanced efficacy in occlusion scenarios. After deploying the model on an NVIDIA Jetson Orin NX edge device and accelerating it with TensorRT, the inference frame rate increased to 32.5 FPS. On random test images, CD-YOLO achieved missed detection and false detection rates of 5.1% and 2.7%, respectively, representing decreases of 7.7% and 2.6% compared to YOLOv11s, demonstrating promising practical application potential. Ablation studies and feature map visualizations further indicated that DWConv, C3k2_EMA, and DynamicHead formed a synergistic optimization loop: DWConv achieved computational compression, freeing up computational budget for subsequent modules; C3k2_EMA enhanced local perception and contextual reconstruction of occluded targets during the feature extraction stage; and DynamicHead performed dynamic fusion of multi-scale features at the decision-making end. Together, they ensured high-precision detection of incomplete targets under limited computational resources. [Conclusions] Through the synergistic design of "lightweighting, feature enhancement, and dynamic fusion", the CD YOLO model achieved an excellent balance between computational efficiency, detection accuracy, and anti-occlusion capability. The model not only significantly reduced reliance on the computational power of edge devices but also effectively improved robustness and adaptability in complex field environments through structured attention and dynamic fusion mechanisms.

Key words: carrot seedlings, obscuration, object detection, CD-YOLO, lightweight

中图分类号: