欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture

• •    

YOLOv8n-SSND:改进的航拍藜麦穗目标检测轻量模型

吴婷婷1, 郭俊睿1, 陶秋洁1, 陈世华2, 郭善利2()   

  1. 1. 西北农林科技大学 机械与电子工程学院,陕西杨凌 712100,中国
    2. 烟台大学 生命科学学院,山东 烟台 264006,中国
  • 收稿日期:2025-08-21 出版日期:2025-12-05
  • 基金项目:
    山东省重点研发计划农业良种工程项目(2023LZGC011)
  • 作者简介:

    吴婷婷,博士,副教授,研究方向为作物表型检测技术及装备。E-mail:

  • 通信作者:
    郭善利,博士,教授,研究方向为植物遗传育种、植物基因克隆及功能研究。E-mail:

YOLOv8n-SSND: An Improved Lightweight Model for Aerial Chenopodium Chenopodium quinoa Willd Willd Spike Target

WU Tingting1, GUO Junrui1, TAO Qiujie1, CHEN Shihua2, GUO Shanli2()   

  1. 1. College of Mechanical and Electronic Engineering, Northwest A&F University, Yangling 712100, China
    2. College of Life Sciences, Yantai University, Yantai 264006, China
  • Received:2025-08-21 Online:2025-12-05
  • Foundation items:The Key Research Project of Shandong Province for Agricultural Superior Varieties(2023LZGC011)
  • About author:

    WU Tingting, E-mail:

  • Corresponding author:
    GUO Shanli, E-mail:

摘要:

【目的/意义】 藜麦穗是藜麦产量评估的重要指标之一。为提高藜麦穗目标检测精度和检测效率,提出一种适用于无人机搭载的目标识别轻量模型YOLOv8n-SSND(YOLOv8n with Switchable Atrous Convolution, Slim Neck, and Deformable Attention)。 【方法】 以YOLOv8n和YOLOv11n为基准模型,考虑藜麦穗形态大小不一及结构复杂,在Backbone层加入可切换空洞卷积(Switchable Atrous Convolution, SAC)模块以提升检测复杂特征的能力;引入Slim-Neck特征融合层,轻量化主干网络;添加可变形注意力(Deformable Attention, DA)机制,使模型能够动态识别藜麦穗部的复杂特征,同时保持较高的推理效率。 【结果和讨论】 YOLOv8n-SSND模型平均精度均值(mean Average Precision at 50% Intersection over Union, mAP50)达94.3%,相较于YOLOv11n-SSND、YOLOv11n、YOLOv12n、YOLOv7、YOLOv5s、SSD(Single Shot MultiBox Detector)、Fast R-CNN(Fast Region-Based Convolutional Neural Network)和YOLOv8n模型,分别提升0.7、0.9、2.1、1.4、2.0、23.1、19.6、1.8个百分点。该模型推理速度达166.7 FPS,较基准模型提高26.7%;浮点运算量为6.8 GFLOPs,较基准模型提高降低16.0%。 【结论】 YOLOv8n-SSND模型在藜麦穗部识别上表现出更高的精确度、更快的推理速度以及更少的浮点运算量,为无人机搭载针对藜麦穗部的藜麦目标检测提供了可行方法,也为藜麦产量评估与智能农业管理提供了高效的技术方案。

关键词: 藜麦穗, 无人机, YOLOv8n, 目标检测, 可变形注意力

Abstract:

[Objective] The quinoa panicle serves as a critical phenotypic indicator for estimating crop yield and evaluating the growth condition of quinoa plants. Accurate and efficient recognition of quinoa panicles in complex field environments is therefore of great significance for intelligent agriculture, yield prediction, and automatic crop management. However, Unmanned Aerial Vehicle (UAV)-acquired field imagery often exhibits complex characteristics such as diverse panicle morphology, uneven illumination, overlapping occlusion, and background interference, which pose substantial challenges for conventional target detection algorithms. To address these issues, a novel lightweight target detection model, named YOLOv8n-SSND (YOLOv8n with Switchable Atrous Convolution, Slim Neck, and Deformable Attention) is proposed, and is specifically optimized for UAV-based quinoa panicle identification. The main goal of this study is to improve the detection accuracy and inference efficiency for quinoa panicles while maintaining low computational cost and real-time performance suitable for embedded UAV deployment. [Methods] The proposed model was constructed based on the YOLOv8n and YOLOv11n frameworks, and incorporated several improvements tailored for small-object agricultural detection tasks. To enhance the ability to capture multi-scale and high-dimensional semantic features, the switchable atrous convolution (SAC) module was embedded into the Backbone network. This module dynamically adjusted its receptive field according to spatial context, enabling more precise extraction of local and global texture details of quinoa panicles. In order to reduce redundant parameters and maintain high computational efficiency, a slim-neck lightweight feature fusion layer was designed, which effectively strengthened the integration of shallow spatial information and deep semantic features, allowing the network to maintain high accuracy without increasing model complexity. Additionally, a deformable attention (DA) mechanism was introduced to enable adaptive focus on regions with rich panicle-related features while suppressing irrelevant background noise. This attention mechanism assigned dynamic weights across both spatial and channel dimensions, improving the model's robustness against occlusions, illumination variations, and complex field textures commonly encountered in UAV images. [Results and Discussions] Comprehensive field experiments were conducted using UAV images of quinoa plots collected under different environmental conditions and growth stages. The results demonstrated that the proposed YOLOv8n-SSND model achieved a mean average precision (mAP50) of 94.3%, showing a remarkable improvement over multiple baseline and comparative models. Specifically, compared with YOLOv11n-SSND, YOLOv11n, YOLOv12n, YOLOv7, YOLOv5s, SSD, Fast R-CNN and YOLOv8n, the proposed model has achieved improvements of 0.7, 0.9, 2.1, 1.4, 2.0, 23.1, 19.6 and 1.8 percentage points respectively (Single Shot MultiBox Detector and Fast Region-Based Convolutional Neural Network). In terms of computational efficiency, the inference speed reached 166.7 frames per second (FPS), representing a 26.7% increase over the YOLOv8n baseline, which ensured real-time detection capability for UAV-mounted onboard processors. Moreover, the total floating-point operation count (FLOPs) was reduced to 6.8 GFLOPs, reflecting a 16.0% reduction compared with the baseline model, thus demonstrating the improved efficiency of the proposed architecture. The experimental comparison also indicated that the integration of SAC enhanced the model's sensitivity to complex spatial patterns, while the DA module effectively improved feature selectivity and prevents overfitting to background textures. The Slim-Neck design contributed significantly to reducing parameter redundancy and facilitated smooth feature propagation across layers. [Conclusions] The YOLOv8n-SSND model effectively achieves a balance among detection accuracy, inference speed, and computational cost, making it well-suited for real-time UAV-based agricultural monitoring. The experimental outcomes confirm that the model not only provides high-precision detection of quinoa panicles but also offers superior inference efficiency with minimal computational resources. These characteristics make it a promising solution for UAV-deployed intelligent agricultural systems, where power and processing capacity are limited. Furthermore, the proposed method provides a technical foundation for large-scale and automated monitoring of quinoa growth, enabling accurate yield estimation, phenotypic analysis, and precision crop management. The results of this study demonstrate that lightweight deep learning architectures with adaptive attention mechanisms can achieve performance in complex agricultural detection tasks. This work contributes to the advancement of UAV-based intelligent sensing and establishes a reference framework for developing future lightweight object detection models applicable to various crop types and field conditions.

Key words: quinoa panicle, UAV, YOLOv8n, object detection, deformable attention

中图分类号: