基于3C-YOLOv8n和深度相机的葡萄识别与定位方法

doi:10.12133/j.smartag.SA202407008

Smart Agriculture ›› 2024, Vol. 6 ›› Issue (6): 121-131.doi: 10.12133/j.smartag.SA202407008

• 专题--农业知识智能服务和智慧无人农场（上） • 上一篇下一篇

基于3C-YOLOv8n和深度相机的葡萄识别与定位方法

刘畅(), 孙雨, 杨晶, 王凤超(), 陈进()

上海应用技术大学理学院，上海 201418，中国

收稿日期:2024-07-09 出版日期:2024-11-30
基金项目:
中国上海市杨帆计划(20YF1447600); 上海应用技术大学科研启动项目(YJ2021-60); 上海应用技术大学协同创新项目(XTCX2023-22); 上海应用技术大学中青年教师科技人才发展基金(ZQ2022-6)
作者简介:
刘畅，研究方向为机器视觉，E-mail：changliu_wuhu@163.com
通信作者:
陈进，博士，副教授，研究方向为机器视觉和嵌入式系统开发。E-mail：jinchenxl@sit.edu.cn
王凤超，博士，副教授，研究方向为光电应用系统开发。E-mail：fcwang@sit.edu.cn

Grape Recognition and Localization Method Based on 3C-YOLOv8n and Depth Camera

LIU Chang(), SUN Yu, YANG Jing, WANG Fengchao(), CHEN Jin()

College of Sciences, Shanghai Institute of Technology, Shanghai 201418, China

Received:2024-07-09 Online:2024-11-30
Foundation items:Shanghai Sailing Program, China(20YF1447600); Research Start-up Project of Shanghai Institute of Technology(YJ2021-60); Collaborative Innovation Project of Shanghai Institute of Technology(XTCX2023-22); Science and Technology Talent Development Fund for Young and Middle-aged Teachers at Shanghai Institute of Technology(ZQ2022-6)
About author:
LIU Chang, E-mail: changliu_wuhu@163.com
Corresponding author:
CHEN Jin, E-mail: jinchenxl@sit.edu.cn
WANG Fengchao, E-mail: fcwang@sit.edu.cn

摘要/Abstract

摘要：

［目的/意义］ 为了提高葡萄采摘效率、做到葡萄采摘自动化，提出了3C-YOLOv8n目标检测模型，与RealSense D415深度相机结合，对葡萄进行识别和定位。 ［方法］ 首先对YOLOv8n主干网络进行改进，将卷积块注意力模块（Convolutional Block Attention Module, CBAM）与原本的网络结构融合，使模块全面捕获特征中的关键信息。再嵌入坐标注意力（Coordinate Attention, CA），既可以对一个通道的特征进行全面捕获，又可以对不同方向的特征进行敏锐感知。然后，在YOLOv8n颈部将最近邻插值上采样算法替换为内容感知特征重组算法（Content-aware ReAssembly of Features, CARAFE），弥补YOLOv8n的原始上采样模块没有利用特征图语义信息的缺点，增大了感受野。最后转换相机坐标系，将目标葡萄的二维平面坐标和距离深度相机的垂直距离结合，得到目标葡萄的世界坐标，实现葡萄的识别和定位。 ［结果和讨论］ 经过对比试验和消融试验，3C-YOLOv8n模型在并交比为0.5（IOU=0.5）的平均精度均值（Mean Average Precision, mAP）达到94.3%，高于YOLOv8n模型1%，准确率（Precision, P）和召回率（Recall, R）分别为91.6%和86.4%，3种改进策略的结合使损失初始值降低，曲线收敛更快。与其他主流YOLO系列算法对比，3C-YOLOv8n各个评价指标都有所提升，且漏检率、错检率为所有算法中最低，在实际检测中具有很大的优势。 ［结论］ 基于3C-YOLOv8n网络模型和RealSense D415深度相机，对葡萄进行精准识别和定位，为采摘自动化提供了技术手段。

关键词: 机器视觉, YOLOv8n, 目标检测, 葡萄, CBAM, 深度相机

Abstract:

[Objective] Grape picking is a key link in increasing production. However, in this process, a large amount of manpower and material resources are required, which makes the picking process complex and slow. To enhance harvesting efficiency and achieve automated grape harvesting, an improved YOLOv8n object detection model named 3C-YOLOv8n was proposed, which integrates the RealSense D415 depth camera for grape recognition and localization. [Methods] The propoesed 3C-YOLOv8n incorporated a convolutional block attention module (CBAM) between the first C2f module and the third Conv module in the backbone network. Additionally, a channel attention (CA) module was added at the end of the backbone structure, resulting in a new 2C-C2f backbone network architecture. This design enabled the model to sequentially infer attention maps across two independent dimensions (channel and spatial), optimize features by considering relationships between channels and positional information. The network structure was both flexible and lightweight. Furthermore, the Content-aware ReAssembly of Features up sampling operator was implemented to support instance-specific kernels (such as deconvolution) for feature reconstruction with neighboring pixels, replacing the nearest neighbor interpolation operator in the YOLOv8n neck network. This enhancement increased the receptive field and guided the reconstruction process based on input features while maintaining low parameter and computational complexity, thereby forming the 3C-YOLOv8n model. The pyrealsense2 library was utilized to obtain pixel position information from the target area using the Intel RealSense D415 camera. During this process, the depth camera was used to capture images, and target detection algorithms were employed to pinpoint the location of grapes. The camera's depth sensor facilitated the acquisition of the three-dimensional point cloud of grapes, allowing for the calculation of the distance from the pixel point to the camera and the subsequent determination of the three-dimensional coordinates of the center of the target's bounding box in the camera coordinate system, thus achieving grape recognition and localization. [Results and Discussions] Comparative and ablation experiments were conducted. it was observed that the 3C-YOLOv8n model achieved a mean average precision (mAP) of 94.3% at an intersection ratio of 0.5 (IOU=0.5), surpassing the YOLOv8n model by 1%. The accuracy (P) and recall (R) rates were recorded at 91.6% and 86.4%, respectively, reflecting increases of 0.1% and 0.7%. The F₁-Score also improved by 0.4%, demonstrating that the improved network model met the experimental accuracy and recall requirements. In terms of loss, the 3C-YOLOv8n algorithm exhibited superior performance, with a rapid decrease in loss values and minimal fluctuations, ultimately leading to a minimized loss value. This indicated that the improved algorithm quickly reached a convergence state, enhancing both model accuracy and convergence speed. The ablation experiments revealed that the original YOLOv8n model yielded a mAP of 93.3%. The integration of the CBAM and CA attention mechanisms into the YOLOv8n backbone resulted in mAP values of 93.5% each. The addition of the Content-aware ReAssembly of Features up sampling operator to the neck network of YOLOv8n produced a 0.5% increase in mAP, culminating in a value of 93.8%. The combination of the three improvement strategies yielded mAP increases of 0.3, 0.7, and 0.8%, respectively, compared to the YOLOv8n model. Overall, the 3C-YOLOv8n model demonstrated the best detection performance, achieving the highest mAP of 94.3%. The ablation results confirmed the positive impact of the proposed improvement strategies on the experimental outcomes. Compared to other mainstream YOLO series algorithms, all evaluation metrics showed enhancements, with the lowest missed detection and false detection rates among all tested algorithms, underscoring its practical advantages in detection tasks. [Conclusions] By effectively addressing the inefficiencies of manual labor, 3C-YOLOv8n network model not only enhances the precision of grape recognition and localization but also significantly optimizes overall harvesting efficiency. Its superior performance in evaluation metrics such as precision, recall, mAP, and F₁-Score, alongside the lowest recorded loss values among YOLO series algorithms, indicates a remarkable advancement in model convergence and operational effectiveness. Furthermore, the model's high accuracy in grape target recognition not only lays the groundwork for automated harvesting systems but also enables the implementation of complementary intelligent operations.

Key words: machine vision, YOLOv8n, object detection, grape, CBAM, depth camera

中图分类号:

刘畅, 孙雨, 杨晶, 王凤超, 陈进. 基于3C-YOLOv8n和深度相机的葡萄识别与定位方法[J]. 智慧农业(中英文), 2024, 6(6): 121-131.

LIU Chang, SUN Yu, YANG Jing, WANG Fengchao, CHEN Jin. Grape Recognition and Localization Method Based on 3C-YOLOv8n and Depth Camera[J]. Smart Agriculture, 2024, 6(6): 121-131.

图/表 14

图1

图2

图3

图4

图5

图6

图7

图8

表1

表2

图9

表3

图10

图11

参考文献 41

1	赵梦瑶, 赵君彦, 张泽, 等. 我国鲜食葡萄价格波动特征及影响因素研究[J]. 北方园艺, 2024(18): 136-144.
	ZHAO M Y, ZHAO J Y, ZHANG Z, et al. Study on price fluctuation characteristics and influencing factors of table grape in China[J]. Northern horticulture, 2024(18): 136-144.
2	言九. 2023年全球与中国葡萄行业产量、消费量、进出口数量及区域分布情况[DS/OL]. (2023-07-20) [2024-07-02].
3	OTANI T, ITOH A, MIZUKAMI H, et al. Agricultural robot under solar panels for sowing, pruning, and harvesting in a synecoculture environment[J]. Agriculture, 2022, 13(1): ID 18.
4	VROCHIDOU E, TSAKALIDOU V N, KALATHAS I, et al. An overview of end effectors in agricultural robotic harvesting systems[J]. Agriculture, 2022, 12(8): ID 1240.
5	FAN P, LANG G D, GUO P J, et al. Multi-feature patch-based segmentation technique in the gray-centered RGB color space for improved apple target recognition[J]. Agriculture, 2021, 11(3): ID 273.
6	KONDO N. Study on grape harvesting robot[J]. IFAC proceedings volumes, 1991, 24(11): 243-246.
7	CHAIVIVATRAKUL S, DAILEY M N. Texture-based fruit detection[J]. Precision agriculture, 2014, 15(6): 662-683.
8	李欣, 齐家敏, 程昊, 等. 基于机器视觉的谷糙分离检测方法[J]. 食品与机械, 2024, 40(6): 97-103.
	LI X, QI J M, CHENG H, et al. Grain and chaff separation detection method based on machine vision[J]. Food & machinery, 2024, 40(6): 97-103.
9	LIU S, WHITTY M. Automatic grape bunch detection in vineyards with an SVM classifier[J]. Journal of applied logic, 2015, 13(4): 643-653.
10	李欣, 王玉德. 基于颜色模型和阈值分割的有遮挡的柑橘果实识别算法[J]. 计算技术与自动化, 2022, 41(2): 136-140.
	LI X, WANG Y D. Occluded citrus fruit recognition algorithm based on color model and threshold segmentation[J]. Computing technology and automation, 2022, 41(2): 136-140.
11	DARWIN B, DHARMARAJ P, PRINCE S, et al. Recognition of bloom/yield in crop images using deep learning models for smart agriculture: A review[J]. Agronomy, 2021, 11(4): ID 646.
12	CECOTTI H, RIVERA A, FARHADLOO M, et al. Grape detection with convolutional neural networks[J]. Expert systems with applications, 2020, 159: ID 113588.
13	YIN W, WEN H J, NING Z T, et al. Fruit detection and pose estimation for grape cluster-harvesting robot using binocular imagery based on deep neural networks[J]. Frontiers in robotics and AI, 2021, 8: ID 626989.
14	GIRSHICK R. Fast R-CNN[C]// 2015 IEEE International Conference on Computer Vision (ICCV). Piscataway, New Jersey, USA: IEEE, 2015: 1440-1448.
15	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(6): 1137-1149.
16	CAI Z W, VASCONCELOS N. Cascade R-CNN: Delving into high quality object detection[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, New Jersey, USA: IEEE, 2018: 6154-6162.
17	PANG J M, CHEN K, SHI J P, et al. Libra R-CNN: Towards balanced learning for object detection[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2019: 821-830.
18	朱旭, 马淏, 姬江涛, 等. 基于Faster R-CNN的蓝莓冠层果实检测识别分析[J]. 南方农业学报, 2020, 51(6): 1493-1501.
	ZHU X, MA H, JI J T, et al. Detecting and identifying blueberry canopy fruits based on Faster R-CNN[J]. Journal of southern agriculture, 2020, 51(6): 1493-1501.
19	LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot MultiBox detector[M]// Lecture Notes in Computer Science. Cham: Springer International Publishing, 2016: 21-37.
20	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]// 2017 IEEE International Conference on Computer Vision (ICCV). Piscataway, New Jersey, USA: IEEE, 2017: 2980-2988.
21	TAN M X, PANG R M, LE Q V. EfficientDet: Scalable and efficient object detection[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2020: 10781-10790.
22	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2016: 779-788.
23	REDMON J, FARHADI A. YOLO9000: Better, faster, stronger[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2017: 7263-7271.
24	REDMON J, FARHADI A. YOLOv3: An incremental improvement[EB/OL]. arXiv:1804.02767, 2018.
25	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: Optimal speed and accuracy of object detection[EB/OL]. arXiv: 2004.10934, 2020.
26	MOREIRA G, MAGALHÃES S A, PINHO T, et al. Benchmark of deep learning and a proposed HSV colour space models for the detection and classification of greenhouse tomato[J]. Agronomy, 2022, 12(2): ID 356.
27	SU S Z, CHEN R B, FANG X J, et al. A novel lightweight grape detection method[J]. Agriculture, 2022, 12(9): ID 1364.
28	陈青, 殷程凯, 郭自良, 等. 苹果采摘机器人关键技术研究现状与发展趋势[J]. 农业工程学报, 2023, 39(4): 1-15.
	CHEN Q, YIN C K, GUO Z L, et al. Current status and future development of the key technologies for apple picking robots[J]. Transactions of the Chinese society of agricultural engineering, 2023, 39(4): 1-15.
29	TAFURO A, ADEWUMI A, PARSA S, et al. Strawberry picking point localization ripeness and weight estimation[C]// 2022 International Conference on Robotics and Automation (ICRA). Piscataway, New Jersey, USA: IEEE, 2022: 2295-2302.
30	DU W S, JIA Z H, SUI S S, et al. Table grape inflorescence detection and clamping point localisation based on channel pruned YOLOv7-TP[J]. Biosystems engineering, 2023, 235: 100-115.
31	宁政通, 罗陆锋, 廖嘉欣, 等. 基于深度学习的葡萄果梗识别与最优采摘定位[J]. 农业工程学报, 2021, 37(9): 222-229.
	NING Z T, LUO L F, LIAO J X, et al. Recognition and the optimal picking point location of grape stems based on deep learning[J]. Transactions of the Chinese society of agricultural engineering, 2021, 37(9): 222-229.
32	WANG G, CHEN Y F, AN P, et al. UAV-YOLOv8: A small-object-detection model based on improved YOLOv8 for UAV aerial photography scenarios[J]. Sensors, 2023, 23(16): ID 7190.
33	TANG H Y, LIANG S, YAO D, et al. A visual defect detection for optics lens based on the YOLOv5-C3CA-SPPF network model[J]. Optics express, 2023, 31(2): 2628-2643.
34	WANG S B, CHEN R H, WU H Y, et al. YOLOH: You only look one hourglass for real-time object detection[J]. IEEE transactions on image processing, 2024, 33: 2104-2115.
35	CHEN S L, ZHAO J Q, ZHOU Y, et al. Info-FPN: An informative feature pyramid network for object detection in remote sensing images[J]. Expert systems with applications, 2023, 214: ID 119132.
36	LI Y T, FAN Q S, HUANG H S, et al. A modified YOLOv8 detection network for UAV aerial image recognition[J]. Drones, 2023, 7(5): ID 304.
37	WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[M]// Lecture Notes in Computer Science. Cham: Springer International Publishing, 2018: 3-19.
38	HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2021.
39	WANG J Q, CHEN K, XU R, et al. CARAFE: Content-aware ReAssembly of FEatures[C]// 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway, New Jersey, USA: IEEE, 2019.
40	LI M Y, HUANG J Q, XUE L, et al. A guidance system for robotic welding based on an improved YOLOv5 algorithm with a RealSense depth camera[J]. Scientific reports, 2023, 13(1): ID 21299.
41	RegnierNicolaas. Grape bunch detect and segment[DS/OL]. [2024-05-11].

参数	设置值
Image size/dpi	640×640
Epoch	200
Batch size	4
lr0	0.01
lrf	0.01
box	7.5
cls	0.5
obj	1.0

模型	P	R	mAP/%	F ₁/%
3C-YOLOv8n	0.916	0.864	94.3	88.9
YOLOv8n	0.915	0.857	93.3	88.5
YOLOv5	0.901	0.833	92.1	86.5
YOLOv4	0.718	0.802	82.6	75.7
YOLOv3	0.624	0.806	80.1	70.3

试验编号	CBAM模块	CA模块	CARAFE算子	mAP/%
1	×	×	×	93.3
2	√	×	×	93.5
3	×	√	×	93.5
4	×	×	√	93.8
5	√	√	×	93.6
6	√	×	√	94.0
7	×	√	√	94.1
8	√	√	√	94.3

[1]	严从宽, 朱德泉, 孟凡凯, 杨玉青, 唐七星, 张爱芳, 廖娟. 基于改进CycleGAN的水稻叶片病害图像增强方法[J]. 智慧农业(中英文), 2024, 6(6): 96-108.
[2]	叶大鹏, 景均, 张之得, 李辉煌, 吴昊宇, 谢立敏. MSH-YOLOv8：融合尺度重建的蘑菇小目标检测方法[J]. 智慧农业(中英文), 2024, 6(5): 139-152.
[3]	年悦, 赵凯旋, 姬江涛. 基于改进DeepLabCut模型的奶牛滑蹄检测方法[J]. 智慧农业(中英文), 2024, 6(5): 153-163.
[4]	靳学萌, 梁西银, 邓鹏飞. 基于改进YOLOv10的轻量级黄花菜分级检测模型[J]. 智慧农业(中英文), 2024, 6(5): 108-118.
[5]	王宇啸, 石源源, 陈招达, 吴珍芳, 蔡更元, 张素敏, 尹令. 猪三维点云体尺自动计算模型Pig Back Transformer[J]. 智慧农业(中英文), 2024, 6(4): 76-90.
[6]	李明煌, 苏力德, 张永, 宗哲英, 张顺. 基于改进YOLOv8n-pose和三维点云分析的蒙古马体尺自动测量方法[J]. 智慧农业(中英文), 2024, 6(4): 91-102.
[7]	吴小燕, 郭威, 朱轶萍, 朱华吉, 吴华瑞. 基于改进YOLOv8s的大田甘蓝移栽状态检测算法[J]. 智慧农业(中英文), 2024, 6(2): 107-117.
[8]	张荣华, 白雪, 樊江川. 复杂场景下害虫目标检测算法：YOLOv8-Extend[J]. 智慧农业(中英文), 2024, 6(2): 49-61.
[9]	毛永文, 韩俊英, 刘成忠. 基于机器视觉的胡麻种子自动化考种方法[J]. 智慧农业(中英文), 2024, 6(1): 135-146.
[10]	李政凯, 于嘉辉, 潘时佳, 贾泽丰, 牛子杰. 冬季猕猴桃树单木骨架提取与冠层生长预测方法[J]. 智慧农业(中英文), 2023, 5(4): 92-104.
[11]	刘易雪, 宋育阳, 崔萍, 房玉林, 苏宝峰. 基于无人机遥感和深度学习的葡萄卷叶病感染程度诊断方法[J]. 智慧农业(中英文), 2023, 5(3): 49-61.
[12]	朱衍俊, 杜文圣, 王春颖, 刘平, 李祥. 自然环境中鲜食葡萄快速识别与采摘点自动定位方法[J]. 智慧农业(中英文), 2023, 5(2): 23-34.
[13]	商枫楠, 周学成, 梁英凯, 肖明玮, 陈桥, 罗陈迪. 基于改进YOLOX的自然环境中火龙果检测方法[J]. 智慧农业(中英文), 2022, 4(3): 120-131.
[14]	何锐敏, 郑可锋, 尉钦洋, 张小斌, 张俊, 朱怡航, 赵懿滢, 顾清. 基于改进Mask R-CNN模型的工厂化养蚕蚕体识别与计数[J]. 智慧农业(中英文), 2022, 4(2): 163-173.
[15]	郭秀明, 诸叶平, 李世娟, 张杰, 吕纯阳, 刘升平. 农业复杂环境下尺度自适应小目标识别算法——以蜜蜂为研究对象[J]. 智慧农业(中英文), 2022, 4(1): 140-149.

基于3C-YOLOv8n和深度相机的葡萄识别与定位方法

Grape Recognition and Localization Method Based on 3C-YOLOv8n and Depth Camera

在线阅读

知网下载

本地下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 14

参考文献 41

相关文章 15

编辑推荐

Metrics

本文评价