基于改进YOLOv5s和多模态图像的树上毛桃检测

doi:10.12133/j.smartag.SA202210004

Smart Agriculture ›› 2022, Vol. 4 ›› Issue (4): 84-104.doi: 10.12133/j.smartag.SA202210004

基于改进YOLOv5s和多模态图像的树上毛桃检测

^1.安徽农业大学信息与计算机学院，安徽合肥 230036
^2.农业农村部农业传感器重点实验室，安徽合肥 230036
^3.智慧农业技术与装备安徽省重点实验室，安徽合肥 230036

收稿日期:2022-10-30 出版日期:2022-12-30
基金项目:
安徽省智慧农业技术与装备重点实验室（APKLSATE2021X004）；农业农村部国际合作项目（125A0607）；安徽省重点研发计划项目（201904a06020056）；安徽省立大学自然科学重大项目（2022AH040125）；安徽省自然科学基金项目（2008085MF203）
作者简介:罗庆，E-mail: tsing.omg@gmail.com
通信作者: 饶元，E-mail: raoyuan@ahau.edu.cn

Multi-Class on-Tree Peach Detection Using Improved YOLOv5s and Multi-Modal Images

LUO Qing^1,^2,³(), RAO Yuan^1,^2,³(), JIN Xiu^1,^2,³, JIANG Zhaohui^1,^2,³, WANG Tan^1,^2,³, WANG Fengyi^1,^2,³, ZHANG Wu^1,^2,³

^1.College of Information and Computer Science, Anhui Agricultural University, Hefei 230036, China
^2.Key Laboratory of Agricultural Sensors, Ministry of Agriculture and Rural Affairs, Hefei 230036, China
^3.Anhui Provincial Key Laboratory of Smart Agricultural Technology and Equipment, Hefei 230036, China

Received:2022-10-30 Online:2022-12-30
Foundation items:The Anhui Provincial Key Laboratory of Smart Agricultural Technology and Equipment(APKLSATE2021X004);The International Cooperation Project of Ministry of Agriculture and Rural Affairs(125A0607);The Key Research and Development Plan of Anhui Province(201904a06020056);The Natural Science Major Project for Anhui Provincial University(2022AH040125);The Natural Science Foundation of Anhui Province, China(2008085MF203)
About author:LUO Qing (1997－), male, graduate student, research interest: smart agriculture. E-mail: tsing.omg@gmail.com
Corresponding author:RAO Yuan (1982－), male, PhD, professor, research interest: smart agricultural information technology. E-mail: raoyuan@ahau.edu.cn

摘要/Abstract

摘要：

毛桃等果实的准确检测是实现机械化、智能化农艺管理的必要前提。然而，由于光照不均和严重遮挡，在果园中实现毛桃，尤其是套袋毛桃的检测一直面临着挑战。本研究基于改进YOLOv5s和多模态视觉数据提出了面向机械化采摘的毛桃多分类准确检测。具体地，构建了一个多类标签的裸桃和套袋毛桃的RGB-D数据集，包括4127组由消费级RGB-D相机获取的像素对齐的彩色、深度和红外图像。随后，通过引入方向感知和位置敏感的注意力机制，提出了改进的轻量级YOLOv5s（小深度）模型，该模型可以沿一个空间方向捕捉长距离依赖，并沿另一个空间方向保留准确的位置信息，提高毛桃检测精度。同时，通过将卷积操作分解为深度方向的卷积与宽度、高度方向的卷积，使用深度可分离卷积在保持模型检测准确性的同时减少模型的计算量、训练和推理时间。实验结果表明，使用多模态视觉数据的改进YOLOv5s模型在复杂光照和严重遮挡环境下，对裸桃和套袋毛桃的平均精度（Mean Average Precision，mAP）分别为98.6%和88.9%，比仅使用RGB图像提高了5.3%和16.5%，比YOLOv5s提高了2.8%和6.2%。在套袋毛桃检测方面，改进YOLOv5s的mAP比YOLOX-Nano、PP-YOLO-Tiny和EfficientDet-D0分别提升了16.3%、8.1%和4.5%。此外，多模态图像、改进YOLOv5s对提升自然果园中的裸桃和套袋毛桃的准确检测均有贡献，所提出的改进YOLOv5s模型在检测公开数据集中的富士苹果和猕猴桃时，也获得了优于传统方法的结果，验证了所提出的模型具有良好的泛化能力。最后，在主流移动式硬件平台上，改进后的YOLOv5s模型使用五通道多模态图像时检测速度可达每秒19幅，能够实现毛桃的实时检测。上述结果证明了改进的YOLOv5s网络和含多类标签的多模态视觉数据在实现果实自动采摘系统视觉智能方面的应用潜力。

关键词: 多类检测, YOLOv5s, 多模态视觉数据, 机械化采摘, 深度学习

Abstract:

Accurate peach detection is a prerequisite for automated agronomic management, e.g., peach mechanical harvesting. However, due to uneven illumination and ubiquitous occlusion, it is challenging to detect the peaches, especially when the peaches are bagged in orchards. To this end, an accurate multi-class peach detection method was proposed by means of improving YOLOv5s and using multi-modal visual data for mechanical harvesting in this paper. RGB-D dataset with multi-class annotations of naked and bagging peach was proposed, including 4127 multi-modal images of corresponding pixel-aligned color, depth, and infrared images acquired with consumer-level RGB-D camera. Subsequently, an improved lightweight YOLOv5s (small depth) model was put forward by introducing a direction-aware and position-sensitive attention mechanism, which could capture long-range dependencies along one spatial direction and preserve precise positional information along the other spatial direction, helping the networks accurately detect peach targets. Meanwhile, the depthwise separable convolution was employed to reduce the model computation by decomposing the convolution operation into convolution in the depth direction and convolution in the width and height directions, which helped to speed up the training and inference of the network while maintaining accuracy. The comparison experimental results demonstrated that the improved YOLOv5s using multi-modal visual data recorded the detection mAP of 98.6% and 88.9% on the naked and bagging peach with 5.05 M model parameters in complex illumination and severe occlusion environment, increasing by 5.3% and 16.5% than only using RGB images, as well as by 2.8% and 6.2% when compared to YOLOv5s. As compared with other networks in detecting bagging peaches, the improved YOLOv5s performed best in terms of mAP, which was 16.3%, 8.1% and 4.5% higher than YOLOX-Nano, PP-YOLO-Tiny, and EfficientDet-D0, respectively. In addition, the proposed improved YOLOv5s model offered better results in different degrees than other methods in detecting Fuji apple and Hayward kiwifruit, verified the effectiveness on different fruit detection tasks. Further investigation revealed the contribution of each imaging modality, as well as the proposed improvement in YOLOv5s, to favorable detection results of both naked and bagging peaches in natural orchards. Additionally, on the popular mobile hardware platform, it was found out that the improved YOLOv5s model could implement 19 times detection per second with the considered five-channel multi-modal images, offering real-time peach detection. These promising results demonstrated the potential of the improved YOLOv5s and multi-modal visual data with multi-class annotations to achieve visual intelligence of automated fruit harvesting systems.

Key words: multi-class detection, YOLOv5s, multi-modal visual data, mechanical harvesting, deep learning

中图分类号:

S662.1

罗庆, 饶元, 金秀, 江朝晖, 王坦, 王丰仪, 张武. 基于改进YOLOv5s和多模态图像的树上毛桃检测[J]. 智慧农业(中英文), 2022, 4(4): 84-104.

LUO Qing, RAO Yuan, JIN Xiu, JIANG Zhaohui, WANG Tan, WANG Fengyi, ZHANG Wu. Multi-Class on-Tree Peach Detection Using Improved YOLOv5s and Multi-Modal Images[J]. Smart Agriculture, 2022, 4(4): 84-104.

参考文献

1	YADAV S, SENGAR N, SINGH A, et al. Identification of disease using deep learning and evaluation of bacteriosis in peach leaf[J]. Ecological Informatics, 2021, 61: ID 101247.
2	GENE-MOLA J, VILAPLANA V, ROSELL-POLO J R, et al. Multi-modal deep learning for Fuji apple detection using RGB-D cameras and their radiometric capabilities[J]. Computers and Electronics in Agriculture, 2019, 162: 689-698.
3	NGUYEN T T, VANDEVOORDE K, WOUTERS N, et al. Detection of red and bicoloured apples on tree with an RGB-D camera[J]. Biosystems Engineering, 2016, 146: 33-44.
4	LIU X, JIA W, RUAN C, et al. The recognition of apple fruits in plastic bags based on block classification[J]. Precision Agriculture, 2018, 19(4): 735-749.
5	LIU T, EHSANI R, TOUDESHKI A, et al. Identifying immature and mature pomelo fruits in trees by elliptical model fitting in the Cr-Cb color space[J]. Precision Agriculture, 2019, 20(1): 138-156.
6	LIU Y, CHEN B, QIAO J. Development of a machine vision algorithm for recognition of peach fruit in a natural scene[J]. Transactions of the ASABE, 2011, 54(2): 695-702.
7	WILLIAMS H A M, JONES M H, NEJATI M, et al. Robotic kiwifruit harvesting using machine vision, convolutional neural networks, and robotic arms[J]. Biosystems Engineering, 2019, 181: 140-156.
8	NAVAS E, FERNANDEZ R, SEPULVEDA D, et al. Soft grippers for automatic crop harvesting: A review[J]. Sensors, 2021, 21(8): ID 2689.
9	TU S, PANG J, LIU H, et al. Passion fruit detection and counting based on multiple scale faster R-CNN using RGB-D images[J]. Precision Agriculture, 2020, 21(5): 1072-1091.
10	H?NI N, ROY P, ISLER V. A comparative study of fruit detection and counting methods for yield mapping in apple orchards[J]. Journal of Field Robotics, 2020, 37(2): 263-282.
11	LU S, CHEN W, ZHANG X, et al. Canopy-attention-YOLOv4-based immature/mature apple fruit detection on dense-foliage tree architectures for early crop load estimation[J]. Computers and Electronics in Agriculture, 2022, 193: ID 106696.
12	LI X, PAN J, XIE F, et al. Fast and accurate green pepper detection in complex backgrounds via an improved YOLOv4-tiny model[J]. Computers and Electronics in Agriculture, 2021, 191: ID 106503.
13	JIANG M, SONG L, WANG Y, et al. Fusion of the YOLOv4 network model and visual attention mechanism to detect low-quality young apples in a complex environment[J]. Precision Agriculture, 2022, 23(2): 559-577.
14	HUANG H, HUANG T, LI Z, et al. Design of citrus fruit detection system based on mobile platform and edge computer device[J]. Sensors, 2021, 22(1): ID 59.
15	FU L, GAO F, WU J, et al. Application of consumer RGB-D cameras for fruit detection and localization in field: A critical review[J]. Computers and Electronics in Agriculture, 2020, 177: ID 105687.
16	SA I, GE Z, DAYOUB F, et al. Deepfruits: A fruit detection system using deep neural networks[J]. Sensors, 2016, 16(8): ID 1222.
17	ARAD B, BALENDONCK J, BARTH R, et al. Development of a sweet pepper harvesting robot[J]. Journal of Field Robotics, 2020, 37(6): 1027-1039.
18	SUO R, GAO F, ZHOU Z, et al. Improved multi-classes kiwifruit detection in orchard to avoid collisions during robotic picking[J]. Computers and Electronics in Agriculture, 2021, 182: ID 106052.
19	TIAN Y, YANG G, WANG Z, et al. Apple detection during different growth stages in orchards using the improved YOLO-v3 model[J]. Computers and Electronics in Agriculture, 2019, 157: 417-426.
20	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, New York, USA: IEEE, 2016: 779-788.
21	REDMON J, FARHADI A. YOLOv3: An incremental improvement[J/OL]. arXiv:1804.02767[cs.CV], 2018.
22	REDMON J, FARHADI A. YOLO9000: Better, faster, stronger[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, New York, USA: IEEE, 2017: 7263-7271.
23	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: Optimal speed and accuracy of object detection[J/OL]. arXiv: 2004.10934[cs.CV], 2020.
24	YAN B, FAN P, LEI X, et al. A real-time apple targets detection method for picking robot based on improved YOLOv5[J]. Remote Sensing, 2021, 13(9): ID 1619.
25	LIU S, QI L, QIN H, et al. Path aggregation network for instance segmentation[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, New York, USA: IEEE, 2018: 8759-8768.
26	HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, New York, USA: IEEE, 2021: 13713-13722.
27	FANG L, WU Y, LI Y, et al. Ginger seeding detection and shoot orientation discrimination using an improved YOLOv4-LITE network[J]. Agronomy, 2021, 11(11): ID 2328.
28	SHI C, LIN L, SUN J, et al. A lightweight YOLOv5 transmission line defect detection method based on coordinate attention[C]// 2022 IEEE 6th Information Technology and Mechatronics Engineering Conference (ITOEC). Piscataway, New York, USA: IEEE, 2022, 6: 1779-1785.
29	ZHA M, QIAN W, YI W, et al. A lightweight YOLOv4-based forestry pest detection method using coordinate attention and feature fusion[J]. Entropy, 2021, 23(12): 1587.
30	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, New York, USA: IEEE, 2018: 7132-7141.
31	ZHANG Y, YU J, CHEN Y, et al. Real-time strawberry detection using deep neural networks on embedded system (RTSD-net): An edge AI application[J]. Computers and Electronics in Agriculture, 2022, 192: ID 106586.
32	CHOLLET F. Xception: Deep learning with depthwise separable convolutions[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, New York, USA: IEEE, 2017: 1251-1258.
33	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: Optimal speed and accuracy of object detection[J/OL]. arXiv: 2004.10934[cs.CV], 2020.
34	ZHENG Z, WANG P, LIU W, et al. Distance-IoU loss: Faster and better learning for bounding box regression[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Piscataway, New York, USA: IEEE, 2020, 34(7): 12993-13000.
35	POWERS D M W. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation[J/OL]. arXiv: 2010.16061[cs.LG], 2020.
36	GE Z, LIU S, WANG F, et al. YOLOx: Exceeding yolo series in 2021[J/OL]. arXiv: 2107.08430[cs.CV], 2021.
37	LONG X, DENG K, WANG G, et al. PP-YOLO: An effective and efficient implementation of object detector[J/OL]. arXiv: 2007.12099[cs.CV], 2020.
38	TAN M, PANG R, LE Q V. EfficientDet: Scalable and efficient object detection[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, New York, USA: IEEE, 2020: 10781-10790.

[1]	李瑞杰, 王爱冬, 吴华星, 李子秋, 冯向前, 洪卫源, 汤学军, 覃金华, 王丹英, 褚光, 张运波, 陈松. 水稻生育期遥感监测的研究进展、瓶颈问题与技术优化路径[J]. 智慧农业(中英文), 2025, 7(3): 89-107.
[2]	韩宇, 齐康康, 郑纪业, 李金瑷, 姜富贵, 张相伦, 游伟, 张霞. 基于改进YOLOv11的轻量化肉牛面部识别方法[J]. 智慧农业(中英文), 2025, 7(3): 173-184.
[3]	马六, 毛克彪, 郭中华. 基于混合注意力生成对抗网络的遥感图像去雾方法[J]. 智慧农业(中英文), 2025, 7(2): 172-182.
[4]	许世卫, 李乾川, 栾汝朋, 庄家煜, 刘佳佳, 熊露. 农产品市场监测预警深度学习智能预测方法[J]. 智慧农业(中英文), 2025, 7(1): 57-69.
[5]	杨信廷, 胡焕, 陈晓, 李汶政, 周子洁, 李文勇. 多源场景下粘虫板小目标害虫轻量化检测识别模型[J]. 智慧农业(中英文), 2025, 7(1): 111-123.
[6]	宫宇, 王玲, 赵荣强, 尤海波, 周沫, 刘劼. 基于多模态数据表型特征提取的番茄生长高度预测方法[J]. 智慧农业(中英文), 2025, 7(1): 97-110.
[7]	齐梓均, 牛当当, 吴华瑞, 张礼麟, 王仑峰, 张宏鸣. 基于双维信息与剪枝的中文猕猴桃文本命名实体识别方法[J]. 智慧农业(中英文), 2025, 7(1): 44-56.
[8]	张辉, 胡军, 石航, 刘昶希, 吴淼. 融合远端深度学习识别模型的白菜株心精准对靶喷雾系统[J]. 智慧农业(中英文), 2024, 6(6): 85-95.
[9]	芦碧波, 梁迪, 杨洁, 宋爱青, 皇甫尚卫. 基于改进ENet的复杂背景下山药叶片图像分割方法[J]. 智慧农业(中英文), 2024, 6(6): 109-120.
[10]	罗友璐, 潘勇浩, 夏顺兴, 陶友志. 基于改进YOLOv8的苹果叶病害轻量化检测算法[J]. 智慧农业(中英文), 2024, 6(5): 128-138.
[11]	刘伊, 张彦军. ReluformerN：轻量化高低频增强高光谱农业地物分类方法[J]. 智慧农业(中英文), 2024, 6(5): 74-87.
[12]	年悦, 赵凯旋, 姬江涛. 基于改进DeepLabCut模型的奶牛滑蹄检测方法[J]. 智慧农业(中英文), 2024, 6(5): 153-163.
[13]	张岩琪, 周硕, 张凝, 柴秀娟, 孙坦. 基于改进实例分割算法的区域养殖生猪计数系统[J]. 智慧农业(中英文), 2024, 6(4): 53-63.
[14]	翁智, 范琦, 郑志强. 基于多模态图像信息及改进实例分割网络的肉牛体尺自动测量方法[J]. 智慧农业(中英文), 2024, 6(4): 64-75.
[15]	侯依廷, 饶元, 宋贺, 聂振君, 王坦, 何豪旭. 复杂大田场景下基于改进YOLOv8的小麦幼苗期叶片数快速检测方法[J]. 智慧农业(中英文), 2024, 6(4): 128-137.

基于改进YOLOv5s和多模态图像的树上毛桃检测

Multi-Class on-Tree Peach Detection Using Improved YOLOv5s and Multi-Modal Images

在线阅读

知网下载

本地下载

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价