基于双目视觉和改进YOLOv8的玉米茎秆宽度原位识别方法

doi:10.12133/j.smartag.SA202309004

摘要/Abstract

摘要：

[目的/意义] 玉米茎秆宽度是影响玉米抗倒伏能力的重要指标。玉米茎秆宽度测量存在人工采集过程繁琐、设备自动采集识别精度误差较大等问题，研究一种玉米茎秆宽度原位检测与高精度识别方法具有重要应用价值。 [方法] 采用ZED2i双目相机并将其固定在田间获取实时的玉米茎秆左目和右目图片，对原始图片进行数据增强，使用YOLOv8对玉米茎秆进行识别，再通过多次增加注意力机制（Coordinate Attention，CA）模块，和替换损失函数（Efficient IoU Loss，EIoU）的方法，进一步提高玉米茎秆的识别精度，然后通过对玉米茎秆的三维重建，获取识别框边界点在世界坐标系下的三维数据，通过距离公式计算出茎秆宽度。最后对改进后的YOLOv8模型与YOLOv8原模型、YOLOv7、YOLOv5、Faster RCNN、SSD进行对比，验证模型的识别准确性和识别精度。 [结果和讨论] 改进后的YOLOv8模型的查准率P、查全率R、平均精确率mAP_0.5、平均精确率mAP_0.5∶0.95分别达到了96.8%、94.1%、96.6%、77.0%，玉米茎秆宽度原位检测宽度计算的线性回归决定系数R²，均方根误差RMSE和平均绝对误差MAE分别为0.373、0.265和0.244 cm，可满足实际生产对玉米茎秆宽度测量精度的要求。 [结论] 本研究提出的基于改进YOLOv8模型的玉米茎秆宽度原位识别方法可以实现对玉米茎秆的原位准确识别，很好地解决了目前人工测量耗时费力和机器视觉识别精度较差的问题，为实际生产应用提供了理论依据。

关键词: YOLOv8, 注意力机制, 双目视觉, 玉米茎宽检测, 三维重建

Abstract:

[Objective] The width of maize stalks is an important indicator affecting the lodging resistance of maize. The measurement of maize stalk width has many problems, such as cumbersome manual collection process and large errors in the accuracy of automatic equipment collection and recognition, and it is of great application value to study a method for in-situ detection and high-precision identification of maize stalk width. [Methods] The ZED2i binocular camera was used and fixed in the field to obtain real-time pictures from the left and right sides of maize stalks together. The picture acquisition system was based on the NVIDIA Jetson TX2 NX development board, which could achieve timed shooting of both sides view of the maize by setting up the program. A total of maize original images were collected and a dataset was established. In order to observe more features in the target area from the image and provide assistance to improve model training generalization ability, the original images were processed by five processing methods: image saturation, brightness, contrast, sharpness and horizontal flipping, and the dataset was expanded to 3500 images. YOLOv8 was used as the original model for identifying maize stalks from a complex background. The coordinate attention (CA) attention mechanism can bring huge gains to downstream tasks on the basis of lightweight networks, so that the attention block can capture long-distance relationships in one direction while retaining spatial information in the other direction, so that the position information can be saved in the generated attention map to focus on the area of interest and help the network locate the target better and more accurately. By adding the CA module multiple times, the CA module was fused with the C2f module in the original Backbone, and the Bottleneck in the original C2f module was replaced by the CA module, and the C2fCA network module was redesigned. Replacing the loss function Efficient IoU Loss(EIoU) splits the loss term of the aspect ratio into the difference between the predicted width and height and the width and height of the minimum outer frame, which accelerated the convergence of the prediction box, improved the regression accuracy of the prediction box, and further improved the recognition accuracy of maize stalks. The binocular camera was then calibrated so that the left and right cameras were on the same three-dimensional plane. Then the three-dimensional reconstruction of maize stalks, and the matching of left and right cameras recognition frames was realized through the algorithm, first determine whether the detection number of recognition frames in the two images was equal, if not, re-enter the binocular image. If they were equal, continue to judge the coordinate information of the left and right images, the width and height of the bounding box, and determine whether the difference was less than the given T_a. If greater than the given T_a, the image was re-imported; If it was less than the given T_a, the confidence level of the recognition frame of the image was determined whether it was less than the given T_b. If greater than the given T_b, the image is re-imported; If it is less than the given T_b, it indicates that the recognition frame is the same maize identified in the left and right images. If the above conditions were met, the corresponding point matching in the binocular image was completed. After the three-dimensional reconstruction of the binocular image, the three-dimensional coordinates (A_x, A_y, A_z) and (B_x, B_y, B_z) in the upper left and upper right corners of the recognition box under the world coordinate system were obtained, and the distance between the two points was the width of the maize stalk. Finally, a comparative analysis was conducted among the improved YOLOv8 model, the original YOLOv8 model, faster region convolutional neural networks (Faster R-CNN), and single shot multiBox detector (SSD)to verify the recognition accuracy and recognition accuracy of the model. [Results and Discussions] The precision rate (P)、recall rate (R)、average accuracy mAP_0.5、average accuracy mAP_0.5:0.95 of the improved YOLOv8 model reached 96.8%、94.1%、96.6% and 77.0%. Compared with YOLOv7, increased by 1.3%、1.3%、1.0% and 11.6%, compared with YOLOv5, increased by 1.8%、2.1%、1.2% and 15.8%, compared with Faster R-CNN, increased by 31.1%、40.3%、46.2%、and 37.6%, and compared with SSD, increased by 20.6%、23.8%、20.9% and 20.1%, respectively. Respectively, and the linear regression coefficient of determination R², root mean square error RMSE and mean absolute error MAE were 0.373, 0.265 cm and 0.244 cm, respectively. The method proposed in the research can meet the requirements of actual production for the measurement accuracy of maize stalk width. [Conclusions] In this study, the in-situ recognition method of maize stalk width based on the improved YOLOv8 model can realize the accurate in-situ identification of maize stalks, which solves the problems of time-consuming and laborious manual measurement and poor machine vision recognition accuracy, and provides a theoretical basis for practical production applications.

Key words: YOLOv8, attention mechanism, binocular vision, maize stalk width detection, three-dimensional reconstruction

左昊轩, 黄祺成, 杨佳昊, 孟繁佳, 李思恩, 李莉. 基于双目视觉和改进YOLOv8的玉米茎秆宽度原位识别方法[J]. 智慧农业(中英文), 2023, 5(3): 86-95.

ZUO Haoxuan, HUANG Qicheng, YANG Jiahao, MENG Fanjia, LI Sien, LI Li. In Situ Identification Method of Maize Stalk Width Based on Binocular Vision and Improved YOLOv8[J]. Smart Agriculture, 2023, 5(3): 86-95.

图/表 14

图1

图2

图3

图4

图5

图6

图7

图8

图9

图10

表1

图11

表2

表3

参考文献 20

1	WANG J Z, LI X H, ZHOU Y H, et al. Banana pseudostem width detection based on kinect V2 depth sensor[J]. Computational intelligence and neuroscience, 2022, 2022: ID 3083647.
2	DU J, ZHANG Y, LU X, et al. A deep learning-integrated phenotyping pipeline for vascular bundle phenotypes and its application in evaluating sap flow in the maize stem[J]. The crop journal, 2022, 10(5): 1424-1434.
3	胡松涛, 翟瑞芳, 王应华, 等. 基于多源数据的马铃薯植株表型参数提取[J]. 智慧农业(中英文), 2023, 5(1): 132-145.
	HU S T, ZHAI R F, WANG Y H, et al. Extraction of potato plant phenotypic parameters based on multi-source data[J]. Smart agriculture, 2023, 5(1): 132-145.
4	XIANG L, TANG L, GAI J, et al. Measuring stem diameter of sorghum plants in the field using a high-throughput stereo vision system[J]. Transactions of the ASABE, 2021, 64(6): 1999-2010.
5	ZHOU J, CUI M R, WU Y S, et al. Maize (Zea mays L.) stem target region extraction and stem diameter measurement based on an internal gradient algorithm in field conditions[J]. Agronomy, 2023, 13(5): ID 1185.
6	ZHOU J, WU Y S, CHEN J A, et al. Maize stem contour extraction and diameter measurement based on adaptive threshold segmentation in field conditions[J]. Agriculture, 2023, 13(3): ID 678.
7	陈燕, 李想, 曹勉, 等. 基于语义分割与实例分割的玉米茎秆截面参数测量方法[J]. 农业机械学报, 2023, 54(6): 214-222.
	CHEN Y, LI X, CAO M, et al. Measurement of maize stem cross section parameters based on semantic segmentation and instance segmentation[J]. Transactions of the Chinese society for agricultural machinery, 2023, 54(6): 214-222.
8	徐胜勇, 李磊, 童辉, 等. 基于RGB-D相机的黄瓜苗3D表型高通量测量系统研究[J]. 农业机械学报, 2023, 54(7): 204-213, 281.
	XU S Y, LI L, TONG H, et al. High-throughput measurement system for 3D phenotype of cucumber seedlings using RGB-D camera[J]. Transactions of the Chinese society for agricultural machinery, 2023, 54(7): 204-213, 281.
9	张凯琪. 基于深度学习的盆栽玉米茎粗自动测量平台的研究[D]. 武汉: 华中农业大学, 2022.
	ZHANG K Q. Research on automatic measurement platform of potted maize stem diameter based on deep learning[D]. Wuhan: Huazhong Agricultural University, 2022.
10	彭程, 苗艳龙, 汪刘洋, 等. 基于三维点云的田间香蕉吸芽形态参数获取[J]. 农业工程学报, 2022, 38(S1): 193-200.
	PENG C, MIAO Y L, WANG L Y, et al. Morphological parameters extraction of banana sucker in the field based on three-dimensional point cloud[J]. Transactions of the Chinese society of agricultural engineering, 2022, 38(S1): 193-200.
11	袁红春, 陶磊. 基于改进的YOLOv8商业渔船电子监控数据中鱼类的检测与识别[J]. 大连海洋大学学报, 2023, 38(3): 533-542.
	YUAN H C, TAO L. Detection and identification of fish in electronic monitoring data of commercial fishing vessels based on improved YOLOv8[J]. Journal of Dalian Ocean university, 2023, 38(3): 533-542.
12	LI Y T, FAN Q S, HUANG H S, et al. A modified YOLOv8 detection network for UAV aerial image recognition[J]. Drones, 2023, 7(5): ID 304.
13	MARTINEZ-CARRANZA J, HERNÁNDEZ-FARÍAS D I, ROJAS-PEREZ L O, et al. Language meets YOLOv8 for metric monocular SLAM[J]. Journal of real-time image processing, 2023, 20(4): 1-10.
14	LOU H T, DUAN X H, GUO J M, et al. DC-YOLOv8: Small-size object detection algorithm based on camera sensor[J]. Electronics, 2023, 12(10): ID 2323.
15	PRINZMETAL W, HA R, KHANI A. The mechanisms of involuntary attention[J]. Journal of experimental psychology human perception and performance, 2010, 36(2): 255-267.
16	ZHONG X, GONG O B, HUANG W X, et al. Squeeze-and-excitation wide residual networks in image classification[C]// 2019 IEEE International Conference on Image Processing (ICIP). Piscataway, New Jersey, USA: IEEE, 2019: 395-399.
17	LI X Z, WU B Y, ZHU X, et al. Consecutively missing seismic data interpolation based on coordinate attention unet[J]. IEEE geoscience and remote sensing letters, 2022, 19: 1-5.
18	XUE J L, CHENG F, LI Y Q, et al. Detection of farmland obstacles based on an improved YOLOv5s algorithm by using CIoU and anchor box scale clustering[J]. Sensors, 2022, 22(5): ID 1790.
19	YINAN W, YUN Z, JIA G, et al. YOLOv5 detection algorithm of steel defects based on introducing light convolution network and DIOU function[C]// 2023 IEEE 12th Data Driven Control and Learning Systems Conference (DDCLS). Piscataway, New Jersey, USA: IEEE, 2023: 118-122.
20	LI Z Y, JIANG X Q, SHUAI L Y, et al. A real-time detection algorithm for sweet cherry fruit maturity based on YOLOX in the natural environment[J]. Agronomy, 2022, 12(10): ID 2482.

模型			P/%	R/%	mAP_0.5/%	mAP_0.5：0.95/%	FPS
YOLOv8	CA	EIoU	——	——	——	——	——
	×	×	94.7	92.5	94.4	62.6	69.0
	√	×	96.2	93.5	96.1	70.5	57.0
	×	√	95.3	92.5	95.9	68.8	58.0
	√	√	96.8	94.1	96.6	77	56.0

玉米植株编号	真值/cm	检测值/cm	偏差/cm
1	2.34	2.25	0.09
2	2.82	2.36	0.46
3	2.47	2.30	0.17
4	2.32	2.29	0.03
5	2.56	2.28	0.28
6	2.15	2.32	-0.17
7	2.23	2.30	-0.07
8	1.93	2.25	-0.32
9	2.54	2.33	0.21
10	2.02	2.29	-0.27
R ²	0.373
RMSE/cm	0.265
MAE/cm	0.244

算法	P/%	R/%	mAP_0.5/%	mAP_0.5：0.95/%
YOLOv8	96.8	94.1	96.6	77.0
YOLOv7	95.5	92.8	95.6	65.4
YOLOv5	95.0	92.0	95.4	61.2
Faster R-CNN	65.7	53.8	50.4	39.4
SSD	76.2	70.3	75.7	56.9

[1]	刘易雪, 宋育阳, 崔萍, 房玉林, 苏宝峰. 基于无人机遥感和深度学习的葡萄卷叶病感染程度诊断方法[J]. 智慧农业(中英文), 2023, 5(3): 49-61.
[2]	朱海鹏, 张玉安, 李欢欢, 王建文, 杨英魁, 宋仁德. 基于改进残差网络模型的不同部位牦牛肉分类识别方法[J]. 智慧农业(中英文), 2023, 5(2): 115-125.
[3]	潘晨露, 张正华, 桂文豪, 马家俊, 严晨曦, 张晓敏. 融合ECA机制与DenseNet201的水稻病虫害识别方法[J]. 智慧农业(中英文), 2023, 5(2): 45-55.
[4]	赵毓, 任艺平, 朴欣茹, 郑丹阳, 李东明. 基于改进ShuffleNet V2的轻量级防风药材道地性智能识别[J]. 智慧农业(中英文), 2023, 5(2): 104-114.
[5]	张文景, 蒋泽中, 秦立峰. 基于弱监督下改进的CBAM-ResNet18模型识别苹果多种叶部病害[J]. 智慧农业(中英文), 2023, 5(1): 111-121.
[6]	商枫楠, 周学成, 梁英凯, 肖明玮, 陈桥, 罗陈迪. 基于改进YOLOX的自然环境中火龙果检测方法[J]. 智慧农业(中英文), 2022, 4(3): 120-131.
[7]	李嘉位, 马为红, 李奇峰, 薛向龙, WANG Zhiquan. 复杂环境下肉牛三维点云重建与目标提取方法[J]. 智慧农业(中英文), 2022, 4(2): 64-76.
[8]	龙洁花, 郭文忠, 林森, 文朝武, 张宇, 赵春江. 改进YOLOv4的温室环境下草莓生育期识别方法[J]. 智慧农业(中英文), 2021, 3(4): 99-110.