欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture ›› 2023, Vol. 5 ›› Issue (3): 86-95.doi: 10.12133/j.smartag.SA202309004

• 专刊--作物信息监测技术 • 上一篇    下一篇

基于双目视觉和改进YOLOv8的玉米茎秆宽度原位识别方法

左昊轩1(), 黄祺成1, 杨佳昊2, 孟繁佳2, 李思恩3, 李莉1()   

  1. 1. 中国农业大学农业农村部农业信息获取技术重点实验室,北京 100083,中国
    2. 中国农业大学智慧农业系统集成研究教育部重点实验室,北京 100083,中国
    3. 中国农业大学 水利与土木工程学院,北京 100083,中国
  • 收稿日期:2023-09-01 出版日期:2023-09-30
  • 基金资助:
    国家重点研发计划支持项目(2022YFD1900801)
  • 作者简介:
    左昊轩,研究方向为精细农业系统集成研究。E-mail:
  • 通信作者:
    李 莉,副教授,博士生导师,研究方向为智慧农业系统集成和农业信息获取技术研究。E-mail:

In Situ Identification Method of Maize Stalk Width Based on Binocular Vision and Improved YOLOv8

ZUO Haoxuan1(), HUANG Qicheng1, YANG Jiahao2, MENG Fanjia2, LI Sien3, LI Li1()   

  1. 1. Key Laboratory of Agricultural Information Acquisition Technology, Ministry of Agriculture and Rural Affairs, China Agricultural University, Beijing 100083, China
    2. Key Laboratory of Smart Agriculture System Integration, Ministry of Education, China Agricultural University, Beijing 100083, China
    3. College of Water Resources and Civil Engineering, China Agricultural University, Beijing 100083, China
  • Received:2023-09-01 Online:2023-09-30
  • Supported by:
    National Key Research and Development Program of China(2022YFD1900801)

摘要:

[目的/意义] 玉米茎秆宽度是影响玉米抗倒伏能力的重要指标。玉米茎秆宽度测量存在人工采集过程繁琐、设备自动采集识别精度误差较大等问题,研究一种玉米茎秆宽度原位检测与高精度识别方法具有重要应用价值。 [方法] 采用ZED2i双目相机并将其固定在田间获取实时的玉米茎秆左目和右目图片,对原始图片进行数据增强,使用YOLOv8对玉米茎秆进行识别,再通过多次增加注意力机制(Coordinate Attention,CA)模块,和替换损失函数(Efficient IoU Loss,EIoU)的方法,进一步提高玉米茎秆的识别精度,然后通过对玉米茎秆的三维重建,获取识别框边界点在世界坐标系下的三维数据,通过距离公式计算出茎秆宽度。最后对改进后的YOLOv8模型与YOLOv8原模型、YOLOv7、YOLOv5、Faster RCNN、SSD进行对比,验证模型的识别准确性和识别精度。 [结果和讨论] 改进后的YOLOv8模型的查准率P、查全率R、平均精确率mAP0.5、平均精确率mAP0.5∶0.95分别达到了96.8%、94.1%、96.6%、77.0%,玉米茎秆宽度原位检测宽度计算的线性回归决定系数R2,均方根误差RMSE和平均绝对误差MAE分别为0.373、0.265和0.244 cm,可满足实际生产对玉米茎秆宽度测量精度的要求。 [结论] 本研究提出的基于改进YOLOv8模型的玉米茎秆宽度原位识别方法可以实现对玉米茎秆的原位准确识别,很好地解决了目前人工测量耗时费力和机器视觉识别精度较差的问题,为实际生产应用提供了理论依据。

关键词: YOLOv8, 注意力机制, 双目视觉, 玉米茎宽检测, 三维重建

Abstract:

[Objective] The width of maize stalks is an important indicator affecting the lodging resistance of maize. The measurement of maize stalk width has many problems, such as cumbersome manual collection process and large errors in the accuracy of automatic equipment collection and recognition, and it is of great application value to study a method for in-situ detection and high-precision identification of maize stalk width. [Methods] The ZED2i binocular camera was used and fixed in the field to obtain real-time pictures from the left and right sides of maize stalks together. The picture acquisition system was based on the NVIDIA Jetson TX2 NX development board, which could achieve timed shooting of both sides view of the maize by setting up the program. A total of maize original images were collected and a dataset was established. In order to observe more features in the target area from the image and provide assistance to improve model training generalization ability, the original images were processed by five processing methods: image saturation, brightness, contrast, sharpness and horizontal flipping, and the dataset was expanded to 3500 images. YOLOv8 was used as the original model for identifying maize stalks from a complex background. The coordinate attention (CA) attention mechanism can bring huge gains to downstream tasks on the basis of lightweight networks, so that the attention block can capture long-distance relationships in one direction while retaining spatial information in the other direction, so that the position information can be saved in the generated attention map to focus on the area of interest and help the network locate the target better and more accurately. By adding the CA module multiple times, the CA module was fused with the C2f module in the original Backbone, and the Bottleneck in the original C2f module was replaced by the CA module, and the C2fCA network module was redesigned. Replacing the loss function Efficient IoU Loss(EIoU) splits the loss term of the aspect ratio into the difference between the predicted width and height and the width and height of the minimum outer frame, which accelerated the convergence of the prediction box, improved the regression accuracy of the prediction box, and further improved the recognition accuracy of maize stalks. The binocular camera was then calibrated so that the left and right cameras were on the same three-dimensional plane. Then the three-dimensional reconstruction of maize stalks, and the matching of left and right cameras recognition frames was realized through the algorithm, first determine whether the detection number of recognition frames in the two images was equal, if not, re-enter the binocular image. If they were equal, continue to judge the coordinate information of the left and right images, the width and height of the bounding box, and determine whether the difference was less than the given Ta. If greater than the given Ta, the image was re-imported; If it was less than the given Ta, the confidence level of the recognition frame of the image was determined whether it was less than the given Tb. If greater than the given Tb, the image is re-imported; If it is less than the given Tb, it indicates that the recognition frame is the same maize identified in the left and right images. If the above conditions were met, the corresponding point matching in the binocular image was completed. After the three-dimensional reconstruction of the binocular image, the three-dimensional coordinates (Ax, Ay, Az) and (Bx, By, Bz) in the upper left and upper right corners of the recognition box under the world coordinate system were obtained, and the distance between the two points was the width of the maize stalk. Finally, a comparative analysis was conducted among the improved YOLOv8 model, the original YOLOv8 model, faster region convolutional neural networks (Faster R-CNN), and single shot multiBox detector (SSD)to verify the recognition accuracy and recognition accuracy of the model. [Results and Discussions] The precision rate (P)、recall rate (R)、average accuracy mAP0.5、average accuracy mAP0.5:0.95 of the improved YOLOv8 model reached 96.8%、94.1%、96.6% and 77.0%. Compared with YOLOv7, increased by 1.3%、1.3%、1.0% and 11.6%, compared with YOLOv5, increased by 1.8%、2.1%、1.2% and 15.8%, compared with Faster R-CNN, increased by 31.1%、40.3%、46.2%、and 37.6%, and compared with SSD, increased by 20.6%、23.8%、20.9% and 20.1%, respectively. Respectively, and the linear regression coefficient of determination R2, root mean square error RMSE and mean absolute error MAE were 0.373, 0.265 cm and 0.244 cm, respectively. The method proposed in the research can meet the requirements of actual production for the measurement accuracy of maize stalk width. [Conclusions] In this study, the in-situ recognition method of maize stalk width based on the improved YOLOv8 model can realize the accurate in-situ identification of maize stalks, which solves the problems of time-consuming and laborious manual measurement and poor machine vision recognition accuracy, and provides a theoretical basis for practical production applications.

Key words: YOLOv8, attention mechanism, binocular vision, maize stalk width detection, three-dimensional reconstruction