欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture ›› 2024, Vol. 6 ›› Issue (6): 109-120.doi: 10.12133/j.smartag.SA202407007

• 专题--农业知识智能服务和智慧无人农场(上) • 上一篇    下一篇

基于改进ENet的复杂背景下山药叶片图像分割方法

芦碧波1(), 梁迪1, 杨洁2(), 宋爱青2, 皇甫尚卫2   

  1. 1. 河南理工大学 计算机科学与技术学院,河南 焦作 454003,中国
    2. 焦作市农林科学研究院 特色农业研究所,河南 焦作 454150,中国
  • 收稿日期:2024-07-05 出版日期:2024-11-30
  • 基金项目:
    国家自然科学基金面上项目(42272178); 2024年度河南省高等学校重点科研项目(24B520013); 2022年度河南省重点研发与推广专项(科技攻关)项目(222102210131); 河南理工大学基本科研业务费专项项目(自然科学类)(NSFRF240508)
  • 作者简介:
    芦碧波,研究方向为图像处理,人工智能。E-mail:
  • 通信作者:
    杨 洁,硕士研究生,农艺师,研究方向为山药、地黄等特色作物的育种和栽培技术的研究与推广。E-mail:

Image Segmentation Method of Chinese Yam Leaves in Complex Background Based on Improved ENet

LU Bibo1(), LIANG Di1, YANG Jie2(), SONG Aiqing2, HUANGFU Shangwei2   

  1. 1. School of Computer Science and Technology, Henan University of Technology, Jiaozuo 454003, China
    2. Institute of Characteristic Agriculture, Jiaozuo Academy of Agriculture and Forestry Sciences, Jiaozuo 454150, China
  • Received:2024-07-05 Online:2024-11-30
  • Foundation items:National Natural Science Foundation of China(42272178); 2024 Key Scientific Research Project of Colleges and Universities in Henan Province(24B520013); 2022 Henan Provincial Key R&D and Promotion Special Project(222102210131); Henan Polytechnic University Fundamental Research Funds Special Project (Natural Sciences)(NSFRF240508)
  • About author:
    LU Bibo, E-mail:
  • Corresponding author:
    YANG Jie, E-mail:

摘要:

[目的/意义] 作物叶面积是反映光合作用效率和生长状况的重要指标,建立一个品种丰富的山药图像数据集并提出一种基于深度学习的山药叶片图像分割方法,可以用于实时测定山药叶片面积,解决传统测量效率低的问题。 [方法] 基于改进ENet的轻量化分割网络,在ENet的基础上,裁剪掉第3阶段,减少模型中的冗余计算;将瓶颈结构里面的常规卷积用PConv替换,构成P-Bottleneck,减少模型参数量,加快推理速度;改进上采样模块中的转置卷积为双线性插值,提升模型分割精度,减少参数量;最后在模型编码阶段加入CA注意力机制模块,强化对叶片边缘语义特征的提取能力。训练时使用Adam优化器,根据历史梯度信息自适应地调节学习率,加速收敛过程,提高模型的泛化能力。 [结果和讨论] 改进的模型在包含40个品种的山药室内图像数据集和室外数据集上进行实验,平均交并比和均像素精度分别达到98.61%和99.32%,模型参数量下降51%,浮点运算量下降49%,并且网络运算速度提高38%。与原始模型相比,在保证分割精度的同时显著降低网络的参数量和浮点运算量,提升运行速度,减少资源占用,使其更加适合应用到农业监测设备。 [结论] 改进算法能够精准快速地分割山药叶片,为复杂背景下山药叶片面积的研究提供了参考依据。

关键词: 山药, 图像分割, 深度学习, ENet, 部分卷积, CA注意力机制

Abstract:

[Objective] Crop leaf area is an important indicator reflecting light absorption efficiency and growth conditions. This paper established a diverse Chinese yam image dataset and proposesd a deep learning-based method for Chinese yam leaf image segmentation. This method can be used for real-time measurement of Chinese yam leaf area, addressing the inefficiency of traditional measurement techniques. This will provide more reliable data support for genetic breeding, growth and development research of Chinese yam, and promote the development and progress of the Chinese yam industry. [Methods] A lightweight segmentation network based on improved ENet was proposed. Firstly, based on ENet, the third stage was pruned to reduce redundant calculations in the model. This improved the computational efficiency and running speed, and provided a good basis for real-time applications. Secondly, PConv was used instead of the conventional convolution in the downsampling bottleneck structure and conventional bottleneck structure, the improved bottleneck structure was named P-Bottleneck. PConv applied conventional convolution to only a portion of the input channels and left the rest of the channels unchanged, which reduced memory accesses and redundant computations for more efficient spatial feature extraction. PConv was used to reduce the amount of model computation while increase the number of floating-point operations per second on the hardware device, resulting in lower latency. Additionally, the transposed convolution in the upsampling module was improved to bilinear interpolation to enhance model accuracy and reduce the number of parameters. Bilinear interpolation could process images smoother, making the processed images more realistic and clear. Finally, coordinate attention (CA) module was added to the encoder to introduce the attention mechanism, and the model was named CBPA-ENet. The CA mechanism not only focused on the channel information, but also keenly captured the orientation and position-sensitive information. The position information was embedded into the channel attention to globally encode the spatial information, capturing the channel information along one spatial direction while retaining the position information along the other spatial direction. The network could effectively enhance the attention to important regions in the image, and thus improve the quality and interpretability of segmentation results. [Results and Discussions] Trimming the third part resulted in a 28% decrease in FLOPs, a 41% decrease in parameters, and a 9 f/s increase in FPS. Improving the upsampling method to bilinear interpolation not only reduces the floating-point operation and parameters, but also slightly improves the segmentation accuracy of the model, increasing FPS by 4 f/s. Using P-Bottleneck instead of downsampling bottleneck structure and conventional bottleneck structure can reduce mIoU by only 0.04%, reduce FLOPs by 22%, reduce parameters by 16%, and increase FPS by 8 f/s. Adding CA mechanism to the encoder could only increase a small amount of FLOPs and parameters, improving the accuracy of the segmentation network. To verify the effectiveness of the improved segmentation algorithm, classic semantic segmentation networks of UNet, DeepLabV3+, PSPNet, and real-time semantic segmentation network LinkNet, DABNet were selected to train and validate. These six algorithms got quite high segmentation accuracy, among which UNet had the best mIoU and the mPA, but the model size was too large. The improved algorithm only accounts for 1% of the FLOPs and 0.41% of the parameters of UNet, and the mIoU and mPA were basically the same. Other classic semantic segmentation algorithms, such as DeepLabV3+, had similar accuracy to improved algorithms, but their large model size and slow inference speed were not conducive to embedded development. Although the real-time semantic segmentation algorithm LinkNet had a slightly higher mIoU, its FLOPs and parameters count were still far greater than the improved algorithm. Although the PSPNet model was relatively small, it was also much higher than the improved algorithm, and the mIoU and mPA were lower than the algorithm. The experimental results showed that the improved model achieved a mIoU of 98.61%. Compared with the original model, the number of parameters and FLOPs significantly decreased. Among them, the number of model parameters decreased by 51%, the FLOPs decreased by 49%, and the network operation speed increased by 38%. [Conclusions] The improved algorithm can accurately and quickly segment Chinese yam leaves, providing not only a more accurate means for determining Chinese yam phenotype data, but also a new method and approach for embedded research of Chinese yam. Using the model, the morphological feature data of Chinese yam leaves can be obtained more efficiently, providing a reliable foundation for further research and analysis.

Key words: Chinese yam, image segmentation, deep learning, ENet, partial convolution, CA mechanism

中图分类号: