基于改进UperNet的结球甘蓝叶球识别方法

doi:10.12133/j.smartag.SA202401020

Smart Agriculture ›› 2024, Vol. 6 ›› Issue (3): 128-137.doi: 10.12133/j.smartag.SA202401020

基于改进UperNet的结球甘蓝叶球识别方法

朱轶萍¹^,², 吴华瑞¹^,²^,³^,⁴(), 郭旺²^,³^,⁴, 吴小燕²

^1. 江苏大学计算机科学与通信工程学院，江苏镇江 212013，中国
^2. 国家农业信息化工程技术研究中心，北京 100097，中国
^3. 北京市农林科学院信息技术研究中心，北京 100097，中国
^4. 农业农村部数字乡村技术重点实验室，北京 100097，中国

收稿日期:2023-01-17 出版日期:2024-05-30
基金项目:
“十四五”国家重点研发计划项目(2022YFD1600602); 财政部和农业农村部：国家现代农业产业技术体系资助(CARS-23-D07)
作者简介:
朱轶萍，研究方向为深度学习、计算机视觉。Email：1052046559@qq.com
通信作者:
吴华瑞，博士，研究员，研究方向为农业智能系统、农业大数据智能服务。E-mail：wuhr@nercita.org.cn

Identification Method of Kale Leaf Ball Based on Improved UperNet

ZHU Yiping¹^,², WU Huarui¹^,²^,³^,⁴(), GUO Wang²^,³^,⁴, WU Xiaoyan²

^1. School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China
^2. National Engineering Research Center for Information Technology in Agriculture, Beijing 100097, China
^3. Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
^4. Key Laboratory of Digital Village Technology, Ministry of Agriculture and Rural Affairs, Beijing 100097, China

Received:2023-01-17 Online:2024-05-30
Foundation items:National Key Research and Development Programme(2022YFD1600602); Ministry of Finance and Ministry of Agriculture and Rural Development: Funding for the National Modern Agricultural Industry Technology System(CARS-23-D07)
About author:
ZHU Yiping, E-mail: 1052046559@qq.com
Corresponding author:
WU Huarui, E-mail: wuhr@nercita.org.cn

摘要/Abstract

摘要：

[目的/意义] 叶球是结球甘蓝的重要部分，其生长发育对田间管理至关重要。针对叶球分割识别存在大田背景复杂、光照不均匀和叶片纹理相似等问题，提出一种语义分割算法UperNet-ESA，旨在能快速、准确地分割田间场景中结球甘蓝的外叶和叶球，以实现田间结球甘蓝的智能化管理。 [方法] 首先，采用统一感知解析网络（Unified Perceptual Parsing Network, UperNet）作为高效语义分割框架，将主干网络改为先进的ConvNeXt，使得模型在提升分割精度的同时也能具有较低的模型复杂度；其次，利用高效通道注意力机制（Efficient Channel Attention, ECA）融入特征提取网络的各阶段，进一步捕捉图像的细节信息；最后，通过将特征选择模块（Feature Selection Model, FSM）和特征对齐模块（Feature Alignment Model, FAM）集成到特征金字塔框架中，得到更为精确的目标边界预测结果。 [结果和讨论] 在自制结球甘蓝图像数据集上进行实验，与目前主流的UNet、PSPNet和DeeplabV3+语义分割模型相比，改进UperNet方法的平均交并比为92.45%，平均像素准确率为94.32%，推理速度为16.6 f/s，能够达到最佳精度-速度平衡效果。 [结论] 研究成果可为结球甘蓝生长智能化监测提供理论参考，对甘蓝产业发展具有重要的应用前景。

关键词: 结球甘蓝, 语义分割, 叶球识别, 注意力机制, 特征选择, 特征对齐

Abstract:

[Objective] Kale is an important bulk vegetable crop worldwide, its main growth characteristics are outer leaves and leaf bulbs. The traits of leaf bulb kale are crucial for adjusting water and fertilizer parameters in the field to achieve maximum yield. However, various factors such as soil quality, light exposure, leaf overlap, and shading can affect the growth of in practical field conditions. The similarity in color and texture between leaf bulbs and outer leaves complicates the segmentation process for existing recognition models. In this paper, the segmentation of kale outer leaves and leaf bulbs in complex field background was proposed, using pixel values to determine leaf bulb size for intelligent field management. A semantic segmentation algorithm, UperNet-ESA was proposed to efficiently and accurately segment nodular kale outer leaf and leaf bulb in field scenes using the morphological features of the leaf bulbs and outer leaves of nodular kale to realize the intelligent management of nodular kale in the field. [Methods] The UperNet-ESA semantic segmentation algorithm, which uses the unified perceptual parsing network (UperNet) as an efficient semantic segmentation framework, is more suitable for extracting crop features in complex environments by integrating semantic information across different scales. The backbone network was improved using ConvNeXt, which is responsible for feature extraction in the model. The similarity between kale leaf bulbs and outer leaves, along with issues of leaf overlap affecting accurate target contour localization, posed challenges for the baseline network, leading to low accuracy. ConvNeXt effectively combines the strengths of convolutional neural networks (CNN) and Transformers, using design principles from Swin Transformer and building upon ResNet50 to create a highly effective network structure. The simplicity of the ConvNeXt design not only enhances segmentation accuracy with minimal model complexity, but also positions it as a top performer among CNN architectures. In this study, the ConvNeXt-B version was chosen based on considerations of computational complexity and the background characteristics of the knotweed kale image dataset. To enhance the model's perceptual acuity, block ratios for each stage were set at 3:3:27:3, with corresponding channel numbers of 128, 256, 512 and 1 024, respectively. Given the visual similarity between kale leaf bulbs and outer leaves, a high-efficiency channel attention mechanism was integrated into the backbone network to improve feature extraction in the leaf bulb region. By incorporating attention weights into feature mapping through residual inversion, attention parameters were cyclically trained within each block, resulting in feature maps with attentional weights. This iterative process facilitated the repeated training of attentional parameters and enhanced the capture of global feature information. To address challenges arising from direct pixel addition between up-sampling and local features, potentially leading to misaligned context in feature maps and erroneous classifications at kale leaf boundaries, a feature alignment module and feature selection module were introduced into the feature pyramid network to refine target boundary information extraction and enhance model segmentation accuracy. [Results and Discussions] The UperNet-ESA semantic segmentation model outperforms the current mainstream UNet model, PSPNet model, DeepLabV3+ model in terms of segmentation accuracy, where mIoU and mPA reached 92.45% and 94.32%, respectively, and the inference speed of up to 16.6 frames per second (fps). The mPA values were better than that of the UNet model, PSPNet model, ResNet-50 based, MobilenetV2, and DeepLabV3+ model with Xception as the backbone, showing improvements of 11.52%, 13.56%, 8.68%, 4.31%, and 6.21%, respectively. Similarly, the mIoU exhibited improvements of 12.21%, 13.04%, 10.65%, 3.26% and 7.11% compared to the mIoU of the UNet-based model, PSPNet model, and DeepLabV3+ model based on the ResNet-50, MobilenetV2, and Xception backbones, respectively. This performance enhancement can be attributed to the introduction of the ECA module and the improvement made to the feature pyramid network in this model, which strengthen the judgement of the target features at each stage to obtain effective global contextual information. In addition, although the PSPNet model had the fastest inference speed, the overall accuracy was too low to for developing kale semantic segmentation models. On the contrary, the proposed model exhibited superior inference speed compared to all other network models. [Conclusions] The experimental results showed that the UperNet-ESA semantic segmentation model proposed in this study outperforms the original network in terms of performance. The improved model achieves the best accuracy-speed balance compared to the current mainstream semantic segmentation networks. In the upcoming research, the current model will be further optimized and enhanced, while the kale dataset will be expanded to include a wider range of samples of nodulated kale leaf bulbs. This expansion is intended to provide a more robust and comprehensive theoretical foundation for intelligent kale field management.

Key words: kale, semantic segmentation, leafball identification, attention mechanism, feature selection, feature alignment

朱轶萍, 吴华瑞, 郭旺, 吴小燕. 基于改进UperNet的结球甘蓝叶球识别方法[J]. 智慧农业(中英文), 2024, 6(3): 128-137.

ZHU Yiping, WU Huarui, GUO Wang, WU Xiaoyan. Identification Method of Kale Leaf Ball Based on Improved UperNet[J]. Smart Agriculture, 2024, 6(3): 128-137.

图/表 13

图1

图2

图3

图4

图5

图6

图7

图8

表1

表2

图9

表3

图10

参考文献 26

1	岳智臣, 俞国红, 薛向磊, 等. 杭州秋季露地甘蓝轻简化增效栽培技术简析[J]. 浙江农业科学, 2023, 64(5): 1103-1106.
	YUE Z C, YU G H, XUE X L, et al. Analysis of light and simple and efficient cultivation techniques of autumn cabbage in Hangzhou[J]. Journal of Zhejiang agricultural sciences, 2023, 64(5): 1103-1106.
2	陈皓颖. 人工智能在农业领域中的应用[J]. 灌溉排水学报, 2023, 42(7): 146.
	CHEN H Y. Application of artificial intelligence in agricultural field[J]. Journal of irrigation and drainage, 2023, 42(7): 146.
3	刘海桥, 刘萌, 龚子超, 等. 基于深度学习的图像匹配方法综述[J/OL]. 航空学报, (2024-01-16).
	LIU H Q, LIU M, GONG Z C, et al. A review of image matching methods based on deep learning[J/OL]. Acta aeronautica et astronautica sinica, (2024-01-16).
4	赵永强, 金芝, 张峰, 等. 深度学习图像描述方法分析与展望[J]. 中国图象图形学报, 2023, 28(9): 2788-2816.
	ZHAO Y Q, JIN Z, ZHANG F, et al. Deep-learning-based image captioning: Analysis and prospects[J]. Journal of image and graphics, 2023, 28(9): 2788-2816.
5	MINAEE S, BOYKOV Y, PORIKLI F, et al. Image segmentation using deep learning: A survey[J]. IEEE trans pattern anal mach intell, 2022, 44(7): 3523-3542.
6	GAN P X, LUO X Y, LIU B, et al. Research on semantic segmentation method of urban streetscape image based on deep learning[C]// Seventh Asia Pacific Conference on Optics Manufacture and 2021 International Forum of Young Scientists on Advanced Optical Manufacturing (APCOM and YSAOM 2021). Burlingame, California, USA: SPIE, 2022.
7	翁杨, 曾睿, 吴陈铭, 等. 基于深度学习的农业植物表型研究综述[J]. 中国科学(生命科学), 2019, 49(6): 698-716.
	WENG Y, ZENG R, WU C M, et al. A survey on deep-learning-based plant phenotype research in agriculture[J]. Scientia sinica (vitae), 2019, 49(6): 698-716.
8	刘俊奇, 涂文轩, 祝恩. 图卷积神经网络综述[J]. 计算机工程与科学, 2023, 45(8): 1472-1481.
	LIU J Q, TU W X, ZHU E. Survey on graph convolutional neural network[J]. Computer engineering & science, 2023, 45(8): 1472-1481.
9	郭庆梅, 于恒力, 王中训, 等. 基于卷积神经网络的图像分类模型综述[J]. 电子技术应用, 2023, 49(9): 31-38.
	GUO Q M, YU H L, WANG Z X, et al. Review of image classification models based on convolutional neural networks[J]. Application of electronic technique, 2023, 49(9): 31-38.
10	张鑫, 姚庆安, 赵健, 等. 全卷积神经网络图像语义分割方法综述[J]. 计算机工程与应用, 2022, 58(8): 45-57.
	ZHANG X, YAO Q A, ZHAO J, et al. Image semantic segmentation based on fully convolutional neural network[J]. Computer engineering and applications, 2022, 58(8): 45-57.
11	ZHANG D Y, ZHANG W H, CHENG T, et al. Segmentation of wheat scab fungus spores based on CRF_ResUNet++[J]. Computers and electronics in agriculture, 2024, 216: ID 108547.
12	ZHENG C, CHEN P F, PANG J, et al. A mango picking vision algorithm on instance segmentation and key point detection from RGB images in an open orchard[J]. Biosystems engineering, 2021, 206(6): 32-54.
13	王璨, 武新慧, 张燕青, 等. 基于双注意力语义分割网络的田间苗期玉米识别与分割[J]. 农业工程学报, 2021, 37(9): 211-221.
	WANG C, WU X H, ZHANG Y Q, et al. Recognition and segmentation of maize seedlings in field based on dual attention semantic segmentation network[J]. Transactions of the Chinese society of agricultural engineering, 2021, 37(9): 211-221.
14	刘平, 刘立鹏, 王春颖, 等. 基于机器视觉的田间小麦开花期判定方法[J]. 农业机械学报, 2022, 53(3): 251-258.
	LIU P, LIU L P, WANG C Y, et al. Determination method of field wheat flowering period baesd on machine vision[J]. Transactions of the Chinese society for agricultural machinery, 2022, 53(3): 251-258.
15	SONG Z Z, ZHOU Z X, WANG W Q, et al. Canopy segmentation and wire reconstruction for kiwifruit robotic harvesting[J]. Computers and electronics in agriculture, 2021, 181: ID 105933.
16	DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[EB/OL]. arXiv: 2010.11929, 2020.
17	ZHENG S X, LU J C, ZHAO H S, et al. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers[EB/OL]. arXiv: 2012.15840, 2020.
18	REEDHA R, DERICQUEBOURG E, CANALS R, et al. Transformer neural network for weed and crop classification of high resolution UAV images[J]. Remote sensing, 2022, 14(3): ID 592.
19	XIE E Z, WANG W H, YU Z D, et al. SegFormer: Simple and efficient design for semantic segmentation with transformers[J]. arXiv: 2105.1520, 2021.
20	XIAO T T, LIU Y C, ZHOU B L, et al. Unified perceptual parsing for scene understanding[M]// Computer Vision – ECCV 2018. Cham: Springer International Publishing, 2018: 432-448.
21	LIU Z, MAO H, WU C Y, et al.A ConvNet for the 2020s[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, Louisiana, USA: IEEE, 2020: 11966-11976.
22	TAKAHASHI R, MATSUBARA T, UEHARA K. Data augmentation using random image cropping and patching for deep CNNs[J]. IEEE transactions on circuits and systems for video technology, 2020, 30(9): 2917-2931.
23	DIAO Z H, GUO P L, ZHANG B H, et al. Maize crop row recognition algorithm based on improved UNet network[J]. Computers and electronics in agriculture, 2023, 210: ID 107940.
24	YANG C Z, GUO H J. A method of image semantic segmentation based on PSPNet[J]. Mathematical problems in engineering, 2022, 2022: ID 8958154.
25	马冬梅, 李鹏辉, 黄欣悦, 等. 改进DeepLabV3+的高效语义分割[J]. 计算机工程与科学, 2022, 44(4): 737-745.
	MA D M, LI P H, HUANG X Y, et al. Efficient semantic segmentation based on improved DeepLabV3+[J]. Computer engineering & science, 2022, 44(4): 737-745.
26	LIU Z, LIN Y T, CAO Y, et al. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows[C]// 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway, New Jersey, USA: IEEE, 2021: 10012-10022.

主干网络	mPA/%	mIoU/%	分割速度/（帧/s）
ResNet-50	88.86	88.72	17.6
Swin Transformer	89.81	89.19	15.5
ConvNeXt	90.17	90.12	17.4

模型	ECA	FAM+FSM	mPA/%	mIoU/%
模型1	×	×	90.17	90.12
模型2	√	×	92.88	90.14
模型3	√	√	94.32	92.45

网络	主干网络	mPA/%	mIoU/%	分割速度/（帧/s）
UNet	ResNet-50	82.80	80.14	15.2
PSPNet	ResNet-50	80.76	79.41	20.3
DeeplabV3+	ResNet-50	85.64	81.80	16.2
DeeplabV3+	MobilenetV2	90.01	89.19	14.4
DeeplabV3+	Xception	88.11	85.34	15.7
UperNet +FAM+FSM	ConvNeXt-B +ECA	94.32	92.45	16.6

[1]	黎祖胜, 唐吉深, 匡迎春. 基于改进YOLOv10n的轻量化荔枝虫害小目标检测模型[J]. 智慧农业(中英文), 2025, 7(2): 146-159.
[2]	牛子昂, 裘正军. 基于改进YOLOv11-Pose的玉米植株骨架及表型参数提取方法[J]. 智慧农业(中英文), 2025, 7(2): 95-105.
[3]	吴六爱, 许雪珂. 基于改进YOLOv10n的轻量化番茄叶片病虫害检测方法[J]. 智慧农业(中英文), 2025, 7(1): 146-155.
[4]	权家璐, 陈雯柏, 王一群, 程佳璟, 刘亦隆. 基于GCN-BiGRU-STMHSA的农业干旱预测研究[J]. 智慧农业(中英文), 2025, 7(1): 156-164.
[5]	齐梓均, 牛当当, 吴华瑞, 张礼麟, 王仑峰, 张宏鸣. 基于双维信息与剪枝的中文猕猴桃文本命名实体识别方法[J]. 智慧农业(中英文), 2025, 7(1): 44-56.
[6]	芦碧波, 梁迪, 杨洁, 宋爱青, 皇甫尚卫. 基于改进ENet的复杂背景下山药叶片图像分割方法[J]. 智慧农业(中英文), 2024, 6(6): 109-120.
[7]	胡程喜, 谭立新, 王文胤, 宋敏. 基于改进DeepLabV3+的轻量化茶叶嫩芽采摘点识别模型[J]. 智慧农业(中英文), 2024, 6(5): 119-127.
[8]	年悦, 赵凯旋, 姬江涛. 基于改进DeepLabCut模型的奶牛滑蹄检测方法[J]. 智慧农业(中英文), 2024, 6(5): 153-163.
[9]	李明煌, 苏力德, 张永, 宗哲英, 张顺. 基于改进YOLOv8n-pose和三维点云分析的蒙古马体尺自动测量方法[J]. 智慧农业(中英文), 2024, 6(4): 91-102.
[10]	翁智, 范琦, 郑志强. 基于多模态图像信息及改进实例分割网络的肉牛体尺自动测量方法[J]. 智慧农业(中英文), 2024, 6(4): 64-75.
[11]	范铭铄, 周平, 李淼, 李华龙, 刘先旺, 麻之润. 羊场自动导航喷药机器人设计与实验[J]. 智慧农业(中英文), 2024, 6(4): 103-115.
[12]	侯依廷, 饶元, 宋贺, 聂振君, 王坦, 何豪旭. 复杂大田场景下基于改进YOLOv8的小麦幼苗期叶片数快速检测方法[J]. 智慧农业(中英文), 2024, 6(4): 128-137.
[13]	翁智, 刘海鑫, 郑志强. CSD-YOLOv8s：基于无人机图像的密集小目标羊只检测模型[J]. 智慧农业(中英文), 2024, 6(4): 42-52.
[14]	王宇啸, 石源源, 陈招达, 吴珍芳, 蔡更元, 张素敏, 尹令. 猪三维点云体尺自动计算模型Pig Back Transformer[J]. 智慧农业(中英文), 2024, 6(4): 76-90.
[15]	代昕, 王军号, 张翼, 王鑫杰, 李晏兴, 戴百生, 沈维政. 基于时空流特征融合的俯视视角下奶牛跛行自动检测方法[J]. 智慧农业(中英文), 2024, 6(4): 18-28.

基于改进UperNet的结球甘蓝叶球识别方法

Identification Method of Kale Leaf Ball Based on Improved UperNet

在线阅读

知网下载

本地下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 13

参考文献 26

相关文章 15

编辑推荐

Metrics

本文评价