融合PDE植物时序图像对比学习方法与GCN跳跃连接的U-Net温室甜樱桃图像分割方法

doi:10.12133/j.smartag.SA202502008

Smart Agriculture ›› 2025, Vol. 7 ›› Issue (3): 131-142.doi: 10.12133/j.smartag.SA202502008

融合PDE植物时序图像对比学习方法与GCN跳跃连接的U-Net温室甜樱桃图像分割方法

胡玲艳¹, 郭睿雅¹, 郭占俊², 徐国辉¹, 盖荣丽¹, 汪祖民¹(), 张宇萌¹, 鞠博文¹, 聂晓宇¹()

^1. 大连大学信息工程学院，辽宁大连 116622，中国
^2. 大连市现代农业生产发展服务中心，辽宁大连 116021，中国

收稿日期:2025-02-11 出版日期:2025-05-30
基金项目:
辽宁省科技计划重点项目(2022020655-JH1/109); 大连市科技创新基金项目(2022JJ12SN052)
作者简介:
胡玲艳，博士，副教授，研究方向为智慧农业/智能感知与数据分析。E-mail： hulingyan@dlu.edu.cn
通信作者:
汪祖民，博士，教授，研究方向为物联网技术与数字农业。E-mail： wangzumin@dlu.edu.cn
聂晓宇，硕士，助理实验师，研究方向为智慧农业与病虫害诊断。E-mail：niexiaoyu@dlu.edu.cn

U-Net Greenhouse Sweet Cherry Image Segmentation Method Integrating PDE Plant Temporal Image Contrastive Learning and GCN Skip Connections

HU Lingyan¹, GUO Ruiya¹, GUO Zhanjun², XU Guohui¹, GAI Rongli¹, WANG Zumin¹(), ZHANG Yumeng¹, JU Bowen¹, NIE Xiaoyu¹()

^1. School of Information Engineering, Dalian University, Dalian 116622, China
^2. Dalian Modern Agricultural Production Development Service Center, Dalian 116021, China

Received:2025-02-11 Online:2025-05-30
Foundation items:Key Projects of Liaoning Provincial Science and technology plan(2022020655-JH1/109); Dalian Science and Technology Innovation Fund Project(2022JJ12SN052)
About author:
YANG Xiao, E-mail: 736834185@qq.com
Corresponding author:
WANG Zumin, E-mail: wangzumin@dlu.edu.cn
NIE Xiaoyu, E-mail: niexiaoyu@dlu.edu.cn

摘要/Abstract

摘要：

【目的/意义】 在植物表型特征提取中，面临小目标边界难以精确分割、上采样细节恢复空间信息不足等问题。提出一种融合嵌入先验距离（Priori Distance Embedding, PDE）植物时序图像对比学习方法，预训练与图卷积网络（Graph Convolutional Networks, GCN）跳跃连接的U-Net温室甜樱桃图像分割方法，借助预训练加速模型收敛，优化特征融合，为图像分割提供技术支持。 【方法】 将PDE植物时序图像对比学习方法的预训练权重迁移至语义分割任务；Encoder模块通过卷积-池化层执行多尺度特征提取，分层输入图像的语义信息，构建从低层纹理到高层语义的表示；利用Decoder模块进行上采样操作，融合不同尺度特征并恢复图像分辨率；Encoder和Decoder连接处，加入GCN，形成跳跃连接，使网络更容易学习多尺度图像的局部特征。 【结果和讨论】 从纵向消融实验和横向对比多角度进行试验，并结合准确率、召回率、F₁分数等评价指标综合分析，可以验证本研究提出的融合PDE植物时序图像对比学习方法与GCN跳跃连接的U-Net在甜樱桃图像语义分割中的性能表现最佳，准确率可达0.955 0。 【结论】 通过将PDE植物时序图像对比学习方法和GCN技术融合，构建面向植物表型分析的增强型U-Net架构。研究结果表明该方法在复杂场景下能有效解决小目标边界模糊、细节丢失等难题，实现对甜樱桃图像主要器官和背景区域的精确分割，提高原始模型的分割准度，对农业智慧化发展具有重要的实践意义。

关键词: 嵌入先验距离, 迁移学习, 图卷积网络, U-Net, 跳跃连接, 植物表型

Abstract:

[Objective] Within the field of plant phenotyping feature extraction, the accurate delineation of small targets boundaries and the adequate recovery of spatial details during upsampling operations have long been recognized as significant obstacles hindering progress. To address these limitations, an improved U-Net architecture designed for greenhouse sweet cherry image segmentation. [Methods] Taking temporal phenotypic images of sweet cherries as the research subject, the U-Net segmentation model was employed to delineate the specific organ regions of the plant. This architecture was referred to as the U-Net integrating self-supervised contrastive learning method for plant time-series images with priori distance embedding (PDE) pre-training and graph convolutional networks (GCN ) skip connection for greenhouse sweet cherry image segmentation. To accelerate model convergence, the pre-trained weights derived from the PDE plant temporal image contrastive learning method were transferred to. Concurrently, the incorporation of a GCN local feature fusion layer was incorporated as a skip connection to optimize feature fusion, thereby providing robust technical support for image segmentation task. The PDE plant temporal image contrastive learning method pre-training required the construction of image pairs corresponding to different phenological periods. A classification distance loss function, which incorporated prior knowledge, was employed to construct an Encoder with adjusted parameters. Pre-trained weights obtained from the PDE plant temporal image contrastive learning method were effectively transferred and and applied to the semantic segmentation task, enabling the network to accurately learn semantic information and detailed textures of various sweet cherry organs. The Encoder module performs multi-scale feature extraction by convolutional and pooling layers. This process enabled the hierarchical processing of the semantic information embedded in the input image to construct representations that progress transitions from low-level texture features to high-level semantic features. This allows consistent extraction of semantic features from images across various scales and abstraction of underlying information, enhancing feature discriminability and optimizing modeling of complex targets. The Decoder module was employed to conduct up sampling operations, which facilitated the integration of features from diverse scales and the restoration of the original image resolution. This enabled the results to effectively reconstruct spatial details and significantly improve the efficiency of model optimization. At the interface between the Encoder and Decoder modules, a GCN layer designed for local feature fusion was strategically integrated as a skip connection, enabling the network to better capture and learn the local features in multi-scale images. [Results and Discussions] Utilizing a set of evaluation metrics including accuracy, precision, recall, and F₁-Score, an in-depth and rigorous assessment of the model's performance capabilities was conducted. The research findings revealed that the improved U-Net model achieved superior performance in semantic segmentation of sweet cherry images, with an accuracy of up to 0.955 0. Ablation experiments results further revealed that the proposed method attained a precision of 0.932 8, a recall of 0.927 4, and an F₁-Score of 0.912 8. The accuracy of improved U-Net is higher by 0.069 9, 0.028 8, and 0.042 compared to the original U-Net, U-Net with PDE plant temporal image contrastive learning method, and U-Net with GCN skip connections, respectively. Meanwhile the F₁-Score is 0.078 3, 0.033 8, and 0.043 8 higher respectively. In comparative experiments against DeepLabV3, Swin Transformer and Segment Anything Model segmentation methods, the proposed model surpassed the above models by 0.022 2, 0.027 6 and 0.042 2 in accuracy; 0.063 7, 0.147 1 and 0.107 7 in precision; 0.035 2, 0.065 4 and 0.050 8 in recall; and 0.076 8, 0.127 5 and 0.103 4 in F₁-Score. [Conclusions] The incorporation of the PDE plant temporal image contrastive learning method and the GCN techniques was utilized to develop an advanced U-Net architecture that is specifically designed and optimized for the analysis of sweat cherry plant phenotyping. The results demonstrate that the proposed method is capable of effectively addressing the issues of boundary blurring and detail loss associated with small targets in complex orchard scenarios. It enables the precise segmentation of the primary organs and background regions in sweet cherry images, thereby enhancing the segmentation accuracy of the original model. This improvement provides a solid foundation for subsequent crop modeling research and holds significant practical importance for the advancement of agricultural intelligence.

Key words: priori distance embedding, transfer learning, GCN, U-Net, skip connection, plant phenotype

中图分类号:

胡玲艳, 郭睿雅, 郭占俊, 徐国辉, 盖荣丽, 汪祖民, 张宇萌, 鞠博文, 聂晓宇. 融合PDE植物时序图像对比学习方法与GCN跳跃连接的U-Net温室甜樱桃图像分割方法[J]. 智慧农业(中英文), 2025, 7(3): 131-142.

HU Lingyan, GUO Ruiya, GUO Zhanjun, XU Guohui, GAI Rongli, WANG Zumin, ZHANG Yumeng, JU Bowen, NIE Xiaoyu. U-Net Greenhouse Sweet Cherry Image Segmentation Method Integrating PDE Plant Temporal Image Contrastive Learning and GCN Skip Connections[J]. Smart Agriculture, 2025, 7(3): 131-142.

图/表 10

图1

图2

图3

图4

图5

表1

图6

表2

图7

表3

参考文献 26

[1]	KITZLER F, BARTA N, NEUGSCHWANDTNER R W, et al. WE3DS: An RGB-D image dataset for semantic segmentation in agriculture[J]. Sensors, 2023, 23(5): ID 2713.
[2]	ZHUANG F Z, QI Z Y, DUAN K Y, et al. A comprehensive survey on transfer learning[J]. Proceedings of the IEEE, 2021, 109(1): 43-76.
[3]	ZHOU S L, XU C, XU R, et al. Image recognition model of fraudulent websites based on image leader decision and Inception-V3 transfer learning[J]. China communications, 2024, 21(1): 215-227.
[4]	HOWARD J, RUDER S. Universal language model fine-tuning for text classification[J]. Computer science, 2018, 56(1): 328-339.
[5]	RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge[J]. International journal of computer vision, 2015, 115(3): 211-252.
[6]	SINGH A, KAUR J, SINGH K, et al. Deep transfer learning-based automated detection of blast disease in paddy crop[J]. Signal, image and video processing, 2024, 18(1): 569-577.
[7]	YAN K, GUO X L, JI Z W, et al. Deep transfer learning for cross-species plant disease diagnosis adapting mixed subdomains[J]. IEEE/ACM transactions on computational biology and bioinformatics, 2023, 20(4): 2555-2564.
[8]	CHEN Z K, ZHANG X, CHEN S, et al. A sparse deep transfer learning model and its application for smart agriculture[J]. Wireless communications and mobile computing, 2021, 2021(1): ID 9957067.
[9]	GARCIA-GARCIA A, ORTS-ESCOLANO S, OPREA S, et al. A survey on deep learning techniques for image and video semantic segmentation[J]. Applied soft computing, 2018, 70: 41-65.
[10]	HAIDER RIZVI S M, IMRAN R, MAHMOOD A. Text classification using graph convolutional networks: A comprehensive survey[J]. ACM computing surveys, 2025, 57(8): 1-38.
[11]	MINAEE S, BOYKOV Y, PORIKLI F, et al. Image segmentation using deep learning: A survey[J]. IEEE transactions on pattern analysis and machine intelligence, 2022, 44(7): 3523-3542.
[12]	SHELHAMER E, LONG J, DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(4): 640-651.
[13]	RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional networks for biomedical image segmentation[C]// Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015. Cham, Germany: Springer International Publishing, 2015: 234-241.
[14]	BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(12): 2481-2495.
[15]	PASZKE A, CHAURASIA A, KIM S, et al. ENet: A deep neural network architecture for real-time semantic segmentation[EB/OL]. arXiv: 1606.02147, 2016.
[16]	SIDDIQUE A, TABB A, MEDEIROS H. Self-supervised learning for panoptic segmentation of multiple fruit flower species[J]. IEEE robotics and automation letters, 2022, 7(4): 12387-12394.
[17]	ZHOU H, YANG J Y, ZHANG T T, et al. EAS-CNN: Automatic design of convolutional neural network for remote sensing images semantic segmentation[J]. International journal of remote sensing, 2023, 44(13): 3911-3938.
[18]	XU W, GUO R Y, CHEN P Y, et al. Cherry growth modeling based on Prior Distance Embedding contrastive learning: Pre-training, anomaly detection, semantic segmentation, and temporal modeling[J]. Computers and electronics in agriculture, 2024, 221: ID 108973.
[19]	XU W, HU L Y, GUO R Y, et al. Image segmentation with contrastive learning for plant time-series images with priori distance embedding[C]// 2023 IEEE Smart World Congress (SWC). Piscataway, New Jersey, USA: IEEE, 2023: 1-8.
[20]	ZAFAR A, SABA N, ARSHAD A, et al. Convolutional neural networks: A comprehensive evaluation and benchmarking of pooling layer variants[J]. Symmetry, 2024, 16(11): ID 1516.
[21]	YANG J, MATSUSHITA B, ZHANG H R. Improving building rooftop segmentation accuracy through the optimization of UNet basic elements and image foreground-background balance[J]. ISPRS journal of photogrammetry and remote sensing, 2023, 201: 123-137.
[22]	FAISAL M, LEU J S, DARMAWAN J T. Model selection of hybrid feature fusion for coffee leaf disease classification[J]. IEEE access, 2023, 11: 62281-62291.
[23]	WANG J, ZHOU F, WEN S L, et al. Deep metric learning with angular loss[C]// 2017 IEEE International Conference on Computer Vision (ICCV). Piscataway, New Jersey, USA: IEEE, 2017: 2612-2620.
[24]	董西伟. 有监督和半监督多视图特征学习方法研究[D]. 南京: 南京邮电大学, 2018.
	DONG X W. Study of supervised and semi-supervised multi-view feature learning methods[D]. Nanjing: Nanjing university of posts and telecommunications, 2018.
[25]	WANG D, CHEN X L. Research on feature fusion method based on graph convolutional networks[J]. Applied sciences, 2024, 14(13): ID 5612.
[26]	MENG X B, WANG P F, YAN H R, et al. Multi-graph convolution network with jump connection for event detection[C]// 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI). Piscataway, New Jersey, USA: IEEE, 2019: 744-751.

预测实际	实际正样本	实际负样本
预测正样本	TP	FP
预测负样本	FN	TN

指标	方法
指标	原始U-Net	引入PDE植物时序图像对比学习方法的U-Net	引入GCN跳跃连接的U-Net	融合PDE植物时序图像对比学习方法与GCN跳跃连接的U-Net
准确率	0.885 1	0.926 2	0.913 0	0.955 0
精确率	0.863 7	0.900 0	0.800 0	0.932 8
召回率	0.855 9	0.892 6	0.875 2	0.927 4
F ₁分数	0.833 5	0.879 0	0.869 0	0.912 8

指标	模型
指标	融合PDE植物时序图像对比学习方法和GCN的U-Net	DeepLabV3	Swin Transformer	SAM
准确率	0.955 0	0.932 8	0.927 4	0.912 8
精确率	0.858 7	0.795 0	0.711 6	0.751 0
召回率	0.837 3	0.802 1	0.771 9	0.786 5
F ₁分数	0.882 1	0.805 3	0.754 6	0.778 7

融合PDE植物时序图像对比学习方法与GCN跳跃连接的U-Net温室甜樱桃图像分割方法

U-Net Greenhouse Sweet Cherry Image Segmentation Method Integrating PDE Plant Temporal Image Contrastive Learning and GCN Skip Connections

在线阅读

知网下载

本地下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献 26

相关文章 15

编辑推荐

Metrics

本文评价

[1]	刘龙, 王宁, 王嘉成, 曹宇恒, 张凯, 康峰, 王亚雄. 基于改进U-Net模型的高纺锤形苹果树休眠期修剪点识别与定位方法[J]. 智慧农业(中英文), 2025, 7(3): 120-130.
[2]	金宁, 郭宇峰, 韩晓东, 缪祎晟, 吴华瑞. 基于迁移学习的农业短文本语义相似度计算方法[J]. 智慧农业(中英文), 2025, 7(1): 33-43.
[3]	张淦, 严海峰, 胡根生, 张东彦, 程涛, 潘正高, 许海峰, 沈书豪, 朱科宇. 基于深度学习语义分割和迁移学习策略的麦田倒伏面积识别方法[J]. 智慧农业(中英文), 2023, 5(3): 75-85.
[4]	唐辉, 王铭, 于秋实, 张佳茜, 刘连涛, 王楠. 融合改进UNet和迁移学习的棉花根系图像分割方法[J]. 智慧农业(中英文), 2023, 5(3): 96-109.
[5]	王亚鹏, 曹姗姗, 李全胜, 孙伟. 融合迁移学习和集成学习的自然背景下荒漠植物识别方法[J]. 智慧农业(中英文), 2023, 5(2): 93-103.
[6]	朱海鹏, 张玉安, 李欢欢, 王建文, 杨英魁, 宋仁德. 基于改进残差网络模型的不同部位牦牛肉分类识别方法[J]. 智慧农业(中英文), 2023, 5(2): 115-125.
[7]	潘晨露, 张正华, 桂文豪, 马家俊, 严晨曦, 张晓敏. 融合ECA机制与DenseNet201的水稻病虫害识别方法[J]. 智慧农业(中英文), 2023, 5(2): 45-55.
[8]	白更, 葛玉峰. 作物胁迫感知和植物表型测量系统综述[J]. 智慧农业(中英文), 2023, 5(1): 66-81.
[9]	张文景, 蒋泽中, 秦立峰. 基于弱监督下改进的CBAM-ResNet18模型识别苹果多种叶部病害[J]. 智慧农业(中英文), 2023, 5(1): 111-121.
[10]	张志博, 赵西宁, 高晓东, 张利, 杨孟豪. 基于改进Linknet网络的黄土高原苹果园精准提取[J]. 智慧农业(中英文), 2022, 4(3): 95-107.
[11]	高振, 赵春江, 杨桂燕, 董大明. 典型拉曼光谱技术及其在农业检测中应用研究进展[J]. 智慧农业(中英文), 2022, 4(2): 121-134.
[12]	陈占琦, 张玉安, 王文志, 李丹, 何杰, 宋仁德. 基于迁移学习的多尺度特征融合牦牛脸部识别算法[J]. 智慧农业(中英文), 2022, 4(2): 77-85.
[13]	周巧黎, 马丽, 曹丽英, 于合龙. 基于改进轻量级卷积神经网络MobileNetV3的番茄叶片病害识别[J]. 智慧农业(中英文), 2022, 4(1): 47-56.
[14]	陈梅香, 张瑞瑞, 陈立平, 唐青, 夏浪. 无人机农林业应用全球研究态势分析[J]. 智慧农业(中英文), 2021, 3(3): 22-37.
[15]	赵欢, 王璟璐, 廖生进, 张颖, 卢宪菊, 郭新宇, 赵春江. 基于Micro-CT的玉米籽粒显微表型特征研究[J]. 智慧农业(中英文), 2021, 3(1): 16-28.