Lightweight Tea Shoot Picking Point Recognition Model Based on Improved DeepLabV3+

doi:10.12133/j.smartag.SA202403016

Abstract

Abstract:

[Objective] The picking of famous and high-quality tea is a crucial link in the tea industry. Identifying and locating the tender buds of famous and high-quality tea for picking is an important component of the modern tea picking robot. Traditional neural network methods suffer from issues such as large model size, long training times, and difficulties in dealing with complex scenes. In this study, based on the actual scenario of the Xiqing Tea Garden in Hunan Province, proposes a novel deep learning algorithm was proposed to solve the precise segmentation challenge of famous and high-quality tea picking points. [Methods] The primary technical innovation resided in the amalgamation of a lightweight network architecture, MobilenetV2, with an attention mechanism known as efficient channel attention network (ECANet), alongside optimization modules including atrous spatial pyramid pooling (ASPP). Initially, MobilenetV2 was employed as the feature extractor, substituting traditional convolution operations with depth wise separable convolutions. This led to a notable reduction in the model's parameter count and expedited the model training process. Subsequently, the innovative fusion of ECANet and ASPP modules constituted the ECA_ASPP module, with the intention of bolstering the model's capacity for fusing multi-scale features, especially pertinent to the intricate recognition of tea shoots. This fusion strategy facilitated the model's capability to capture more nuanced features of delicate shoots, thereby augmenting segmentation accuracy. The specific implementation steps entailed the feeding of image inputs through the improved network, whereupon MobilenetV2 was utilized to extract both shallow and deep features. Deep features were then fused via the ECA_ASPP module for the purpose of multi-scale feature integration, reinforcing the model's resilience to intricate backgrounds and variations in tea shoot morphology. Conversely, shallow features proceeded directly to the decoding stage, undergoing channel reduction processing before being integrated with upsampled deep features. This divide-and-conquer strategy effectively harnessed the benefits of features at differing levels of abstraction and, furthermore, heightened the model's recognition performance through meticulous feature fusion. Ultimately, through a sequence of convolutional operations and upsampling procedures, a prediction map congruent in resolution with the original image was generated, enabling the precise demarcation of tea shoot harvesting points. [Results and Discussions] The experimental outcomes indicated that the enhanced DeepLabV3+ model had achieved an average Intersection over Union (IoU) of 93.71% and an average pixel accuracy of 97.25% on the dataset of tea shoots. Compared to the original model based on Xception, there was a substantial decrease in the parameter count from 54.714 million to a mere 5.818 million, effectively accomplishing a significant lightweight redesign of the model. Further comparisons with other prevalent semantic segmentation networks revealed that the improved model exhibited remarkable advantages concerning pivotal metrics such as the number of parameters, training duration, and average IoU, highlighting its efficacy and precision in the domain of tea shoot recognition. This considerable decreased in parameter numbers not only facilitated a more resource-economical deployment but also led to abbreviated training periods, rendering the model highly suitable for real-time implementations amidst tea garden ecosystems. The elevated mean IoU and pixel accuracy attested to the model's capacity for precise demarcation and identification of tea shoots, even amidst intricate and varied datasets, demonstrating resilience and adaptability in pragmatic contexts. [Conclusions] This study effectively implements an efficient and accurate tea shoot recognition method through targeted model improvements and optimizations, furnishing crucial technical support for the practical application of intelligent tea picking robots. The introduction of lightweight DeepLabV3+ not only substantially enhances recognition speed and segmentation accuracy, but also mitigates hardware requirements, thereby promoting the practical application of intelligent picking technology in the tea industry.

Key words: lightweight model, DeepLabV3+, attention mechanism, tender tea buds, ECANet, famous quality tea, ASPP

CLC Number:

HU Chengxi, TAN Lixin, WANG Wenyin, SONG Min. Lightweight Tea Shoot Picking Point Recognition Model Based on Improved DeepLabV3+[J]. Smart Agriculture, 2024, 6(5): 119-127.

Figures/Tables 13

Fig. 1

Table 1

Fig. 2

Fig. 3

Fig. 4

Table 2

Table 3

Table 4

Table 5

Table 6

Fig. 5

Fig. 6

Fig 7

References 23

1	徐邢燕, 沈萍萍, 郝志龙, 等. 基于计算机视觉的茶树叶片色泽差异研究[J]. 茶叶通讯, 2019, 46(3): 276-283.
	XU X Y, SHEN P P, HAO Z L, et al. Study on the color difference of tea leaves based on computer vision[J]. Journal of tea communication, 2019, 46(3): 276-283.
2	马志艳, 李辉. 基于YOLOv5的茶叶嫩芽图像识别算法研究[J]. 湖北工业大学学报, 2024, 39(1): 36-40.
	MA Z Y, LI H. Research on image recognition algorithm of tea shoots based on YOLOv5[J]. Journal of Hubei university of technology, 2024, 39(1): 36-40.
3	吴雪梅, 张富贵, 吕敬堂. 基于图像颜色信息的茶叶嫩叶识别方法研究[J]. 茶叶科学, 2013, 33(6): 584-589.
	WU X M, ZHANG F G, LYU J T. Research on recognition of tea tender leaf based on image color information[J]. Journal of tea science, 2013, 33(6): 584-589.
4	AMPATZIDIS Y G, VOUGIOUKAS S G, WHITING M D, et al. Applying the machine repair model to improve efficiency of harvesting fruit[J]. Biosystems engineering, 2014, 120: 25-33.
5	何梁, 薛龙, 郑建鸿等. 莲蓬采摘点与采摘姿态计算算法[J]. 科学技术与工程, 2023, 23(16): 6845-6852.
	HE L, XUE L, ZHENG J Het al. Picking point and picking posture algorithm of lotus pods[J].Science technology and engineering, 2023, 23(16): 6845-6852.
6	CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[M]// Computer Vision-ECCV 2018. Cham: Springer International Publishing, 2018: 833-851.
7	李惠鹏, 李长勇, 李贵宾, 等.基于深度学习的多品种鲜食葡萄采摘点定位[J]. 中国农机化学报, 2022, 43(12):155-161.
	LI H P, LI C Y, LI G B, et al. Picking point positioning of multi-variety table grapes based on deep-learning[J]. Journal of Chinese agricultural mechanization, 2022, 43 (12): 155-161.
8	李艳文, 左朝阳, 王登奎, 等. 基于改进型SegNet的苹果采摘点分割算法研究[J]. 燕山大学学报, 2022, 46(5): 455-460, 470.
	LI Y W, ZUO C Y, WANG D K, et al. Apple picking point segmentation based on improved SegNet[J].Journal of Yanshan university, 2022, 46(5): 455-460, 470.
9	SANTOS T T, GEBLER L. A methodology for detection and localization of fruits in apples orchards from aerial images[EB/OL]. arxiv: 2110.12331, 2021.
10	GIMÉNEZ-GALLEGO J, MARTINEZ-DEL-RINCON J, GONZÁLEZ-TERUEL J D, et al. On-tree fruit image segmentation comparing Mask R-CNN and Vision Transformer models. Application in a novel algorithm for pixel-based fruit size estimation[J]. Computers and electronics in agriculture, 2024, 222: ID 109077.
11	黄家才, 唐安, 张铎, 等. 基于自适应标记分水岭算法的茶叶嫩芽图像分割方法[J]. 南京工程学院学报(自然科学版), 2022, 20(4): 6-11.
	HUANG J C, TANG A, ZHANG D, et al. Image segmentation of tea buds based on adaptive marked watershed algorithm[J]. Journal of Nanjing institute of technology (natural science edition), 2022, 20(4): 6-11.
12	胡和平, 吴明晖, 洪孔林, 等. 基于改进YOLOv5s的茶叶嫩芽分级识别方法[J]. 江西农业大学学报, 2023, 45(5): 1261-1272.
	HU H P, WU M H, HONG K L, et al. Classification and recognition method for tea buds based on improved YOLOv5s[J]. Acta agriculturae universitatis jiangxiensis, 2023, 45(5): 1261-1272.
13	LIU F, WANG S, PANG S, et al. Detection and recognition of tea buds by integrating deep learning and image-processing algorithm[J]. Journal of food measurement and characterization, 2024, 18(4): 2744-2761.
14	KARUNASENA G, PRIYANKARA H. Tea bud leaf identification by using machine learning and image processing techniques[J]. International journal of scientific & engineering research, 2020, 11(8): 624-628.
15	JUNAGADE S, CHOUDHURY S B, SARANGI S, et al. Estimation of plucking points with overhead imaging in tea-a case study[C]// 2022 IEEE Region 10 Symposium (TENSYMP). Piscataway, New Jersey, USA: IEEE, 2022: 1-6.
16	SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: Inverted residuals and linear bottlenecks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, New Jersey, USA: IEEE, 2018: ID 00474.
17	WANG Q L, WU B G, ZHU P F, et al. ECA-net: Efficient channel attention for deep convolutional neural networks[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2020: ID 01155.
18	CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[EB/OL]. arXiv: ID 1706.05587, 2017.
19	RONNEBERGER O, FISCHER P, BROX T. U-net: Convolutional networks for biomedical image segmentation[M]// Lecture Notes in Computer Science. Cham: Springer International Publishing, 2015: 234-241.
20	CHOLLET F. Xception: Deep learning with depthwise separable convolutions[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2017:ID 195.
21	向煜, 黄志. 一种基于改进的Unet网络的遥感影像建筑物分割方法[J]. 城市勘测, 2024(1): 109-113.
	XIANG Y, HUANG Z. A building segmentation method for remote sensing image based on improved unet network[J]. Urban geotechnical investigation & surveying, 2024(1): 109-113.
22	卢志刚, 陈芳淼, 袁超, 等. 采用Ⅰ-PSPNet语义分割模型的高分辨率遥感影像某特种植物种植地块提取研究[J]. 遥感技术与应用, 2024, 39(1): 222-233.
	LU Z G, CHEN F M, YUAN C, et al. Research on extracting special plant planting plots from high-resolution remote sensing images using Ⅰ-PSPNet semantic segmentation model[J]. Remote sensing technology and application, 2024, 39(1): 222-233.
23	路秋叶, 刘法军, 丁志国, 等. 基于改进DeepLabV3+深度学习模型的冬小麦种植面积提取研究[J]. 无线电工程, 2023, 53(11): 2564-2572.
	LU Q Y, LIU F J, DING Z G, et al. Research on extraction of winter wheat planting area based on improved DeepLabV3+[J]. Radio engineering, 2023, 53(11): 2564-2572.

特征输入尺寸	操作类型	Bottleneck内部升维的倍数	通道数/个	Bottleneck重复的次数/次	步长
224²×3	conv2d	‒	32	1	2
112²×32	bottleneck	1	16	1	1
112²×16	bottleneck	6	24	2	2
56²×24	bottleneck	6	32	3	2
28²×32	bottleneck	6	64	4	2
14²×64	bottleneck	6	96	3	1
14²×96	bottleneck	6	160	3	2
7²×160	bottleneck	6	320	1	1
7²×320	conv2d 1×1	‒	1 280	1	1
7²×1 280	avgpool 7×7	‒	‒	1	‒
1×1×1 280	conv2d 1×1	‒	m	‒	‒

实验环境项目	配置
操作系统	Windows 11操作系统
开发语言	Python 3.11
深度学习框架	Pytorch 1.7.1
CPU	Intel® i5-13400f@2.5 GHz
GPU	NVIDIA RTX3060（12 GB）
内存	DDR4 32 G 4 000 MHz

Backbone	MPA/%	MIoU/%	Recall/%	Parameters/M
Xception	97.75	95.31	97.75	54.714
ResNeXt	94.14	90.21	94.14	103.589
ResNet	96.57	92.33	96.57	59.346
MobileNetV2	97.00	93.01	97.00	5.818

Net	Tender_shoot	A_leaf	Two_leaves	Wrapped_bud
Xception	0.93	0.96	0.91	0.98
ResNeXt	0.88	0.89	0.80	0.96
ResNet	0.90	0.94	0.84	0.96
MobileNetV2	0.90	0.95	0.86	0.96

Net	MPA/%	MIoU/%
DeepLabV3+（MobileNetV2）	97.00	93.01
DeepLabV3+（MobileNetV2）+ECA_ASPP	97.25	93.71