Image Segmentation Method Combined with VoVNetv2 and Shuffle Attention Mechanism for Fish Feeding in Aquaculture

doi:10.12133/j.smartag.SA202310003

Abstract

Abstract:

[Objective] Intelligent feeding methods are significant for improving breeding efficiency and reducing water quality pollution in current aquaculture. Feeding image segmentation of fish schools is a critical step in extracting the distribution characteristics of fish schools and quantifying their feeding behavior for intelligent feeding method development. While, an applicable approach is lacking due to images challenges caused by blurred boundaries and similar individuals in practical aquaculture environment. In this study, a high-precision segmentation method was proposed for fish school feeding images and provides technical support for the quantitative analysis of fish school feeding behavior. [Methods] The novel proposed method for fish school feeding images segmentation combined VoVNetv2 with an attention mechanism named Shuffle Attention. Firstly, a fish feeding segmentation dataset was presented. The dataset was collected at the intensive aquaculture base of Laizhou Mingbo Company in Shandong province, with a focus on Oplegnathus punctatus as the research target. Cameras were used to capture videos of the fish school before, during, and after feeding. The images were annotated at the pixel level using Labelme software. According to the distribution characteristics of fish feeding and non-feeding stage, the data was classified into two semantic categories— non-occlusion and non-aggregation fish (fish1) and occlusion or aggregation fish (fish2). In the preprocessing stage, data cleaning and image augmentation were employed to further enhance the quality and diversity of the dataset. Initially, data cleaning rules were established based on the distribution of annotated areas within the dataset. Images with outlier annotations were removed, resulting in an improvement in the overall quality of the dataset. Subsequently, to prevent the risk of overfitting, five data augmentation techniques (random translation, random flip, brightness variation, random noise injection, random point addition) were applied for mixed augmentation on the dataset, contributing to an increased diversity of the dataset. Through data augmentation operations, the dataset was expanded to three times its original size. Eventually, the dataset was divided into a training dataset and testing dataset at a ratio of 8:2. Thus, the final dataset consisted of 1 612 training images and 404 testing images. In detail, there were a total of 116 328 instances of fish1 and 20 924 instances of fish2. Secondly, a fish feeding image segmentation method was proposed. Specifically, VoVNetv2 was used as the backbone network for the Mask R-CNN model to extract image features. VoVNetv2 is a backbone network with strong computational capabilities. Its unique feature aggregation structure enables effective fusion of features at different levels, extracting diverse feature representations. This facilitates better capturing of fish schools of different sizes and shapes in fish feeding images, achieving accurate identification and segmentation of targets within the images. To maximize feature mappings with limited resources, the experiment replaced the channel attention mechanism in the one-shot aggregation (OSA) module of VoVNetv2 with a more lightweight and efficient attention mechanism named shuffle attention. This improvement allowed the network to concentrate more on the location of fish in the image, thus reducing the impact of irrelevant information, such as noise, on the segmentation results. Finally, experiments were conducted on the fish segmentation dataset to test the performance of the proposed method. [Results and Discussions] The results showed that the average segmentation accuracy of the Mask R-CNN network reached 63.218% after data cleaning, representing an improvement of 7.018% compared to the original dataset. With both data cleaning and augmentation, the network achieved an average segmentation accuracy of 67.284%, indicating an enhancement of 11.084% over the original dataset. Furthermore, there was an improvement of 4.066% compared to the accuracy of the dataset after cleaning alone. These results demonstrated that data preprocessing had a positive effect on improving the accuracy of image segmentation. The ablation experiments on the backbone network revealed that replacing the ResNet50 backbone with VoVNetv2-39 in Mask R-CNN led to a 2.511% improvement in model accuracy. After improving VoVNetv2 through the Shuffle Attention mechanism, the accuracy of the model was further improved by 1.219%. Simultaneously, the parameters of the model decreased by 7.9%, achieving a balance between accuracy and lightweight design. Comparing with the classic segmentation networks SOLOv2, BlendMask and CondInst, the proposed model achieved the highest segmentation accuracy across various target scales. For the fish feeding segmentation dataset, the average segmentation accuracy of the proposed model surpassed BlendMask, CondInst, and SOLOv2 by 3.982%, 12.068%, and 18.258%, respectively. Although the proposed method demonstrated effective segmentation of fish feeding images, it still exhibited certain limitations, such as omissive detection, error segmentation, and false classification. [Conclusions] The proposed instance segmentation algorithm (SA_VoVNetv2_RCNN) effectively achieved accurate segmentation of fish feeding images. It can be utilized for counting the number and pixel quantities of two types of fish in fish feeding videos, facilitating quantitative analysis of fish feeding behavior. Therefore, this technique can provide technical support for the analysis of piscine feeding actions. In future research, these issues will be addressed to further enhance the accuracy of fish feeding image segmentation.

Key words: deep learning, instance segmentation, Mask R-CNN, attention mechanism, VoVNetv2

WANG Herong, CHEN Yingyi, CHAI Yingqian, XU Ling, YU Huihui. Image Segmentation Method Combined with VoVNetv2 and Shuffle Attention Mechanism for Fish Feeding in Aquaculture[J]. Smart Agriculture, 2023, 5(4): 137-149.

Figures/Tables 14

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Fig. 6

Fig. 7

Fig. 8

Fig. 9

Table 1

The network configuration of VoVNetv2

阶段	VoVNetv2-39	VoVNetv2-57	VoVNetv2-99
起始阶段1	$3 × 3 c o n v, 64, s = 2$ $3 × 3 c o n v, 64, s = 1$ $3 × 3 c o n v, 128, s = 1$	$3 × 3 c o n v, 64, s = 2$ $3 × 3 c o n v, 64, s = 1$ $3 × 3 c o n v, 128, s = 1$	$3 × 3 c o n v, 64, s = 2$ $3 × 3 c o n v, 64, s = 1$ $3 × 3 c o n v, 128, s = 1$
OSA模块阶段2	$3 × 3 c o n v, 128, × 5 c o n c a t & 1 × 1 c o n v, 256 × 1$	$3 × 3 c o n v, 128, × 5 c o n c a t & 1 × 1 c o n v, 256 × 1$	$3 × 3 c o n v, 128, × 5 c o n c a t & 1 × 1 c o n v, 256 × 1$
OSA模块阶段3	$3 × 3 c o n v, 160, × 5 c o n c a t & 1 × 1 c o n v, 512 × 1$	$3 × 3 c o n v, 160, × 5 c o n c a t & 1 × 1 c o n v, 512 × 1$	$3 × 3 c o n v, 160, × 5 c o n c a t & 1 × 1 c o n v, 512 × 3$
OSA模块阶段4	$3 × 3 c o n v, 192, × 5 c o n c a t & 1 × 1 c o n v, 768 × 2$	$3 × 3 c o n v, 192, × 5 c o n c a t & 1 × 1 c o n v, 768 × 4$	$3 × 3 c o n v, 192, × 5 c o n c a t & 1 × 1 c o n v, 768 × 9$
OSA模块阶段5	$3 × 3 c o n v, 224, × 5 c o n c a t & 1 × 1 c o n v, 1 024 × 2$	$3 × 3 c o n v, 224, × 5 c o n c a t & 1 × 1 c o n v, 1 024 × 3$	$3 × 3 c o n v, 224, × 5 c o n c a t & 1 × 1 c o n v, 1 024 × 3$

Table 1

Table 2

Table 3

Table4

Table 5

References 31

1	李道亮, 刘畅. 人工智能在水产养殖中研究应用分析与未来展望[J]. 智慧农业(中英文), 2020, 2(3): 1-20.
	LI D L, LIU C. Recent advances and future outlook for artificial intelligence in aquaculture[J]. Smart agriculture, 2020, 2(3): 1-20.
2	杨玲. 基于机器视觉的工厂化鱼群摄食行为智能分析方法研究[D]. 北京: 中国农业大学, 2022.
	YANG L. Computer vision technologies for fish school feeding behavior analysis in industrial aquaculture[D]. Beijing: China Agricultural University, 2022.
3	LIU H Y, LIU T, GU Y Z, et al. A high-density fish school segmentation framework for biomass statistics in a deep-sea cage[J]. Ecological informatics, 2021, 64: ID 101367.
4	ZHANG L, WANG J P, DUAN Q L. Estimation for fish mass using image analysis and neural network[J]. Computers and electronics in agriculture, 2020, 173: ID 105439.
5	KHALID EL MOUTAOUAKIL, NOUREDDINE FALIH. Deep learning-based classification of cattle behavior using accelerometer sensors[J]. IAES international journal of artificial intelligence, 2024, 13(1): 524-5532.
6	ZHANG T W, ZHANG X L. A mask attention interaction and scale enhancement network for SAR ship instance segmentation[J]. IEEE geoscience and remote sensing letters, 2022, 19: 1-5.
7	ALSHDAIFAT N F F, TALIB A Z, OSMAN M A. Improved deep learning framework for fish segmentation in underwater videos[J]. Ecological informatics, 2020, 59: ID 101121.
8	田志新, 廖薇, 茅健, 等. 融合边缘监督的改进Deeplabv3+水下鱼类分割方法[J]. 电子测量与仪器学报, 2022, 36(10): 208-216.
	TIAN Z X, LIAO W, MAO J, et al. Improved Deeplabv3+ underwater fish segmentation method combining with edge supervision[J]. Journal of electronic measurement and instrumentation, 2022, 36(10): 208-216.
9	CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]// Computer Vision-ECCV 2018: 15th European Conference. New York, USA: ACM, 2018: 833-851.
10	WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[M]// Computer vision-ECCV 2018. Cham: Springer International Publishing, 2018: 3-19.
11	覃学标, 黄冬梅, 宋巍, 等. 基于目标检测及边缘支持的鱼类图像分割方法[J]. 农业机械学报, 2023, 54(1): 280-286.
	QIN X B, HUANG D M, SONG W, et al. Fish image segmentation method based on object detection and edge support[J]. Transactions of the Chinese society for agricultural machinery, 2023, 54(1): 280-286.
12	YU X N, WANG Y Q, LIU J C, et al. Non-contact weight estimation system for fish based on instance segmentation[J]. Expert systems with applications, 2022, 210: ID 118403.
13	CHANG C C, WANG Y P, CHENG S C. Fish segmentation in sonar images by mask R-CNN on feature maps of conditional random fields[J]. Sensors, 2021, 21(22): ID 7625.
14	HE K M, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[C]// 2017 IEEE International Conference on Computer Vision (ICCV). Piscataway, New Jersey, USA: IEEE, 2017: 2980-2988.
15	郭奕, 黄佳芯, 邓博奇, 等. 改进Mask R-CNN的真实环境下鱼体语义分割[J]. 农业工程学报, 2022, 38(23): 162-169.
	GUO Y, HUANG J X, DENG B Q, et al. Semantic segmentation of the fish bodies in real environment using improved Mask-RCNN model[J]. Transactions of the Chinese society of agricultural engineering, 2022, 38(23): 162-169.
16	YANG L, ZHANG R, LI L, et al. SimAM: A Simple, Parameter-Free Attention Module for Convolutional Neural Networks[C/OL]// Proceedings of the 38 th International Conference on Machine Learning. New York, USA: PMLR, 2021: 11863-11874.
17	VARKARAKIS V, CORCORAN P. Dataset cleaning: A cross validation methodology for large facial datasets using face recognition[C]// 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX). Piscataway, New Jersey, USA: IEEE, 2020: 1-6.
18	姜波. 基于计算机视觉与深度学习的奶牛跛行检测方法研究[D]. 杨凌: 西北农林科技大学, 2020.
	JIANG B. Detection of dairy cow lameness based on computer vision and deep learning[D]. Yangling: Northwest A & F University, 2020.
19	WU S F, CHANG M C, LYU S W, et al. FlagDetSeg: Multi-nation flag detection and segmentation in the wild[C]// 2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). Piscataway, New Jersey, USA: IEEE, 2021: 1-8.
20	KAMILARIS A, PRENAFETA-BOLDÚ F X. Deep learning in agriculture: A survey[J]. Computers and electronics in agriculture, 2018, 147: 70-90.
21	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(6): 1137-1149.
22	LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2017: 2117-2125.
23	NEUBECK A, VAN GOOL L. Efficient non-maximum suppression[C]//18th International Conference on Pattern Recognition (ICPR'06). Piscataway, New Jersey, USA: IEEE, 2006: 850-855.
24	CHEN Y Y, LIU H H, YANG L, et al. A lightweight detection method for the spatial distribution of underwater fish school quantification in intensive aquaculture[J]. Aquaculture international, 2023, 31(1): 31-52.
25	ZHANG Q L, YANG Y B. SA-net: Shuffle attention for deep convolutional neural networks[C]// ICASSP 2021—2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Piscataway, New Jersey, USA: IEEE, 2021: 2235-2239.
26	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2016: 770-778.
27	LEE Y, PARK J. CenterMask: Real-time anchor-free instance segmentation[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2020: 13906-13915.
28	LEE Y, HWANG J W, LEE S, et al. An energy and GPU-computation efficient backbone network for real-time object detection[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Piscataway, New Jersey, USA: IEEE, 2019: 752-760.
29	WANG X L, ZHANG R F, KONG T, et al. SOLOv2: Dynamic and fast instance segmentation[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. New York, USA: ACM, 2020: 17721-17732.
30	TIAN Z, SHEN C H, CHEN H. Conditional convolutions for instance segmentation[M]// Computer vision-ECCV 2020. Cham: Springer International Publishing, 2020: 282-298.
31	CHEN H, SUN K Y, TIAN Z, et al. BlendMask: Top-down meets bottom-up for instance segmentation[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2020: 8573-8581.

指标	描述
mAP	IoU=0.5∶0.05∶0.95时的平均精度
AP50	IoU=0.5
AP75	IoU=0.75
APs	小型目标（面积<32²）的AP值
APm	中型目标（32²<面积<96²）的AP值
Apl	大型目标（96²<面积）的AP值

预处理方式	mAP	AP50	AP75	APs	APm	APl
无	56.200	79.421	67.694	29.384	57.247	62.929
数据清洗	63.218	85.584	75.698	67.920	63.628	68.854
数据清洗+增强	67.284	93.265	83.317	35.457	68.135	75.056

骨干网络	mAP	AP50	AP75	APs	APm	APl	参数量/M
ResNet50	67.284	93.265	83.317	35.457	68.135	75.056	44.3
VoVNetv2-39	69.795	93.382	85.457	35.878	70.792	75.716	45.7
VoVNetv2-57	70.624	93.828	86.959	37.708	71.447	77.152	62.0
VoVNetv2-99	71.580	94.151	88.369	36.168	72.363	77.860	90.0
SA_VoVNetv2-39	71.014	93.864	87.081	38.231	71.967	76.095	42.1

网络	mAP	AP50	AP75	APs	APm	APl
SOLOv2	52.756	85.905	63.905	16.737	53.644	69.141
CondInst	58.946	92.196	73.463	23.803	60.100	71.053
BlendMask	67.032	93.261	82.548	34.583	67.962	76.676
SA_VoVNetv2-39_RCNN	71.014	93.864	87.081	38.231	71.967	76.095

[1]	ZHOU Huamao, WANG Jing, YIN Hua, CHEN Qi. Phenotype Analysis of Pleurotus Geesteranus Based on Improved Mask R-CNN [J]. Smart Agriculture, 2023, 5(4): 117-126.
[2]	LI Zhengkai, YU Jiahui, PAN Shijia, JIA Zefeng, NIU Zijie. Individual Tree Skeleton Extraction and Crown Prediction Method of Winter Kiwifruit Trees [J]. Smart Agriculture, 2023, 5(4): 92-104.
[3]	TANG Hui, WANG Ming, YU Qiushi, ZHANG Jiaxi, LIU Liantao, WANG Nan. Root Image Segmentation Method Based on Improved UNet and Transfer Learning [J]. Smart Agriculture, 2023, 5(3): 96-109.
[4]	PAN Weiting, SUN Mengli, YUN Yan, LIU Ping. Identification Method of Wheat Grain Phenotype Based on Deep Learning of ImCascade R-CNN [J]. Smart Agriculture, 2023, 5(3): 110-120.
[5]	GUAN Bolun, ZHANG Liping, ZHU Jingbo, LI Runmei, KONG Juanjuan, WANG Yan, DONG Wei. The Key Issues and Evaluation Methods for Constructing Agricultural Pest and Disease Image Datasets: A Review [J]. Smart Agriculture, 2023, 5(3): 17-34.
[6]	LONG Jianing, ZHANG Zhao, LIU Xiaohang, LI Yunxia, RUI Zhaoyu, YU Jiangfan, ZHANG Man, FLORES Paulo, HAN Zhexiong, HU Can, WANG Xufeng. Wheat Lodging Types Detection Based on UAV Image Using Improved EfficientNetV2 [J]. Smart Agriculture, 2023, 5(3): 62-74.
[7]	ZUO Haoxuan, HUANG Qicheng, YANG Jiahao, MENG Fanjia, LI Sien, LI Li. In Situ Identification Method of Maize Stalk Width Based on Binocular Vision and Improved YOLOv8 [J]. Smart Agriculture, 2023, 5(3): 86-95.
[8]	LIU Yixue, SONG Yuyang, CUI Ping, FANG Yulin, SU Baofeng. Diagnosis of Grapevine Leafroll Disease Severity Infection via UAV Remote Sensing and Deep Learning [J]. Smart Agriculture, 2023, 5(3): 49-61.
[9]	MAO Kebiao, ZHANG Chenyang, SHI Jiancheng, WANG Xuming, GUO Zhonghua, LI Chunshu, DONG Lixin, WU Menxin, SUN Ruijing, WU Shengli, JI Dabin, JIANG Lingmei, ZHAO Tianjie, QIU Yubao, DU Yongming, XU Tongren. The Paradigm Theory and Judgment Conditions of Geophysical Parameter Retrieval Based on Artificial Intelligence [J]. Smart Agriculture, 2023, 5(2): 161-171.
[10]	ZHAO Yu, REN Yiping, PIAO Xinru, ZHENG Danyang, LI Dongming. Lightweight Intelligent Recognition of Saposhnikovia Divaricata (Turcz.) Schischk Originality Based on Improved ShuffleNet V2 [J]. Smart Agriculture, 2023, 5(2): 104-114.
[11]	PAN Chenlu, ZHANG Zhenghua, GUI Wenhao, MA Jiajun, YAN Chenxi, ZHANG Xiaomin. Rice Disease and Pest Recognition Method Integrating ECA Mechanism and DenseNet201 [J]. Smart Agriculture, 2023, 5(2): 45-55.
[12]	ZHU Haipeng, ZHANG Yu'an, LI Huanhuan, WANG Jianwen, YANG Yingkui, SONG Rende. Classification and Recognition Method for Yak Meat Parts Based on Improved Residual Network Model [J]. Smart Agriculture, 2023, 5(2): 115-125.
[13]	XIA Xue, CHAI Xiujuan, ZHANG Ning, ZHOU Shuo, SUN Qixin, SUN Tan. A Lightweight Fruit Load Estimation Model for Edge Computing Equipment [J]. Smart Agriculture, 2023, 5(2): 1-12.
[14]	HU Songtao, ZHAI Ruifang, WANG Yinghua, LIU Zhi, ZHU Jianzhong, REN He, YANG Wanneng, SONG Peng. Extraction of Potato Plant Phenotypic Parameters Based on Multi-Source Data [J]. Smart Agriculture, 2023, 5(1): 132-145.
[15]	GUO Yangyang, DU Shuzeng, QIAO Yongliang, LIANG Dong. Advances in the Applications of Deep Learning Technology for Livestock Smart Farming [J]. Smart Agriculture, 2023, 5(1): 52-65.