基于双目视觉和改进YOLOv11-pose的河蟹水下原位质量估测方法

doi:10.12133/j.smartag.SA202505019

Smart Agriculture

• •

基于双目视觉和改进YOLOv11-pose的河蟹水下原位质量估测方法

李澳强¹^,²^,³, 戴航宇¹^,²^,³, 郭亚¹^,²^,³()

^1. 江南大学智能光学感知与应用国际联合研究中心，江苏无锡 214122，中国
^2. 轻工过程先进控制教育部重点实验室，江南大学，江苏无锡 214122，中国
^3. 江南大学物联网工程学院，江苏无锡 214122，中国

收稿日期:2025-05-19 出版日期:2025-07-23
基金项目:
国家自然科学基金国际合作项目(51961125102); 江苏省现代农业-重点及面上项目(BE2022366)
作者简介:
李澳强，硕士研究生，研究方向为机器视觉。E-mail：liaoqiang919@gmail.com
通信作者:
郭亚，博士，教授，研究方向为系统建模与控制、传感器与仪器。E-mail：guoya68@163.com

An Underwater Insitu Quality Estimation Method for Chinese Mitten Crab Based on Binocular Vision and Improved YOLOv11-pose

LI Aoqiang¹^,²^,³, DAI Hangyu¹^,²^,³, GUO Ya¹^,²^,³()

^1. International Joint Research Center for Intelligent Optical Sensing and Applications at Jiangnan University, Wuxi 214122, China
^2. Key Laboratory of Advanced Process Control for Light Industry, Ministry of Education, Jiangnan University, Wuxi 214122, China
^3. School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, China

Received:2025-05-19 Online:2025-07-23
Foundation items:International Cooperation and Exchange Program of the National Natural Science Foundation of China(51961125102); Jiangsu Provincial Modern Agriculture - Key and General Programs(BE2022366)
About author:
Li Aoqiang, E-mail: liaoqiang919@gmail.com
Corresponding author:
GUO Ya, E-mail: guoya68@163.com

摘要/Abstract

摘要：

【目的/意义】 河蟹质量的精准估测对于河蟹精准化养殖具有重要的意义，但是传统的河蟹质量人工测量效率低且易损伤蟹体，面向河蟹水下原位质量估测方法研究非常缺乏。为此研究开发高效无损的河蟹水下原位质量估测方法。 【方法】 基于双目视觉建立融合关键点检测与三维测量的质量估测方法，克服现有三维重建技术在水下动态场景中关键点定位精度不足的瓶颈，进而实现河蟹水下原位质量精准估测。首先，改进YOLOv11框架，通过将MBConv模块与EffectiveSE注意力机制融合重构C3K2特征提取模块，并引入空间动态特征融合机制表面细节融合模块（Surface Detail Fusion Module, SDFM），构建适应养殖水体浑浊环境的河蟹目标检测模型；其次，建立关键点检测匹配算法替代传统全局立体匹配，结合三角测量原理实现甲壳三维参数精准测量；最后构建双层反向传播（Back Propagation, BP）神经网络，融合甲壳形态参数与性别特征进行质量回归预测。 【结果和讨论】 改进后的目标检测模型在并交比为0.5 的平均精度均值（Mean Average Precision, mAP）达97.2%，关键点检测精度mAP50达96.7%；河蟹甲壳三维测量平均绝对百分比误差（Mean Absolute Percentage Error, MAPE）仅为2.68%；系统整体质量预测平均绝对百分比误差MAPE为7.1%。 【结论】 建立的非接触式测量方法能够实现水下河蟹形态参数的精准获取与质量估测，为河蟹养殖智能化监测提供了关键技术参数，具有重要的理论和应用价值。

关键词: 河蟹, 关键点检测, 双目视觉, YOLOv11, 质量估测

Abstract:

[Objective] With the accelerated development of large-scale and intelligent aquaculture, accurate estimation of the body mass of individual Chinese mitten crabs is critical for tasks such as precise feeding, disease prevention, and optimization of harvest decisions. Traditional methods of manually catching and weighing crabs are time-consuming, labor-intensive, and can cause stress or injury to the crabs, while also failing to provide real-time monitoring. To address the challenges posed by turbid water conditions in aquaculture, which lead to poor image quality and difficulty in feature extraction, a method is proposed for estimating Chinese mitten crab quality that combines binocular vision with deep learning–based keypoint detection. This approach achieves high-precision detection of anatomical keypoints on the crab, providing new technical support for precision aquaculture and intelligent management. [Methods] Based on a lightweight YOLOv11 framework, in its C3K2 module, MBConv depthwise-separable convolutions were incorporated to significantly reduce computational complexity and improve feature extraction efficiency. An EffectiveSE channel attention mechanism was introduced to adaptively emphasize important channel-wise features. To further enhance cross-scale information fusion, a spatial dynamic feature fusion module (SDFM) was added. The SDFM adaptively and weightedly fused local spatial attention with global channel attention, enabling detailed extraction of crab shell edges and anatomical keypoints. The improved YOLOv11-ES model could simultaneously output the crab's bounding box, the positions of four anatomical keypoints, and the crab's sex classification in a single forward pass. In the 3D reconstruction stage, calibrated stereo camera parameters were used, and a sparse keypoint matching strategy guided by the crab's sex and spatial geometric constraints was employed. High-confidence keypoint pairs were selected from the left and right views, and the true 3D coordinates of the crab's carapace length and width were computed by triangulation. Finally, the obtained carapace length, width, and sex label data were fed into a two-layer back-propagation (BP) neural network to perform a regression prediction of the individual crab's mass. [Results and Discussion] To validate the effectiveness and robustness of the proposed method, a dataset of Chinese mitten crab images with annotated keypoints was constructed under varying water turbidity and lighting conditions, and both ablation and comparative experiments were conducted. The YOLOv11-ES achieved a mean Average Precision at intersection over union (IOU) threshold of 0.5 (mAP@50) of 97.2% on the test set, which is 4.4 percentage point higher than the original YOLOv11 model. The keypoint detection component reached an mAP@50 of 96.7%, which is 3.6 percentage point higher than that of the original YOLOv11 model. In comparative experiments, YOLOv11-ES also demonstrated significant advantages over other models in the same series. Moreover, in a full-system evaluation using images of 30 individual crabs, the mean absolute percentage error (MAPE) for carapace width measurements was only 2.68%, and for carapace length it was 1.48%. The Pearson correlation coefficients between the measured and manually obtained true values for both carapace length and width exceeded 0.977, indicating high accuracy in the 3D reconstruction and minimal measurement error. Experiments analyzing the influence of image quality on measurement accuracy showed that when the underwater image quality measure (UIQM) reached at least 1.5, the combined MAPE of carapace length and width errors could be kept below 5%. When UIQM reached at least 2.2, the MAPE dropped to about 1.9%. These results confirmed the robustness of the method against variations in water turbidity and lighting conditions. For mass regression prediction, the BP network trained on carapace length, width, and sex features achieved a mean absolute error (MAE) of 2.39 g and a MAPE of 7.1% on an independent test set, demonstrating high-precision estimation of individual crab mass. [Conclusions] The proposed method, which combines an improved YOLOv11 object detection network, binocular sparse keypoint matching, and a two-layer BP regression network, enabled high-precision, low-error, real-time, non-contact estimation of Chinese mitten crab mass in complex turbid aquatic environments. This approach featured a lightweight model, high computational efficiency, excellent measurement accuracy, and strong adaptability to varying environmental conditions. It provided key technical parameters for intelligent Chinese mitten crab farming. In the future, this approach could be extended to other aquaculture species and complex farming scenarios. Combined with transfer learning and online adaptive calibration techniques, its generalization capability could be further improved and integrated with intelligent monitoring platforms to achieve large-scale, all-weather underwater crab quality estimation, contributing to the sustainable development of smart aquaculture.

Key words: Chinese mitten crab, keypoint detection, binocular vision, YOLOv11, weight estimation

中图分类号:

TP391.4

李澳强, 戴航宇, 郭亚. 基于双目视觉和改进YOLOv11-pose的河蟹水下原位质量估测方法[J]. 智慧农业(中英文), doi: 10.12133/j.smartag.SA202505019.

LI Aoqiang, DAI Hangyu, GUO Ya. An Underwater Insitu Quality Estimation Method for Chinese Mitten Crab Based on Binocular Vision and Improved YOLOv11-pose[J]. Smart Agriculture, doi: 10.12133/j.smartag.SA202505019.

图/表 11

图1

图2

图3

图4

图5

表1

表2

图6

图7

图8

图9

参考文献 33

[1]	段延娥, 李道亮, 李振波, 等. 基于计算机视觉的水产动物视觉特征测量研究综述[J]. 农业工程学报, 2015, 31(15): 1-11.
	DUAN Y E, Li D L, Li Z B, et al. Review on visual characteristic measurement research of aquatic animals based on computer vision[J]. Transactions of the Chinese society of agricultural engineering, 2015, 31(15): 1-11.
[2]	LI D L, LIU C, SONG Z Y, et al. Automatic monitoring of relevant behaviors for crustacean production in aquaculture: A review[J]. Animals, 2021, 11(9): ID 2709.
[3]	ZHAO Y X, QIN H X, XU L, et al. A review of deep learning-based stereo vision techniques for phenotype feature and behavioral analysis of fish in aquaculture[J]. Artificial intelligence review, 2024, 58(1): ID 7.
[4]	张铮, 鲁祥, 胡庆松. 基于图像增强与GC-YOLO v5s的水下环境河蟹识别轻量化模型研究[J]. 农业机械学报, 2024, 55(11): 124-131, 374.
	ZHANG Z, LU X, HU Q S. Lightweight model for river crab detection based on image enhancement and improved YOLO v5s[J]. Transactions of the Chinese society for agricultural machinery, 2024, 55(11): 124-131, 374.
[5]	JI W, PENG J Q, XU B, et al. Real-time detection of underwater river crab based on multi-scale pyramid fusion image enhancement and MobileCenterNet model[J]. Computers and electronics in agriculture, 2023, 204: ID 107522.
[6]	LIU C H, WANG Z Y, LI Y C, et al. Research progress of computer vision technology in abnormal fish detection[J]. Aquacultural engineering, 2023, 103: ID 102350.
[7]	唐永成, 彭姣, 赵运林, 等. 池养中华绒螯蟹不同性别形态及质量差异分析[J]. 渔业科学进展, 2019, 40(6): 114-120.
	TANG Y C, PENG J, ZHAO Y L, et al. Morphological attributes and quality parameters of different sexes of Eriocheir sinensis cultured in a pond[J]. Progress in fishery sciences, 2019, 40(6): 114-120.
[8]	和飞, 王志忠, 卢红, 等. 黄河口中华绒螯蟹成蟹形态性状与体质量的相关性及通径分析[J]. 水产学杂志, 2024, 37(3): 31-36.
	HE F, WANG Z Z, LU H, et al. Correlation and path analysis on morphometric traits and body weight for adult Chinese mitten handed crab (Eriocheir sinensis) from Yellow River Delta[J]. Chinese journal of fisheries, 2024, 37(3): 31-36.
[9]	SUN D W, LI J T, LI Z, et al. Grading related feature extraction of Chinese mitten crab based on machine vision[J]. BIO web of conferences, 2024, 142: ID 02016.
[10]	CHEN K, CHEN Z Q, WANG C B, et al. Improved YOLOv8-based method for the carapace keypoint detection and size measurement of Chinese mitten crabs[J]. Animals, 2025, 15(7): ID 941.
[11]	HUO G, WU Z, LI J, et al. Underwater target detection and 3D reconstruction system based on binocular vision[J]. Sensors, 2018, 18(10): 3570.
[12]	KONG M R, LI B B, ZHANG Y H, et al. Non-intrusive mass estimation method for crucian carp using instance segmentation and point cloud processing[J]. Computers and electronics in agriculture, 2024, 226: ID 109445.
[13]	ZHOU M G, SHEN P F, ZHU H, et al. In-water fish body-length measurement system based on stereo vision[J]. Sensors, 2023, 23(14): ID 6325.
[14]	SHI C, WANG Q B, HE X L, et al. An automatic method of fish length estimation using underwater stereo system based on LabVIEW[J]. Computers and electronics in agriculture, 2020, 173: ID 105419.
[15]	SETIAWAN A, HADIYANTO H, WIDODO C E. Shrimp body weight estimation in aquaculture ponds using morphometric features based on underwater image analysis and machine learning approach[J]. Revue d'Intelligence artificielle, 2022, 36(6): 905-912.
[16]	董鹏, 周烽, 赵悰悰, 等. 基于双目视觉的水下海参尺寸自动测量方法[J]. 计算机工程与应用, 2021, 57(8): 271-278.
	DONG P, ZHOU F, ZHAO C C, et al. Automatic measurement of underwater sea cucumber size based on binocular vision[J]. Computer engineering and applications, 2021, 57(8): 271-278.
[17]	LI Q, WANG H J, XIAO Y, et al. Underwater unsupervised stereo matching method based on semantic attention[J]. Journal of marine science and engineering, 2024, 12(7): ID 1123.
[18]	汤忠强, 周波, 戴先中, 等. 基于改进DCP算法的水下机器人视觉增强[J]. 机器人, 2018, 40(2): 222-230.
	TANG Z Q, ZHOU B, DAI X Z, et al. Underwater robot visual enhancements based on the improved DCP algorithm[J]. Robot, 2018, 40(2): 222-230.
[19]	王新伟, 孙亮, 雷平顺, 等. 用于海洋宏生物原位观测的水下激光雷达相机[J]. 红外与激光工程, 2021, 50(6): 37-45.
	WANG X W, SUN L, LEI P S, et al. Underwater light ranging and imaging for macro marine life in situ observation and measurement[J]. Infrared and laser engineering, 2021, 50(6): 37-45.
[20]	HU K, WANG T Y, SHEN C W, et al. Overview of underwater 3D reconstruction technology based on optical images[J]. Journal of marine science and engineering, 2023, 11(5): ID 949.
[21]	崔海朋, 秦朝旭, 马志宇. 基于深度学习的鱼类特征点检测与体征识别方法[J]. 中国农机化学报, 2024, 45(6): 201-207.
	CUI H P, QIN C X, MA Z Y. Fish key feature point detection and sign identification based on deep learning[J]. Journal of Chinese agricultural mechanization, 2024, 45(6): 201-207.
[22]	JIAN M W, YANG N, TAO C, et al. Underwater object detection and datasets: A survey[J]. Intelligent marine technology and systems, 2024, 2(1): ID 9.
[23]	DIAZ-GARCIA P, ESCALONA F, CAZORLA M. UKDM: Underwater keypoint detection and matching using underwater image enhancement techniques[EB/OL]. arXiv: 2504.11063, 2025.
[24]	KHANAM R, HUSSAIN M. YOLOv11: An overview of the key architectural enhancements[EB/OL]. arXiv: 2024, 2410: 17725-17734.
[25]	牛子昂, 裘正军. 基于改进YOLOv11-Pose的玉米植株骨架及表型参数提取方法[J]. 智慧农业(中英文), 2025, 7(2): 95-105.
	NIU Z A, QIU Z J. Extraction method of maize plant skeleton and phenotypic parameters based on improved YOLOv11-pose[J]. Smart agriculture, 2025, 7(2): 95-105.
[26]	FU C P, FAN X, XIAO J W, et al. Learning heavily-degraded prior for underwater object detection[J]. IEEE transactions on circuits and systems for video technology, 2023, 33(11): 6887-6896.
[27]	陶洋, 钟邦乾, 赵文博, 等. 融合显示视觉中心与注意力机制的水下目标检测算法[J]. 激光与光电子学进展, 2024, 61(12): 441-450.
	TAO Y, ZHONG B Q, ZHAO W B, et al. Underwater object detection algorithm integrating explicit visual center and attention mechanism[J]. Laser & optoelectronics progress, 2024, 61(12): 441-450.
[28]	TAN M X, LE Q V. EfficientNet: Rethinking model scaling for convolutional neural networks[EB/OL]. arXiv: 1905.11946, 2019.
[29]	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, New Jersey, USA: IEEE, 2018: 7132-7141.
[30]	LEE Y, PARK J. CenterMask: Real-time anchor-free instance segmentation[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2020: 13903-13912.
[31]	SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: Inverted residuals and linear bottlenecks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, New Jersey, USA: IEEE, 2018: 4510-4520.
[32]	TANG L F, ZHANG H, XU H, et al. Rethinking the necessity of image fusion in high-level vision tasks: A practical infrared and visible image fusion network based on progressive semantic injection and scene fidelity[J]. Information fusion, 2023, 99: ID 101870.
[33]	WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional Block attention module[M]// Computer Vision-ECCV 2018. Cham: Springer International Publishing, 2018: 3-19.

序号	EMBC	SDFM	GFLOPS	BOX				Pose
序号	EMBC	SDFM	GFLOPS	P/%	R/%	mAP50/%	F ₁/%	P/%	R/%	mAP50/%	F ₁/%
1	×	×	6.6	85.3	84.7	92.8	85.0	85.8	85.1	93.1	85.4
2	×	√	9.7	90.2	91.2	95.7	90.7	89.9	90.8	95.8	90.3
3	√	×	6.4	89.8	93.2	95.1	91.5	89.1	92.3	94.4	90.7
4	√	√	9.5	91.7	94.7	97.2	93.2	91.3	94.4	96.7	92.8

模型	GFLOPS	BOX				Pose
模型	GFLOPS	P/%	R/%	mAP50/%	F ₁/%	P/%	R/%	mAP50/%	F ₁/%
YOLOv5	7.3	83.2	87.5	92.4	85.3	88.3	82.2	92.3	85.1
YOLOv8n	8.3	89.3	85.4	93.7	87.3	89.3	85.4	93.7	87.3
YOLOv10n	8.0	81.4	83.4	86.0	82.4	81.4	83.4	86.2	82.4
YOLO11n	6.6	85.3	84.7	92.8	85.0	85.8	85.1	93.1	85.4
YOLOv12n	6.6	79.4	86.0	88.7	82.6	80.7	86.5	89.1	83.5
YOLOv8-ES	9.7	90.6	87.7	95.4	89.1	90.3	87.3	94.9	88.8
YOLOv11-ES	9.5	91.7	94.7	97.2	93.2	91.3	94.4	96.7	92.8

基于双目视觉和改进YOLOv11-pose的河蟹水下原位质量估测方法

An Underwater Insitu Quality Estimation Method for Chinese Mitten Crab Based on Binocular Vision and Improved YOLOv11-pose

在线阅读

知网下载

本地下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 33

相关文章 4

编辑推荐

Metrics

本文评价

[1]	韩宇, 齐康康, 郑纪业, 李金瑷, 姜富贵, 张相伦, 游伟, 张霞. 基于改进YOLOv11的轻量化肉牛面部识别方法[J]. 智慧农业(中英文), 2025, 7(3): 173-184.
[2]	牛子昂, 裘正军. 基于改进YOLOv11-Pose的玉米植株骨架及表型参数提取方法[J]. 智慧农业(中英文), 2025, 7(2): 95-105.
[3]	周秀珊, 文露婷, 介百飞, 郑海锋, 吴其琦, 李克讷, 梁军能, 黎一键, 文家燕, 江林源. 改进YOLOv11的水面膨化饲料颗粒图像实时检测算法[J]. 智慧农业(中英文), 2024, 6(6): 155-167.
[4]	左昊轩, 黄祺成, 杨佳昊, 孟繁佳, 李思恩, 李莉. 基于双目视觉和改进YOLOv8的玉米茎秆宽度原位识别方法[J]. 智慧农业(中英文), 2023, 5(3): 86-95.