欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture

• •    

基于双目视觉和改进YOLOv11-pose的河蟹水下原位质量估测方法

李澳强1,2,3, 戴航宇1,2,3, 郭亚1,2,3()   

  1. 1. 江南大学智能光学感知与应用国际联合研究中心,江苏 无锡 214122,中国
    2. 轻工过程先进控制教育部重点实验室,江南大学,江苏 无锡 214122,中国
    3. 江南大学 物联网工程学院,江苏 无锡 214122,中国
  • 收稿日期:2025-05-19 出版日期:2025-07-23
  • 基金项目:
    国家自然科学基金国际合作项目(51961125102); 江苏省现代农业-重点及面上项目(BE2022366)
  • 作者简介:

    李澳强,硕士研究生,研究方向为机器视觉。E-mail:

  • 通信作者:
    郭 亚,博士,教授,研究方向为系统建模与控制、传感器与仪器。E-mail:

An Underwater Insitu Quality Estimation Method for Chinese Mitten Crab Based on Binocular Vision and Improved YOLOv11-pose

LI Aoqiang1,2,3, DAI Hangyu1,2,3, GUO Ya1,2,3()   

  1. 1. International Joint Research Center for Intelligent Optical Sensing and Applications at Jiangnan University, Wuxi 214122, China
    2. Key Laboratory of Advanced Process Control for Light Industry, Ministry of Education, Jiangnan University, Wuxi 214122, China
    3. School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, China
  • Received:2025-05-19 Online:2025-07-23
  • Foundation items:International Cooperation and Exchange Program of the National Natural Science Foundation of China(51961125102); Jiangsu Provincial Modern Agriculture - Key and General Programs(BE2022366)
  • About author:

    Li Aoqiang, E-mail:

  • Corresponding author:
    GUO Ya, E-mail:

摘要:

【目的/意义】 河蟹质量的精准估测对于河蟹精准化养殖具有重要的意义,但是传统的河蟹质量人工测量效率低且易损伤蟹体,面向河蟹水下原位质量估测方法研究非常缺乏。为此研究开发高效无损的河蟹水下原位质量估测方法。 【方法】 基于双目视觉建立融合关键点检测与三维测量的质量估测方法,克服现有三维重建技术在水下动态场景中关键点定位精度不足的瓶颈,进而实现河蟹水下原位质量精准估测。首先,改进YOLOv11框架,通过将MBConv模块与EffectiveSE注意力机制融合重构C3K2特征提取模块,并引入空间动态特征融合机制表面细节融合模块(Surface Detail Fusion Module, SDFM),构建适应养殖水体浑浊环境的河蟹目标检测模型;其次,建立关键点检测匹配算法替代传统全局立体匹配,结合三角测量原理实现甲壳三维参数精准测量;最后构建双层反向传播(Back Propagation, BP)神经网络,融合甲壳形态参数与性别特征进行质量回归预测。 【结果和讨论】 改进后的目标检测模型在并交比为0.5 的平均精度均值(Mean Average Precision, mAP)达97.2%,关键点检测精度mAP50达96.7%;河蟹甲壳三维测量平均绝对百分比误差(Mean Absolute Percentage Error, MAPE)仅为2.68%;系统整体质量预测平均绝对百分比误差MAPE为7.1%。 【结论】 建立的非接触式测量方法能够实现水下河蟹形态参数的精准获取与质量估测,为河蟹养殖智能化监测提供了关键技术参数,具有重要的理论和应用价值。

关键词: 河蟹, 关键点检测, 双目视觉, YOLOv11, 质量估测

Abstract:

[Objective] With the accelerated development of large-scale and intelligent aquaculture, accurate estimation of the body mass of individual Chinese mitten crabs is critical for tasks such as precise feeding, disease prevention, and optimization of harvest decisions. Traditional methods of manually catching and weighing crabs are time-consuming, labor-intensive, and can cause stress or injury to the crabs, while also failing to provide real-time monitoring. To address the challenges posed by turbid water conditions in aquaculture, which lead to poor image quality and difficulty in feature extraction, a method is proposed for estimating Chinese mitten crab quality that combines binocular vision with deep learning–based keypoint detection. This approach achieves high-precision detection of anatomical keypoints on the crab, providing new technical support for precision aquaculture and intelligent management. [Methods] Based on a lightweight YOLOv11 framework, in its C3K2 module, MBConv depthwise-separable convolutions were incorporated to significantly reduce computational complexity and improve feature extraction efficiency. An EffectiveSE channel attention mechanism was introduced to adaptively emphasize important channel-wise features. To further enhance cross-scale information fusion, a spatial dynamic feature fusion module (SDFM) was added. The SDFM adaptively and weightedly fused local spatial attention with global channel attention, enabling detailed extraction of crab shell edges and anatomical keypoints. The improved YOLOv11-ES model could simultaneously output the crab's bounding box, the positions of four anatomical keypoints, and the crab's sex classification in a single forward pass. In the 3D reconstruction stage, calibrated stereo camera parameters were used, and a sparse keypoint matching strategy guided by the crab's sex and spatial geometric constraints was employed. High-confidence keypoint pairs were selected from the left and right views, and the true 3D coordinates of the crab's carapace length and width were computed by triangulation. Finally, the obtained carapace length, width, and sex label data were fed into a two-layer back-propagation (BP) neural network to perform a regression prediction of the individual crab's mass. [Results and Discussion] To validate the effectiveness and robustness of the proposed method, a dataset of Chinese mitten crab images with annotated keypoints was constructed under varying water turbidity and lighting conditions, and both ablation and comparative experiments were conducted. The YOLOv11-ES achieved a mean Average Precision at intersection over union (IOU) threshold of 0.5 (mAP@50) of 97.2% on the test set, which is 4.4 percentage point higher than the original YOLOv11 model. The keypoint detection component reached an mAP@50 of 96.7%, which is 3.6 percentage point higher than that of the original YOLOv11 model. In comparative experiments, YOLOv11-ES also demonstrated significant advantages over other models in the same series. Moreover, in a full-system evaluation using images of 30 individual crabs, the mean absolute percentage error (MAPE) for carapace width measurements was only 2.68%, and for carapace length it was 1.48%. The Pearson correlation coefficients between the measured and manually obtained true values for both carapace length and width exceeded 0.977, indicating high accuracy in the 3D reconstruction and minimal measurement error. Experiments analyzing the influence of image quality on measurement accuracy showed that when the underwater image quality measure (UIQM) reached at least 1.5, the combined MAPE of carapace length and width errors could be kept below 5%. When UIQM reached at least 2.2, the MAPE dropped to about 1.9%. These results confirmed the robustness of the method against variations in water turbidity and lighting conditions. For mass regression prediction, the BP network trained on carapace length, width, and sex features achieved a mean absolute error (MAE) of 2.39 g and a MAPE of 7.1% on an independent test set, demonstrating high-precision estimation of individual crab mass. [Conclusions] The proposed method, which combines an improved YOLOv11 object detection network, binocular sparse keypoint matching, and a two-layer BP regression network, enabled high-precision, low-error, real-time, non-contact estimation of Chinese mitten crab mass in complex turbid aquatic environments. This approach featured a lightweight model, high computational efficiency, excellent measurement accuracy, and strong adaptability to varying environmental conditions. It provided key technical parameters for intelligent Chinese mitten crab farming. In the future, this approach could be extended to other aquaculture species and complex farming scenarios. Combined with transfer learning and online adaptive calibration techniques, its generalization capability could be further improved and integrated with intelligent monitoring platforms to achieve large-scale, all-weather underwater crab quality estimation, contributing to the sustainable development of smart aquaculture.

Key words: Chinese mitten crab, keypoint detection, binocular vision, YOLOv11, weight estimation

中图分类号: