Cotton Maturity Detection Algorithm Based on Improved RT-DETR

doi:10.12133/j.smartag.SA202512013

Abstract

Abstract:

[Objective] In the context of computer vision applications in agriculture, achieving precision management of cotton growth requires addressing the significant differences in water and fertilizer demands across various developmental stages. Cotton maturity assessment is a vital task in precision agriculture, playing a crucial role in supporting timely irrigation, fertilization, and harvesting decisions. However, traditional monitoring approaches are time-consuming and labor-intensive, and current deep learning-based models often struggle to effectively recognize cotton bolls at varying maturity stages, especially in complex field environments with dense foliage, occlusion, and illumination changes. To address these challenges, a high-accuracy and lightweight computer vision model was proposed for cotton maturity detection. The model can provide reliable technical support for precise water and fertilizer regulation and quality enhancement in cotton production. [Methods] An enhanced detection framework named Cotton Maturity-Detection Transformer (CM-DETR) was proposed, based on an improved RT-DETR architecture. CM-DETR incorporated three core architectural innovations that significantly improve both detection accuracy and computational efficiency. First, to construct a lightweight and efficient backbone, a novel feature extraction module named RGCSPELAN (Re-parameterized Group Convolution Spatial Enhancement Lightweight Attention Network) was introduced. This module integrated Progressive Convolution, which captured hierarchical and local contextual features, with Re-parameterized Convolution (RepConv), which reduced computational complexity during inference by transforming multi-branch structures into a single-path representation. The combination effectively enhanced the model's feature representation capabilities and gradient propagation while minimizing the number of parameters and FLOPs. Furthermore, RGCSPELAN was designed with a scalable architecture, allowing its computational capacity to be adjusted via a scaling factor. This ensured compatibility with both small and large models, facilitating flexible deployment across resource-constrained edge devices and high-performance systems alike. Second, to address the issue of small target feature loss, a new module termed Deep Robust Feature Downsampling (DRFD) was proposed. DRFD emploied a multi-scale feature fusion strategy by integrating multiple downsampling branches (e.g., convolutional, cut-based, and max-pooling pathways). This design enabled the model to retain fine-grained spatial details while expanding its receptive field. Third, the original loss function in RT-DETR was replaced with Focaler-CIoU, and an adaptive regression optimization strategy integrating sample reweighting and geometric constraints was implemented to improve bounding box localization under complex conditions. [Results and Discussions] Experimental results demonstrated that CM-DETR achieved mAP50 and mAP50-95 scores of 80.8% and 51.1%, respectively, outperforming the baseline model by 3.7 and 1.8 percentage points. Meanwhile, CM-DETR reduced the parameter count and computational cost by 31.7% and 22.8%, respectively, indicating a favorable trade-off between detection accuracy and model efficiency. The incorporation of the DRFD module enhanced the model's sensitivity to small and subtly distinct features related to cotton maturity, improved robustness under diverse field conditions, and enabled more precise detection of cotton bolls at different growth stages. Moreover, the optimized regression strategy contributed to more stable bounding box prediction performance in scenarios involving occlusion, scale variation, and dense foliage. Overall, the proposed architectural improvements effectively strengthened feature representation capability while maintaining lightweight characteristics, thereby demonstrating practical applicability in real-time agricultural environments. [Conclusions] In conclusion, the proposed CM-DETR model provides a efficient, and scalable solution for automated cotton maturity detection. By enhancing multi-stage feature recognition, improving small-target sensitivity, and reducing the demand on computational resources, CM-DETR serves as a reliable tool for intelligent decision-making in precision agriculture. Its practical deployment can support more accurate timing for irrigation, fertilization, and harvesting, thereby contributing to improved crop management and yield optimization.

Key words: cotton maturity, RT-DETR, object detection, RGCSPELAN, DRFD, Focaler-CIoU loss function

CLC Number:

TP391

SHI Qimeng, WANG Jun, XU Xiaofeng, ZHANG Weiyi. Cotton Maturity Detection Algorithm Based on Improved RT-DETR[J]. Smart Agriculture, doi: 10.12133/j.smartag.SA202512013.

Figures/Tables 16

Fig. 1

Table 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Fig. 6

Table 2

Table 3

Table 4

Fig. 7

Table 5

Table 6

Fig. 8

Table 7

Fig. 9

References 31

[1]	李玉超, 张立杰, 乔心培. 棉花产业协同集聚对棉花绿色全要素生产率的影响[J]. 中国生态农业学报(中英文), 2025, 33(5): 985-998.
	LI Y C, ZHANG L J, QIAO X P. The impact of cotton industry synergistic agglomeration on cotton green total factor productivity[J]. Chinese Journal of Eco-Agriculture, 2025, 33(5): 985-998.
[2]	范军亮, 白振涛, 李云霞, 等. 灌水施氮和缩节胺用量对南疆棉花产量品质和水肥利用效率的影响[J]. 农业工程学报, 2024, 40(9): 68-78.
	FAN J L, BAI Z T, LI Y X, et al. Effects of irrigation amount, nitrogen rate and mepiquat chloride dose on cotton yield, quality and water-fertilizer use efficiency in Southern Xinjiang of China[J]. Transactions of the Chinese Society of Agricultural Engineering, 2024, 40(9): 68-78.
[3]	吴华瑞, 李静晨, 杨雨森. 基于大语言模型的个性化作物水肥管理智能决策方法[J]. 智慧农业(中英文), 2025, 7(1): 11-19.
	WU H R, LI J C, YANG Y S. Intelligent decision-making method for personalized vegetable crop water and fertilizer management based on large language models[J]. Smart Agriculture, 2025, 7(1): 11-19.
[4]	吴琼, 刘保军, 李慧, 等. 不同基肥水平对棉苗生长和棉花产量的影响[J]. 新疆农业科学, 2020, 57(4): 740-745.
	WU Q, LIU B J, LI H, et al. Effects of different basal fertilizer levels on cotton seedling growth and cotton yield[J]. Xinjiang Agricultural Sciences, 2020, 57(4): 740-745.
[5]	吴传云, 冯健, 陈传强, 等. 我国棉花产业现状与机械化发展情况分析[J]. 中国农机化学报, 2021, 42(5): 215-221.
	WU C Y, FENG J, CHEN C Q, et al. Analysis of the status quo and mechanization development of cotton-producing industry in China[J]. Journal of Chinese Agricultural Mechanization, 2021, 42(5): 215-221.
[6]	辛明华, 王占彪, 韩迎春, 等. 新疆机采棉发展回顾、现状分析及措施建议[J]. 中国农业科技导报, 2021, 23(7): 11-20.
	XIN M H, WANG Z B, HAN Y C, et al. Review, status and measures of Xinjiang machine-picked cotton[J]. Journal of Agricultural Science and Technology, 2021, 23(7): 11-20.
[7]	SHAO L T, GONG J Q, FAN W Q, et al. Cost comparison between digital management and traditional management of cotton fields—evidence from cotton fields in Xinjiang, China[J]. Agriculture, 2022, 12(8): 1105.
[8]	YU J, DE ANTONIO A, VILLALBA-MORA E. Deep learning (CNN, RNN) applications for smart homes: a systematic review[J]. Computers, 2022, 11(2): 26.
[9]	CHOUDHARY K, DECOST B, CHEN C, et al. Recent advances and applications of deep learning methods in materials science[J]. npj Computational Materials, 2022, 8: 59.
[10]	王东方, 汪军. 基于迁移学习和残差网络的农作物病害分类[J]. 农业工程学报, 2021, 37(4): 199-207.
	WANG D F, WANG J. Crop disease classification with transfer learning and residual networks[J]. Transactions of the Chinese Society of Agricultural Engineering, 2021, 37(4): 199-207.
[11]	XUE Z Y, XU R J, BAI D, et al. YOLO-tea: A tea disease detection model improved by YOLOv5[J]. Forests, 2023, 14(2): 415.
[12]	KINDA Z, MALO S, BAYALA T R. Detection of cotton diseases by YOLOv8 on UAV images using the RT-DETR backbone[C]// Ambient Intelligence-Software and Applications-15th International Symposium on Ambient Intelligence. Cham, Germany: Springer, 2025: 3-13.
[13]	TAN C J, LI C Y, SUN J, et al. Multi-object tracking for cotton boll counting in ground videos based on transformer[C]// American Society of Agricultural and Biological Engineers, ASABE Annual International Meeting, 2024. DOI:10.13031/aim.202400619 .
[14]	YADAV P K, THOMASSON J A, SEARCY S W, et al. Assessing the performance of YOLOv5 algorithm for detecting volunteer cotton plants in corn fields at three different growth stages[J]. Artificial Intelligence in Agriculture, 2022, 6: 292-303.
[15]	彭炫, 周建平, 许燕, 等. 改进YOLOv5识别复杂环境下棉花顶芽[J]. 农业工程学报, 2023, 39(16): 191-197.
	PENG X, ZHOU J P, XU Y, et al. Cotton top bud recognition method based on YOLOv5-CPP in complex environment[J]. Transactions of the Chinese Society of Agricultural Engineering, 2023, 39(16): 191-197.
[16]	VERMA P, PAUL A, MACHAVARAM R, et al. Cotton growth stages detection using fine-tuned YOLOv8 deep learning model[C]// Proceedings of the 2024 8th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence. New York, USA: ACM, 2024: 20-25.
[17]	XIE X, CHENG G, WANG J B, et al. Oriented R-CNN for object detection[C]// 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway, New Jersey, USA: IEEE, 2021: 3500-3509.
[18]	BHARATI P, PRAMANIK A. Deep learning techniques-r-CNN to mask R-CNN: A survey[C]// Computational Intelligence in Pattern Recognition. Singapore, Germany: Springer, 2020: 657-668.
[19]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2016: 779-788.
[20]	邵延华, 张铎, 楚红雨, 等. 基于深度学习的YOLO目标检测综述[J]. 电子与信息学报, 2022, 44(10): 3697-3708.
	SHAO Y H, ZHANG D, CHU H Y, et al. A review of YOLO object detection based on deep learning[J]. Journal of Electronics & Information Technology, 2022, 44(10): 3697-3708.
[21]	CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]// Computer Vision – ECCV 2020. Cham, Germany: Springer, 2020: 213-229.
[22]	ZHAO Y A, LV W Y, XU S L, et al. DETRs beat YOLOs on real-time object detection[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2024: 16965-16974.
[23]	SOUDY M, AFIFY Y, BADR N. RepConv: a novel architecture for image scene classification on Intel scenes dataset[J]. International Journal of Intelligent Computing and Information Sciences, 2022: 1-11.
[24]	吴建成, 郭荣佐, 成嘉伟, 等. 注意力特征融合的快速遥感图像目标检测算法[J]. 计算机工程与应用, 2024, 60(1): 207-216.
	WU J C, GUO R Z, CHENG J W, et al. Fast remote sensing image object detection algorithm based on attention feature fusion[J]. Computer Engineering and Applications, 2024, 60(1): 207-216.
[25]	LU W, CHEN S B, TANG J, et al. A robust feature downsampling module for remote-sensing visual tasks[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 4404312.
[26]	QIAN X L, ZHANG N N, WANG W. Smooth GIoU loss for oriented object detection in remote sensing images[J]. Remote Sensing, 2023, 15(5): 1259.
[27]	ZHENG Z H, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 12993-13000.
[28]	DU S J, ZHANG B F, ZHANG P, et al. An improved bounding box regression loss function based on CIOU loss for multi-scale object detection[C]// 2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML). Piscataway, New Jersey, USA: IEEE, 2021: 92-98.
[29]	ZHANG H, ZHANG S J. Focaler-IoU: more focused intersection over union loss[EB/OL]. arXiv: 2401.10525, 2024.
[30]	WU T Y, TANG S, ZHANG R, et al. CGNet: a light-weight context guided network for semantic segmentation[J]. IEEE Transactions on Image Processing, 2021, 30: 1169-1179.
[31]	WANG D R, PENG J M, LAN S, et al. CTDD-YOLO: a lightweight detection algorithm for tiny defects on tile surfaces[J]. Electronics, 2024, 13(19): 3931.

棉花成熟度类别	图像数量/张
花蕾期	1 378
棉花花期	602
早铃棉期	3 667
裂棉铃期	400
成熟棉铃期	2 757

RT-DETR	RGCSPELAN	DRFD	Focaler-CIoU	P/%	R/%	F ₁/%	mAP50/%	mAP50-95/%	参数量/M	浮点运算量/G	帧率/（帧/s）
√	×	×	×	79.5	74.5	76.9	77.1	49.3	19.9	57.0	77.3
√	√	×	×	81.4	75.3	78.2	78.2	50.3	13.8	44.5	83.8
√	×	√	×	83.3	74.6	78.7	77.8	49.3	19.6	56.5	75.9
√	×	×	√	80.2	76.8	78.5	78.9	49.8	19.9	57.0	76.4
√	√	√	×	82.2	76.6	79.3	78.8	49.4	13.6	44.0	83.3
√	√	×	√	79.8	77.7	78.7	79.7	50.1	13.8	44.5	87.6
√	×	√	√	79.7	74.9	77.2	78.3	49.9	19.6	56.5	74.6
√	√	√	√	82.1	77.4	79.7	80.8	51.1	13.6	44.0	83.6

模型	P/%	R/%	mAP50/%	mAP50-95/%	参数量/M	浮点运算量/G	帧率/（帧/s）
RT-DETR（baseline）	79.5	74.5	77.1	49.3	19.9	57.0	42.8
+RGCSPELAN	81.4	75.3	78.2	50.3	13.8	44.5	52.8
+EMBSFPN	81.6	72.7	76.6	49.2	17.9	48.6	33.9
+PACAPN	81.1	75.1	76.4	48.9	19.9	38.8	38.8
+Context Guided	80.1	73.7	77.1	48.7	16.5	47.6	43.7
+CGRFPN	81.7	75.2	76.0	48.9	19.2	48.2	41.0

α取值	P/%	R/%	F ₁/%	mAP50/%	mAP50-95/%
0.250	81.5	76.7	79.0	79.4	49.9
0.375	80.9	75.6	78.2	78.1	48.7
0.500	82.1	77.4	79.7	80.8	51.1
0.625	80.2	75.7	77.9	80.2	50.6
0.750	79.5	75.1	77.2	78.5	49.5

d取值	u取值	P/%	R/%	F ₁/%	mAP50/%	mAP50-95/%
0.20	0.90	80.8	76.0	78.3	79.9	50.2
0.25	0.70	81.7	75.6	78.5	77.9	48.7
0.40	0.85	80.2	72.6	76.2	76.1	48.2
0.55	0.75	82.3	73.4	77.6	78.5	49.2
0.30	0.80	77.4	71.3	74.2	74.6	47.4
0.50	0.60	82.1	77.4	79.7	80.8	51.1