Welcome to Smart Agriculture 中文

Smart Agriculture

   

Cotton Maturity Detection Algorithm Based on Improved RT-DETR

SHI Qimeng, WANG Jun(), XU Xiaofeng, ZHANG Weiyi   

  1. School of Computer and Information Science, Anhui Polytechnic University, Wuhu 241000, China
  • Received:2025-12-12 Online:2026-03-13
  • Foundation items:National Natural Science Foundation of China(62406004); Collaborative Innovation Project of Anhui Higher Education Institutions(GXXT-2019-020)
  • About author:

    SHI Qimeng, E-mail:

  • corresponding author:
    WANG Jun, E-mail:

Abstract:

[Objective] In the context of computer vision applications in agriculture, achieving precision management of cotton growth requires addressing the significant differences in water and fertilizer demands across various developmental stages. Cotton maturity assessment is a vital task in precision agriculture, playing a crucial role in supporting timely irrigation, fertilization, and harvesting decisions. However, traditional monitoring approaches are time-consuming and labor-intensive, and current deep learning-based models often struggle to effectively recognize cotton bolls at varying maturity stages, especially in complex field environments with dense foliage, occlusion, and illumination changes. To address these challenges, a high-accuracy and lightweight computer vision model was proposed for cotton maturity detection. The model can provide reliable technical support for precise water and fertilizer regulation and quality enhancement in cotton production. [Methods] An enhanced detection framework named Cotton Maturity-Detection Transformer (CM-DETR) was proposed, based on an improved RT-DETR architecture. CM-DETR incorporated three core architectural innovations that significantly improve both detection accuracy and computational efficiency. First, to construct a lightweight and efficient backbone, a novel feature extraction module named RGCSPELAN (Re-parameterized Group Convolution Spatial Enhancement Lightweight Attention Network) was introduced. This module integrated Progressive Convolution, which captured hierarchical and local contextual features, with Re-parameterized Convolution (RepConv), which reduced computational complexity during inference by transforming multi-branch structures into a single-path representation. The combination effectively enhanced the model's feature representation capabilities and gradient propagation while minimizing the number of parameters and FLOPs. Furthermore, RGCSPELAN was designed with a scalable architecture, allowing its computational capacity to be adjusted via a scaling factor. This ensured compatibility with both small and large models, facilitating flexible deployment across resource-constrained edge devices and high-performance systems alike. Second, to address the issue of small target feature loss, a new module termed Deep Robust Feature Downsampling (DRFD) was proposed. DRFD emploied a multi-scale feature fusion strategy by integrating multiple downsampling branches (e.g., convolutional, cut-based, and max-pooling pathways). This design enabled the model to retain fine-grained spatial details while expanding its receptive field. Third, the original loss function in RT-DETR was replaced with Focaler-CIoU, and an adaptive regression optimization strategy integrating sample reweighting and geometric constraints was implemented to improve bounding box localization under complex conditions. [Results and Discussions] Experimental results demonstrated that CM-DETR achieved mAP50 and mAP50-95 scores of 80.8% and 51.1%, respectively, outperforming the baseline model by 3.7 and 1.8 percentage points. Meanwhile, CM-DETR reduced the parameter count and computational cost by 31.7% and 22.8%, respectively, indicating a favorable trade-off between detection accuracy and model efficiency. The incorporation of the DRFD module enhanced the model's sensitivity to small and subtly distinct features related to cotton maturity, improved robustness under diverse field conditions, and enabled more precise detection of cotton bolls at different growth stages. Moreover, the optimized regression strategy contributed to more stable bounding box prediction performance in scenarios involving occlusion, scale variation, and dense foliage. Overall, the proposed architectural improvements effectively strengthened feature representation capability while maintaining lightweight characteristics, thereby demonstrating practical applicability in real-time agricultural environments. [Conclusions] In conclusion, the proposed CM-DETR model provides a efficient, and scalable solution for automated cotton maturity detection. By enhancing multi-stage feature recognition, improving small-target sensitivity, and reducing the demand on computational resources, CM-DETR serves as a reliable tool for intelligent decision-making in precision agriculture. Its practical deployment can support more accurate timing for irrigation, fertilization, and harvesting, thereby contributing to improved crop management and yield optimization.

Key words: cotton maturity, RT-DETR, object detection, RGCSPELAN, DRFD, Focaler-CIoU loss function

CLC Number: