Welcome to Smart Agriculture 中文

Smart Agriculture

   

Lodging Region Detection in Flax Based on a Lightweight Improved YOLOv11n-seg Model

SU Yujie1(), LI Yue1,2(), WEI Linjing1, WU Bing2,3, GUO Linhai1, YAN Bin2,4, ZHOU Hui1, GAO Yuhong2,4, KANG Lianghe1, LIU Huan1, SU Shunchang1   

  1. 1. College of Information Science and Technology, Gansu Agricultural University, Lanzhou 730070, China
    2. State Key Laboratory of Crop Biology in Dryland, Lanzhou 730070, China
    3. College of Life Science and Technology, Gansu Agricultural University, Lanzhou 730070, China
    4. College of Agronomy, Gansu Agricultural University, Lanzhou 730070, China
  • Received:2025-08-14 Online:2025-11-14
  • Foundation items:National Natural Science Foundation of China (NSFC) Projects(32460443); Key Project of Gansu Provincial Science and Technology Plan-Natural Science Foundation(23JRRA1403); National Foreign Experts Project of the Ministry of Science and Technology(G2022042005L); Industry Support Project of Higher Education Institutions in Gansu Province(2023CYZC-54); Key R&D Program of Gansu Province(23YFWA0013); High-Level Foreign Experts Recruitment Program of Gansu Province(25RCKA015); National Technology System for Specialty Oil Crops(CARS-14-1-16)
  • corresponding author:
    LI Yue, E-mail:

Abstract:

[Objective] Lodging is a major agronomic constraint that adversely affects both yield and quality in field crops, with flax (Linum usitatissimum L.) being especially vulnerable due to its slender stems and susceptibility to wind and rainfall. Precise delineation of lodged areas from field imagery remains a significant challenge owing to the complex and heterogeneous morphology of lodging patterns, irregular and blurred boundaries, and substantial background interference from upright plants, weeds, and soil textures. These factors necessitate the development of a segmentation framework that combines high precision and strong boundary adherence with computational efficiency, enabling deployment on resource-constrained agricultural monitoring platforms. In response to this need, a lightweight yet accurate lodging segmentation approach based on an improved YOLOv11n-seg architecture was proposed. The proposed method is specifically designed to enhance fine-grained feature sensitivity, multi-scale representation capability, and boundary precision, while markedly reducing parameter count,giga floating-point operations (GFLOPs), and model size. [Methods] The proposed architecture integrated targeted modifications across the backbone, neck, and output stages. In the backbone, standard C3k2 modules were replaced with C3k2_SDW blocks, which combined a StarBlock structure with depthwise separable convolutions to reduce redundancy and computation without sacrificing spatial and contextual representational capacity. To counteract potential reductions in channel discrimination resulting from light-weighting, a multi-scale efficient channel attention (MS-ECA) mechanism was embedded within selected backbone layers, yielding C3k2_SDW_MS-ECA modules. These modules incorporated parallel convolution branches with varying kernel sizes to capture channel-wise dependencies across multiple receptive fields, thereby adaptively recalibrating lodging-related features with minimal computational overhead. In the neck, a bidirectional feature pyramid network (BiFPN) was introduced to facilitate efficient bidirectional information exchange between scales. By assigning normalized, trainable fusion weights, the BiFPN adaptively balanced contributions from low- and high-level feature maps, while a multi-stage semantic fusion strategy further enriched the integration of spatial details and contextual semantics, thereby improving the detection of small and fragmented lodged patches. At the output stage, a boundary refinement procedure was applied to the predicted masks, improving contour sharpness, enhancing boundary compactness, and mitigating false detections in complex visual environments.The experimental dataset comprised unmanned aerial vehicle (UAV) RGB imagery at a resolution of 4 032×2 268 pixels, acquired from flax fields in Dingxi, Gansu province. Lodged regions were manually annotated with polygonal masks. To increase robustness against variability in illumination, background complexity, and lodging morphology, data augmentation techniques—including random rotation, brightness and contrast adjustment, and blurring—were employed, expanding the dataset to 3 852 images. The dataset was divided into training, validation, and testing subsets in a 75%, 15% and 10% split. Model training was conducted with 640×640 pixel inputs for 300 epochs using stochastic gradient descent (initial learning rate 0.01, momentum 0.937, weight decay 0.0005) in PyTorch 2.0.0. Evaluation involved comparison with YOLACT, YOLOv7-seg, YOLOv8n-seg, and the original YOLOv11n-seg using precision (P), recall (R), mAP@0.5, mAP50–95, parameter count, GFLOPs, and model size. [Results and Discussions] Ablation experiments demonstrated the incremental contributions of each architectural component. Substituting C3k2 with C3k2_SDW reduced parameters from 2.83 M to 2.14 M and computation from 10.2 to 8.1 GFLOPs, with slight performance improvements. Incorporating BiFPN further lowered complexity to 1.68 M parameters and 7.7 GFLOPs, accompanied by notable gains in detection metrics. The addition of MS-ECA attention achieved the highest performance, delivering P of 92.6%, R of 92.0%, and mAP@0.5 of 95.2%, corresponding to improvements of 3.7 percentage points in Precision and 2.1 percentage points in mAP@0.5 over the YOLOv11n-seg baseline, without increasing model size. Qualitative Grad-CAM visualizations revealed more precise focus on lodging regions and reduced false activations in upright stems and non-lodged soil areas. Generalization capability was further validated on the public WE3DS agricultural segmentation dataset, where the proposed model achieved average improvements of 4.3, 1.9, and 2.6 percentage points in precision, recall, and mAP@0.5, respectively, compared to the baseline. [Conclusions] The improved YOLOv11n-seg architecture achieves a superior balance between accuracy and efficiency for flax lodging segmentation. By combining the C3k2_SDW_MS-ECA backbone, BiFPN with multi-stage semantic fusion in the neck, and output boundary refinement. This combination of high accuracy, lightweight design, and robust boundary delineation renders the model highly applicable to real-time, in-field deployment for intelligent lodging monitoring and precision agriculture. The results further suggest that the approach is transferable to broader agricultural segmentation tasks, providing a practical and scalable solution for modern smart farming applications.

Key words: flax, image segmentation, lightweight model, lodging detection, YOLOv11n-seg

CLC Number: