[Objective] To address the problems of high labor cost, low efficiency, and poor consistency in manual sorting during the post-harvest processing of Euryale ferox, and to develop a high-precision, lightweight, and real-time sorting model to provide technical support for intelligent processing, an appearance defect detection algorithm based on an improved YOLOv11n model is proposed. [Methods] YOLOv11n was improved from three aspects: feature extraction, downsampling mechanism, and loss function. First, the Universal Perception Large‑Kernel ConvNet Block (UniRepLKNetBlock) was integrated into the C3k2 structure of the neck network to construct a novel feature extraction module named CURK (C3k2‑UniRepLKNetBlock). This module used a depthwise large‑kernel convolution as the main branch, with four parallel convolutional branches. After training, these branches were merged into a single convolutional layer via structural re‑parameterization, which significantly enlarged the effective receptive field and enhanced the model's representation capability. Second, a Depth Lightweight Adaptive Extraction module (DLAE) replaced the standard convolutional downsampling layer in the 8th layer of the backbone. DLAE adopted a parallel two‑branch design: a feature extraction branch based on depthwise separable convolution (DWConv) to capture local texture details, and an attention branch that generated spatial attention weights through global average pooling and 1×1 DWConv, followed by Softmax normalization. The attention weights were then multiplied channel‑wise with the features, reducing computational load while adaptively enhancing key defect regions and suppressing background noise. Third, the Scale Dynamic IoU Loss (SDIoU) was introduced to replace the original loss function. In the BBox branch, SDIoU calculated a dynamic coefficient based on the ratio of the target bounding box area to the preset maximum target area, combined with the feature map compression ratio (ROC), and a threshold was used for clipping, automatically adjusting the weights of scale loss and location loss. A similar mechanism was applied in the Mask branch. A self‑built Euryale ferox appearance defect image dataset was constructed. A total of 2 537 original images were collected, and after data augmentation, the dataset was expanded to 5 354 images covering five categories: qualified, surface scratch, broken, dark (overripe), and shell. The dataset was randomly divided into training, validation, and test sets in a 7:1:2 ratio. [Results and Discussions] Ablation experiments showed that the combination of CURK, DLAE, and SDIoU achieved the best overall performance while maintaining lightweight advantages: precision was 95.4%, recall was 92.8%, and mAP50 reached 97.4%. Compared with the baseline YOLOv11n, recall increased by 2.9 percentage points and mAP50 by 0.4 percentage points. The model weight file size was reduced to 4.9 MB, parameters to 2.31 M, and computational cost to 6.1 GFLOPs. The inference speed reached 189.2 f/s, meeting real‑time detection requirements. In comparative experiments with mainstream models, the proposed model achieved the highest mAP50 (97.4%) and the lowest parameter count and computational cost. Heatmap visualization analysis indicated that after integrating CURK, DLAE, and SDIoU, the model's focus on Euryale ferox target regions became more concentrated, background interference was significantly suppressed, and missed and false detections were effectively reduced. A physical visual detection validation platform was built, and 402 samples each from Tianchang city and Funan county were tested. The overall accuracy was 94.5% for Tianchang samples and 90.0% for Funan samples, with an average of 92.25%, confirming the model's good generalization capability and engineering adaptability under different imaging conditions and geographical origins. [Conclusions] Through the synergistic effects of the CURK large‑kernel re‑parameterization module, the DLAE lightweight adaptive downsampling module, and the SDIoU scale‑dynamic loss function, the improved YOLOv11n model maintains high detection accuracy while balancing model lightweighting and real‑time performance, demonstrating good engineering application potential. It provides an efficient, accurate, and lightweight technical reference for intelligent sorting of Euryale ferox.