0 Introduction
1 Materials and methods
1.1 Dataset
Fig. 1 Camera deployment and installation |
Table 1 Definitions of 5 typical Hu sheep behaviors |
| Behavior ID | Typical behavior | Description |
|---|---|---|
| 0 | Drinking | Standing with limbs outstretched and drink water while looking down at the pipe |
| 1 | Eating | Standing with limbs extended and head lowered to eat in front of the feeding trough |
| 2 | Lying | Body resting on the ground, with limbs bent or tucked |
| 3 | Licking | Standing with limbs outstretched and lick the salt brick |
| 4 | Standing | Standing on all limbs with the body supported, and not actively engaged in eating, drinking, or licking |
Fig. 2 Examples of the 5 annotated Hu sheep behaviors |
1.2 Dataset features and analysis
Fig. 3 Distribution of behavior categories in SheepDB |
Fig. 4 Representative examples of SheepDB under different illumination conditions and object densities |
1.3 Method
1.3.1 Object detection algorithm (YOLO)
1.3.2 Adaptive detail-contextual attention (DCAttention)
Fig. 5 DCAttention computation process |
1.3.3 DCAC3K2
Fig. 6 DCAC3K2 structure |
1.3.4 Comprehensive classification-quality focal loss (CQFL)
1.3.5 Bounding box similarity soft-NMS (BS-NMS)
Fig. 7 Pseudocode for BS-NMS |
1.3.6 Light encoder
Fig. 8 Architecture of the Light Encoder and its Classification Head |
1.3.7 Overall architecture
Fig. 9 DC-YOLO structure |
1.4 Experimental environment and evaluation metrics
1.4.1 Overall architecture
Table 2 Experimental environment |
| Configuration Item | Value |
|---|---|
| CPU | Intel(R) Xeon(R) Platinum 8352 V CPU @ 2.10 GHz |
| GPU | NVIDIA GeForce RTX 4090 |
| OS | Ubuntu 20.04 |
| Programming language | Python3.8 |
| Deep learning framework | PyTorch2.0.0 |
| Version of CUDA | CUDA11.8 |
| System RAM | 120 GB |
1.4.2 Evaluation metrics
2 Results and analysis
2.1 Ablation experiments and analysis
Table 3 Results of ablation experiments of DC-YOLO |
| Model | DCAC3K2 | CQFL | BS-NMS | A2C2f | P/% | R/% | mAP50/% | mAP50:95/% | Parameters/M | tinfer/ms | tpost/ms |
|---|---|---|---|---|---|---|---|---|---|---|---|
| YOLOv12 | × | × | × | √ | 81.6 | 78.5 | 83.6 | 64.9 | 2.51 | 172.8 | 0.37 |
| 1 | √ | × | × | √ | 79.0 | 87.7 | 86.7 | 66.0 | 2.17 | 130.3 | 0.31 |
| 2 | × | √ | × | √ | 85.9 | 74.5 | 85.5 | 66.9 | 2.51 | 165.3 | 0.39 |
| 3 | × | × | √ | √ | 83.7 | 82.8 | 87.6 | 68.6 | 2.51 | 168.2 | 1.60 |
| 4 | √ | √ | × | √ | 86.5 | 86.2 | 88.0 | 71.5 | 2.17 | 143.2 | 0.37 |
| 5 | √ | √ | √ | √ | 91.0 | 86.8 | 90.1 | 73.8 | 2.17 | 133.5 | 1.74 |
| 6 | √ | √ | × | × | 85.9 | 86.1 | 88.9 | 73.5 | 2.29 | 163.7 | 1.71 |
| DC-YOLO | √ | √ | √ | × | 91.0 | 86.6 | 91.4 | 75.9 | 2.29 | 115.5 | 1.69 |
|
Table 4 Results of ablation experiments based on Light Encoder and pretrained model of DC-YOLO |
| Model | Pre-training method | mAP/% | P/% | R/% | |||
|---|---|---|---|---|---|---|---|
| daytime natural | nighttime | abnormal lighting | ALL | ||||
| YOLOv11 | YOLOv11n | 93.0 | 82.9 | 84.5 | 87.3 | 88.1 | 83.9 |
| Random initial weights | 91.9 | 82.1 | 83.9 | 86.8 | 83.4 | 81.8 | |
| Light-Encoder | 92.8 | 83.4 | 88.4 | 88.9 | 84.9 | 77.6 | |
| YOLOv12 | YOLOv12n | 92.5 | 80.8 | 78.7 | 84.6 | 79.5 | 82.7 |
| Random initial weights | 91.8 | 80.1 | 77.3 | 83.6 | 81.6 | 78.5 | |
| Light-Encoder | 92.4 | 82.9 | 80.1 | 85.7 | 77.4 | 85.9 | |
| DC-YOLO (ours) | YOLOv11n | 94.1 | 84.3 | 87.4 | 90.6 | 90.0 | 86.1 |
| YOLOv12n | 94.0 | 84.4 | 86.9 | 90.5 | 83.7 | 85.1 | |
| Random initial weights | 93.1 | 83.7 | 85.9 | 89.9 | 83.4 | 87.6 | |
| Light-Encoder | 93.9 | 85.1 | 89.3 | 91.4 | 91.0 | 86.6 | |
2.2 Comparative experiments
Table 5 Comparison of performance among different models on SheepDB |
| Model | P /% | R /% | mAP /% | Parameters/M | GFLOPs | FPS/(f/s) |
|---|---|---|---|---|---|---|
| YOLOv8 | 81.4 | 81.6 | 85.8 | 2.69 | 6.8 | 8.56 |
| YOLOv9 | 83.5 | 78.0 | 84.3 | 2.43 | 6.4 | 7.03 |
| YOLOv10 | 79.4 | 76.4 | 82.2 | 2.27 | 6.5 | 7.51 |
| YOLOv11 | 83.4 | 81.8 | 86.8 | 2.59 | 6.4 | 7.58 |
| YOLOv12 | 81.6 | 78.5 | 83.6 | 2.51 | 6.9 | 5.71 |
| YOLOv13 | 77.9 | 83.8 | 83.9 | 2.45 | 6.1 | 4.29 |
| DINO | 62.7 | 67.6 | 68.7 | 47.00 | 279.0 | 2.58 |
| DAB-DETR | 43.2 | 51.7 | 50.4 | 43.00 | 195.0 | 2.52 |
| RT-DETR | 63.1 | 73.6 | 71.4 | 42.77 | 130.5 | 1.21 |
| DC-YOLO | 91.0 | 86.6 | 91.4 | 2.29 | 6.2 | 8.50 |
2.3 Performance of DC-YOLO
2.3.1 DC-YOLO's performance in behavior detection
Table 6 Comparison of performance with YOLOv12 in different categories |
| Behavior | DC-YOLO | YOLOv12 | ||||
|---|---|---|---|---|---|---|
| P /% | R /% | mAP /% | P /% | R/% | mAP /% | |
| Drinking | 90.8 | 98.7 | 95.9 | 76.8 | 76.2 | 80.7 |
| Eating | 96.1 | 96.0 | 97.8 | 95.4 | 95.0 | 97.2 |
| Lying | 93.8 | 86.3 | 92.8 | 91.1 | 85.7 | 92.8 |
| Licking | 80.5 | 63.6 | 76.5 | 54.3 | 51.5 | 54.0 |
| Standing | 93.8 | 86.9 | 94.4 | 91.2 | 84.2 | 93.1 |
| ALL | 91.0 | 86.6 | 91.4 | 81.6 | 78.5 | 83.6 |
Fig. 10 Normalized confusion matrix |
2.3.2 DC-YOLO's performance under different illumination conditions
Fig. 11 Comparison of detection results of different models under normal daylight conditions |
Fig. 12 Comparison of detection results of different models in a dim evening scenario |
Fig. 13 Comparison of detection results of different models under daytime illumination with light filtered by blue insulation panels |
Fig. 14 Comparison of detection results of different models under nighttime conditions with artificial light spot interference |
Fig. 15 Comparison of detection results of different models under nighttime infrared illumination |
3 Discussion
3.1 Limitations in dataset and model
3.2 Tracking extensions and future directions
Fig. 16 Visualization of behavior detection and tracking results for Hu sheep |





