基于改进CycleGAN的水稻叶片病害图像增强方法
收稿日期: 2024-07-18
网络出版日期: 2024-11-21
基金资助
国家重点研发计划子课题(2022YFD2001801-3);国家自然科学基金项目(32201665)
版权
Rice Leaf Disease Image Enhancement Based on Improved CycleGAN
Received date: 2024-07-18
Online published: 2024-11-21
Supported by
Sub-project of the National Key Research and Development Program(2022YFD2001801-3);National Natural Science Foundation of China Project(32201665)
Copyright
【目的/意义】 针对水稻病害图像识别任务存在数据集获取困难、样本不足及不同类别病害样本不均衡等问题,提出了一种基于改进CycleGAN(Cycle-consistent Adversarial Networks)的水稻叶片病害图像数据增强方法。 【方法】 以CycleGAN为基本框架,将CBAM(Convolution Block Attention Module)注意力机制嵌入到生成器的残差模块中,增强CycleGAN对病害特征的提取能力,使网络更准确地捕捉小目标病害或域间差异不明显的特征;在损失函数中引入感知图像相似度损失,以指导模型在训练过程中生成高质量的样本图像,并提高模型训练的稳定性。基于生成的水稻病害样本,在不同目标检测模型上进行迁移训练,通过比较迁移学习前后模型性能的变化,验证生成的病害图像数据的有效性。 【结果和讨论】 改进的CycleGAN网络生成的水稻叶片病害图像质量优于原始CycleGAN,病斑区域的视觉特征更加明显,结构相似性(Structural Similarity, SSIM)指标提升约3.15%,峰值信噪比(Peak Signal-to-Noise Ratio, PSNR)指标提升约8.19%。同时,使用YOLOv5s、YOLOv7-tiny和YOLOv8s这3种模型在生成的数据集上进行迁移学习后,模型的检测性能均有提升,如YOLOv5s模型的病害检测精度从79.7%提升至93.8%。 【结论】 本研究提出的方法有效解决了水稻病害图像数据集匮乏的问题,为水稻病害识别模型的训练提供了可靠的数据支撑。
严从宽 , 朱德泉 , 孟凡凯 , 杨玉青 , 唐七星 , 张爱芳 , 廖娟 . 基于改进CycleGAN的水稻叶片病害图像增强方法[J]. 智慧农业, 2024 : 1 -13 . DOI: 10.12133/j.smartag.SA202407019
[Objective] Rice diseases significantly impact both the yield and quality of rice production. Automatic recognition of rice diseases using computer vision is crucial for ensuring high yields, quality, and efficiency. However, the task of rice disease image recognition faces challenges such as limited availability of datasets, insufficient sample sizes, and imbalanced sample distributions across different disease categories. To address these challenges, a data augmentation method for rice leaf disease images is proposed based on an improved CycleGAN model. The method aims to expand disease image datasets by generating disease features, thereby alleviating the burden of collecting real disease data and providing more comprehensive and diverse data to support automatic rice disease recognition. [Methods] The proposed approach built upon the CycleGAN framework, with a key modification being the integration of a convolutional block attention module (CBAM) into the generator's residual module. This enhancement strengthened the network's ability to extract both local key features and global contextual information pertaining to rice disease-affected areas. By improving the attention mechanism across both the channel and spatial dimensions, the model increased its sensitivity to small-scale disease targets and subtle variations between healthy and diseased domains. This design effectively mitigated the potential loss of critical feature information during the image generation process, ensuring higher fidelity in the resulting images. Additionally, skip connections were introduced between the residual modules and the CBAM. These connections facilitate improved information flow between different layers of the network, addressing common issues such as gradient vanishing during the training of deep networks. Furthermore, a perception similarity loss function, designed to align with the human visual system, was incorporated into the overall loss function. This addition enabled the deep learning model to more accurately measure perceptual differences between the generated images and real images, thereby guiding the network towards producing higher-quality samples. This adjustment also helped to reduce visual artifacts and excessive smoothing, while concurrently improving the stability of the model during the training process. To comprehensively evaluate the quality of the rice disease images generated by the proposed model and to assess its impact on disease recognition performance, both subjective and objective evaluation metrics were utilized. These included user perception evaluation (UPE), structural similarity index (SSIM), peak signal-to-noise ratio (PSNR), and the performance of disease recognition within object detection frameworks. Comparative experiments were conducted across multiple GAN models, enabling a thorough assessment of the proposed model's performance in generating rice disease images. Additionally, different attention mechanisms, including efficient channel attention (ECA), coordinate attention (CA), and CBAM, were individually embedded into the generator's residual module. These variations allowed for a detailed comparison of the effects of different attention mechanisms on network performance and the visual quality of the generated images. Ablation studies were further performed to validate the effectiveness of the CBAM residual module and the perception similarity loss function in the network's overall architecture. Based on the generated rice disease samples, transfer learning experiments were conducted using various object detection models. By comparing the performance of these models before and after transfer learning, the effectiveness of the generated disease image data in enhancing the performance of object detection models was empirically verified. [Results and Discussions] Experimental results demonstrated that the rice disease images generated by the improved CycleGAN model surpassed those produced by other GAN variants in terms of image detail clarity and the prominence of disease-specific features. In terms of objective quality metrics, the proposed model exhibited a 3.15% improvement in SSIM and an 8.19% enhancement in PSNR compared to the original CycleGAN model, underscoring its significant advantage in structural similarity and signal-to-noise ratio. The comparative experiments involving different attention mechanisms and ablation studies revealed that embedding the CBAM into the generator effectively increased the network's focus on critical disease-related features, resulting in more realistic and clearly defined disease-affected regions in the generated images. Furthermore, the introduction of the perception similarity loss function substantially enhanced the network's ability to perceive and represent disease-related information, thereby improving the visual fidelity and realism of the generated images. Additionally, transfer learning applied to object detection models such as YOLOv5s, YOLOv7-tiny, and YOLOv8s led to significant improvements in disease detection performance on the augmented dataset. Notably, the detection accuracy of the YOLOv5s model increased from 79.7% to 93.8%, representing a considerable enhancement in both generalization ability and robustness. This improvement also effectively reduced the rates of false positives and false negatives, resulting in more stable and reliable performance in rice disease detection tasks. [Conclusions] In conclusion, the rice leaf disease image generation method based on the improved CycleGAN model, as proposed in this study, effectively transforms images of healthy leaves into those depicting disease symptoms. By addressing the challenge of insufficient disease samples, this method significantly improves the disease recognition capabilities of object detection models. Therefore, it holds considerable application potential in the domain of rice leaf disease image augmentation and offers a promising new direction for expanding datasets of disease images for other crops.
Key words: rice leaf disease; data enhancement; CycleGAN; CBAM; perceptual similarity loss
1 | ARUMUGA ARUN R, UMAMAHESWARI S. Effective multi-crop disease detection using pruned complete concatenated deep learning model[J]. Expert systems with applications, 2023, 213: ID 118905. |
2 | FUENTES A F, YOON S, LEE J, et al. High-performance deep neural network-based tomato plant diseases and pests diagnosis system with refinement filter bank[J]. Frontiers in plant science, 2018, 9: ID 1162. |
3 | KUMAR SAHU S, PANDEY M. An optimal hybrid multiclass SVM for plant leaf disease detection using spatial Fuzzy C-Means model[J]. Expert systems with applications, 2023, 214: ID 118989. |
4 | AHMED I, YADAV P K. Plant disease detection using machine learning approaches[J]. Expert systems, 2023, 40(5): ID e13136. |
5 | 高荣华, 冯璐, 张月, 等. 基于多维随机森林的番茄灰霉病高光谱图像早期检测[J]. 光谱学与光谱分析, 2022, 42(10): 3226-3234. |
GAO R H, FENG L, ZHANG Y, et al. Early detection of tomato gray mold based on multidimensional random forest hyperspectral image[J]. Spectroscopy and spectral analysis, 2022, 42(10): 3226-3234. | |
6 | FERENTINOS K P. Deep learning models for plant disease detection and diagnosis[J]. Computers and electronics in agriculture, 2018, 145: 311-318. |
7 | LIAO J, CHEN M H, ZHANG K, et al. SC-Net: A new strip convolutional network model for rice seedling and weed segmentation in paddy field[J]. Computers and electronics in agriculture, 2024, 220: ID 108862. |
8 | HASSAN S M, MAJI A K. Plant disease identification using a novel convolutional neural network[J]. IEEE access, 2022, 10: 5390-5401. |
9 | RAHMAN C R, ARKO P S, ALI M E, et al. Identification and recognition of rice diseases and pests using convolutional neural networks[J]. Biosystems engineering, 2020, 194: 112-120. |
10 | BARMAN U, CHOUDHURY R D, SAHU D, et al. Comparison of convolution neural networks for smartphone image based real time classification of citrus leaf disease[J]. Computers and electronics in agriculture, 2020, 177: ID 105661. |
11 | THAI H T, LE K H, NGUYEN N L T. FormerLeaf: An efficient vision transformer for cassava leaf disease detection[J]. Computers and electronics in agriculture, 2023, 204: ID 107518. |
12 | 崔金荣, 魏文钊, 赵敏. 基于改进MobileNetV3的水稻病害识别模型[J]. 农业机械学报, 2023, 54(11): 217-224, 276. |
CUI J R, WEI W Z, ZHAO M. Rice disease identification model based on improved MobileNetV3[J]. Transactions of the Chinese society for agricultural machinery, 2023, 54(11): 217-224, 276. | |
13 | KAMILARIS A, PRENAFETA-BOLDú F X. Deep learning in agriculture: A survey[J]. Computers and electronics in agriculture, 2018, 147: 70-90. |
14 | CUBUK E D, ZOPH B, MANE D, et al. AutoAugment: Learning augmentation strategies from data[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2019: 113-123. |
15 | CONG W Y, ZHANG J F, NIU L, et al. DoveNet: Deep image harmonization via domain verification[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2020: 8394-8403. |
16 | HONG Y J, HWANG U, YOO J, et al. How generative adversarial networks and their variants work[J]. ACM computing surveys, 2020, 52(1): 1-43. |
17 | GUO H L, LI M Y, HOU R Z, et al. Sample expansion and classification model of maize leaf diseases based on the self-attention CycleGAN[J]. Sustainability, 2023, 15(18): ID 13420. |
18 | HU G S, WU H Y, ZHANG Y, et al. A low shot learning method for tea leaf's disease identification[J]. Computers and electronics in agriculture, 2019, 163: ID 104852. |
19 | 李天俊, 杨信廷, 陈晓, 等. 基于生成对抗网络和视觉-语义对齐的零样本害虫识别方法[J]. 智慧农业(中英文), 2024, 6(2): 72-84. |
LI T J, YANG X T, CHEN X, et al. Zero-shot pest identification based on generative adversarial networks and visual-semantic alignment[J]. Smart agriculture, 2024, 6(2): 72-84. | |
20 | ABBAS A, JAIN S, GOUR M, et al. Tomato plant disease detection using transfer learning with C-GAN synthetic images[J]. Computers and electronics in agriculture, 2021, 187: ID 106279. |
21 | ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]// 2017 IEEE International Conference on Computer Vision (ICCV). Piscataway, New Jersey, USA: IEEE, 2017: 2223-2232. |
22 | BARTH R, HEMMING J, VAN HENTEN E J. Optimising realism of synthetic images using cycle generative adversarial networks for improved part segmentation[J]. Computers and electronics in agriculture, 2020, 173: ID 105378. |
23 | VAN MARREWIJK B M, POLDER G, KOOTSTRA G. Investigation of the added value of CycleGAN on the plant pathology dataset[J]. IFAC-papers on line, 2022, 55(32): 89-94. |
24 | GUO M H, XU T X, LIU J J, et al. Attention mechanisms in computer vision: A survey[J]. Computational visual media, 2022, 8(3): 331-368. |
25 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[M]// Lecture Notes in Computer Science. Cham: Springer International Publishing, 2018: 3-19. |
26 | ISOLA P, ZHU J Y, ZHOU T H, et al. Image-to-image translation with conditional adversarial networks[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2017: 1125-1134. |
27 | ZHANG R, ISOLA P, EFROS A A, et al. The unreasonable effectiveness of deep features as a perceptual metric[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, New Jersey, USA: IEEE, 2018: 586-595. |
28 | ZHAI G, MIN X. Perceptual image quality assessment: A survey[J]. Science China Information Sciences, 2020, 63: 1-52. |
29 | 韩烨, 侯睿峥, 陈霄. 基于循环一致对抗网络的玉米灰斑病图像迁移方法研究[J]. 中国农机化学报, 2023, 44(2): 163-171. |
HAN Y, HOU R Z, CHEN X. Research on images migration method of maize gray disease based on cyclic consistent adversarial network[J]. Journal of Chinese agricultural mechanization, 2023, 44(2): 163-171. | |
30 | WANG Z, BOVIK A C, SHEIKH H R, et al. Image quality assessment: From error visibility to structural similarity[J]. IEEE transactions on image processing, 2004, 13(4): 600-612 |
31 | KORHONEN J, YOU J Y. Peak signal-to-noise ratio revisited: Is simple beautiful?[C]// 2012 Fourth International Workshop on Quality of Multimedia Experience. Piscataway, New Jersey, USA: IEEE, 2012: 37-38. |
32 | KIM J, KIM M, KANG H, et al. U-GAT-IT: Unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation[EB/OL]. arxiv: , 2019. |
33 | CAP Q H, UGA H, KAGIWADA S, et al. LeafGAN: An effective data augmentation method for practical plant disease diagnosis[J]. IEEE transactions on automation science and engineering, 2022, 19(2): 1258-1267. |
34 | WANG Q L, WU B G, ZHU P F, et al. ECA-net: Efficient channel attention for deep convolutional neural networks[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2020: 11534-11542. |
35 | HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2021: 13713-13722. |
/
〈 | 〉 |