欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture ›› 2023, Vol. 5 ›› Issue (3): 75-85.doi: 10.12133/j.smartag.SA202309013

• 专刊--作物信息监测技术 • 上一篇    下一篇

基于深度学习语义分割和迁移学习策略的麦田倒伏面积识别方法

张淦1(), 严海峰1, 胡根生1(), 张东彦1,2, 程涛1,2, 潘正高1,3, 许海峰1,3, 沈书豪1,3, 朱科宇1   

  1. 1. 安徽大学农业生态大数据分析与应用技术国家地方联合工程研究中心,安徽 合肥 230039,中国
    2. 西北农林科技大学 机械与电子工程学院,陕西杨凌 712100,中国
    3. 宿州学院 信息工程学院,安徽 宿州 234000,中国
  • 收稿日期:2023-09-11 出版日期:2023-09-30
  • 基金资助:
    安徽省教育厅高校科研项目(自然科学类)(2023AH052246); 宿州学院博士科研启动基金(2021BSK043); 国家自然科学基金(42271364)
  • 作者简介:
    张 淦,研究方向为农业遥感。E-mail:
  • 通信作者:
    胡根生,博士,教授,研究方向为机器学习及图像视觉。E-mail:

Identification Method of Wheat Field Lodging Area Based on Deep Learning Semantic Segmentation and Transfer Learning

ZHANG Gan1(), YAN Haifeng1, HU Gensheng1(), ZHANG Dongyan1,2, CHENG Tao1,2, PAN Zhenggao1,3, XU Haifeng1,3, SHEN Shuhao1,3, ZHU Keyu1   

  1. 1. National Engineering Research Center for Argo-Ecological Big Data Analysis &Application, Anhui University, Hefei 230039, China
    2. College of Mechanical and Electronic Engineering, Northwest A&F University, Yangling 712100, China
    3. School of Information Engineering, Suzhou University, Suzhou 234000, China
  • Received:2023-09-11 Online:2023-09-30
  • corresponding author: HU  Gensheng, E-mail:
  • About author:ZHANG Gan, E-mail:zhanggan@ahu.edu.cn
  • Supported by:
    University Research Project of Anhui Provincial Department of Education (Natural Science)(2023AH052246); Suzhou University Doctoral Research Foundation(2021BSK043); National Natural Science Foundation of China(42271364)

摘要:

[目的/意义] 利用低空无人机技术并结合深度学习语义分割模型精准提取作物倒伏区域是一种高效的倒伏灾害监测手段。然而,在实际应用中,受田间各种客观条件(不同无人机飞行高度低于120 m、多个研究区、关键生育期不同天气状况等)限制,无人机获取的图像数量仍偏少,难以满足高精度深度学习模型训练的要求。本研究旨在探索一种在作物生育期和研究区有限的情况下精准提取倒伏面积的方法。 [方法] 以健康/倒伏小麦为研究对象,在其灌浆期和成熟期开展麦田图像采集工作。设置2个飞行高度(40和80 m),采集并拼接获取2019、2020、2021和2023年份3个研究区的数字正射影像图(Digital Ortho⁃photo Map,DOM);在Swin-Transformer深度学习语义分割框架基础上,分别使用40 m训练集单独训练、40和80 m训练集混合训练、40 m训练集预训练80 m训练集迁移学习等3种训练方法,获得对照模型、混合训练模型和迁移学习模型;采用对比实验比较上述3种模型分割80 m高度预测集图像的精度并评估模型性能。 [结果和讨论] 迁移学习模型倒伏面积提取精度最高,交并比、正确率、精确率、召回率和F1-Score共5个指标平均数分别为85.37%、94.98%、91.30%、92.52%和91.84%,高于对照组模型1.08%~3.19%,平均加权帧率达到738.35 fps/m2,高于40 m图像183.12 fps/m2[结论] 利用低飞行高度(40 m)预训练语义分割模型,在较高飞行高度(80 m)空图像做迁移学习的方法提取倒伏小麦面积是可行的,这为解决空域飞行高度限制下,较少80 m及以上图像数据集无法满足语义分割模型训练的要求的问题,提供了一种有效的方法。

关键词: 倒伏识别, 农业遥感, 无人机影像, 迁移学习, 语义分割, Swin-Transformer

Abstract:

[Objective] Lodging constitutes a severe crop-related catastrophe, resulting in a reduction in photosynthesis intensity, diminished nutrient absorption efficiency, diminished crop yield, and compromised crop quality. The utilization of unmanned aerial vehicles (UAV) to acquire agricultural remote sensing imagery, despite providing high-resolution details and clear indications of crop lodging, encounters limitations related to the size of the study area and the duration of the specific growth stages of the plants. This limitation hinders the acquisition of an adequate quantity of low-altitude remote sensing images of wheat fields, thereby detrimentally affecting the performance of the monitoring model. The aim of this study is to explore a method for precise segmentation of lodging areas in limited crop growth periods and research areas. [Methods] Compared to the images captured at lower flight altitudes, the images taken by UAVs at higher altitudes cover a larger area. Consequently, for the same area, the number of images taken by UAVs at higher altitudes is fewer than those taken at lower altitudes. However, the training of deep learning models requires huge amount supply of images. To make up the issue of insufficient quantity of high-altitude UAV-acquired images for the training of the lodging area monitoring model, a transfer learning strategy was proposed. In order to verify the effectiveness of the transfer learning strategy, based on the Swin-Transformer framework, the control model, hybrid training model and transfer learning training model were obtained by training UAV images in 4 years (2019, 2020, 2021, 2023)and 3 study areas(Shucheng, Guohe, Baihe) under 2 flight altitudes (40 and 80 m). To test the model's performance, a comparative experimental approach was adopted to assess the accuracy of the three models for segmenting 80 m altitude images. The assessment relied on five metrics: intersection of union (IoU), accuracy, precision, recall, and F1-score. [Results and Discussions] The transfer learning model shows the highest accuracy in lodging area detection. Specifically, the mean IoU, accuracy, precision, recall, and F1-score achieved 85.37%, 94.98%, 91.30%, 92.52% and 91.84%, respectively. Notably, the accuracy of lodging area detection for images acquired at a 40 m altitude surpassed that of images captured at an 80 m altitude when employing a training dataset composed solely of images obtained at the 40 m altitude. However, when adopting mixed training and transfer learning strategies and augmenting the training dataset with images acquired at an 80 m altitude, the accuracy of lodging area detection for 80 m altitude images improved, inspite of the expense of reduced accuracy for 40 m altitude images. The performance of the mixed training model and the transfer learning model in lodging area detection for both 40 and 80 m altitude images exhibited close correspondence. In a cross-study area comparison of the mean values of model evaluation indices, lodging area detection accuracy was slightly higher for images obtained in Baihu area compared to Shucheng area, while accuracy for images acquired in Shucheng surpassed that of Guohe. These variations could be attributed to the diverse wheat varieties cultivated in Guohe area through drill seeding. The high planting density of wheat in Guohe resulted in substantial lodging areas, accounting for 64.99% during the late mature period. The prevalence of semi-lodging wheat further exacerbated the issue, potentially leading to misidentification of non-lodging areas. Consequently, this led to a reduction in the recall rate (mean recall for Guohe images was 89.77%, which was 4.88% and 3.57% lower than that for Baihu and Shucheng, respectively) and IoU (mean IoU for Guohe images was 80.38%, which was 8.80% and 3.94% lower than that for Baihu and Shucheng, respectively). Additionally, the accuracy, precision, and F1-score for Guohe were also lower compared to Baihu and Shucheng. [Conclusions] This study inspected the efficacy of a strategy aimed at reducing the challenges associated with the insufficient number of high-altitude images for semantic segmentation model training. By pre-training the semantic segmentation model with low-altitude images and subsequently employing high-altitude images for transfer learning, improvements of 1.08% to 3.19% were achieved in mean IoU, accuracy, precision, recall, and F1-score, alongside a notable mean weighted frame rate enhancement of 555.23 fps/m2. The approach proposed in this study holds promise for improving lodging monitoring accuracy and the speed of image segmentation. In practical applications, it is feasible to leverage a substantial quantity of 40 m altitude UAV images collected from diverse study areas including various wheat varieties for pre-training purposes. Subsequently, a limited set of 80 m altitude images acquired in specific study areas can be employed for transfer learning, facilitating the development of a targeted lodging detection model. Future research will explore the utilization of UAV images captured at even higher flight altitudes for further enhancing lodging area detection efficiency.

Key words: lodging detection, agricultural remote sensing, UAV image, transfer learning, semantic segmentation, Swin-Transformer