欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture ›› 2024, Vol. 6 ›› Issue (5): 74-87.doi: 10.12133/j.smartag.SA202406008

• 技术方法 • 上一篇    下一篇

ReluformerN:轻量化高低频增强高光谱农业地物分类方法

刘伊1,2(), 张彦军2   

  1. 1. 太原工业学院 自动化系,山西 太原 030008,中国
    2. 中北大学 仪器与电子学院,山西 太原 030001,中国
  • 收稿日期:2024-06-17 出版日期:2024-09-30
  • 基金项目:
    山西省高等学校科技创新项目; 山西省基础研究计划青年项目(202103021223352); 校级立项项目(24020502)
  • 通信作者:
    刘 伊,讲师,研究方向为高光谱图像处理。E-mail:

ReluformerN: Lightweight High-Low Frequency Enhanced for Hyperspectral Agricultural Lancover Classification

LIU Yi1,2(), ZHANG Yanjun2   

  1. 1. Department of Automation, Taiyuan Institute of Technology, Taiyuan 030008, China
    2. School of Instrument and Electronics, North University of China, Taiyuan 030001, China
  • Received:2024-06-17 Online:2024-09-30
  • Foundation items:Science and Technology Innovation Project of Higher Education in Shanxi Province; The Youth Project of Basic Research Program of Shanxi Province(202103021223352); University-Level Funded Project(24020502)
  • Corresponding author:
    LIU Yi, E-mail:

摘要:

【目的/意义】 为了智能监测农业地物种类分布情况,一般采用无人机搭载高光谱相机进行高光谱数据采集,之后对高光谱数据分类,实现农作物分布图自动绘制。但不同农作物外形相似,同一农作物不同生长期差别较大,所以对农业地物分类的网络模型要求较高。分类精度较高的网络模型往往复杂程度较高,无法部署在硬件系统中。针对以上问题,本研究提出一种轻量化高低频增强的ReluformerN网络(Reluformer Network)实现农业地物分类。 【方法】 首先提出自适应八倍频卷积,不仅可以对高光谱图像的空间和光谱频域特征进行提取,同时缓解了内部人工参数设置带来的影响。其次针对低频信息可以捕获全局特征的特点,提出Reluformer进行全局特征提取,Reluformer相比transformer具有线性计算复杂度,有利于网络轻量化的同时保持了提取全局特征的能力。将该网络在三个公开的有关农作物品种精细分类的高光谱数据集上进行实验,并与较为流行的五种分类网络进行对比。 【结果和讨论】 ReluformerN在整体精度(Overall Accuracy, OA)、平均精度(Average Accuracy, AA)等精度评价指标中表现较好。在模型参数量(Parameters)、模型计算量(FLOPs)模型复杂度评价指标中,ReluformerN参数量最小,计算量最低。 【结论】 本研究提出的ReluformerN网络在农作物品种分类精度和模型复杂度之间达到了较好的平衡,有望后续部署在资源有限的硬件系统中,实现地物实时分类功能。

关键词: 高光谱图像, 农业地物分类, 轻量化网络, 线性transformer, 深度学习

Abstract:

[Objective] In order to intelligently monitor the distribution of agricultural land cover types, high-spectral cameras are usually mounted on drones to collect high-spectral data, followed by classification of the high-spectral data to automatically draw crop distribution maps. Different crops have similar shapes, and the same crop has significant differences in different growth stages, so the network model for agricultural land cover classification requires a high degree of accuracy. However, network models with high classification accuracy are often complex and cannot be deployed on hardware systems. In view of this problem, a lightweight high-low frequency enhanced Reluformer network (ReluformerN) was proposed in this research. [Methods] Firstly, an adaptive octave convolution was proposed, which utilized the softmax function to automatically adjust the spectral dimensions of high-frequency features and low-frequency features, effectively alleviating the influence of manually setting the spectral dimensions and benefiting the subsequent extraction of spatial and spectral domain features of hyperspectral images. Secondly, a Reluformer was proposed to extract global features, taking advantage of the fact that low-frequency information could capture global features. Reluformer replaced the softmax function with a function of quadratic computational complexity, and through theoretical and graphical analysised, Relu function, LeakRelu function, and Gelu function were compared, it was found that the ReLU function and the softmax function both had non-negativity, which could be used for feature relevance analysis. Meanwhile, the ReLU function has a linearization feature, which is more suitable for self-relevance analysis. Therefore, the ReLU self-attention mechanism was proposed, which used the ReLU function to perform feature self-attention analysis. In order to extract deep global features, multi-scale feature fusion was used, and the ReLU self-attention mechanism was used as the core to construct the multi-head ReLU self-attention mechanism. Similar to the transformer architecture, the Reluformer structure was built by combining multi-head ReLU self-attention mechanism, feedforward layers, and normalization layers. With Reluformer as the core, the Reluformer network (ReluformerN) was proposed. This network considered frequency from the perspective of high-frequency information, taking into account the local features of image high-frequency information, and used deep separable convolution to design a lightweight network for fine-grained feature extraction of high-frequency information. It proposed Reluformer to extract global features for low-frequency information, which represented the global features of the image. ReluformerN was experimented on three public high-spectral data sets (Indian Pines, WHU-Hi-LongKou and Salinas) for crop variety fine classification, and was compared with five popular classification networks (2D-CNN, HybirdSN, ViT, CTN and LSGA-VIT). [Results and Discussion] ReluformerN performed best in overall accuracy (OA), average accuracy (AA), and other accuracy evaluation indicators. In the evaluation indicators of model parameters, model computation (FLOPs), and model complexity, ReluformerN had the smallest number of parameters and was less than 0.3 M, and the lowest computation. In the visualization comparison, the classification effect diagram of the model using ReluformerN had clearer image edges and more complete morphological structures, with fewer classification errors. The validity of the adaptive octave convolution was verified by comparing it with the traditional eightfold convolution. The classification accuracy of the adaptive octave convolution was 0.1% higher than that of the traditional octave convolution. When the artificial parameters were set to different values, the maximum and minimum classification accuracies of the traditional octave convolution were about 0.3% apart, while those of the adaptive octave convolution were only 0.05%. This showed that the adaptive octave convolution not only had the highest classification accuracy, but was also less sensitive to the artificial parameter setting, effectively overcoming the influence of the artificial parameter setting on the classification result. To validated the Reluformer module, it was compared with transformer, LeakRelufromer, and Linformer in terms of accuracy evaluation metrics such as OA and AA. The Reluformer achieved the highest classification accuracy and the lowest model parameter count among these models. This indicated that Reluformer not only effectively extracted global features but also reduced computational complexity. Finally, the effectiveness of the high-frequency and low-frequency branch networks was verified. The effectiveness of the high-frequency and low-frequency feature extraction branches was verified, and the characteristics of the feature distribution after high-frequency feature extraction, after high-low frequency feature extraction, and after the classifier were displayed using a 2D t-sne, compared with the original feature distribution. It was found that after high-frequency feature extraction, similar features were generally clustered together, but the spacing between different features was small, and there were also some features with overlapping situations. After low-frequency feature extraction, it was obvious that similar features were clustered more tightly. After high-low frequency feature fusion, and after the classifier, it was obvious that similar features were clustered, and different types of features were clearly separated, indicating that high-low frequency feature extraction enhanced the classification effect. [Conclusion] This network achieves a good balance between crop variety classification accuracy and model complexity, and is expected to be deployed on hardware systems with limited resources in the future to achieve real-time classification functions.

Key words: hyperspectral images, agricultural object classification, lightweight network, linear transformer, deep learning

中图分类号: