欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture ›› 2025, Vol. 7 ›› Issue (5): 101-113.doi: 10.12133/j.smartag.SA202505013

• 专刊--光智农业创新技术与应用 • 上一篇    

基于半监督深度卷积生成对抗网络的不平衡自然老化大豆种质活力高光谱检测

李飞1,2,3, 王自强2,3, 武晶2,3, 辛霞2,3, 李春梅1(), 徐虎博2,3()   

  1. 1. 青海大学 计算机技术与应用学院,青海 西宁 810016,中国
    2. 中国农业科学院 作物科学研究所,北京 100081,中国
    3. 农业农村部粮食作物基因资源评价利用重点实验室,北京 100081,中国
  • 收稿日期:2025-05-14 出版日期:2025-09-30
  • 基金项目:
    国家重点研发计划项目(2024YFD1200100); 国家自然科学基金项目(62166033); 北京市自然科学基金项目(6254042); 中央级公益性科研院所基本科研业务费专项(S2025QH24)
  • 作者简介:

    李 飞,硕士研究生,研究方向为种质资源智能无损检测技术。E-mail:

  • 通信作者:
    李春梅,硕士,教授,研究方向为人工智能与数据挖掘。E-mail:
    徐虎博,博士,助理研究员,研究方向为种质资源智能无损检测技术。E-mail:

Imbalanced Hyperspectral Viability Detection of Naturally Aged Soybean Germplasm Based on Semi-Supervised Deep Convolutional Generative Adversarial Network

LI Fei1,2,3, WANG Ziqiang2,3, WU Jing2,3, XIN Xia2,3, LI Chunmei1(), XU Hubo2,3()   

  1. 1. School of Computer Technology and Application, Qinghai University, Xining 810016, China
    2. Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
    3. Key Laboratory of Grain Crop Genetic Resources Evaluation and Utilization, Ministry of Agriculture and Rural Affairs, Beijing 100081, China
  • Received:2025-05-14 Online:2025-09-30
  • Foundation items:National Key Research and Development Program of China(2024YFD1200100); National Natural Science Foundation of China(62166033); Beijing Natural Science Foundation(6254042); Central Public-interest Scientific Institution Basal Research Fund(S2025QH24)
  • About author:
  • Corresponding author:
    LI Chunmei, E-mail:
    XU Hubo, E-mail:

摘要:

[目的/意义] 种质资源是高质量育种的“芯片”,评估大豆种质活力对于种质资源安全保存和大豆产业健康发展至关重要。传统的活力检测方法耗时、耗力且消耗种子,迫切需要开展无损、智能化和高通量的检测研究。高光谱成像结合深度学习为快速无损检测大豆种质活力提供了新途径。相较于人工老化样本,自然老化样本更能真实地反映种质活力衰退过程中的物质变化,但其有无活力样本数量不平衡限制了活力预测模型的泛化能力。 [方法] 为解决上述问题,本研究提出了一种半监督深度卷积生成对抗网络(Semi-supervised Deep Convolutional Generative Adversarial Network, SDCGAN),以生成高质量的带活力标签的高光谱数据;构建了光谱分数融合网络(Spectral Score Fusion Network, SSFNet),用于实现大豆种质活力的高光谱检测。利用SDCGAN对原始光谱进行数据增强,构建了原始光谱、生成光谱和混合光谱3种数据集;SSFNet通过对光谱通道进行自适应加权,突出活力相关特征、抑制冗余噪声,并开展迁移泛化实验验证模型稳健性。 [结果与讨论] 相较于原始和生成光谱数据集,SSFNet在样本数量更大、有无活力样本数量平衡的混合光谱数据集上实现了93.33%的最高活力分类准确率;与其他4种对比模型相比,SSFNet具有最佳的预测分类性能;在未知大豆品种的泛化测试中,SSFNet实现了73.67%的活力预测准确率,显著优于既往模型。 [结论] 研究结果将为样本不平衡条件下的种子活力智能无损检测技术研发提供方法参考和技术支撑。

关键词: 大豆种质, 高光谱成像, 活力检测, 生成对抗网络, 样本不平衡

Abstract:

[Objective] Germplasm resources are regarded as the "chips" of high-quality breeding, and evaluating the viability of soybean germplasm is essential for ensuring the secure preservation of genetic resources and promoting the healthy development of the soybean industry. Traditional viability detection methods are time-consuming, labor-intensive, and seed-consuming, highlighting the urgent need for non-destructive, intelligent, and high-throughput detection technologies. Hyperspectral imaging combined with deep learning offers a promising approach for the rapid, non-destructive assessment of soybean germplasm viability. Compared to artificially aged samples, naturally aged samples more accurately reflect the substance changes associated with the decline in germplasm viability. However, the imbalance in the number of viable and non-viable samples limits the generalization performance of viability prediction models. [Methods] In order to address the aforementioned challenges, a semi-supervised deep convolutional generative adversarial network (SDCGAN) was proposed in this research to generate high-quality hyperspectral data with associated viability labels. The SDCGAN framework consisted of three main components: a generator, a discriminator, and a classifier. The generator progressively transformed low-dimensional latent representations into hyperspectral data. This was achieved through four one-dimensional transposed convolutional layers, ensuring the output matched the dimensionality of real spectra. The discriminator adopted an optimization strategy based on the wasserstein distance, replacing the Jensen-Shannon divergence used in traditional GANs, thereby mitigating training instability and gradient vanishing. Additionally, a gradient penalty term was introduced to further stabilize model training. In the classifier, a unilateral margin loss function was employed to penalize only those samples near the decision boundary, effectively avoiding overfitting on well-separated samples and improving training efficiency. Furthermore, a spectral score fusion network (SSFNet) was developed to enable hyperspectral-based detection of soybean seed viability. SSFNet comprised two core modules: a spectral residual network and a spectral score fusion module. The spectral residual network extracted shallow-level features from the hyperspectral data, capturing local patterns within spectral sequences. The spectral score fusion module adaptively reweighted spectral channels to emphasize viability-related features and suppress redundant noise. Finally, the performance of the SDCGAN-generated spectra was evaluated using root mean square error (RMSE), while the viability detection performance of SSFNet was assessed using test accuracy, precision, area under the curve (AUC), and F1-Score. [Results and Discussions] In the performance analysis of SDCGAN, the model progressively learned and captured the key spectral features that distinguished viable and non-viable soybean seeds during the training process. The generated spectra gradually evolved from initial noisy fluctuations to smoother curves that closely resembled real spectra, demonstrating strong nonlinear modeling capability. Compared to other generative adversarial models, SDCGAN achieved the best performance in enhancing viability detection, and its generated data exhibited low error characteristics in RMSE analysis. By applying SDCGAN for data augmentation, three types of datasets were constructed: original spectra, generated spectra, and mixed spectral dataset. When using the multiple scatter correction-savitzky-golay-standardscaler (MSC-SG-SS) preprocessing strategy, SSFNet achieved the highest viability detection accuracies across all three datasets, reaching 89.50%, 90.83%, and 93.33%, respectively. In comparison with other viability detection models, SSFNet consistently outperformed alternative algorithms in all four evaluation metrics across all datasets. Particularly on the mixed dataset, SSFNet demonstrated the best performance, achieving a test accuracy of 93.33%, precision of 95.17%, AUC of 92.58%, and F1-Score of 94.83%. Notably, all models trained on the mixed dataset containing SDCGAN-generated samples achieved better performance than those trained on either original or generated datasets alone. This improvement was likely due to the increased sample diversity and balanced class distribution in the mixed dataset, which provided more comprehensive viability-related features, facilitated model convergence, and reduced overfitting. In transfer experiments, SSFNet also exhibited superior generalization capability compared to four baseline algorithms: support vector machine (SVM ), extreme gradient boosting (XGBoost), one-dimensional convolutional neural network (1D-CNN), and Transformer, achieving the highest classification accuracy of 73.67% on the mixed dataset. [Conclusions] This research constructs an integrated SDCGAN-SSFNet framework for robust viability detection of naturally aged soybean germplasm under imbalanced sample conditions. The SDCGAN component accurately learns the underlying distributional characteristics of real hyperspectral data from soybean seeds and generates realistic synthetic samples, effectively augmenting the spectral data of non-viable seeds and improving data diversity. Meanwhile, SSFNet explores inter-band spectral correlations to adaptively enhance features that are highly relevant to viability classification while effectively suppressing redundant and noisy information. This integrated approach enables rapid, nondestructive, and high-precision detection of soybean seed viability under challenging sample imbalance scenarios, providing an efficient and reliable method for seed quality assessment and agricultural decision-making.

Key words: soybean germplasm, hyperspectral imaging, viability detection, generative adversarial network, sample imbalance

中图分类号: