欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture

• •    

基于双意图建模和知识图谱扩散的水稻品种选育推荐方法

乔磊1, 陈雷1,2, 袁媛1,2()   

  1. 1. 安徽建筑大学 电子与信息工程学院,安徽 合肥 230601,中国
    2. 中国科学院合肥物质科学研究院 智能机械研究所,安徽 合肥 230031,中国
  • 收稿日期:2024-12-31 出版日期:2025-03-03
  • 基金项目:
    国家重点研发计划子课题(2023YFD2000101-01); 国家自然科学基金(32271981,32471988)
  • 作者简介:

    乔 磊,硕士研究生,研究方向为知识图谱构建及应用。 E-mail:

  • 通信作者:
    袁 媛,博士,副研究员,研究方向为机器学习理论方法在计算机视觉中的应用。 E-mail:

Bi-Intentional Modeling and Knowledge Graph Diffusion for Rice Variety Selection and Breeding Recommendation

QIAO Lei1, CHEN Lei1,2, YUAN Yuan1,2()   

  1. 1. School of Electronic and Information Engineering, Anhui Jianzhu University, Hefei 230601, China
    2. Institute of Intelligent Machines, HFIPS, Chinese Academy of Sciences, Hefei 230031, China
  • Received:2024-12-31 Online:2025-03-03
  • Foundation items:Subtopic of National Key Research and Development Program of China(2023YFD2000101-01); National Natural Science Foundation of China(32271981,32471988)
  • About author:

    QIAO Lei, E-mail:

  • Corresponding author:
    YUAN Yuan, E-mail:

摘要:

【目的/意义】 为满足用户水稻品种选育需求,推进水稻育种工作信息化和智能化。 【方法】 首先收集水稻品种性状信息,构建一定量级的水稻知识图谱数据,再以地区为单位收集所种植过的水稻品种,使用随机采样的方法构建水稻品种选育交互数据,之后提出一种双意图建模和知识图谱扩散(Bi-intentional Modeling and Knowledge Graph Diffusion, BMKGD)的模型。BMKGD模型同时考虑到交互行为中的意图因素和知识图谱的去噪处理,分为独立性和从众性两种意图,分别建模相应的意图空间,对于知识图谱中存在的噪声,结合扩散模型,在多次迭代中完成去噪处理,最后通过对不同视图中的项目表示进行跨视图对比学习,充分学习到两个视图中的信息,进而完成推荐。 【结果和讨论】 在构建的水稻品种选育数据集上,BMKGD模型取得最优的性能,其Recall值和归一化折损累计增益(Normalized Discounted Cumulative Gain, NDCG)值相较于表现最好的基线模型——基于双向意图指导的协同过滤模型(Bilateral Intent-guided Graph Collaborative Filtering, BIGCF)分别提升了2.9%和3.7%,性能的提升验证了方法的有效性,去除关键组件的模型变体相较于原模型性能都有所下降,表明BMKGD模块对模型整体的推荐性能具有一定的影响。 【结论】 提出的BMKGD模型能够很好地完成推荐任务,有望在后续的水稻育种工作中,帮助用户选择适宜的水稻品种。

关键词: 水稻育种, 知识图谱, 意图建模, 对比学习, 推荐系统

Abstract:

[Objective] Selection of rice varieties requires consideration of several factors, such as yield, fertility, disease resistance and resistance to downfall. There are many rice varieties in the world, and different rice varieties have different traits. When users select rice varieties, they need to spend a lot of time to retrieve information about different rice varieties and make a selection, which increases the workload to some extent. In order to meet the user's rice variety selection needs, help users quickly access to the rice varieties they need, improve efficiency, and further promote the informatization and intelligence of rice breeding work, the bi-intentional modeling and knowledge graph diffusion model, an advanced method was proposed. [Methods] The research work was mainly carried out at two levels: data and methodology. At the data level, considering the current lack of relevant data support for rice variety selection and breeding recommendation, a certain amount of recommendation dataset was constructed. The rice variety selection recommendation dataset consisted of two parts: interaction data and knowledge graph. For the interaction data, the rice varieties that had been planted in the region were collected on a region-by-region basis, and then a batch of users was simulated and generated from the region. The corresponding rice varieties were assigned to the generated users according to the random sampling method to construct the user-item interaction data. For the knowledge graph, detailed text descriptions of rice varieties were first collected, and then information was extracted from them to construct data in ternary format from multiple varietal characteristics, such as selection unit, varietal category, disease resistance, and cold tolerance. At the methodological level, a model of Bi-intentional Modeling and Knowledge Graph Diffusion (BMKGD) was proposed. The intent factor in the interaction behavior and the denoising process of the knowledge graph were both taken into account by the BMKGD model. Intentions were usually considered from two perspectives: individual independence and conformity. A dual intent space was chosen to be built by the model to represent both perspectives. For the problem of noisy data in the knowledge graph, denoising was carried out by combining the idea of the diffusion model. Random noise was introduced to destroy the original structure when the knowledge graph was initialized, and the original structure was restored through iterative learning. The denoising was completed in this process. After that, cross-view contrastive learning was carried out in both views. [Results and Discussions] The results demonstrated that the method proposed in this paper achieved optimal performance in the rice variety selection dataset, with Recall and NDCG values improved by 2.9% and 3.7% compared to the suboptimal model. The performance improvement validated the effectiveness of the method to some extent, indicating that the BMKGD model was more suitable for rice variety recommendation. The Recall value of the BMKGD model on the rice variety selection dataset was 0.327 6, meeting the basic requirements of the recommendation system. It indicated that the method proposed in this paper could be used in the actual rice variety selection and breeding work to reduce the workload of users in the process of information retrieval and assist users in making decisions. In contrast, traditional knowledge-aware recommendation models did not perform well on the rice variety selection and breeding dataset, underperforming even compared to models without knowledge graph integration. The analysis revealed that the collaborative signals in the interaction data played a major role, while the quality of the constructed knowledge graph still had some room for improvement. The module variants with key components removed all exhibited a decrease in performance compared to the original model, which validated the effectiveness of the modules. The performance degradation of the model variants with each component removed varied, indicating that different components played different roles. The performance drop of the model variant with the cross-view contrastive learning module removed was small, indicating that there was some room for improvement in the module to fully utilize the collaborative relationship between the two views. [Conclusions] The BMKGD model proposed in this paper achieves good performance on the rice variety selection dataset and accomplishes the recommendation task well. It shows that the model can be used to support the rice variety selection and breeding work and help users to select suitable rice varieties. In addition, a certain amount of rice selection and breeding dataset is constructed, which provides data support for the subsequent rice variety recommendation work. Improvements in modeling methods also provide ideas for subsequent work. The research results can be applied to the work of rice variety selection and breeding to reduce the user's workload in information retrieval, and can also provide technical support for scientific breeding.

Key words: rice breeding, knowledge graph, intent modeling, contrastive learning, recommender systems

中图分类号: