欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture ›› 2024, Vol. 6 ›› Issue (6): 63-71.doi: 10.12133/j.smartag.SA202410008

• 专题--农业知识智能服务和智慧无人农场(上) • 上一篇    下一篇

基于大语言模型推理的数字孪生平台蔬菜作物生长模型研究

赵春江(), 李静晨, 吴华瑞, 杨雨森   

  1. 北京市农林科学院信息技术研究中心,北京 100079,中国
  • 收稿日期:2024-10-11 出版日期:2024-11-30
  • 基金项目:
    国家重点研发计划(2021ZD0113604); 财政部和农业农村部国家现代农业产业技术体系建设专项(CARS-23-D07); 中央引导地方科技发展资金项目(2023ZY1-CGZY-01)
  • 通信作者:
    赵春江,研究员,中国工程院院士,研究方向为大语言模型与农业知识服务。E-mail:

Vegetable Crop Growth Modeling in Digital Twin Platform Based on Large Language Model Inference

ZHAO Chunjiang(), LI Jingchen, WU Huarui, YANG Yusen   

  1. Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100079, China
  • Received:2024-10-11 Online:2024-11-30
  • Foundation items:National Key R&D Program of China under Grant(2021ZD0113604); China Agriculture Research System of MOF and MARA Grant(CARS-23-D07); Central Guiding Local Science and Technology Development Fund Projects under Grant(2023ZY1-CGZY-01)
  • Corresponding author:
    ZHAO Chunjiang, E-mail:

摘要:

[目的/意义] 利用数字孪生技术实现对无人农场的实时监管和虚实映射控制是新一代农业信息化技术的核心需求之一,但由于蔬菜生长模型过于复杂,难以在数字孪生平台中建立作物预期模拟,因此结合人工智能技术实现作物生长自动建模成了领域迫切需要的关键技术。 [方法] 在蔬菜作物数字孪生平台中引入大语言模型技术,通过预训练大语言模型的推理能力实现蔬菜生长数字孪生平台中准确的蔬菜作物生长模拟。为了使大语言模型具有关于蔬菜作物生长的更多知识和推理能力,首先收集了大量连续的蔬菜生长数据,用于预训练和指令微调;随后设计了阶段式大语言模型智能体集合,由一个预测蔬菜生长阶段的管理型智能体和负责各个阶段的智能体组成,根据数字孪生平台提供的实时数据对蔬菜作物生长进行建模。 [结果和讨论] 根据气候、土壤、灌溉、施肥、病虫害、生长日期等蔬菜生长状态信息,所建模型能够预测次日的作物长势,且能根据数字孪生平台的作物管理模拟实现几天甚至几个月的长势预测。通过十折交叉验证证明,该方法使得大语言模型在进行蔬菜作物生长建模时的准确率达到98%,蔬菜生长阶段识别准确率高达99.7%。 [结论] 研究表明大语言模型能够在特定数据微调后,实现对于数字孪生平台中作物生长的一般性推理,且能平滑过渡到作物生长的不同阶段。

关键词: 农业信息化, 数字农业, 数字孪生, 大语言模型, 蔬菜生长

Abstract:

[Objective] In the era of digital agriculture, real-time monitoring and predictive modeling of crop growth are paramount, especially in autonomous farming systems. Traditional crop growth models, often constrained by their reliance on static, rule-based methods, fail to capture the dynamic and multifactorial nature of vegetable crop growth. This research tried to address these challenges by leveraging the advanced reasoning capabilities of pre-trained large language models (LLMs) to simulate and predict vegetable crop growth with accuracy and reliability. Modeling the growth of vegetable crops within these platforms has historically been hindered by the complex interactions among biotic and abiotic factors. [Methods] The methodology was structured in several distinct phases. Initially, a comprehensive dataset was curated to include extensive information on vegetable crop growth cycles, environmental conditions, and management practices. This dataset incorporates continuous data streams such as soil moisture, nutrient levels, climate variables, pest occurrence, and historical growth records. By combining these data sources, the study ensured that the model was well-equipped to understand and infer the complex interdependencies inherent in crop growth processes. Then, advanced techniques was emploied for pre-training and fine-tuning LLMs to adapt them to the domain-specific requirements of vegetable crop modeling. A staged intelligent agent ensemble was designed to work within the digital twin platform, consisting of a central managerial agent and multiple stage-specific agents. The managerial agent was responsible for identifying transitions between distinct growth stages of the crops, while the stage-specific agents were tailored to handle the unique characteristics of each growth phase. This modular architecture enhanced the model's adaptability and precision, ensuring that each phase of growth received specialized attention and analysis. [Results and Discussions] The experimental validation of this method was conducted in a controlled agricultural setting at the Xiaotangshan Modern Agricultural Demonstration Park in Beijing. Cabbage (Zhonggan 21) was selected as the test crop due to its significance in agricultural production and the availability of comprehensive historical growth data. Over five years, the dataset collected included 4 300 detailed records, documenting parameters such as plant height, leaf count, soil conditions, irrigation schedules, fertilization practices, and pest management interventions. This dataset was used to train the LLM-based system and evaluate its performance using ten-fold cross-validation. The results of the experiments demonstrating the efficacy of the proposed system in addressing the complexities of vegetable crop growth modeling. The LLM-based model achieved 98% accuracy in predicting crop growth degrees and a 99.7% accuracy in identifying growth stages. These metrics significantly outperform traditional machine learning approaches, including long short-term memory (LSTM), XGBoost, and LightGBM models. The superior performance of the LLM-based system highlights its ability to reason over heterogeneous data inputs and make precise predictions, setting a new benchmark for crop modeling technologies. Beyond accuracy, the LLM-powered system also excels in its ability to simulate growth trajectories over extended periods, enabling farmers and agricultural managers to anticipate potential challenges and make proactive decisions. For example, by integrating real-time sensor data with historical patterns, the system can predict how changes in irrigation or fertilization practices will impact crop health and yield. This predictive capability is invaluable for optimizing resource allocation and mitigating risks associated with climate variability and pest outbreaks. [Conclusions] The study emphasizes the importance of high-quality data in achieving reliable and generalizable models. The comprehensive dataset used in this research not only captures the nuances of cabbage growth but also provides a blueprint for extending the model to other crops. In conclusion, this research demonstrates the transformative potential of combining large language models with digital twin technology for vegetable crop growth modeling. By addressing the limitations of traditional modeling approaches and harnessing the advanced reasoning capabilities of LLMs, the proposed system sets a new standard for precision agriculture. Several avenues also are proposed for future work, including expanding the dataset, refining the model architecture, and developing multi-crop and multi-region capabilities.

Key words: agricultural informatics, digital agriculture, digital twin, large language model, vegetable growth

中图分类号: