欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture

• •    

基于大语言模型的农业用户主体需求关键因子提取方法研究

栗润腾1,3,4, 王一群1, 李宏达2,3,4, 李静晨3,4, 陈雯柏1()   

  1. 1. 北京信息科技大学 自动化学院,北京 100192,中国
    2. 中国农业大学 信息与电气工程学院,北京 100083,中国
    3. 国家农业信息化工程技术研究中心,北京 100097,中国
    4. 北京市农林科学院 信息技术研究中心,北京 100079,中国
  • 收稿日期:2025-09-02 出版日期:2025-12-11
  • 基金项目:
    科技部2030新一代人工智能重大专项(2021ZD0113603); 国家自然科学基金(62276028); 国家自然科学基金重大研究计划(92267110)
  • 作者简介:

    栗润腾,硕士研究生,研究方向为大语言模型与多智能体协作。E-mail:

  • 通信作者:
    陈雯柏,博士,教授,研究方向为模式识别与智能系统、智能科学与技术。E-mail:

Research on Key Factor Extraction of Agricultural User Demand Based on Large Language Models

LI Runteng1,3,4, WANG Yiqun1, LI Hongda2,3,4, LI Jingchen3,4, CHEN Wenbai1()   

  1. 1. School of Automation, Beijing Information Science & Technology University, Beijing 100192, China
    2. School of Electronic Information and Electrical Engineering, China Agricultural University, Beijing 100083, China
    3. National Engineering Research Center for Information Technology in Agriculture, Beijing 100097, China
    4. Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100079, China
  • Received:2025-09-02 Online:2025-12-11
  • Foundation items:Major Project of Scientific and Technological Innovation 2030(2021ZD0113603); National Natural Science Foundation of China(62276028); National Natural Science Foundation of China Major Research Plan(92267110)
  • About author:

    LI Runteng, E-mail:

  • Corresponding author:
    CHEN Wenbai, E-mail:

摘要:

【目的/意义】 在农业领域,用户需求文本是农技精准推广、政策靶向服务的核心依据,但其包含的农业专业术语、多场景差异、动态更新特征,导致传统方法存在“专业术语识别误差大、多场景需求分类模糊、动态需求响应滞后”的问题,难以支撑高效的需求分析。因此,本研究构建基于大语言模型的农业需求智能分析方法,针对性解决传统方法的上述痛点,实现农业用户需求的精准识别、结构化提取与高效解析,为农技推广、政策服务提供高质量的需求数据支撑,推动农业数字化转型的精准落地。 【方法】 在农业需求分析中引入大语言模型,利用其语义理解与推理能力实现对用户需求的精准解析。提出了“3阶段训练+多智能体协同”的技术路径:首先收集8万条农业文本与2.28万条标注样本,用于领域预训练与指令微调,以增强模型的领域适应性;随后设计多智能体协同框架,由管理型智能体负责任务调度与质量控制,任务型智能体分别承担需求分类、关键因子提取与解释生成,从而实现对农业用户需求的结构化分析。 【结果和讨论】 所提出的方法——农业需求分析智能体框架(Agri-NeedAgent)在需求类型匹配的准确率达到84.6%,关键因子提取的F1值达到85.2%,接口合规率94.2%,可解释性90.2分,均显著优于未经过领域微调的通用大语言模型与不含多智能体协同机制的对照方法。 【结论】 大语言模型经过领域适配与微调后,能够实现对农业用户需求的深度解析与关键因子精准提取,且能通过多智能体协同机制提升分析的完整性与可解释性,为智慧农业决策支持系统提供高质量的数据支撑。

关键词: 农业需求分析, 多智能体, 大语言模型, 关键因子提取, 领域微调

Abstract:

[Objective] In the agricultural domain, user demand texts serve as essential primary sources for agricultural extension, production management, and policy services. However, these texts typically contain highly specialized terminology, exhibit non-standard, colloquial, and diverse linguistic expressions, present fragmented semantics, and rely heavily on contextual reasoning. Such characteristics make them difficult to parse accurately using traditional rule-based approaches or shallow machine learning models. Consequently, these limitations often lead to biased demand classification and incomplete extraction of key factors, thereby constraining the quality of data available for intelligent agricultural decision-making. To address these challenges, the aim is to develop a robust, domain-adapted, and highly interpretable structured analysis method for agricultural user demands. [Methods] Agri-NeedAgent, an agricultural user demand analysis framework was proposed based on a "three-stage training + multi-agent collaboration" paradigm. First, during the domain knowledge pretraining stage, 80 000 agriculture-related texts—including crop cultivation manuals, pest and disease control guides, agricultural policy documents, and farmer consultation records—were used to construct domain-specific semantic understanding, thereby enhancing the model's capability to interpret agricultural terminology, dialectal expressions, contextual logic, and implicit semantics. Second, in the instruction fine-tuning stage, 6 320 annotated samples in an "instruction–input–output" format were employed to establish an explicit mapping from raw demand texts to structured outputs. Third, in the agricultural knowledge low-rank adaptation stage, Low-rank Adaptation (LoRA)was applied to perform lightweight parameter tuning on task-specific agents, enabling targeted adaptation for demand classification and key-factor extraction tasks. Built upon the above training process, a multi-agent collaborative framework was constructed, in which the manager agent is responsible for task scheduling and quality control, while task agents are designed to perform demand classification, key-factor extraction, and explanation generation, respectively. Through this division of labor and collaborative mechanism, the framework achieves efficient and structured analysis of agricultural user demands. [Results and Discussions] Experimental results demonstrate that the proposed Agri-NeedAgent achieves a demand classification accuracy of 84.6%, a key-factor extraction F1-Score of 85.2%, a structured interface compliance rate of 94.2%, and an interpretability score of 90.2.These results show clear improvements over traditional deep learning models such as Bidirectional Encoder Representations from Transformers (BERT) as well as general-purpose large language models (LLMs) without domain adaptation The findings confirmed the critical role of domain knowledge injection, explicit task alignment, and multi-agent specialization in enhancing semantic understanding and structured analysis of agricultural texts. Ablation experiments further validated the effectiveness of each component. Removing domain pretraining or LoRA fine-tuning resulted in substantial performance degradation in classification and key-factor extraction, indicating the necessity of domain adaptation and task-specific optimization for handling non-standard agricultural expressions. Moreover, eliminating the manager agent or the Reasoning and Acting(ReAct) mechanism significantly reduced structured interface compliance and interpretability, highlighting the importance of task coordination, intermediate verification, and multi-step reasoning for ensuring logical consistency and output completeness. Additionally, removing the external knowledge base reduced the interpretability score from 90.2 to 77.6, underscoring its essential role in providing theoretical grounding, reasoning support, and professional explanations. Although the multi-agent collaboration introduced an additional inference overhead of approximately 140 ms, the overall per-sample inference time remained within 225 ms, meeting the real-time requirements of agricultural consultation scenarios. [Conclusions] Supported by a "three-stage training + multi-agent collaboration" framework, LLMs can effectively address challenges posed by non-standard expressions, semantic fragmentation, and multi-factor reasoning in agricultural user demand texts. The proposed method demonstrates significant improvements in demand classification, key-factor extraction, structured output compliance, and interpretability, providing high-quality and traceable structured data for intelligent agricultural decision-making. After domain adaptation and task-specific tuning, the model not only gains enhanced capability for deep semantic analysis of agricultural user demands but also ensures the completeness and interpretability of outputs through multi-agent coordination. Although the current workflow still requires optimization in terms of data preparation, staged training, and knowledge-base updating, future work will focus on expanding region-specific and emerging-technology-related demand data, developing a dynamically updated agricultural knowledge system, improving multi-agent coordination efficiency, and exploring cross-lingual agricultural demand analysis to further promote the application and deployment of agricultural large models across broader scenarios.

Key words: agricultural demand analysis, multi-agent, large language model, key factor extraction, domain fine-tuning

中图分类号: