Welcome to Smart Agriculture 中文

Smart Agriculture

   

Research on Key Factor Extraction of Agricultural User Demand Based on Large Language Models

LI Runteng1,3,4, WANG Yiqun1, LI Hongda2,3,4, LI Jingchen3,4, CHEN Wenbai1()   

  1. 1. School of Automation, Beijing Information Science & Technology University, Beijing 100192, China
    2. School of Electronic Information and Electrical Engineering, China Agricultural University, Beijing 100083, China
    3. National Engineering Research Center for Information Technology in Agriculture, Beijing 100097, China
    4. Information Technology Research Center, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100079, China
  • Received:2025-09-02 Online:2025-12-11
  • Foundation items:Major Project of Scientific and Technological Innovation 2030(2021ZD0113603); National Natural Science Foundation of China(62276028); National Natural Science Foundation of China Major Research Plan(92267110)
  • About author:

    LI Runteng, E-mail:

  • corresponding author:
    CHEN Wenbai, E-mail:

Abstract:

[Objective] In the agricultural domain, user demand texts serve as essential primary sources for agricultural extension, production management, and policy services. However, these texts typically contain highly specialized terminology, exhibit non-standard, colloquial, and diverse linguistic expressions, present fragmented semantics, and rely heavily on contextual reasoning. Such characteristics make them difficult to parse accurately using traditional rule-based approaches or shallow machine learning models. Consequently, these limitations often lead to biased demand classification and incomplete extraction of key factors, thereby constraining the quality of data available for intelligent agricultural decision-making. To address these challenges, the aim is to develop a robust, domain-adapted, and highly interpretable structured analysis method for agricultural user demands. [Methods] Agri-NeedAgent, an agricultural user demand analysis framework was proposed based on a "three-stage training + multi-agent collaboration" paradigm. First, during the domain knowledge pretraining stage, 80 000 agriculture-related texts—including crop cultivation manuals, pest and disease control guides, agricultural policy documents, and farmer consultation records—were used to construct domain-specific semantic understanding, thereby enhancing the model's capability to interpret agricultural terminology, dialectal expressions, contextual logic, and implicit semantics. Second, in the instruction fine-tuning stage, 6 320 annotated samples in an "instruction–input–output" format were employed to establish an explicit mapping from raw demand texts to structured outputs. Third, in the agricultural knowledge low-rank adaptation stage, Low-rank Adaptation (LoRA)was applied to perform lightweight parameter tuning on task-specific agents, enabling targeted adaptation for demand classification and key-factor extraction tasks. Built upon the above training process, a multi-agent collaborative framework was constructed, in which the manager agent is responsible for task scheduling and quality control, while task agents are designed to perform demand classification, key-factor extraction, and explanation generation, respectively. Through this division of labor and collaborative mechanism, the framework achieves efficient and structured analysis of agricultural user demands. [Results and Discussions] Experimental results demonstrate that the proposed Agri-NeedAgent achieves a demand classification accuracy of 84.6%, a key-factor extraction F1-Score of 85.2%, a structured interface compliance rate of 94.2%, and an interpretability score of 90.2.These results show clear improvements over traditional deep learning models such as Bidirectional Encoder Representations from Transformers (BERT) as well as general-purpose large language models (LLMs) without domain adaptation The findings confirmed the critical role of domain knowledge injection, explicit task alignment, and multi-agent specialization in enhancing semantic understanding and structured analysis of agricultural texts. Ablation experiments further validated the effectiveness of each component. Removing domain pretraining or LoRA fine-tuning resulted in substantial performance degradation in classification and key-factor extraction, indicating the necessity of domain adaptation and task-specific optimization for handling non-standard agricultural expressions. Moreover, eliminating the manager agent or the Reasoning and Acting(ReAct) mechanism significantly reduced structured interface compliance and interpretability, highlighting the importance of task coordination, intermediate verification, and multi-step reasoning for ensuring logical consistency and output completeness. Additionally, removing the external knowledge base reduced the interpretability score from 90.2 to 77.6, underscoring its essential role in providing theoretical grounding, reasoning support, and professional explanations. Although the multi-agent collaboration introduced an additional inference overhead of approximately 140 ms, the overall per-sample inference time remained within 225 ms, meeting the real-time requirements of agricultural consultation scenarios. [Conclusions] Supported by a "three-stage training + multi-agent collaboration" framework, LLMs can effectively address challenges posed by non-standard expressions, semantic fragmentation, and multi-factor reasoning in agricultural user demand texts. The proposed method demonstrates significant improvements in demand classification, key-factor extraction, structured output compliance, and interpretability, providing high-quality and traceable structured data for intelligent agricultural decision-making. After domain adaptation and task-specific tuning, the model not only gains enhanced capability for deep semantic analysis of agricultural user demands but also ensures the completeness and interpretability of outputs through multi-agent coordination. Although the current workflow still requires optimization in terms of data preparation, staged training, and knowledge-base updating, future work will focus on expanding region-specific and emerging-technology-related demand data, developing a dynamically updated agricultural knowledge system, improving multi-agent coordination efficiency, and exploring cross-lingual agricultural demand analysis to further promote the application and deployment of agricultural large models across broader scenarios.

Key words: agricultural demand analysis, multi-agent, large language model, key factor extraction, domain fine-tuning

CLC Number: