Applications Research Progress and Prospects of Multi-Agent Large Language Models in Agricultural

doi:10.12133/j.smartag.SA202503026

Abstract

Abstract:

[Significance] With the rapid advancement of large language models (LLM) and multi-agent systems, their integration, multi-agent large language models, is emerging as a transformative force in modern agriculture. Agricultural production involves complex, sequential, and highly environment-dependent processes, including tillage, planting, management, and harvesting. Traditional intelligent systems often struggle with the diversity, uncertainty, and coordination of these stages' demand. Multi-agent LLMs offer a new paradigm for agricultural intelligence by combining deep semantic understanding with distributed collaboration and adaptive coordination. Through role specialization, real-time perception, and cooperative decision-making, they can decompose complex workflows, adapt to changing conditions, and enable robust, full-process automation, making them well-suited to the challenges of modern agriculture. More importantly, their application marks a critical step toward the digital transformation, precision management, and sustainable development of agriculture. By enabling intelligent decision-making across the entire agricultural lifecycle, they provide both theoretical foundations and practical tools for building next-generation smart and unmanned farming systems. [Progress] The core concepts of multi-agent LLMs are first elucidated, covering the composition and characteristics of multi-agent systems as well as the development and training pipelines of LLMs. Then, the overall architecture of multi-agent systems is presented, encompassing both the environments in which agents operate and their internal structures. The collaborative patterns of multi-agent LLMs are then examined in terms of coordination structures and temporal organization. Following this, interaction mechanisms are discussed from multiple dimensions, including interactions between agents and the external environment, inter-agent communication, communication protocol frameworks, and communication security. To demonstrate the varying task specializations of different multi-agent frameworks, a comparative benchmark survey table is provided by synthesizing benchmark tasks and results reported in existing studies. The results show that different multi-agent large language model architectures tend to perform better on specific types of tasks, reflecting the influence of agent framework design characteristics such as role assignment strategies, communication protocols, and decision-making mechanisms. Furthermore, several representative architectures of multi-agent LLMs, as proposed in existing studies, are briefly reviewed. Based on their design features, their potential applicability to agricultural scenarios is discussed. Finally, current research progress and practical applications of LLMs, multimodal large models, and multi-agent LLMs in the agricultural domain are surveyed. The application architecture of agricultural LLMs is summarized, using rice cultivation as a representative scenario to illustrate the collaborative process of a multi-agent system powered by LLMs. This process involves data acquisition agents, data processing agents, task allocation and coordination agents, task execution agents, and feedback and optimization agents. The roles and functions of each kind of agent in enabling automated and intelligent operations throughout the entire agricultural lifecycle, including tillage, planting, management, and harvesting, are comprehensively described. In addition, drawing on existing research on multimodal data processing, the pseudocode is provided to illustrate the basic logic of the data processing agents. [Conclusions and Prospects] Multi-agent LLMs technology holds vast promise in agriculture but still confronts several challenges. First, limited model interpretability, stemming from opaque internal reasoning and high-dimensional parameter mappings, hinders decision transparency, traceability, user trust, and debugging efficiency. Second, model hallucination is significant, probabilistic generation may deviate from facts, leading to erroneous environmental perception and decisions that cause resource waste or crop damage. Third, multi-modal agricultural data acquisition and processing remain complex due to non-uniform equipment standards, heterogeneous data, and insufficient cross-modal reasoning, complicating data fusion and decision-making. Future directions include: (1) enhancing interpretability via chain-of-thought techniques to improve reasoning transparency and traceability; (2) reducing hallucinations by integrating knowledge bases, retrieval-augmented generation, and verification mechanisms to bolster decision reliability; and (3) standardizing data formats to strengthen cross-modal fusion and reasoning. These measures will improve system stability and efficiency, providing solid support for the advancement of smart agriculture.

Key words: multi-agent, large language models, agricultural applications, intelligent decision-making, deep learning, smart agriculture

CLC Number:

TP18
S126

ZHAO Yingping, LIANG Jinming, CHEN Beizhang, DENG Xiaoling, ZHANG Yi, XIONG Zheng, PAN Ming, MENG Xiangbao. Applications Research Progress and Prospects of Multi-Agent Large Language Models in Agricultural[J]. Smart Agriculture, 2025, 7(5): 37-51.

Figures/Tables 10

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Table 1

Fig. 6

Fig. 7

Fig. 8

Table 2

Pseudocode for multimodal data fusion process

多模态数据融合伪代码

输入：

图像数据 $I$ （例如：遥感图像、作物生长图像）

传感器数据 $S$ （例如：温湿度、光照、土壤湿度）

文本数据 $T$ （例如：农业管理日志、天气报告）

步骤：

（1）数据预处理

对图像数据 $I$ 进行标准化、裁剪或增强等

对传感器数据 $S$ 进行去噪、插值、单位标准化

对文本数据 $T$ 进行分词、去停用词

（2）特征提取

使用图像模型（如ViT）提取图像特征：

$F i = I m a g e E n c o d e r (I)$

使用时间序列模型（如长短时记忆（Long Short Term Memory， LSTM））提取传感器数据特征：

$F s = S e n s o r E n c o d e r (S)$

使用文本模型（如BERT）提取文本特征：

$F t = T e x t E n c o d e r (T)$

（3）特征对齐

将不同模态的特征映射到相同的维度空间：

$F i' = A l i g n (F i)$

$F s' = A l i g n (F s)$

$F t' = A l i g n (F t)$

（4）特征融合

$F a l l = F u s i o n (F i', F s', F t')$

输出：

融合后的统一特征表示 $F a l l$ ，作为下游任务的输入基础

Table 2

References 77

[1]	ANNEPAKA Y, PAKRAY P. Large language models: A survey of their development, capabilities, and applications[J]. Knowledge and information systems, 2025, 67(3): 2967-3022.
[2]	WANG Z C, CHU Z B, DOAN T V, et al. History, development, and principles of large language models: An introductory survey[J]. AI and ethics, 2025, 5(3): 1955-1971.
[3]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// 31st Conference on Neural Information Processing System. Red Hook, New York, USA: Curran Associates Inc., 2017: 5998-6008.
[4]	DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[EB/OL]. arXiv:1810.04805v2, 2019.
[5]	RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving language understanding by generative pre-training[R]. OpenAI, 2018. [2025-04-30]. https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf.
[6]	RADFORD A, WU J, CHILD R, et al. Language models are unsupervised multitask learners[R]. OpenAI blog, 2019. [2025-04-30]. https://cdn.openai.com/better-language-models/language_mo dels_are_unsupervised_multitask_learners.pdf.
[7]	BROWN T, MANN B, RYDER N, et al. Language models are few-shot learners[EB/OL]. arXiv: 2005.14165, 2020.
[8]	CHOWDHERY A, NARANG S, DEVLIN J, et al. PaLM: Scaling language modeling with pathways[J]. Journal of machine learning research, 2023, 24(240): 1-113.
[9]	TEAM G, ANIL R, BORGEAUD S, et al. Gemini: A family of highly capable multimodal models[EB/OL]. arXiv: 2312.11805, 2023.
[10]	ACHIAM J, ADLER S, AGARWAL S, et al. GPT-4 technical report[EB/OL]. arXiv: 2303.08774, 2023.
[11]	YANG A, YANG B S, HUI B Y, et al. Qwen2 technical report[EB/OL]. arXiv: 2407.10671, 2024.
[12]	BAI S, CHEN K Q, LIU X J, et al. Qwen 2.5-VL technical report[EB/OL]. arXiv: 2502.13923, 2025.
[13]	TOUVRON H, MARTIN L, STONE K, et al. Llama 2: Open foundation and fine-tuned chat models[EB/OL]. arXiv: 2307.09288, 2023.
[14]	GRATTAFIORI A, DUBEY A, JAUHRI A, et al. The llama 3 herd of models[EB/OL]. arXiv: 2407.21783, 2024.
[15]	LIU A X, FENG B, XUE B, et al. DeepSeek-V3 technical report[EB/OL]. arXiv: 2412.19437, 2024.
[16]	GUO D Y, YANG D J, ZHANG H W, et al. DeepSeek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning[EB/OL]. arXiv: 2501.12948, 2025.
[17]	TRAN K T, DAO D, NGUYEN M D, et al. Multi-agent collaboration mechanisms: A survey of LLMs[EB/OL]. arXiv: 2501.063 22, 2025.
[18]	DORRI A, KANHERE S S, JURDAK R. Multi-agent systems: A survey[J]. IEEE access, 2018, 6: 28573-28593.
[19]	GUO T C, CHEN X Y, WANG Y Q, et al. Large language model based multi-agents: A survey of progress and challenges[EB/OL]. arXiv: 2402.01680, 2024.
[20]	JULIAN V, BOTTI V. Multi-agent systems[J]. Applied sciences, 2019, 9(7): ID 1402.
[21]	ZHAO W X, ZHOU K, LI J Y, et al. A survey of large language models[EB/OL]. arXiv: 2303.18223, 2023.
[22]	KAPLAN J, MCCANDLISH S, HENIGHAN T, et al. Scaling laws for neural language models[EB/OL]. arXiv: 2001.08361, 2020.
[23]	LI P F, ZHANG M, LIN P J, et al. Conditional embedding pre-training language model for image captioning[J]. Neural processing letters, 2022, 54(6): 4987-5003.
[24]	DAN Y H, LEI Z K, GU Y Y, et al. EduChat: A large-scale language model-based chatbot system for intelligent education[EB/OL]. arXiv: 2308.02773, 2023.
[25]	KIANIAN R, SUN D Y, CROWELL E L, et al. The use of large language models to generate education materials about uveitis[J]. Ophthalmology retina, 2024, 8(2): 195-201.
[26]	KUNG T H, CHEATHAM M, MEDENILLA A, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models[J]. PLoS digital health, 2023, 2(2): ID e0000198.
[27]	SALLAM M. ChatGPT utility in healthcare education, research, and practice: Systematic review on the promising perspectives and valid concerns[J]. Healthcare, 2023, 11(6): ID 887.
[28]	SINGHAL K, TU T, GOTTWEIS J, et al. Toward expert-level medical question answering with large language models[J]. Nature medicine, 2025, 31(3): 943-950.
[29]	WU S J, IRSOY O, LU S, et al. Bloomberggpt: A large language model for finance[EB/OL]. arXiv: 2303.17564, 2023.
[30]	HUANG A H, WANG H, YANG Y. FinBERT: A large language model for extracting information from financial text[J]. Contemporary accounting research, 2023, 40(2): 806-841.
[31]	HUANG J M, XIAO M X, LI D, et al. Open-finllms: Open multimodal large language models for financial applications[EB/OL]. arXiv: 2408.11878, 2024.
[32]	PARK J S, O'BRIEN J, CAI C J, et al. Generative agents: Interactive simulacra of human behavior[C]// Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology. New York, USA: ACM, 2023.
[33]	RASHEED Z, WASEEM M, AHMAD A, et al. Can large language models serve as data analysts? A multi-agent assisted approach for qualitative data analysis[EB/OL]. arXiv: 2402.01386, 2024.
[34]	FRAIWAN M, KHASAWNEH N. A review of ChatGPT applications in education, marketing, software engineering, and healthcare: Benefits, drawbacks, and research directions[EB/OL]. arXiv: 2305.00237, 2023.
[35]	TIRO D. The possibility of applying ChatGPT (AI) for calculations in mechanical engineering[M]// New Technologies, Development and Application VI. Cham: Springer Nature Switzerland, 2023: 313-320.
[36]	PAL S, BHATTACHARYA M, LEE S S, et al. A domain-specific next-generation large language model (LLM) or ChatGPT is required for biomedical engineering and research[J]. Annals of biomedical engineering, 2024, 52(3): 451-454.
[37]	AGASHE S, FAN Y, WANG X E. LLM-coordination: Evaluating and analyzing multi-agent coordination abilities in large language models[EB/OL]. arXiv: 2310.03903, 2023.
[38]	LI G H, HAMMOUD H, ITANI H, et al. Camel: Communicative agents for "mind" exploration of large language model society[EB/OL]. arXiv:2303.17760v2, 2023.
[39]	HONG S R, ZHUNG M C, CHEN J Q, et al. Metagpt: Meta programming for multi-agent collaborative framework[EB/OL]. arXiv: 2308.00352, 2023.
[40]	CHEN G Y, DONG S W, SHU Y, et al. Autoagents: A framework for automatic agent generation[EB/OL]. arXiv: 2309.17288, 2023.
[41]	QIAN C, XIE Z H, WANG Y F, et al. Scaling large-language-model-based multi-agent collaboration[EB/OL]. arXiv: 2406.07155, 2024.
[42]	ZHUGE M C, WANG W Y, KIRSCH L, et al. Language agents as optimizable graphs[EB/OL]. arXiv: 2402.16823, 2024.
[43]	CHEN W Z, SU Y S, ZUO J W, et al. Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors in agents[EB/OL]. arXiv: 2308.10848, 2023.
[44]	LIU Z W, YAO W R, ZHANG J G, et al. Agentlite: A lightweight library for building and advancing task-oriented LLM agent system[EB/OL]. arXiv: 2402.15538, 2024.
[45]	ZHANG B, MAO H Y, RUAN J Q, et al. Controlling large language model-based agents for large-scale decision-making: An actor-critic approach[EB/OL]. arXiv: 2311.13884, 2023.
[46]	SMITH. The contract net protocol: High-level communication and control in a distributed problem solver[J]. IEEE transactions on computers, 1980, C-29(12): 1104-1113.
[47]	PAN Y, SUN J X, YU H F, et al. Building multi-agent copilot towards autonomous agricultural data management and analysis[C]// 2024 IEEE International Conference on Big Data (BigData). Piscataway, New Jersey, USA: IEEE, 2024: 4384-4393.
[48]	WU Q Y, BANSAL G, ZHANG J Y, et al. Autogen: Enabling next-gen LLM applications via multi-agent conversation framework[EB/OL]. arXiv: 2308.08155, 2023.
[49]	SHI H R, LI S B, YUAN Z Y, et al. PestMA: LLM-based multi-agent system for informed pest management[EB/OL]. arXiv: 2504.09855, 2025.
[50]	CHEN Y J, ZHU K X, CHEN Z D, et al. Intelligent multi-agent systems for UAV-robot path optimization via reflective evolution[M]// Parallel and Distributed Computing, Applications and Technologies. Singapore: Springer Nature Singapore, 2025: 566-577.
[51]	THUDUMU S, FISHER J. OpenAg: Democratizing agricultural intelligence [EB/OL] arXiv: 2506.04571, 2025.
[52]	张浩, 蔡晨馨, 屈傲, 等. Beehive: 基于智能体的智慧农场操作系统[J]. 计算, 2025, 1(2): 63-70.
	ZHANG H, CAI C X, QU A, et al. Beehive: Agent-based smart farm operating system[J]. Computing magazine of the CCF, 2025, 1(2): 63-70.
[53]	刘海峰, 孟祥宝, 谢秋波, 等. 我国智能化植物工厂发展现状与对策建议[J]. 广东科技, 2021, 30(7): 69-71.
[54]	万欢, 欧媛珍, 管宪鲁, 等. 无人农机作业环境感知技术综述[J]. 农业工程学报, 2024, 40(8): 1-18.
	WAN H, OU Y Z, GUAN X L, et al. Review of the perception technologies for unmanned agricultural machinery operating environment[J]. Transactions of the Chinese society of agricultural engineering, 2024, 40(8): 1-18.
[55]	赵春江. 智慧农业发展现状及战略目标研究[J]. 智慧农业, 2019, 1(1): 1-7.
	ZHAO C J. State-of-the-art and recommended developmental strategic objectivs of smart agriculture[J]. Smart agriculture, 2019, 1(1): 1-7.
[56]	XU H, MAN Y B, YANG M Y, et al. Analytical insight of earth: A cloud-platform of intelligent computing for geospatial big data[EB/OL]. arXiv: 2312.16385, 2023.
[57]	WILLIAMS D, MACFARLANE F, BRITTEN A. Leaf only SAM: A segment anything pipeline for zero-shot automated leaf segmentation[J]. Smart agricultural technology, 2024, 8: ID 100515.
[58]	WANG L Q, JIN T, YANG J Y, et al. Agri-LLaVA: Knowledge-infused large multimodal assistant on agricultural pests and diseases[EB/OL]. arXiv: 2412.02158, 2024.
[59]	REZAYI S, LIU Z L, WU Z H, et al. AgriBERT: Knowledge-infused agricultural language models for matching food and nutrition[C]// the Thirty-First International Joint Conference on Artificial Intelligence AI for Good. Messe Wien, Vienna, Austria: IJCAI, 2022: 5150-5156.
[60]	YANG X J, GAO J F, XUE W X, et al. Pllama: An open-source large language model for plant science[EB/OL]. arXiv: 2401.01600, 2024.
[61]	赵春江, 李静晨, 吴华瑞, 等. 基于大语言模型推理的数字孪生平台蔬菜作物生长模型研究[J]. 智慧农业(中英文), 2024, 6(6): 63-71.
	ZHAO C J, LI J C, WU H R, et al. Vegetable crop growth modeling in digital twin platform based on large language model inference[J]. Smart agriculture, 2024, 6(6): 63-71.
[62]	任荣荣, 胡崇宇, 吴国龙. 农业种植智能体(Agri-agent)的构建与应用展望[J]. 农业展望, 2024, 20(6): 92-106.
	REN R R, HU C Y, WU G L. Construction and application outlook of agri-agent[J]. Agricultural outlook, 2024, 20(6): 92-106.
[63]	吴华瑞, 李晓锁. 大模型在设施蔬菜智能化生产中的应用[J]. 蔬菜, 2024(11): 1-8.
	WU H R, LI X S. Application of large model in intelligent production of protected vegetables[J]. Vegetables, 2024(11): 1-8.
[64]	LIU Z, LIU M J, CHEN J Z, et al. Fusion: Fully integration of vision-language representations for deep cross-modal[EB/OL]. arXiv: 2504.09925, 2025.
[65]	JIANG J C, LI Y, NIE J, et al. Integrating large language models with cross-modal data fusion for advanced intelligent transportation systems in sustainable cities development[J]. Applied soft computing, 2025, 177: ID 113278.
[66]	DU M N, LIU N H, HU X. Techniques for interpretable machine learning[J]. Communications of the ACM, 2019, 63(1): 68-77.
[67]	ZHAO H Y, CHEN H J, YANG F, et al. Explainability for large language models: A survey[J]. ACM transactions on intelligent systems and technology, 2024, 15(2): 1-38.
[68]	BLEVINS T, GONEN H, ZETTLEMOYER L. Prompting language models for linguistic structure[EB/OL]. arXiv: 2211.07830, 2022.
[69]	RIBEIRO M T, SINGH S, GUESTRIN C. "Why should I trust you": Explaining the predictions of any classifier[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA. ACM, 2016: 1135-1144.
[70]	LUNDBERG S M, LEE S I. A unified approach to interpreting model predictions[C]// 31st Conference on Neural Information Processing Systems. Red Hook, New York, USA: Curran Associates Inc., 2017: 4768-4777.
[71]	HUANG L, YU W J, MA W T, et al. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions[J]. ACM transactions on information systems, 2025, 43(2): 1-55.
[72]	张宇芹, 朱景全, 董薇, 等. 农业垂直领域大语言模型构建流程和技术展望[J]. 农业大数据学报, 2024, 6(3): 412-423.
	ZHANG Y Q, ZHU J Q, DONG W, et al. Construction process and technological prospects of large language models in the agricultural vertical domain[J]. Journal of agricultural big data, 2024, 6(3): 412-423.
[73]	BéCHARD P, AYALA O M. Reducing hallucination in structured outputs via Retrieval-Augmented Generation[EB/OL]. arXiv: 2404.08189, 2024.
[74]	MARTINO A, IANNELLI M, TRUONG C. Knowledge injection to counter large language model (LLM) hallucination[M]// The Semantic Web: ESWC 2023 Satellite Events. Cham: Springer Nature Switzerland, 2023: 182-185.
[75]	FELDMAN P, FOULDS J R, PAN S M. Trapping LLM hallucinations using tagged context prompts[EB/OL]. arXiv: 2306.06085, 2023.
[76]	WEI J H, YAO Y S, J-FTON, et al. Measuring and reducing LLM hallucination without gold-standard answers[EB/OL]. arXiv: 2402.10412, 2024.
[77]	BANERJEE S, AGARWAL A, SINGLA S. LLMs will always hallucinate, and we need to live with this[EB/OL]. arXiv: 2409.05746, 2024.

方法	MMLU	HumanEval	SRDD	CommonGen	平均分
测试集规模/条	14 042	164	1 201	1 497	平均分
GPTSwarm	0.236 8	0.496 9	0.709 6	0.622 2	0.516 3
AgentVerse	0.297 7	0.725 6	0.758 7	0.539 9	0.580 5
MacNet （链式结构）	0.663 2	0.372 0	0.805 6	0.590 3	0.607 8