Welcome to Smart Agriculture 中文

Smart Agriculture ›› 2025, Vol. 7 ›› Issue (2): 26-40.doi: 10.12133/j.smartag.SA202501004

• Topic--Development and Application of the Big Data Platform for Grain Production • Previous Articles     Next Articles

Knowledge Graph Driven Grain Big Data Applications: Overview and Perspective

YANG Chenxue, LI Xian, ZHOU Qingbo()   

  1. Agricultural Information Institute, China Academy of Agricultural Sciences, Beijing 100081, China
  • Received:2025-01-01 Online:2025-03-30
  • Foundation items:
    The National Key Research and Development Program of China(2023YFD2000102)
  • About author:
    YANG Chenxue, E-mail:
  • corresponding author:
    ZHOU Qingbo, E-mail:

Abstract:

[Significance] Grain production spans multiple stages and involves numerous heterogeneous factors, including agronomic inputs, natural resources, environmental conditions, and socio-economic variables. However, the associated data generated throughout the entire production process, ranging from cultivation planning to harvest evaluation, remains highly fragmented, unstructured, and semantically diverse. This complexity data, combined with the lack of integrated core algorithms to support decision-making, has severely limited the potential of big data to drive innovation in grain production. Knowledge graph technology, by offering structured and semantically-rich representations of complex data, enables the integration of multi-source and heterogeneous data, enhances semantic mining and reasoning capabilities, and provides intelligent, knowledge-driven support for sustainable grain production, thereby addressing these challenges effectively. [Progress] This paper systematically reviewed the current research and application progress of knowledge graphs in the grain production big data. A comprehensive knowledge graph driven framework was proposed based on a hybrid paradigm combining data-driven modeling and domain knowledge guidance to support the entire grain production lifecycle and addressed three primary dimensions of data complexity: Structural diversity, relational heterogeneity, and semantic ambiguity. The key techniques of constructing multimodal knowledge map and temporal reasoning for grain production were described. First, an agricultural ontology system for grain production was designed, incorporating domain-specific concepts, hierarchical relationships, and attribute constraints. This ontology provided the semantic foundation for knowledge modeling and alignment. Second, multimodal named entity recognition (NER) techniques were employed to extract entities such as crops, varieties, weather conditions, operations, and equipment from structured and unstructured data sources, including satellite imagery, agronomic reports, Internet of Things sensor data, and historical statistics. Advanced deep learning models, such as bidirectional encoder representations from transformers (BERT) and vision-language transformers, were used to enhance recognition accuracy across text and image modalities. Third, the system implemented multimodal entity linking and disambiguation, which connected identical or semantically similar entities across different data sources by leveraging graph embeddings, semantic similarity measures, and rule-based matching. Finally, temporal reasoning modules were constructed using temporal knowledge graphs and logical rules to support dynamic inference over time-sensitive knowledge, such as crop growth stages, climate variations, and policy interventions. The proposed knowledge graph driven system enabled the development of intelligent applications across multiple stages of grain production. In the pre-production stage, knowledge graphs supported decision-making in resource allocation, crop variety selection, and planting schedule optimization based on past data patterns and predictive inference. During the in-production stage, the system facilitated precision operations, such as real-time fertilization and irrigation by reasoning over current field status, real-time sensor inputs, and historical trends. In the post-production stage, it enabled yield assessment and economic evaluation through integration of production outcomes, environmental factors, and policy constraints. [Conclusions and Prospects] Knowledge graph technologies offer a scalable and semantically-enhanced approach for unlocking the full potential of grain production big data. By integrating heterogeneous data sources, representing domain knowledge explicitly, and supporting intelligent reasoning, knowledge graphs can provide visualization, explainability, and decision support across various spatial scales, including national, provincial, county-level, and large-scale farm contexts. These technologies are of great scientific and practical significance in supporting China's national food security strategy and advancing the goals of storing grain in the land and storing grain in technology. Future directions include the construction of cross-domain agricultural knowledge fusion systems, dynamic ontology evolution mechanisms, and federated knowledge graph platforms for multi-region data collaboration under data privacy constraints.

Key words: grain production big data, knowledge representation, multimodal knowledge graph, named entity recognition, entity linking, temporal reasoning

CLC Number: