Welcome to Smart Agriculture 中文

Smart Agriculture

   

Knowledge Graph Driven Grain Big Data Applications:Overview and Perspective

YANG Chenxue(), LI Xian, ZHOU Qingbo()   

  1. Agricultural Information Institute, China Academy of Agricultural Sciences, Beijing 100081, China

Abstract:

[Significance] Grain production in China spans multiple stages and involves numerous heterogeneous factors, including agronomic inputs, natural resources, environmental conditions, and socio-economic variables. However, the associated data generated throughout the entire production process—ranging from cultivation planning to harvest evaluation—remains highly fragmented, unstructured, and semantically diverse. This complexity data, combined with the lack of integrated core algorithms to support decision-making, has severely limited the potential of big data to drive innovation in grain production. Knowledge graph (KG) technology, by offering structured, semantically-rich representations of complex data, provides a promising approach to address these challenges. KGs enable the integration of multi-source and heterogeneous data, enhance semantic mining and reasoning capabilities, and offer intelligent, knowledge-driven support for sustainable grain production. [Progress] This paper systematically reviewed the current state of research and application of knowledge graphs in the domain of grain production big data. A comprehensive KG-driven framework was proposed based on a hybrid paradigm combining data-driven modeling and domain knowledge guidance. The framework was designed to support the entire grain production lifecycle and addressed three primary dimensions of data complexity: structural diversity, relational heterogeneity, and semantic ambiguity. The key techniques of constructing multimodal knowledge map and temporal reasoning for grain production were described. First, an agricultural ontology system for grain production was designed, incorporating domain-specific concepts, hierarchical relationships, and attribute constraints. This ontology provided the semantic foundation for knowledge modeling and alignment. Second, multimodal named entity recognition (NER) techniques were employed to extract entities such as crops, varieties, weather conditions, operations, and equipment from structured and unstructured data sources, including satellite imagery, agronomic reports, IoT sensor data, and historical statistics. Advanced deep learning models, such as BERT and vision-language transformers, were used to enhance recognition accuracy across text and image modalities. Third, the system implemented multimodal entity linking and disambiguation, which connected identical or semantically similar entities across different data sources by leveraging graph embeddings, semantic similarity measures, and rule-based matching. Finally, temporal reasoning modules were constructed using temporal KGs and logical rules to support dynamic inference over time-sensitive knowledge, such as crop growth stages, climate variations, and policy interventions. The proposed KG-driven system enabled the development of intelligent applications across multiple stages of grain production. In the pre-production stage, knowledge graphs supported decision-making in resource allocation, crop variety selection, and planting schedule optimization based on past data patterns and predictive inference. During the in-production stage, the system facilitated precision operations—such as real-time fertilization and irrigation—by reasoning over current field status, real-time sensor inputs, and historical trends. In the post-production stage, it enabled yield assessment and economic evaluation through integration of production outcomes, environmental factors, and policy constraints. Conclusions and Prospects Knowledge graph technologies offer a scalable and semantically-enhanced approach for unlocking the full potential of grain production big data. By integrating heterogeneous data sources, representing domain knowledge explicitly, and supporting intelligent reasoning, KGs can provide visualization, explainability, and decision support across various spatial scales, including national, provincial, county-level, and large-scale farm contexts. These technologies are of great scientific and practical significance in supporting China's national food security strategy and advancing the goals of storing grain in the land and storing grain in technology. Future directions include the construction of cross-domain agricultural knowledge fusion systems, dynamic ontology evolution mechanisms, and federated KG platforms for multi-region data collaboration under data privacy constraints.

Key words: grain production big data, knowledge representation, multimodal knowledge graph, named entity recognition, entity linking, temporal reasoning

CLC Number: