Welcome to Smart Agriculture 中文

Smart Agriculture

   

ADON-R: A Method for Constructing an Agricultural Ontology Network Based on Semantic Similarity and Rule-Based Reasoning

CHEN XiaoJing1,2, LI Wei1,2, LI Ruihang1,2, LIN Jia2, YAO Qiong2, WU Wendi2, FAN Jingchao1,2, YAN Shen4, WANG Jian1,2, ZHANG Jianhua1,2(), ZHOU Guomin1,2,3,5()   

  1. 1. Institute of Agricultural Information, Chinese Academy of Agricultural Sciences/Key Laboratory of Agricultural Big Data, Ministry of Agriculture and Rural Affairs/National Agricultural Science Data Center, Beijing 100081, China
    2. Nanfan Research Institute, Chinese Academy of Agricultural Sciences, Sanya 572024, China
    3. Western Research Institute, Chinese Academy of Agricultural Sciences, Changji 831100, China
    4. Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
    5. Nanjing Institute of Agricultural Mechanization, Ministry of Agriculture and Rural Affairs, Nanjing 210014, China
  • Received:2025-10-15 Online:2026-01-20
  • Foundation items:National Key R&D Program of China(2022YFF0711800); Nanfan Special Project of Sanya Academy of Chinese Academy of Agricultural Sciences(YBXM2448); Central Public-interest Scientific Institution Basal Research Fund(JBYW-AII-2025-05); National Agricultural Science Data Center Project(NASDC2025XM11)
  • About author:

    CHEN XiaoJing, E-mail: ;

    LI Wei, E-mail:

  • corresponding author:
    ZHOU Guomin, E-mail: ;
    ZHANG Jianhua, E-mail:

Abstract:

[Objective] The agricultural knowledge ecosystem has long been hindered by "knowledge silos" arising from heterogeneous, multi-source ontologies—a critical bottleneck impeding the advancement of smart agriculture. Existing ontology reasoning approaches are typically confined to a single semantic dimension or rely solely on formal logical rules, rendering them inadequate for capturing the intricate biological relationships and cross-disciplinary knowledge structures inherent in agricultural domains. To address this challenge, agricultural deep ontology network-reasoner (ADON-R), a novel framework is proposed, that integrates semantic similarity with rule-based inference for constructing a unified agricultural ontology network. The aim is to establish a hybrid reasoning architecture that combines logical rigor with semantic discovery capabilities, enabling the systematic integration of 28 internationally recognized agricultural ontologies into a richly interconnected, structurally refined, and reliability-quantified knowledge network. [Methods] A dual-track inference architecture was designed, comprising a basic relational reasoning module and a graded relevance-based reasoning module. In the basic module, ten biologically plausible transitivity rules (R1-R10) were manually formulated based on four core relations defined by the open biomedical ontologies (OBO) Foundry: is_a, part_of, has_part, and regulates. These rules were implemented via explicit SPARQL(SPARQL Protocol and RDF Query Language) queries using the Apache Jena library to precisely complete missing triples, thereby establishing a logically consistent backbone for the knowledge network. This strategy prioritized interpretability and controllability, effectively avoiding rule conflicts or redundant derivations commonly introduced by generic reasoners. In the graded relevance-based module, a five-dimensional evidence framework was introduced, encompassing: definition-based similarity, semantic similarity, biological network proximity, functional trait alignment, and taxonomic co-reference. Semantic similarity was computed by embedding term definitions using the BioBERT pre-trained language model, followed by large-scale approximate nearest neighbor search via FAISS(Facebook AI Similarity Search) across more than 130 000 definition texts. To validate BioBERT's efficacy, comparative experiments were conducted on the STS-B(Semantic Textual Similarity Benchmark) benchmark, with performance evaluated using Spearman's rank correlation coefficient. The remaining four evidence dimensions were derived through string matching, Jena rule execution, and traversal of specific relation paths (e.g., regulates, subClassOf, only_in_taxon). Newly inferred relations were classified into four confidence tiers (I-IV) based on the number of independent supporting evidence types: Tier I required ≥3 heterogeneous evidence sources; Tier II required exactly 2; Tier III relied solely on definition-based similarity; and Tier IV represented associations supported by a single non-definition evidence type. To mitigate error propagation, only relations of Tiers I–III were permitted to participate in subsequent transitive inference under constrained conditions, while Tier IV relations were excluded from further chaining due to insufficient evidential support. [Results and Discussion] The experimental pipeline integrated approximately 167 887 terms and 249 603 initial relations from 28 agricultural ontologies. ADON-R generated 1 305 312 new ontology relations. The basic reasoning module contributed 182 779 triples, substantially expanding high-confidence is_a and part_of hierarchies through transitive closure. Within the graded relevance module, definition-based similarity yielded the largest volume of inferences (557 825 relations). Notably, only four Tier I relations were produced—reflecting the method's "high precision, low recall" design principle that prioritized consensus among multiple orthogonal evidence streams. Tier II comprised 3 539 relations, while Tiers III and IV each exceeded 557 825 and 561 165 relations, respectively, collectively forming a nuanced spectrum of inferred associations ranging from highly reliable to exploratory. On the STS-B test set, BioBERT achieved a Spearman correlation of 0.852 0—slightly below general-domain BERT (0.868 1) but outperforming specialized biomedical models such as ClinicalBERT (0.844 2) and BlueBERT (0.818 0)—demonstrating its suitability for domain-specific semantic understanding. Case studies in a graph database further illustrated ADON-R's capacity to uncover deep, cross-ontology connections. [Conclusions] The ADON-R framework successfully constructs a large-scale, structurally granular, and reliability-stratified agricultural ontology network, effectively mitigating knowledge fragmentation across heterogeneous sources. By harmonizing logical rule inference with deep semantic modeling, ADON-R not only preserves the logical integrity of core ontological structures but also substantially enhances the discovery of latent cross-domain associations. Its novel evidence-grading mechanism endows automatically inferred relations with actionable confidence labels, markedly improving the adaptability and robustness of the knowledge network in real-world applications. Although rigorous empirical validation remains pending, ADON-R provides a methodological foundation for knowledge infrastructure in smart agriculture.

Key words: ontology reasoning, pre-trained language model, vector retrieval, rule-based reasoning, hierarchical evidence, agricultural ontology network

CLC Number: