Welcome to Smart Agriculture 中文

Smart Agriculture ›› 2021, Vol. 3 ›› Issue (1): 118-128.doi: 10.12133/j.smartag.2021.3.1.202012-SA001

• Information Processing and Decision Making • Previous Articles     Next Articles

Agricultural Named Entity Recognition Based on Semantic Aggregation and Model Distillation

LI Liangde1,2(), WANG Xiujuan1,3, KANG Mengzhen1,2, HUA Jing1,4, FAN Menghan1,2   

  1. 1.The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
    2.School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China
    3.Beijing Engineering Research Center of Intelligent Systems and Technology, Beijing 100190, China
    4.Qingdao Smart AgriTech. , Ltd, Qingdao 266000, China
  • Received:2020-12-17 Revised:2021-02-09 Online:2021-03-30 Published:2021-06-01
  • corresponding author: Mengzhen KANG E-mail:liliangde2018@ia.ac.cn


With the development of smart agriculture, automatic question and answer (Q&A) of agricultural knowledge is needed to improve the efficiency of agricultural information acquisition. Agriculture named entity recognition plays a key role in automatic Q&A system, which helps obtaining information, understanding agriculture questions and providing answer from the knowledge graph. Due to the scarcity of labeled ANE data, some existing open agricultural entity recognition models rely on manual features, can reduce the accuracy of entity recognition. In this work, an approach of model distillation was proposed to recognize agricultural named entity data. Firstly, massive agriculture data were leveraged from Internet, an agriculture knowledge graph (AgriKG) was constructed. To overcome the scarcity of labeled named agricultural entity data, weakly named entity recognition label on agricultural texts crawled from the Internet was built with the help of AgriKG. The approach was derived from distant supervision, which was used to solve the scarcity of labeled relation extraction data. Considering the lack of labeled data, pretraining language model was introduced, which is fine tuned with existing labeled data. Secondly, large scale pretraining language model, BERT was used for agriculture named entity recognition and provided a pretty well initial parameters containing a lot of basic language knowledge. Considering that the task of agriculture named entity recognition relied heavily on low-end semantic features but slightly on high-end semantic features, an Attention-based Layer Aggregation mechanism for BERT(BERT-ALA) was designed in this research. The aim of BERT-ALA was to adaptively aggregate the output of multiple hidden layers of BERT. Based on BERT-ALA model, Bidirectional LSTM (BiLSTM) and conditional random field (CRF) were coupled to further improve the recognition precision, giving a BERT-ALA+BiLSTM+CRF model. Bi-LSTM improved BERT's insufficient learning ability of the relative position feature, while conditional random field models the dependencies of entity recognition label. Thirdly, since BERT-ALA+BiLSTM+CRF model was difficult to serve online because of the extremely high time and space complexity, BiLSTM+CRF model was used as student model to distill BERT-ALA+BiLSTM+CRF model. It fitted the BERT-ALA+BiLSTM+CRF model's output of BiLSTM layer and CRF layer. The experiment on the database constructed in the research, as well as two open datasets showed that (1) the macro-F1 of the BERT-ALA + BiLSTM + CRF model was improved by 1% compared to the baseline model BERT + BiLSTM + CRF, and (2) compared with the model trained on the original data, the macro-F1 of the distilled student model BiLSTM + CRF was increased by an average of 3.3%, the prediction time was reduced by 33%, and the storage space was reduced by 98%. The experimental results verify the effectiveness of the BERT-ALA and knowledge distillation in agricultural entity recognition.

Key words: distant supervision, agriculture knowledge graph, agriculture Q&A system, named entity recognition, knowledge distillation, deep learning, BERT, Bi-LSTM

CLC Number: