Agricultural Named Entity Recognition Based on Semantic Aggregation and Model Distillation

doi:10.12133/j.smartag.2021.3.1.202012-SA001

Abstract

Abstract:

With the development of smart agriculture, automatic question and answer (Q&A) of agricultural knowledge is needed to improve the efficiency of agricultural information acquisition. Agriculture named entity recognition plays a key role in automatic Q&A system, which helps obtaining information, understanding agriculture questions and providing answer from the knowledge graph. Due to the scarcity of labeled ANE data, some existing open agricultural entity recognition models rely on manual features, can reduce the accuracy of entity recognition. In this work, an approach of model distillation was proposed to recognize agricultural named entity data. Firstly, massive agriculture data were leveraged from Internet, an agriculture knowledge graph (AgriKG) was constructed. To overcome the scarcity of labeled named agricultural entity data, weakly named entity recognition label on agricultural texts crawled from the Internet was built with the help of AgriKG. The approach was derived from distant supervision, which was used to solve the scarcity of labeled relation extraction data. Considering the lack of labeled data, pretraining language model was introduced, which is fine tuned with existing labeled data. Secondly, large scale pretraining language model, BERT was used for agriculture named entity recognition and provided a pretty well initial parameters containing a lot of basic language knowledge. Considering that the task of agriculture named entity recognition relied heavily on low-end semantic features but slightly on high-end semantic features, an Attention-based Layer Aggregation mechanism for BERT(BERT-ALA) was designed in this research. The aim of BERT-ALA was to adaptively aggregate the output of multiple hidden layers of BERT. Based on BERT-ALA model, Bidirectional LSTM (BiLSTM) and conditional random field (CRF) were coupled to further improve the recognition precision, giving a BERT-ALA+BiLSTM+CRF model. Bi-LSTM improved BERT's insufficient learning ability of the relative position feature, while conditional random field models the dependencies of entity recognition label. Thirdly, since BERT-ALA+BiLSTM+CRF model was difficult to serve online because of the extremely high time and space complexity, BiLSTM+CRF model was used as student model to distill BERT-ALA+BiLSTM+CRF model. It fitted the BERT-ALA+BiLSTM+CRF model's output of BiLSTM layer and CRF layer. The experiment on the database constructed in the research, as well as two open datasets showed that (1) the macro-F1 of the BERT-ALA + BiLSTM + CRF model was improved by 1% compared to the baseline model BERT + BiLSTM + CRF, and (2) compared with the model trained on the original data, the macro-F1 of the distilled student model BiLSTM + CRF was increased by an average of 3.3%, the prediction time was reduced by 33%, and the storage space was reduced by 98%. The experimental results verify the effectiveness of the BERT-ALA and knowledge distillation in agricultural entity recognition.

Key words: distant supervision, agriculture knowledge graph, agriculture Q&A system, named entity recognition, knowledge distillation, deep learning, BERT, Bi-LSTM

CLC Number:

TP391

LI Liangde, WANG Xiujuan, KANG Mengzhen, HUA Jing, FAN Menghan. Agricultural Named Entity Recognition Based on Semantic Aggregation and Model Distillation[J]. Smart Agriculture, 2021, 3(1): 118-128.

References

1	COWIE J, LEHNERT W. Information extraction[J]. Communications of the ACM, 1996, 39(1): 80-91.
2	LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, et al. Neural architectures for named entity recognition[C]// The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego, USA: Association for Computational Linguistics, 2016: ID N16-1030.
3	李贯峰, 张鹏. 一个基于农业本体的 Web 知识抽取模型[J]. 江苏农业科学, 2018, 46(4): 201-205.
3	LI G, ZHANG P. A web knowledge extraction model based on agricultural ontology[J]. Jiangsu Agricultural Sciences, 2018, 46 (4): 201-205.
4	王春雨, 王芳. 基于条件随机场的农业命名实体识别研究[J]. 河北农业大学学报, 2014, 37(1): 132-135.
4	WANG C, WANG F. Research on agricultural named entity recognition based on conditional random field[J]. Journal of Hebei Agricultural University, 2014, 37 (1): 132-135.
5	TSENG H, CHANG P-C, ANDREW G, et al. A conditional random field word segmenter for sighan bakeoff 2005[C]// Proceedings of the fourth SIGHAN workshop on Chinese language Processing. San Diego, USA: Association for Computational Linguistics, 2005.
6	MALARKODI C, LEX E, DEVI S L J. Named entity recognition for the agricultural domain[J]. Research in Computing Science, 2016, 117(1): 121-132.
7	刘晓俊. 面向农业领域的命名实体识别研究[D]. 合肥: 安徽农业大学, 2019.
7	LIU X. Research on named entity recognition for agriculture[D]. Hefei: Anhui Agricultural University, 2019.
8	BISWAS P, SHARAN A, VERMA S. Named entity recognition for agriculture domain using word net[J]. IInternational Journal of Computer & Mathematical Sciences2016, 5(10): 29-36.
9	MILLER G A. WordNet: An electronic lexical database[M]. Massachusetts: MIT press, 1998.
10	LI J, SUN A, HAN J, et al. A survey on deep learning for named entity recognition[J]. IEEE Transactions on Knowledge Data Engineering, 2020 (99): 1.
11	MINTZ M, BILLS S, SNOW R, et al. Distant supervision for relation extraction without labeled data[C]// Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. San Diego, USA: Association for Computational Linguistics, 2009: 1003-1011.
12	ZENG D, LIU K, CHEN Y, et al. Distant supervision for relation extraction via piecewise convolutional neural networks[C]// Proceedings of the 2015 conference on empirical methods in natural language processing. Lisbon, Portugal: Association for Computational Linguistics, 2015: 1753-1762.
13	DEVLIN J, CHANG M-W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego, USA: Association for Computational Linguistics, 2018.
14	POLINO A, PASCANU R, ALISTARH D. Model compression via distillation and quantization[EB/OL]. 2018. arXiv:.
15	HUANG Z, XU W, YU K. Bidirectional LSTM-CRF models for sequence tagging[EB/OL]. 2015. arXiv:.
16	ZHOU Z. A brief introduction to weakly supervised learning[J]. National Science Review, 2018, 5(1): 44-53.
17	米嘉. 大规模中文文本检索中的高性能索引研究[D]. 北京: 中国科学院, 2005.
17	MI J. Research on high performance index in large scale Chinese text retrieval[D]. Beijing: Chinese Academy of Sciences, 2005.
18	LUO L, YANG Z, YANG P, et al. An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition[J]. Bioinformatics, 2018, 34(8): 1381-1388.
19	SOUZA F, NOGUEIRA R, LOTUFO R. Portuguese named entity recognition using BERT-CRF[EB/OL]. 2019. arXiv:.
20	GREFF K, SRIVASTAVA R K, KOUTNíK J, et al. LSTM: A search space odyssey[J]. IEEE Transactions on Neural Networks and Learning Systems, 2016, 28(10): 2222-2232.
21	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, US: Carran Associates Inc., 2017: 6000-6010.
22	DONG C, ZHANG J, ZONG C, et al. Character-based LSTM-CRF with radical-level features for Chinese named entity recognition[C]//International Conference on Computer Processing of Oriental Languages National CCF Conference on Natural Language Processing and Chinese Computing. Berlin, German: Springer, 2016: 239-250.
23	YAN H, DENG B, LI X, et al. Tener: Adapting transformer encoder for name entity recognition[EB/OL]. 2019. arXiv:.
24	JIANG S, ZHAO S, HOU K, et al. A BERT-BiLSTM-CRF model for chinese electronic medical records named entity recognition[C]// 2019 12th International Conference on Intelligent Computation Technology and Automation. Piscataway, New York, USA: IEEE, 2019: 166-169.
25	JAWAHAR G, SAGOT B, SEDDAH D. What does BERT learn about the structure of language?[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. San Diego, USA: Association for Computational Linguistics, 2019.
26	OPITZ J, BURST S. Macro F1 and Macro F1[EB/OL]. 2019. arXiv:.
27	GRAVE E, BOJANOWSKI P, GUPTA P, et al. Learning word vectors for 157 languages[C]// Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japan: European Language Resources Association (ELRA), 2018.
28	KINGMA D P, BA J J A P A. Adam: A method for stochastic optimization[EB/OL]
28	// 3rd International Conference on Learning Representations. Ithaca, NY: arXiv. org. 2015: 13.

[1]	MA Liu, MAO Kebiao, GUO Zhonghua. Defogging Remote Sensing Images Method Based on a Hybrid Attention-Based Generative Adversarial Network [J]. Smart Agriculture, 2025, 7(2): 172-182.
[2]	YANG Chenxue, LI Xian, ZHOU Qingbo. Knowledge Graph Driven Grain Big Data Applications: Overview and Perspective [J]. Smart Agriculture, 2025, 7(2): 26-40.
[3]	XU Shiwei, LI Qianchuan, LUAN Rupeng, ZHUANG Jiayu, LIU Jiajia, XIONG Lu. Agricultural Market Monitoring and Early Warning: An Integrated Forecasting Approach Based on Deep Learning [J]. Smart Agriculture, 2025, 7(1): 57-69.
[4]	GONG Yu, WANG Ling, ZHAO Rongqiang, YOU Haibo, ZHOU Mo, LIU Jie. Tomato Growth Height Prediction Method by Phenotypic Feature Extraction Using Multi-modal Data [J]. Smart Agriculture, 2025, 7(1): 97-110.
[5]	QI Zijun, NIU Dangdang, WU Huarui, ZHANG Lilin, WANG Lunfeng, ZHANG Hongming. Chinese Kiwifruit Text Named Entity Recognition Method Based on Dual-Dimensional Information and Pruning [J]. Smart Agriculture, 2025, 7(1): 44-56.
[6]	ZHANG Hui, HU Jun, SHI Hang, LIU Changxi, WU Miao. Precision Target Spraying System Integrated with Remote Deep Learning Recognition Model for Cabbage Plant Centers [J]. Smart Agriculture, 2024, 6(6): 85-95.
[7]	LU Bibo, LIANG Di, YANG Jie, SONG Aiqing, HUANGFU Shangwei. Image Segmentation Method of Chinese Yam Leaves in Complex Background Based on Improved ENet [J]. Smart Agriculture, 2024, 6(6): 109-120.
[8]	LUO Youlu, PAN Yonghao, XIA Shunxing, TAO Youzhi. Lightweight Apple Leaf Disease Detection Algorithm Based on Improved YOLOv8 [J]. Smart Agriculture, 2024, 6(5): 128-138.
[9]	LIU Yi, ZHANG Yanjun. ReluformerN: Lightweight High-Low Frequency Enhanced for Hyperspectral Agricultural Lancover Classification [J]. Smart Agriculture, 2024, 6(5): 74-87.
[10]	NIAN Yue, ZHAO Kaixuan, JI Jiangtao. Cow Hoof Slippage Detecting Method Based on Enhanced DeepLabCut Model [J]. Smart Agriculture, 2024, 6(5): 153-163.
[11]	ZHANG Yanqi, ZHOU Shuo, ZHANG Ning, CHAI Xiujuan, SUN Tan. A Regional Farming Pig Counting System Based on Improved Instance Segmentation Algorithm [J]. Smart Agriculture, 2024, 6(4): 53-63.
[12]	WENG Zhi, FAN Qi, ZHENG Zhiqiang. Automatic Measurement Method of Beef Cattle Body Size Based on Multimodal Image Information and Improved Instance Segmentation Network [J]. Smart Agriculture, 2024, 6(4): 64-75.
[13]	HOU Yiting, RAO Yuan, SONG He, NIE Zhenjun, WANG Tan, HE Haoxu. A Rapid Detection Method for Wheat Seedling Leaf Number in Complex Field Scenarios Based on Improved YOLOv8 [J]. Smart Agriculture, 2024, 6(4): 128-137.
[14]	ZHANG Yu, LI Xiangting, SUN Yalin, XUE Aidi, ZHANG Yi, JIANG Hailong, SHEN Weizheng. Real-Time Monitoring Method for Cow Rumination Behavior Based on Edge Computing and Improved MobileNet v3 [J]. Smart Agriculture, 2024, 6(4): 29-41.
[15]	LI Hao, DU Yuqiu, XIAO Xingzhu, CHEN Yanxi. Remote Sensing Identification Method of Cultivated Land at Hill County of Sichuan Basin Based on Deep Learning [J]. Smart Agriculture, 2024, 6(3): 34-45.