基于语义融合与模型蒸馏的农业实体识别

doi:10.12133/j.smartag.2021.3.1.202012-SA001

Smart Agriculture ›› 2021, Vol. 3 ›› Issue (1): 118-128.doi: 10.12133/j.smartag.2021.3.1.202012-SA001

基于语义融合与模型蒸馏的农业实体识别

李亮德^1,²(), 王秀娟^1,³, 康孟珍^1,², 华净^1,⁴, 樊梦涵^1,²

^1.中国科学院自动化研究所复杂系统管理与控制国家重点实验室，北京 100190
^2.中国科学院大学人工智能学院，北京 100049
^3.北京智能化技术与系统工程技术研究中心，北京 100190
^4.青岛中科慧农科技有限公司，山东青岛 266000

收稿日期:2020-12-17 修回日期:2021-02-09 出版日期:2021-03-30
基金项目:
中国科学院战略性先导科技专项（A类）(XDA20030102);国家自然科学基金面上项目(62076239);中国科学院与泰国科技发展署合作研究资助项目(GJHZ2076)
作者简介:李亮德（1996－），男，硕士研究生，研究方向为面向农业的自然语言处理。E-mail：liliangde2018@ia.ac.cn。
通信作者: 康孟珍（1975－），女，博士，副研究员，研究方向为计算植物和智慧农业。电话：010-82544502。E-mail： mengzhen.kang@ia.ac.cn

Agricultural Named Entity Recognition Based on Semantic Aggregation and Model Distillation

LI Liangde^1,²(), WANG Xiujuan^1,³, KANG Mengzhen^1,², HUA Jing^1,⁴, FAN Menghan^1,²

^1.The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
^2.School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China
^3.Beijing Engineering Research Center of Intelligent Systems and Technology, Beijing 100190, China
^4.Qingdao Smart AgriTech. , Ltd, Qingdao 266000, China

Received:2020-12-17 Revised:2021-02-09 Online:2021-03-30
Foundation items:Strategic Priority Science and Technology Project (Class A) of the Chinese Academy of Sciences(XDA20030102);National Natural Science Foundation of China(62076239);Collaborative Research Project between the Chinese Academy of Sciences and the Thailand Science and Technology Development Agency (GJHZ2076)
About author:LI Liangde, E-mail：liliangde2018@ia.ac.cn
Corresponding author:KANG Mengzhen, E-mail： mengzhen.kang@ia.ac.cn

摘要/Abstract

摘要：

当前农业实体识别标注数据稀缺，部分公开的农业实体识别模型依赖手工特征，实体识别精度低。虽然有的农业实体识别模型基于深度学习方法，实体识别效果有所提高，但是存在模型推理延迟高、参数量大等问题。本研究提出了一种基于知识蒸馏的农业实体识别方法。首先，利用互联网的海量农业数据构建农业知识图谱，在此基础上通过远程监督得到弱标注语料。其次，针对实体识别的特点，提出基于注意力的BERT层融合模型（BERT-ALA），融合不同层次的语义特征；结合双向长短期记忆网络（BiLSTM）和条件随机场CRF，得到BERT-ALA+BiLSTM+CRF模型作为教师模型。最后，用BiLSTM+CRF模型作为学生模型蒸馏教师模型，保证模型预测耗时和参数量符合线上服务要求。在本研究构建的农业实体识别数据集以及两个公开数据集上进行实验，结果显示，BERT-ALA+BiLSTM+CRF模型的macro-F1相对于基线模型BERT+ BiLSTM+CRF平均提高1%。蒸馏得到的学生模型BiLSTM+CRF的macro-F1相对于原始数据训练的模型平均提高3.3%，预测耗时降低了33%，存储空间降低98%。试验结果验证了基于注意力机制的BERT层融合模型以及知识蒸馏在农业实体识别方面具有有效性。

关键词: 远程监督, 农业知识图谱, 农业问答系统, 实体识别, 知识蒸馏, 深度学习, BERT, 双向长短期记忆网络

Abstract:

With the development of smart agriculture, automatic question and answer (Q&A) of agricultural knowledge is needed to improve the efficiency of agricultural information acquisition. Agriculture named entity recognition plays a key role in automatic Q&A system, which helps obtaining information, understanding agriculture questions and providing answer from the knowledge graph. Due to the scarcity of labeled ANE data, some existing open agricultural entity recognition models rely on manual features, can reduce the accuracy of entity recognition. In this work, an approach of model distillation was proposed to recognize agricultural named entity data. Firstly, massive agriculture data were leveraged from Internet, an agriculture knowledge graph (AgriKG) was constructed. To overcome the scarcity of labeled named agricultural entity data, weakly named entity recognition label on agricultural texts crawled from the Internet was built with the help of AgriKG. The approach was derived from distant supervision, which was used to solve the scarcity of labeled relation extraction data. Considering the lack of labeled data, pretraining language model was introduced, which is fine tuned with existing labeled data. Secondly, large scale pretraining language model, BERT was used for agriculture named entity recognition and provided a pretty well initial parameters containing a lot of basic language knowledge. Considering that the task of agriculture named entity recognition relied heavily on low-end semantic features but slightly on high-end semantic features, an Attention-based Layer Aggregation mechanism for BERT(BERT-ALA) was designed in this research. The aim of BERT-ALA was to adaptively aggregate the output of multiple hidden layers of BERT. Based on BERT-ALA model, Bidirectional LSTM (BiLSTM) and conditional random field (CRF) were coupled to further improve the recognition precision, giving a BERT-ALA+BiLSTM+CRF model. Bi-LSTM improved BERT's insufficient learning ability of the relative position feature, while conditional random field models the dependencies of entity recognition label. Thirdly, since BERT-ALA+BiLSTM+CRF model was difficult to serve online because of the extremely high time and space complexity, BiLSTM+CRF model was used as student model to distill BERT-ALA+BiLSTM+CRF model. It fitted the BERT-ALA+BiLSTM+CRF model's output of BiLSTM layer and CRF layer. The experiment on the database constructed in the research, as well as two open datasets showed that (1) the macro-F1 of the BERT-ALA + BiLSTM + CRF model was improved by 1% compared to the baseline model BERT + BiLSTM + CRF, and (2) compared with the model trained on the original data, the macro-F1 of the distilled student model BiLSTM + CRF was increased by an average of 3.3%, the prediction time was reduced by 33%, and the storage space was reduced by 98%. The experimental results verify the effectiveness of the BERT-ALA and knowledge distillation in agricultural entity recognition.

Key words: distant supervision, agriculture knowledge graph, agriculture Q&A system, named entity recognition, knowledge distillation, deep learning, BERT, Bi-LSTM

中图分类号:

TP391

李亮德, 王秀娟, 康孟珍, 华净, 樊梦涵. 基于语义融合与模型蒸馏的农业实体识别[J]. 智慧农业(中英文), 2021, 3(1): 118-128.

LI Liangde, WANG Xiujuan, KANG Mengzhen, HUA Jing, FAN Menghan. Agricultural Named Entity Recognition Based on Semantic Aggregation and Model Distillation[J]. Smart Agriculture, 2021, 3(1): 118-128.

参考文献

1	COWIE J, LEHNERT W. Information extraction[J]. Communications of the ACM, 1996, 39(1): 80-91.
2	LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, et al. Neural architectures for named entity recognition[C]// The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego, USA: Association for Computational Linguistics, 2016: ID N16-1030.
3	李贯峰, 张鹏. 一个基于农业本体的 Web 知识抽取模型[J]. 江苏农业科学, 2018, 46(4): 201-205.
3	LI G, ZHANG P. A web knowledge extraction model based on agricultural ontology[J]. Jiangsu Agricultural Sciences, 2018, 46 (4): 201-205.
4	王春雨, 王芳. 基于条件随机场的农业命名实体识别研究[J]. 河北农业大学学报, 2014, 37(1): 132-135.
4	WANG C, WANG F. Research on agricultural named entity recognition based on conditional random field[J]. Journal of Hebei Agricultural University, 2014, 37 (1): 132-135.
5	TSENG H, CHANG P-C, ANDREW G, et al. A conditional random field word segmenter for sighan bakeoff 2005[C]// Proceedings of the fourth SIGHAN workshop on Chinese language Processing. San Diego, USA: Association for Computational Linguistics, 2005.
6	MALARKODI C, LEX E, DEVI S L J. Named entity recognition for the agricultural domain[J]. Research in Computing Science, 2016, 117(1): 121-132.
7	刘晓俊. 面向农业领域的命名实体识别研究[D]. 合肥: 安徽农业大学, 2019.
7	LIU X. Research on named entity recognition for agriculture[D]. Hefei: Anhui Agricultural University, 2019.
8	BISWAS P, SHARAN A, VERMA S. Named entity recognition for agriculture domain using word net[J]. IInternational Journal of Computer & Mathematical Sciences2016, 5(10): 29-36.
9	MILLER G A. WordNet: An electronic lexical database[M]. Massachusetts: MIT press, 1998.
10	LI J, SUN A, HAN J, et al. A survey on deep learning for named entity recognition[J]. IEEE Transactions on Knowledge Data Engineering, 2020 (99): 1.
11	MINTZ M, BILLS S, SNOW R, et al. Distant supervision for relation extraction without labeled data[C]// Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. San Diego, USA: Association for Computational Linguistics, 2009: 1003-1011.
12	ZENG D, LIU K, CHEN Y, et al. Distant supervision for relation extraction via piecewise convolutional neural networks[C]// Proceedings of the 2015 conference on empirical methods in natural language processing. Lisbon, Portugal: Association for Computational Linguistics, 2015: 1753-1762.
13	DEVLIN J, CHANG M-W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego, USA: Association for Computational Linguistics, 2018.
14	POLINO A, PASCANU R, ALISTARH D. Model compression via distillation and quantization[EB/OL]. 2018. arXiv:.
15	HUANG Z, XU W, YU K. Bidirectional LSTM-CRF models for sequence tagging[EB/OL]. 2015. arXiv:.
16	ZHOU Z. A brief introduction to weakly supervised learning[J]. National Science Review, 2018, 5(1): 44-53.
17	米嘉. 大规模中文文本检索中的高性能索引研究[D]. 北京: 中国科学院, 2005.
17	MI J. Research on high performance index in large scale Chinese text retrieval[D]. Beijing: Chinese Academy of Sciences, 2005.
18	LUO L, YANG Z, YANG P, et al. An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition[J]. Bioinformatics, 2018, 34(8): 1381-1388.
19	SOUZA F, NOGUEIRA R, LOTUFO R. Portuguese named entity recognition using BERT-CRF[EB/OL]. 2019. arXiv:.
20	GREFF K, SRIVASTAVA R K, KOUTNíK J, et al. LSTM: A search space odyssey[J]. IEEE Transactions on Neural Networks and Learning Systems, 2016, 28(10): 2222-2232.
21	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, US: Carran Associates Inc., 2017: 6000-6010.
22	DONG C, ZHANG J, ZONG C, et al. Character-based LSTM-CRF with radical-level features for Chinese named entity recognition[C]//International Conference on Computer Processing of Oriental Languages National CCF Conference on Natural Language Processing and Chinese Computing. Berlin, German: Springer, 2016: 239-250.
23	YAN H, DENG B, LI X, et al. Tener: Adapting transformer encoder for name entity recognition[EB/OL]. 2019. arXiv:.
24	JIANG S, ZHAO S, HOU K, et al. A BERT-BiLSTM-CRF model for chinese electronic medical records named entity recognition[C]// 2019 12th International Conference on Intelligent Computation Technology and Automation. Piscataway, New York, USA: IEEE, 2019: 166-169.
25	JAWAHAR G, SAGOT B, SEDDAH D. What does BERT learn about the structure of language?[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. San Diego, USA: Association for Computational Linguistics, 2019.
26	OPITZ J, BURST S. Macro F1 and Macro F1[EB/OL]. 2019. arXiv:.
27	GRAVE E, BOJANOWSKI P, GUPTA P, et al. Learning word vectors for 157 languages[C]// Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japan: European Language Resources Association (ELRA), 2018.
28	KINGMA D P, BA J J A P A. Adam: A method for stochastic optimization[EB/OL]
28	// 3rd International Conference on Learning Representations. Ithaca, NY: arXiv. org. 2015: 13.

[1]	马六, 毛克彪, 郭中华. 基于混合注意力生成对抗网络的遥感图像去雾方法[J]. 智慧农业(中英文), 2025, 7(2): 172-182.
[2]	杨晨雪, 李娴, 周清波. 知识图谱驱动下粮食生产大数据应用现状与展望[J]. 智慧农业(中英文), 2025, 7(2): 26-40.
[3]	许世卫, 李乾川, 栾汝朋, 庄家煜, 刘佳佳, 熊露. 农产品市场监测预警深度学习智能预测方法[J]. 智慧农业(中英文), 2025, 7(1): 57-69.
[4]	宫宇, 王玲, 赵荣强, 尤海波, 周沫, 刘劼. 基于多模态数据表型特征提取的番茄生长高度预测方法[J]. 智慧农业(中英文), 2025, 7(1): 97-110.
[5]	齐梓均, 牛当当, 吴华瑞, 张礼麟, 王仑峰, 张宏鸣. 基于双维信息与剪枝的中文猕猴桃文本命名实体识别方法[J]. 智慧农业(中英文), 2025, 7(1): 44-56.
[6]	张辉, 胡军, 石航, 刘昶希, 吴淼. 融合远端深度学习识别模型的白菜株心精准对靶喷雾系统[J]. 智慧农业(中英文), 2024, 6(6): 85-95.
[7]	芦碧波, 梁迪, 杨洁, 宋爱青, 皇甫尚卫. 基于改进ENet的复杂背景下山药叶片图像分割方法[J]. 智慧农业(中英文), 2024, 6(6): 109-120.
[8]	罗友璐, 潘勇浩, 夏顺兴, 陶友志. 基于改进YOLOv8的苹果叶病害轻量化检测算法[J]. 智慧农业(中英文), 2024, 6(5): 128-138.
[9]	刘伊, 张彦军. ReluformerN：轻量化高低频增强高光谱农业地物分类方法[J]. 智慧农业(中英文), 2024, 6(5): 74-87.
[10]	年悦, 赵凯旋, 姬江涛. 基于改进DeepLabCut模型的奶牛滑蹄检测方法[J]. 智慧农业(中英文), 2024, 6(5): 153-163.
[11]	张岩琪, 周硕, 张凝, 柴秀娟, 孙坦. 基于改进实例分割算法的区域养殖生猪计数系统[J]. 智慧农业(中英文), 2024, 6(4): 53-63.
[12]	翁智, 范琦, 郑志强. 基于多模态图像信息及改进实例分割网络的肉牛体尺自动测量方法[J]. 智慧农业(中英文), 2024, 6(4): 64-75.
[13]	侯依廷, 饶元, 宋贺, 聂振君, 王坦, 何豪旭. 复杂大田场景下基于改进YOLOv8的小麦幼苗期叶片数快速检测方法[J]. 智慧农业(中英文), 2024, 6(4): 128-137.
[14]	李豪, 杜雨秋, 肖星竹, 陈彦羲. 基于深度学习的四川盆地丘陵区县域耕地遥感识别研究[J]. 智慧农业(中英文), 2024, 6(3): 34-45.
[15]	聂刚刚, 饶洪辉, 李泽锋, 刘木华. 基于改进YOLACT的油茶叶片炭疽病感染严重程度分级模型[J]. 智慧农业(中英文), 2024, 6(3): 138-147.

基于语义融合与模型蒸馏的农业实体识别

Agricultural Named Entity Recognition Based on Semantic Aggregation and Model Distillation

在线阅读

知网下载

本地下载

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价