Progressive Convolutional Net Based Method for Agricultural Named Entity Recognition

doi:10.12133/j.smartag.SA202303001

Abstract

Abstract:

Pre-training refers to the process of training deep neural network parameters on a large corpus before a specific task model performs a particular task. This approach enables downstream tasks to fine-tune the pre-trained model parameters based on a small amount of labeled data, eliminating the need to train a new model from scratch. Currently, research on named entity recognition (NER) using pre-trained language model (PLM) only uses the last layer of the PLM to express output when facing challenges such as complex entity naming methods and fuzzy entity boundaries in the agricultural field. This approach ignores the rich information contained in the internal layers of the model themselves. To address these issues, a named entity recognition method based on progressive convolutional networks has been proposed. This method stores natural sentences and outputs representations of each layer obtained through PLM. The intermediate outputs of the pre-trained model are sequentially convolved to extract shallow feature information that may have been overlooked previously. Using the progressive convolutional network module proposed in this research, the adjacent two-layer representations are convolved from the first layer, and the fusion result continues to be convolved with the next layer, resulting in enhanced sentence embedding that includes the entire information dimension of the model layer. The method does not require the introduction of external information, which makes the sentence representation contain richer information. Research has shown that the sentence embedding output of the model layer near the input contains more fine-grained information, such as phrases and phrases, which can assist with NER problems in the agricultural field. Fully utilizing the computational power already used, the results obtained can enhance the representation embedding of sentences. Finally, the conditional random field (CRF) model was used to generate the global optimal sequence. On a constructed agricultural dataset containing four types of agricultural entities, the proposed method's comprehensive indicator F₁ value increased by 3.61% points compared to the basic BERT (Bidirectional Encoder Representation from Transformers) model. On the open dataset MSRA, the F₁ value also increased to 94.96%, indicating that the progressive convolutional network can enhance the model's ability to represent natural language and has advantages in NER tasks.

Key words: agriculture named entity recognition (NER), pre-trained language model (PLM), convolutional net, representation aggregation, deep learning

CLC Number:

TP391.1

JI Jie, JIN Zhou, WANG Rujing, LIU Haiyan, LI Zhiyuan. Progressive Convolutional Net Based Method for Agricultural Named Entity Recognition[J]. Smart Agriculture, 2023, 5(1): 122-131.

Figures/Tables 15

Fig. 1

Fig. 2

Fig. 3

Table 1

Table 2

Table 3

Table 4

Table 5

Table 6

Table 7

Table 8

Fig. 4

Table 9

Table 10

Table 11

References 24

1	QIU X P, SUN T X, XU Y G, et al. Pre-trained models for natural language processing: A survey[J]. Science China technological sciences, 2020, 63(10): 1872-1897.
2	SEVASTJANOVA R, KALOULI A, BECK C, et al. Explaining contextualization in language models using visual analytics[C]// 2021 59th Association for Computational Linguistics (ACL). Stroudsburg, PA, USA: Association for Computational Linguistics, 2021: 464-476.
3	DEVLIN J, CHANG M-W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]// North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT). Stroudsburg, PA, USA: Association for Computational Linguistics, 2019: 4171-4186.
4	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// 2017 31st Annual Conference on Neural Information Processing Systems (NIPS). La Jolla, California, USA: Neural Information Processing Systems, 2017: 6000-6100.
5	杨飘, 董文永. 基于BERT嵌入的中文命名实体识别方法[J]. 计算机工程, 2020, 46(4): 40-45, 52.
	YANG P, DONG W Y. Chinese named entity recognition method based on BERT embedding[J]. Computer engineering, 2020, 46(4): 40-45, 52.
6	GAN Y, YANG R S, ZHANG C F, et al. Chinese named entity recognition based on BERT-transformer-BiLSTM-CRF model[C]// 2021 7th International Symposium on System and Software Reliability (ISSSR). Piscataway, NJ, USA: IEEE, 2021: 109-118.
7	GAO W C, ZHENG X H, ZHAO S S. Named entity recognition method of Chinese EMR based on BERT-BiLSTM-CRF[J]. Journal of physics. Conference series. 2021, 1848(1): ID 012083.
8	CHANG Y, KONG L, JIA K J, et al. Chinese named entity recognition method based on BERT[C]// 2021 IEEE International Conference on Data Science and Computer Application (ICDSCA). Piscataway, NJ, USA: IEEE, 2021: 294-299.
9	LI X, YAN H, QIU X, et al . FLAT: Chinese NER Using Flat-Lattice Transformer; proceedings of the ACL, F, 2020[C]// 2020 58th Annual Meeting of the Association for Computational Linguistics (ACL). Stroudsburg, PA, USA: Association for Computational Linguistics, 2020: 6836-6842.
10	琚生根, 李天宁, 孙界平. 基于关联记忆网络的中文细粒度命名实体识别[J]. 软件学报, 2021, 32(8): 2545-2556.
	JU S G, LI T N, SUN J P. Chinese fine-grained name entity recognition based on associated memory networks[J]. Journal of software, 2021, 32(8): 2545-2556.
11	WANG X Y, JIANG Y, BACH N, et al. Improving named entity recognition by external context retrieving and cooperative learning[J/OL]. arXiv: , 2021.
12	NIE Y Y, TIAN Y H, SONG Y, et al. Improving named entity recognition with attentive ensemble of syntactic information[C]// Findings of the Association for Computational Linguistics: EMNLP 2020. Stroudsburg, PA, USA: Association for Computational Linguistics, 2020: 4231-4245.
13	李林, 周晗, 郭旭超, 等. 基于多源信息融合的中文农作物病虫害命名实体识别[J]. 农业机械学报, 2021, 52(12): 253-263.
	LI L, ZHOU H, GUO X C, et al. Named entity recognition of diseases and insect pests based on multi source information fusion[J]. Transactions of the Chinese society for agricultural machinery, 2021, 52(12): 253-263.
14	赵鹏飞, 赵春江, 吴华瑞, 等. 基于注意力机制的农业文本命名实体识别[J]. 农业机械学报, 2021, 52(1): 185-192.
	ZHAO P F, ZHAO C J, WU H R, et al. Named entity recognition of Chinese agricultural text based on attention mechanism[J]. Transactions of the Chinese society for agricultural machinery, 2021, 52(1): 185-192.
15	JAWAHAR G, SAGOT B, SEDDAH D. What does BERT learn about the structure of language? [C]// 2019 57th Annual Meeting of the Association for Computational Linguistics (ACL). Stroudsburg, PA, USA: Association for Computational Linguistics, 2019: 3651-3657.
16	ROGERS A, KOVALEVA O, RUMSHISKY A. A primer in BERTology: What we know about how BERT works[J]. Transactions of the association for computational linguistics, 2020, 8: 842-866.
17	JIE Z M, LU W. Dependency-guided LSTM-CRF for named entity recognition[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Stroudsburg, PA, USA: Association for Computational Linguistics, 2019: 4231-4245.
18	ZHANG Z B, WU S, JIANG D W, et al. BERT-JAM: Maximizing the utilization of BERT for neural machine translation[J]. Neurocomputing, 2021, 460: 84-94.
19	SU T C, CHENG H C. SesameBERT: Attention for anywhere[C]// 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA). Piscataway, NJ, USA: IEEE, 2020: 363-369.
20	HU J, SHEN L, ALBANIE S, et al. Squeeze-and-excitation networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2020, 42(8): 2011-2023.
21	JIANG Z, YU W, ZHOU D, et al. ConvBERT: Improving BERT with Span-based Dynamic Convolution[J/OL]. arXiv:2008.02496 [cs.CL], 2020.
22	SUN C J, GUAN Y, WANG X L, et al. Rich features based Conditional Random Fields for biological named entities recognition[J]. Computers in biology and medicine, 2007, 37(9): 1327-1333.
23	WEI J, REN X, LI X, et al. NEZHA: Neural contextualized representation for Chinese language understanding[J/OL]. arXiv:1909.00204v3 [cs.CL], 2009.
24	CUI Y M, CHE W X, LIU T, et al. Pre-training with whole word masking for Chinese BERT[J]. IEEE/ACM transactions on audio, speech and language processing, 2021, 29: 3504-3514.

数据集	类型	训练集/个	训练集/个
PeopleDaily	句子数量	20,864	4346
PeopleDaily	实体数量	33,992	7707
MSRA	句子数量	45,000	3442
MSRA	实体数量	7559	6192

统计对象	训练集/个	验证集/个	测试集/个	总数/个
句子数量	5050	1682	1682	8414
实体数量	4351	1456	1449	7256
农产品类别	157	35	38	230
农产品实例	1140	387	373	1900
病虫害类别	103	33	22	158
病虫害实例	1099	406	372	1877
行政区划	1852	615	644	3111

实体类型	实体首字符标注	实体非首字符标注
农产品类别	B-Product-Class	I-Product-Class
农产品实例	B-Product-Instance	I-Product-Instance
病虫害类别	B-DP-Class	I-DP-Class
病虫害实例	B-DP-Instance	I-DP-Instance
行政区划	B-Region-Instance	I-Region-Instance

操作系统	Windows 10
CPU型号	Intel Xemon CPU E5-1630 v4 @3.70 GHz
GPU型号	Titan X
Python版本	3.7
Tensorflow版本	1.14
内存大小	64 GB

参数	值
最大序列长度	128
批大小	32
学习率	0.00005
Dropout失活率	0.5
卷积核大小	5×5