Yield Prediction Models in Guangxi Sugarcane Planting Regions Based on Machine Learning Methods
Received date: 2023-04-08
Online published: 2023-06-25
Supported by
Guangxi Science and Technology Major Project (Guike AA22117004); National Natural Science Foundation of China project (31760342)
[Objective] Accurate prediction of changes in sugarcane yield in Guangxi can provide important reference for the formulation of relevant policies by the government and provide decision-making basis for farmers to guide sugarcane planting, thereby improving sugarcane yield and quality and promoting the development of the sugarcane industry. This research was conducted to provide scientific data support for sugar factories and related management departments, explore the relationship between sugarcane yield and meteorological factors in the main sugarcane producing areas of Guangxi Zhuang Autonomous Region. [Methods] The study area included five sugarcane planting regions which laid in five different counties in Guangxi, China. The average yields per hectare of each planting regions were provided by Guangxi Sugar Industry Group which controls the sugar refineries of each planting region. The daily meteorological data including 14 meteorological factors from 2002 to 2019 were acquired from National Data Center for Meteorological Sciences to analyze their influences placed on sugarcane yield. Since meteorological factors could pose different influences on sugarcane growth during different time spans, a new kind of factor which includes meteorological factors and time spans was defined, such as the average precipitation in August, the average temperature from February to April, etc. And then the inter-correlation of all the meteorological factors of different time spans and their correlations with yields were analyzed to screen out the key meteorological factors of sensitive time spans. After that, four algorithms of BP neural network (BPNN), support vector machine (SVM), random forest (RF), and long short-term memory (LSTM) were employed to establish sugarcane apparent yield prediction models for each planting region. Their corresponding reference models based on the annual meteorological factors were also built. Additionally, the meteorological yields of every planting region were extracted by HP filtering, and a general meteorological yield prediction model was built based on the data of all the five planting regions by using RF, SVM BPNN, and LSTM, respectively. [Results and Discussions] The correlation analysis showed that different planting regions have different sensitive meteorological factors and key time spans. The highly representative meteorological factors mainly included sunshine hours, precipitation, and atmospheric pressure. According to the results of correlation analysis, in Region 1, the highest negative correlation coefficient with yield was observed at the sunshine hours during October and November, while the highest positive correlation coefficient was found at the minimum relative humidity in November. In Region 2, the maximum positive correlation coefficient with yield was observed at the average vapor pressure during February and March, whereas the maximum negative correlation coefficient was associated with the precipitation in August and September. In Region 3, the maximum positive correlation coefficient with yield was found at the 20‒20 precipitation during August and September, while the maximum negative correlation coefficient was related to sunshine hours in the same period. In Region 4, the maximum positive correlation coefficient with yield was observed at the 20‒20 precipitation from March to December, whereas the maximum negative correlation coefficient was associated with the highest atmospheric pressure from August to December. In Region 5, the maximum positive correlation coefficient with yield was found at the average vapor pressure from June and to August, whereas the maximum negative correlation coefficient as related to the lowest atmospheric pressure in February and March. For each specific planting region, the accuracy of apparent yield prediction model based on sensitive meteorological factors during key time spans was obviously better than that based on the annual average meteorological values. The LSTM model performed significantly better than the widely used classic BPNN, SVM, and RF models for both kinds of meteorological factors (under sensitive time spans or annually). The overall root mean square error (RMSE) and mean absolute percentage error (MAPE) of the LSTM model under key time spans were 10.34 t/ha and 6.85%, respectively, with a coefficient of determination Rv2 of 0.8489 between the predicted values and true values. For the general prediction models of the meteorological yield to multiple the sugarcane planting regions, the RF, SVM, and BPNN models achieved good results, and the best prediction performance went to BPNN model, with an RMSE of 0.98 t/ha, MAPE of 9.59%, and Rv2 of 0.965. The RMSE and MAPE of the LSTM model were 0.25 t/ha and 39.99%, respectively, and the Rv2 was 0.77. [Conclusions] Sensitive meteorological factors under key time spans were found to be more significantly correlated with the yields than the annual average meteorological factors. LSTM model shows better performances on apparent yield prediction for specific planting region than the classic BPNN, SVM, and RF models, but BPNN model showed better results than other models in predicting meteorological yield over multiple sugarcane planting regions.
Key words: meteorological factor; HP filter; sugarcane yield; BPNN model; LSTM model; machine learning
SHI Jiefeng , HUANG Wei , FAN Xieyang , LI Xiuhua , LU Yangxu , JIANG Zhuhui , WANG Zeping , LUO Wei , ZHANG Muqing . Yield Prediction Models in Guangxi Sugarcane Planting Regions Based on Machine Learning Methods[J]. Smart Agriculture, 2023 , 5(2) : 82 -92 . DOI: 10.12133/j.smartag.SA202304004
1 | 李威, 顾峰雪. 区域作物产量的模型预测研究[J]. 农业展望, 2020, 16(3): 104-111. |
LI W, GU F X. Prediction of regional crop yield based on model[J]. Agricultural outlook, 2020, 16(3): 104-111. | |
2 | 农业农村部市场预警专家委员会. 中国农业展望报告2019—2028[M]. 北京: 中国农业科学技术出版社, 2019. |
Expert Committee on Market Warning of Ministry of Agriculture and Rural Affairs. China agricultural outlook 2019-2028[M]. Beijing: China Agricultural Science and Technology Press, 2019. | |
3 | 高俊杰, 袁业溶, 梁应. 高要区早稻产量预测模型的建立[J]. 广东气象, 2022, 44(2): 50-52. |
GAO J J, YUAN Y R, LIANG Y. Establishment of early rice yield prediction model in Gaoyao area[J]. Guangdong meteorology, 2022, 44(2): 50-52. | |
4 | 于珍珍, 邹华芬, 于德水, 等. 融合田间水热因子的甘蔗产量GA-BP预测模型[J]. 农业机械学报, 2022, 53(10): 277-283. |
YU Z Z, ZOU H F, YU D S, et al. Sugarcane yield GA-BP prediction model incorporating field water and heat factors[J]. Transactions of the Chinese society for agricultural machinery, 2022, 53(10): 277-283. | |
5 | 陈上. 基于历史气象数据和CERES-maize模型的玉米产量预测及灌溉决策方法[D]. 杨凌: 西北农林科技大学, 2017. |
CHEN S. Yield forecast and irrigation decision for maize based on historical weather data and the Ceres-maize model[D]. Yangling: Northwest A & F University, 2017. | |
6 | 王二虎, 宋晓. 基于气象因子的开封市花生产量预测模型[J]. 陕西农业科学, 2012, 58(4): 31-33. |
WANG E H, SONG X. Prediction model of peanut yield in Kaifeng city based on meteorological factors[J]. Shaanxi journal of agricultural sciences, 2012, 58(4): 31-33. | |
7 | 何虹, 王巧娟, 李亮, 等. 宁夏引黄灌区玉米趋势产量与气候产量分离方法研究[J]. 灌溉排水学报, 2022, 41(4): 30-39. |
HE H, WANG Q J, LI L, et al. Separating the effect of meteorology on maize yield from the impact of other factors in the Yellow River-water irrigated regions in Ningxia of China[J]. Journal of irrigation and drainage, 2022, 41(4): 30-39. | |
8 | 顾雅文, 姚艳丽, 傅玮东. 基于关键气象因子的阿克苏地区苹果产量预测模型[J]. 新疆农业科技, 2021(2): 22-24. |
GU Y W, YAO Y L, FU W D. Prediction model of apple yield in Aksu region based on key meteorological factors[J]. Xinjiang agricultural science and technology, 2021(2): 22-24. | |
9 | 何修君. 基于机器学习的玉米产量预测模型研究[D]. 长春: 吉林农业大学, 2021. |
HE X J. Research on maize yield prediction model based on machine learning[D]. Changchun: Jilin Agricultural University, 2021. | |
10 | 李严明. 基于机器学习的气象因素对小麦产量影响的分析预测[D]. 郑州: 河南农业大学, 2019. |
LI Y M. Wheat yield forecasting: A machine learning approach based on meteorological factors[D]. Zhengzhou: Henan Agricultural University, 2019. | |
11 | ZHAO Y X, XIAO D P, BAI H Z, et al. The prediction of wheat yield in the North China plain by coupling crop model with machine learning algorithms[J]. Agriculture, 2022, 13(1): ID 99. |
12 | CROCI M, IMPOLLONIA G, MERONI M, et al. Dynamic maize yield predictions using machine learning on multi-source data[J]. Remote sensing, 2022, 15(1): ID 100. |
13 | OIKONOMIDIS A, CATAL C, KASSAHUN A. Hybrid deep learning-based models for crop yield prediction[J]. Applied artificial intelligence, 2022, 36(1): 1-18. |
14 | DI Y, GAO M F, FENG F K, et al. A new framework for winter wheat yield prediction integrating deep learning and Bayesian optimization[J]. Agronomy, 2022, 12(12): ID 3194. |
15 | BURDETT H, WELLEN C. Statistical and machine learning methods for crop yield prediction in the context of precision agriculture[J]. Precision agriculture, 2022, 23(5): 1553-1574. |
16 | QU L S, ZHU Q A, ZHU C F, et al. 2022. Monthly precipitation data set with 1 km resolution in China from 1960 to 2020[DB/OL]. Science Data Bank. [2022-04-15]. . |
17 | 黄海迅, 周筠珺, 曾勇, 等. 广西贵港甘蔗产量气象预报[J]. 成都信息工程大学学报, 2020, 35(5): 554-559. |
HUANG H X, ZHOU Y J, ZENG Y, et al. Meteorological forecast of sugarcane production in Guigang, Guangxi[J]. Journal of Chengdu university of information technology, 2020, 35(5): 554-559. | |
18 | 许鑫, 马兆务, 熊淑萍, 等. 基于气候年型的河南省冬小麦产量预测[J]. 中国农业科技导报, 2022, 24(2): 136-144. |
XU X, MA Z W, XIONG S P, et al. Wheat yield forecast in Henan Province based on climate year type[J]. Journal of agricultural science and technology, 2022, 24(2): 136-144. | |
19 | 王桂芝, 陆金帅, 陈克垚, 等. 基于HP滤波的气候产量分离方法探讨[J]. 中国农业气象, 2014, 35(2): 195-199. |
WANG G Z, LU J S, CHEN K Y, et al. Exploration of method in separating climatic output based on HP filter[J]. Chinese journal of agrometeorology, 2014, 35(2): 195-199. | |
20 | ZHOU C H, WU Z Y, LIU C. A study on quality prediction for smart manufacturing based on the optimized BP-AdaBoost model[C]// 2019 IEEE International Conference on Smart Manufacturing, Industrial & Logistics Engineering (SMILE). Piscataway, NJ, USA: IEEE, 2020: 1-3. |
21 | KAZEMI A, BOOSTANI R, ODEH M, et al. Two-layer SVM, towards deep statistical learning[C]// 2022 International Engineering Conference on Electrical, Energy, and Artificial Intelligence (EICEEAI). Piscataway, NJ, USA: IEEE, 2022. |
22 | MIAH M O, KHAN S S, SHATABDA S, et al. Improving detection accuracy for imbalanced network intrusion classification using cluster-based under-sampling with random forests[C]// 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT). Piscataway, NJ, USA: IEEE, 2019: 1-5. |
23 | AKANDEH A, SALEM F M. Slim LSTM networks: Lstm_6 and LSTM_C6[C]// 2019 IEEE 62nd International Midwest Symposium on Circuits and Systems (MWSCAS). Piscataway, NJ, USA: IEEE, 2020: 630-633. |
24 | 欧钊荣, 谭宗琨, 何燕, 等. 影响我国甘蔗主产区甘蔗产量的关键气象因子及其丰欠指标[J]. 安徽农业科学, 2008, 36(24): 10407-10410, 10415. |
OU Z R, TAN Z K, HE Y, et al. The key meteorological factors affecting the sugarcane yield in major production areas in China and their high-low yield indices[J]. Journal of Anhui agricultural sciences, 2008, 36(24): 10407-10410, 10415. | |
25 | 李志强, 张香燕, 田华东. 应用HP滤波的卫星遥测数据预测方法[J]. 航天器工程, 2021, 30(4): 23-30. |
LI Z Q, ZHANG X Y, TIAN H D. Prediction method of satellite telemetry data using HP filter[J]. Spacecraft engineering, 2021, 30(4): 23-30. |
/
〈 |
|
〉 |