Welcome to Smart Agriculture

Smart Agriculture ›› 2024, Vol. 6 ›› Issue (1): 89-100.doi: 10.12133/j.smartag.SA202311032

• Topic--Intelligent Agricultural Sensor Technology • Previous Articles     Next Articles

Using a Portable Visible-near Infrared Spectrometer and Machine Learning to Distinguish and Quantify Mold Contamination in Wheat

JIA Wenshen1,2(), LYU Haolin1, ZHANG Shang1(), QIN Yingdong2, ZHOU Wei3   

  1. 1. College of Computer and Information Technology, China Three Gorges University, Yichang 443002, China
    2. Institute of Quality Standards and Testing Technology, Beijing Academy of Agricultural and Forestry Sciences, Beijing 100097, China
    3. Food Inspection and Research Institute, Hebei Food Safety Key Laboratory, Shijiazhuang 050000, China
  • Received:2023-11-27 Online:2024-01-30
  • corresponding author:
    ZHANG Shang, E-mail:
  • Supported by:
    Key Research and Development Projects of Hebei Province(21375501D); Innovation and Capacity Building Project of Beijing Academy of Agriculture and Forestry Sciences(KJCX20230438); National Natural Science Foundation of China(31801634)

Abstract:

Objective Traditional methods for detecting mold are time-consuming, labor-intensive, and vulnerable to environmental influences, highlighting the need for a swift, precise, and dependable detection approach. Researchers have utilized visible-near infrared (NIR) spectroscopy for the non-destructive, rapid assessment of wheat moisture content, crude protein content, concealed pests, starch content, dry matter, weight, hardness, origin, and other attributes. However, most of these studies rely on research-grade Visible-NIR spectrometers typically found in laboratories. While these spectrometers offer superior detection accuracy and stability, their bulky size, lack of portability, and high cost hinder their widespread use and adoption across various agricultural product distribution channels. Methods A low-resolution Visible-NIR spectrometer (VNIAPD, with a resolution of 1.6 nm) was utilized to gather wheat data. The aim was to enhance the accuracy of moldy wheat detection by identifying suitable spectral data preprocessing methods using corresponding algorithms. A high-resolution Visible-NIR spectrometer (SINO2040, with a resolution of 0.19 nm) served as a control to validate the instrument and method's effectiveness. The Zhoumai (No. 22) wheat variety was adopted, with a total of 100 samples prepared. The spectra of fresh wheat were scanned and then placed in a constant temperature chamber at 35 °C to replicate the appropriate conditions for mold growth, thereby accelerating the reproduction of naturally occurring mold in the wheat. The degree of mold was categorized based on the cultivation time in the constant temperature chamber, with wheat classified as mildly, moderately, or severely moldy after 3, 6, and 9 days of cultivation, respectively. A total of 400 wheat spectral data points were collected, including 100 samples each of fresh wheat, wheat cultured for 3 days, wheat cultured for 6 days, and wheat cultured for 9 days. Preprocessing methods such as standard deviation normalization (SDN), standard normal variation (SNV), mean centrality (MC), first-order derivatives (1ST), Savitzky-Golay smoothing (SG), and multiple scattering correction (MSC) were applied to the spectral data. Outliers were identified and eliminated using the local outlier factor (LOF) method. Following this, the sequential projection algorithm (SPA) and Least absolute shrinkage and selection operator (LASSO) were used to extract characteristic wavelengths from the preprocessed spectra. Subsequently, six algorithms, including k-nearest neighbors (KNN), support vector machines (SVM), random forests (RF), Naïve-Bayes, back propagation neural networks (BPNN), and deep neural networks (DNN), were employed to model and analyze the feature wavelength spectra, differentiating moldy wheat and classifying the degree of mold. Evaluation criteria encompassed accuracy, modeling time, and model size to aid in selecting the most suitable model for specific application scenarios. Results and discussions Regarding accuracy, even when utilizing the computationally slower and more memory-demanding neural network models BPNN and DNN, both the VNIAPD and SINO2040 achieved a perfect 100% accuracy in the binary classification task of distinguishing between fresh and moldy wheat. They also maintained a faultless 100% accuracy in the ternary classification task that differentiates three varying levels of mold growth. Adopting faster and more memory-efficient shallow models such as KNN, SVM, RF, and Naïve-Bayes, the VNIAPD yielded a top test set accuracy of 97.72% when combined with RF for binary classification. Conversely, SINO2040 achieved 100% accuracy using Naïve-Bayes. In the ternary classification scenario, the VNIAPD hit the mark at 100% accuracy with both KNN and RF, while SINO2040 demonstrated 97.72% accuracy with KNN and SVM. Regarding modeling speed, the shallow machine learning algorithms, including KNN, SVM, RF, and Naïve-Bayes, exhibited quicker training times, with Naïve-Bayes being the swiftest at just 3 ms. In contrast, the neural network algorithms BPNN and DNN required more time for training, taking 3 293 and 18 614 ms, respectively. Regarding memory footprint, BPNN had the largest model size, occupying 4 028 kb, whereas SVM was the most memory-efficient, with a size of only 4 kb. Overall, the VNIAPD matched the SINO2040 in detection accuracy despite having lower optical parameters: A slightly lesser optical resolution of 1.6 nm compared to the SINO2040's 0.19 nm—and a lower cost, highlighting its efficiency and cost-effectiveness in the given context. Conclusions In this study, by comparing different preprocessing methods for spectral data, the optimal data optimization choices for corresponding algorithms were identified. As a result, the low-resolution spectrometer VNIAPD was able to achieve performance on par with the high-resolution spectrometer SINO2040 in detecting moldy wheat, providing a new option for low-cost, non-destructive detection of wheat mold and the degree of moldiness based on Visible-NIR spectroscopy.

Key words: Visible-NIR spectroscopy, wheat mold, machine learning, nondestructive detection, food safety, neural networks