Phenotype Analysis of <i>Pleurotus Geesteranus</i> Based on Improved Mask R-CNN

ZHOU Huamao; WANG Jing; YIN Hua; CHEN Qi

doi:10.12133/j.smartag.SA202309024

Smart Agriculture >

2023 , Vol. 5 >Issue 4: 117 - 126

DOI: https://doi.org/10.12133/j.smartag.SA202309024

Special Issue--Artificial Intelligence and Robot Technology for Smart Agriculture

Phenotype Analysis of Pleurotus Geesteranus Based on Improved Mask R-CNN

ZHOU Huamao ^,¹ ,
WANG Jing ¹ ,
YIN Hua ² ,
CHEN Qi ^,²

Expand

^1. College of Engineering, Jiangxi Agricultural University, Nanchang 330000, China
^2. College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang 330000, China

CHEN Qi, E-mail: 37914448@qq.com

Received date: 2023-09-22

Online published: 2023-12-20

Supported by

National Natural Science Foundation of China(62362039)

Copyright

Fold

Abstract

[Objective] Pleurotus geesteranus is a rare edible mushroom with a fresh taste and rich nutritional elements, which is popular among consumers. It is not only cherished for its unique palate but also for its abundant nutritional elements. The phenotype of Pleurotus geesteranus is an important determinant of its overall quality, a specific expression of its intrinsic characteristics and its adaptation to various cultivated environments. It is crucial to select varieties with excellent shape, integrity, and resistance to cracking in the breeding process. However, there is still a lack of automated methods to measure these phenotype parameters. The method of manual measurement is not only time-consuming and labor-intensive but also subjective, which lead to inconsistent and inaccurate results. Thus, the traditional approach is unable to meet the demand of the rapid development Pleurotus geesteranus industry. [Methods] To solve the problems which mentioned above, firstly, this study utilized an industrial-grade camera (Daheng MER-500-14GM) and a commonly available smartphone (Redmi K40) to capture high-resolution images in DongSheng mushroom industry (Jiujiang, Jiangxi province). After discarding blurred and repetitive images, a total of 344 images were collected, which included two commonly distinct varieties, specifically Taixiu 57 and Gaoyou 818. A series of data augmentation algorithms, including rotation, flipping, mirroring, and blurring, were employed to construct a comprehensive Pleurotus geesteranus image dataset. At the end, the dataset consisted of 3 440 images and provided a robust foundation for the proposed phenotype recognition model. All images were divided into training and testing sets at a ratio of 8:2, ensuring a balanced distribution for effective model training. In the second part, based upon foundational structure of classical Mask R-CNN, an enhanced version specifically tailored for Pleurotus geesteranus phenotype recognition, aptly named PG-Mask R-CNN (Pleurotus geesteranus-Mask Region-based Convolutional Neural Network) was designed. The PG-Mask R-CNN network was refined through three approaches: 1) To take advantage of the attention mechanism, the SimAM attention mechanism was integrated into the third layer of ResNet101feature extraction network after analyzing and comparing carefully, it was possible to enhance the network's performance without increasing the original network parameters. 2) In order to avoid the problem of Mask R-CNN's feature pyramid path too long to split low-level feature and high-level feature, which may impair the semantic information of the high-level feature and lose the positioning information of the low-level feature, an improved feature pyramid network was used for multiscale fusion, which allowed us to amalgamate information from multiple levels for prediction. 3) To address the limitation of IoU (Intersection over Union) bounding box, which only considered the overlapping area between the prediction box and target box while ignoring the non-overlapping area, a more advanced loss function called GIoU (Generalized Intersection over Union) was introduced. This replacement improved the calculation of image overlap and enhanced the performance of the model. Furthermore, to evaluate crack state of Pleurotus geesteranus more scientifically, reasonably and accurately, the damage rate as a new crack quantification evaluation method was introduced, which was calculated by using the proportion of cracks in the complete pileus of the mushroom and utilized the MRE (Mean Relative Error) to calculate the mean relative error of the Pleurotus geesteranus's damage rate. Thirdly, the PG-Mask R-CNN network was trained and tested based on the Pleurotus geesteranus image dataset. According to the detection and segmentation results, the measurement and accuracy verification were conducted. Finally, considering that it was difficult to determine the ground true of the different shapes of Pleurotus geesteranus, the same method was used to test 4 standard blocks of different specifications, and the rationality of the proposed method was verified. [Results and Discussions] In the comparative analysis, the PG-Mask R-CNN model was superior to Grabcut algorithm and other 4 instance segmentation models, including YOLACT (You Only Look At Coefficien Ts), InstaBoost, QueryInst, and Mask R-CNN. In object detection tasks, the experimental results showed that PG-Mask R-CNN model achieved a mAP of 84.8% and a mAR (mean Average Recall) of 87.7%, respectively, higher than the five methods were mentioned above. Furthermore, the MRE of the instance segmentation results was 0.90%, which was consistently lower than that of other instance segmentation models. In addition, from a model size perspective, the PG-Mask R-CNN model had a parameter count of 51.75 M, which was slightly larger than that of the unimproved Mask R-CNN model but smaller than other instance segmentation models. With the instance segmentation results on the pileus and crack, the MRE were 1.30% and 7.54%, respectively, while the MAE of the measured damage rate was 0.14%. [Conclusions] The proposed PG-Mask R-CNN model demonstrates a high accuracy in identifying and segmenting the stipe, pileus, and cracks of Pleurotus geesteranus. Thus, it can help the automated measurements of phenotype measurements of Pleurotus geesteranus, which lays a technical foundation for subsequent intelligent breeding, smart cultivation and grading of Pleurotus geesteranus.

Key words： Pleurotus geesteranus; Mask R-CNN; SimAM attention mechanism; Resnet101; phenotype analysis; improved feature pyramid network

Cite this article

ZHOU Huamao , WANG Jing , YIN Hua , CHEN Qi . Phenotype Analysis of Pleurotus Geesteranus Based on Improved Mask R-CNN[J]. Smart Agriculture, 2023 , 5(4) : 117 -126 . DOI: 10.12133/j.smartag.SA202309024

0 引言

秀珍菇，学名环柄侧耳（Pleurotus geesteranus），因其子实体口感鲜爽，富含多种人体所需的营养元素，深受人们喜爱^［1］。为了满足人们对优质秀珍菇的需求，培养外形完整、生长周期短且运输时不易损坏的新品种势在必行。表现型是基因型和环境型共同作用的结果^［2］，也是消费者关注的重点以及菇农调整种植策略和分级定价的依据。根据相关标准^［3］，菌柄粗细、菌盖大小及裂纹数量都是秀珍菇品质的体现，但目前受技术水平的限制，这些参数在育种、生产过程中还需要通过人工测量确定，劳动强度大、效率低，具有一定的主观性。这也在一定程度上阻碍了秀珍菇高通量育种和智慧栽培的高速发展。

随着计算机视觉技术的不断发展，采用机器视觉分析农作物形态的方法已逐渐趋于成熟。但在对食用菌表型分析方面，目前研究较少，仅有的研究则集中在菌盖形态的测量与获取方面^［4］。王玲等^［5］提出了一种基于结构光SR300深度相机的褐蘑菇原位测量技术，其能够分割粘连菇、满足褐蘑菇的工厂化实时采摘机器人的需求。Liu等^［6］提出了一种改进的YOLOX方法检验香菇的表面纹理，mAP达到99.96%，有效应用于香菇生产过程中的快速质量分类；Yin等^［7］提出一种基于YOLOv4和距离滤波器的黑皮鸡枞菌盖直径估测方法，其识别准确率达95.36%。黄星奕等^［8］对畸形秀珍菇识别方法进行了讨论，但未对秀珍菇裂纹进行定量分析。

为满足育种、生产等对秀珍菇表型快速定量分析的需求，本研究根据秀珍菇形态的特点，构建了PG-Mask R-CNN实例分割模型，提出了秀珍菇裂纹程度评价方法，完成对秀珍菇菌柄粗细、菌盖面积及裂纹率的自动化测量算法，并验证了算法的有效性。

1 数据采集与预处理

实验数据来源于江西九江东晟生物科技有限公司，研究对象为台秀57、高邮818两个常见秀珍菇品种。在流水线装置中完成图片采集，传送带为绿色，将相机（大恒MER-500-14 GM，分辨率2 592×1 944）和镜头（HN-0619-5M）固定于传送带上方，贴近传送带处安装有红外对射开关（欧姆龙E3X-NA），相机、对射开关、传送带控制器通过数据线与工控机（磐仪FPC-7703）相连并受上位机程序控制。为避免传送带运行时拍摄造成的图像模糊，影响对裂纹的判断，整个系统采取“走-停-走”的策略工作，即当秀珍菇行至相机正下方时触发对射开关，工控机控制传送带停止工作并拍摄照片，待拍摄结束后继续工作。

为了增加数据集中的图像数量并提高系统的鲁棒性，使用手机（红米K40，分辨率4 000×2 250）静态拍摄图像作为补充。最终采集到344幅原始秀珍菇正面图像用于模型训练，其中相机和手机拍摄的图像数量分别为168张和176张，并按照8∶2比例划分为训练集和测试集。在训练之前进行10倍数据增强，对秀珍菇图像进行随机亮度增减、图像翻转、添加不同噪声等处理，共获得3 440张图像用于模型的训练，其中训练集和测试集分别为2 752张和688张。同时再拍摄50个新选的秀珍菇样本用于表型参数测量验证，其中手机和相机分别拍摄25张。使用LabelMe软件中的多边形标注工具进行人工标注，为每个秀珍菇标注出菌盖、菌柄、裂纹，如图1所示。

显示原图|下载原图ZIP|生成PPT

图1 秀珍菇数据集标注样例

Fig. 1 Sample of Pleurotus geesteranus dataset label

2 研究方法

由于秀珍菇的个体差异，而受摆放位置、角度的影响产生阴影使得用传统形态学方法处理误差较大。因此，本研究提出一种新的改进模型获取秀珍菇的菌柄、菌盖和裂纹大小，并根据分割结果计算菌盖面积、菌柄长度、宽度、裂纹率等表型参数，其具体流程如图2所示。

显示原图|下载原图ZIP|生成PPT

图2 基于PG-Mask R-CNN的秀珍菇表型分析方法流程

Fig. 2 Phenotyping methodology process of Pleurotus geesteranus based on PG-Mask R-CNN

2.1　PG-Mask R-CNN模型构建

Mask R-CNN网络是一种实例分割（Instance Segmentation）算法^［9］，与其他算法相比，具有检测速度快和准确率高的优势，使其在农产品图像识别领域有着广泛的应用。传统的Mask R-CNN网络虽然在公开数据集上取得了不错的结果，但是由于秀珍菇裂纹与菌盖的颜色对比度低，部分裂纹过小且图片背景的阴影部分对裂纹识别有影响，并不能完全适合。因此，本研究中改进的部分包括：1）针对主干网络中Resnet101网络在特征融合过程中没有区分每个空间维度特征的重要性的不足，将SimAM注意力机制模块添加到其残差模块的conv3层，以提高模型的特征提取能力；2）在FPN（Feature Pyramid Networks）网络中引入一条自下向上的新特征融合路径（Dual Feature Pyramid Networks，DFPN），避免高层语义信息的丢失；3）采用GIoU（Generalized Intersection over Union）^［10］边界框回归损失函数替代原有的IoU（Intersection over Union）损失函数，完善图像重叠度的计算，进一步优化网络模型。改进后模型具体结构如图3所示。

显示原图|下载原图ZIP|生成PPT

图3 PG-Mask R-CNN网络结构图

Fig. 3 Structure of PG-Mask R-CNN network

2.1.1　SimAM注意力机制模块

由于摆放位置、光照角度等原因，秀珍菇菌柄菌盖结合区域、部分裂纹与菌盖结合区域存在颜色相近现象，传统模型识别难以取得满意的结果。为使得网络模型能够更多关注上述区域的细节引入注意力模块。目前，注意力模块的种类很多^［11-13］，但与其他注意力模块不同，SimAM注意力机制^［14］原理简单且不引入额外模型参数。其定义的第n个序号的神经元能量函数为

e n

，如公式（1）所示。

e n = 4 (∂ 2 + φ) (t n - μ^) 2 + 2 ∂ 2 + 2 φ

（1）

其中，

φ

为正则项；

∂ 2

为所有神经元在单个通道上的方差；

t n

为输入特征图在单个通道上的第n个神经元。

在SimAM模块工作时首先将特征图输入获得所有神经元权重，然后采用Sigmoid函数对其进行权值归一化，将结果与原始特征相乘，最后输出特征图。现有研究表明，将SimAM模块嵌入到Resnet101残差模块的Bottleneck卷积层后可以得到较好的效果，但SimAM模型的插入方法不固定，经过多次实验后本研究将SimAM模块嵌入到层conv3层。

2.1.2　DFPN路径

特征金字塔（Feature Pyramid Network，FPN）作为Mask R-CNN的特征融合网络，可以在多个尺度上提取特征，提高准确性和特征提取速度。原始特征金字塔有一条自顶向下的特征连接，路径过长使得低层特征无法影响高层特征，导致高层特征语义信息弱，并缺少低层特征的定位信息。从图4中可以看出，原始特征金字塔生成的特征图P5仅有其当前层次的语义信息，而P2、P3、P4却有自顶向下的语义信息。对秀珍菇裂纹进行检测时，由于裂纹面积远小于菌盖面积，且部分裂纹形状不规则、边缘不平滑、存在阴影。因此，当语义信息弱时会导致裂纹检测结果出现漏检、错检^［15］。为了提高裂纹检测性能和小目标目标检测精度，在原始特征金字塔的基础上增加一条自下向上的特征融合路径DFPN用于多尺寸特征融合，加强语义信息。如图4所示，首先将原始特征图P2作为D2，通过3×3卷积上采样，并与经过1×1卷积的P3水平连接，得到D3，从下至上的方式将D4、D5用相同的方式生成。新生成的特征图D2、D3、D4、D5具有更丰富的语义信息，进一步提升了特征金字塔特征融合的性能。

显示原图|下载原图ZIP|生成PPT

图4 DFPN网络结构

Fig. 4 Structure of DFPN network

2.1.3　边界框回归损失函数改进

边界框回归损失函数用于衡量真实值和预测值之间的误差大小，并预测目标的坐标位置。Mask R-CNN采用IoU作为边界框回归损失函数计算预测框和目标框的交并比，从而对模型进行迭代优化。当预测框与真实框不相交时，IoU为0，即Loss为0，此时模型无法优化，网络便无法进行训练。由于边界框回归损失函数IoU只关注预测框与目标框的重叠区域，忽略非重叠区域导致秀珍菇各部位目标检测的坐标位置不准确。针对这一情况，参考文献［16］，考虑目标的非重叠区域，将边界框回归损失函数由IoU更换为GIoU，用于减少其他特征影响并完善图像重叠度的计算，进一步提高秀珍菇各部位的目标检测性能。如图5所示，B和G分别代表真实框和预测框；S表示B和G的最小外包框；a表示预测框和真实框覆盖的并集

B ⋃ G

。

显示原图|下载原图ZIP|生成PPT

图5 GIoU边界框损失函数原理图

Fig. 5 Schematic diagram of GIoU bounding box loss function

IoU的定义如公式（2）所示。

I o U = B ⋂ G B ⋃ G

（2）

GIoU的定义如公式（3）所示。

G I o U = I o U - S - B ⋃ G S

（3）

式中：S为真实框和预测框的最小外包面积；

B ⋃ G

为预测框和真实框覆盖的并集；边界框回归损失函数GIoU取值的区间为［-1，1］。

2.2　秀珍菇表型参数测量

2.2.1　测量指标

经过PG-Mask R-CNN网络模型对秀珍菇进行实例分割后，已经能够识别菌柄、菌盖及裂纹，并得到其轮廓，但需要进一步对它们进行量化。由于秀珍菇形态并不规则，因此本研究中菌盖及裂纹采用面积作为其量化指标；而对于菌柄则选择长度及粗细进行量化，具体如下。

菌盖面积（mm²）：图像上菌盖图形所占面积，是拍摄自然状态下将秀珍菇正面放置时的测量结果。

菌柄长度（mm）：菌柄最小外接矩形框的长度。

菌柄宽度（mm）：菌柄最小外接矩形框的宽度。

裂纹条数（条）：秀珍菇子实体裂纹数量。

菌盖损伤率：裂纹在完整菌盖面积中所占的比例。参考文献［17］的思路，其定义如公式（4）所示。

∂ = S c r a c k S p i l e u s + S c r a c k × 100 %

（4）

式中：S _pileus、S _crack分别为秀珍菇菌盖和裂纹面积的大小，mm²；

∂

为裂盖损伤率，%，值越小说明菌盖完整度越高。

2.2.2　测量方法

对于秀珍菇表型参数的计算步骤如下：

1）首先通过标定消除相机畸变及获取像素比例^［18］。

2）利用PG-Mask R-CNN模型获得包含菌盖、裂纹和菌柄的分割结果及其像素数量。

3）由于部分菌柄存在倾斜，利用外接矩形难以准确描述表型参数，而主成分分析能够得到样本中分布差异最大的成分（主方向）且已广泛应用^{［19, 20］}。因此，使用主成分分析算法提取菌柄掩膜主方向并旋转至水平方向。

4）作最小外接矩形框，获取其长和宽作为菌柄的长度和宽度，得到菌柄的长度及粗细；同时计算菌盖、裂纹的面积得到最终结果。

2.3　评价方法

为验证PG-Mask R-CNN模型在秀珍菇图像数据集的性能，采用推理时间（Inference Time）、mAP、mAR这3个指标进行评价。Inference Time表示模型识别一张图像所需的时间；mAP表示当预测结果与真实结果阈值交集的比例为0.5~0.95时，所有类别的平均AP计算方法如公式（5）所示；mAP@0.5表示当预测结果与真实结果阈值交集的比例为0.5时，所有类别的平均AP；mAR表示当预测结果与真实结果阈值交集的比例为0.5~0.95时，所有类别的平均AR计算方法如公式（6）所示。

m A P = ∑ i = 1 C A P i C

（5）

m A R = ∑ i = 1 C A R i C

（6）

式中：C表示类别数，个；AP表示平均精度，其计算方式如公式（7）；AR表示平均召回率，其计算方法如公式（8）。

A P = ∑ j = 1 N P j ∆ R (j)

（7）

A R = 2 ∫ 0.5 1 R o d o

（8）

式中：N表示数据总量；j为每个样本点的索引；P表示精确度，R表示召回率；o表示预测掩膜与真值之间的交并比。

为了定量描述实例分割结果，采用平均相对误差（Mean Relative Error，MRE）和平均绝对误差（Mean Absolute Error，MAE）进行评价，其定义为预测结果与手工标注结果之间像素点数量差异，如公式如（9）和公式（10）所示。

M R E = 1 n ∑ k = 1 n y k - x k x k

（9）

M A E = 1 n ∑ k = 1 n y k - x k

（10）

式中：n表示秀珍菇的个数；y_k 表示预测的第k个秀珍菇实例的像素点个数；x_k 表示标注的第k个秀珍菇实例的像素点个数。

2.4　实验环境与模型训练

使用Pycharm集成开发环境对PG-Mask R-CNN网络模型进行构建与测试。硬件平台的处理器为Intel Xeon Silver 4310 CPU、显卡为NVIDIA GeForce RTX 3090，并在Ubuntu 18.04.0操作系统和Python 3.7、Pytorch 1.10.0等编程环境中进行模型训练与开发。模型共训练100轮次，设置学习率为0.002，动量因子为0.9，权重衰减系数为0.000 1，Batch Size为4。

3 结果与分析

3.1　SimAM模块插入位置分析

SimAM注意力机制为一个即插即用的模块，其插入模型的位置目前尚无确定的方法，需要根据实际情况进行尝试。将SimAM注意力模块分别嵌入到Resnet101残差模块的Bottleneck的第1层conv1、第2层conv2和第3层conv3后，并进行结果对比。如表1所示，将SimAM模块嵌入到conv3后mAP为84.8%，mAR为87.7%，Inference Time为0.069 ms。相对于其他模型目标检测精度更高但模型推理时间仅相差0.009 ms，因此最终选择将SimAM模块嵌入到conv3层之后。

表1 添加SimAM后模型训练结果

Table 1 Model training results after adding SimAM

Module	mAP/%	mAR/%	Inference time/ms
conv1+SimAM	84.1	87.6	0.067
conv2+SimAM	84.4	87.7	0.060
conv3+SimAM	84.8	87.7	0.069

3.2　消融实验

为探究PG-Mask R-CNN网络模型的有效性进行消融实验，结果如表2所示。在仅使用SimAM注意力进行改进后，mAP为84.2%，比原有的Mask R-CNN模型提高2%；而DFPN特征金字塔的引入使模型的mAP提高0.4%，parameters增加5.91 M。引入GIoU代替IoU边界框回归损失函数，同时添加SimAM注意力机制和DFPN特征金字塔后，mAP提高2.2%，mAR提高1.6%，parameters大小不变，Inference Time仅增加0.009 ms。这些模块的加入可以提高秀模型的精度，在精确度和识别效率之间实现更好的平衡。

表2 消融实验结果

Table 2 Results of ablation experiments

Module	Exp No.1	Exp No.2	Exp No.3	Exp No.4
SimAM	×	√	×	√
DFPN	×	×	√	√
GIoU	×	×	×	√
mAP/%	82.2	84.2	82.6	84.8
mAR/%	86.4	87.4	86.1	87.7
Inference time/ms	0.050	0.063	0.060	0.069
Parameters/M	45.84	45.84	51.75	51.75

注：√为加入该模块到模型；×为不选择该模块。

3.3　与其他分割方法对比

Grabcut算法是一种前景分割算法。通过人机交互完成矩形输入，无需阈值选取过程并直接获得分割结果，已经广泛用于各个场景^［21］。YOLACT是一种在实时实例分割中大量应用的模型，具有训练耗时短、识别速度快和可支持多实例的优点；InstaBoost是一种基于crop-paste的实例分割模型，它能够改善模型在复杂场景下的性能，并提高目标检测的准确率和鲁棒性；Queryinst是一种基于Query的实例分割新方法，相比其他实例分割模型能够更好地应对遮挡和密集场景中的目标实例。由于YOLACT^［22］、InstaBoost^［23］、Queryinst^［24］算法与传统的实例分割算法相比在COCO（Microsoft Common Objects in Context）数据集上面分割精度都高于传统的实例分割算法（如Mask R-CNN、Cascade R-CNN、SOLO V2、CondInst等），故将它们与PG-Mask R-CNN进行对比，验证改进后算法的有效性。相应的实验结果如表3所示。GrabCut算法分割出的掩膜边缘不清晰，并且无法适应复杂背景，故未在表中列出；而PG-Mask R-CNN目标检测的mAP、mAP@0.5、mAR分别为84.8%、95.3%、87.7%，均优于YOLACT、InstaBoost、QueryInst、Mask R-CNN算法； PG-Mask R-CNN的MRE为0.90%，均小于InstaBoost、Mask R-CNN、QueryInst和YOLACT。综合结果表明，PG-Mask R-CNN在对秀珍菇图像进行实例分割时相较于其他模型具有更高精确度，其预测结果与真值最接近，边界更清晰，区域更完整。

表3 不同实例分割方法实验结果对比

Table 3 Results comparison of different instance segmentation training methods

Algorithm （Bbox）	mAP@0.5/%	mAP/%	mAR/%	Parameters/M	MRE/%
Mask R-CNN	95.2	82.2	86.4	45.84	1.24
YOLACT	91.7	72.5	77.2	53.73	1.41
InstaBoost	81.0	60.4	67.1	62.75	4.13
QueryInst	93.0	76.4	84.2	191.27	7.49
PG-Mask R-CNN （ours）	95.3	84.8	87.7	51.75	0.90

注：加粗实验得到最佳结果；加波浪线表示实验得到结果排名第二。

图6是使用不同方法对秀珍菇进行分割的结果，在使用YOLACT、InstaBoost、QueryInst时结果存在漏检、错检的情况，且分割出的掩膜存在明显缺损，而Mask R-CNN虽然能够分割出裂纹掩膜，但完整度及平滑度不如PG-Mask R-CNN。

显示原图|下载原图ZIP|生成PPT

图6 秀珍菇在Grabcut和不同实例分割方法下的分割实验结果对比

Fig. 6 Comparison of the results of the segmentation experimental of Pleurotus geesteranus between Grabcut and other different instance segmentation methods

3.4　表型参数测量结果分析

将秀珍菇测试图片放入PG-Mask R-CNN模型中得到各部位掩膜，同时使用LabelMe软件标注出秀珍菇测试图片的真值，通过像素点个数计算两个实验结果的菌盖、裂纹的面积和损伤率

∂

。选取所采集的50张用于验证的秀珍菇样本进行实验，分别测量其菌盖、裂纹及损伤率并计算误差，结果如图7所示。测量得到菌盖的MRE为1.30%、裂纹的MRE为7.54%，损伤率的MAE为0.14%。但是，从图7中可以看出对于某些样品裂纹MRE值较大。这是由于拍摄角度不正确以及阴影的影响，模型无法准确检测某些面积较大的菌盖上的细小裂纹，造成漏检所致，如图8所示。但在实际应用过程中，若裂纹与菌盖本身相比较小，则大多数情况下会忽略。因此，引入损伤率更能对秀珍菇品质情况进行评价。

显示原图|下载原图ZIP|生成PPT

图7 利用PG-Mask R-CNN模型测量秀珍菇损伤率并与真值对比

Fig. 7 Damage rate comparison results of Pleurotus geesteranus between the measured values using PG-Mask R-CNN model and the true value

显示原图|下载原图ZIP|生成PPT

图8 检测秀珍菇裂纹时出现的漏检和错检情况

注：蓝色框代表检测到的裂缝；红色框代表错检或遗漏的裂缝。

Fig. 8 Missing detection and wrong detection when detecting Pleurotus geesteranus crack

3.5　测量结果量化与量化方法合理性分析

对测量结果进行量化，结果如表4所示，得到菌盖、裂纹的面积以及菌柄的长度及粗细等参数。

表4 利用PG-Mask R-CNN模型测量秀珍菇表型参数结果

Table 4 Results of the phenotype measurement of the Pleurotus geesteranus by PG-Mask R-CNN model

秀珍菇序号	菌盖面积/mm²	菌柄长度/mm	菌柄宽度/mm	裂纹数/条	菌盖损伤率/%
0	551.18	35.06	12.58	1	1.48
1	589.12	32.08	13.71	2	1.98
2	977.29	43.77	14.09	1	0.63
3	602.26	39.23	12.22	1	0.29
4	1 000.49	43.98	14.24	2	1.01
5	797.17	36.00	14.80	3	1.58
6	754.48	46.11	14.66	1	0.66
7	462.24	45.12	12.99	1	0.94
8	557.78	28.39	13.13	1	2.17
9	328.67	40.59	9.42	0	0.00
10	796.91	53.51	15.40	1	0.47
11	1 664.96	52.04	19.29	3	3.10
12	561.90	27.71	15.16	2	1.80
13	716.65	39.11	14.12	1	0.45
14	1 650.42	49.41	16.82	2	2.11

由于秀珍菇的形态不规则难以测量其真值，为了验证采用上述算法的合理性，参考文献［25］的方法，利用标准块进行验证：使用4个测量专用的标准量块代替秀珍菇，在同一环境下对其进行测量；同时，使用游标卡尺获取标准块真值，最后将测量结果与实际测量结果进行对比。通过计算得到的量块长度、宽度、面积与实测值，如表5所示。其中，量块的长度和宽度测得的MRE为1.02%~1.96%，MAE为0.33~1.05 mm；面积测得的MRE为2.30%，MAE为36.06 mm²。因此，通过标定物测量实验结果表明，利用本研究算法测量秀珍菇菌盖和裂纹可达到类似的精度，能够满足实际应用需求。

表5 秀珍菇标定物图像测量与实际测量结果比较

Table 5 Comparison between the measurement results of reference and ground truth of Pleurotus geesteranus

量块	长度				宽度				面积
材料	实测值/mm	图像测量值/mm	MRE/%	MAE/mm	实测值/mm	图像测量值/mm	MRE/%	MAE/mm	实测值/mm²	图像测量值/mm²	MRE/%	MAE/mm²
a	80.03±0.02	82.11±1.02	2.60	2.08	34.87±0.01	34.47±0.58	1.21	0.42	2 790.91±1.23	2 830.92±82.25	1.80	50.32
b	60.01±0.01	60.63±0.94	1.14	0.68	34.94±0.03	34.79±0.67	0.59	0.21	2 096.53±1.58	2 109.60±72.69	1.26	26.32
c	40.02±0.01	40.58±0.95	1.63	0.65	35.06±0.01	35.18±0.78	1.05	0.37	1 402.96±0.89	1 428.03±66.02	2.65	47.16
d	34.67±0.02	35.44±0.91	2.26	0.79	25.00±0.01	25.22±0.64	1.22	0.31	866.93±0.42	893.80±44.34	3.51	30.45
均值	‒	‒	1.96	1.05	‒	‒	1.02	0.33	‒	‒	2.30	36.06

注：‒ 表示该值无需计算均值。

4 结论

本研究提出了一种改进的PG-Mask R-CNN模型来对秀珍菇各个部位进行分割及表型参数进行测量。

实验结果表明，PG-Mask R-CNN模型对秀珍菇进行目标检测的mAP和mAR分别为84.8%和87.7%，均高于目前主流的YOLACT、InstaBoost、QueryInst和Mask R-CNN模型；以像素为单位，与手工标注相比，本模型对秀珍菇菌盖测量的MRE为1.30%，裂纹的MRE为7.54%，损伤率的MAE为0.14%，具有较高精度。因此，该算法可用于秀珍菇表型的定量分析。

但是，该算法在某些情况下仍存在局限性。例如，某些秀珍菇菌柄弯曲角度较大，利用最小外接矩形进行长、宽计算显然误差较大。如何准确测量秀珍菇的弯曲的菌柄是一个需要解决的问题。另外，目前本文提出的算法针对的是单个秀珍菇表型，不能同时检测视野中的多个秀珍菇。当视野中有多个秀珍菇时秀珍菇到相机的距离不一致，会出现测量误差。若都是相机正下方又容易造成遮挡无法测量。后续将继续改进算法，进一步提高检测精度并在更丰富的场景中进行验证，以增强模型的鲁棒性和通用性。

利益冲突声明

本研究不存在研究者以及与公开研究成果有关的利益冲突。

References

Publishing order | Descend order by publishing year | Descend order by cited within

1	刘凌云, 周宇, 陈华, 等. 秀珍菇研究进展[J]. 微生物学通报, 2020, 47(11): 3650-3657. LIU L Y, ZHOU Y, CHEN H, et al. Research progress of Pleurotus geesteranus [J]. Microbiology China, 2020, 47(11): 3650-3657.

2	徐云碧. 作物科学中的环境型鉴定(Envirotyping)及其应用[J]. 中国农业科学, 2015, 48(17): 3354-3371. XU Y B. Envirotyping and its applications in crop science[J]. Scientia agricultura sinica, 2015, 48(17): 3354-3371.

3	T/GXEFA 0002—2022.富硒秀珍菇生产技术规程 [S]. 广西: 广西食用菌协会, 2022.

4	YIN H, YI W L, HU D M, Computer vision and machine learning applied in the mushroom industry: A critical review[J]. Computers and electronics in agriculture, 2022, 198: ID 107015.

5	王玲, 徐伟, 杜开炜, 等. 基于SR300深度相机的褐蘑菇原位测量技术[J]. 农业机械学报, 2018, 49(12): 13-19, 108. WANG L, XU W, DU K W, et al. Portabella mushrooms measurement in situ based on SR300 depth camera[J]. Transactions of the Chinese society for agricultural machinery, 2018, 49(12): 13-19, 108.

6	LIU Q, FANG M, LI Y S, et al. Deep learning based research on quality classification of shiitake mushrooms[J]. LWT, 2022, 168: ID 113902.

7	YIN H A, XU J L, WANG Y L, et al. A novel method of situ measurement algorithm for Oudemansiella raphanipies caps based on YOLOv4 and distance filtering[J]. Agronomy, 2022, 13(1): ID 134.

8	黄星奕, 姜爽, 陈全胜, 等. 基于机器视觉技术的畸形秀珍菇识别[J]. 农业工程学报, 2010, 26(10): 350-354. HUANG X Y, JIANG S, CHEN Q S, et al. Identification of defect Pleurotus geesteranus based on computer vision[J]. Transactions of the Chinese society of agricultural engineering, 2010, 26(10): 350-354.

9	HE K M, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[J]. IEEE transactions on pattern analysis and machine intelligence, 2020, 42(2): 386-397.

10	REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: A metric and a loss for bounding box regression[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2019: 658-666.

11	HU J, SHEN L, ALBANIE S, et al. Squeeze-and-excitation networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2020, 42(8): 2011-2023.

12	HOU Q B, ZHOU D Q, FENG J S. Coordinateattention for efficient mobile network design[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2021.

13	WANG Q L, WU B G, ZHU P F, et al. ECA-net: Efficient channel attention for deep convolutional neural networks[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, New Jersey, USA: IEEE, 2020.

14	YANG L X, ZHANG R Y, LI L D, et al. SimAM: A simple, parameter-free attention module for convolutional neural networks[C]// International Conference on Machine Learning. New York, USA: PMLR, 2021: 11863-11874.

15	LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, New Jersey, USA: IEEE, 2018.

16	俞林森, 陈志国. 融合前景注意力的轻量级交通标志检测网络[J]. 电子测量与仪器学报, 2023, 37(1): 21-31. YU L S, CHEN Z G. Lightweight traffic sign detection network with fused foreground attention[J]. Journal of electronic measurement and instrumentation, 2023, 37(1): 21-31.

17	ZHENG Z Z, HU Y H, YANG H B, et al. AFFU-Net: Attention feature fusion U-net with hybrid loss for winter jujube crack detection[J]. Computers and electronics in agriculture, 2022, 198: ID 107049.

18	ZHANG Z Y. Aflexible new technique for camera calibration[J]. IEEE transactions on pattern analysis and machine intelligence, 2000, 22(11): 1330-1334.

19	YANG S, ZHENG L H, YANG H J, et al. A synthetic datasets based instance segmentation network for high-throughput soybean pods phenotype investigation[J]. Expert systems with applications, 2022, 192: ID 116403.

20	周丽, 冯百明, 关煜, 等. 面向智能手机拍摄的变形文档图像校正[J]. 计算机工程与科学, 2022, 44(1): 102-109. ZHOU L, FENG B M, GUAN Y, et al. Correcting distorted document images on smartphones[J]. Computer engineering & science, 2022, 44(1): 102-109.

21	ROTHER C, KOLMOGOROV V, BLAKE A. "GrabCut": Interactive foreground extraction using iterated graph cuts[J]. ACM transactions on graphics, 23(3): 309-314.

22	BOLYA D, ZHOU C, XIAO F Y, et al. YOLACT: Real-time instance segmentation[C]// 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway, New Jersey, USA: IEEE, 2019: 9157-9166.

23	FANG H S, SUN J H, WANG R Z, et al. InstaBoost: Boosting instance segmentation via probability map guided copy-pasting[C]// 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway, New Jersey, USA: IEEE, 2019: 682-691.

24	FANG Y X, YANG S S, WANG X G, et al. QueryInst: Parallelly supervised mask query for instance segmentation[EB/OL]. arXiv: 2105.01928, 2021.

25	朱怡航, 张小斌, 沈颖越, 等. 基于图像识别技术的金针菇表型高通量采集与分析[J]. 菌物学报, 2021, 40(3): 626-640. ZHU Y H, ZHANG X B, SHEN Y Y, et al. High-throughput phenotyping collection and analysis of Flammulina filiformis based on image recognition technology[J]. Mycosystema, 2021, 40(3): 626-640.

Options

Outlines

模态框（Modal）标题

Abstract

Cite this article

0 引 言

1 数据采集与预处理

图1 秀珍菇数据集标注样例

2 研究方法

图2 基于PG-Mask R-CNN的秀珍菇表型分析方法流程

2.1 PG-Mask R-CNN模型构建

图3 PG-Mask R-CNN网络结构图

2.1.1 SimAM注意力机制模块

2.1.2 DFPN路径

图4 DFPN网络结构

2.1.3 边界框回归损失函数改进

图5 GIoU边界框损失函数原理图

2.2 秀珍菇表型参数测量

2.2.1 测量指标

2.2.2 测量方法

2.3 评价方法

2.4 实验环境与模型训练

3 结果与分析

3.1 SimAM模块插入位置分析

表1 添加SimAM后模型训练结果

3.2 消融实验

表2 消融实验结果

3.3 与其他分割方法对比

表3 不同实例分割方法实验结果对比

图6 秀珍菇在Grabcut和不同实例分割方法下的分割实验结果对比

3.4 表型参数测量结果分析

图7 利用PG-Mask R-CNN模型测量秀珍菇损伤率并与真值对比

图8 检测秀珍菇裂纹时出现的漏检和错检情况

3.5 测量结果量化与量化方法合理性分析

表4 利用PG-Mask R-CNN模型测量秀珍菇表型参数结果

表5 秀珍菇标定物图像测量与实际测量结果比较

4 结 论

利益冲突声明

References

0 引言

2.1　PG-Mask R-CNN模型构建

2.1.1　SimAM注意力机制模块

2.1.2　DFPN路径

2.1.3　边界框回归损失函数改进

2.2　秀珍菇表型参数测量

2.2.1　测量指标

2.2.2　测量方法

2.3　评价方法

2.4　实验环境与模型训练

3.1　SimAM模块插入位置分析

3.2　消融实验

3.3　与其他分割方法对比

3.4　表型参数测量结果分析

3.5　测量结果量化与量化方法合理性分析

4 结论