电子科技 ›› 2022, Vol. 35 ›› Issue (6): 28-34.doi: 10.16180/j.cnki.issn1007-7820.2022.06.005

• • 上一篇    下一篇

基于改进ExfuseNet模型的街景语义分割

陈劲宏,陈玮,尹钟   

  1. 上海理工大学 光电信息与计算机工程学院,上海 200093
  • 收稿日期:2021-02-04 出版日期:2022-06-15 发布日期:2022-06-20
  • 作者简介:陈劲宏(1996-),男,硕士研究生。研究方向:图像处理。|陈玮(1964-),女,副教授。研究方向:图像处理与模式识别。|尹钟(1988-),男,副教授。研究方向:基于脑电信号的深度学习。
  • 基金资助:
    国家自然科学基金(61703277)

Semantic Segmentation of Streetscape Based on Improved ExfuseNet

CHEN Jinhong,CHEN Wei,YIN Zhong   

  1. School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China
  • Received:2021-02-04 Online:2022-06-15 Published:2022-06-20
  • Supported by:
    National Natural Science Foundation of China(61703277)

摘要:

使用ExfuseNet模型进行街景语义分割时,由于街景图像背景复杂度较高,造成感兴趣类之间的面积占比与分布不均衡,特别是图像中面积占比低且密度低的感兴趣目标,越到网络深层越容易被错误分类,最终导致模型分割性能下降。为解决该问题,文中对ExfuseNet模型进行了改进。为了获取不同尺度的语义信息,在不增加模型参数量的条件下,多监督模块采用不同空洞率的带孔卷积。在下采样特征融合后,立刻采用随机丢弃层来减少模型参数量,提高泛化力。在主输出前采用CBAM注意力机制模块以便更高效地对感兴趣目标类的深度语义信息进行采样,并在多监督模块之后采用类平衡函数来改善数据集Camvid的类不平衡问题。实验结果表明,改进的ExfuseNet模型语义分割效果有明显提升,其均交并比提升到了68.32%,Pole类分类准确率提升到38.14%。

关键词: 街景图像, 多监督, 空洞率, 带孔卷积, 随机丢弃层, 泛化力, 注意力机制, 类平衡, 均交并比

Abstract:

When using the ExfuseNet model for streetscape semantic segmentation, due to the high background complexity of the street view image, the area ratio and distribution between the classes of interest are unbalanced. Interesting targets with low area and low density in the image are more likely to be misclassified as they go deeper into the network, which ultimately leads to the degradation of model segmentation performance. To solve this problem, an improved Exfusenet model is proposed. In order to obtain the semantic information of different scales without increasing the amount of model parameters, the multi-monitor module adopts atrous convolution with different rates. After the down-sampling features are fused, the random discarding layer is used immediately to reduce the amount of model parameters and improve the generalization ability. Before the main output, the CBAM attention mechanism module is used to sample the depth semantic information of the target class of interest more efficiently, and the class balance function is used after the multi-supervision module to improve the class imbalance problem of the data set Camvid. The experimental results show that the semantic segmentation effect of the improved ExfuseNet model has been significantly improved, MIOU has increased to 68.32%, and the classification accuracy rate of the Pole class has increased to 38.14%.

Key words: street view image, multiple supervision, dilated rate, dilated convolution, random drop layer, generalization, attention mechanism, class balance, mean intersection over union

中图分类号: 

  • TP391.4