Real Time Semantic Segmentation Algorithm based on Encoder-decoder

Authors

  • Shibo Hou
  • Wenwen Zong

DOI:

https://doi.org/10.54691/k80m4t52

Keywords:

Real Time Semantic Segmentation, Convolutional Neural Net-Work, Lightweight Algorithm, Attention Fusion, Context Information.

Abstract

Encoder-decoder networks have demonstrated significant advantages in semantic segmentation tasks. PP-LiteSeg as a light-weight encoder-decoder network, has gained a lot of attention because it can strike a good balance between accuracy and speed. However, the encoder and decoder in the networks have obvious limitations in extracting semantic information and recovering detailed information, respectively. To address these limitations, we carry out some improvements on PP-LiteSeg and propose MFF-LiteSeg. The improvements include a new encoder based on a parallel dilated convolution short-term dense cascade module and a multi-scale context module. The former module are exploited to help the encoder capturing rich features while maintaining low parameter counts. The latter module is integrated into the encoder to capture multi-scale contextual information, enhancing the ability of the encoder to understand objects of different scales. The improvements also include an attention fusion module which is applied to the decoder to optimize the feature fusion, enabling the decoder to more accurately restore detailed feature information. Experimental results on public datasets demonstrate that MFF-LiteSeg achieves higher segmentation accuracy than PP-LiteSeg while maintaining a fast inference speed of 160 FPS, striking a good balance between inference speed and segmentation accuracy.

Downloads

Download data is not yet available.

References

[1] Gao Y, Jiang Y, Peng Y, et al. Medical Image Segmentation: A Comprehensive Review of Deep Learning-Based Methods[J]. Tomography, 2025, 11(5): 52.

[2] Elhassan M A M, Zhou C, Khan A, et al. Real-time semantic segmentation for autonomous driving: A review of CNNs, Transformers, and Beyond[J]. Journal of King Saud University-Computer and Information Sciences, 2024, 36(10): 102226.

[3] Yang G, Wang Y, Shi D, et al. Golden Cudgel Network for Real-Time Semantic Segmentation[C]//Proceedings of the Computer Vision and Pattern Recognition Conference, 2025: 25367-25376.

[4] Li B, Tang X, Ruan C, et al. A Survey on Real-Time Semantic Segmentation Based on Deep Learning[C]//International Conference on Big Data and Security, 2023: 51-62.

[5] Badrinarayanan V, Kendall A, Cipolla R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495.

[6] Zhang Y, Yao T, Qiu Z, et al. Lightweight and progressively-scalable networks for semantic segmentation[J]. International Journal of Computer Vision, 2023, 131(8): 2153-2171.

[7] Noh H, Hong S, Han B. Learning deconvolution network for semantic segmentation[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015: 1520-1528.

[8] Zhao H, Qi X, Shen X, et al. Icnet for real-time semantic segmentation on high-resolution images[C]//Proceedings of the European Conference on Computer Vision (ECCV), 2018: 405-420.

[9] Paszke A, Chaurasia A, Kim S, et al. Enet: A deep neural network architecture for real-time semantic segmentation[J]. arXiv preprint arXiv:1606.02147, 2016.

[10] Gao G, Xu G, Yu Y, et al. MSCFNet: A lightweight network with multi-scale context fusion for real-time semantic segmentation[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 23(12): 25489-25499.

[11] Peng J, Liu Y, Tang S, et al. PP-LiteSeg: A superior real-time semantic segmentation model[J]. arXiv preprint arXiv:2204.02681, 2022.

[12] Wang J, Gou C, Wu Q, et al. RTFormer: Efficient design for real-time semantic segmentation with transformer[J]. Advances in Neural Information Processing Systems, 2022, 35: 7423-7436.

[13] Xu G, Li J, Gao G, et al. Lightweight real-time semantic segmentation network with efficient transformer and CNN[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(12): 15897-15906.

[14] Chen S, Tang M, Dong R, et al. Encoder--Decoder Structure Fusing Depth Information for Outdoor Semantic Segmentation[J]. Applied Sciences, 2023, 13(17): 9924.

[15] Zhao H, Shi J, Qi X, et al. Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 2881-2890.

Downloads

Published

24-11-2025

Issue

Section

Articles