DeepLab: Models

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [pdf]
Liang-Chieh Chen*, George Papandreou*, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille. (*equal contribution)
arXiv 2016

Attention to Scale: Scale-aware Semantic Image Segmentation [pdf]
Liang-Chieh Chen, Yi Yang, Jiang Wang, Wei Xu and Alan L Yuille
CVPR 2016

Semantic Image Segmentation with Task-Specific Edge Detection Using CNNs and a Discriminatively Trained Domain Transform [pdf]
Liang-Chieh Chen, Jonathan T Barron, George Papandreou, and Alan L. Yuille
CVPR 2016

Weakly- and Semi-Supervised Learning of a DCNN for Semantic Image Segmentation [pdf]
George Papandreou*, Liang-Chieh Chen*, Kevin Murphy, and Alan L. Yuille. (*equal contribution)
ICCV 2015

Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs [pdf]
Liang-Chieh Chen*, George Papandreou*, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille. (*equal contribution)
ICLR 2015

  • DeepLab_v2 Code for (1) our two CVPR'16 and for (2) DeepLab based on ResNet-101 for (CAFFE forked on Feb. 2016). Note that this version also supports the experiments (DeepLab v1) in our ICLR'15 and ICCV'15. You only need to modify the old prototxt files. For example, our proposed atrous convolution is called dilated convolution in CAFFE framework, and you need to change the convolution parameter "hole" to "dilation" (the usage is exactly the same) in order to use our latest code to reproduce ICLR'15 and ICCV'15 results.
  • DeepLab_v1 Code for ICLR'15 and ICCV'15 (older version of CAFFE)
  • FAQ when using the code

Trained models and corresponding prototxt files:

1. Training DeepLab with strong supervision (i.e., pixel-level annotations) in our ICLR'15:

2. Training DeepLab with weakly- or semi-supervised learning in our ICCV;15:

3. Training DeepLab with (1) Multi-scale inputs, (2) extra supervision, and (3) attention model in our CVPR'16 (attention to scale):

4. Training DeepLab with a discriminatively trained domain transform in our CVPR'16 (domain transform):

5. DeepLabv2 models reported in our latest arXiv:

  • DeepLabv2_VGG16: (1) re-purposed VGG-16 via atrous convolution + (2) Multi-scale inputs + atrous spatial pyramid pooling.
  • DeepLabv2_ResNet101: (1) re-purposed ResNet-101 via atrous convolution + (2) Multi-scale inputs + atrous spatial pyramid pooling.