DeepLab-MSc explores a multi-scale prediction method to increase the boundary localization accuracy. Specifically, we attach to the input image and the output of each of the first four max pooling layers a two-layer MLP (first layer: 128 3×3 convolutional filters, second layer: 128 1×1 convolutional filters) whose feature map is concatenated to the main network’s last layer feature map. We only adjust the newly added weights, keeping the other network parameters intact.


After DenseCRF, the model yields 67.1% performance on the PASCAL VOC 2012 test set.

CRF parameters: bi_w = 3, bi_xy_std = 95, bi_rgb_std = 3, pos_w = 3, pos_xy_std = 3.

Pretrained models and corresponding prototxt files

Please download from here