DeepLab-LargeFOV-Semi-Bbox-EM-Fixed is trained on PASCAL using mixed annotations (some strong pixel-level labels and many weak bounding box annotations). Note that EM-Fixed algorithm is employed during training (see our provided dataset in which the bounding box segmentations are used in this model).  Please also see our paper, Weakly- and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation. The model is based on DeepLab-LargeFOV.


After DenseCRF, the model (trained with 1.4K strong labels) yields 64.8% performance on the PASCAL VOC 2012 val set, and the model (trained with 2.9K strong labels) yields 69.0% performance on the PASCAL VOC 2012 test set.

CRF parameters: bi_w = 6, bi_x_std = 59, bi_r_std = 5, pos_w = 3, pos_x_std = 3

Pretrained models and corresponding prototxt files

Please download from this link.

Note please change the variable, TRAIN_SET_WEAK, in to TRAIN_SET_WEAK_BBOX so that you can train the model with the provided list of bounding box annotations.