DeepLab-LargeFOV-Semi-Bbox-Seg is trained on PASCAL using mixed annotations (some strong pixel-level labels and many weak bounding box annotations). Note that densecrf is already employed to perform foreground/background segmentation for each bounding box annotations (see our provided dataset in which the refined segmentations are used in this model).  Please also see our paper, Weakly- and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation. The model is based on DeepLab-LargeFOV.


After DenseCRF, the model (trained with 1.4K strong labels) yields 65.1% performance on the PASCAL VOC 2012 val set, and the model (trained with 2.9K strong labels) yields 68.0% performance on the PASCAL VOC 2012 test set.

CRF parameters: bi_w = 5, bi_x_std = 63, bi_r_std = 5, pos_w = 3, pos_x_std = 3

Pretrained models and corresponding prototxt files

Please download from this link.

Note please change the variable, TRAIN_SET_WEAK, in to TRAIN_SET_WEAK_BBOX so that you can train the model with the provided list of bounding box annotations.