performance improvement in evaluate_kitti.py + some minor fixes

nearthlab · Mar 6, 2019 · 7fd7b56 · 7fd7b56
1 parent 07b8e7c
commit 7fd7b56
Show file tree

Hide file tree

Showing 13 changed files with 45 additions and 27 deletions.
diff --git a/README.md b/README.md
@@ -34,9 +34,8 @@ This repository includes:
 ```
 
 ![alt text](assets/unet.gif)
-![alt text](assets/kitti0.png)
-![alt text](assets/kitti1.png)
-![alt text](assets/coco0.png)
+![alt text](assets/kitti.png)
+![alt text](assets/coco.png)
 
 # Installation
 
@@ -79,24 +78,25 @@ This repository includes:
 # How to run examples
 Please read the instruction written in READ.md files in each example folder
 1. [Custom Backbone](https://github.com/nearthlab/image-segmentation/tree/master/examples/custom_backbone) <br/>
-This example illustrates how to build MaskRCNN with your custom backbone CNN. In particular, I adopted [matterport's implementation of ResNet](https://github.com/matterport/Mask_RCNN/blob/1ad9feaae3d87b52495413e6c8ea0e92f0e5bc34/mrcnn/model.py#L171), which is slightly different from [qubvel's](https://github.com/qubvel/classification_models/blob/e223c492477030b80bdc56b53471df39c4e090ea/classification_models/resnet/builder.py#L24). Moreover, you can run the inference using the pretrained [MaskRCNN_coco.h5](https://github.com/nearthlab/image-segmentation/releases). (I slightly modified the 'mask_rcnn_coco.h5' in [matterport/Mask_RCNN/releases](https://github.com/matterport/Mask_RCNN/releases) to make this file due to some differences in layer names)
+This example illustrates how to build MaskRCNN with your custom backbone architecture. In particular, I adopted [matterport's implementation of ResNet](https://github.com/matterport/Mask_RCNN/blob/1ad9feaae3d87b52495413e6c8ea0e92f0e5bc34/mrcnn/model.py#L171), which is slightly different from [qubvel's](https://github.com/qubvel/classification_models/blob/e223c492477030b80bdc56b53471df39c4e090ea/classification_models/resnet/builder.py#L24). Moreover, you can run the inference using the pretrained [MaskRCNN_coco.h5](https://github.com/nearthlab/image-segmentation/releases). (I slightly modified the 'mask_rcnn_coco.h5' in [matterport/Mask_RCNN/releases](https://github.com/matterport/Mask_RCNN/releases) to make this example work: there are some differences in layer names only)
 
 2. [Imagenet Classification](https://github.com/nearthlab/image-segmentation/tree/master/examples/imagenet) <br/>
 This example shows the imagenet classification results for various backbone architectures.
 
 3. [Create KITTI Label](https://github.com/nearthlab/image-segmentation/tree/master/examples/create_kitti_label) <br/>
-This example is a code that I used to simplify some of the object class labels in KITTI dataset. (For instance, I merged the 5 separate classes 'car', 'truck', 'bus', 'caravan' and 'trailer' into one single class called 'vehicle')
+This example is a code that I used to simplify some of the object class labels in KITTI dataset. (For instance, I merged the 5 separate classes 'car', 'truck', 'bus', 'caravan' and 'trailer' into a single class called 'vehicle')
 
 4. [Configurations](https://github.com/nearthlab/image-segmentation/tree/master/examples/configs) <br/>
 Some example cfg files that describes the segmentation models and training processes
 
 # How to train your own FPN / LinkNet / PSPNet / UNet model on KITTI dataset 
 
   i. Download the modified KITTI dataset from [release page](https://github.com/nearthlab/image-segmentation/releases)
-  (or make your own dataset into the same format) and place it under [datasets](https://github.com/nearthlab/image-segmentation/tree/master/datasets) folder. [Note that the KITTI dataset is a public dataset available [online](http://www.cvlibs.net/datasets/kitti/eval_semseg.php?benchmark=semantics2015).
-  I simply splitted the dataset into training and validation sets and simplified the labels using [create_kitti_label.py](https://github.com/nearthlab/image-segmentation/blob/master/examples/create_kitti_label/create_kitti_label.py).]
+  (or make your own dataset into the same format) and place it under [datasets](https://github.com/nearthlab/image-segmentation/tree/master/datasets) folder. 
+  * KITTI dataset is a public dataset available [online](http://www.cvlibs.net/datasets/kitti/eval_semseg.php?benchmark=semantics2015).
+  I simply splitted the dataset into training and validation sets and simplified the labels using [create_kitti_label.py](https://github.com/nearthlab/image-segmentation/blob/master/examples/create_kitti_label/create_kitti_label.py).
 
-  * Note that KITTI dataset is a very small dataset containing only 180 training images and 20 validation images. If you want to train a model for a serious purpose, you should consider using much more larger dataset. 
+  * Note that this dataset is very small containing only 180 training images and 20 validation images. If you want to train a model for a serious purpose, you should consider using much more larger dataset. 
 
   ii. Choose your model and copy corresponding cfg files from examples/configs. For example, if you want to train a Unet model,
 ```bash
@@ -105,11 +105,11 @@ Some example cfg files that describes the segmentation models and training proce
   cp examples/configs/unet/*.cfg plans/unet
 ```
 
-  iii. [Optional] Tune some model and training parameters in the config files that you have just copied. Read the comments in the example config files for what each parameter does.
+  iii. [Optional] Tune some model and training parameters in the config files that you have just copied. Read the comments in the example config files for what each parameter means.
 [Note that you have to declare a variable in .cfg file in the format
 ```{type}-{VARIABLE_NAME} = {value}```]
 
-  iv. Run the training command
+  iv. Run train.py:
 ```bash
   cd cd /path/to/image-segmentation
   python train.py -s plans/unet -d datasets/KITTI \
@@ -122,6 +122,11 @@ Some example cfg files that describes the segmentation models and training proce
   <br/><br/>
   Once the training is done, you can find the three files: 'class_names.json', 'infer.cfg' and 'best_model.h5',
   which you can use later for the [inference](https://github.com/nearthlab/image-segmentation/blob/master/README.md#how-to-visualize-inference)
+
+  v. KITTI Evaluation:
+```bash
+  python evaluate_kitti.py -c /path/to/infer.cfg -w /path/to/best_model.h5 -l /path/to/class_names.json
+```
 
 # How to train your own MaskRCNN model on COCO dataset
 
@@ -138,12 +143,12 @@ Some example cfg files that describes the segmentation models and training proce
   cp examples/configs/maskrcnn/*.cfg plans/maskrcnn
 ```
 
-  iii. [Optional] Tune some model and training parameters in the config files that you have just copied. Read the comments in the example config files for what each parameter does.
+  iii. [Optional] Tune some model and training parameters in the config files that you have just copied. Read the comments in the example config files for what each parameter means.
 [Note that you have to declare a variable in .cfg file in the format
 ```{type}-{VARIABLE_NAME} = {value}```]
 
 
-  iv. Run the training command
+  iv. Run train.py:
 ```bash
   cd cd /path/to/image-segmentation
   python train.py -s plans/maskrcnn -d datasets/coco \
@@ -155,15 +160,19 @@ Some example cfg files that describes the segmentation models and training proce
   Likewise, you can find the three files: 'class_names.json', 'infer.cfg' and 'best_model.h5',
   which you can use later for the [inference](https://github.com/nearthlab/image-segmentation/blob/master/README.md#how-to-visualize-inference)
 
-
+  v. COCO Evaluation:
+```bash
+  python evaluate_coco.py -c /path/to/infer.cfg -w /path/to/best_model.h5 -l /path/to/class_names.json
+```
+
 # How to visualize inference
 
 You can visualize your model's inference in a pop-up window:
 ```bash
 python infer_gui.py -c=/path/to/infer.cfg -w=/path/to/best_model.h5 -l=/path/to/class_names.json \
 -i=/path/to/a/directory/containing/image_files
 ```
-or save the result as image files [This will create a directory named 'results' under the directory you provided in -i option, and write the viusalized inference images in it]:
+or save the results as image files [This will create a directory named 'results' under the directory you provided in -i option, and write the viusalized inference images in it]:
 ```bash
 python infer.py -c=/path/to/infer.cfg -w=/path/to/best_model.h5 -l=/path/to/class_names.json \
 -i=/path/to/a/directory/containing/image_files

diff --git a/assets/coco0.png → assets/coco.png b/assets/coco0.png → assets/coco.png
diff --git a/assets/kitti.png b/assets/kitti.png
diff --git a/assets/kitti0.png b/assets/kitti0.png
diff --git a/assets/kitti1.png b/assets/kitti1.png
diff --git a/evaluate_kitti.py b/evaluate_kitti.py
@@ -7,11 +7,13 @@
 from data_generators.kitti import load_image_gt, KittiDataset
 
 def compute_confusion_matrix(gt_mask, pr_mask, num_classes):
+    gt_mask = np.max(gt_mask * np.arange(1, num_classes), axis=-1)
+    pr_mask = np.max(pr_mask * np.arange(1, num_classes), axis=-1)
 
     confusion_matrix = np.zeros((num_classes, num_classes))
-    for row, col, cls in np.ndindex(gt_mask.shape):
-        gt_cls = (cls + 1) * gt_mask[row][col][cls]
-        pr_cls = (cls + 1) * pr_mask[row][col][cls]
+    for row, col in np.ndindex(gt_mask.shape):
+        gt_cls = gt_mask[row][col]
+        pr_cls = pr_mask[row][col]
         confusion_matrix[gt_cls][pr_cls] += 1
 
     return confusion_matrix

diff --git a/examples/configs/linknet/train_linknet_all.cfg b/examples/configs/linknet/train_linknet_all.cfg
@@ -39,7 +39,7 @@
 [LOSS]
     # Weight decay for l2 regularization
 	# Set this value so that regularization loss is about in the similar range with other losses
-    float-WEIGHT_DECAY = 1.0
+    float-WEIGHT_DECAY = 5.0
 
     # Loss weights for more precise optimization.
     dict-LOSS_WEIGHTS = {"cce_loss": 1.0, "jaccard_loss": 1.0, "dice_loss": 1.0}

diff --git a/examples/configs/pspnet/pspnet.cfg b/examples/configs/pspnet/pspnet.cfg
@@ -19,8 +19,8 @@
     int-NUM_CLASSES = 13
 
     # The width and height of the input tensor
-    int-IMAGE_WIDTH = 1280
-    int-IMAGE_HEIGHT = 384
+    int-IMAGE_WIDTH = 960
+    int-IMAGE_HEIGHT = 288
 
     # Number of images to train with on each GPU. A 12GB GPU can typically
     # handle 2 images of 1024x1024px.

diff --git a/examples/configs/pspnet/train_pspnet_all.cfg b/examples/configs/pspnet/train_pspnet_all.cfg
@@ -39,7 +39,7 @@
 [LOSS]
     # Weight decay for l2 regularization
 	# Set this value so that regularization loss is about in the similar range with other losses
-    float-WEIGHT_DECAY = 1.0
+    float-WEIGHT_DECAY = 5.0
 
     # Loss weights for more precise optimization.
     dict-LOSS_WEIGHTS = {"cce_loss": 1.0, "jaccard_loss": 1.0, "dice_loss": 1.0}

diff --git a/examples/configs/unet/train_unet_all.cfg b/examples/configs/unet/train_unet_all.cfg
@@ -39,7 +39,7 @@
 [LOSS]
     # Weight decay for l2 regularization
 	# Set this value so that regularization loss is about in the similar range with other losses
-    float-WEIGHT_DECAY = 1.0
+    float-WEIGHT_DECAY = 5.0
 
     # Loss weights for more precise optimization.
     dict-LOSS_WEIGHTS = {"cce_loss": 1.0, "jaccard_loss": 1.0, "dice_loss": 1.0}

diff --git a/examples/custom_backbone/README.md b/examples/custom_backbone/README.md
@@ -10,8 +10,14 @@ To run example inference code on image files in images folder, download MaskRCNN
   # or python infer.py to save the results in images/results folder, which will be autamatically created
 ```
 
-To train your model with this example custom backbone, run:
+To train your model with this example custom backbone, run train.py:
 ```bash
   cd /path/to/image-segmentation/examples/custom_backbone
   python train.py -d /path/to/coco
+```
+
+To evaluate the model, run evaluate.py:
+```bash
+  cd /path/to/image-segmentation/examples/custom_backbone
+  python evaluate.py -d /path/to/coco
 ```
diff --git a/image-segmentation/data_generators/utils.py b/image-segmentation/data_generators/utils.py
@@ -210,7 +210,8 @@ def unmold_mask(mask, bbox, image_shape, threshold=0.5):
     '''
     y1, x1, y2, x2 = bbox
     mask = resize(mask, (y2 - y1, x2 - x1))
-    mask = np.where(mask >= threshold, 1, 0).astype(np.bool)
+    if threshold:
+        mask = np.where(mask >= threshold, 1, 0).astype(np.bool)
 
     # Put the mask in the right location.
     full_mask = np.zeros(image_shape[:2], dtype=np.bool)

diff --git a/image-segmentation/keras_model_wrapper/semantic_model_wrapper.py b/image-segmentation/keras_model_wrapper/semantic_model_wrapper.py
@@ -115,11 +115,11 @@ def predict(self, image, threshold=0.5):
         res = self.model.predict(input, batch_size=1)[0]
 
         num_channels = res.shape[-1]
-        final_result = np.zeros((height, width, num_channels))
+        final_result = np.zeros((height, width, num_channels), dtype=np.bool)
         for i in range(num_channels):
             resized_mask = unresize_image(res[:, :, i], window, (height, width))
-            resized_mask[resized_mask > threshold] = 1.0
-            resized_mask[resized_mask <= threshold] = 0.0
+            if threshold:
+                resized_mask = np.where(resized_mask >= threshold, 1, 0).astype(bool)
             final_result[:, :, i] = resized_mask
 
         return final_result