|
| 1 | +# Train a Mask R-CNN model with the Tensorflow Object Detection API |
| 2 | + |
| 3 | + |
| 4 | + |
| 5 | +## 1. Installation |
| 6 | + |
| 7 | +### Clone the repository and install dependencies |
| 8 | + |
| 9 | +First, we need to clone the Tensorflow models repository. This can be done by either cloning the repository directly or by typing **git clone https://github.com/tensorflow/models --single-branch --branch r1.13.0** inside a command line. |
| 10 | + |
| 11 | +After cloning the repository, it is a good idea to install all the dependencies. This can be done by typing: |
| 12 | + |
| 13 | +```bash |
| 14 | +pip install --user Cython |
| 15 | +pip install --user contextlib2 |
| 16 | +pip install --user pillow |
| 17 | +pip install --user lxml |
| 18 | +pip install --user jupyter |
| 19 | +pip install --user matplotlib |
| 20 | +``` |
| 21 | + |
| 22 | +Also make sure to use Tensorflow 1.x since training a custom model doesn't work with Tensorflow 2 yet. |
| 23 | + |
| 24 | +### Install the COCO API |
| 25 | + |
| 26 | +COCO is a large image dataset designed for object detection, segmentation, person keypoints detection, stuff segmentation, and caption generation. If you want to use the data-set and evaluation metrics, you need to clone the cocoapi repository and copy the pycocotools subfolder to the tensorflow/models/research directory. |
| 27 | + |
| 28 | +```bash |
| 29 | +git clone https://github.com/cocodataset/cocoapi.git |
| 30 | +cd cocoapi/PythonAPI |
| 31 | +make |
| 32 | +cp -r pycocotools <path_to_tensorflow>/models/research/ |
| 33 | +``` |
| 34 | + |
| 35 | +Using make won't work on windows. To install the cocoapi on windows the following command can be used: |
| 36 | + |
| 37 | +```bash |
| 38 | +pip install "git+https://github.com/philferriere/cocoapi.git#egg=pycocotools&subdirectory=PythonAPI" |
| 39 | +``` |
| 40 | + |
| 41 | +### Protobuf Installation/Compilation |
| 42 | + |
| 43 | +The Tensorflow Object Detection API uses .proto files. These files need to be compiled into .py files in order for the Object Detection API to work properly. Google provides a programmed called Protobuf that can compile these files. |
| 44 | + |
| 45 | +Protobuf can be downloaded from this website. After downloading, you can extract the folder in a directory of your choice. |
| 46 | + |
| 47 | +After extracting the folder, you need to go into models/research and use protobuf to extract python files from the proto files in the object_detection/protos directory. |
| 48 | + |
| 49 | +The official installation guide uses protobuf like: |
| 50 | + |
| 51 | +```bash |
| 52 | +./bin/protoc object_detection/protos/*.proto --python_out=. |
| 53 | +``` |
| 54 | + |
| 55 | +But the * which stands for all files didn’t work for me, so I wrote a little Python script to execute the command for each .proto file. |
| 56 | + |
| 57 | +```python |
| 58 | +import os |
| 59 | +import sys |
| 60 | +args = sys.argv |
| 61 | +directory = args[1] |
| 62 | +protoc_path = args[2] |
| 63 | +for file in os.listdir(directory): |
| 64 | + if file.endswith(".proto"): |
| 65 | + os.system(protoc_path+" "+directory+"/"+file+" --python_out=.") |
| 66 | +``` |
| 67 | + |
| 68 | +This file needs to be saved inside the research folder, and I named it use_protobuf.py. Now we can use it by going into the console and typing: |
| 69 | + |
| 70 | +```bash |
| 71 | +python use_protobuf.py <path to directory> <path to protoc file> Example: python use_protobuf.py object_detection/protos C:/Users/Gilbert/Downloads/bin/protoc |
| 72 | +``` |
| 73 | + |
| 74 | +### Add necessary environment variables and finish Tensorflow Object Detection API installation |
| 75 | + |
| 76 | +Lastly, we need to add the research and research/slim folder to our environment variables and run the setup.py file. |
| 77 | + |
| 78 | +To add the paths to environment variables in Linux you need to type: |
| 79 | + |
| 80 | +```bash |
| 81 | +export PYTHONPATH=$PYTHONPATH:<PATH_TO_TF>/TensorFlow/models/research |
| 82 | +export PYTHONPATH=$PYTHONPATH:<PATH_TO_TF>/TensorFlow/models/research/object_detection |
| 83 | +export PYTHONPATH=$PYTHONPATH:<PATH_TO_TF>/TensorFlow/models/research/slim |
| 84 | +``` |
| 85 | + |
| 86 | +On windows you need to at the path of the research folder and the research/slim to your PYTHONPATH environment variable (See Environment Setup). |
| 87 | + |
| 88 | +To run the setup.py file, we need to navigate to tensorflow/models/research and run: |
| 89 | + |
| 90 | +```bash |
| 91 | +# From within TensorFlow/models/research/ |
| 92 | +python setup.py build |
| 93 | +python setup.py install |
| 94 | +``` |
| 95 | + |
| 96 | +This completes the installation of the object detection api. To test if everything is working correctly, run the object_detection_tutorial.ipynb notebook from the object_detection folder. |
| 97 | + |
| 98 | +If your installation works correctly, you should see the following output: |
| 99 | + |
| 100 | + |
| 101 | + |
| 102 | +### Run the Tensorflow Object Detection API with Docker |
| 103 | + |
| 104 | +Installing the Tensorflow Object Detection API can be hard because there are lots of errors that can occur depending on your operating system. Docker makes it easy to setup the Tensorflow Object Detection API because you only need to download the files inside the [docker folder](docker/) and run **docker-compose up**. |
| 105 | + |
| 106 | +After running the command docker should automatically download and install everything needed for the Tensorflow Object Detection API and open Jupyter on port 8888. If you also want to have access to the bash for training models, you can simply say **docker exec -it CONTAINER_ID**. For more information, check out [Dockers' documentation](https://docs.docker.com/). |
| 107 | + |
| 108 | +If you experience any problems with the docker files, be sure to let me know. |
| 109 | + |
| 110 | +### 2. Gathering data |
| 111 | + |
| 112 | +Now that the Tensorflow Object Detection API is ready to go, we need to gather the images needed for training. |
| 113 | + |
| 114 | +To train a robust model, we need lots of pictures that should vary as much as possible from each other. That means that they should have different lighting conditions, different backgrounds, and lots of random objects in them. |
| 115 | + |
| 116 | +You can either take the pictures yourself, or you can download pictures from the internet. For my microcontroller detector, I have four different objects I want to detect (Arduino Nano, ESP8266, Raspberry Pi 3, Heltect ESP32 Lora). |
| 117 | + |
| 118 | +I took about 25 pictures of each individual microcontroller and 25 pictures containing multiple microcontrollers using my smartphone. After taking the pictures, make sure to transform them to a resolution suitable for training (I used 800x600). |
| 119 | + |
| 120 | + |
| 121 | + |
| 122 | +You can use the [resize_images script](resize_images.py) to resize the image to the wanted resolutions. |
| 123 | + |
| 124 | +```bash |
| 125 | +python resize_images.py -d images/ -s 800 600 |
| 126 | +``` |
| 127 | + |
| 128 | +After you have all the images, move about 80% to the object_detection/images/train directory and the other 20% to the object_detection/images/test directory. Make sure that the images in both directories have a good variety of classes. |
| 129 | + |
| 130 | +## 3. Labeling data |
| 131 | + |
| 132 | +After you have gathered enough images, it's time to label them, so your model knows what to learn. In order to label the data, you will need to use some kind of labeling software. |
| 133 | + |
| 134 | +For object detection, we used [LabelImg](https://github.com/tzutalin/labelImg), an excellent image annotation tool supporting both PascalVOC and Yolo format. For Image Segmentation / Instance Segmentation there are multiple great annotations tools available. Including, [VGG Image Annotation Tool](http://www.robots.ox.ac.uk/~vgg/software/via/), [labelme](https://github.com/wkentaro/labelme), and [PixelAnnotationTool](https://github.com/abreheret/PixelAnnotationTool). I chose labelme, because of its simplicity to both install and use. |
| 135 | + |
| 136 | + |
| 137 | + |
| 138 | +## 4. Generating Training data |
| 139 | + |
| 140 | +With the images labeled, we need to create TFRecords that can be served as input data for the training of the model. Before we create the TFRecord files, we'll convert the labelme labels into COCO format. This can be done with the [labelme2coco.py script](images/labelme2coco.py). |
| 141 | + |
| 142 | +```bash |
| 143 | +python labelme2coco.py train train.json |
| 144 | +python labelme2coco.py test test.json |
| 145 | +``` |
| 146 | + |
| 147 | +Now we can create the TFRecord files using the [create_coco_tf_record.py script](create_coco_tf_record.py). |
| 148 | + |
| 149 | +```bash |
| 150 | +python create_coco_tf_record.py --logtostderr --train_image_dir=images/train --test_image_dir=images/test --train_annotations_file=images/train.json --test_annotations_file=images/test.json --output_dir=./ |
| 151 | +``` |
| 152 | + |
| 153 | +After executing this command, you should have a train.record and test.record file inside your object detection folder. |
| 154 | + |
| 155 | +## 5. Getting ready for training |
| 156 | + |
| 157 | +The last thing we need to do before training is to create a label map and a training configuration file. |
| 158 | + |
| 159 | +### 5.1 Creating a label map |
| 160 | + |
| 161 | +The label map maps an id to a name. We will put it in a folder called training, which is located in the object_detection directory. The labelmap for my detector can be seen below. |
| 162 | + |
| 163 | +```bash |
| 164 | +item { |
| 165 | + id: 1 |
| 166 | + name: 'Arduino' |
| 167 | +} |
| 168 | +item { |
| 169 | + id: 2 |
| 170 | + name: 'ESP8266' |
| 171 | +} |
| 172 | +item { |
| 173 | + id: 3 |
| 174 | + name: 'Heltec' |
| 175 | +} |
| 176 | +item { |
| 177 | + id: 4 |
| 178 | + name: 'Raspberry' |
| 179 | +} |
| 180 | +``` |
| 181 | + |
| 182 | +The id number of each item should match the ids inside the train.json and test.json files. |
| 183 | + |
| 184 | +```json |
| 185 | +"categories": [ |
| 186 | + { |
| 187 | + "supercategory": "Arduino", |
| 188 | + "id": 0, |
| 189 | + "name": "Arduino" |
| 190 | + }, |
| 191 | + { |
| 192 | + "supercategory": "ESP8266", |
| 193 | + "id": 1, |
| 194 | + "name": "ESP8266" |
| 195 | + }, |
| 196 | + { |
| 197 | + "supercategory": "Heltec", |
| 198 | + "id": 2, |
| 199 | + "name": "Heltec" |
| 200 | + }, |
| 201 | + { |
| 202 | + "supercategory": "Raspberry", |
| 203 | + "id": 3, |
| 204 | + "name": "Raspberry" |
| 205 | + } |
| 206 | +], |
| 207 | +``` |
| 208 | + |
| 209 | +### 5.2 Creating the training configuration |
| 210 | + |
| 211 | +Lastly, we need to create a training configuration file. The Tensorflow Object Detection API provides 4 model options: |
| 212 | + |
| 213 | +From the [Tensorflow Model Zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md): |
| 214 | +| Model name | Speed (ms) | COCO mAP[^1] | Outputs | |
| 215 | +| ------------ | :--------------: | :--------------: | :-------------: | |
| 216 | +| [mask_rcnn_inception_resnet_v2_atrous_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_inception_resnet_v2_atrous_coco_2018_01_28.tar.gz) | 771 | 36 | Masks | |
| 217 | +| [mask_rcnn_inception_v2_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_inception_v2_coco_2018_01_28.tar.gz) | 79 | 25 | Masks | |
| 218 | +| [mask_rcnn_resnet101_atrous_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_resnet101_atrous_coco_2018_01_28.tar.gz) | 470 | 33 | Masks | |
| 219 | +| [mask_rcnn_resnet50_atrous_coco](http://download.tensorflow.org/models/object_detection/mask_rcnn_resnet50_atrous_coco_2018_01_28.tar.gz) | 343 | 29 | Masks | |
| 220 | + |
| 221 | +For this tutorial I chose to use the mask_rcnn_inception_v2_coco, because it's alot faster than the other options. You can find the [mask_rcnn_inception_v2_coco.config file](https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/mask_rcnn_inception_v2_coco.config) inside the samples/config folder. Copy the config file to the training directory. Then open it with a text editor and make the following changes: |
| 222 | + |
| 223 | +* Line 10: change the number of classes to number of objects you want to detect (4 in my case) |
| 224 | + |
| 225 | +* Line 126: change fine_tune_checkpoint to the path of the model.ckpt file: |
| 226 | + |
| 227 | + * ```fine_tune_checkpoint: "<path>/models/research/object_detection/training/mask_rcnn_inception_v2_coco_2018_01_28/model.ckpt"``` |
| 228 | + |
| 229 | +* Line 142: change input_path to the path of the train.records file: |
| 230 | + |
| 231 | + * ```input_path: "<path>/models/research/object_detection/train.record"``` |
| 232 | + |
| 233 | +* Line 158: change input_path to the path of the test.records file: |
| 234 | + |
| 235 | + * ```input_path: "<path>/models/research/object_detection/test.record"``` |
| 236 | + |
| 237 | +* Line 144 and 160: change label_map_path to the path of the label map: |
| 238 | + |
| 239 | + * ```label_map_path: "<path>/models/research/object_detection/training/labelmap.pbtxt"``` |
| 240 | + |
| 241 | +* Line 150: change num_example to the number of images in your test folder. |
| 242 | + |
| 243 | +## 6. Training the model |
| 244 | + |
| 245 | +To train the model execute the following command in the command line: |
| 246 | + |
| 247 | +```bash |
| 248 | +python model_main.py --logtostderr --model_dir=training/ --pipeline_config_path=training/mask_rcnn_inception_v2_coco.config |
| 249 | +``` |
| 250 | + |
| 251 | +Every few minutes, the current loss gets logged to Tensorboard. Open Tensorboard by opening a second command line, navigating to the object_detection folder and typing: |
| 252 | + |
| 253 | +```bash |
| 254 | +tensorboard --logdir=training |
| 255 | +``` |
| 256 | + |
| 257 | +The training script saves checkpoints every few minutes. Train the model until it reaches a satisfying loss, then you can terminate the training process by pressing Ctrl+C. |
| 258 | + |
| 259 | +### Training in Google Colab |
| 260 | + |
| 261 | +If your computer doesn't have a good enough GPU to train the model locally, you can train it on Google Colab. For this, I recommend creating a folder that has the data as well as all the config files in it and putting it on Google Drive. That way, you can then load in all the custom files into Google Colab. |
| 262 | + |
| 263 | +You can find an example inside the [Tensorflow_Object_Detection_API_Instance_Segmentation_in_Google_Colab.ipynb notebook](Tensorflow_Object_Detection_API_Instance_Segmentation_in_Google_Colab.ipynb). |
| 264 | + |
| 265 | +## 7. Exporting the inference graph |
| 266 | + |
| 267 | +Now that we have a trained model, we need to generate an inference graph, which can be used to run the model. For doing so we need to first of find out the highest saved step number. For this, we need to navigate to the training directory and look for the model.ckpt file with the biggest index. |
| 268 | + |
| 269 | +Then we can create the inference graph by typing the following command in the command line. |
| 270 | + |
| 271 | +```bash |
| 272 | +python export_inference_graph.py --input_type image_tensor --pipeline_config_path training/mask_rcnn_inception_v2_coco.config --trained_checkpoint_prefix training/model.ckpt-XXXX --output_directory inference_graph |
| 273 | +``` |
| 274 | + |
| 275 | +XXXX represents the highest number. |
0 commit comments