Detectron benchmark 12

Keywords: github Python git network

By Facebook research Compile Flin Source: Github

Benchmarking

Here, we benchmark the training speed of Mask R-CNN in detectron 2 based on some other popular open source implementations of Mask R-CNN.

set up

Main results

<table> < tr > < td > tools < / td > < td > throughput (IMG / s) < td > < tr > <tr> <td>Detectron2</td><td>59</td> </tr> <tr> <td>maskrcnn-benchmark</td><td>51</td> </tr> <tr> <td>tensorpack </td><td>50</td> </tr> <tr> <td>mmdetection</td><td>41</td> </tr> <tr> <td>simpledet</td><td>39</td> </tr> <tr> <td>Detectron</td><td>19</td> </tr> <tr> <td>matterport/Mask_RCNN</td><td>14</td> </tr> </table> Links for each implementation:

Details of each implementation:

  • Detectron2:

    python tools/train_net.py  --config-file configs/Detectron1-Comparisons/mask_rcnn_R_50_FPN_noaug_1x.yaml --num-gpus 8
    
  • Maskrcnn benchmark: through sed -i's/torch.uint8/torch.bool/g '* * / *. py use commit 0ce8f6f to make it compatible with the latest PyTorch. Then, run

    python -m torch.distributed.launch --nproc_per_node=8 tools/train_net.py --config-file configs/e2e_mask_rcnn_R_50_FPN_1x.yaml
    

    We have observed that the speed is faster than its model zoo, which may be due to different software versions.

  • tensorpack: submitting caafda,export TF_CUDNN_USE_AUTOTUNE=0, then run

    mpirun -np 8 ./train.py --config DATA.BASEDIR=/data/coco TRAINER=horovod BACKBONE.STRIDE_1X1=True TRAIN.STEPS_PER_EPOCH=50 --load ImageNet-R50-AlignPadding.npz
    
  • mmdetection: commit 4d9a5f, apply the following diff, and then run

    ./tools/dist_train.sh configs/mask_rcnn_r50_fpn_1x.py 8
    

    We have observed that the speed is faster than its model zoo, which may be due to different software versions.

    <details> <summary> (diff makes it use the same super parameter - click expand) </summary>

    diff --git i/configs/mask_rcnn_r50_fpn_1x.py w/configs/mask_rcnn_r50_fpn_1x.py
    index 04f6d22..ed721f2 100644
    --- i/configs/mask_rcnn_r50_fpn_1x.py
    +++ w/configs/mask_rcnn_r50_fpn_1x.py
    @@ -1,14 +1,15 @@
    # model settings
    model = dict(
      type='MaskRCNN',
    -    pretrained='torchvision://resnet50',
    +    pretrained='open-mmlab://resnet50_caffe',
      backbone=dict(
        type='ResNet',
        depth=50,
        num_stages=4,
        out_indices=(0, 1, 2, 3),
        frozen_stages=1,
    -        style='pytorch'),
    +        norm_cfg=dict(type="BN", requires_grad=False),
    +        style='caffe'),
      neck=dict(
        type='FPN',
        in_channels=[256, 512, 1024, 2048],
    @@ -115,7 +116,7 @@ test_cfg = dict(
    dataset_type = 'CocoDataset'
    data_root = 'data/coco/'
    img_norm_cfg = dict(
    -    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
    +    mean=[123.675, 116.28, 103.53], std=[1.0, 1.0, 1.0], to_rgb=False)
    train_pipeline = [
      dict(type='LoadImageFromFile'),
      dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
    

    </details>

  • SimpleDet: run at commit9187a1

    python detection_train.py --config config/mask_r50v1_fpn_1x.py
    
  • Detectron: running

    python tools/train_net.py --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-50-FPN_1x.yaml
    

    Note that many of its operations run on the CPU, so performance is limited.

  • matterport/Mask_RCNN: 3deaec during commit, apply the following diff, export TF_ CUDNN_ USE_ Autotune = 0, then run

    python coco.py train --dataset=/data/coco/ --model=imagenet
    

Note that many of the small details in this implementation may differ from the Detectron standard.

<details> <summary> (diff makes it use the same super parameter - click expand) </summary>

diff --git i/mrcnn/model.py w/mrcnn/model.py
index 62cb2b0..61d7779 100644
--- i/mrcnn/model.py
+++ w/mrcnn/model.py
@@ -2367,8 +2367,8 @@ class MaskRCNN():
      epochs=epochs,
      steps_per_epoch=self.config.STEPS_PER_EPOCH,
      callbacks=callbacks,
-            validation_data=val_generator,
-            validation_steps=self.config.VALIDATION_STEPS,
+            #validation_data=val_generator,
+            #validation_steps=self.config.VALIDATION_STEPS,
      max_queue_size=100,
      workers=workers,
      use_multiprocessing=True,
diff --git i/mrcnn/parallel_model.py w/mrcnn/parallel_model.py
index d2bf53b..060172a 100644
--- i/mrcnn/parallel_model.py
+++ w/mrcnn/parallel_model.py
@@ -32,6 +32,7 @@ class ParallelModel(KM.Model):
    keras_model: The Keras model to parallelize
    gpu_count: Number of GPUs. Must be > 1
    """
+        super().__init__()
    self.inner_model = keras_model
    self.gpu_count = gpu_count
    merged_outputs = self.make_parallel()
diff --git i/samples/coco/coco.py w/samples/coco/coco.py
index 5d172b5..239ed75 100644
--- i/samples/coco/coco.py
+++ w/samples/coco/coco.py
@@ -81,7 +81,10 @@ class CocoConfig(Config):
  IMAGES_PER_GPU = 2

  # Uncomment to train on 8 GPUs (default is 1)
-    # GPU_COUNT = 8
+    GPU_COUNT = 8
+    BACKBONE = "resnet50"
+    STEPS_PER_EPOCH = 50
+    TRAIN_ROIS_PER_IMAGE = 512

  # Number of classes (including background)
  NUM_CLASSES = 1 + 80  # COCO has 80 classes
@@ -496,29 +499,10 @@ if __name__ == '__main__':
    # *** This training schedule is an example. Update to your needs ***

    # Training - Stage 1
-        print("Training network heads")
    model.train(dataset_train, dataset_val,
          learning_rate=config.LEARNING_RATE,
          epochs=40,
-                    layers='heads',
-                    augmentation=augmentation)
-
-        # Training - Stage 2
-        # Finetune layers from ResNet stage 4 and up
-        print("Fine tune Resnet stage 4 and up")
-        model.train(dataset_train, dataset_val,
-                    learning_rate=config.LEARNING_RATE,
-                    epochs=120,
-                    layers='4+',
-                    augmentation=augmentation)
-
-        # Training - Stage 3
-        # Fine tune all layers
-        print("Fine tune all layers")
-        model.train(dataset_train, dataset_val,
-                    learning_rate=config.LEARNING_RATE / 10,
-                    epochs=160,
-                    layers='all',
+                    layers='3+',
          augmentation=augmentation)

  elif args.command == "evaluate":

</details>

Original link: https://detectron2.readthedocs.io/notes/benchmarks.html

Welcome to pioneer AI blog: http://panchuang.net/

sklearn machine learning Chinese official document: http://sklearn123.com/

Welcome to pioneer blog Resource Hub: http://docs.panchuang.net/

Posted by johnkelly on Tue, 02 Jun 2020 21:27:10 -0700