Train your dataset with yolov5 and deploy yolov5 through flash

Keywords: AI Pytorch Object Detection Flask

Use yolov5 to train your own dataset (detailed process) and deploy yolov5 through flash

github project address

Use yolov5 to train your own data set (detailed process) and deploy it through flash

1. Prepare data set

PASCAL VOC

In this paper PASCAL VOC extraction code: 07wp
Take the dataset as an example. Put the dataset under the project dataset directory. The dataset structure is as follows:

---VOC2012
--------Annotations
---------------xml0
---------------xml1
--------JPEGImages
---------------img0
---------------img1
--------pascal_voc_classes.txt

Annotations are all xml files, JPEGImages are all image files, pascal_voc_classes.txt is a category file.

Get label file

The format of yolo tag file is as follows:

102 0.682813 0.415278 0.237500 0.502778
102 0.914844 0.396528 0.168750 0.451389

first place label´╝îIs the category of objects in the picture
 The last four digits are the positions of the objects in the picture( xcenter, ycenter, h, w)That is, the relative coordinates and relative height and width of the central position of the target object
 There are two targets in the figure above

If you already have the above label file, you can skip to the next step.
If there is no label file above, it can be used labelimg extraction code dbi2
Label. Generate a label file in xml format and then convert it to a label file in yolo format. The use of labelimg is very simple and will not be repeated here.
The label file in xml format is converted to yolo format,
Run the get_train_val.py file under the project.

python xml_yolo.py

pascal_voc_classes.txt is the json file corresponding to your category. The following is the category format of VOC dataset.

["aeroplane","bicycle", "bird","boat","bottle","bus","car","cat","chair","cow","diningtable","dog","horse","motorbike","person","pottedplant","sheep","sofa","train", "tvmonitor"]

Path structure after running the above code

---VOC2012
--------Annotations
--------JPEGImages
--------pascal_voc_classes.json
---yolodata
--------images
--------labels

2. Divide training set and test set

The division of training set and test set is very simple. The original data is scrambled, and then divided into training set and test set according to 9:1.
Run the get_train_val.py file under the project.

python get_train_val.py
Running the above code will generate the following path structure
---VOC2012
--------Annotations
--------JPEGImages
--------pascal_voc_classes.json
---yolodata
--------images
--------labels
---traindata
--------images
----------------train
----------------val
--------labels
----------------train
----------------val
Traintata is the last training file needed

3. Training model

The training of yolov5 is very simple. The code has been simplified in this paper. The code structure is as follows:

dataset             # data set
------traindata     # Training data set
inference           # Input / output interface
------inputs        # input data
------outputs       # output data
config              # configuration file
------score.yaml    # Training profile
------yolov5l.yaml  # Model profile
models              # Model code
runs	            # log file
utils               # Code file
weights             # Model saving path, last.pt, best.pt
train.py            # Training code
detect.py           # Test code

score.yaml is explained as follows:

# train and val datasets (image directory)
train: ./datasets/traindata/images/train/
val: ./datasets/traindata/images/val/
# number of classes
nc: 2
# class names
names: ['Apple','Banana']

Train: refers to the train address of image data
val: the address of the image data
nc: number of categories
names: the name corresponding to the category

yolov5l.yaml is explained as follows:

nc: 2 # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32
backbone:
  # [from, number, module, args]
  [[-1, 1, Focus, [64, 3]],  # 1-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 2-P2/4
   [-1, 3, Bottleneck, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 4-P3/8
   [-1, 9, BottleneckCSP, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 6-P4/16
   [-1, 9, BottleneckCSP, [512]],
   [-1, 1, Conv, [1024, 3, 2]], # 8-P5/32
   [-1, 1, SPP, [1024, [5, 9, 13]]],
   [-1, 6, BottleneckCSP, [1024]],  # 10
  ]
head:
  [[-1, 3, BottleneckCSP, [1024, False]],  # 11
   [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],  # 12 (P5/32-large)
   [-2, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 1, Conv, [512, 1, 1]],
   [-1, 3, BottleneckCSP, [512, False]],
   [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],  # 17 (P4/16-medium)
   [-2, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 1, Conv, [256, 1, 1]],
   [-1, 3, BottleneckCSP, [256, False]],
   [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],  # 22 (P3/8-small)
   [[], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]

nc: the number of target categories
depth_multiple and width_multiple: control the depth and width of the model. Different parameters correspond to: s, m, l, x model.
anchors: it is the basic box generated by k-means clustering for the input target box, which is used to predict the target box.
yolov5 will automatically generate anchors. yolov5 uses Euclidean distance for k-means clustering, and then uses genetic algorithm to do a series of mutation to get the final anchors. However, the effect of using Euclidean distance for k-means clustering is not as good as that of using 1 - iou for k-means clustering. If you want the source code of 1 - iou for k-means clustering, please talk to me in private. But the effect is the same A few.
backbone: the network structure of the image feature extraction part.
head: the network structure for the final prediction part

The train.py configuration is very simple:

We only need to modify the following parameters

epoch:         Controls the number of training iterations
batch_size     Enter the number of pictures for the iteration
cfg:           Configure network model path
data:          Training profile path
weights:       Load the model and continue training

Terminal operation

 python train.py

Start training.

Training process


Training results

3. Test model


Three parameters need to be changed

source:     To be tested images/videos route
out: 		Path to save results
weights:    The path of the trained model weight file

Terminal operation

 python detect.py

The test can be started.

You can also use the weight files on the coco dataset to test and put them in the weights folder

Extraction code: hhbb

detection result


4. Deploy through flash

The deployment of flash is very simple. If you don't understand, please refer to my previous blog or the comments below.

Deploying Python and flash projects on alicloud ECS is easy to understand without nginx and uwsgi

web platform for target detection and multi-target tracking based on yolov3 deep sort flash

Terminal operation

 python app.py

You can start to jump to the web page for image or video upload detection.

Posted by sentback on Sat, 23 Oct 2021 02:48:39 -0700