yolo Learning Series: K-means Dimensional Clustering

yolo Learning Series (V): K-means Dimensional Clustering

Deep learning: teach you to do target detection (YOLO, SSD) video tutorial

1. Dimensional Clustering

1.1 Clustering Purpose

Run voc_label.py under Ubuntu system to generate training set and test set list file. Under Windows, there will be coding errors.

When using their own data sets for YOLO training, the first thing to do is to change anchors to their own size. The default is the size of 20/80 targets in VOC/COCO data sets.
Dimensional clustering is to use training set iteration to find the candidate frame width and height size of the data set.
First release results

*****************
n_anchors = 5
*****************

k-means result: 

(9.503125000000356, 8.046875000000572)
k-means result: 

(4.367264851484575, 6.158106435643181)
k-means result: 

(7.935546874999882, 4.897786458333437)
k-means result: 

(6.179457720588269, 7.807674632352704)
k-means result: 

(3.7825989208630717, 3.5274280575536627)

*****************
n_anchors = 6
*****************

k-means result: 

(6.195027372262894, 7.749087591240633)
k-means result: 

(9.131578947368867, 9.546875000000345)
k-means result: 

(4.324420103092257, 6.20972938144295)
k-means result: 

(9.331640624999876, 5.898828124999875)
k-means result: 

(3.7665307971012534, 3.5264945652170954)
k-means result: 

(7.337187499999761, 4.57375000000048)

*****************
n_anchors = 9
*****************

k-means result: 

(2.6570723684221056, 3.3848684210523685)
k-means result: 

(3.6341911764701336, 5.45174632352828)
k-means result: 

(5.790958737863881, 8.205703883494753)
k-means result: 

(7.290719696969482, 4.354640151515531)
k-means result: 

(5.568824404761428, 5.959821428571835)
k-means result: 

(9.106445312500655, 9.826171875000409)
k-means result: 

(9.356770833333087, 5.37803819444432)
k-means result: 

(8.429036458334373, 7.22656250000046)
k-means result: 

(4.455696202530808, 3.311313291139051)

1.2 source code

Source - Github

# coding=utf-8
# k-means ++ for YOLOv2 anchors
# Through k-means ++ Algorithm acquisition YOLOv2 Needed anchors Dimensions
import numpy as np

# Define the Box class to describe the coordinates of bounding box
class Box():
    def __init__(self, x, y, w, h):
        self.x = x
        self.y = y
        self.w = w
        self.h = h


# Calculate the overlapping parts of two box es on an axis
# x1 is the coordinate of the center of box 1 on this axis
# Len 1 is the length of box 1 on this axis
# x2 is the coordinate of the center of box2 on this axis
# len2 is the length of box 2 on this axis
# The return value is the length of the overlap on the axis
def overlap(x1, len1, x2, len2):
    len1_half = len1 / 2
    len2_half = len2 / 2

    left = max(x1 - len1_half, x2 - len2_half)
    right = min(x1 + len1_half, x2 + len2_half)

    return right - left


# Calculating the intersection area of box a and box b
# Both a and b are examples of Box types
# The return value area is the intersection area of box a and box b
def box_intersection(a, b):
    w = overlap(a.x, a.w, b.x, b.w)
    h = overlap(a.y, a.h, b.y, b.h)
    if w < 0 or h < 0:
        return 0

    area = w * h
    return area


# Computing the Consolidated Area of box a and box b
# Both a and b are examples of Box types
# The return value u is the Union area of box a and box b
def box_union(a, b):
    i = box_intersection(a, b)
    u = a.w * a.h + b.w * b.h - i
    return u


# Calculating iou of box a and box b
# Both a and b are examples of Box types
# The iou whose return values are box a and box b
def box_iou(a, b):
    return box_intersection(a, b) / box_union(a, b)


# Using k-means ++ Initialization centroids，Reducing random initialization centroids Impact on final results
# Boxes are a list of Box objects for all bounding boxes
# n_anchors is k-means Of k value
# The return value centroids is the initialized n_anchors centroid
def init_centroids(boxes,n_anchors):
    centroids = []
    boxes_num = len(boxes)

    centroid_index = np.random.choice(boxes_num, 1)
    centroids.append(boxes[centroid_index])

    print(centroids[0].w,centroids[0].h)

    for centroid_index in range(0,n_anchors-1):

        sum_distance = 0
        distance_thresh = 0
        distance_list = []
        cur_sum = 0

        for box in boxes:
            min_distance = 1
            for centroid_i, centroid in enumerate(centroids):
                distance = (1 - box_iou(box, centroid))
                if distance < min_distance:
                    min_distance = distance
            sum_distance += min_distance
            distance_list.append(min_distance)

        distance_thresh = sum_distance*np.random.random()

        for i in range(0,boxes_num):
            cur_sum += distance_list[i]
            if cur_sum > distance_thresh:
                centroids.append(boxes[i])
                print(boxes[i].w, boxes[i].h)
                break

    return centroids


# Carry out k-means New calculation centroids
# Boxes are a list of Box objects for all bounding boxes
# n_anchors is k-means Of k value
# Centoids are the center of all clusters
# The return value new_centroids is the calculated new cluster center
# The return value group is a list of boxes contained in n_anchors clusters
# The return value loss is the sum of the nearest centroid to which all box distances belong
def do_kmeans(n_anchors, boxes, centroids):
    loss = 0
    groups = []
    new_centroids = []
    for i in range(n_anchors):
        groups.append([])
        new_centroids.append(Box(0, 0, 0, 0))

    for box in boxes:
        min_distance = 1
        group_index = 0
        for centroid_index, centroid in enumerate(centroids):
            distance = (1 - box_iou(box, centroid))
            if distance < min_distance:
                min_distance = distance
                group_index = centroid_index
        groups[group_index].append(box)
        loss += min_distance
        new_centroids[group_index].w += box.w
        new_centroids[group_index].h += box.h

    for i in range(n_anchors):
        new_centroids[i].w /= len(groups[i])
        new_centroids[i].h /= len(groups[i])

    return new_centroids, groups, loss


# Centoids for calculating the number of n_anchors for a given bounding boxes
# label_path is the file address of the training set list
# n_anchors are the number of anchors
# Los_convergence is the minimum allowable change of loss
# grid_size * grid_size Is the number of grids
# iterations_num is the maximum number of iterations
# plus = 1Enabled k means ++ Initialization centroids
def compute_centroids(label_path,n_anchors,loss_convergence,grid_size,iterations_num,plus):

    boxes = []
    label_files = []
    f = open(label_path)
    for line in f:
        label_path = line.rstrip().replace('images', 'labels')
        label_path = label_path.replace('JPEGImages', 'labels')
        label_path = label_path.replace('.jpg', '.txt')
        label_path = label_path.replace('.JPEG', '.txt')
        label_files.append(label_path)
    f.close()

    for label_file in label_files:
        f = open(label_file)
        for line in f:
            temp = line.strip().split(" ")
            if len(temp) > 1:
                boxes.append(Box(0, 0, float(temp[3]), float(temp[4])))

    if plus:
        centroids = init_centroids(boxes, n_anchors)
    else:
        centroid_indices = np.random.choice(len(boxes), n_anchors)
        centroids = []
        for centroid_index in centroid_indices:
            centroids.append(boxes[centroid_index])

    # iterate k-means
    centroids, groups, old_loss = do_kmeans(n_anchors, boxes, centroids)
    iterations = 1
    while (True):
        centroids, groups, loss = do_kmeans(n_anchors, boxes, centroids)
        iterations = iterations + 1
        print("loss = %f" % loss)
        if abs(old_loss - loss) < loss_convergence or iterations > iterations_num:
            break
        old_loss = loss

        for centroid in centroids:
            print(centroid.w * grid_size, centroid.h * grid_size)

    # print result
    for centroid in centroids:
        print("k-means result: \n")
        print(centroid.w * grid_size, centroid.h * grid_size)


label_path = "/home/chris/workspace/2007_train.txt"
n_anchors = 5
loss_convergence = 1e-6
grid_size = 13
iterations_num = 100
plus = 0
compute_centroids(label_path,n_anchors,loss_convergence,grid_size,iterations_num,plus)

1.3 Running Results

k-means result: 

(8.979910714285644, 5.140624999999976)
k-means result: 

(4.5747282608690005, 7.813858695652043)
k-means result: 

(2.2546296296290005, 7.7939814814810005)
k-means result: 

(11.235351562499998, 9.699218750000407)
k-means result: 

(2.442095588236353, 3.5698529411762943)

1.4 Reference Resources

YOLOv2 obtains anchor boxes through k-means

2.Caffe-SSD Length-Width Ratio Clustering

k-means Computing the Length and Width Clustering Results of Detecting anchors in voc2012 Data Set

In this paper, two programs are used, one is get_w_h.py to get the width and height of the data set target (because the function of this part of the reference program is not good), the other is the clustering drawing program in the reference link.

The width and height of the target are obtained by get_w_h.py, and the data 1.txt file with the first and the second ranks is made by excel. The data 1.txt file and the K-means.py file are stored in the same directory, and the PY file can be run.

#  The first67That's ok
#      print(ob_w)
      print(ob_h)

Posted by nEJC on Wed, 02 Oct 2019 05:01:54 -0700

Programmer Group