[programming art] analysis of darknet load_weights interface

Keywords: C AI Deep Learning source code analysis

Welcome to my official account, reply to 001 Google programming specification.

  O_o   >_<   o_O   O_o   ~_~   o_O

  this paper analyzes darknet load_weights interface, which is mainly used to load model weights.

1. darknet data loading process

    The previous article has introduced the data loading process of darknet target detection, and introduced the loading implementation of. Data,. names and. cfg.

    Next, load here_ The weights interface is mainly used to load the weights of the. Weights model.

2,load_weights interface

    Let's take a look at the interface call:

load_weights(&net, weightfile);

    Where net is the instance of the network structure and weightfile is the file path of the weight. Take a look at load_ Implementation of weights:

/// parser.c
void load_weights(network *net, char *filename)
{
    load_weights_upto(net, filename, net->n);
}

     Load is mainly called_ weights_ Upto function:

/// parser.c
void load_weights_upto(network *net, char *filename, int cutoff)
{
#ifdef GPU
    if(net->gpu_index >= 0){
        cuda_set_device(net->gpu_index);                         // Set gpu_index
    } 
#endif
    fprintf(stderr, "Loading weights from %s...", filename);
    fflush(stdout);                                             // Force immediate output
    FILE *fp = fopen(filename, "rb");
    if(!fp) file_error(filename);

    int major;
    int minor;
    int revision;
    fread(&major, sizeof(int), 1, fp);                          // Loading of some flag bits
    fread(&minor, sizeof(int), 1, fp);
    fread(&revision, sizeof(int), 1, fp);
    if ((major * 10 + minor) >= 2) {
        printf("\n seen 64");
        uint64_t iseen = 0;
        fread(&iseen, sizeof(uint64_t), 1, fp);
        *net->seen = iseen;
    }
    else {
        printf("\n seen 32");
        uint32_t iseen = 0;
        fread(&iseen, sizeof(uint32_t), 1, fp);
        *net->seen = iseen;
    }
    *net->cur_iteration = get_current_batch(*net);
    printf(", trained: %.0f K-images (%.0f Kilo-batches_64) \n", (float)(*net->seen / 1000), (float)(*net->seen / 64000));
    int transpose = (major > 1000) || (minor > 1000);

    int i;
    for(i = 0; i < net->n && i < cutoff; ++i){                     // Identify different operators for weight loading
        layer l = net->layers[i];
        if (l.dontload) continue;
        if(l.type == CONVOLUTIONAL && l.share_layer == NULL){
            load_convolutional_weights(l, fp);
        }
        if (l.type == SHORTCUT && l.nweights > 0) {
            load_shortcut_weights(l, fp);
        }
        if (l.type == IMPLICIT) {
            load_implicit_weights(l, fp);
        }
        if(l.type == CONNECTED){
            load_connected_weights(l, fp, transpose);
        }
        if(l.type == BATCHNORM){
            load_batchnorm_weights(l, fp);
        }
        if(l.type == CRNN){
            load_convolutional_weights(*(l.input_layer), fp);
            load_convolutional_weights(*(l.self_layer), fp);
            load_convolutional_weights(*(l.output_layer), fp);
        }
        if(l.type == RNN){
            load_connected_weights(*(l.input_layer), fp, transpose);
            load_connected_weights(*(l.self_layer), fp, transpose);
            load_connected_weights(*(l.output_layer), fp, transpose);
        }
        if(l.type == GRU){
            load_connected_weights(*(l.input_z_layer), fp, transpose);
            load_connected_weights(*(l.input_r_layer), fp, transpose);
            load_connected_weights(*(l.input_h_layer), fp, transpose);
            load_connected_weights(*(l.state_z_layer), fp, transpose);
            load_connected_weights(*(l.state_r_layer), fp, transpose);
            load_connected_weights(*(l.state_h_layer), fp, transpose);
        }
        if(l.type == LSTM){
            load_connected_weights(*(l.wf), fp, transpose);
            load_connected_weights(*(l.wi), fp, transpose);
            load_connected_weights(*(l.wg), fp, transpose);
            load_connected_weights(*(l.wo), fp, transpose);
            load_connected_weights(*(l.uf), fp, transpose);
            load_connected_weights(*(l.ui), fp, transpose);
            load_connected_weights(*(l.ug), fp, transpose);
            load_connected_weights(*(l.uo), fp, transpose);
        }
        if (l.type == CONV_LSTM) {
            if (l.peephole) {
                load_convolutional_weights(*(l.vf), fp);
                load_convolutional_weights(*(l.vi), fp);
                load_convolutional_weights(*(l.vo), fp);
            }
            load_convolutional_weights(*(l.wf), fp);
            if (!l.bottleneck) {
                load_convolutional_weights(*(l.wi), fp);
                load_convolutional_weights(*(l.wg), fp);
                load_convolutional_weights(*(l.wo), fp);
            }
            load_convolutional_weights(*(l.uf), fp);
            load_convolutional_weights(*(l.ui), fp);
            load_convolutional_weights(*(l.ug), fp);
            load_convolutional_weights(*(l.uo), fp);
        }
        if(l.type == LOCAL){
            int locations = l.out_w*l.out_h;
            int size = l.size*l.size*l.c*l.n*locations;
            fread(l.biases, sizeof(float), l.outputs, fp);
            fread(l.weights, sizeof(float), size, fp);
#ifdef GPU
            if(gpu_index >= 0){
                push_local_layer(l);
            }
#endif
        }
        if (feof(fp)) break;
    }
    fprintf(stderr, "Done! Loaded %d layers from weights-file \n", i);
    fclose(fp);
}

     The above points are not easy to understand, such as the following paragraph:

int major;
int minor;
int revision;
fread(&major, sizeof(int), 1, fp);
fread(&minor, sizeof(int), 1, fp);
fread(&revision, sizeof(int), 1, fp);

     This is best seen in combination with the interface for saving weights. load_weights is save_ For the decoding process of weights, take a look at save_weights_ Front part of upto:

void save_weights_upto(network net, char *filename, int cutoff, int save_ema)
{
#ifdef GPU
    if(net.gpu_index >= 0){
        cuda_set_device(net.gpu_index);
    }
#endif
    fprintf(stderr, "Saving weights to %s\n", filename);
    FILE *fp = fopen(filename, "wb");
    if(!fp) file_error(filename);

    int major = MAJOR_VERSION;
    int minor = MINOR_VERSION;
    int revision = PATCH_VERSION;
    fwrite(&major, sizeof(int), 1, fp);              // Hit major first
    fwrite(&minor, sizeof(int), 1, fp);              // Call minor again
    fwrite(&revision, sizeof(int), 1, fp);           // Then type revision
    (*net.seen) = get_current_iteration(net) * net.batch * net.subdivisions; // remove this line, when you will save to weights-file both: seen & cur_iteration
    fwrite(net.seen, sizeof(uint64_t), 1, fp);       // Finally, type net.see
......
}

  save from above_ From the weights interface, we can see that the weight of darknet will be marked with several flags in front: major, minor, revision and net.see, and then the weight data of each layer will be continuously stored, which is not difficult to understand_ This decoding is done during weights. The following are the macro definitions of these parameters:

/// version.h
#define MAJOR_VERSION 0
#define MINOR_VERSION 2
#define PATCH_VERSION 5

     Back to load_weights, after loading these flags, the weights of each layer are loaded. In terms of convolution weight loading, the logic is divided into two:

    (1) Single conv, using fread to load bias and weights according to the specific size;

  (2) conv + bn fusion, use fread to load bias, scales and rolling according to the specific size_ mean,rolling_variance,weights.

     Look at the implementation:

/// parser.c
void load_convolutional_weights(layer l, FILE *fp)
{
    if(l.binary){
        //load_convolutional_weights_binary(l, fp);
        //return;
    }
    int num = l.nweights;
    int read_bytes;
    read_bytes = fread(l.biases, sizeof(float), l.n, fp);               // load biases
    if (read_bytes > 0 && read_bytes < l.n) printf("\n Warning: Unexpected end of wights-file! l.biases - l.index = %d \n", l.index);
    //fread(l.weights, sizeof(float), num, fp); // as in connected layer
    if (l.batch_normalize && (!l.dontloadscales)){
        read_bytes = fread(l.scales, sizeof(float), l.n, fp);          // load scales
        if (read_bytes > 0 && read_bytes < l.n) printf("\n Warning: Unexpected end of wights-file! l.scales - l.index = %d \n", l.index);
        read_bytes = fread(l.rolling_mean, sizeof(float), l.n, fp);    // load rolling_mean
        if (read_bytes > 0 && read_bytes < l.n) printf("\n Warning: Unexpected end of wights-file! l.rolling_mean - l.index = %d \n", l.index);
        read_bytes = fread(l.rolling_variance, sizeof(float), l.n, fp);   // load rolling_variance
        if (read_bytes > 0 && read_bytes < l.n) printf("\n Warning: Unexpected end of wights-file! l.rolling_variance - l.index = %d \n", l.index);
        if(0){
            int i;
            for(i = 0; i < l.n; ++i){
                printf("%g, ", l.rolling_mean[i]);
            }
            printf("\n");
            for(i = 0; i < l.n; ++i){
                printf("%g, ", l.rolling_variance[i]);
            }
            printf("\n");
        }
        if(0){
            fill_cpu(l.n, 0, l.rolling_mean, 1);
            fill_cpu(l.n, 0, l.rolling_variance, 1);
        }
    }
    read_bytes = fread(l.weights, sizeof(float), num, fp);          // load weights
    if (read_bytes > 0 && read_bytes < l.n) printf("\n Warning: Unexpected end of wights-file! l.weights - l.index = %d \n", l.index);
    //if(l.adam){
    //    fread(l.m, sizeof(float), num, fp);
    //    fread(l.v, sizeof(float), num, fp);
    //}
    //if(l.c == 3) scal_cpu(num, 1./256, l.weights, 1);
    if (l.flipped) {
        transpose_matrix(l.weights, (l.c/l.groups)*l.size*l.size, l.n);
    }
    //if (l.binary) binarize_weights(l.weights, l.n, (l.c/l.groups)*l.size*l.size, l.weights);
#ifdef GPU
    if(gpu_index >= 0){
        push_convolutional_layer(l);
    }
#endif
}

    Let's talk about fread. This function is often used for data reading in the framework source code. Let's take a look at the C language function:

size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream)

    Parameter Description:

  • ptr: pointer to memory block with minimum size * nmemb bytes;
  • Size: the size of each element to be read, in bytes;
  • nmemb: number of elements. The size of each element is size bytes;
  • Stream: pointer to FILE object, specifying an input stream;

  return value: number of elements successfully read size_ The return value of the t object is the same as that of the nmenb parameter. If it is different, a read error may occur or the end of the file may be reached.

  well, the above analysis analyzes the load of darknet_ The weights interface and weights data structure, combined with the previous articles, have collected the interpretation of the dark target detection data loading part. I hope my sharing can be a little helpful to your learning.

[official account transmission]
<[programming art] analysis of darknet load_weights interface>

Posted by VisionsOfCody on Wed, 17 Nov 2021 03:16:35 -0800