[Dlib] use Dlib ﹣ face ﹣ recognition ﹣ RESNET ﹣ model ﹣ v1.dat to realize fine tuning

Keywords: network

1. Problem description

dlib officially uses resnet to train face recognition, and trains 3 million data. The network parameters are saved in dlib face recognition resnet model v1.dat.
When lfw data is identified in the test, the accuracy can reach 99.13%. However, when identifying their own data, the accuracy is a little low, so they want to use their own data for fine-tuning.
After a lot of mischief, I finally failed (I am an AI Xiaobai).

2. Cause analysis

The reason is that the training network is different from the test network. Dlib face recognition RESNET model v1.dat stores the serialization parameters of the test network.
The code implementation of training network and test network is as follows

template <template <int,template<typename>class,int,typename> class block, int N, template<typename>class BN, typename SUBNET>
using residual = add_prev1<block<N,BN,1,tag1<SUBNET>>>;

template <template <int,template<typename>class,int,typename> class block, int N, template<typename>class BN, typename SUBNET>
using residual_down = add_prev2<avg_pool<2,2,2,2,skip1<tag2<block<N,BN,2,tag1<SUBNET>>>>>>;

template <int N, template <typename> class BN, int stride, typename SUBNET> 
using block  = BN<con<N,3,3,1,1,relu<BN<con<N,3,3,stride,stride,SUBNET>>>>>;


template <int N, typename SUBNET> using res       = relu<residual<block,N,bn_con,SUBNET>>;
template <int N, typename SUBNET> using ares      = relu<residual<block,N,affine,SUBNET>>;
template <int N, typename SUBNET> using res_down  = relu<residual_down<block,N,bn_con,SUBNET>>;
template <int N, typename SUBNET> using ares_down = relu<residual_down<block,N,affine,SUBNET>>;

// ----------------------------------------------------------------------------------------

template <typename SUBNET> using level0 = res_down<256,SUBNET>;
template <typename SUBNET> using level1 = res<256,res<256,res_down<256,SUBNET>>>;
template <typename SUBNET> using level2 = res<128,res<128,res_down<128,SUBNET>>>;
template <typename SUBNET> using level3 = res<64,res<64,res<64,res_down<64,SUBNET>>>>;
template <typename SUBNET> using level4 = res<32,res<32,res<32,SUBNET>>>;

template <typename SUBNET> using alevel0 = ares_down<256,SUBNET>;
template <typename SUBNET> using alevel1 = ares<256,ares<256,ares_down<256,SUBNET>>>;
template <typename SUBNET> using alevel2 = ares<128,ares<128,ares_down<128,SUBNET>>>;
template <typename SUBNET> using alevel3 = ares<64,ares<64,ares<64,ares_down<64,SUBNET>>>>;
template <typename SUBNET> using alevel4 = ares<32,ares<32,ares<32,SUBNET>>>;


// training network type
using net_type = loss_metric<fc_no_bias<128,avg_pool_everything<
                            level0<
                            level1<
                            level2<
                            level3<
                            level4<
                            max_pool<3,3,2,2,relu<bn_con<con<32,7,7,2,2,
                            input_rgb_image
                            >>>>>>>>>>>>;

// testing network type (replaced batch normalization with fixed affine transforms)
using anet_type = loss_metric<fc_no_bias<128,avg_pool_everything<
                            alevel0<
                            alevel1<
                            alevel2<
                            alevel3<
                            alevel4<
                            max_pool<3,3,2,2,relu<affine<con<32,7,7,2,2,
                            input_rgb_image
                            >>>>>>>>>>>>;

The main difference between the net type of training network and the net type of testing network is that affine replaces BN con

If Dlib ﹣ face ﹣ recognition ﹣ RESNET ﹣ model ﹣ v1.dat is parallelized to net ﹣ type, an error will be reported and then it will crash

terminate called after throwing an instance of 'dlib::serialization_error'
  what():  An error occurred while trying to read the first object from the file dlib_face_recognition_resnet_model_v1.dat.
ERROR: Unexpected version 'affine_' found while deserializing dlib::bn_.

Aborted (core dumped)

From the error printing, it can be seen that the Dlib ﹣ face ﹣ recognition ﹣ RESNET ﹣ model ﹣ v1.dat stores the after ﹣ i.e. test network.

In the dlib official demo: DNN > metric > Learning > on > images > ex.cpp

anet_type testing_net = net;

It shows that the training network net type can be automatically converted into the test network net type
The reason is that there is code from bn? To affine? In dlib source code

    class affine_
    {
    public:
		template <
            layer_mode bnmode
            >
        affine_(
            const bn_<bnmode>& item
        )
     ...

Parallelizing Dlib ﹣ face ﹣ recognition ﹣ RESNET ﹣ model ﹣ v1.dat to the test network anet ﹣ type can work normally. If the test network anet ﹣ type can be converted to the training network net ﹣ type, it can be realized.
I Xiaobai, haven't found a way...

Posted by scast on Mon, 11 Nov 2019 06:54:38 -0800