BP Neural Network (Complete Theory and Empirical Formula)

http://blog.csdn.net/runatworld/article/details/50774215

BP Neural Network

2016-03-01 17:27 271 people read comment(0) Collection Report

Classification:

Data Mining Algorithm s (3)

Today, we talk about BP neural network. Neural network is based on BP neural network. machine learning It is widely used in many fields, such as function approximation, pattern recognition, classification, data compression and data processing.

Mining and other fields. Next, the principle and implementation of BP neural network are introduced.

Contents

1. Recognition of BP Neural Network

2. Selection of Implicit Layer

3. Forward Transfer Subprocesses

4. Reverse Transfer Subprocesses

5. Attentions of BP Neural Network

6. C++ Implementation of BP Neural Network

1. Recognition of BP Neural Network

Back Propagation neural network is divided into two processes

(1) Forward transmission sub-process of working signal

(2) Reverse transmission sub-process of error signal

In BP neural network, a single sample hasThere are three inputs.There are usually several hidden layers between the input layer and the output layer. Actual

In 1989, Robert Hecht-Nielsen proved that a BP network with an implicit layer can be used for a continuous function in any closed interval.

This is the universal approximation theorem. So a three-layer BP network can accomplish arbitraryDimension toDimensional mapping. That's the three layers.

They are input layer (I), hidden layer (H), output layer (O). The following illustration

2. Selection of Implicit Layer

In BP neural network, the number of nodes in input layer and output layer is determined, while the number of nodes in hidden layer is uncertain, so how much should be set?

Is that right? In fact, the number of nodes in the hidden layer has an effect on the performance of the neural network. There is an empirical formula to determine the hidden layer.

The number of nodes is as follows

Among themFor the number of hidden layer nodes,For the number of input layer nodes,For the number of output layer nodes,byAdjustment constant between.

3. Forward Transfer Subprocesses

Nodes are now set upNodeThe weights between them areNodeThe threshold isThe output value of each node isAnd the output of each node

Value is based on the output value of all nodes in the upper layer, the weight value of the current node and all nodes in the upper layer, the threshold value of the current node and the activation function.

. The specific calculation method is as follows.

Among themIn order to activate function, S-type function or linear function are generally selected.

The forward transfer process is relatively simple and can be calculated according to the above formula. In BP neural network, the input layer node has no threshold.

4. Reverse Transfer Subprocesses

Back-propagation of error signal is a complex sub-process in BP neural network, which is based on Widrow-Hoff learning rules. Assumed output layer

All results areThe error function is as follows

The main purpose of BP neural network is to revise weights and thresholds repeatedly so as to minimize the error function. Widrow-Hoff Learning Rules

By continuously adjusting the weights and thresholds of the network along the steepest descent direction of the sum of squares of relative errors, according to the gradient descent method, the weights vector

The correction is proportional to the gradient of E(w,b) at the current position, forThere are two output nodes.

Suppose that the selected activation function is

The derivative of activation function is obtained.

Next, we will focus onYes

Among them are

Similarly forYes

This is famous.Learning rules reduce errors in the actual and expected output of the system by changing the connection weights between neurons.

It is also called Widrow-Hoff learning rule or error-correcting learning rule.

The above is to calculate the adjustments of the weights between the hidden layer and the output layer, and the adjustments of the thresholds for the input layer, the hidden layer and the hidden layer.

Integer calculation is more complicated. hypothesisIs the weight between the k-th node of the input layer and the i-th node of the hidden layer, then there is

Among them are

That's right.Learn rules more deeply.

With the above formula, according to the gradient descent method, the weights and thresholds between the hidden layer and the output layer are adjusted as follows.

The weight and threshold adjustments between the input layer and the hidden layer are the same.

So far, the principle of BP neural network is basically finished.

5. Attentions of BP Neural Network

BP neural network is generally used to classify or approximate problems. If used for classification, the activation function usually chooses Sigmoid function or hard limit function.

Number, if used for function approximation, the output layer node uses a linear function, i.e..

BP neural network can adopt incremental learning or batch learning when training data.

Incremental learning requires that the input mode should be random enough and sensitive to the noise of the input mode, that is, training for the input mode which changes dramatically.

The training effect is poor and suitable for on-line processing. Batch learning has no input mode order problem and good stability, but it is only suitable for offline processing.

Defects of standard BP neural network:

(1) It is easy to form local minima without obtaining global optimum.

There are many minimum values in BP neural network, so it is easy to fall into local minimum, which requires initial weights and thresholds.

The randomness of the initial weights and thresholds is good enough to be implemented randomly many times.

(2) The more training times, the lower learning efficiency and the slower convergence speed.

(3) The selection of hidden layer lacks theoretical guidance.

(4) There is a tendency to forget old samples when learning new samples during training.

BPalgorithm Improvement:

(1) Increasing momentum terms

The momentum term is introduced to accelerate the convergence of the algorithm, i.e. the following formula

Momentum factorGeneral selection.

(2) Adaptive adjustment of learning rate

(3) introducing steepness factor

Normally, BP neural network normalizes the data before training, which maps the data into smaller intervals, such as [0,1] or [-1,1].

6. C++ Implementation of BP Neural Network

The C++ files of BP neural network are as follows

BP.h:

#ifndef _BP_H_

#define _BP_H_



#include <vector>



#define LAYER 3 // Three-Layer Neural Network

#define NUM * 10 // Maximum number of nodes per layer



#define A        30.0

#define B. 10.0//A and B are parameters of S-type functions

#define ITERS 1000 // Maximum number of training

#define ETA_W 0.0035// Weight Adjustment Rate

#define ETA_B 0.001//Threshold Adjustment Rate

#define ERROR 0.002// Permissible Errors for a Single Sample

#define ACCU 0.005// Permissible error per iteration



#define Type double

#define Vector std::vector



struct Data

{

    Vector<Type> x;       //Input data

    Vector<Type> y;       //output data

};



class BP{



public:



    void GetData(const Vector<Data>);

    void Train();

    Vector<Type> ForeCast(const Vector<Type>);



private:



    void InitNetWork();         //Initialize the network

    void GetNums();             //Get the number of input, output, and hidden layer nodes

    void ForwardTransfer();     //Forward Propagator Subprocess

    void ReverseTransfer(int);  //Reverse Propagation Subprocess

    void CalcDelta(int);        //Calculate the adjustments of w and b

    void UpdateNetWork();       //Update weights and thresholds

    Type GetError(int);         //Calculating the Error of a Single Sample

    Type GetAccu();             //Calculating the Accuracy of All Samples

    Type Sigmoid(const Type);   //Calculate the value of Sigmoid



private:

    int in_num;                 //Number of input layer nodes

    int ou_num;                 //Number of Output Layer Nodes

    int hd_num;                 //Number of Hidden Layer Nodes



    Vector<Data> data;          //Input and output data



    Type w[LAYER][NUM][NUM];    //Weights of BP Network

    Type b[LAYER][NUM];         //Threshold of BP Network Node



    Type x[LAYER][NUM];         //The output value of each neuron is transformed by S-type function, and the input layer is the original value.

    Type d[LAYER][NUM];         //Record the value of delta in delta learning rules

};



#endif  //_BP_H_

BP.cpp:

#include <string.h>

#include <stdio.h>

#include <math.h>

#include <assert.h>

#include "BP.h"



//Obtain all training sample data

void BP::GetData(const Vector<Data> _data)

{

    data = _data;

}



//Start training

void BP::Train()

{

    printf("Begin to train BP NetWork!\n");

    GetNums();

    InitNetWork();

    int num = data.size();



    for(int iter = 0; iter <= ITERS; iter++)

    {

        for(int cnt = 0; cnt < num; cnt++)

        {

            //Layer 1 input node assignment

            for(int i = 0; i < in_num; i++)

                x[0][i] = data.at(cnt).x[i];



            while(1)

            {

                ForwardTransfer();

                if(GetError(cnt) < ERROR)    //If the error is small, jump out of the loop for a single sample

                    break;

                ReverseTransfer(cnt);

            }

        }

        printf("This is the %d th trainning NetWork !\n", iter);



        Type accu = GetAccu();

        printf("All Samples Accuracy is %lf\n", accu);

        if(accu < ACCU) break;

    }

    printf("The BP NetWork train End!\n");

}



//Predict the output value according to the trained network

Vector<Type> BP::ForeCast(const Vector<Type> data)

{

    int n = data.size();

    assert(n == in_num);

    for(int i = 0; i < in_num; i++)

        x[0][i] = data[i];



    ForwardTransfer();

    Vector<Type> v;

    for(int i = 0; i < ou_num; i++)

        v.push_back(x[2][i]);

    return v;

}



//Get the number of network nodes

void BP::GetNums()

{

    in_num = data[0].x.size();                         //Get the number of input layer nodes

    ou_num = data[0].y.size();                         //Get the number of output layer nodes

    hd_num = (int)sqrt((in_num + ou_num) * 1.0) + 5;   //Getting Number of Hidden Layer Nodes

    if(hd_num > NUM) hd_num = NUM;                     //The number of hidden layers should not exceed the maximum setting

}



//Initialize the network

void BP::InitNetWork()

{

    memset(w, 0, sizeof(w));      //Initialization weights and thresholds are 0, and random values can also be initialized.

    memset(b, 0, sizeof(b));

}



//Forward transmission sub-process of working signal

void BP::ForwardTransfer()

{

    //Calculating the Output Values of the Nodes in the Hidden Layer

    for(int j = 0; j < hd_num; j++)

    {

        Type t = 0;

        for(int i = 0; i < in_num; i++)

            t += w[1][i][j] * x[0][i];

        t += b[1][j];

        x[1][j] = Sigmoid(t);

    }



    //Calculate the output value of each node in the output layer

    for(int j = 0; j < ou_num; j++)

    {

        Type t = 0;

        for(int i = 0; i < hd_num; i++)

            t += w[2][i][j] * x[1][i];

        t += b[2][j];

        x[2][j] = Sigmoid(t);

    }

}



//Calculating the Error of a Single Sample

Type BP::GetError(int cnt)

{

    Type ans = 0;

    for(int i = 0; i < ou_num; i++)

        ans += 0.5 * (x[2][i] - data.at(cnt).y[i]) * (x[2][i] - data.at(cnt).y[i]);

    return ans;

}



//Error Signal Reverse Transfer Subprocess

void BP::ReverseTransfer(int cnt)

{

    CalcDelta(cnt);

    UpdateNetWork();

}



//Calculating the Accuracy of All Samples

Type BP::GetAccu()

{

    Type ans = 0;

    int num = data.size();

    for(int i = 0; i < num; i++)

    {

        int m = data.at(i).x.size();

        for(int j = 0; j < m; j++)

            x[0][j] = data.at(i).x[j];

        ForwardTransfer();

        int n = data.at(i).y.size();

        for(int j = 0; j < n; j++)

            ans += 0.5 * (x[2][j] - data.at(i).y[j]) * (x[2][j] - data.at(i).y[j]);

    }

    return ans / num;

}



//Calculate adjustment

void BP::CalcDelta(int cnt)

{

    //Calculating delta Value of Output Layer

    for(int i = 0; i < ou_num; i++)

        d[2][i] = (x[2][i] - data.at(cnt).y[i]) * x[2][i] * (A - x[2][i]) / (A * B);

    //Calculating delta Value of Implicit Layer

    for(int i = 0; i < hd_num; i++)

    {

        Type t = 0;

        for(int j = 0; j < ou_num; j++)

            t += w[2][i][j] * d[2][j];

        d[1][i] = t * x[1][i] * (A - x[1][i]) / (A * B);

    }

}



//Adjust BP network according to the calculated adjustment amount

void BP::UpdateNetWork()

{

    //Weight and Threshold Adjustment Between Implicit Layer and Output Layer

    for(int i = 0; i < hd_num; i++)

    {

        for(int j = 0; j < ou_num; j++)

            w[2][i][j] -= ETA_W * d[2][j] * x[1][i];

    }

    for(int i = 0; i < ou_num; i++)

        b[2][i] -= ETA_B * d[2][i];



    //Weight and Threshold Adjustment Between Input Layer and Implicit Layer

    for(int i = 0; i < in_num; i++)

    {

        for(int j = 0; j < hd_num; j++)

            w[1][i][j] -= ETA_W * d[1][j] * x[0][i];

    }

    for(int i = 0; i < hd_num; i++)

        b[1][i] -= ETA_B * d[1][i];

}



//Calculating the value of Sigmoid function

Type BP::Sigmoid(const Type x)

{

    return A / (1 + exp(-x / B));

}

Test.cpp:

#include <iostream>

#include <string.h>

#include <stdio.h>



#include "BP.h"



using namespace std;



double sample[41][4]=

{

    {0,0,0,0},

    {5,1,4,19.020},

    {5,3,3,14.150},

    {5,5,2,14.360},

    {5,3,3,14.150},

    {5,3,2,15.390},

    {5,3,2,15.390},

    {5,5,1,19.680},

    {5,1,2,21.060},

    {5,3,3,14.150},

    {5,5,4,12.680},

    {5,5,2,14.360},

    {5,1,3,19.610},

    {5,3,4,13.650},

    {5,5,5,12.430},

    {5,1,4,19.020},

    {5,1,4,19.020},

    {5,3,5,13.390},

    {5,5,4,12.680},

    {5,1,3,19.610},

    {5,3,2,15.390},

    {1,3,1,11.110},

    {1,5,2,6.521},

    {1,1,3,10.190},

    {1,3,4,6.043},

    {1,5,5,5.242},

    {1,5,3,5.724},

    {1,1,4,9.766},

    {1,3,5,5.870},

    {1,5,4,5.406},

    {1,1,3,10.190},

    {1,1,5,9.545},

    {1,3,4,6.043},

    {1,5,3,5.724},

    {1,1,2,11.250},

    {1,3,1,11.110},

    {1,3,3,6.380},

    {1,5,2,6.521},

    {1,1,1,16.000},

    {1,3,2,7.219},

    {1,5,3,5.724}

};



int main()

{

    Vector<Data> data;

    for(int i = 0; i < 41; i++)

    {

        Data t;

        for(int j = 0; j < 3; j++)

            t.x.push_back(sample[i][j]);

        t.y.push_back(sample[i][3]);

        data.push_back(t);

    }

    BP *bp = new BP();

    bp->GetData(data);

    bp->Train();



    while(1)

    {

        Vector<Type> in;

        for(int i = 0; i < 3; i++)

        {

            Type v;

            scanf("%lf", &v);

            in.push_back(v);

        }

        Vector<Type> ou;

        ou = bp->ForeCast(in);

        printf("%lf\n", ou[0]);

    }

    return 0;

}

Makefile:

Test : BP.h BP.cpp Test.cpp

    g++ BP.cpp Test.cpp -o Test



clean:

    rm Test

Posted by blacklotus on Sun, 24 Mar 2019 12:57:31 -0700

Programmer Group

BP Neural Network (Complete Theory and Empirical Formula)

http://blog.csdn.net/runatworld/article/details/50774215

BP Neural Network

Hot Keywords