# 1, Multilayer perceptron model based on BP algorithm

The multilayer perceptron using BP algorithm is the most widely used neural network so far. In the application of multilayer perceptron, the single hidden layer network shown in Figure 3-15 is the most common. In general, it is customary to call a single hidden layer feedforward network a three-layer perceptron. The so-called three layers include input layer, hidden layer and output layer.  The final result of the algorithm adopts the gradient descent method, and the specific detailed process is omitted here!

# 2, Program implementation flow of BP algorithm # 3, Improvement of standard BP algorithm -- adding momentum term

When adjusting the weights, the standard BP algorithm only adjusts according to the gradient descent direction of the error at time t, without considering the gradient direction before time t, which often makes the training process oscillate and converge slowly. In order to improve the training speed of the network, a momentum term can be added to the weight adjustment formula. If W represents the weight matrix of a layer and X represents the input vector of a layer, the expression of the weight adjustment vector containing momentum term is It can be seen that increasing the momentum term means taking out part of the previous weight adjustment and adding it to the current weight adjustment, α It is called the momentum coefficient, which generally has a ∈ (0,1). The momentum term reflects the adjustment experience accumulated before and plays a damping role in the adjustment of time t. When the error surface fluctuates suddenly, the oscillation trend can be reduced and the training speed can be improved. At present, the momentum term is added to the BP algorithm, so that the BP algorithm with momentum term has become a new standard algorithm.

# 4, Implementing BP neural network and its learning algorithm with Python

Here, in order to use the algorithm, a brief example is given (no normalization or standardization is required)

Enter X=-1:0.1:1;
Output D =.... (see the data in the code for details)

In order to view the results, we draw the results as graphs, as follows: The yellow line and blue line represent the output and input after training

# 5, The procedure is as follows:

```# -*- coding: utf-8 -*-
import math
import string
import matplotlib as mpl
############################################Call library (modify according to your own programming)
import numpy.matlib
import numpy as np
np.seterr(divide='ignore',invalid='ignore')
import matplotlib.pyplot as plt
from matplotlib import font_manager
import pandas as pd
import random

#Generating random numbers in interval [a,b]
def random_number(a,b):
return (b-a)*random.random()+a

#Generate a matrix with the size of m*n, and set the default zero matrix
def makematrix(m, n, fill=0.0):
a = []
for i in range(m):
a.append([fill]*n)
return np.array(a)

#Function sigmoid(), both functions can be used as activation functions
def sigmoid(x):
#return np.tanh(x)
return (1-np.exp(-1*x))/(1+np.exp(-1*x))
#Functions derived from sigmoid
def derived_sigmoid(x):
return 1-(np.tanh(x))**2
#return (2*np.exp((-1)*x)/((1+np.exp(-1*x)**2)))

#Construct three-layer BP network architecture
class BPNN:
def __init__(self, num_in, num_hidden, num_out):
#Number of nodes in input layer, hidden layer and output layer
self.num_in = num_in + 1  #Add an offset node
self.num_hidden = num_hidden + 1   #Add an offset node
self.num_out = num_out

#Activate all nodes (vectors) of the neural network
self.active_in = np.array([-1.0]*self.num_in)
self.active_hidden = np.array([-1.0]*self.num_hidden)
self.active_out = np.array([1.0]*self.num_out)

#Create weight matrix
self.wight_in = makematrix(self.num_in, self.num_hidden)
self.wight_out = makematrix(self.num_hidden, self.num_out)

for i in range(self.num_in):
for j in range(self.num_hidden):
self.wight_in[i][j] = random_number(0.1, 0.1)
for i in range(self.num_hidden):
for j in range(self.num_out):
self.wight_out[i][j] = random_number(0.1, 0.1)
#deviation
for j in range(self.num_hidden):
self.wight_in[j] = 0.1
for j in range(self.num_out):
self.wight_in[j] = 0.1

#Finally, the momentum factor (matrix) is established
self.ci = makematrix(self.num_in, self.num_hidden)
self.co = makematrix(self.num_hidden, self.num_out)

#Signal forward propagation
def update(self, inputs):
if len(inputs) != self.num_in-1:
raise ValueError('Inconsistent with the number of input layer nodes')
#Data input layer
self.active_in[1:self.num_in]=inputs

#Data processing in hidden layer
self.sum_hidden=np.dot(self.wight_in.T,self.active_in.reshape(-1,1)) #Dot multiplication
self.active_hidden=sigmoid(self.sum_hidden)   #active_hidden [] is stored after processing the input data as the input data of the output layer
self.active_hidden=-1

#Data processing in the output layer
self.sum_out=np.dot(self.wight_out.T,self.active_hidden) #Dot multiplication
self.active_out = sigmoid(self.sum_out)   #Same as above
return self.active_out

#Error back propagation
def errorbackpropagate(self, targets, lr,m):   #lr is the learning rate
if self.num_out==1:
targets=[targets]
if len(targets) != self.num_out:
raise ValueError('Inconsistent with the number of output layer nodes!')
#error
error=(1/2)*np.dot((targets.reshape(-1,1)-self.active_out).T,(targets.reshape(-1,1)-self.active_out))

#Output error signal
self.error_out=(targets.reshape(-1,1)-self.active_out)*derived_sigmoid(self.sum_out)
#Hidden layer error signal
#self.error_hidden=np.dot(self.wight_out.reshape(-1,1),self.error_out.reshape(-1,1))*self.active_hidden*(1-self.active_hidden)
self.error_hidden=np.dot(self.wight_out,self.error_out)*derived_sigmoid(self.sum_hidden)

#Update weight
#hide
self.wight_out=self.wight_out+lr*np.dot(self.error_out,self.active_hidden.reshape(1,-1)).T+m*self.co
self.co=lr*np.dot(self.error_out,self.active_hidden.reshape(1,-1)).T
#input
self.wight_in=self.wight_in+lr*np.dot(self.error_hidden,self.active_in.reshape(1,-1)).T+m*self.ci
self.ci=lr*np.dot(self.error_hidden,self.active_in.reshape(1,-1)).T
return error

#test
def test(self, patterns):
for i in patterns:
print(i[0:self.num_in-1], '->', self.update(i[0:self.num_in-1]))
return self.update(i[0:self.num_in-1])

#Weight
def weights(self):
print("Enter layer weights")
print(self.wight_in)
print("Output layer weight")
print(self.wight_out)

def train(self, pattern, itera=100, lr = 0.2, m=0.1):
for i in range(itera):
error = 0.0
for j in pattern:
inputs = j[0:self.num_in-1]
targets = j[self.num_in-1:]
self.update(inputs)
error = error+self.errorbackpropagate(targets, lr,m)
if i % 10 == 0:
print('########################error %-.5f######################Iteration% d '% (error,i))

#example
X=list(np.arange(-1,1.1,0.1))
D=[-0.96, -0.577, -0.0729, 0.017, -0.641, -0.66, -0.11, 0.1336, -0.201, -0.434, -0.5, -0.393, -0.1647, 0.0988, 0.3072, 0.396, 0.3449, 0.1816, -0.0312, -0.2183, -0.3201]
A=X+D
patt=np.array([A]*2)
#Create neural network, 21 input nodes, 21 hidden layer nodes and 1 output layer node
n = BPNN(21, 21, 21)
#Training neural network
n.train(patt)
#Test neural network
d=n.test(patt)
#View weight value
n.weights()

plt.plot(X,D)
plt.plot(X,d)
plt.show()```

     Han Liqun, artificial neural network theory and application [M]   Beijing: China Machine Press, 2016

Posted by Eratimus on Tue, 30 Nov 2021 21:35:36 -0800