Summary of Wu Enda's machine learning code and related knowledge points -- ex3 (1. Neural network)

Keywords: network Python MATLAB

Analysis of python matrix multiplication (multiply / maumull / * / @)

1. Load data

def load_data(path,transpose=True):
    data=sio.loadmat(path)
    y=data.get("y")#before reshape:(5000, 1)
    y=y.reshape(y.shape[0])
    X=data.get("X")#(5000,400),(400,)
    if transpose:
        #Vectorization, so that each column becomes a sample and each row is a sample?
        X=np.array([im.reshape((20,20)).T for im in X])
        X=np.array([im.reshape(400) for im in X])#Expand the sample to the original (400,)
    return X,y
X, y = load_data('code/ex3-neural network/ex3data1.mat')
print(X.shape)
print(y.shape)

(5000, 400)
(5000,)

2. drawing

def plot_an_image(image):
    fig,ax=plt.subplots(figsize=(1,1))
    ax.matshow(image.reshape((20, 20)), cmap=matplotlib.cm.binary)
    plt.xticks(np.array([]))  # Just get rid of marks
    plt.yticks(np.array([]))
pick_one = np.random.randint(0, 5000)
plot_an_image(X[pick_one,:])
plt.show()
print('this should be {}'.format(y[pick_one]))
  • Maplotlib.pyplot.mattshow matrix visualization, plot a matrix or an array as an image.
  • In matplotlib, ticks represent scale, and scale has two meanings, one is locks, the other is tick labels. In drawing, the x-axis and y-axis are continuous, so the marking can be specified at will, that is, to find the position on the continuous variable, and the scale label can be replaced correspondingly:
    xticks() returns two objects, one is the locks and the other is the scale labels
    locs, labels = xticks()
def plot_100_image(X):
    size=int(np.sqrt(X.shape[1]))
    sample_idx=np.random.choice(np.arange(X.shape[0]),100)#100 out of 5000 samples
    sample_images=X[sample_idx,:]
    fig,ax_array=plt.subplots(nrows=10,ncols=10,sharex=True,sharey=True)
    for r in range(10):
        for c in range(10):
            ax_array[r,c].matshow(sample_images[10*r+c].reshape((size,size)))
            plt.xticks(np.array([]))
            plt.yticks(np.array([]))  
            #Drawing function, drawing 100 pictures
  • np.arange()
    The function returns a fixed step arrangement with an end point and a start point, such as [1,2,3,4,5]. The start point is 1, the end point is 5, and the step is 1.
    The number of parameters: NP. Range() function is divided into one parameter, two parameters, three parameters and three cases
    1) For a parameter, the parameter value is the end point, the start point is 0 by default, and the step length is 1 by default.
    2) When there are two parameters, the first parameter is the starting point, the second parameter is the end point, and the step length takes the default value of 1.
    3) For the three parameters, the first parameter is the starting point, the second parameter is the end point, and the third parameter is the step size. Where step size supports decimal
    Original link: https://blog.csdn.net/qq_41550480/article/details/89390579
  • random.choice() randomly selects content: you can randomly select content from an int number or a 1-D array, and return the selection result in an n-D array.

Preparation data

raw_X, raw_y = load_data('code/ex3-neural network/ex3data1.mat')
print(raw_X.shape)
print(raw_y.shape)

(5000, 400)
(5000,)

# Add x0=1
X = np.insert(raw_X, 0, values=np.ones(raw_X.shape[0]), axis=1)#First column inserted (all 1)
X.shape

(5000, 401)
Vectorization label

y_matrix=[]
for k in range(1,11):
    y_matrix.append((raw_y==k).astype(int))
# last one is k==10, it's digit 0, bring it to the first position
y_matrix = [y_matrix[-1]] + y_matrix[:-1]#Y matrix [- 1] is (5000,), [y matrix [- 1]] is (15000), y matrix [: - 1] shape is (95000)
y = np.array(y_matrix)
y.shape

(10, 5000)

Training one-dimensional model

def sigmoid(z):
    return 1 / (1 + np.exp(-z))
def cost(theta,X,y):
    return np.mean(-y * np.log(sigmoid(X @ theta)) - (1 - y) * np.log(1 - sigmoid(X @ theta)))
def regularized_cost(theta, X, y, lr=1):
    theta_ji_to_n=theta[1:]
    regularized_term=(lr/(2*len(X)))*np.sum(np.power(theta_ji_to_n,2))
    return cost(theta,X,y)+regularized_term
def gradient(theta, X, y):
    '''just 1 batch gradient'''
    return (1 / len(X)) * X.T@ (sigmoid(X @ theta) - y)

def regularized_gradient(theta, X, y, lr=1):
    theta_j1_to_n = theta[1:]
    regularized_theta = (lr / len(X)) * theta_j1_to_n
    regularized_term = np.concatenate([np.array([0]), regularized_theta])
    return gradient(theta, X, y) + regularized_term
    
def logistic_regression(X,y,lr=1):
    theta=np.zeros(X.shape[1])
    res=opt.minimize(fun=regularized_cost,x0=theta,args=(X,y,lr),method="TNC",jac=regularized_gradient, options={'disp': True})
    final_theta = res.x
    return final_theta
def predict(x, theta):
    prob = sigmoid(x @ theta)
    return (prob >= 0.5).astype(int)

t0=logistic_regression(X,y[0])
print(t0.shape)
y_pred = predict(X, t0)
print('Accuracy={}'.format(np.mean(y[0] == y_pred)))

(401,)
Accuracy=0.9974

  • Numpy provides * * numpy.concatenate((a1,a2,...) ), axis=0) * * function. Can complete the splicing of multiple arrays at one time. Among them, a1,a2 Is an array type parameter

a=np.array([1,2,3])
b=np.array([11,22,33])
c=np.array([44,55,66])
np.concatenate((a,b,c),axis=0) # By default, axis=0 can be left blank
array([ 1, 2, 3, 11, 22, 33, 44, 55, 66]) #For one-dimensional array splicing, the value of axis does not affect the final result

Training multidimensional data

k_theta = np.array([logistic_regression(X, y[k]) for k in range(10)])
print(k_theta.shape)

(10, 401)

prob_matrix = sigmoid(X @ k_theta.T)
np.set_printoptions(suppress=True)
print(prob_matrix.shape)
print(prob_matrix)

(5000, 10)
[[0.99577353 0. 0.00053536 ... 0.0000647 0.00003916 0.00172426]
[0.99834639 0.0000001 0.00005611 ... 0.00009681 0.00000291 0.00008494]
[0.99139822 0. 0.00056824 ... 0.00000655 0.02655352 0.00197512]
...
[0.00000068 0.04144103 0.00321037 ... 0.00012724 0.00297365 0.707625 ]
[0.00001843 0.00000013 0.00000009 ... 0.00164807 0.0680994 0.86118731]
[0.02879745 0. 0.00012979 ... 0.36617606 0.00498225 0.14829291]]

y_pred=np.argmax(prob_matrix,axis=1)#Returns the index value of the maximum value of each row
print(y_pred)

[0 0 0 ... 9 9 7]

y_answer = raw_y.copy()
y_answer[y_answer==10] = 0
print(classification_report(y_answer, y_pred))
precision    recall  f1-score   support

      0       0.97      0.99      0.98       500
      1       0.95      0.99      0.97       500
      2       0.95      0.92      0.93       500
      3       0.95      0.91      0.93       500
      4       0.95      0.95      0.95       500
      5       0.92      0.92      0.92       500
      6       0.97      0.98      0.97       500
      7       0.95      0.95      0.95       500
      8       0.93      0.92      0.92       500
      9       0.92      0.92      0.92       500

avg / total 0.94 0.94 0.94 5000

Feedforward prediction of neural network


Next, we evaluate the model by loading the existing weights:

def load_weight(path):
    data=sio.loadmat(path)
    return data["Theta1"],data["Theta2"]
theta1, theta2 = load_weight('code/ex3-neural network/ex3weights.mat')

theta1.shape, theta2.shape

((25, 401), (10, 26))
Because in the data loading function, the original data is transposed, however, the transposed data is not compatible with the given parameters, because these parameters are trained by the original data. So in order to apply the given parameters, I need to use the raw data (not transpose)

X, y = load_data('code/ex3-neural network/ex3data1.mat',transpose=False)

X = np.insert(X, 0, values=np.ones(X.shape[0]), axis=1)  # intercept

print(X.shape)
print(y.shape)

(5000, 401)
(5000,)

feed forward prediction

First floor:

a1=X#input
z2=a1@theta1.T
z2.shape

(5000, 401)
(5000,)

The second floor:

z2 = np.insert(z2, 0, values=np.ones(z2.shape[0]), axis=1)#Add a column of bias
a2=sigmoid(z2)
a2.shape

(5000, 26)

z3=a2@theta2.T
a3=sigmoid(z3)
a3

array([[0.00013825, 0.0020554 , 0.00304012, ..., 0.00049102, 0.00774326,
0.99622946],
[0.00058776, 0.00285027, 0.00414688, ..., 0.00292311, 0.00235617,
0.99619667],
[0.00010868, 0.0038266 , 0.03058551, ..., 0.07514539, 0.0065704 ,
0.93586278],
...,
[0.06278247, 0.00450406, 0.03545109, ..., 0.0026367 , 0.68944816,
0.00002744],
[0.00101909, 0.00073436, 0.00037856, ..., 0.01456166, 0.97598976,
0.00023337],
[0.00005908, 0.00054172, 0.0000259 , ..., 0.00700508, 0.73281465,
0.09166961]])

y_pred = np.argmax(a3, axis=1) + 1  # numpy is 0 base index, +1 for matlab convention, returns the index of axis maximum value along the axis, axis=1 represents the row
y_pred.shape
print(classification_report(y, y_pred))

precision recall f1-score support

      1       0.97      0.98      0.97       500
      2       0.98      0.97      0.97       500
      3       0.98      0.96      0.97       500
      4       0.97      0.97      0.97       500
      5       0.98      0.98      0.98       500
      6       0.97      0.99      0.98       500
      7       0.98      0.97      0.97       500
      8       0.98      0.98      0.98       500
      9       0.97      0.96      0.96       500
     10       0.98      0.99      0.99       500

avg / total 0.98 0.98 0.98 5000

Published 5 original articles, won praise 0, visited 82
Private letter follow

Posted by zdzislaw on Fri, 28 Feb 2020 03:47:24 -0800