# Summary of Wu Enda's machine learning code and related knowledge points -- ex3 (1. Neural network)

Keywords: network Python MATLAB

```def load_data(path,transpose=True):
y=data.get("y")#before reshape:(5000, 1)
y=y.reshape(y.shape[0])
X=data.get("X")#(5000,400),(400,)
if transpose:
#Vectorization, so that each column becomes a sample and each row is a sample?
X=np.array([im.reshape((20,20)).T for im in X])
X=np.array([im.reshape(400) for im in X])#Expand the sample to the original (400,)
return X,y
```
```X, y = load_data('code/ex3-neural network/ex3data1.mat')
print(X.shape)
print(y.shape)
```

(5000, 400)
(5000,)

## 2. drawing

```def plot_an_image(image):
fig,ax=plt.subplots(figsize=(1,1))
ax.matshow(image.reshape((20, 20)), cmap=matplotlib.cm.binary)
plt.xticks(np.array([]))  # Just get rid of marks
plt.yticks(np.array([]))
```
```pick_one = np.random.randint(0, 5000)
plot_an_image(X[pick_one,:])
plt.show()
print('this should be {}'.format(y[pick_one]))
```
• Maplotlib.pyplot.mattshow matrix visualization, plot a matrix or an array as an image.
• In matplotlib, ticks represent scale, and scale has two meanings, one is locks, the other is tick labels. In drawing, the x-axis and y-axis are continuous, so the marking can be specified at will, that is, to find the position on the continuous variable, and the scale label can be replaced correspondingly:
xticks() returns two objects, one is the locks and the other is the scale labels
locs, labels = xticks()
```def plot_100_image(X):
size=int(np.sqrt(X.shape[1]))
sample_idx=np.random.choice(np.arange(X.shape[0]),100)#100 out of 5000 samples
sample_images=X[sample_idx,:]
fig,ax_array=plt.subplots(nrows=10,ncols=10,sharex=True,sharey=True)
for r in range(10):
for c in range(10):
ax_array[r,c].matshow(sample_images[10*r+c].reshape((size,size)))
plt.xticks(np.array([]))
plt.yticks(np.array([]))
#Drawing function, drawing 100 pictures
```
• np.arange()
The function returns a fixed step arrangement with an end point and a start point, such as [1,2,3,4,5]. The start point is 1, the end point is 5, and the step is 1.
The number of parameters: NP. Range() function is divided into one parameter, two parameters, three parameters and three cases
1) For a parameter, the parameter value is the end point, the start point is 0 by default, and the step length is 1 by default.
2) When there are two parameters, the first parameter is the starting point, the second parameter is the end point, and the step length takes the default value of 1.
3) For the three parameters, the first parameter is the starting point, the second parameter is the end point, and the third parameter is the step size. Where step size supports decimal
• random.choice() randomly selects content: you can randomly select content from an int number or a 1-D array, and return the selection result in an n-D array.

## Preparation data

```raw_X, raw_y = load_data('code/ex3-neural network/ex3data1.mat')
print(raw_X.shape)
print(raw_y.shape)
```

(5000, 400)
(5000,)

```# Add x0=1
X = np.insert(raw_X, 0, values=np.ones(raw_X.shape[0]), axis=1)#First column inserted (all 1)
X.shape
```

(5000, 401)
Vectorization label

```y_matrix=[]
for k in range(1,11):
y_matrix.append((raw_y==k).astype(int))
# last one is k==10, it's digit 0, bring it to the first position
y_matrix = [y_matrix[-1]] + y_matrix[:-1]#Y matrix [- 1] is (5000,), [y matrix [- 1]] is (15000), y matrix [: - 1] shape is (95000)
y = np.array(y_matrix)
y.shape

```

(10, 5000)

## Training one-dimensional model

```def sigmoid(z):
return 1 / (1 + np.exp(-z))
```
```def cost(theta,X,y):
return np.mean(-y * np.log(sigmoid(X @ theta)) - (1 - y) * np.log(1 - sigmoid(X @ theta)))
```
```def regularized_cost(theta, X, y, lr=1):
theta_ji_to_n=theta[1:]
regularized_term=(lr/(2*len(X)))*np.sum(np.power(theta_ji_to_n,2))
return cost(theta,X,y)+regularized_term
```
```def gradient(theta, X, y):
return (1 / len(X)) * X.T@ (sigmoid(X @ theta) - y)

```
```def regularized_gradient(theta, X, y, lr=1):
theta_j1_to_n = theta[1:]
regularized_theta = (lr / len(X)) * theta_j1_to_n
regularized_term = np.concatenate([np.array([0]), regularized_theta])
return gradient(theta, X, y) + regularized_term

```
```def logistic_regression(X,y,lr=1):
theta=np.zeros(X.shape[1])
final_theta = res.x
return final_theta
```
```def predict(x, theta):
prob = sigmoid(x @ theta)
return (prob >= 0.5).astype(int)

```
```t0=logistic_regression(X,y[0])
print(t0.shape)
y_pred = predict(X, t0)
print('Accuracy={}'.format(np.mean(y[0] == y_pred)))
```

(401,)
Accuracy=0.9974

• Numpy provides * * numpy.concatenate((a1,a2,...) ), axis=0) * * function. Can complete the splicing of multiple arrays at one time. Among them, a1,a2 Is an array type parameter

a=np.array([1,2,3])
b=np.array([11,22,33])
c=np.array([44,55,66])
np.concatenate((a,b,c),axis=0) # By default, axis=0 can be left blank
array([ 1, 2, 3, 11, 22, 33, 44, 55, 66]) #For one-dimensional array splicing, the value of axis does not affect the final result

## Training multidimensional data

```k_theta = np.array([logistic_regression(X, y[k]) for k in range(10)])
print(k_theta.shape)
```

(10, 401)

```prob_matrix = sigmoid(X @ k_theta.T)
np.set_printoptions(suppress=True)
print(prob_matrix.shape)
print(prob_matrix)
```

(5000, 10)
[[0.99577353 0. 0.00053536 ... 0.0000647 0.00003916 0.00172426]
[0.99834639 0.0000001 0.00005611 ... 0.00009681 0.00000291 0.00008494]
[0.99139822 0. 0.00056824 ... 0.00000655 0.02655352 0.00197512]
...
[0.00000068 0.04144103 0.00321037 ... 0.00012724 0.00297365 0.707625 ]
[0.00001843 0.00000013 0.00000009 ... 0.00164807 0.0680994 0.86118731]
[0.02879745 0. 0.00012979 ... 0.36617606 0.00498225 0.14829291]]

```y_pred=np.argmax(prob_matrix,axis=1)#Returns the index value of the maximum value of each row
print(y_pred)
```

[0 0 0 ... 9 9 7]

```y_answer = raw_y.copy()
```
```precision    recall  f1-score   support

0       0.97      0.99      0.98       500
1       0.95      0.99      0.97       500
2       0.95      0.92      0.93       500
3       0.95      0.91      0.93       500
4       0.95      0.95      0.95       500
5       0.92      0.92      0.92       500
6       0.97      0.98      0.97       500
7       0.95      0.95      0.95       500
8       0.93      0.92      0.92       500
9       0.92      0.92      0.92       500
```

avg / total 0.94 0.94 0.94 5000

## Feedforward prediction of neural network

```def load_weight(path):
return data["Theta1"],data["Theta2"]
```
```theta1, theta2 = load_weight('code/ex3-neural network/ex3weights.mat')

theta1.shape, theta2.shape
```

((25, 401), (10, 26))
Because in the data loading function, the original data is transposed, however, the transposed data is not compatible with the given parameters, because these parameters are trained by the original data. So in order to apply the given parameters, I need to use the raw data (not transpose)

```X, y = load_data('code/ex3-neural network/ex3data1.mat',transpose=False)

X = np.insert(X, 0, values=np.ones(X.shape[0]), axis=1)  # intercept

print(X.shape)
print(y.shape)
```

(5000, 401)
(5000,)

## feed forward prediction

First floor:

```a1=X#input
z2=a1@theta1.T
z2.shape
```

(5000, 401)
(5000,)

The second floor:

```z2 = np.insert(z2, 0, values=np.ones(z2.shape[0]), axis=1)#Add a column of bias
a2=sigmoid(z2)
a2.shape
```

(5000, 26)

```z3=a2@theta2.T
a3=sigmoid(z3)
a3
```

array([[0.00013825, 0.0020554 , 0.00304012, ..., 0.00049102, 0.00774326,
0.99622946],
[0.00058776, 0.00285027, 0.00414688, ..., 0.00292311, 0.00235617,
0.99619667],
[0.00010868, 0.0038266 , 0.03058551, ..., 0.07514539, 0.0065704 ,
0.93586278],
...,
[0.06278247, 0.00450406, 0.03545109, ..., 0.0026367 , 0.68944816,
0.00002744],
[0.00101909, 0.00073436, 0.00037856, ..., 0.01456166, 0.97598976,
0.00023337],
[0.00005908, 0.00054172, 0.0000259 , ..., 0.00700508, 0.73281465,
0.09166961]])

```y_pred = np.argmax(a3, axis=1) + 1  # numpy is 0 base index, +1 for matlab convention, returns the index of axis maximum value along the axis, axis=1 represents the row
y_pred.shape
```
```print(classification_report(y, y_pred))
```

precision recall f1-score support

```      1       0.97      0.98      0.97       500
2       0.98      0.97      0.97       500
3       0.98      0.96      0.97       500
4       0.97      0.97      0.97       500
5       0.98      0.98      0.98       500
6       0.97      0.99      0.98       500
7       0.98      0.97      0.97       500
8       0.98      0.98      0.98       500
9       0.97      0.96      0.96       500
10       0.98      0.99      0.99       500
```

avg / total 0.98 0.98 0.98 5000

Published 5 original articles, won praise 0, visited 82

Posted by zdzislaw on Fri, 28 Feb 2020 03:47:24 -0800