Wu Enda's programming assignment in the second week

Keywords: Python neural networks Deep Learning Data Mining

Title Description

Given the training data set (pictures of cats), let's build a simple neural network to identify cats.

Dataset description

There are 209 pictures in the training set, and the shape of each picture is (64, 64, 3)
There are 50 pictures in the test set, and the shape of each picture is (64, 64, 3)
classes stores two string data in bytes, namely [b'non-cat ', b'cat']
The classification label is {0,1}. 0 means not a cat, 1 means a cat

Dataset Download

Github address: For reference only (including complete code and dataset resources)

code implementation

Construct load dataset function

The code for loading data sets has been given in Wu Enda's course

import numpy as np
import h5py
def load_dataset():
    train_dataset = h5py.File('/datasets/train_catvnoncat.h5',', "r") # The relative path is given here. If the dataset file is under the currently written code file, it does not need to be modified. Otherwise, write the path corresponding to the specific dataset
    train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features
    train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels

    test_dataset = h5py.File(./datasets/test_catvnoncat.h5', "r")
    test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features
    test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels

    classes = np.array(test_dataset["list_classes"][:]) # the list of classes
    train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
    test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))
    # classes saves two string data in bytes, namely: [b'non-cat', b'cat']
    return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes

Load dataset

x_train, y_train, x_test, y_test, classes = load_dataset()

Convert data to array format

X_train = np.array(x_train)
Y_train = np.array(y_train)
X_test = np.array(x_test)
Y_test = np.array(y_test)

print(X_train.shape) # The first dimension corresponds to the number of pictures, and the latter dimension corresponds to the shape of the picture

View pictures in dataset

import matplotlib.pyplot as plt
index = (25, 26, 27, 28)
plt.subplots(figsize=(20, 10))
for i in range(4):


Image array processing

In order to facilitate the prediction of data later, we reconstruct the array with dimension (64, 64, 3) into an array with dimension (64x64x3, 1)

x_train_flatten = x_train.reshape(x_train.shape[0], -1).T
# Here. shape[]0] means to take the first dimension as the first dimension of the new array, and - 1 means to multiply the remaining dimensions into a new dimension
x_test_flatten = x_test.reshape(x_test.shape[0], -1).T

View new array shapes




Processing of picture data

We know that the pixel value is between 0 and 255. In order to center the later data processing, we will place the standardized data between [0, 1]

X_train = x_train_flatten / 255.0
X_test = x_test_flatten / 255.0

Gradient descent method for constructing logistic regression

sigmoid function

def sigmoid(z):
    s = 1 / (1 + np.exp(-z))
    return s

Initialize the function of w, b

def initialize_w_b(dim):
    w = np.zeros(shape=(dim,1))
    b = 0
    assert(w.shape == (dim, 1))
    assert(isinstance(b, float) or isinstance(b, int)) # It's OK not to write this here, but we should be more rigorous in order to standardize the code and avoid unnecessary errors
    return (w, b)
def initialize_w_b(dim):
    w = np.zeros(shape=(dim,1))
    b = 0
    assert(w.shape == (dim, 1))
    assert(isinstance(b, float) or isinstance(b, int)) # It's OK not to write this here, but we should be more rigorous in order to standardize the code and avoid unnecessary errors
    return (w, b)

Propagation function

def propagate(w, b, X, Y):
    m = X.shape[1]
    # Forward propagation
    z  = sigmoid(np.dot(w.T, X) + b)
    # cost function 
    cost = (-1) * np.sum(Y * np.log(z) + (1 - Y) * (np.log(1 - z)))
    # Back propagation
    dw = (1 / m) * np.dot(X, (z- Y).T)
    db = (1/ m) * np.sum(z -  Y)
    assert(dw.shape == w.shape)
    assert(db.dtype == float)
    cost = np.squeeze(cost) # Eliminate the value of 1 in the shape (i.e. dimension reduction)
    assert(cost.shape == ())
    # Dictionary storage dw, db
    grads = {
    return (grads, cost)

Optimization function

def  optimizer(w, b, X, Y, num_iterations, learning_rate,):
    costs = []
    for i in range (num_iterations):
        grads, cost = propagate(w, b, X, Y)
        dw = grads['dw']
        db = grads['db']
        # Update w, b
        w = w - learning_rate * dw
        b = b - learning_rate * db
        if i % 100 == 0:
            costs.append(cost) # Value of storage cost function
            print("Number of iterations: i%´╝îError value:%f" % (i, cost))
    params = {
    grads ={
    return (params, grads, costs)

In this way, we completed the gradient descent of simple logistic regression

Prediction function

During the prediction of the model, we may have values with tag values between (0,1). Therefore, we need to carry out the following processing

def predict(w, b, X):
    m  = X.shape[1] #Number of pictures
    Y_prediction = np.zeros((1,m)) 
    w = w.reshape(X.shape[0],1)
    #Calculate and predict the probability of cats appearing in the picture
    z = sigmoid(np.dot(w.T , X) + b)
    for i in range(z.shape[1]):
        #Convert the probability z [0, i] to the actual prediction p [0, i]
        if z[0,i] > 0.5: 
        	Y_prediction[0,i] = 1  
        	Y_prediction[0,i] = 0
    #Use assertions
    assert(Y_prediction.shape == (1,m)) # Ensure y_ Shape of prediction
    return Y_prediction

We have completed all the functions we need, and finally write an integration function

Integration function

def model(X_train, Y_train, X_test, Y_test, num_iterations, learning_rate):
    w , b = initialize_w_b(X_train.shape[0])
    parameters , grads , costs = optimizer(w , b , X_train , Y_train,num_iterations , learning_rate)
    #Retrieve parameters w and b from the dictionary parameters
    w , b = parameters["w"] , parameters["b"]
    #Predictive test / training set
    Y_prediction_test = predict(w , b, X_test)
    Y_prediction_train = predict(w , b, X_train)
    #Print accuracy after training
    print("Training set accuracy:"  , format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100) ,"%")
    print("Test set accuracy:"  , format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100) ,"%")
    stored = {
            "costs" : costs,
            "Y_prediction_test" : Y_prediction_test,
            "Y_prediciton_train" : Y_prediction_train,
            "w" : w,
            "b" : b,
            "learning_rate" : learning_rate,
            "num_iterations" : num_iterations }
    return stored

model training

d = model(X_train, y_train, X_test, y_test,num_iterations=2000, learning_rate=0.01)


View gradient descent learning

costs = np.squeeze(d['costs'])
plt.xlabel('iterations ( hundreds)')
plt.title("Learning rate =" + str(d["learning_rate"]))


I hope this article is helpful to everyone's study!

Posted by HaXoRL33T on Fri, 19 Nov 2021 16:36:49 -0800