Tensorflow produces and trains its own dataset with CNN

Keywords: network Python Session

After learning Tensorflow, after using MNIST data sets to train simple MLP, self-encoder and CNN, I wondered if I could make a data set and train with convolutional neural network. So I checked the data on the Internet and found that standard TFrecords format could be used. However, there are some problems. The data set of TFrecords is well made, and the error is reported when it runs. There is no relevant method found on the internet. Later, I found a way to solve it myself. If someone has a better way, you can communicate.

1. Preparing data

I have prepared two categories of images, cat and dog, which are stored in the train_data folder of D disk, as follows:

Data Set Storage.png

2. Making tfrecords files

The code is named make_own_data.py
tfrecord will automatically label each class with the same label, depending on the type of input file you choose.

The code is as follows:

# -*- coding: utf-8 -*-
"""
@author: caokai
"""

import os 
import tensorflow as tf 
from PIL import Image  
import matplotlib.pyplot as plt 
import numpy as np
 
cwd='D:/train_data/'
classes={'dog','cat'}  #Artificial Setting of Two Categories
writer= tf.python_io.TFRecordWriter("dog_and_cat_train.tfrecords") #Files to be generated
 
for index,name in enumerate(classes):
    class_path=cwd+name+'/'
    for img_name in os.listdir(class_path): 
        img_path=class_path+img_name #The address of each picture
 
        img=Image.open(img_path)
        img= img.resize((128,128))
        img_raw=img.tobytes()#Converting pictures to binary format
        example = tf.train.Example(features=tf.train.Features(feature={
            "label": tf.train.Feature(int64_list=tf.train.Int64List(value=[index])),
            'img_raw': tf.train.Feature(bytes_list=tf.train.BytesList(value=[img_raw]))
        })) #example object encapsulates label and image data
        writer.write(example.SerializeToString())  #Serialization into strings
 
writer.close()

In this way, two types of data 0 and 1 are added to the pictures of dogs and cats, and the file is stored as dog_and_cat_train.tfrecords. You will find this file in the folder where your python code is located.

3. Read tfrecords files

Read out the pictures and labels. The reshape is 128x128x3.
Read the code as a single file, named ReadMyOwnData.py

The code is as follows:

# -*- coding: utf-8 -*-
"""
tensorflow : read my own dataset
@author: caokai
"""

import tensorflow as tf

def read_and_decode(filename): # Read in dog_train.tfrecords
    filename_queue = tf.train.string_input_producer([filename])#Generate a queue queue
 
    reader = tf.TFRecordReader()
    _, serialized_example = reader.read(filename_queue)#Return the file name and file
    features = tf.parse_single_example(serialized_example,
                                       features={
                                           'label': tf.FixedLenFeature([], tf.int64),
                                           'img_raw' : tf.FixedLenFeature([], tf.string),
                                       })#Remove image data and label
 
    img = tf.decode_raw(features['img_raw'], tf.uint8)
    img = tf.reshape(img, [128, 128, 3])  #Rehape 3-channel image 128*128
    img = tf.cast(img, tf.float32) * (1. / 255) - 0.5 #Throwing an img tensor in a stream
    label = tf.cast(features['label'], tf.int32) #Throwing label tensor in flow
    return img, label

4. Using Convolutional Neural Network Training

This part of the Python code is named dog_and_cat_train.py

4.1 Define the structure of convolution neural network

To import ReadMyOwnData from the file we read, we initialize the weight using tf.truncated_normal, twice convolution operation, twice maximum pooling, activation function ReLU, full connection layer, and finally y_conv is the second kind of problem of soft Max output. The loss function is optimized by cross-entropy and Adam.

The convolution code is as follows:

# -*- coding: utf-8 -*-
"""
@author: caokai
"""
import tensorflow as tf 
import numpy as np
import ReadMyOwnData

batch_size = 50

#initial weights
def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev = 0.1)
    return tf.Variable(initial)
#initial bias
def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial)

#convolution layer
def conv2d(x,W):
    return tf.nn.conv2d(x, W, strides=[1,1,1,1], padding='SAME')

#max_pool layer
def max_pool_4x4(x):
    return tf.nn.max_pool(x, ksize=[1,4,4,1], strides=[1,4,4,1], padding='SAME')

x = tf.placeholder(tf.float32, [batch_size,128,128,3])
y_ = tf.placeholder(tf.float32, [batch_size,1])

#first convolution and max_pool layer
W_conv1 = weight_variable([5,5,3,32])
b_conv1 = bias_variable([32])
h_conv1 = tf.nn.relu(conv2d(x, W_conv1) + b_conv1)
h_pool1 = max_pool_4x4(h_conv1)

#second convolution and max_pool layer
W_conv2 = weight_variable([5,5,32,64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_4x4(h_conv2)

#Turn it into a full connection layer and process it with an MLP
reshape = tf.reshape(h_pool2,[batch_size, -1])
dim = reshape.get_shape()[1].value
W_fc1 = weight_variable([dim, 1024])
b_fc1 = bias_variable([1024])
h_fc1 = tf.nn.relu(tf.matmul(reshape, W_fc1) + b_fc1)

#dropout
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

W_fc2 = weight_variable([1024,2])
b_fc2 = bias_variable([2])
y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)

#Loss function and optimization algorithm
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_conv), reduction_indices=[1]))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

correct_prediction = tf.equal(tf.argmax(y_conv,1),tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

4.2 training

Read data from your own data set and initialize training

image, label = ReadMyOwnData.read_and_decode("dog_and_cat_train.tfrecords")
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
coord=tf.train.Coordinator()
threads= tf.train.start_queue_runners(coord=coord)

When training feed data, try other articles on the Internet, and encounter the following mistakes:

Cannot feed value of shape (128, 128, 3) for Tensor u'Placeholder_12:0', which has shape '(50, 128, 128, 3)'

Or the following error:

TypeError: The value of a feed cannot be a tf.Tensor object. Acceptable feed values include Python scalars, strings, lists, or numpy ndarrays.

I tried, and the code was written as follows so that I could pass:

example = np.zeros((batch_size,128,128,3))
l = np.zeros((batch_size,1))

try:
    for i in range(20):        
        for epoch in range(batch_size):
            example[epoch], l[epoch] = sess.run([image,label])#Remove image and label from the session           
        train_step.run(feed_dict={x: example, y_: l, keep_prob: 0.5})        
    print(accuracy.eval(feed_dict={x: example, y_: l, keep_prob: 0.5})) #The eval function is similar to re-run, validate, and modify at the same time.

except tf.errors.OutOfRangeError:
        print('done!')
finally:
    coord.request_stop()
coord.join(threads)

If you have a better and more efficient way to read tfrecords data sets and train CNN, you can exchange ideas.

Reference resources:
TensorFlow (2) Making its own TFRecord data set read, display and code details

Posted by shinstar on Sat, 18 May 2019 07:57:40 -0700