Own data: Tensorflow 2.0 preprocessing data, Keras as as model, Tensorflow multi GPU running

From tensorflow 2.0, I started to cooperate with keras in all aspects, including model building and model training, but tf.keras, including model building, is more encouraged. With your company's project, I will use tensorflow for data preprocessing. There is no image generator of keras here, because one of the preprocessing operations is not used Yes, and there will be more data processing methods with TF. Then we all import TF. Keras

For the beginning of tensorflow 2.0, please refer to this article for the relationship between keras and tensorflow
From tensorflow2.0, whether to use keras or tf.keras

The whole process is divided into several steps:
1. Data preprocessing
2. Model generation
3. Training part (the setting of super parameters such as optimizer, which involves the operation of multiple GPU s starting from 2.0)

1. Data preprocessing: mainly used
Tf.io.read-u file (filename) (here is the read file, any file will be read, so notice the error caused by reading non image file)
There are many preprocessing methods in tf.image. Please refer to another article for details Reference articles
tf.constant generate parameters, such as generate Tags
Tf.reshape (total tab, (- 1, 1)), keep the dimension consistent
train_dataset = tf.data.Dataset.from_tensor_slices((total_train, total_label))
train_dataset = train_dataset.map(_image_process)
train_dataset = train_dataset.shuffle(buffer_size=40000)
train_dataset = train_dataset.batch(batch_size)

About tf.data.dataset.from'sensor'slices Reference link

The preprocessing part mainly includes: preprocessing operation, generation of tags, synthesis of tags and data, use of synthetic data, etc., as well as generation of validation sets, use of shuffle and use of batchsize.

def _image_process(filename,label):

    image_string = tf.io.read_file(filename)
    image_decoded = tf.image.decode_jpeg(image_string)
    image_resized = tf.image.resize(image_decoded, [int(360*224/640), 224],method=tf.image.ResizeMethod.BILINEAR) / 255.0
    image_padding = tf.image.pad_to_bounding_box(image_resized, offset_height=49, offset_width=0, target_height=224,target_width=224)
    image_rotated = tf.image.rot90(image_padding)
    image_horizontal = tf.image.flip_left_right(image_rotated)
    image_vertical = tf.image.flip_up_down(image_horizontal)
    image_brightness = tf.image.random_brightness(image_vertical, max_delta=0.4)
    final_img = tf.expand_dims(image_brightness, 0)

    return image_brightness, label

def data_generator(train_dir, valid_dir, batch_size):

    train_covers_dir = train_dir + '/covers/'
    train_nocovers_dir = train_dir + '/nocovers/'
    valid_covers_dir  = valid_dir + '/covers/'
    valid_nocovers_dir = valid_dir + '/nocovers/'

    train_covers_filename = tf.constant([train_covers_dir + filename for filename in os.listdir(train_covers_dir) if filename.endswith('.jpg')])
    train_nocovers_filename = tf.constant([train_nocovers_dir + filename for filename in os.listdir(train_nocovers_dir) if filename.endswith('.jpg')])
    total_train = tf.concat([train_covers_filename,train_nocovers_filename],axis=-1)
    total_label = tf.concat([tf.zeros(train_covers_filename.shape, dtype=tf.int32),tf.ones(train_nocovers_filename.shape,dtype=tf.int32)],axis=-1)
    total_label = tf.reshape(total_label,(-1, 1)) #Keep the dimensions consistent
    train_dataset = tf.data.Dataset.from_tensor_slices((total_train, total_label))
    train_dataset = train_dataset.map(_image_process)
    train_dataset = train_dataset.shuffle(buffer_size=40000)
    train_dataset = train_dataset.batch(batch_size)


    valid_covers_filename = tf.constant([valid_covers_dir + filename for filename in os.listdir(valid_covers_dir) if filename.endswith('.jpg')])
    valid_nocovers_filename = tf.constant([valid_nocovers_dir + filename for filename in os.listdir(valid_nocovers_dir) if filename.endswith('.jpg')])
    total_valid = tf.concat([valid_covers_filename,valid_nocovers_filename],axis=-1)
    total_valid_label = tf.concat([tf.zeros(valid_covers_filename.shape, dtype=tf.int32),tf.ones(valid_nocovers_filename.shape,dtype=tf.int32)],axis=-1)
    total_valid_label = tf.reshape(total_valid_label,(-1, 1))
    valid_dataset = tf.data.Dataset.from_tensor_slices((total_valid, total_valid_label))
    valid_dataset = valid_dataset.map(_image_process)
    valid_dataset = valid_dataset.shuffle(buffer_size=10000)`Insert a code slice here`
    valid_dataset = valid_dataset.batch(test_batch_size)

    return train_dataset, valid_dataset, total_train.shape[0], total_valid.shape[0]

2. Model generation:

-Pay attention to the shape setting of input "sensor, which will directly relate to the size of the shape you can input when you test last.
-tf.keras is used here to load and train the model, then extract a certain layer, then have the input part, and finally use
-The code model is from tensorflow.keras import Model

    x = base_model.get_layer('top_activation').output
    x = GlobalAveragePooling2D()(x)
    x = Dropout(0.5)(x)

    predictions = Dense(units=1, activation='sigmoid')(x)

    model = Model(inputs=base_model.input, outputs=predictions)

– for the training part (the selection of the optimizer of the model, the selection of the loss function of the model, and how to use the multi GPU training), it should be noted that mul.model.train is written from 2.0, which is not effective in the multi GPU training, so the following new methods are used:

logging = TensorBoard(log_dir = log_dir)

best_val_loss_callback = ModelCheckpoint(save_path_weight, monitor='val_loss', verbose=0, save_best_only=True, save_weights_only=False, mode='auto', period=1)

reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=10, verbose=1)

early_stopping = EarlyStopping(monitor='val_loss', min_delta=0, patience=10, verbose=1)

if args.gpu_num > 1:
    strategy = tf.distribute.MirroredStrategy()
    with strategy.scope():
        model = models.init_model(args.Backbone,input_tensor)
        model.compile(optimizer=Optimizers.init_optimizers(),loss=tf.keras.losses.binary_crossentropy,metrics=['accuracy'])
else:
    print('planing go to use the cpu')
    model = models.init_model(args.Backbone,input_tensor)
    model.compile(optimizer=Optimizers.init_optimizers(), loss=tf.keras.losses.binary_crossentropy,metrics=['accuracy'])


model.fit(

    train_generator,
    epochs=args.max_epoch,
    steps_per_epoch=max(1,train_len//train_batch_size),
    validation_data=valid_generator,
    workers=5,
    validation_steps=max(1,valid_len//test_batch_size),
    callbacks=[logging, reduce_lr , best_val_loss_callback, reduce_lr, early_stopping]

)

model.save(save_path_weight)
122 original articles published, 10 praised, 10000 visitors+
Private letter follow

Posted by Jaguar on Fri, 17 Jan 2020 03:39:10 -0800