meet Last article: https://blog.csdn.net/SPESEG/article/details/103875916
Use the jitter video data to test our video and see how it works.
Strategy: randomly select 20-40 frames, stack all frames, select the frames according to the index, then enter the integration V3 model as a whole, and then enter the classification model. There is no technical content.
Probability as similarity can be understood in this way. But I don't think it will work very well.
Considering the time, if there is any fast way to get the frame number of video, it will be better.
Firstly, 30 frames are selected randomly, the total number of frames is obtained by ffmpeg library, and then the corresponding frames are obtained by cv2 according to index, or other decoders.
1 - feature similarity first
2-GO through [high-level feature] two classifications again
1.1 I basically extracted the avg feature, but here Calculate similarity or measure You need to use the concept of tree. Yes, it's the tree you encounter in data structure.
But there are many kinds of such trees. At present, we can try the basic K-d tree. The structure of the tree is to convert the linear complexity into the log (n) complexity. sklearn has integrated the ready-made API without any effort.
But I found a github which was made by docker. I don't know anything about it. It's not convenient to take out what I want, even though it's also written by py.
Many projects are so structured, which is really a delay. To understand the code, you have to deal with every script.
I want to test MNIST. I can't do it with kd tree. It's too hard to lie in the slot.
The similarity is made of KNN, This blog post The results are mentioned in. I can only say that the first 50 may still be correct, the back may not be, or the first 30
1.2 our company's video random 30 frames inclusionv3 feature similarity in In this article It has been mentioned, but it does not actually do query, which is created by t-SNE diagram We also know that the effect will not be very good.
The following is the query result: however, it doesn't look too bad [Note: 012 is a category, 3 is a category], of course, there are bad results, as follows
#bad case query data id 410, label 1 result: {10754, 1540, 7684, 6151, 3080, 5129, 1554, 3093, 1558, 22, 3614, 8737, 8226, 11301, 3625, 6701, 9264, 7223, 573, 62, 3138, 77, 10318, 79, 1106, 595, 2134, 6232, 92, 11360, 1126, 3688, 3189, 5755, 7294, 4740, 3214, 1177, 11941, 1190, 3752, 6648, 7340, 6841, 5313, 196, 3783, 7369, 6865, 9432, 5861, 9966, 11510, 3845, 5415, 8495, 9519, 6963, 3388, 1865, 1867, 332, 3919, 7509, 347, 6493, 8035, 5489, 10610, 3441, 3955, 2933, 381, 7550, 6549, 410, 932, 9128, 3496, 10666, 427, 3500, 429, 1456, 1969, 6579, 7607, 955, 446, 2499, 9172, 7125, 985, 6106, 2019, 8677, 10218, 7661, 7150, 7160} results' label : [3 2 3 3 3 3 2 3 2 0 3 3 3 3 3 3 3 3 1 0 3 0 3 0 2 1 3 3 0 3 2 3 3 3 3 3 3 2 3 2 3 3 3 3 3 0 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 1 3 3 1 3 3 3 3 3 3 3 1 3 3 1 1 3 3 3 1 3 1 2 2 3 3 1 1 3 3 3 2 3 2 3 3 3 3 3] query data id 1989, label 2 result: {6658, 7184, 8720, 3091, 1563, 31, 544, 10278, 1575, 1576, 5673, 11307, 2091, 2095, 561, 1589, 5688, 1081, 8762, 5696, 67, 1604, 4166, 3661, 8781, 1617, 594, 8278, 2144, 7778, 10342, 1127, 10857, 118, 10360, 4234, 2188, 11405, 1678, 1691, 1186, 7342, 8879, 7351, 9915, 5822, 7881, 3804, 221, 733, 8926, 1250, 1769, 7404, 11503, 6899, 1278, 1797, 2312, 6409, 6413, 6925, 11030, 1302, 2849, 291, 804, 8486, 7476, 6456, 10563, 327, 340, 9049, 1898, 10630, 4487, 2958, 398, 2449, 1433, 1435, 9627, 926, 1441, 1955, 422, 1453, 941, 1458, 11187, 4026, 1989, 5080, 1497, 2008, 4063, 10217, 8172, 7164} results' label : [3 3 3 3 2 0 1 3 2 2 3 3 2 3 1 2 3 2 3 3 0 2 3 3 3 2 1 3 3 3 3 2 3 0 3 3 3 3 2 2 2 3 3 3 3 3 3 3 0 1 3 2 2 3 3 3 2 2 3 3 3 3 3 2 3 1 1 3 3 3 3 1 1 3 2 3 3 3 1 3 2 2 3 1 2 2 1 2 1 2 3 3 2 3 2 2 3 3 3 3] query data id 861, label 1 result: {1, 10246, 10256, 7699, 5665, 9771, 3629, 4653, 5681, 54, 11840, 7238, 2649, 9819, 98, 4724, 8823, 8839, 7310, 8336, 2718, 9896, 6824, 1192, 9395, 7860, 8896, 3780, 10960, 4309, 8418, 9443, 2792, 8426, 5869, 753, 2290, 3324, 9209, 3837, 2310, 7946, 267, 6416, 11025, 6937, 8988, 6432, 5414, 6447, 8506, 316, 9533, 6975, 10053, 8519, 6475, 4429, 2894, 10063, 6480, 5464, 861, 10594, 1380, 5996, 9582, 11120, 7027, 2941, 11138, 4996, 7046, 11655, 2450, 10649, 1949, 3998, 9629, 10657, 5031, 10151, 11178, 6066, 8115, 9153, 3010, 7630, 9166, 4565, 8155, 4064, 7654, 7656, 9194, 11248, 7154, 7667, 7161, 4605} results' label : [0 3 3 3 3 3 3 3 3 0 3 3 3 3 0 3 3 3 3 3 3 3 3 2 3 3 3 3 3 3 3 3 3 3 3 1 3 3 3 3 3 3 1 3 3 3 3 3 3 3 3 1 3 3 3 3 3 3 3 3 3 3 1 3 2 3 3 3 3 3 3 3 3 3 3 3 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3] query data id 677, label 1 result: {5640, 1033, 3087, 4627, 8224, 10787, 10280, 9258, 3627, 11820, 3630, 10800, 4156, 8765, 4157, 6716, 10812, 8770, 10818, 11335, 1095, 11849, 4168, 6220, 10831, 6738, 95, 96, 8290, 8294, 6248, 2664, 5226, 10861, 9336, 1667, 1670, 8864, 677, 2728, 691, 6845, 11970, 1218, 6862, 3793, 4306, 1749, 1751, 10968, 11487, 5855, 737, 3310, 3314, 3319, 276, 11542, 9495, 6424, 3360, 6436, 8998, 4922, 1343, 2377, 333, 11086, 4438, 1885, 7012, 8565, 7546, 902, 7049, 6548, 3989, 3988, 2979, 1444, 2986, 5554, 435, 1973, 11707, 6590, 10181, 10183, 8137, 8144, 11217, 10704, 2007, 8160, 8161, 9696, 1510, 8167, 11251, 5622} results' label : [3 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 3 3 3 3 3 0 0 3 3 3 3 3 3 3 2 2 3 1 3 1 3 3 2 3 3 3 2 2 3 3 3 1 3 3 3 1 3 3 3 3 3 3 3 2 3 1 3 3 2 3 3 3 1 3 3 3 3 3 2 3 3 1 2 3 3 3 3 3 3 3 3 2 3 3 3 2 3 3 3]
2 about re classification, let's use our data to re test
The structure of the model is very simple. There is no complex layer. It is basic. There are three levels of deny, which is also the most basic in the recommendation.
Layer (type) Output Shape Param # ================================================================= dense_1 (Dense) (None, 128) 262272 _________________________________________________________________ activation_1 (Activation) (None, 128) 0 _________________________________________________________________ dropout_1 (Dropout) (None, 128) 0 _________________________________________________________________ dense_2 (Dense) (None, 32) 4128 _________________________________________________________________ dropout_2 (Dropout) (None, 32) 0 _________________________________________________________________ batch_normalization_1 (Batch (None, 32) 128 _________________________________________________________________ dense_3 (Dense) (None, 2) 66 _________________________________________________________________ activation_2 (Activation) (None, 2) 0 ================================================================= Total params: 266,594 Trainable params: 266,530 Non-trainable params: 64
Due to the poor classification of our company's data, for example, some videos in Video 1 are not very good. They may not be in the category of 1. They may be in the category of touchy spots, but they are also classified into 1. In general, the result of two classification model is more than 75% of ACC, the highest is 80%. The best model is dense5 〝 14 〝 0.9838.hdf5, only 6M, if the compression estimation is smaller, so the recommended network is really much simpler than the image.
Model: "sequential_1" ACC=0.8 _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense_1 (Dense) (None, 256) 524544 _________________________________________________________________ activation_1 (Activation) (None, 256) 0 _________________________________________________________________ dropout_1 (Dropout) (None, 256) 0 _________________________________________________________________ dense_2 (Dense) (None, 32) 8224 _________________________________________________________________ dropout_2 (Dropout) (None, 32) 0 _________________________________________________________________ batch_normalization_1 (Batch (None, 32) 128 _________________________________________________________________ dense_3 (Dense) (None, 2) 66 _________________________________________________________________ activation_2 (Activation) (None, 2) 0 ================================================================= Total params: 532,962 Trainable params: 532,898 Non-trainable params: 64
Next steps:
View our videos one by one, delete unnecessary ones, and then extract features for training.
In addition, we can join QQ group to discuss some related problems without wechat group
QQ group: 868373192
Voice image video depth learning group