non_max_suppression code analysis

non_max_suppression code analysis NMS was performed simply according to confidence def non_max_suppression(boxes, conf_thres=0.5, nms_thres=0.3): detection = boxes # 1. Find the box in the picture whose score is greater than the threshold function. The number of coincident boxes can be greatly reduced by filtering scores before screen ...

Posted by dloeppky on Mon, 08 Nov 2021 03:17:38 -0800

Principle and code analysis of the strongest ViT (Vision Transformer) in the whole network

    Today, let's learn more about Vision Transformer. timm based code. 1. Patch Embedding Transformer was originally used to do NLP work, so ViT's primary task is to convert the graph into word structure. The method adopted here is to divide the picture into small blocks, as shown in the lower left corner of the above figure. Eac ...

Posted by warran on Fri, 29 Oct 2021 10:38:15 -0700

Second week of CV transformer

6.2.1 data set introduction The data used in this OCR experiment is based on Task 4.3:Word Recognition in icdar2015 incident scene text. This is a word recognition task. Remove some pictures to simplify the difficulty of this experiment. The data set contains many text areas in natural scene images. The training set in the original data co ...

Posted by ManWithNoName on Sun, 24 Oct 2021 08:41:35 -0700