Fundamentals of digital image processing (openCV)

Keywords: Python OpenCV Computer Vision

openCV

Window, mouse and keyboard operation

Comprehensive example

Generate a 500 * 500 pure black canvas, create a window, add mouse events to the window, and use the window to display images

import cv2
import numpy as np

# Generate 500 * 500 pure black canvas
convas = np.zeros(shape=(512, 512, 3),
                  dtype=np.uint8)

# create a window
cv2.namedWindow(winname='draw circle')

# Write mouse events and draw circles for the canvas
def onMouse(event, x, y, flags, param):
    """Double click the mouse with the left button: draw a circle with the mouse position as the center and the radius"""
    if event == cv2.EVENT_LBUTTONDBLCLK:
        img, radius, thickness = param
        thickness = -1 if thickness <= 0 else thickness
        cv2.circle(img=img,
                   center=(x, y),
                   radius=radius,
                   color=(0, 0, 255),
                   thickness=thickness)  # -1 solid circle

img = convas.copy()
radius = 100
thickness = -1
# Add mouse events to windows
cv2.setMouseCallback('draw circle', onMouse, param=[img, radius, thickness])

# Display canvas through window
while 1:
    cv2.imshow(winname='draw circle', mat=img)  # The window and canvas are decoupled, so the relationship between the window and canvas should be refreshed every time a circle is added
    if cv2.waitKey(20) & 0xFF == 27:  # esc build exit
        cv2.destroyWindow(winname='draw circle')
        break

cv2.destroyAllWindows()
  1. Create window cv2.namedWindow(winname, flags=None)

    • notes:

      """
      namedWindow(winname[, flags]) -> None
      .   @brief Creates a window.
      ...
      .   @param winname Name of the window in the window caption that may be used as a window identifier.
      .   @param flags Flags of the window. The supported flags are: (cv::WindowFlags)
      """
      
    • Function: create window with specified name

    • Parameters:

      winname: strWindow name
      flags
  2. Destroy window cv2.destroyWindow(winname)

    • notes:

      """
      destroyWindow(winname) -> None
      .   @brief Destroys the specified window.
      .   
      .   The function destroyWindow destroys the window with the given name.
      .   
      .   @param winname Name of the window to be destroyed.
      """
      
    • Function: destroy the window with the specified name

    • Parameters:

      winname: strWindow name
  3. Destroy all windows (CV2. Destroyallwindows)

    • notes:

      """
      destroyAllWindows() -> None
      .   @brief Destroys all of the HighGUI windows.
      .   
      .   The function destroyAllWindows destroys all of the opened HighGUI windows.
      """
      
    • Function: destroy all open windows

    • Parameter: None

  4. Mouse callback event cv2.setMouseCallback()

    • notes:

      1. """ setMouseCallback(windowName, onMouse [, param]) -> None """
        
    • Function: add mouse events for a specified window

    • Parameters:

      windowName: strWindow name
      onMouse: Function(event, x, y, flags, param)Mouse event callback function
    1. Write the mouse callback function onMouse:Function

      def onMouse(event, x, y, flags, param):
          if event == cv2.EVENT_...:  # Judge mouse trigger events
              pass  # Perform the desired action
      
    2. View mouse event

      import cv2
      
      events = [i for i in dir(cv2) if 'EVENT' in i]
      print(events)
      
      ['EVENT_FLAG_ALTKEY',
      'EVENT_FLAG_CTRLKEY', 
      'EVENT_FLAG_LBUTTON', 
      'EVENT_FLAG_MBUTTON', 
      'EVENT_FLAG_RBUTTON', 
      'EVENT_FLAG_SHIFTKEY', 
      'EVENT_LBUTTONDBLCLK', # Double left click
      'EVENT_LBUTTONDOWN',  # Left key press
      'EVENT_LBUTTONUP',  # Left key release 
      'EVENT_MBUTTONDBLCLK', 
      'EVENT_MBUTTONDOWN', 
      'EVENT_MBUTTONUP', 
      'EVENT_MOUSEHWHEEL', 
      'EVENT_MOUSEMOVE', 
      'EVENT_MOUSEWHEEL', 
      'EVENT_RBUTTONDBLCLK',  # Right click double click
      'EVENT_RBUTTONDOWN',  # Right click
      'EVENT_RBUTTONUP']  # Right click release 
      
  5. Keyboard cv2.watKey()

Image reading, writing and display

Comprehensive example

import cv2
# 1. Read image
lean = cv2.imread(filename=r'F:\wuxin_convenient\Pictures\CV\data-img\lena.jpeg', flags=cv2.IMREAD_COLOR)  # Read color image
lena_gray = cv2.imread(filename=r'F:\wuxin_convenient\Pictures\CV\data-img\lena.jpeg', flags=cv2.IMREAD_GRAYSCALE)  # Read gray image

# 3. Display image
cv2.imshow(winname='img', mat=lena)
cv2.imshow(winname='img_gray', mat=lena_gray)
cv2.waitKey()
cv2.destroyAllWindows()

# 2. Write image
cv2.imwrite(filename='data/lena.png', img=lena)
cv2.imwrite(filename='data/lena_gray.png', img=lena_gray)
  1. Image reading cv2.imread (filename, flags = none) - > numpy.ndarray

    • Function: read image data according to write path and read mode

    • Parameters:

      filename: strPicture storage path
      flages: Union[int, str]Picture reading mode
      flages: Union[int, str]
      Value
      explain
      1
      cv2.IMREAD_COLOR
      BGR color image
      0
      cv2.IMREAD_GRAYSCALE
      Gray image
    1. View image reading mode

      import cv2
      
      flags=[i for i in dir(cv2) if 'IMREAD' in i]
      print(flags)
      
      ['IMREAD_ANYCOLOR', 
       'IMREAD_ANYDEPTH', 
       'IMREAD_COLOR', 
       'IMREAD_GRAYSCALE', 
       'IMREAD_IGNORE_ORIENTATION', 
       'IMREAD_LOAD_GDAL', 
       'IMREAD_REDUCED_COLOR_2', 
       'IMREAD_REDUCED_COLOR_4', 
       'IMREAD_REDUCED_COLOR_8', 
       'IMREAD_REDUCED_GRAYSCALE_2', 
       'IMREAD_REDUCED_GRAYSCALE_4', 
       'IMREAD_REDUCED_GRAYSCALE_8', 
       'IMREAD_UNCHANGED']
      
    2. ??

  2. Image write cv2.imwrite(filename, img, params=None)

    • Function: write image data (memory - > hard disk)

    • Parameters:

      filename: strPicture storage path
      img: numpy.ndarrayimage data
  3. Image display cv2.imshow(winname, mat)

    • The function imshow displays an image in the specified window

    • Parameters:

      winname: strWindow name
      mat: numpy.ndarrayimage data

Image color operation

Comprehensive example

import cv2
import matplotlib.pyplot as plt

img = cv2.imread('data/lena.jpeg')

# 1. Color space transformation
img_gray = cv2.cvtColor(src=img,
                        code=COLOR_BGR2GRAY)  # Image graying (BGR - > gray)
img_hsv = cv2.cvtColor(src=img,
                        code=COLOR_BGR2HSV)  # BGR->HSV

# 2.BGR image channel suppression
img[:,:,0] = 0  # Blue channel suppression
img[:,:,1] = 0  # Green channel suppression
img[:,:,2] = 0  # Red channel suppression

# 3.HSV image channel adjustment
img[:,:,0] += 10  # Brightness channel (H component) enhancement
img[:,:,1]  *= 1.2  # Saturation channel (S component) enhancement
img[:,:,2]  += 10  # Hue channel (V component) offset

# 4. Threshold processing
t, img_binary = cv2.thrshold(src=img,
                             thresh=127, maxval=255,
                             type=cv2.THRESHOLD_BINARY)  # Binarization threshold processing for BGR color image
t, img_binary_inv = cv2.thrshold(src=img,
                                 thresh=127, maxval=255,
                                 type=cv2.THRESHOLD_BINARY_INV)  # Inverse binarization threshold processing for BGR color image
t, img_gray_binary = cv2.thrshold(src=img_gray,
                                  thresh=127, maxval=255,
                                  type=cv2.THRESHOLD_BINARY)  # Binarization threshold processing for gray image
t, img_gray_binary_inv = cv2.thrshold(src=img_gray,
                                      thresh=127, maxval=255,
                                      type=cv2.THRESHOLD_BINARY_INV)  # The gray image is processed by inverse binarization threshold


# 5. Histogram equalization
plt.hist(img_hsv[:,:,0].ravel(), bins=256, range=[0,255])  # UN equalized luminance channel histogram
img_hsv[:,:,0] = cv2.equalizeHist(img_hsv[:,:,0])  # Histogram equalization of brightness channel (H component) of HSV color image
plt.hist(img_hsv[:,:,0].ravel(), bins=256, range=[0,255])  # Equalized luminance channel histogram
  1. Color space transformation cvtcolor (SRC, code, DST = none, dstcn = none) - > ndarray

    • notes:

      """
      cvtColor(src, code[, dst[, dstCn]]) -> dst
      .   @brief Converts an image from one color space to another.
      ...   
      .   @param src input image: 8-bit unsigned, 16-bit unsigned ( CV_16UC... ), or single-precision
      .   floating-point.
      .   @param dst output image of the same size and depth as src.
      .   @param code color space conversion code (see #ColorConversionCodes).
      .   @param dstCn number of channels in the destination image; if the parameter is 0, the number of the
      .   channels is derived automatically from src and code.
      .   
      .   @see @ref imgproc_color_conversions
      """
      
    • Function: color space conversion

    • Parameters:

      src: ndarrayOriginal image data
      code: cv2.COLOR_...Color space conversion coding
      dst: ndarraytarget image
      dstCn
    1. View color space conversion encoding

      import cv2
      
      # codes = [i for i in dir(cv2) if i.startswith('COLOR_')]
      codes = [i for i in dir(cv2) if i.startswith('COLOR_') and i.count('_') < 2 and len(i)<15]
      print(codes)
      
      ['COLOR_BGR2BGRA', 'COLOR_BGR2GRAY',  # BGR->gray
       'COLOR_BGR2HLS', 'COLOR_BGR2HSV',  # BGR->HSV
       'COLOR_BGR2LAB', 'COLOR_BGR2LUV', 'COLOR_BGR2Lab', 'COLOR_BGR2Luv', 'COLOR_BGR2RGB', 'COLOR_BGR2RGBA', 'COLOR_BGR2XYZ', 'COLOR_BGR2YUV', 'COLOR_BGRA2BGR', 'COLOR_BGRA2RGB', 'COLOR_GRAY2BGR',  # gray->BGR
       'COLOR_GRAY2RGB', 'COLOR_HLS2BGR', 'COLOR_HLS2RGB', 'COLOR_HSV2BGR',  # HSV->BGR
       'COLOR_HSV2RGB', 'COLOR_LAB2BGR', 'COLOR_LAB2LBGR', 'COLOR_LAB2LRGB', 'COLOR_LAB2RGB', 'COLOR_LBGR2LAB', 'COLOR_LBGR2LUV', 'COLOR_LBGR2Lab', 'COLOR_LBGR2Luv', 'COLOR_LRGB2LAB', 'COLOR_LRGB2LUV', 'COLOR_LRGB2Lab', 'COLOR_LRGB2Luv', 'COLOR_LUV2BGR', 'COLOR_LUV2LBGR', 'COLOR_LUV2LRGB', 'COLOR_LUV2RGB', 'COLOR_Lab2BGR', 'COLOR_Lab2LBGR', 'COLOR_Lab2LRGB', 'COLOR_Lab2RGB', 'COLOR_Luv2BGR', 'COLOR_Luv2LBGR', 'COLOR_Luv2LRGB', 'COLOR_Luv2RGB', 'COLOR_RGB2BGR', 'COLOR_RGB2BGRA', 'COLOR_RGB2GRAY', 'COLOR_RGB2HLS', 'COLOR_RGB2HSV', 'COLOR_RGB2LAB', 'COLOR_RGB2LUV', 'COLOR_RGB2Lab', 'COLOR_RGB2Luv', 'COLOR_RGB2RGBA', 'COLOR_RGB2XYZ', 'COLOR_RGB2YUV', 'COLOR_RGBA2BGR', 'COLOR_RGBA2RGB', 'COLOR_XYZ2BGR', 'COLOR_XYZ2RGB', 'COLOR_YUV2BGR', 'COLOR_YUV2RGB']
      
    2. Image graying

      import cv2
      
      img = cv2.imread('data/lena.jpeg')
      dst = cv2.cvtColor(src=img, 
                         code=cv2.COLOR_BGR2GRAY)
      

      Several processing methods of image graying:

      1. Component method: take the brightness of three components in the color image as the gray value of three gray images, and select one of the gray images according to the application requirements.
      2. Maximum value method: take the maximum value of three components of brightness in color image as the gray value of gray image.
      3. Mean value method: calculate the average value of the brightness of the three components in the color image to a gray value as the gray value of the gray image.
      4. Weighted average method: weighted average the three components with different weights according to the importance and other indicators.
        1. Human eye adaptation weighting: human eyes are most sensitive to green and blue is the second. According to this, the weight of three components of BGR is obtained for weighted average f ( i , j ) = 0.11 B ( i , j ) + 0.59 G ( i , j ) + 0.30 R ( i , j ) f(i,j)=0.11B(i,j) + 0.59G(i,j) + 0.30R(i,j) f(i,j)=0.11B(i,j)+0.59G(i,j)+0.30R(i,j)
  2. BGR image channel suppression

    1. Blue channel adjustment img_BGR[:,:,0]=0
    2. Green channel adjustment img_BGR[:,:,1]=0
    3. Red channel adjustment img_BGR[:,:,2]=0
  3. HSV image channel adjustment

    1. Brightness channel adjustment img_HSV[:,:,0] + = offset
    2. Saturation channel adjustment img_HSV[:,:,1] * = correction factor
    3. Tone channel adjustment img_HSV[:,:,2] + = offset
  4. Threshold processing CV2. Threshold (SRC, threshold, maxval, type, DST = none) - > threshold_ value, ndarray

    • notes:

      """
      threshold(src, thresh, maxval, type[, dst]) -> retval, dst
      .   @brief Applies a fixed-level threshold to each array element.
      ...
      .   @param src input array (multiple-channel, 8-bit or 32-bit floating point).
      .   @param dst output array of the same size  and type and the same number of channels as src.
      .   @param thresh threshold value.
      .   @param maxval maximum value to use with the #THRESH_BINARY and #THRESH_BINARY_INV thresholding
      .   types.
      .   @param type thresholding type (see #ThresholdTypes).
      .   @return the computed threshold value if Otsu's or Triangle methods used.
      .   
      .   @sa  adaptiveThreshold, findContours, compare, min, max
      """
      
    • @brief: applies a fixed level threshold to each array element.

      1. Binarization threshold processing: for pixels with gray value > threshold t, set their gray value to max value; For pixels with gray value < = threshold t, set their gray value to 0.
      2. Inverse binarization threshold processing: Contrary to binarization, the gray value of pixels > t is set to 0<= The gray value of the pixel of T is set to max value.
    • @param:

      src: ndarrayOriginal image
      thresh: intThreshold t
      maxval: intMaximum gray value
      type: cv2.THRESH_...Threshold processing type
    1. Viewing threshold processing types

      import cv2
      
      threshold_types = [i for i in dir(cv2) if i.startswith('THRESH')]
      print(threshold_types)
      
      ['THRESH_BINARY',  # Binarization threshold processing
       'THRESH_BINARY_INV',  # Inverse binarization threshold processing
       'THRESH_MASK', 'THRESH_OTSU', 'THRESH_TOZERO', 'THRESH_TOZERO_INV', 'THRESH_TRIANGLE', 'THRESH_TRUNC']
      
    2. ??

  5. Histogram equalization CV2. Equalizehist (SRC, DST = none) - > ndarray
    Gray level histogram: it reflects the frequency of each gray level pixel in a component. The histogram is drawn with the gray level and the frequency of gray level as the horizontal and vertical coordinates. It describes the relationship between frequency and gray level. It is an important feature of image and reflects the gray distribution of image.
    $$
    \begin{align}

    &Digital image with gray level range [0,L-1]_ {MN}, whose histogram is a discrete function: h(r_k)=n_k \
    &Of which:\
    & \qquad r_ K is the k-th gray value\
    & \qquad n_ K is the gray level in the image, and R is the gray level_ Number of pixels of K\
    &Normalize the histogram with the total number of pixels: p(r_k) = \frac{n_k}{MN}

    \end{align}
    $$
    Histogram equalization: it is an image transformation processing method based on probability theory. It achieves gray level equalization by comprehensively considering the gray value distribution of the whole image and modifying the gray value of each pixel of the whole image. Effectively enhance the image that is too dark, too bright and the details are not clear, so that the image has high contrast and large gray tone change.

    • notes:

      """
          equalizeHist(src[, dst]) -> dst
          .   @brief Equalizes the histogram of a grayscale image.
      ...
          .   @param src Source 8-bit single channel image. Single channel image
          .   @param dst Destination image of the same size and type as src .
          """
      
    • @brief: equalize the histogram of gray image.

    • @param:

      src: ndarrayOriginal image (single channel image)
      dst: ndarray like srctarget image

Image morphological operation

Straightness / parallelism transformation

  1. Straightness: after affine transformation, the straight line of the image is still a straight line.
  2. Parallelism: after affine transformation, parallel lines are still parallel lines.

Comprehensive example

import cv2

img = cv2.imread('data/lena.jpeg')

# 1. Mirror (flip)
img_flip0 = cv2.flip(img, 0)  # Flip around the x axis
img_flip1 = cv2.flip(img, 1)  # Flip around the y axis
img_flip_minus = cv2.flip(img, -1)  # Flip around x-axis and y-axis

# 2. Affine transformation
translated_img = translate(src=img, x=10, y=20)  # Custom translation transform
rotated_img = rotate(src=img, angle=90)  # Custom rotation transform

# 3. Perspective transformation

# 4. Zoom
h,w = src.shape[:2]
equScaleDown_dst_size = h/2, w/2
equScaleDown_dst = cv2.resize(src=img, dsize=equScaleDown)  # Equal scale reduction

nonEquScaleDown_dst_size = h-50, w-50
nonEquScaleDown_dst Up= cv2.resize(src=img, dsize=nonEquScaleDown)  # Unequal scale reduction

equScaleUp_dst_size = h*2, w*2
equScaleUp_dst = cv2.resize(src=img, dsize=equScaleDown,)  # Equal scale reduction

nonEquScaleUp_dst_size = h+50, w+50
nonEquScaleUp_dst = cv2.resize(src=img, dsize=nonEquScaleDown)  # Unequal scale reduction

# 5. Cutting
centerCrop_dst = center_crop(src=img, w=50, h=50)  # Custom center crop
randomCrop_dst = random_crop(src=img, w=50, h=50)  # Custom random clipping
  1. Mirror (flip) cv2.flip (SRC, flipcode, DST = none) - > ndarray

    • notes:

      """
          flip(src, flipCode[, dst]) -> dst
          .   @brief Flips a 2D array around vertical, horizontal, or both axes.
      ...
          .   @param src input array.
          .   @param dst output array of the same size and type as src.
          .   @param flipCode a flag to specify how to flip the array; 0 means
          .   flipping around the x-axis and positive value (for example, 1) means
          .   flipping around y-axis. Negative value (for example, -1) means flipping
          .   around both axes.
          .   @sa transpose , repeat , completeSymm
          """
      
    • @brief: flips a 2D array on a vertical, horizontal, or two axes.

    • @param:

      src: ndarrayOriginal image
      flipCode: Union[-1, 0, 1]Flip axis
      dst: ndarray like srctarget image
      flipCode: Union[-1, 0, 1]
      -1Flip around x-axis and y-axis
      0Flip around the x axis
      1Flip around the y axis
  2. Affine transformation cv2.warpAffine(src, M, dsize, dst=None, flags=None, borderMode=None, borderValue=None)
    affine transformation: the image realizes translation, rotation and other operations through a series of geometric transformations, and the transformation maintains the flatness and parallelism of the image.
    In computer graphics, in order to uniformly represent translation, rotation and scaling by matrix, homogeneous coordinates need to be introduced. (assuming that a 2x2 matrix is used, there is no way to describe the translation operation. Only by introducing the 3x3 matrix form can the translation, rotation and scaling operations in two dimensions be described uniformly. Similarly, a 4x4 matrix must be used to describe the three-dimensional transformation uniformly).
    At the application level, affine transformation is based on three fixed vertices
    [the external chain image transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the image and upload it directly (img-xcog7ped-1632439287) (F: \ wuxin_convenent \ documents \ typro \ tarena \ final sprint \ fundamentals of digital image processing (opencv) _img \ three fixed vertices of affine transformation. png)]

    • notes:

      """
          warpAffine(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]]) -> dst
          .   @brief Applies an affine transformation to an image.
      ...
          .   @param src input image.
          .   @param dst output image that has the size dsize and the same type as src .
          .   @param M \f$2\times 3\f$ transformation matrix.
          .   @param dsize size of the output image.
      ...  
          .   @sa  warpPerspective, resize, remap, getRectSubPix, transform
          """
      
    • @brief: apply affine transformation to the image.

    • @param:

      src: ndarrayOriginal image
      M: ndarrayTransformation matrix
      dsize: Union[Tuple, List]The size of the output image
      dst ndarraytarget image
      flags
      borderMode
      borderValue
    1. Fixed three vertex affine transformation

      ??

    2. Translation transformation

      def translate(src, x:int, y:int): -> ndarray
      	"""
          Define the translation matrix for image translation transformation
          :param img: input image 
          :param x: Horizontal translation coordinate distance
          :param y: Vertical translation coordinate distance
          :return: Panned image
          """
          h, w = src.shape[:2]  # Take out the height and width of the original image (shape = (height, width, number of channels))
          M = np.float([1,0,x],
                       [0,1,y])  # Define translation matrix
          # affine transformation 
          shifted_dst = cv2.warpAffine(src=src,
                                       M=M,
                                       dsize=(w,h))  # Note: width, height
          return shifted_dst
      
      1. Calculate translation matrix

        • Two dimensional translation matrix
          [the external chain picture transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-hs5shcml-1632434291) (F: \ wuxin_convenent \ documents \ typro \ tarena \ final sprint \ fundamentals of digital image processing (opencv) _img \ two-dimensional translation. png)]
          $$
          \begin{align}
          &Point P is translated to point P 'in x and y directions, and the following can be obtained:\
          & \qquad x′=x+t_x \
          & \qquad y′=y+t_y \
          &Write in matrix form:\
          & \qquad \begin{bmatrix} x' \ y' \end{bmatrix} =
          \begin{bmatrix}
          1 & 0 & t_x \
          0 & 1 & t_y \
          \end{bmatrix} * \begin{bmatrix} x \ y \ 1 \end{bmatrix} \
          &Introducing homogeneous coordinates:\
          & \qquad \begin{bmatrix} x' \ y' \ 1 \end{bmatrix} =
          \begin{bmatrix}
          1 & 0 & t_x \
          0 & 1 & t_y \
          0 & 0 & 1 \
          \end{bmatrix} * \begin{bmatrix} x \ y \ 1 \end{bmatrix} \

          \end{align}
          $$

    3. Rotation transformation

      def rotate(src, angle, center=None, scale=1.0):
          """
          Image rotation transformation
          :param img: Original drawing
          :param angle: Rotation angle
          :param center: Center of rotation
          :param scale: Scale
          :return: Returns the rotated image src
          """
          h, w = src.shape[:2]  # Take out the height and width of the original image
      
          # Rotation center (if center=None, add rotation center)
          if center:
              center = (w / 2, h / 2)
      
          # Calculate rotation matrix
          M = cv2.getRotationMatrix2D(center, angle, scale)
      
          # Affine transformation using rotation matrix
          rotated_dst = cv2.warpAffine(img, M, (w, h))
          return rotated_dst
      
      1. Calculate the rotation matrix cv2.getRotationMatrix2D(center, angle, scale)

        • notes:

          """
              getRotationMatrix2D(center, angle, scale) -> retval
              .   @brief Calculates an affine matrix of 2D rotation.
          ...
              .   
              .   The transformation maps the rotation center to itself. If this is not the target, adjust the shift.
              .   
              .   @param center Center of the rotation in the source image.
              .   @param angle Rotation angle in degrees. Positive values mean counter-clockwise rotation (the
              .   coordinate origin is assumed to be the top-left corner).
              .   @param scale Isotropic scale factor.
              .   
              .   @sa  getAffineTransform, warpAffine, transform
              """
          
        • Two dimensional rotation principle: rotation rotates around a point in two dimensions.

          1. 2D rotation around origin
            [the external chain image transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the image and upload it directly (img-rhozawe8-1632439295) (F: \ wuxin_convenent \ documents \ typro \ tarena \ final sprint \ fundamentals of digital image processing (opencv) _img \ two-dimensional rotation around the origin. png)]
            $$
            \begin{align}
            &As shown in the figure, point v rotates the \ theta angle around the origin to obtain point v '\
            &Suppose the coordinates of point v are (x, y), the distance from the origin to V is r, and the angle between the vector from the origin to point v and the X axis is \ phi\
            &The coordinates (x ', y') of point v 'are derived:\
            & \qquad x=rcos\phi ,\ y=rsin\phi \
            & \qquad x′=rcos(θ+\phi ) ,\ y′=rsin(θ+\phi ) \
            &Obtained by trigonometric function expansion:\
            & \qquad x′=rcosθcos\phi −rsinθsin\phi \
            & \qquad y′=rsinθcos\phi +rcosθsin\phi \
            &Bring in the x and y expressions to get:\
            & \qquad x′=xcosθ−ysinθ \
            & \qquad y′=xsinθ+ycosθ \
            &Write in matrix form:\
            & \qquad \begin{bmatrix} x' \ y' \end{bmatrix} =
            \begin{bmatrix}
            cos\theta & -sin\theta \
            sin\theta & cos\theta \
            \end{bmatrix} * \begin{bmatrix} x \ y \end{bmatrix}

            \end{align}
            $$

          2. 2D rotation around any point

            Idea:
            	1. First move the rotation point to the origin 
            	2. Performs a rotation around the origin 
            	3. Then move the rotation point back to the original position
            

            [the external chain image transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the image and upload it directly (img-txiiudbk-1632439298) (F: \ wuxin_convenent \ documents \ typro \ tarena \ final sprint \ fundamentals of digital image processing (opencv) img \ two-dimensional rotation around any point. png)]
            $$
            \begin{align}
            &In the case of processing rotation around any point, you need to perform two translations\
            &Assuming that the translation matrix from the origin to the rotation point is T(t_x,t_y), the translation matrix from the rotation point to the origin is T(-t-x,-t_y), and the rotation matrix R(\theta)\
            &When column vectors are used to describe the coordinates of points, the whole calculation process is as follows:\
            & \qquad v'=T(x,y)*R(\theta)*T(-x,-y)v = Mv \
            &The obtained rotation matrix M is:\
            & \qquad M =
            \begin{bmatrix}
            1 & 0 & t_x \
            0 & 1 & t_y \
            0 & 0 & 1 \
            \end{bmatrix} *
            \begin{bmatrix}
            cos\theta & -sin\theta & 0 \
            sin\theta & cos\theta & 0 \
            0 & 0 & 1 \
            \end{bmatrix} *
            \begin{bmatrix}
            1 & 0 & -t_x \
            0 & 1 & -t_y \
            0 & 0 & 1 \
            \end{bmatrix} \

            \end{align}
            $$

  3. Perspective transformation warpPerspective(src, M, dsize, dst=None, flags=None, borderMode=None, borderValue=None)
    Perspective transformation is the transformation of an image based on four fixed vertices
    [the external chain picture transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-jmviwbdr-16324392300) (F: \ wuxin_convenent \ documents \ typro \ tarena \ final sprint \ fundamentals of digital image processing (opencv) _img \ four fixed vertices of perspective transformation. png)]

    • notes:

      """
          warpPerspective(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]]) -> dst
          .   @brief Applies a perspective transformation to an image.
      ...
          .   @param src input image.
          .   @param dst output image that has the size dsize and the same type as src .
          .   @param M \f$3\times 3\f$ transformation matrix.
          .   @param dsize size of the output image.
      ...  
          .   @sa  warpAffine, resize, remap, getRectSubPix, perspectiveTransform
          """
      
    • @brief: apply perspective transformation to the image.

    • @param:

    1. Fixed four vertex perspective transform

      def perspective(src, src_points, dst_points):
          # Generate perspective transformation matrix
          M = cv2.getPerspectiveTransform(src=src_points,
                                          dst=dst_points)
          h,w = src.shape[:2]
          perspective_dst = cv2.warpPerspective(src=src,
                                                M=M,
                                                disze=(w,h))
          return perspective_dst
      
      1. Calculate the perspective matrix cv2.getPerspectiveTransform(src, dst, solveMethod=None)

        • notes:

          """
              getPerspectiveTransform(src, dst[, solveMethod]) -> retval
              .   @brief Calculates a perspective transform from four pairs of the corresponding points.
          ...
              .   @param src Coordinates of quadrangle vertices in the source image.
              .   @param dst Coordinates of the corresponding quadrangle vertices in the destination image.
              .   @param solveMethod method passed to cv::solve (#DecompTypes)
              .   
              .   @sa  findHomography, warpPerspective, perspectiveTransform
              """
          
  4. Zoom cv2.resize(src, dsize, dst=None, fx=None, fy=None, interpolation=None)

    • notes:

      """
          resize(src, dsize[, dst[, fx[, fy[, interpolation]]]]) -> dst
          .   @brief Resizes an image.
         
          .   @param src input image.
          .   @param dst output image; it has the size dsize (when it is non-zero) or the size computed from
          .   src.size(), fx, and fy; the type of dst is the same as of src.
          .   @param dsize output image size; if it equals zero, it is computed as:...
          .   @param interpolation interpolation method, see #InterpolationFlags
          .   
          .   @sa  warpAffine, warpPerspective, remap
          """
      
      1. (non) equal scale reduction

      2. (non) equal scale magnification

        1. View interpolation method

          import cv2
          
          interpolations = [i for i in dir(cv2) if i.startswith('INTER_')]
          print(interpolations)
          
          ['INTER_AREA', 'INTER_BITS', 'INTER_BITS2', 'INTER_CUBIC', 'INTER_LANCZOS4', 'INTER_LINEAR',  # bilinear interpolation 
           'INTER_LINEAR_EXACT', 'INTER_MAX',  # Maximum interpolation
           'INTER_NEAREST',  # Nearest neighbor interpolation
           'INTER_NEAREST_EXACT', 'INTER_TAB_SIZE', 'INTER_TAB_SIZE2']
          
          1. Bilinear interpolation method: in this method, four nearest neighbors are used to estimate the gray level of a given position
            KaTeX parse error: No such environment: align at position 8: \begin{ ̲ a ̲ l ̲ i ̲ g ̲ n ̲}̲ & Let (x,y) be the estimated position, v

  5. Crop array slice

    1. Center clipping

      def center_crop(im, w, h):
      
          # Image center position
          center_x, center_y, temp = np.array(im.shape)//2 # tuple does not support scalar operation
      
          # Generate clipping start position (upper left corner)
          start_x = center_x - w//2
          start_y = center_y - h//2
          new_img = im[start_y:start_y + h, start_x:start_x + w, :]
          return new_img
      
    2. Random clipping

      def random_crop(im, w, h):
          # Randomly generated clipping start position (upper left corner)
          start_x = np.random.randint(0, im.shape[1])  # - w
          start_y = np.random.randint(0, im.shape[0])  # - h
          
          new_img = im[start_y:start_y + h, start_x:start_x + w, :]
          # Im [row, column, channel] = im [height, width, channel], or directly im[h,w], then all channels are taken by default
          
          return new_img
      

Image arithmetic calculation

Comprehensive example

import cv2


  1. Image addition array addition
    • Purpose:
      1. Watermark overlay
      2. Denoising (cumulative mean)
  2. Image subtraction array subtraction
    • Purpose:
      1. Find image differences
      2. Continuous image background elimination and motion trajectory detection

Corrosion and expansion (morphological filtering)

Image dilation and erosion are two basic morphological operations, which are mainly used to find the maximum region and minimum region in the image.

import cv2

img = cv2.imread('data/lena.jpeg')
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
t, img_binary = cv2.threshold(img_gray, 127, 255, type=cv2.THRESH_BINARY)

# Define convolution kernel
kernel = np.ones((5,5), np.uint8)

# 1. Image corrosion
eroded_img_binary = cv2.erode(src=img_binary,
                              kernel=kernel,
                              iterations=5)
eroded2_img_binary = cv2.morphologyEx(src=img_bianry,
                                      op=cv2.MORPH_ERODE,
                                      kernel=kernel,
                                      iterations=5)

# 2. Image expansion
dilated_img_binary = cv2.dilate(src=img_binary,
                                kernel=kernel,
                                iterations=5)
dilated2_img_binary = cv2.morphologyEx(src=img_bianry,
                                       op=cv2.MORPH_DILATE,
                                       kernel=kernel,
                                       iterations=5)

# 3. Image opening operation (segmentation - removing external noise)
open_img_binary = cv2.dilate(src=eroded_img_binary,
                             kernel=kernel,
                             iterations=5)
open2_img_binary = cv2.morphologyEx(src=img_bianry,
                                    op=cv2.MORPH_OPEN,
                                    kernel=kernel,
                                    iterations=5)

# 3. Image closing operation (Unicom - remove internal noise)
close_img_binary = cv2.erode(src=dliated_img_binary,
                             kernel=kernel,
                             iterations=5)
close2_img_binary = cv2.morphologyEx(src=img_bianry,
                                     op=cv2.MORPH_CLOSE,
                                     kernel=kernel,
                                     iterations=5)

# 4. Morphological gradient (get the edge of foreground image)
img_binary = dilated_img_binary - eroded_img_binary

# 5. Top (ceremony) hat operation (get edge noise)
tophat_img_binary = img_binary - open_img_binary

# 6. Black hat operation (get internal noise)
balckhat_img_binary = close_img_binary - img_binary
  1. Morphological extension cv2.morphologyEx(src, op, kernel, dst=None, anchor=None, iterations=None, borderType=None, borderValue=None)
    Description: integrated advanced morphological transformation function

    • notes:

      """
          morphologyEx(src, op, kernel[, dst[, anchor[, iterations[, borderType[, borderValue]]]]]) -> dst
          .   @brief Performs advanced morphological transformations.
      ...
          .   @param src Source image. The number of channels can be arbitrary. The depth should be one of
          .   CV_8U, CV_16U, CV_16S, CV_32F or CV_64F.
          .   @param dst Destination image of the same size and type as source image.
          .   @param op Type of a morphological operation, see #MorphTypes
          .   @param kernel Structuring element. It can be created using #getStructuringElement.
          .   @param anchor Anchor position with the kernel. Negative values mean that the anchor is at the
          .   kernel center.
          .   @param iterations Number of times erosion and dilation are applied.
          .   @param borderType Pixel extrapolation method, see #BorderTypes. #BORDER_WRAP is not supported.
          .   @param borderValue Border value in case of a constant border. The default value has a special
          .   meaning.
          .   @sa  dilate, erode, getStructuringElement
          .   @note The number of iterations is the number of times erosion or dilatation operation will be applied.
          .   For instance, an opening operation (#MORPH_OPEN) with two iterations is equivalent to apply
          .   successively: erode -> erode -> dilate -> dilate (and not erode -> dilate -> erode -> dilate).
          """
      
    • @brief: perform advanced morphological transformation.

    • @param:

      src: ndarraySource image
      op: cv2.MORPH_...Operation operator
      kernel: ndarrayConvolution kernel
      achor: TupleAnchor positionThe default value (- 1, - 1) indicates the center position
      iterations: intNumber of iterationsDefault 1
    1. View configuration operation list

      import cv2
      
      morphs = [i for i in dir(cv2) if i.startswith('MORPH_')]
      print(morphs)
      
      ['MORPH_BLACKHAT',  #Black hat operation
       'MORPH_CLOSE',  # Closed operation
       'MORPH_CROSS', 'MORPH_DILATE',  # Expansion operation
       'MORPH_ELLIPSE', 'MORPH_ERODE',  # Corrosion operation
       'MORPH_GRADIENT',  # Morphological gradient
       'MORPH_HITMISS', 'MORPH_OPEN',  # Open operation
       'MORPH_RECT', 'MORPH_TOPHAT'  # Top hat operation
      ]
      
  2. Image corrosion cv2.erode(src, kernel, dst=None, anchor=None, iterations=None, borderType=None, borderValue=None)
    Description: reduce and refine the highlighted area or white part in the binary image. The operation result is smaller than the highlighted area in the original image, that is, the highlighted area is eroded. (color images and gray images can also be corroded, but the effect is not obvious)
    Purpose: "shrink" or "refine" the foreground (highlighted area) in binary graphics to realize the functions of edge denoising and element segmentation.
    Principle:
    $$
    \begin{align}

    &Corrosion operator: '-', corrosion operation is defined as:\
    & \qquad A-B={x|B_x \subseteq A } \
    &Definition: image A is etched with convolution template B, and image A is etched with template B\
    &\ qquad convolution operation to obtain the minimum value of pixels in the B coverage area, and use the minimum value to replace the reference\
    &\ qquad pixel value of the point\

    &Corrosion process: the template (convolution kernel) scans the original image pixel by pixel according to its central point, and the original image\
    &\ qquad pixel scanned by convolution kernel: if all values are 1, the pixel value is 1; Otherwise, it is 0 (0 is dark and 1 is bright)\
    &\ qquad also shows that the larger the convolution kernel area and the more convolution times, the deeper the corrosion degree and the highlight area\
    &\ qquad is eaten more widely.

    \end{align}
    $$

    • notes:

      """
          erode(src, kernel[, dst[, anchor[, iterations[, borderType[, borderValue]]]]]) -> dst
          .   @brief Erodes an image by using a specific structuring element.
      ...
          .   @param src input image; the number of channels can be arbitrary, but the depth should be one of
          .   CV_8U, CV_16U, CV_16S, CV_32F or CV_64F.
          .   @param dst output image of the same size and type as src.
          .   @param kernel structuring element used for erosion; if `element=Mat()`, a `3 x 3` rectangular
          .   structuring element is used. Kernel can be created using #getStructuringElement.
          .   @param anchor position of the anchor within the element; default value (-1, -1) means that the
          .   anchor is at the element center.
          .   @param iterations number of times erosion is applied.
          .   @param borderType pixel extrapolation method, see #BorderTypes. #BORDER_WRAP is not supported.
          .   @param borderValue border value in case of a constant border
          .   @sa  dilate, morphologyEx, getStructuringElement
          """
      
  3. Image dilate(src, kernel, dst=None, anchor=None, iterations=None, borderType=None, borderValue=None)
    Description: expand the highlighted area or white part of the binary image. The operation result is larger than the highlighted area of the original image, that is, the highlighted area is expanded. (color image and gray image can also be expanded, but the effect is not obvious)
    Purpose: expand the foreground (highlighted area) in binary graphics to realize internal denoising and element connectivity (such as filling the blank in the image after image segmentation).
    Principle:
    $$
    \begin{align}

    &Expansion operator: '\ oplus', expansion operation is defined as:\
    & \qquad A-B={x|B_x \cap A \neq } \
    &Definition representation: image A is expanded with convolution template B, which is connected with image A through template B\
    &\ qquad convolution operation to obtain the maximum value of pixels in the B coverage area, and use this maximum value to replace the reference\
    &\ qquad pixel value of the point\

    &Expansion process: the template (convolution kernel) scans the original image pixel by pixel according to its central point, and the original image\
    &\ qquad pixel scanned by convolution kernel: if at least one value is 1, the pixel value is 1; Otherwise\
    &\ qquad is 0 (0 is dark and 1 is bright)\
    &\ qquad also shows that the larger the convolution kernel area, the more convolution times, the deeper the expansion degree and the highlighted area\
    &\ qquad is expanded more widely.

    \end{align}
    $$

    • notes:

      """
          dilate(src, kernel[, dst[, anchor[, iterations[, borderType[, borderValue]]]]]) -> dst
          .   @brief Dilates an image by using a specific structuring element.
      ...
          .   @param src input image; the number of channels can be arbitrary, but the depth should be one of
          .   CV_8U, CV_16U, CV_16S, CV_32F or CV_64F.
          .   @param dst output image of the same size and type as src.
          .   @param kernel structuring element used for dilation; if elemenat=Mat(), a 3 x 3 rectangular
          .   structuring element is used. Kernel can be created using #getStructuringElement
          .   @param anchor position of the anchor within the element; default value (-1, -1) means that the
          .   anchor is at the element center.
          .   @param iterations number of times dilation is applied.
          .   @param borderType pixel extrapolation method, see #BorderTypes. #BORDER_WRAP is not suported.
          .   @param borderValue border value in case of a constant border
          .   @sa  erode, morphologyEx, getStructuringElement
          """
      
  4. Image opening operation

    Description: the image corrodes first and then expands
    Purpose: denoising

    1. Foreground: remove external noise - segmentation;
    2. Background: eliminate small highlighted areas.
      [the external chain image transfer fails, and the source station may have an anti-theft chain mechanism. It is recommended to save the image and upload it directly (IMG etnkrikz-1632439242304) (F: \ wuxin_convenent \ documents \ typro \ tarena \ final sprint \ fundamentals of digital image processing (opencv) _img \ image opening operation. png)]
  5. Image closure operation

    Description: the image expands first and then corrodes
    Purpose: denoising

    1. Foreground: remove internal noise - connectivity (eliminate small black holes in the foreground);
      [the external chain image transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the image and upload it directly (img-5eworrya-16324342307) (F: \ wuxin_convenent \ documents \ typro \ tarena \ final sprint \ fundamentals of digital image processing (opencv) _img \ image closing operation. png)]
  6. Morphological gradient

    Description: difference image of expansion and corrosion

    Purpose: highlight the periphery of the highlighted area and provide new ideas for contour search.

  7. Top hat operation

    Description: the difference image between the original image and the opening operation (expansion after corrosion)

    Purpose: background extraction (extracting small high bright blocks)

  8. Black hat operation

    Description: close the difference image between the operation and the original image

    Purpose: foreground black hole extraction

Image gradient processing

In the process of image enhancement, various image smoothing algorithms are usually used to eliminate noise. The common noise of image mainly includes additive noise, multiplicative noise and quantization noise.
Generally speaking, the energy of the image is mainly concentrated in its low-frequency part, the frequency band of the noise is mainly in the high-frequency part, and the image edge information is also mainly concentrated in its high-frequency part.
This will lead to the blurring of image edge and image contour after smoothing of the original image. In order to reduce the impact of such adverse effects, it is necessary to use image sharpening technology to make the edge of the image clear. The purpose of image sharpening processing is to make the edge, contour and details of the image clear:

  1. The fundamental reason why the smoothed image becomes blurred is that the image is subjected to average or integral operation, so the image can be made clear by inverse operation (such as differential operation). Differential operation is to calculate the change rate of signal. According to the differential property of Fourier transform, differential operation has a strong role of high-frequency component.
  2. Considering from the frequency domain, the essence of image blur is that its high-frequency component is attenuated, so high pass filter can be used to make the image clear.

However, it should be noted that the image that can be sharpened must have a high sex noise ratio, otherwise the sex noise ratio of the image after sharpening is lower, so that the noise increases more than the signal. Therefore, it is generally to remove or reduce the noise before sharpening.

Original image – > smoothing – > image edge / contour blur – > image sharpening – > image edge / contour / detail clarity

Image gradient: the speed at which image pixel values change. For the edge part of the object, the gray value changes greatly and the gradient value is also large; In the smoother part of the object, the change of gray value and gradient value are small. (the gradient value is large in the edge area with large gray change, small in the area with gentle gray change, and zero in the area with uniform gray. Sharpening: increase the gradient; blur: reduce the gradient)

  • The view image is a binary discrete function, and the change rate of image gray is expressed by differentiation.

  • In calculus, a one-dimensional function f ( x ) f(x) First order differential and two-dimensional function of f(x) f ( x , y ) f(x,y) The basic definition of the first-order partial differential of f(x,y) is:
    d f ( x ) d x = lim ⁡ △ x → 0 f ( x + △ x ) − f ( x ) △ x ∂ f ( x , y ) ∂ x = lim ⁡ △ x → 0 f ( x + △ x , y ) − f ( x , y ) △ x ∂ f ( x , y ) ∂ y = lim ⁡ △ y → 0 f ( x , y + △ y ) − f ( x , y ) △ y \frac{df(x)}{dx}=\lim_{\vartriangle x \to 0} \frac{f(x+\vartriangle x)−f(x)}{\vartriangle x} \\ \\ \frac{\partial f(x,y)}{\partial x}=\lim_{\vartriangle x \to 0} \frac{f(x+\vartriangle x,y)−f(x,y)}{\vartriangle x} \\ \frac{\partial f(x,y)}{\partial y}=\lim_{\vartriangle y \to 0} \frac{f(x,y+\vartriangle y)−f(x,y)}{\vartriangle y} dxdf(x)​=△x→0lim​△xf(x+△x)−f(x)​∂x∂f(x,y)​=△x→0lim​△xf(x+△x,y)−f(x,y)​∂y∂f(x,y)​=△y→0lim​△yf(x,y+△y)−f(x,y)​

  • The image is a two-dimensional discrete function (discrete according to pixels), △ x , △ y \vartriangle x,\vartriangle y △ x and △ y shall be at least 1. Therefore, the gradient of the image is calculated by the positive difference between the two axes to obtain the gradient matrix of the image M M M:
    KaTeX parse error: No such environment: align at position 8: \begin{̲a̲l̲i̲g̲n̲}̲ & M(x,y)=\sqrt...

Template operation:

  • Template (convolution kernel): a small matrix w with size n*n (n is generally odd, called template size), and the value on the matrix is called weight. During calculation, the anchor point (generally the center point) of the template is aligned with the pixel by pixel p in the image, and the template weight value and the image pixel value within the coverage of the template are used as input (note the template) in one operation w w w. Input image i i i).

  • Template convolution (weighted summation):
    p = ∑ i = 1 n ∑ j = 1 n w i , j ∗ i i , j ∑ i = 1 n ∑ j = 1 n w i , j p=\frac{\sum_{i=1}^n \sum_{j=1}^n w_{i,j}*i_{i,j}}{\sum_{i=1}^n \sum_{j=1}^n w_{i,j}} p=∑i=1n​∑j=1n​wi,j​∑i=1n​∑j=1n​wi,j​∗ii,j​​

  • Template sorting:

    • Maximum template sorting: p = m a x ( s o r t e d ( i ) ) p=max(sorted(i)) p=max(sorted(i))
    • Minimum template sorting: p = m i n ( s o r t e d ( i ) ) p=min(sorted(i)) p=min(sorted(i))
    • Median template sort: p = m e d i a n ( s o r t e d ( i ) ) p=median(sorted(i)) p=median(sorted(i))

Comprehensive example

import cv2

img = cv2.imread('data/lena.jpeg')

Image smoothing (denoising, blur)

Comprehensive example

import cv2

img = cv2.imread('data/lena.jpeg')

# 1. Mean filtering
meanBlur_img = cv2.blur(src=img, 
                        ksize=(3,3))

# 2. Median filtering
medianBlur_img = cv2.meadianBlur(src=img,
                                 ksize=3)

# 3. Gaussian filtering
im_gaussian_blur = cv2.GaussianBlur(src=im, 
                                    ksize=(5,5), 
                                    sigmaX=3)
  1. Mean filtering cv2.blur(src, ksize, dst=None, anchor=None, borderType=None)
    Description: mean filter refers to a filter with template weight of 1. It takes the neighborhood average of pixels as the output result through convolution operation.
    $$
    \begin{align}

    &For ex amp le, 3 * 3 mean filter (ksize=(3,3)):
    \begin{bmatrix}
    1 & 1 & 1 \
    1 & 1 & 1 \
    1 & 1 & 1
    \end{bmatrix}

    \end{align}
    $$
    Purpose: the image is smooth, but the image will blur with the increase of template size

    characteristic:

    1. Advantages: fast speed and simple algorithm
    2. Disadvantages: can not remove the noise, can only weakly weaken it
    • notes:

      """
          blur(src, ksize[, dst[, anchor[, borderType]]]) -> dst
          .   @brief Blurs an image using the normalized box filter.
      ...
          .   @param src input image; it can have any number of channels, which are processed independently, but
          .   the depth should be CV_8U, CV_16U, CV_16S, CV_32F or CV_64F.
          .   @param dst output image of the same size and type as src.
          .   @param ksize blurring kernel size. : tuple
          .   @param anchor anchor point; default value Point(-1,-1) means that the anchor is at the kernel
          .   center.
          .   @param borderType border mode used to extrapolate pixels outside of the image, see #BorderTypes. #BORDER_WRAP is not supported.
          .   @sa  boxFilter, bilateralFilter, GaussianBlur, medianBlur
          """
      
  2. Median filter cv2.medianBlur(src, ksize, dst=None)
    Description: median filter belongs to the filter of template sorting operation. It outputs the sorted median value of pixels in the neighborhood instead of the original pixel value.
    Purpose: denoising (good suppression effect on salt and pepper noise) while preserving the sharpness of the original image (sharp edge and less blur as far as possible); However, it will destroy the linear relationship in the image and is not suitable for processing precision images with more point and line details.

    • notes:

      """
          medianBlur(src, ksize[, dst]) -> dst
          .   @brief Blurs an image using the median filter.
      ...
          .   @note The median filter uses #BORDER_REPLICATE internally to cope with border pixels, see #BorderTypes
          .   
          .   @param src input 1-, 3-, or 4-channel image; when ksize is 3 or 5, the image depth should be
          .   CV_8U, CV_16U, or CV_32F, for larger aperture sizes, it can only be CV_8U.
          .   @param dst destination array of the same size and type as src.
          .   @param ksize aperture linear size; it must be odd and greater than 1, for example: 3, 5, 7 ... : integer
          .   @sa  bilateralFilter, blur, boxFilter, GaussianBlur
          """
      
  3. Gaussian filter CV2. Gaussian blur (SRC, ksize, sigma x, DST = none, sigma y = none, bordertype = none)
    Description: the template of Gaussian filter determines the template coefficient according to the Gaussian distribution. The weight near the center is greater than that at the edge.
    $$
    \begin{align}

    &For ex amp le, 5 * 5 Gaussian filter (ksize=5):
    \begin{bmatrix}
    1 & 4 & 7 & 4 & 1 \
    4 & 16 & 26 & 16 & 4 \
    7 & 26 & 41 & 26 & 7 \
    4 & 16 & 26 & 16 & 4 \
    1 & 4 & 7 & 4 & 1
    \end{bmatrix}

    \end{align}
    $$
    Purpose: smooth and denoise, and reduce the influence of the increase of template size on image fuzziness in mean filtering.

    • notes:

      """
          GaussianBlur(src, ksize, sigmaX[, dst[, sigmaY[, borderType]]]) -> dst
          .   @brief Blurs an image using a Gaussian filter.
          .   
          .   The function convolves the source image with the specified Gaussian kernel. In-place filtering is
          .   supported.
          .   
          .   @param src input image; the image can have any number of channels, which are processed
          .   independently, but the depth should be CV_8U, CV_16U, CV_16S, CV_32F or CV_64F.
          .   @param dst output image of the same size and type as src.
          .   @param ksize Gaussian kernel size. ksize.width and ksize.height can differ but they both must be : tuple
          .   positive and odd. Or, they can be zero's and then they are computed from sigma.
          .   @param sigmaX Gaussian kernel standard deviation in X direction.
          .   @param sigmaY Gaussian kernel standard deviation in Y direction; if sigmaY is zero, it is set to be
          .   equal to sigmaX, if both sigmas are zeros, they are computed from ksize.width and ksize.height,
          .   respectively (see #getGaussianKernel for details); to fully control the result regardless of
          .   possible future modifications of all this semantics, it is recommended to specify all of ksize,
          .   sigmaX, and sigmaY.
          .   @param borderType pixel extrapolation method, see #BorderTypes. #BORDER_WRAP is not supported.
          .   
          .   @sa  sepFilter2D, filter2D, blur, boxFilter, bilateralFilter, medianBlur
          """
      
    • @param:

      sigmaX: intStandard deviation of Gaussian kernel weight in X-axis
      sigmaY: intStandard deviation of Gaussian kernel weight in Y-axisDefault = sign

Image sharpening (edge detection)

Sharpening is to highlight the details (boundary), so it is necessary to strengthen the pixels at the edge (for example, directly use the gradient value as the gray level or RGB component of the pixel)

, sharpening is to reduce blur in the image by enhancing high-frequency components,

Enhance the detail edge and contour of the image and enhance the gray contrast, which is convenient for target recognition and processing in the later stage. Sharpening not only enhances the edge of the image, but also increases the noise of the image

The edge detection operator checks the field of each pixel and quantifies the gray change rate, usually including the determination of direction. Most of them are convolution methods based on directional template.

For edge detection, as long as the gray level of pixels exceeding the threshold is set to 0 according to the set threshold, otherwise it is set to 255.

Comprehensive example

import cv2

img = cv2.imread('data/lena.png', 0)  # Read grayscale image

# First order differential sharpening 1.sobel filtering
im_sobel = cv2.Sobel(im, ddepth=cv2.CV_64F, dx=1, dy=1, ksize=5)
cv2.imshow('im_sobel', im_sobel)

# Second order differential sharpening 1.Laplacian filtering
im_lap = cv2.Laplacian(im, cv2.CV_64F)
cv2.imshow('im_lap', im_lap)
cv2.imshow('im_lap-im', im_lap - im)

# Multistage algorithm 1.Canny filtering
im_canny = cv2.Canny(im, 50, 240)
im_blur_canny = cv2.Canny(src=cv2.GaussianBlur(im, ksize=(3,3), sigmaX=0.5), 
                          50, 240)

First order differential sharpening

  1. Robert operator

  2. Sobel operator cv2.Sobel(src, ddepth, dx, dy, dst=None, ksize=None, scale=None, delta=None, borderType=None)
    Description:

    1. The simplest approximation of the gradient is g x = ( w 7 + 2 w 8 + w 9 ) − ( w 1 + 2 w 2 + w 3 ) g y = ( w 3 + 2 w 6 + w 9 ) − ( w 1 + 2 w 4 + w 7 ) g_x=(w_7+2w_8+w_9)-(w_1+2w_2+w_3) \\ g_y=(w_3+2w_6+w_9)-(w_1+2w_4+w_7) gx​=(w7​+2w8​+w9​)−(w1​+2w2​+w3​)gy​=(w3​+2w6​+w9​)−(w1​+2w4​+w7​)
    2. Operator is [ − 1 − 2 − 1 0 0 0 1 2 1 ] and [ − 1 0 1 − 2 0 2 − 1 0 1 ] \Begin {bMatrix} - 1 & - 2 & - 1 \ \ 0 & 0 \ \ 1 & 2 & 1 \ \ end {bMatrix} and \ begin {bMatrix} - 1 & 0 & 1 \ \ 2 & 0 & 2 \ \ 1 & 0 & 1 \ \ end {bMatrix} ⎣− 101 − 202 − 101 ⎦⎤ and ⎣− 1 − 2 − 1 000 ⎦⎤121 ⎦⎤
    3. Calculate the gradient image as M ( x , y ) = ∣ g x ∣ + ∣ g y ∣ M(x,y)=|g_x| + |g_y| M(x,y)=∣gx​∣+∣gy​∣

    Purpose:

    • notes:

      """
          Sobel(src, ddepth, dx, dy[, dst[, ksize[, scale[, delta[, borderType]]]]]) -> dst
          .   @brief Calculates the first, second, third, or mixed image derivatives using an extended Sobel operator.
      ...
          .   @param src input image.
          .   @param dst output image of the same size and the same number of channels as src .
          .   @param ddepth output image depth, see @ref filter_depths "combinations"; in the case of
          .       8-bit input images it will result in truncated derivatives.
          .   @param dx order of the derivative x.
          .   @param dy order of the derivative y.
          .   @param ksize size of the extended Sobel kernel; it must be 1, 3, 5, or 7.
          .   @param scale optional scale factor for the computed derivative values; by default, no scaling is
          .   applied (see #getDerivKernels for details).
          .   @param delta optional delta value that is added to the results prior to storing them in dst.
          .   @param borderType pixel extrapolation method, see #BorderTypes. #BORDER_WRAP is not supported.
          .   @sa  Scharr, Laplacian, sepFilter2D, filter2D, GaussianBlur, cartToPolar
          """
      
  3. Prewitt operator

Second order differential sharpening

  1. Laplacian deformation operator (SRC, ddepth, DST = none, ksize = none, scale = none, delta = none, bordertype = none)
    Description:

    1. Differential is defined as ∇ 2 f = ∂ f ∂ x + ∂ f ∂ y = f ( x + 1 , y ) + f ( x − 1 , y ) + f ( x , y + 1 ) + f ( x , y − 1 ) − 4 f ( x , y ) \nabla^2f = \frac{\partial f}{\partial x} + \frac{\partial f}{\partial y} = f(x+1,y)+f(x-1,y)+f(x,y+1)+f(x,y-1)-4f(x,y) ∇2f=∂x∂f​+∂y∂f​=f(x+1,y)+f(x−1,y)+f(x,y+1)+f(x,y−1)−4f(x,y)
    2. Operator is [ 0 1 0 1 − 4 1 0 1 0 ] \begin{bmatrix} 0 & 1 & 0 \\ 1 & -4 & 1 \\ 0 & 1 & 0 \\ \end{bmatrix} ⎣⎡​010​1−41​010​⎦⎤​
    3. Calculate the differential image as correlation / convolution

    Purpose:

    • notes:

      """
          Laplacian(src, ddepth[, dst[, ksize[, scale[, delta[, borderType]]]]]) -> dst
          .   @brief Calculates the Laplacian of an image.
      ...
          .   @param src Source image.
          .   @param dst Destination image of the same size and the same number of channels as src .
          .   @param ddepth Desired depth of the destination image.
          .   @param ksize Aperture size used to compute the second-derivative filters. See #getDerivKernels for
          .   details. The size must be positive and odd.
          .   @param scale Optional scale factor for the computed Laplacian values. By default, no scaling is
          .   applied. See #getDerivKernels for details.
          .   @param delta Optional delta value that is added to the results prior to storing them in dst .
          .   @param borderType Pixel extrapolation method, see #BorderTypes. #BORDER_WRAP is not supported.
          .   @sa  Sobel, Scharr
          """
      
  2. Log edge operator

  3. Gauss Laplace operator

Multistage algorithm

  1. Canny algorithm (image, threshold1, threshold2, edges = none, aperturesize = none, l2gradient = none)
    Steps:

    1. Image denoising: prevent false edge recognition.
    2. Calculate image gradient: get the set of all possible edges.
    3. Non maximum suppression: keep the maximum gray transformation point in the gradient direction in the local range, and thin the edge (multiple pixel wide edges – > single pixel wide edges).
    4. Double threshold filtering: retain strong edges, complete unclosed edges, and discard weak edges.
    • notes:

      """
          Canny(image, threshold1, threshold2[, edges[, apertureSize[, L2gradient]]]) -> edges
          .   @brief Finds edges in an image using the Canny algorithm @cite Canny86 .
      ...
          .   @param image 8-bit input image.
          .   @param edges output edge map; single channels 8-bit image, which has the same size as image .
          .   @param threshold1 first threshold for the hysteresis procedure. Low threshold
          .   @param threshold2 second threshold for the hysteresis procedure. High threshold
      ...
          """
      

Custom filtering

Comprehensive example

import numpy as np
import cv2

im = cv2.imread('data/lena.jpg')

# Custom arithmetic average operator (smoothing)
simple_mean_kernel = np.array([
    [1, 1, 1],
    [1, 1, 1],
    [1, 1, 1]
], dtype=float)/9
# Smooth filtering
im_filter = cv2.filter2D(src=im, 
                         ddepth=-1, 
                         kernel=simple_mean_kernel)

# Custom Laplace biaxial operator
laplacian_x_y = np.array([
    [0, -1, 0],
    [-1, 4, -1],
    [0, 1, 0]
]) / 3.0  # To prevent the image from being too dark
# Sharpening filter
im_sharpen_2 = cv2.filter2D(im, -1, laplacian_x_y)
  1. Two dimensional filtering filter2D(src, ddepth, kernel, dst=None, anchor=None, delta=None, borderType=None)

    • notes:

      """
          filter2D(src, ddepth, kernel[, dst[, anchor[, delta[, borderType]]]]) -> dst
          .   @brief Convolves an image with the kernel.
      ...
          .   @param src input image.
          .   @param dst output image of the same size and the same number of channels as src.
          .   @param ddepth desired depth of the destination image, see @ref filter_depths "combinations"
          .   @param kernel convolution kernel (or rather a correlation kernel), a single-channel floating point
          .   matrix; if you want to apply different kernels to different channels, split the image into
          .   separate color planes using split and process them individually.
          .   @param anchor anchor of the kernel that indicates the relative position of a filtered point within
          .   the kernel; the anchor should lie within the kernel; default value (-1,-1) means that the anchor
          .   is at the kernel center.
          .   @param delta optional value added to the filtered pixels before storing them in dst.
          .   @param borderType pixel extrapolation method, see #BorderTypes. #BORDER_WRAP is not supported.
          .   @sa  sepFilter2D, dft, matchTemplate
          """
      

theory

Spatial domain processing

Operate on image pixels, which can be divided into two categories:

  1. Grayscale transformation: operation on a single pixel of the image, mainly for the purpose of contrast kernel threshold processing (point processing technology)

  2. Spatial filtering: operations involving improving performance, such as sharpening the image through neighborhood processing of each pixel (neighborhood processing technology)

Spatial domain processing representation

$$
\begin{align}

& g(x,y) = T[f(x,y)] \
&Of which:\
&\ qquad f (x, y) is the input image\
&\ qquad g (x, y) is the processed output image\
&\ qquad t is an operator about f defined on the neighborhood of point (x,y), which can be applied to a single image or image set\
&\ qquad the typical neighborhood is generally a small rectangle centered on (x,y). When the neighborhood contains pixels outside the image, two processing methods are specified:\
&\ qquad \ qquad [1] \ ignore outer neighbors\
&\ qquad \ qquad [2] \ fill outer adjacent points (generally, the filling value is 0)

\end{align}
$$

This process is called spatial filtering, and the neighborhood together with predefined operations is called spatial filter (spatial mask / core / template / window)

wave filtering

The word filter is borrowed from frequency domain processing, which means to accept (pass) or reject a certain frequency component. The classification methods are as follows:

  • Low pass filter: a filter that accepts (passes) low frequencies and blurs (smoothes) an image with its final effect.
  • High pass filter:

Spatial filter

The spatial filter has linear and nonlinear operations (while the frequency domain filter cannot do nonlinear filtering):

  • Linear spatial filter: performs linear operations on image pixels. There are two types of operations (but convolution is always confused with correlation):
    1. Correlation: the process of moving the filter template over the image and calculating the sum of the product of each position.
      • Correlation is a function of filter displacement.
      • A function (filter) is associated with a discrete unit impulse (a function containing all zeros and a single 1), and a 180 ° inverted version of the function is generated at the position of the impulse (a single 1)
      • For the image (two-dimensional function), a filter of size m*n w ( x , y ) w(x,y) w(x,y) and an image f ( x , y ) f(x,y) f(x,y) is related and recorded as w ( x , y ) ☆ f ( x , y ) = yes chart image upper each individual image element ( x , y ) ∑ s = − m / / 2 m / / 2 ∑ t = − n / / 2 n / / 2 w ( s , t ) f ( x + s , y + t ) w(x,y)☆f(x,y)=_ {for each pixel on the image (x,y)} \sum_{s=-m//2}^{m//2} \sum_{t=-n//2}^{n//2} w(s,t)f(x+s,y+t) w(x,y) ☆ f(x,y) = for each pixel on the image (x,y) ∑ s = − m//2m//2 ∑ t = − n//2n//2 w(s,t)f(x+s,y+t)
    2. Convolution: the filter is first rotated 180 ° and then correlated.
      • The filter is rotated in advance and then correlated, that is, the filter is convoluted with the unit impulse to obtain a copy of the function at the impulse.
      • Two dimensional image convolution operation, recorded as w ( x , y ) ★ f ( x , y ) = yes chart image upper each individual image element ( x , y ) ∑ s = − m / / 2 m / / 2 ∑ t = − n / / 2 n / / 2 w ( s , t ) f ( x − s , y − t ) w(x,y)★f(x,y)=_ {for each pixel on the image (x,y)} \sum_{s=-m//2}^{m//2} \sum_{t=-n//2}^{n//2} w(s,t)f(x-s,y-t) w(x,y) ★ f(x,y) = for each pixel on the image (x,y) ∑ s = − m//2m//2 ∑ t = − n//2n//2 w(s,t)f(x − s,y − T)
  • Nonlinear spatial filter:
    1. Statistical sorting: the response of this filter is based on the sorting of pixels contained in the image area surrounded by the filter, and the value determined by the statistical sorting result is used to replace the value of the central pixel.

Generation of spatial filter template

Smoothing spatial filter

Smoothing filters are used for blur processing and noise reduction. Blur processing is often used in preprocessing tasks (such as removing trivial details in the image and bridging the gap between lines / curves before large target extraction, also known as denoising). The grayscale of smaller objects is mixed with the background, and larger objects become like "spots" and easy to detect. Blur an image to get a rough description of ROI.

Smooth linear (mean) spatial filter

The output (response) of the smoothing phenomenon space filter is the simple average of the pixels contained in the neighborhood of the filter template, also known as the mean filter. Because the typical random noise consists of sharp change of gray level, the average gray value of pixels in the neighborhood determined by the filter template is used to replace the pixel value on the image, which reduces the sharp change of image gray level. However, the image edge is also composed of sharp gray changes, so the mean filtering will bring the negative effect of fuzzy edge.

Main applications of mean filtering:

  • Remove irrelevant details in the image. This "irrelevant" refers to the pixel area which is smaller than the size of the filter template.
  • Get a rough description of ROI, so that the gray level of smaller objects is fused with the background, and larger objects become like "spots" and easy to detect. (the size of the template is determined by the size of the smaller object to be fused by the background)

Classification of mean filter:

  • Arithmetic mean filter
  • Weighted average filter: set the weight based on the distance from the template anchor (Center). The farther the weight is, the smaller the weight is, which can reduce the degree of ambiguity in smoothing.
Smooth spatial nonlinear (statistical sorting) spatial filter

Classification of statistical sorting filter:

  • Median filter: replace the value of the pixel with the median of the gray level in the neighborhood of the pixel.
    1. Application: for certain types of random noise (impulse noise / salt and pepper noise: superimposed on the image in the form of black and white dots), it has excellent denoising ability; And it has more ambiguity than the linear smoothing filter of the same size.
    2. Method: use the m*m median filter to remove the areas that are brighter / darker than the adjacent pixels < m 2 / 2 <m^2/2 < m2 / 2 isolated pixel family, "remove" means that it is forced to be the median gray level of the neighborhood, and the larger family is less affected.
  • Maximum filter
  • Minimum filter
Sharpening Spatial Filters

The main purpose of sharpening processing is to highlight the transition part of gray scale. The edge of digital image is often similar to slope transition in gray level (noise is also the part of gray level mutation. It is easy to get pseudo edge (noise) by sharpening directly on the image without smoothing).

The mean value processing in smoothing processing is similar to integration, and logically, the anti derivation sharpening processing should be similar to spatial differentiation. And the differential of any order is a prior operation, so the sharpening operators defined by the differential are linear operators.

The sharpening operator is defined and realized by digital differentiation (the response intensity of the differential operator is directly proportional to the mutation degree of the point where the image is operated by the operator), and the image differential enhances the edge or other mutations (such as noise) and weakens the area with slow gray change.

The differential operator only emphasizes the sudden change of gray level in the image, not the region where the gray level changes slowly. Differential sharpening produces an image that superimposes light gray edges and abrupt points on a dark background.

The sum of all coefficients of the differential operator template is 0, just like the expected value of the differential operator, indicating that the response of the gray constant region is 0 (black background).

Digital image differential (derivative) definition

digital image f ( x , y ) f(x,y) f(x,y) first order differential

Basic definition: ∂ f ∂ x = f ( x + 1 , y ) − f ( x , y ) , ∂ f ∂ y = f ( x , y + 1 ) − f ( x , y ) \frac{\partial f}{\partial x} = f(x+1,y)-f(x,y),\frac{\partial f}{\partial y} = f(x,y+1)-f(x,y) ∂x∂f​=f(x+1,y)−f(x,y),∂y∂f​=f(x,y+1)−f(x,y)

Definition of terms:

  1. The branch in the constant gray area is zero
  2. The differential value is non-zero at the grayscale step \ slope
  3. The differential value along the gray slope is non-zero

digital image f ( x , y ) f(x,y) f(x,y) second order differential

Basic definition: ∂ 2 f ∂ x 2 = f ( x + 1 , y ) + f ( x − 1 , y ) − 2 f ( x , y ) , ∂ 2 f ∂ y 2 = f ( x , y + 1 ) + f ( x , y − 1 ) − 2 f ( x , y ) \frac{\partial^2 f}{\partial x^2} = f(x+1,y)+f(x-1,y)-2f(x,y),\frac{\partial^2 f}{\partial y^2} = f(x,y+1)+f(x,y-1)-2f(x,y) ∂x2∂2f​=f(x+1,y)+f(x−1,y)−2f(x,y),∂y2∂2f​=f(x,y+1)+f(x,y−1)−2f(x,y)

Definition of terms:

  1. In the constant region, the differential value is zero
  2. The differential value is non-zero at the beginning of the grayscale step \ slope
  3. The differential value along the gray slope is non-zero

On the protruding edge, the first-order differential is not as good as the second-order differential

  • First order differential: because the differential along the slope is non-zero, the first-order differential of the image produces thicker edges.
  • Second order differential: it can be seen from the observation that the gray slope \ step is equally separated by zero at the beginning and end of the second-order differential, resulting in a pixel wide double edge in the second-order differential of the image, which is stronger than the first-order differential in enhancing details.
Realization of second-order differential of two-dimensional function and its application in image sharpening

Methods: a discrete formula of second-order differential is defined, and then a filter template based on the formula is constructed

Isotropic filter: the response of this filter is independent of the sudden change direction of the image acted by the filter (that is, the isotropic filter does not rotate. Filter the source image after rotating = = filter the source image before rotating)

Laplace differential operator: the simplest isotropic differential operator

  1. A two-dimensional image f ( x , y ) f(x,y) Of f(x,y) x , y x,y x. The Laplace operator in the y-axis is defined as: ∇ 2 f = ∂ 2 f ∂ x 2 + ∂ 2 f ∂ y 2 = f ( x + 1 , y ) + f ( x − 1 , y ) + f ( x , y + 1 ) + f ( x , y − 1 ) − 4 f ( x , y ) \nabla^2f=\frac{\partial^2 f}{\partial x^2} + \frac{\partial^2 f}{\partial y^2}=f(x+1,y)+f(x-1,y)+f(x,y+1)+f(x,y-1)-4f(x,y) ∂ 2f = ∂ x2 ∂ 2f + ∂ y2 ∂ 2f = f(x+1,y)+f(x − 1,y)+f(x,y+1)+f(x,y − 1) − 4f(x,y). The formula is implemented as a filter template (it can be seen that the linear filter with changed weight still uses correlation / convolution for calculation):
    [ ( x − 1 , y − 1 ) ( x , y − 1 ) ( x + 1 , y − 1 ) ( x − 1 , y ) ( x , y ) ( x + 1 , y ) ( x − 1 , y + 1 ) ( x , y + 1 ) ( x + 1 , y + 1 ) ] = [ 0 1 0 1 − 4 1 0 1 0 ] \begin{bmatrix} (x-1,y-1) & (x,y-1) & (x+1,y-1) \\ (x-1,y) & (x,y) & (x+1,y) \\ (x-1,y+1) & (x,y+1) & (x+1,y+1) \\ \end{bmatrix} = \begin{bmatrix} 0 & 1 & 0 \\ 1 & -4 & 1 \\ 0 & 1 & 0 \\ \end{bmatrix} ⎣⎡​(x−1,y−1)(x−1,y)(x−1,y+1)​(x,y−1)(x,y)(x,y+1)​(x+1,y−1)(x+1,y)(x+1,y+1)​⎦⎤​=⎣⎡​010​1−41​010​⎦⎤​

  2. Add two diagonal axes, and the Laplace operator is defined as: ∇ 2 f = f ( x + 1 , y ) + f ( x − 1 , y ) + f ( x , y + 1 ) + f ( x , y − 1 ) + f ( x − 1 , y − 1 ) + f ( x + 1 , y + 1 ) + f ( x + 1 , y − 1 ) + f ( x − 1 , y + 1 ) − 8 f ( x , y ) \nabla^2f=f(x+1,y)+f(x-1,y)+f(x,y+1)+f(x,y-1)+f(x-1,y-1)+f(x+1,y+1)+f(x+1,y-1)+f(x-1,y+1)-8f(x,y) ∇2f=f(x+1,y)+f(x−1,y)+f(x,y+1)+f(x,y−1)+f(x−1,y−1)+f(x+1,y+1)+f(x+1,y−1)+f(x−1,y+1)−8f(x,y)

    Filter template implementation (the result for 15 ° amplitude is isotropic):
    [ ( x − 1 , y − 1 ) ( x , y − 1 ) ( x + 1 , y − 1 ) ( x − 1 , y ) ( x , y ) ( x + 1 , y ) ( x − 1 , y + 1 ) ( x , y + 1 ) ( x + 1 , y + 1 ) ] = [ 1 1 1 1 − 8 1 1 1 1 ] \begin{bmatrix} (x-1,y-1) & (x,y-1) & (x+1,y-1) \\ (x-1,y) & (x,y) & (x+1,y) \\ (x-1,y+1) & (x,y+1) & (x+1,y+1) \\ \end{bmatrix} = \begin{bmatrix} 1 & 1 & 1 \\ 1 & -8 & 1 \\ 1 & 1 & 1 \\ \end{bmatrix} ⎣⎡​(x−1,y−1)(x−1,y)(x−1,y+1)​(x,y−1)(x,y)(x,y+1)​(x+1,y−1)(x+1,y)(x+1,y+1)​⎦⎤​=⎣⎡​111​1−81​111​⎦⎤​

  3. Equivalent but different sign Laplacian (subtraction for image merging):
    $$
    \begin{bmatrix}
    0 & -1 & 0 \
    -1 & 4 & -1 \
    0 & -1 & 0 \
    \end{bmatrix}

    ,\

    \begin{bmatrix}
    -1 & -1 & -1 \
    -1 & 8 & -1 \
    -1 & -1 & -1 \
    \end{bmatrix}
    $$

Basic methods of Laplacian image enhancement (restoring background characteristics and maintaining Laplacian sharpening effect):

  1. Original image - image processed by central negative Laplace filter
  2. Original image + image processed by Laplace filter with positive center value

The method is defined as: g ( x , y ) = f ( x , y ) + c [ ∇ 2 f ( x , y ) ] , his in c by ± 1 take Decide to PULL universal PULL Si count son of in heart system number g(x,y)=f(x,y)+c[\nabla^2f(x,y)], where c is ± 1, depending on the central coefficient of the Laplace operator g(x,y)=f(x,y)+c [∇ 2f(x,y)], where c is ± 1, depending on the central coefficient of the Laplace operator

Realization of first-order differential of two-dimensional function and its application in (nonlinear) image sharpening

Gradient definition: the first-order differential in image processing is realized by gradient amplitude, f ( x , y ) f(x,y) f(x,y) in coordinates ( x , y ) (x,y) The gradient of (x,y) is defined as a binary vector: ∇ f = g r a d ( f ) = [ g x g y ] = [ ∂ f ∂ x ∂ f ∂ y ] \nabla f = grad(f) = \begin{bmatrix}g_x \\ g_y \end{bmatrix} = \begin{bmatrix}\frac{\partial f}{\partial x} \\ \frac{\partial f}{\partial y} \end{bmatrix} ∂ f=grad(f)=[gx gy] = [∂ x ∂ f ∂ y ∂ F], which indicates f f f in coordinates ( x , y ) (x,y) The direction of the maximum rate of change at (x,y).

Gradient vector ∇ f \nabla f ≓ amplitude value of f (length): M ( x , y ) = m a g ( ∇ f ) = g x 2 + g y 2 or ∣ g x ∣ + ∣ g y ∣ M(x,y)=mag(\nabla f)=\sqrt{g_x^2+g_y^2} or | g_x|+|g_y| M(x,y)=mag(∇f)=gx2​+gy2​ Or ∣ gx ∣ + ∣ gy ∣, indicating that the change rate of gradient vector direction is in the coordinate ( x , y ) (x,y) Value at (x,y).

At this point, we have: f ( x , y ) f(x,y) f(x,y) original image M ( x , y ) M(x,y) M(x,y) gradient image (the same size as the original image, recording the gradient direction change rate of each pixel of the original image; the component of the gradient vector is differential, so the gradient vector is a linear operator, but the gradient image is not a linear operator due to the square sum)

Define the discrete approximation of the formula:

Assume that the weight matrix is [ w 1 w 2 w 3 w 4 w 5 w 6 w 7 w 8 w 9 ] \begin{bmatrix} w_1 & w_2 & w_3 \\ w_4 & w_5 & w_6 \\ w_7 & w_8 & w_9 \\ \end{bmatrix} ⎣⎡​w1​w4​w7​​w2​w5​w8​​w3​w6​w9​​⎦⎤​

  1. Early definition of the simplest approximation of first-order differential: g x = f ( x + 1 , y ) − f ( x , y ) = w 6 − w 5 g y = f ( x , y + 1 ) − f ( x , y ) = w 8 − w 5 g_x=f(x+1,y)-f(x,y)=w_6-w_5 \\ g_y=f(x,y+1)-f(x,y)=w_8-w_5 gx​=f(x+1,y)−f(x,y)=w6​−w5​gy​=f(x,y+1)−f(x,y)=w8​−w5​

  2. Roberts's simplest approximation of cross difference: g x = f ( x + 1 , y + 1 ) − f ( x , y ) = w 9 − w 5 g y = f ( x , y + 1 ) − f ( x + 1 , y ) = w 8 − w 6 g_x=f(x+1,y+1)-f(x,y)=w_9-w_5 \\ g_y=f(x,y+1)-f(x+1,y)=w_8-w_6 gx​=f(x+1,y+1)−f(x,y)=w9​−w5​gy​=f(x,y+1)−f(x+1,y)=w8​−w6​
    Calculate gradient image: M ( x , y ) = ( w 9 − w 5 ) 2 + ( w 8 − w 6 ) 2 or ∣ w 9 − w 5 ∣ + ∣ w 8 − w 6 ∣ M(x,y)=\sqrt{(w_9-w_5)^2 + (w_8-w_6)^2} or | w_9-w_5| + |w_8-w_6| M(x,y)=(w9​−w5​)2+(w8​−w6​)2 Or ∣ w9 − w5 ∣ + ∣ w8 − w6 ∣

  3. The simplest approximation using a 3 * 3 template:
    KaTeX parse error: No such environment: align at position 8: \begin{̲a̲l̲i̲g̲n̲}̲ &g_x\\ & \qqua...
    Calculate gradient image: M ( x , y ) = ∣ ( w 7 + 2 w 8 + w 9 ) − ( w 1 + 2 w 2 + w 3 ) ∣ + ∣ ( w 3 + 2 w 6 + w 9 ) − ( w 1 + 2 w 4 + w 7 ) ∣ M(x,y)=|(w_7+2w_8+w_9)-(w_1+2w_2+w_3)| + |(w_3+2w_6+w_9)-(w_1+2w_4+w_7)| M(x,y)=∣(w7​+2w8​+w9​)−(w1​+2w2​+w3​)∣+∣(w3​+2w6​+w9​)−(w1​+2w4​+w7​)∣

Form filter template:

  1. Robert cross gradient operator:
    [ − 1 0 0 1 ] and [ 0 − 1 1 0 ] \Begin {bMatrix} - 1 & 0 \ \ 0 & 1 \ \ end {bMatrix} and \ begin {bMatrix} 0 & - 1 \ \ 1 & 0 \ \ end {bMatrix} [− 10 − 01] and [01 − 10]
    Because there is no symmetry center, even size templates are difficult to achieve.

  2. Soble (3 * 3 template) operator:
    [ − 1 − 2 − 1 0 0 0 1 2 1 ] and [ − 1 0 1 − 2 0 2 − 1 0 1 ] \Begin {bMatrix} - 1 & - 2 & - 1 \ \ 0 & 0 \ \ 1 & 2 & 1 \ \ end {bMatrix} and \ begin {bMatrix} - 1 & 0 & 1 \ \ 2 & 0 & 2 \ \ 1 & 0 & 1 \ \ end {bMatrix} ⎣− 101 − 202 − 101 ⎦⎤ and ⎣− 1 − 2 − 1 000 ⎦⎤121 ⎦⎤

Frequency domain processing

Operate on the Fourier transform of the image, not for the image itself

Image contour

Two basic implementations based on gray image: discontinuity and similarity

  1. Discontinuity: Based on its gray mutation
  2. Similarity: an image is segmented into similar regions according to a set of predefined criteria

Image contour, find contour, draw contour, fit contour

Although image edge detection technology can detect edges, most of the edges are discontinuous, so the obtained edges are discontinuous. The graphic contour refers to connecting the edges to form a whole for subsequent calculation (obtaining the size, position, direction and other information of the target image).

Find and draw profiles

Comprehensive example

import cv2
import numpy as np

im = cv2.imread("data/3.png")
cv2.imshow("orig", im)

gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
ret, binary = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)

# 1. Find contour
contours, hierarchy = cv2.findContours(binary,  # Binary processed image
                                            cv2.RETR_EXTERNAL,  # Only the outer contour is detected
                                            cv2.CHAIN_APPROX_NONE)  # Store all contour points

# 2. Draw outline
im_cnt = cv2.drawContours(im,  # Draw an image (draw an outline on the figure)
                          contours,  # Contour point list
                          -1,  # Draw all contours
                          (0, 0, 255),  # Outline color: Red
                          2)  # Outline thickness (- 1 is a solid outline)
cv2.imshow("im_cnt", im_cnt)

cv2.waitKey()
cv2.destroyAllWindows()

Outline: a series of points that represent a curve in an image in some way.

  1. Find contours findcontours (image, mode, method, contours = none, hierarchy = none, offset = none) - > Image: ndarray, contours: list [list], hierarchy

    • notes:

      """
          findContours(image, mode, method[, contours[, hierarchy[, offset]]]) -> contours, hierarchy
          .   @brief Finds contours in a binary image.
       ...
          .   @param image Source, an 8-bit single-channel image. Non-zero pixels are treated as 1's. Zero
          .   pixels remain 0's, so the image is treated as binary . You can use #compare, #inRange, #threshold ,
          .   #adaptiveThreshold, #Canny, and others to create a binary image out of a grayscale or color one.
          .   If mode equals to #RETR_CCOMP or #RETR_FLOODFILL, the input can also be a 32-bit integer image of labels (CV_32SC1).
          .   @param contours Detected contours. Each contour is stored as a vector of points (e.g.
          .   std::vector<std::vector<cv::Point> >).
          .   @param hierarchy Optional output vector (e.g. std::vector<cv::Vec4i>), containing information about the image topology. It has
          .   as many elements as the number of contours. For each i-th contour contours[i], the elements
          .   hierarchy[i][0] , hierarchy[i][1] , hierarchy[i][2] , and hierarchy[i][3] are set to 0-based indices
          .   in contours of the next and previous contours at the same hierarchical level, the first child
          .   contour and the parent contour, respectively. If for the contour i there are no next, previous,
          .   parent, or nested contours, the corresponding elements of hierarchy[i] will be negative.
          .   @param mode Contour retrieval mode, see #RetrievalModes
          .   @param method Contour approximation method, see #ContourApproximationModes
          .   @param offset Optional offset by which every contour point is shifted. This is useful if the
          .   contours are extracted from the image ROI and then they should be analyzed in the whole image
          .   context.
          """
      
    • @param:

      imageOriginal image (grayscale image will be automatically processed into binary image).
      In actual operation, the image of the contour to be found can be processed into a binary image by using functions such as threshold processing in advance.
      Finally, the processed 1 image must be a gray binary image with black background and white foreground.
      modeContour retrieval mode
      methodApproximate method of contour (representation)
      mode
      Valuemeaning
      cv2.RETR_EXTERNALOnly the outer contour is detected
      cv2.RETR_LISTAll contours are detected, and no hierarchical relationship is established for the detected contours
      cv2.RETR_CCOMPRetrieve all contours and organize them into a two-level hierarchy. The upper layer is the outer boundary and the lower layer is the boundary of the inner hole
      cv2.RETR_TREEAll contours are detected and a hierarchical tree structure is established
      method
      Valuemeaning
      cv2.CHAIN_APPROX_NONEStore all contour points, and the pixel position difference of two adjacent points shall not exceed 1, i.e m a x ( a b s ( x 1 − x 2 ) , a b s ( y 2 − y 1 ) ) < = 1 max(abs(x1-x2),abs(y2-y1))<=1 max(abs(x1−x2),abs(y2−y1))<=1
      cv2.CHAIN_APPROX_SIMPLESimplified format.
      Compress the elements in horizontal, vertical and diagonal directions, and only retain the end coordinates in this direction
      cv2.CHAIN_APPROX_TC89_L1A style of using teh chinl chain approximation algorithm
      cv2.CHAIN_APPROX_TC89_KCOSA style of using teh chinl chain approximation algorithm
    • @return:

      image
      openCV 4.x has no such return value
      Consistent with the original image in the function parameter
      contours
      ndarray.ndim=3
      Returns the outline of the.
      The return value returns a set of contour information. Each contour is composed of several points (each contour is represented by a list).
      For example, contours[i] is the ith contour (subscript starts from 0), contours[i][j] is the jth point in the ith contour, and each point consists of (x,y).
      hierarchyThe topological information of the image (reflecting the contour hierarchy).
      The contours in the image may be in different positions. For example, one contour is inside another contour. In this case, we call the outer contour the parent contour and the inner contour the child contour.
      According to the above relationship classification, a parent-child relationship is established between all contours in an image. Each contour contour [i] corresponds to four elements to illustrate the hierarchical relationship of the current contour. Its form is: [Next,Previous,First_Child,Parent], representing the index number of the next contour, the index number of the previous contour, the index number of the first sub contour and the index number of the parent contour respectively
  2. Draw contours (image, contours, contouridx, color, thickness = none, linetype = none, hierarchy = none, MAXLEVEL = none, offset = none) - > Image: ndarray

    • notes:

      """
          drawContours(image, contours, contourIdx, color[, thickness[, lineType[, hierarchy[, maxLevel[, offset]]]]]) -> image
          .   @brief Draws contours outlines or filled contours.
       ...
          .   @param image Destination image.
          .   @param contours All the input contours. Each contour is stored as a point vector.
          .   @param contourIdx Parameter indicating a contour to draw. If it is negative, all the contours are drawn.
          .   @param color Color of the contours.
          .   @param thickness Thickness of lines the contours are drawn with. If it is negative (for example,
          .   thickness=#FILLED ), the contour interiors are drawn.
          .   @param lineType Line connectivity. See #LineTypes
          .   @param hierarchy Optional information about hierarchy. It is only needed if you want to draw only
          .   some of the contours (see maxLevel ).
          .   @param maxLevel Maximal level for drawn contours. If it is 0, only the specified contour is drawn.
          .   If it is 1, the function draws the contour(s) and all the nested contours. If it is 2, the function
          .   draws the contours, all the nested contours, all the nested-to-nested contours, and so on. This
          .   parameter is only taken into account when there is hierarchy available.
          .   @param offset Optional contour shift parameter. Shift all the drawn contours by the specified
          .   \f$\texttt{offset}=(dx,dy)\f$ .
          .   @note When thickness=#FILLED, the function is designed to handle connected components with holes correctly
          .   even when no hierarchy date is provided. This is done by analyzing all the outlines together
          .   using even-odd rule. This may give incorrect results if you have a joint collection of separately retrieved
          .   contours. In order to solve this problem, you need to call #drawContours separately for each sub-group
          .   of contours, or iterate over the collection using contourIdx parameter.
          """
      
    • @param:

      imageImage to be outlined
      contoursThe contour to be drawn. The type of this parameter is the same as the output contour of the function CV2. Find contours(), both of which are of type List[List]
      contourIdxThe edge index to be drawn tells the function cv2.drawContours() whether to draw a certain contour or all contours.
      If the parameter is an integer or zero, it means to draw the contour of the corresponding index number in the contours; If the value is negative (usually "- 1"), all contours are drawn.
      colorThe color drawn is expressed in BGR format (0 ~ 255,0 ~ 255,0 ~ 255)
      thicknessLine thickness

Contour fitting

Contour fitting: the actual calculation of the contour does not require a complete curve, but uses an approximate polygon close to the contour to approximate it.

Comprehensive example

"""

"""
import cv2
import numpy as np

im = cv2.imread('../data/cloud.png', 1)
adp1 = im.copy()
adp2 = im.copy()
im_gray = cv2.cvtColor(im, code=cv2.COLOR_BGR2GRAY)
t, im_binary = cv2.threshold(im_gray, 127, 255, cv2.THRESH_BINARY)

# Contour lookup
img, contours, hie = cv2.findContours(im_binary, mode=cv2.RETR_LIST, method=cv2.CHAIN_APPROX_NONE)

# Generate ellipse positioning information according to contour
ellipse_data = cv2.fitEllipse(contours[0])
print(ellipse_data)  # ((Center) (short radius, long radius) (angle))
cv2.ellipse(im, ellipse_data, (0, 0, 255), 3)
cv2.imshow('im', im)

# Generate rectangular positioning information according to the contour
rectangle_data = cv2.boundingRect(contours[0])
print(rectangle_data)  # (upper left starting point x,y rectangle width, height)
x, y, w, h = rectangle_data
brcnt = np.array([[x, y], [x + w, y], [x + w, y + h], [x, y + h]])  # Orderly manufacturing coordinates
cv2.drawContours(im, [brcnt], -1, (0, 255, 0), 3)
cv2.imshow('im', im)

# Generate circular positioning information according to the contour
circle_data = cv2.minEnclosingCircle(contours[0])
print(circle_data)  # ((center x,y) radius)
(x, y), radius = circle_data
center = int(x), int(y)
radius = int(radius)
cv2.circle(im, center, radius, (255, 0, 0), 3)  # center, radius must be integer
cv2.imshow('im', im)

# Generate polygon positioning information according to contour
# Precision 1
# adp1 = im.copy()
epsilon1 = 0.005 * cv2.arcLength(contours[0], True)
approx1 = cv2.approxPolyDP(contours[0], epsilon1, True)
cv2.drawContours(adp1, [approx1], 0, (0, 0, 255), 2)
cv2.imshow('adp1', adp1)
# Precision 2
# adp2= im.copy()
epsilon2 = 0.01 * cv2.arcLength(contours[0], True)
approx2 = cv2.approxPolyDP(contours[0], epsilon2, True)
cv2.drawContours(adp2, [approx2], 0, (255, 0, 0), 2)
cv2.imshow('adp2', adp2)

cv2.waitKey()
cv2.destroyAllWindows()

  1. Rectangular bounding box

    1. Generate rectangular positioning information boundingRect(array) according to the contour

      • notes:

        """
            boundingRect(array) -> retval
            .   @brief Calculates the up-right bounding rectangle of a point set or non-zero pixels of gray-scale image.
            .   
            .   The function calculates and returns the minimal up-right bounding rectangle for the specified point set or
            .   non-zero pixels of gray-scale image.
            .   
            .   @param array Input gray-scale image or 2D point set, stored in std::vector or Mat.
            """
        
    2. Draw a rectangle cv2.rectangle(img, pt1, pt2, color, thickness=None, lineType=None, shift=None) or cv2.drawContours()

      • notes:

        """
            rectangle(img, pt1, pt2, color[, thickness[, lineType[, shift]]]) -> img
            .   @brief Draws a simple, thick, or filled up-right rectangle.
        ... 
            .   @param img Image.
            .   @param pt1 Vertex of the rectangle.
            .   @param pt2 Vertex of the rectangle opposite to pt1 .
            .   @param color Rectangle color or brightness (grayscale image).
            .   @param thickness Thickness of lines that make up the rectangle. Negative values, like #FILLED,
            .   mean that the function has to draw a filled rectangle.
            .   @param lineType Type of the line. See #LineTypes
            .   @param shift Number of fractional bits in the point coordinates.
        ...
            """
        
  2. Minimum enclosing circle

    1. Generate the minimum enclosing circle positioning information CV2. Minenclosing circle (points) according to the contour

      • notes:

        """
            minEnclosingCircle(points) -> center, radius
            .   @brief Finds a circle of the minimum area enclosing a 2D point set.
        ...
            .   @param points Input vector of 2D points, stored in std::vector\<\> or Mat
            .   @param center Output center of the circle.
            .   @param radius Output radius of the circle.
            """
        
    2. Draw a circle cv2.circle(img, center, radius, color, thickness=None, lineType=None, shift=None)

      • notes:

        """
            circle(img, center, radius, color[, thickness[, lineType[, shift]]]) -> img
            .   @brief Draws a circle.
         ...
            .   @param img Image where the circle is drawn.
            .   @param center Center of the circle.
            .   @param radius Radius of the circle.
            .   @param color Circle color.
            .   @param thickness Thickness of the circle outline, if positive. Negative values, like #FILLED,
            .   mean that a filled circle is to be drawn.
            .   @param lineType Type of the circle boundary. See #LineTypes
            .   @param shift Number of fractional bits in the coordinates of the center and in the radius value.
            """
        
  3. Optimal fitting ellipse

    1. Generate the optimal fitting ellipse positioning information cv2.fitEllipse(points) according to the contour

      • notes:

        """
            fitEllipse(points) -> retval
            .   @brief Fits an ellipse around a set of 2D points.
        ...
            .   @param points Input 2D point set, stored in std::vector\<\> or Mat
            """
        
    2. Draw ellipse cv2.ellipse(img, center, axes, angle, startAngle, endAngle, color, thickness=None, lineType=None, shift=None)

      • notes:

        """
            ellipse(img, center, axes, angle, startAngle, endAngle, color[, thickness[, lineType[, shift]]]) -> img
            .   @brief Draws a simple or thick elliptic arc or fills an ellipse sector.
        ...
            .   @param img Image.
            .   @param center Center of the ellipse.
            .   @param axes Half of the size of the ellipse main axes.
            .   @param angle Ellipse rotation angle in degrees.
            .   @param startAngle Starting angle of the elliptic arc in degrees.
            .   @param endAngle Ending angle of the elliptic arc in degrees.
            .   @param color Ellipse color.
            .   @param thickness Thickness of the ellipse arc outline, if positive. Otherwise, this indicates that
            .   a filled ellipse sector is to be drawn.
            .   @param lineType Type of the ellipse boundary. See #LineTypes
            .   @param shift Number of fractional bits in the coordinates of the center and values of axes.
        ...
            """
        
  4. Approximating polygon

    1. Generate approximation accuracy based on contour perimeter

      epsilon = 0.005 * cv2.arcLength(contours[0], True)
      
    2. Generate polygon positioning information cv2.approxPolyDP(curve, epsilon, closed, approxCurve=None) according to contour and accuracy

      • notes:

        """
            approxPolyDP(curve, epsilon, closed[, approxCurve]) -> approxCurve
            .   @brief Approximates a polygonal curve(s) with the specified precision.
        ...
            .   @param curve Input vector of a 2D point stored in std::vector or Mat
            .   @param approxCurve Result of the approximation. The type should match the type of the input curve.
            .   @param epsilon Parameter specifying the approximation accuracy. This is the maximum distance
            .   between the original curve and its approximation.
            .   @param closed If true, the approximated curve is closed (its first and last vertices are
            .   connected). Otherwise, it is not closed.
            """
        
    3. Draw polygons using cv2.drwContours()

Contour information

  1. Contour perimeter cv.arcLength(curve, closed)
  2. Contour area cv2.contourArea(contour, oriented=None)

Image preprocessing in AI

The purpose of image preprocessing is that the image data is more suitable for the training of AI model

Data enhancement (extended dataset)

Zoom, stretch, add noise, flip, rotate, translate, cut, contrast adjust, channel transform

Increase the number of images and enhance image quality

liptic arc or fills an ellipse sector.
...
. @param img Image.
. @param center Center of the ellipse.
. @param axes Half of the size of the ellipse main axes.
. @param angle Ellipse rotation angle in degrees.
. @param startAngle Starting angle of the elliptic arc in degrees.
. @param endAngle Ending angle of the elliptic arc in degrees.
. @param color Ellipse color.
. @param thickness Thickness of the ellipse arc outline, if positive. Otherwise, this indicates that
. a filled ellipse sector is to be drawn.
. @param lineType Type of the ellipse boundary. See #LineTypes
. @param shift Number of fractional bits in the coordinates of the center and values of axes.
...
"""
```

  1. Approximating polygon

    1. Generate approximation accuracy based on contour perimeter

      epsilon = 0.005 * cv2.arcLength(contours[0], True)
      
    2. Generate polygon positioning information cv2.approxPolyDP(curve, epsilon, closed, approxCurve=None) according to contour and accuracy

      • notes:

        """
            approxPolyDP(curve, epsilon, closed[, approxCurve]) -> approxCurve
            .   @brief Approximates a polygonal curve(s) with the specified precision.
        ...
            .   @param curve Input vector of a 2D point stored in std::vector or Mat
            .   @param approxCurve Result of the approximation. The type should match the type of the input curve.
            .   @param epsilon Parameter specifying the approximation accuracy. This is the maximum distance
            .   between the original curve and its approximation.
            .   @param closed If true, the approximated curve is closed (its first and last vertices are
            .   connected). Otherwise, it is not closed.
            """
        
    3. Draw polygons using cv2.drwContours()

Contour information

  1. Contour perimeter cv.arcLength(curve, closed)
  2. Contour area cv2.contourArea(contour, oriented=None)

Image preprocessing in AI

The purpose of image preprocessing is that the image data is more suitable for the training of AI model

Data enhancement (extended dataset)

Zoom, stretch, add noise, flip, rotate, translate, cut, contrast adjust, channel transform

Increase the number of images and enhance image quality

Posted by eugene2008 on Thu, 23 Sep 2021 15:39:37 -0700